Performs the cross-validation calculations for mvr
.
This function is not meant to be called directly, but through the generic
functions pcr
, plsr
, cppls
or mvr
with the
argument validation
set to "CV"
or "LOO"
. All
arguments to mvrCv
can be specified in the generic function call.
If segments
is a list, the arguments segment.type
and
length.seg
are ignored. The elements of the list should be integer
vectors specifying the indices of the segments. See
cvsegments
for details.
Otherwise, segments of type segment.type
are generated. How many
segments to generate is selected by specifying the number of segments in
segments
, or giving the segment length in length.seg
. If both
are specified, segments
is ignored.
If jackknife
is TRUE
, jackknifed regression coefficients are
returned, which can be used for for variance estimation
(var.jack
) or hypothesis testing (jack.test
).
X
and Y
do not need to be centered.
Note that this function cannot be used in situations where \(X\) needs to
be recalculated for each segment (except for scaling by the standard
deviation), for instance with msc
or other preprocessing. For such
models, use the more general (but slower) function crossval
.
Also note that if needed, the function will silently(!) reduce ncomp
to the maximal number of components that can be cross-validated, which is
\(n - l - 1\), where \(n\) is the number of observations and \(l\) is
the length of the longest segment. The (possibly reduced) number of
components is returned as the component ncomp
.
By default, the cross-validation will be performed serially. However, it
can be done in parallel using functionality in the parallel
package by setting the option parallel
in pls.options
.
See pls.options
for the different ways to specify the
parallelism.
mvrCv(
X,
Y,
ncomp,
Y.add = NULL,
weights = NULL,
method = pls.options()$mvralg,
scale = FALSE,
segments = 10,
segment.type = c("random", "consecutive", "interleaved"),
length.seg,
jackknife = FALSE,
trace = FALSE,
...
)
A list with the following components:
equals
"CV"
for cross-validation.
an array with the cross-validated predictions.
(only if jackknife
is TRUE
) an array with the jackknifed regression coefficients. The
dimensions correspond to the predictors, responses, number of components,
and segments, respectively.
a vector of PRESS values (one for each response variable) for a model with zero components, i.e., only the intercept.
a matrix of PRESS values for models with 1,
..., ncomp
components. Each row corresponds to one response
variable.
a matrix of adjustment values for calculating bias
corrected MSEP. MSEP
uses this.
the list of segments used in the cross-validation.
the actual number of components used.
if method cppls
is used, gamma values
for the powers of each CV segment are returned.
a matrix of observations. NA
s and Inf
s are not
allowed.
a vector or matrix of responses. NA
s and Inf
s are
not allowed.
the number of components to be used in the modelling.
a vector or matrix of additional responses containing relevant
information about the observations. Only used for cppls
.
a vector of individual weights for the observations. Only
used for cppls
. (Optional)
the multivariate regression method to be used.
logical. If TRUE
, the learning \(X\) data for each
segment is scaled by dividing each variable by its sample standard
deviation. The prediction data is scaled by the same amount.
the number of segments to use, or a list with segments (see below).
the type of segments to use. Ignored if segments
is a list.
Positive integer. The length of the segments to use. If
specified, it overrides segments
unless segments
is a list.
logical. Whether jackknifing of regression coefficients should be performed.
logical; if TRUE
, the segment number is printed for each
segment.
additional arguments, sent to the underlying fit function.
Ron Wehrens and Bjørn-Helge Mevik
Mevik, B.-H., Cederkvist, H. R. (2004) Mean Squared Error of Prediction (MSEP) Estimates for Principal Component Regression (PCR) and Partial Least Squares Regression (PLSR). Journal of Chemometrics, 18(9), 422--429.
mvr
crossval
cvsegments
MSEP
var.jack
jack.test
data(yarn)
yarn.pcr <- pcr(density ~ NIR, 6, data = yarn, validation = "CV", segments = 10)
if (FALSE) plot(MSEP(yarn.pcr))
Run the code above in your browser using DataLab