One of five filter methods can be chosen for repeated shaving of
a certain percentage of the worst performing variables. Performance of the
reduced models are stored and viewable through print
and plot
methods.
shaving(
y,
X,
ncomp = 10,
method = c("SR", "VIP", "sMC", "LW", "RC"),
prop = 0.2,
min.left = 2,
comp.type = c("CV", "max"),
validation = c("CV", 1),
fixed = integer(0),
newy = NULL,
newX = NULL,
segments = 10,
plsType = "plsr",
Y.add = NULL,
...
)# S3 method for shaved
plot(x, y, what = c("error", "spectra"), index = "min", log = "x", ...)
# S3 method for shaved
print(x, ...)
Returns a list object of class shaved
containing the method type,
the error, number of components, and number of variables per reduced model. It
also contains a list of all sets of reduced variable sets plus the original data.
vector of response values (numeric
or factor
).
numeric predictor matrix
.
integer number of components (default = 10).
filter method, i.e. SR, VIP, sMC, LW or RC given as character
.
proportion of variables to be removed in each iteration (numeric
).
minimum number of remaining variables.
use number of components chosen by cross-validation, "CV"
,
or fixed, "max"
.
type of validation for plsr
. The default is "CV". If more
than one set of CV segments is wanted, use a vector of lenth two, e.g. c("CV",5)
.
vector of indeces for compulsory/fixed variables that should always be included in the modelling.
validation response for RMSEP/error computations.
validation predictors for RMSEP/error computations.
see mvr
for documentation of segment choices.
Type of PLS model, "plsr" or "cppls".
Additional response for CPPLS, see plsType
.
additional arguments for plsr
or cvsegments
.
object of class shaved
for plotting or printing.
plot type. Default = "error". Alternative = "spectra".
which iteration to plot. Default = "min"; corresponding to minimum RMSEP.
logarithmic x (default) or y scale.
Kristian Hovde Liland
Variables are first sorted with respect to some importancemeasure, and usually one of the filter measures described above are used. Secondly, a threshold is used to eliminate a subset of the least informative variables. Then a model is fitted again to the remaining variables and performance is measured. The procedure is repeated until maximum model performance is achieved.
VIP
(SR/sMC/LW/RC), filterPLSR
, shaving
,
stpls
, truncation
,
bve_pls
, ga_pls
, ipw_pls
, mcuve_pls
,
rep_pls
, spa_pls
,
lda_from_pls
, lda_from_pls_cv
, setDA
.
data(mayonnaise, package = "pls")
sh <- shaving(mayonnaise$design[,1], pls::msc(mayonnaise$NIR), type = "interleaved")
pars <- par(mfrow = c(2,1), mar = c(4,4,1,1))
plot(sh)
plot(sh, what = "spectra")
par(pars)
print(sh)
Run the code above in your browser using DataLab