Allows user to select optimal number of components for PLS model
# S3 method for pls
selectCompNum(obj, ncomp = NULL, selcrit = obj$ncomp.selcrit, ...)
the same model with selected number of components
PLS model (object of class pls
)
number of components to select
criterion for selecting optimal number of components ('min'
for
first local minimum of RMSECV and 'wold'
for Wold's rule.)
other parameters if any
The method sets ncomp.selected
parameter for the model and return it back. The parameter
points out to the optimal number of components in the model. You can either specify it manually,
as argument ncomp
, or use one of the algorithms for automatic selection.
Automatic selection by default based on cross-validation statistics. If no cross-validation results are found in the model, the method will use test set validation results. If they are not available as well, the model will use calibration results and give a warning as in this case the selected number of components will lead to overfitted model.
There are two algorithms for automatic selection you can chose between: either first local minimum of RMSE (`selcrit="min"`) or Wold's rule (`selcrit="wold"`).
The first local minimum criterion finds at which component, A, error of prediction starts raising and selects (A - 1) as the optimal number. The Wold's criterion finds which component A does not make error smaller at least by 5 as the optimal number.
If model is PLS2 model (has several response variables) the method computes optimal number of components for each response and returns the smallest value. For example, if for the first response 2 components give the smallest error and for the second response this number is 3, A = 2 will be selected as a final result.
It is not recommended to use automatic selection for real applications, always investigate your model (via RMSE, Y-variance plot, regression coefficients) to make correct decision.
See examples in help for pls
function.