Learn R Programming

mixOmics (version 4.1-4)

valid: Compute validation criterion for PLS, sPLS, PLS-DA and sPLS-DA

Description

Function to estimate measures of the prediction error for fitted PLS, sparse PLS, PLS-DA and sparse PLS-DA models. M-fold and leave-one-out cross-validation are implemented.

Usage

## S3 method for class 'pls':
valid(object, criterion = c("all", "MSEP", "R2", "Q2"), 
      validation = c("Mfold", "loo"), folds = 10,
      max.iter = 500, tol = 1e-06, ...)	

## S3 method for class 'spls':
valid(object, criterion = c("all", "MSEP", "R2", "Q2"), 
      validation = c("Mfold", "loo"), folds = 10,
         max.iter = 500, tol = 1e-06, ...)

## S3 method for class 'plsda':
valid(object, method = c("all", "max.dist", "centroids.dist", 
                                         "mahalanobis.dist"),
         validation = c("Mfold", "loo"), folds = 10,
         max.iter = 500, tol = 1e-06, ...)	

## S3 method for class 'splsda':
valid(object, method = c("all", "max.dist", "centroids.dist", 
                                          "mahalanobis.dist"),
         validation = c("Mfold", "loo"), folds = 10,
         max.iter = 500, tol = 1e-06, ...)

Arguments

object
object of class inheriting from "pls", "plsda", "spls" or "splsda".
criterion
what type of validation criterion to be used for pls or spls. Should be a subset of "MSEP", "R2" or "Q2". Default is "all".
method
prediction method to be applied for plsda or splsda. Should be a subset of "max.dist", "centroids.dist", "mahalanobis.dist". Default is "all". See
validation
character. What kind of (internal) validation to use, matching one of "Mfold" or "loo" (see below). Default is "Mfold".
folds
the folds in the Mfold cross-validation. See Details.
max.iter
integer, the maximum number of iterations.
tol
a not negative real, the tolerance used in the iterative algorithm.
...
arguments to pass to nearZeroVar.

Value

  • For PLS and sPLS models, valid produces a list with the following components:
  • MSEPMean Square Error Prediction for each $Y$ variable.
  • R2a matrix of $R^2$ values of the $Y$-variables for models with $1, \ldots ,$ncomp components.
  • Q2if $Y$ containts one variable, a vector of $Q^2$ values else a list with a matrix of $Q^2$ values for each $Y$-variable and a vector of $Q^2$-total values for models with $1, \ldots ,$ncomp components.
  • For PLS-DA and sPLS-DA models, valid produces a matrix of classification error rate estimation. The dimensions correspond to the components in the model and to the prediction method used, respectively.

encoding

latin1

Details

For fitted PLS and sPLS regression models, valid estimates the mean squared error of prediction (MSEP), $R^2$, and $Q^2$ to assess the predictive validity of the model using M-fold or leave-one-out cross-validation. Note that only the classic, regression and invariant modes can be applied. If validation = "Mfold", M-fold cross-validation is performed. How many folds to generate is selected by specifying the number of folds in folds. The folds also can be supplied as a list of vectors containing the indexes defining each fold as produced by split. If validation = "loo", leave-one-out cross-validation is performed. For fitted PLS-DA and sPLS-DA models, valid estimates the classification error rate using cross-validation. How many folds to generate is selected such that there is at least 1 sample for each class in the test set.

References

Tenenhaus, M. (1998). La r�gression PLS: th�orie et pratique. Paris: Editions Technic. L� Cao, K. A., Rossouw D., Robert-Grani�, C. and Besse, P. (2008). A sparse PLS for variable selection when integrating Omics data. Statistical Applications in Genetics and Molecular Biology 7, article 35. Mevik, B.-H., Cederkvist, H. R. (2004). Mean Squared Error of Prediction (MSEP) Estimates for Principal Component Regression (PCR) and Partial Least Squares Regression (PLSR). Journal of Chemometrics 18(9), 422-429.

See Also

predict, nipals, plot.valid and http://www.math.univ-toulouse.fr/~biostat/mixOmics/ for more details.

Examples

Run this code
## validation for objects of class 'pls' or 'spls'
data(liver.toxicity)
X <- liver.toxicity$gene
Y <- liver.toxicity$clinic

liver.pls <- pls(X, Y, ncomp = 3)
liver.val <- valid(liver.pls, validation = "Mfold")
				   
plot(liver.val, criterion = "R2", type = "l", layout = c(2, 2))

## validation for objects of class 'plsda' or 'splsda'
data(srbct)
X <- srbct$gene
Y <- srbct$class  

ncomp = 5
srbct.splsda <- splsda(X, Y, ncomp = ncomp, keepX = rep(10, ncomp))  
error <- valid(srbct.splsda, validation = "Mfold", folds = 8, 
               method = "all")

plot(error, type = "l")

Run the code above in your browser using DataLab