splsda: Sparse Partial Least Squares Discriminant Analysis (sPLS-DA)

Description

Function to perform sparse Partial Least Squares to classify samples (supervised analysis). sPLS-DA approach enables variable selection.

Usage

splsda(X, Y, ncomp = 2, keepX = rep(ncol(X), ncomp),
       max.iter = 500, tol = 1e-06, near.zero.var = TRUE)

Arguments

numeric matrix of predictors. NAs are allowed.

a factor or a class vector for the discrete outcome.

ncomp

the number of components to include in the model (see Details).

keepX

numeric vector of length ncomp, the number of variables to keep in $X$-loadings. By default all variables are kept in the model.

max.iter

integer, the maximum number of iterations.

tol

a positive real, the tolerance used in the iterative algorithm.

near.zero.var

boolean, see the internal nearZeroVar function (should be set to TRUE in particular for data with many zero values). Setting this argument to FALSE (when appropriate) will speed up the computations.

Value

splsda returns an object of class "splsda", a list that contains the following components:
Xthe centered and standardized original predictor matrix.
Ythe centered and standardized indicator response vector or matrix.
ind.matthe indicator matrix.
ncompthe number of components included in the model.
keepXnumber of $X$ variables kept in the model on each component.
mat.cmatrix of coefficients to be used internally by predict.
variateslist containing the variates.
loadingslist containing the estimated loadings for the X and Y variates.
nameslist containing the names to be used for individuals and variables.
nzvlist containing the zero- or near-zero predictors information.
tolthe tolerance used in the iterative algorithm, used for subsequent S3 methods
max.iterthe maximum number of iterations, used for subsequent S3 methods
iterNumber of iterations of the algorthm for each component

encoding

latin1

Details

splsda function fit sPLS models with $1, \ldots ,$ncomp components to the factor or class vector Y. The appropriate indicator (dummy) matrix is created.

References

On sPLS-DA: Le Cao, K.-A., Boitard, S. and Besse, P. (2011). Sparse PLS Discriminant Analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinformatics 12:253.

Examples

Run this code

## First example
data(breast.tumors)
X <- breast.tumors$gene.exp
# Y will be transformed as a factor in the function,
# but we set it as a factor to set up the colors.
Y <- as.factor(breast.tumors$sample$treatment)

res <- splsda(X, Y, ncomp = 2, keepX = c(25, 25))


# individual names appear
plotIndiv(res, ind.names = Y, add.legend = TRUE, plot.ellipse =TRUE)


## Second example
## Second example
data(liver.toxicity)
X <- as.matrix(liver.toxicity$gene)
# Y will be transformed as a factor in the function,
# but we set it as a factor to set up the colors.
Y <- as.factor(liver.toxicity$treatment[, 4])

splsda.liver <- splsda(X, Y, ncomp = 2, keepX = c(20, 20))

# individual name is set to the treatment
plotIndiv(splsda.liver, ind.names = Y, plot.ellipse = TRUE, add.legend = TRUE)

Run the code above in your browser using DataLab