pls: Partial Least Squares (PLS) Regression

Description

Function to perform Partial Least Squares (PLS) regression.

Usage

pls(X, Y, ncomp = 2, 
    mode = c("regression", "canonical", "invariant", "classic"), 
    max.iter = 500, tol = 1e-06, near.zero.var = TRUE)

Arguments

numeric matrix of predictors. NAs are allowed.

numeric vector or matrix of responses (for multi-response models). NAs are allowed.

ncomp

the number of components to include in the model. Default to 2.

mode

character string. What type of algorithm to use, (partially) matching one of "regression", "canonical", "invariant" or "classic". See Details.

max.iter

integer, the maximum number of iterations.

tol

a not negative real, the tolerance used in the iterative algorithm.

near.zero.var

boolean, see the internal nearZeroVar function (should be set to TRUE in particular for data with many zero values). Setting this argument to FALSE (when appropriate) will speed up the computations.

Value

pls returns an object of class "pls", a list that contains the following components:
Xthe centered and standardized original predictor matrix.
Ythe centered and standardized original response vector or matrix.
ncompthe number of components included in the model.
modethe algorithm used to fit the model.
mat.cmatrix of coefficients to be used internally by predict.
variateslist containing the $X$ and $Y$ variates.
loadingslist containing the estimated loadings for the variates.
nameslist containing the names to be used for individuals and variables.
nzvlist containing the zero- or near-zero predictors information.
tolthe tolerance used in the iterative algorithm, used for subsequent S3 methods
max.iterthe maximum number of iterations, used for subsequent S3 methods
iterNumber of iterations of the algorthm for each component

encoding

latin1

Details

pls function fit PLS models with $1, \ldots ,$ncomp components. Multi-response models are fully supported. The X and Y datasets can contain missing values.

The type of algorithm to use is specified with the mode argument. Four PLS algorithms are available: PLS regression ("regression"), PLS canonical analysis ("canonical"), redundancy analysis ("invariant") and the classical PLS algorithm ("classic") (see References).

The estimation of the missing values can be performed by the reconstitution of the data matrix using the nipals function. Otherwise, missing values are handled by casewise deletion in the pls function without having to delete the rows with missing data.

References

Tenenhaus, M. (1998). La regression PLS: theorie et pratique. Paris: Editions Technic.

Wold H. (1966). Estimation of principal components and related models by iterative least squares. In: Krishnaiah, P. R. (editors), Multivariate Analysis. Academic Press, N.Y., 391-420.

Examples

Run this code

data(linnerud)
X <- linnerud$exercise
Y <- linnerud$physiological
linn.pls <- pls(X, Y, mode = "classic")

data(liver.toxicity)
X <- liver.toxicity$gene
Y <- liver.toxicity$clinic

toxicity.pls <- pls(X, Y, ncomp = 3)

Run the code above in your browser using DataLab