The function pls.regression
performs pls multivariate regression (with several response
variables and several predictor variables) using de Jong's SIMPLS algorithm. This function
is an adaptation of R. Wehrens' code from the package pls.pcr.
pls.regression(Xtrain, Ytrain, Xtest=NULL, ncomp=NULL, unit.weights=TRUE)
A list with the following components:
the (p x m x length(ncomp
)) matrix containing the regression coefficients. Each
row corresponds to a predictor variable and each column to a response variable. The third
dimension of the matrix B
corresponds to the number of PLS components used
to compute the regression coefficients. If ncomp
has length 1, B
is just a (p x m) matrix.
the (ntest x m x length(ncomp
)) containing the predicted
values of the response variables for the observations from Xtest
. The
third dimension of the matrix Ypred
corresponds to the number of PLS
components used to compute the regression coefficients.
the (p x max(ncomp
)) matrix containing the X-loadings.
the (m x max(ncomp
)) matrix containing the Y-loadings.
the (ntrain x max(ncomp
)) matrix containing the X-scores (latent components)
the (p x max(ncomp
)) matrix containing the weights used to construct the
latent components.
the p-vector containing the means of the columns of Xtrain
.
a (ntrain x p) data matrix of predictors. Xtrain
may be a matrix or a
data frame. Each row corresponds to an observation and each column to a predictor variable.
a (ntrain x m) data matrix of responses. Ytrain
may be a vector (if m=1),
a matrix or a data frame. If Ytrain
is a matrix or a data frame, each row corresponds
to an observation and each column to a response variable. If Ytrain
is a vector, it
contains the unique response variable for each observation.
a (ntest x p) matrix containing the predictors for the test data
set. Xtest
may also be a
vector of length p (corresponding to only one test observation).
the number of latent components to be used for regression. If
ncomp
is a vector of integers, the regression model is built
successively with each number of components. If ncomp=NULL
, the maximal
number of components min(ntrain,p) is chosen.
if TRUE
then the latent components
will be constructed from weight vectors that are standardized to length 1,
otherwise the weight vectors do not have length 1 but the latent components have
norm 1.
Anne-Laure Boulesteix (https://www.ibe.med.uni-muenchen.de/mitarbeiter/professoren/boulesteix/index.html) and Korbinian Strimmer (https://strimmerlab.github.io/korbinian.html).
Adapted in part from pls.pcr code by R. Wehrens (in a former version of the 'pls' package https://CRAN.R-project.org/package=pls).
The columns of the data matrices Xtrain
and Ytrain
must not be centered to have
mean zero, since centering is performed by the function pls.regression
as a preliminary
step before the SIMPLS algorithm is run.
In the original definition of SIMPLS by de Jong (1993), the weight vectors have length 1. If the weight vectors are standardized to have length 1, they satisfy a simple optimality criterion (de Jong, 1993). However, it is also usual (and computationally efficient) to standardize the latent components to have length 1.
In contrast to the original version found in the package pls.pcr
,
the prediction for the observations from Xtest
is performed after
centering the columns of Xtest
by substracting the columns means
calculated from Xtrain
.
S. de Jong (1993). SIMPLS: an alternative approach to partial least squares regression, Chemometrics Intell. Lab. Syst. 18, 251--263.
C. J. F. ter Braak and S. de Jong (1993). The objective function of partial least squares regression, Journal of Chemometrics 12, 41--54.
pls.lda
, TFA.estimate
,
pls.regression.cv
.
# load plsgenomics library
library(plsgenomics)
# load the Ecoli data
data(Ecoli)
# perform pls regression
# with unit latent components
pls.regression(Xtrain=Ecoli$CONNECdata,Ytrain=Ecoli$GEdata,Xtest=Ecoli$CONNECdata,
ncomp=1:3,unit.weights=FALSE)
# with unit weight vectors
pls.regression(Xtrain=Ecoli$CONNECdata,Ytrain=Ecoli$GEdata,Xtest=Ecoli$CONNECdata,
ncomp=1:3,unit.weights=TRUE)
Run the code above in your browser using DataLab