Algorithm fitting a multi-block PLS1 or PLS2 model between dependent variables \(Xlist\) and responses \(Y\), based on the "Improved kernel algorithm #1" proposed by Dayal and MacGregor (1997).
For weighted versions, see for instance Schaal et al. 2002, Siccard & Sabatier 2006, Kim et al. 2011 and Lesnoff et al. 2020.
Auxiliary functions
transform
Calculates the LVs for any new matrix \(X\) from the model.
summary
returns summary information for the model.
coef
Calculates b-coefficients from the model, adjuted for raw data.
predict
Calculates the predictions for any new matrix \(X\) from the model.
mbplsr(Xlist, Y, blockscaling = TRUE, weights = NULL, nlv,
Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1])# S3 method for Mbplsr
transform(object, X, ..., nlv = NULL)
# S3 method for Mbplsr
summary(object, X, ...)
# S3 method for Mbplsr
coef(object, ..., nlv = NULL)
# S3 method for Mbplsr
predict(object, X, ..., nlv = NULL)
A list of outputs, such as
The X-score matrix (\(n, nlv\)).
The X-loadings matrix (\(p, nlv\)).
The X-loading weights matrix (\(p, nlv\)).
The Y-loading weights matrix (C = t(Beta), where Beta is the scores regression coefficients matrix).
The PLS projection matrix (\(p, nlv\)).
The list of centering vectors of \(Xlist\).
The centering vector of \(Y\) (\(q, 1\)).
The list of \(Xlist\) variable standard deviations.
The vector of \(Y\) variable standard deviations (\(q, 1\)).
Weights applied to the training observations.
the X-score normalization factor.
block scaling.
"norm" of each block, i.e. the square root of the sum of the variances of each column of each block, computed on the training, and used as scaling factor
.
intermediate output.
For transform.Mbplsr
: X-scores matrix for new Xlist-data.
For summary.Mbplsr
:
matrix of explained variances.
For coef.Mbplsr
:
matrix (1,nlv) with the intercepts
matrix (n,nlv) with the coefficients
For predict.Mbplsr
:
A list of matrices (\(m, q\)) with the Y predicted values for the new Xlist-data
For the main function: list of training X-data (\(n\)rows).
For the auxiliary functions: list of new X-data, with the same variables than the training X-data.
Training Y-data (\(n, q\)).
logical. If TRUE, the scaling factor (computed on the training) is the "norm" of the block, i.e. the square root of the sum of the variances of each column of the block.
Weights (\(n, 1\)) to apply to the training observations. Internally, weights are "normalized" to sum to 1. Default to NULL
(weights are set to \(1 / n\)).
For the main functions: The number(s) of LVs to calculate. --- For the auxiliary functions: The number(s) of LVs to consider.
vector (of length Xlist) of variable scaling for each datablock, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used.
character. variable scaling for the Y-block, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used.
For the auxiliary functions: A fitted model, output of a call to the main functions.
For the auxiliary functions: Optional arguments. Not used.
Andersson, M., 2009. A comparison of nine PLS1 algorithms. Journal of Chemometrics 23, 518-529.
Dayal, B.S., MacGregor, J.F., 1997. Improved PLS algorithms. Journal of Chemometrics 11, 73-85.
Hoskuldsson, A., 1988. PLS regression methods. Journal of Chemometrics 2, 211-228. https://doi.org/10.1002/cem.1180020306
Kim, S., Kano, M., Nakagawa, H., Hasebe, S., 2011. Estimation of active pharmaceutical ingredients content using locally weighted partial least squares and statistical wavelength selection. Int. J. Pharm., 421, 269-274.
Lesnoff, M., Metz, M., Roger, J.M., 2020. Comparison of locally weighted PLS strategies for regression and discrimination on agronomic NIR Data. Journal of Chemometrics. e3209. https://onlinelibrary.wiley.com/doi/abs/10.1002/cem.3209
Rannar, S., Lindgren, F., Geladi, P., Wold, S., 1994. A PLS kernel algorithm for data sets with many variables and fewer objects. Part 1: Theory and algorithm. Journal of Chemometrics 8, 111-125. https://doi.org/10.1002/cem.1180080204
Schaal, S., Atkeson, C., Vijayamakumar, S. 2002. Scalable techniques from nonparametric statistics for the real time robot learning. Applied Intell., 17, 49-60.
Sicard, E. Sabatier, R., 2006. Theoretical framework for local PLS1 regression and application to a rainfall data set. Comput. Stat. Data Anal., 51, 1393-1410.
Tenenhaus, M., 1998. La régression PLS: théorie et pratique. Editions Technip, Paris, France.
Wold, S., Sjostrom, M., Eriksson, l., 2001. PLS-regression: a basic tool for chemometrics. Chem. Int. Lab. Syst., 58, 109-130.
mbplsr_mbplsda_allsteps
function to help determine the optimal number of latent variables, perform a permutation test, calculate model parameters and predict new observations.
n <- 10 ; p <- 10
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- rnorm(n)
m <- 2
Xtest <- matrix(rnorm(m * p), ncol = p)
colnames(Xtrain) <- colnames(Xtest) <- paste("v", 1:p, sep = "")
Xtrain
Xtest
blocks <- list(1:2, 4, 6:8)
X1 <- mblocks(Xtrain, blocks = blocks)
X2 <- mblocks(Xtest, blocks = blocks)
nlv <- 3
fm <- mbplsr(Xlist = X1, Y = ytrain, Xscaling = c("sd","none","none"),
blockscaling = TRUE, weights = NULL, nlv = nlv)
summary(fm, X1)
coef(fm)
transform(fm, X2)
predict(fm, X2)
Run the code above in your browser using DataLab