plsrda: PLSDA models

Description

Discrimination (DA) based on PLS.

The training variable \(y\) (univariate class membership) is firstly transformed to a dummy table containing \(nclas\) columns, where \(nclas\) is the number of classes present in \(y\). Each column is a dummy variable (0/1). Then, a PLS2 is implemented on the \(X-\)data and the dummy table, returning latent variables (LVs) that are used as dependent variables in a DA model.

- plsrda: Usual "PLSDA". A linear regression model predicts the Y-dummy table from the PLS2 LVs. This corresponds to the PLSR2 of the X-data and the Y-dummy table. For a given observation, the final prediction is the class corresponding to the dummy variable for which the prediction is the highest.

- plslda and plsqda: Probabilistic LDA and QDA are run over the PLS2 LVs, respectively.

Usage

plsrda(X, y, weights = NULL, nlv, 
Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1])
plslda(X, y, weights = NULL, nlv, prior = c("unif", "prop"), 
Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1])
plsqda(X, y, weights = NULL, nlv, prior = c("unif", "prop"), 
Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1])
# S3 method for Plsrda
predict(object, X, ..., nlv = NULL) 
# S3 method for Plsprobda
predict(object, X, ..., nlv = NULL)

Value

For plsrda, plslda, plsqda:

fm: list with the model: (T): X-scores matrix; (P): X-loading matrix;(R): The PLS projection matrix (p,nlv); (W): X-loading weights matrix ;(C): The Y-loading weights matrix; (TT): the X-score normalization factor; (xmeans): the centering vector of X (p,1); (ymeans): the centering vector of Y (q,1); (xscales): the scaling vector of X (p,1); (yscales): the scaling vector of Y (q,1); (weights): vector of observation weights; (U): intermediate output.
lev: classes
ni: number of observations in each class

For predict.Plsrda, predict.Plsprobda:

pred: predicted class for each observation
posterior: calculated probability of belonging to a class for each observation

Arguments

X: For the main functions: Training X-data (\(n, p\)). --- For the auxiliary functions: New X-data (\(m, p\)) to consider.
y: Training class membership (\(n\)). Note: If y is a factor, it is replaced by a character vector.
weights: Weights (\(n\)) to apply to the training observations for the PLS2. Internally, weights are "normalized" to sum to 1. Default to NULL (weights are set to \(1 / n\)).
nlv: The number(s) of LVs to calculate.
prior: The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in y).
Xscaling: X variable scaling among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used.
Yscaling: Y variable scaling, once converted to binary variables, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used.
object: For the auxiliary functions: A fitted model, output of a call to the main functions.
...: For the auxiliary functions: Optional arguments. Not used.

Examples

Run this code


## EXAMPLE OF PLSDA

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)

Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5]

nlv <- 5
fm <- plsrda(Xtrain, ytrain, Xscaling = "sd", nlv = nlv)
names(fm)

predict(fm, Xtest)
predict(fm, Xtest, nlv = 0:2)$pred

pred <- predict(fm, Xtest)$pred
err(pred, ytest)

zfm <- fm$fm
transform(zfm, Xtest)
transform(zfm, Xtest, nlv = 1)
summary(zfm, Xtrain)
coef(zfm)
coef(zfm, nlv = 0)
coef(zfm, nlv = 2)

## EXAMPLE OF PLS LDA

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)
Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5]

nlv <- 5
fm <- plslda(Xtrain, ytrain, Xscaling = "sd", nlv = nlv)
predict(fm, Xtest)
predict(fm, Xtest, nlv = 1:2)$pred

zfm <- fm$fm[[1]]
class(zfm)
names(zfm)
summary(zfm, Xtrain)
transform(zfm, Xtest[1:2, ])
coef(zfm)

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples