Learn R Programming

rchemo (version 0.1-3)

plsrda: PLSDA models

Description

Discrimination (DA) based on PLS.

The training variable \(y\) (univariate class membership) is firstly transformed to a dummy table containing \(nclas\) columns, where \(nclas\) is the number of classes present in \(y\). Each column is a dummy variable (0/1). Then, a PLS2 is implemented on the \(X-\)data and the dummy table, returning latent variables (LVs) that are used as dependent variables in a DA model.

- plsrda: Usual "PLSDA". A linear regression model predicts the Y-dummy table from the PLS2 LVs. This corresponds to the PLSR2 of the X-data and the Y-dummy table. For a given observation, the final prediction is the class corresponding to the dummy variable for which the prediction is the highest.

- plslda and plsqda: Probabilistic LDA and QDA are run over the PLS2 LVs, respectively.

Usage

plsrda(X, y, weights = NULL, nlv, 
Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1])

plslda(X, y, weights = NULL, nlv, prior = c("unif", "prop"), Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1])

plsqda(X, y, weights = NULL, nlv, prior = c("unif", "prop"), Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1])

# S3 method for Plsrda predict(object, X, ..., nlv = NULL)

# S3 method for Plsprobda predict(object, X, ..., nlv = NULL)

Value

For plsrda, plslda, plsqda:

fm

list with the model: (T): X-scores matrix; (P): X-loading matrix;(R): The PLS projection matrix (p,nlv); (W): X-loading weights matrix ;(C): The Y-loading weights matrix; (TT): the X-score normalization factor; (xmeans): the centering vector of X (p,1); (ymeans): the centering vector of Y (q,1); (xscales): the scaling vector of X (p,1); (yscales): the scaling vector of Y (q,1); (weights): vector of observation weights; (U): intermediate output.

lev

classes

ni

number of observations in each class

For predict.Plsrda, predict.Plsprobda:

pred

predicted class for each observation

posterior

calculated probability of belonging to a class for each observation

Arguments

X

For the main functions: Training X-data (\(n, p\)). --- For the auxiliary functions: New X-data (\(m, p\)) to consider.

y

Training class membership (\(n\)). Note: If y is a factor, it is replaced by a character vector.

weights

Weights (\(n\)) to apply to the training observations for the PLS2. Internally, weights are "normalized" to sum to 1. Default to NULL (weights are set to \(1 / n\)).

nlv

The number(s) of LVs to calculate.

prior

The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in y).

Xscaling

X variable scaling among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used.

Yscaling

Y variable scaling, once converted to binary variables, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used.

object

For the auxiliary functions: A fitted model, output of a call to the main functions.

...

For the auxiliary functions: Optional arguments. Not used.

See Also

plsr_plsda_allsteps function to help determine the optimal number of latent variables, perform a permutation test, calculate model parameters and predict new observations.

Examples

Run this code

## EXAMPLE OF PLSDA

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)

Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5]

nlv <- 5
fm <- plsrda(Xtrain, ytrain, Xscaling = "sd", nlv = nlv)
names(fm)

predict(fm, Xtest)
predict(fm, Xtest, nlv = 0:2)$pred

pred <- predict(fm, Xtest)$pred
err(pred, ytest)

zfm <- fm$fm
transform(zfm, Xtest)
transform(zfm, Xtest, nlv = 1)
summary(zfm, Xtrain)
coef(zfm)
coef(zfm, nlv = 0)
coef(zfm, nlv = 2)

## EXAMPLE OF PLS LDA

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)
Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5]

nlv <- 5
fm <- plslda(Xtrain, ytrain, Xscaling = "sd", nlv = nlv)
predict(fm, Xtest)
predict(fm, Xtest, nlv = 1:2)$pred

zfm <- fm$fm[[1]]
class(zfm)
names(zfm)
summary(zfm, Xtrain)
transform(zfm, Xtest[1:2, ])
coef(zfm)

Run the code above in your browser using DataLab