plsrda_agg: PLSDA with aggregation of latent variables

Description

Ensemblist approach where the predictions are calculated by "averaging" the predictions of PLSDA models built with different numbers of latent variables (LVs).

For instance, if argument nlv is set to nlv = "5:10", the prediction for a new observation is the most occurent level (vote) over the predictions returned by the models with 5 LVS, 6 LVs, ... 10 LVs.

- plsrda_agg: use plsrda.

- plslda_agg: use plslda.

- plsqda_agg: use plsqda.

Usage

plsrda_agg(X, y, weights = NULL, nlv)
plslda_agg(X, y, weights = NULL, nlv, prior = c("unif", "prop"))
plsqda_agg(X, y, weights = NULL, nlv, prior = c("unif", "prop"))
# S3 method for Plsda_agg
predict(object, X, ...)

Value

For plsrda_agg, plslda_agg and plsqda_agg:

fm: list contaning: the model((fm)=(T): X-scores matrix; (P): X-loading matrix;(R): The PLS projection matrix (p,nlv); (W): X-loading weights matrix ;(C): The Y-loading weights matrix; (TT): the X-score normalization factor; (xmeans): the centering vector of X (p,1); (ymeans): the centering vector of Y (q,1); (weights): vector of observation weights; (U): intermediate output), (lev):classes, (ni):number of observations in each class
nlv: range of the numbers of LVs considered

For predict.Plsda_agg:

pred: Final predictions (after aggregation)
predlv: Intermediate predictions (Per nb. LVs)

Arguments

X: For the main functions: Training X-data (\(n, p\)). --- For the auxiliary function: New X-data (\(m, p\)) to consider.
y: Training class membership (\(n\)). Note: If y is a factor, it is replaced by a character vector.
weights: Weights (\(n, 1\)) to apply to the training observations. Internally, weights are "normalized" to sum to 1. Default to NULL (weights are set to \(1 / n\)).
nlv: A character string such as "5:20" defining the range of the numbers of LVs to consider (here: the models with nb LVS = 5, 6, ..., 20 are averaged). Syntax such as "10" is also allowed (here: correponds to the single model with 10 LVs).
prior: The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in y).
object: For the auxiliary function: A fitted model, output of a call to the main functions.
...: For the auxiliary function: Optional arguments. Not used.

Examples

Run this code


## EXAMPLE OF PLSRDA-AGG

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- sample(c(1, 4, 10, 2), size = n, replace = TRUE)

m <- 5
Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m]

nlv <- "2:5"
fm <- plsrda_agg(Xtrain, ytrain, nlv = nlv)
names(fm)
res <- predict(fm, Xtest)
names(res)
res$pred
err(res$pred, ytest)
res$predlv

pars <- mpars(nlv = c("1:3", "2:5"))
pars

res <- gridscore(
    Xtrain, ytrain, Xtest, ytest, 
    score = err, 
    fun = plsrda_agg, 
    pars = pars)
res

segm <- segmkf(n = n, K = 3, nrep = 1)
res <- gridcv(
    Xtrain, ytrain, 
    segm, score = err, 
    fun = plslda_agg, 
    pars = pars,
    verb = TRUE)
res

## EXAMPLE OF PLSLDA-AGG

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- sample(c(1, 4, 10, 2), size = n, replace = TRUE)
#ytrain <- sample(c("a", "10", "d"), size = n, replace = TRUE)
m <- 5
Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m]

nlv <- "2:5"
fm <- plslda_agg(Xtrain, ytrain, nlv = nlv, prior = "unif")
names(fm)
res <- predict(fm, Xtest)
names(res)
res$pred
err(res$pred, ytest)
res$predlv

pars <- mpars(nlv = c("1:3", "2:5"), prior = c("unif", "prop"))
pars
res <- gridscore(
    Xtrain, ytrain, Xtest, ytest, 
    score = err, 
    fun = plslda_agg, 
    pars = pars)
res

segm <- segmkf(n = n, K = 3, nrep = 1)
res <- gridcv(
    Xtrain, ytrain, 
    segm, score = err, 
    fun = plslda_agg, 
    pars = pars,
    verb = TRUE)
res

Run the code above in your browser using DataLab