Learn R Programming

enpls (version 6.1)

enspls.ad: Ensemble Sparse Partial Least Squares for Model Applicability Domain Evaluation

Description

Model applicability domain evaluation with ensemble sparse partial least squares.

Usage

enspls.ad(x, y, xtest, ytest, maxcomp = 5L, cvfolds = 5L,
  alpha = seq(0.2, 0.8, 0.2), space = c("sample", "variable"),
  method = c("mc", "boot"), reptimes = 500L, ratio = 0.8,
  parallel = 1L)

Arguments

x

Predictor matrix of the training set.

y

Response vector of the training set.

xtest

List, with the i-th component being the i-th test set's predictor matrix (see example code below).

ytest

List, with the i-th component being the i-th test set's response vector (see example code below).

maxcomp

Maximum number of components included within each model. If not specified, will use 5 by default.

cvfolds

Number of cross-validation folds used in each model for automatic parameter selection, default is 5.

alpha

Parameter (grid) controlling sparsity of the model. If not specified, default is seq(0.2, 0.8, 0.2).

space

Space in which to apply the resampling method. Can be the sample space ("sample") or the variable space ("variable").

method

Resampling method. "mc" (Monte-Carlo resampling) or "boot" (bootstrapping). Default is "mc".

reptimes

Number of models to build with Monte-Carlo resampling or bootstrapping.

ratio

Sampling ratio used when method = "mc".

parallel

Integer. Number of CPU cores to use. Default is 1 (not parallelized).

Value

A list containing:

  • tr.error.mean - absolute mean prediction error for training set

  • tr.error.median - absolute median prediction error for training set

  • tr.error.sd - prediction error sd for training set

  • tr.error.matrix - raw prediction error matrix for training set

  • te.error.mean - list of absolute mean prediction error for test set(s)

  • te.error.median - list of absolute median prediction error for test set(s)

  • te.error.sd - list of prediction error sd for test set(s)

  • te.error.matrix - list of raw prediction error matrix for test set(s)

Examples

Run this code
# NOT RUN {
data("logd1k")
# remove low variance variables
x <- logd1k$x[, -c(17, 52, 59)]
y <- logd1k$y

# training set
x.tr <- x[1:300, ]
y.tr <- y[1:300]

# two test sets
x.te <- list(
  "test.1" = x[301:400, ],
  "test.2" = x[401:500, ]
)
y.te <- list(
  "test.1" = y[301:400],
  "test.2" = y[401:500]
)

set.seed(42)
ad <- enspls.ad(
  x.tr, y.tr, x.te, y.te,
  maxcomp = 3, alpha = c(0.3, 0.6, 0.9),
  space = "variable", method = "mc",
  ratio = 0.8, reptimes = 10
)
print(ad)
plot(ad)
# the interactive plot requires a HTML viewer
# }
# NOT RUN {
plot(ad, type = "interactive")
# }

Run the code above in your browser using DataLab