Learn R Programming

DAP (version 1.0)

cv_DAP: Cross-validation for DAP

Description

Chooses optimal tuning parameter lambda for DAP based on the k-fold cross-validation to minimize the misclassification error rate

Usage

cv_DAP(X, Y, lambda_seq, nfolds = 5, eps = 1e-04, maxiter = 1000,
  myseed = 1001, prior = TRUE)

Arguments

X

A n x p training dataset; n observations on the rows and p features on the columns.

Y

A n vector of training group labels, either 1 or 2.

lambda_seq

A sequence of tuning parameters to choose from.

nfolds

Number of folds for cross-validation, the default is 5.

eps

Convergence threshold for the block-coordinate decent algorithm based on the maximum element-wise change in \(V\). The default is 1e-4.

maxiter

Maximum number of iterations, the default is 10000.

myseed

Optional specification of random seed for generating the folds, the default value is 1001.

prior

A logical indicating whether to put larger weights to the groups of larger size; the default value is TRUE.

Value

A list of

lambda_seq

The sequence of tuning parameters used.

cvm

The mean cross-validated error rate - a vector of length length(lambda_seq)

cvse

The estimated standard error vector corresponding to cvm.

lambda_min

Value of tuning parameter corresponding to the minimal error in cvm.

lambda_1se

The largest value of tuning parameter such that the correspondig error is within 1 standard error of the minimal error in cvm.

nfeature_mat

A nfolds x length(lambda_seq) matrix of the number of selected features.

error_mat

A nfolds x length(lambda_seq) matrix of the error rates.

Examples

Run this code
# NOT RUN {
## This is an example for cv_DAP

## Generate data
n_train = 50
n_test = 50
p = 100
mu1 = rep(0, p)
mu2 = rep(3, p)
Sigma1 = diag(p)
Sigma2 = 0.5* diag(p)

## Build training data
x1 = MASS::mvrnorm(n = n_train, mu = mu1, Sigma = Sigma1)
x2 = MASS::mvrnorm(n = n_train, mu = mu2, Sigma = Sigma2)
xtrain = rbind(x1, x2)
ytrain = c(rep(1, n_train), rep(2, n_train))

## Apply cv_DAP
fit = cv_DAP(X = xtrain, Y = ytrain, lambda_seq = c(0.2, 0.3, 0.5, 0.7, 0.9))
# }

Run the code above in your browser using DataLab