Learn R Programming

VDA (version 1.3)

cv.vda.r: Choose $\lambda$ using K-fold cross validation

Description

Choose the optimal tuning parameter $\lambda$ for Vertex Discriminant Analyis by using K-fold cross validation.

Usage

cv.vda.r(x, y, k, lam.vec) cv.vda(x, y, k, lam.vec)

Arguments

x
n x p matrix or data frame containing the cases for each feature. The rows correspond to cases and the columns to the features. Intercept column is not included in this.
y
n x 1 vector representing the outcome variable. Each element denotes which one of the k classes that case belongs to.
k
The number of folds to be used in cross-validation.
lam.vec
A vector containing the set of all values of $\lambda$, from which VDA will be conducted.

Value

k
The value of K used for the K-fold cross validation.
lam.vec
The values of lambda tested.
mean.error
The mean error corresponding to each lambda across k-folds
lam.opt
The determined lambda value among lam.vec that returns the smallest prediction error. This value is the optimal lambda value for use in link{vda.r}.
error.cv
The prediction error matrix returned by cross validation method.

Details

K-fold cross validation to select optimal lambda for use in Vertex Disciminant Analysis (vda.r). The optimal value is considered the lamda value that retuns the lowest testing error over the cross validation. If more than one lambda value give the minumum testing error, the largest lambda is selected.

A plot of the cross validation errors can be viewed through plot.cv.vda.r.

References

Lange, K. and Wu, T.T. (2008) An MM Algorithm for Multicategory Vertex Discriminant Analysis. Journal of Computational and Graphical Statistics, Volume 17, No 3, 527-544.

See Also

vda.r. plot.cv.vda.r

Examples

Run this code
# load zoo data
# column 1 is name, columns 2:17 are features, column 18 is class
data(zoo)

# feature matrix without intercept
x <- zoo[,2:17]

# class vector
y <- zoo[,18]

# lambda vector
lam.vec <- (1:10)/10

# searching for the best lambda with 10-fold cross validation and plot cv
cv <- cv.vda.r(x, y, 10, lam.vec)
plot(cv)

# run VDA
out <- vda.r(x,y,cv$lam.opt)

# Predict five cases based on VDA
fivecases <- matrix(0,5,16)
fivecases[1,] <- c(1,0,0,1,0,0,0,1,1,1,0,0,4,0,1,0)
fivecases[2,] <- c(1,0,0,1,0,0,1,1,1,1,0,0,4,1,0,1)
fivecases[3,] <- c(0,1,1,0,1,0,0,0,1,1,0,0,2,1,1,0)
fivecases[4,] <- c(0,0,1,0,0,1,1,1,1,0,0,1,0,1,0,0)
fivecases[5,] <- c(0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0)
predict(out, fivecases)

Run the code above in your browser using DataLab