Learn R Programming

glinternet (version 1.0.12)

glinternet.cv: Cross-validation for glinternet

Description

Does k-fold cross validation for glinternet and returns a value of lambda.

Usage

glinternet.cv(X, Y, numLevels, nFolds = 10, lambda=NULL, nLambda=50,
lambdaMinRatio=0.01, interactionCandidates=NULL, interactionPairs=NULL,
screenLimit=NULL, family=c("gaussian", "binomial"), tol=1e-5, maxIter=5000,
verbose=FALSE, numCores=1)

Arguments

X

X matrix as in glinternet.

Y

Target Y as in glinternet.

numLevels

Number of levels numLevels as in glinternet.

nFolds

Number of folds - default is 10.

lambda

lambda as in glinternet.

nLambda

nLambda as in glinternet.

lambdaMinRatio

lambdaMinRatio as in glinternet.

interactionCandidates

interactionCandidates as in glinternet.

interactionPairs

interactionPairs as in glinternet.

screenLimit

screenLimit as in glinternet.

family

family as in glinternet.

tol

tol as in glinternet.

maxIter

maxIter as in glinternet.

verbose

verbose as in glinternet.

numCores

numCores as in glinternet.

Value

An object of class glinternet.cv with the components

call

The user function call.

glinternetFit

Glinternet object fitted on the full data using a lambda sequence that terminates at lambdaHat.

fitted

Vector for fitted values (same length as Y). This is from the model fitted at lambdaHat.

activeSet

activeSet is a list variables found for the model fitted with lambdaHat.

betahat

Unstandardized coefficients for the variables in activeSet.

lambda

The actual sequence of lambda values used for the cross validation.

lambdaHat

The value of lambda that minimizes the cv error curve.

lambdaHat1Std

The largest value of lambda that produces a cv error that is within 1 standard deviation of the minimum cv error. This will always be at least as large as lambdaHat.

cvErr

The vector of cross validation errors. Same length as lambda.

cvErrStd

Standard deviation for cv errors across the nFolds folds.

family

The response type.

numLevels

Input number of levels for each variable.

nFolds

The number of folds used.

Details

The lambda sequence is computed using all the data. nFolds models are fit, each time with one of the folds omitted. The error is accumulated, and the average error and standard deviation over the folds is computed. The lambda value that minimizes the average error is returned, and a model with this lambda is fit to the full data set.

See Also

glinternet, predict.glinternet, predict.glinternet.cv, plot.glinternet.cv

Examples

Run this code
# NOT RUN {
Y = rnorm(100)
numLevels = sample(1:5, 10, replace=TRUE)
X = sapply(numLevels, function(x) if (x==1)
rnorm(100) else sample(0:(x-1), 100, replace=TRUE))
fit = glinternet.cv(X, Y, numLevels, nFolds=3)
# }

Run the code above in your browser using DataLab