cv.sparsenet: Cross-validation for sparsenet

Description

Does k-fold cross-validation for sparsenet, produces a plot, and returns values for gamma, lambda

Usage

cv.sparsenet(x, y, weights, type.measure = c("mse", "mae"), ...,nfolds = 10,
       foldid, keep=FALSE, trace.it=FALSE)

Value

an object of class "cv.sparsenet" is returned, which is a list with the ingredients of the cross-validation fit.

lambda: the values of lambda used in the fits. This is an nlambda x ngamma matrix
cvm: The mean cross-validated error - a matrix shaped like lambda
cvsd: estimate of standard error of cvm.
cvup: upper curve = cvm+cvsd.
cvlo: lower curve = cvm-cvsd.
nzero: number of non-zero coefficients at each lambda, gamma pair.
name: a text string indicating type of measure (for plotting purposes).
sparsenet.fit: a fitted sparsenet object for the full data.
call: The call that produced this object
parms.min: values of gamma, lambda that gives minimum cvm.
which.min: indices for the above
lambda.1se: gamma, lambda of smallest model (df) such that error is within 1 standard error of the minimum.
which.1se: indices of the above

Arguments

x: x matrix as in sparsenet.
y: response y as in sparsenet.
weights: Observation weights; defaults to 1 per observation
type.measure: loss to use for cross-validation. Currently two options: squared-error (type.measure="mse") or mean-absolute error ( type.measure="mae" )
...: Other arguments that can be passed to sparsenet.
nfolds: number of folds - default is 10. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds=3
foldid: an optional vector of values between 1 and nfold identifying whhat fold each observation is in. If supplied, nfold can be missing.
keep: If TRUE, we include the prevalidation array as component fit.preval on the returned object. Default is keep = FALSE.
trace.it: If TRUE, then we get a printout that shows the progress

Author

Rahul Mazumder, Jerome Friedman and Trevor Hastie

Maintainer: Trevor Hastie <hastie@stanford.edu>

Details

The function runs sparsenet nfolds+1 times; the first to get the lambda sequence, and then the remainder to compute the fit with each of the folds omitted. The error is accumulated, and the average error and standard deviation over the folds is computed.

References

Mazumder, Rahul, Friedman, Jerome and Hastie, Trevor (2011) SparseNet: Coordinate Descent with Nonconvex Penalties. JASA, Vol 106(495), 1125-38, https://hastie.su.domains/public/Papers/Sparsenet/Mazumder-SparseNetCoordinateDescent-2011.pdf

Examples

Run this code

train.data=gendata(100,1000,nonzero=30,rho=0.3,snr=3)
fit=sparsenet(train.data$x,train.data$y)
par(mfrow=c(3,3))
plot(fit)
par(mfrow=c(1,1))
fitcv=cv.sparsenet(train.data$x,train.data$y,trace.it=TRUE)
plot(fitcv)

Run the code above in your browser using DataLab