Does k-fold cross-validation for gglasso, produces a plot, and returns a
value for lambda
. This function is modified based on the cv
function from the glmnet
package.
cv.gglasso(
x,
y,
group,
lambda = NULL,
pred.loss = c("misclass", "loss", "L1", "L2"),
nfolds = 5,
foldid,
delta,
...
)
an object of class cv.gglasso
is returned, which is a
list with the ingredients of the cross-validation fit.
the
values of lambda
used in the fits.
the mean
cross-validated error - a vector of length length(lambda)
.
estimate of standard error of cvm
.
upper
curve = cvm+cvsd
.
lower curve = cvm-cvsd
.
a text string indicating type of measure (for plotting purposes).
a fitted gglasso
object for the
full data.
The optimal value of lambda
that gives
minimum cross validation error cvm
.
The largest
value of lambda
such that error is within 1 standard error of the
minimum.
matrix of predictors, of dimension \(n \times p\); each row is an observation vector.
response variable. This argument should be quantitative for regression (least squares), and a two-level factor for classification (logistic model, huberized SVM, squared SVM).
a vector of consecutive integers describing the grouping of the coefficients (see example below).
optional user-supplied lambda sequence; default is
NULL
, and gglasso
chooses its own sequence.
loss to use for cross-validation error. Valid options are:
"loss"
for classification, margin based loss
function.
"misclass"
for classification, it gives
misclassification error.
"L1"
for regression, mean square
error used by least squares regression loss="ls"
, it measure the
deviation from the fitted mean to the response.
"L2"
for
regression, mean absolute error used by least squares regression
loss="ls"
, it measure the deviation from the fitted mean to the
response.
Default is "loss"
.
number of folds - default is 5. Although nfolds
can be
as large as the sample size (leave-one-out CV), it is not recommended for
large datasets. Smallest value allowable is nfolds=3
.
an optional vector of values between 1 and nfold
identifying what fold each observation is in. If supplied, nfold
can
be missing.
parameter \(\delta\) only used in huberized SVM for
computing log-likelihood on validation set, only available with
pred.loss = "loss"
, loss = "hsvm"
.
other arguments that can be passed to gglasso.
Yi Yang and Hui Zou
Maintainer: Yi Yang <yi.yang6@mcgill.ca>
The function runs gglasso
nfolds
+1 times; the first to
get the lambda
sequence, and then the remainder to compute the fit
with each of the folds omitted. The average error and standard deviation
over the folds are computed.
Yang, Y. and Zou, H. (2015), ``A Fast Unified Algorithm for
Computing Group-Lasso Penalized Learning Problems,'' Statistics and
Computing. 25(6), 1129-1141.
BugReport:
https://github.com/emeryyi/gglasso
gglasso
, plot.cv.gglasso
,
predict.cv.gglasso
, and coef.cv.gglasso
methods.
# load gglasso library
library(gglasso)
# load data set
data(bardet)
# define group index
group <- rep(1:20,each=5)
# 5-fold cross validation using group lasso
# penalized logisitic regression
cv <- cv.gglasso(x=bardet$x, y=bardet$y, group=group, loss="ls",
pred.loss="L2", lambda.factor=0.05, nfolds=5)
Run the code above in your browser using DataLab