cv.glmgraph: Cross-validation for glmgraph

Description

Performs k-fold cross validation for glmgraph

Usage

cv.glmgraph(X,Y,L,...,type.measure=c("mse","mae","deviance","auc"),nfolds=5,trace=TRUE)

Arguments

X matrix as in glmgraph.

Response Y as in glmgraph.

User-specified Laplacian matrix L as in glmgraph.

...

Additional arguments as in glmgraph.

type.measure

if family is "gaussian", the type.measure option is "mse"(mean squared error) or "mae"(mean absolute error); if family is "binomial", the type.measure option is "deviance" or "auc"(area under the curve). The default is "mse".

nfolds

The number of cross-validation folds. Default is 5.

trace

Print out the cross validation steps if trace is specified TRUE.

Value

obj: The fitted glmgraph object for the whole data.
cvmat: A data frame summarized cross validation results, which could be obtained by print function. It has lambda2,lambda1.min,cvmin,semin,lambda1.1se as columns. Each row represents that for this lambda2, lambda1 with best type.measure cvmin is chosen and reported as lambda1.min. If one standard error rule is applied, lambda1.1se and its corresponding best type.measure value semin is reported.
cvm: The mean cross-validated type.measure value. A list of vector contains type.measure. Each element of the list is a vector that is type.measure value for one lambda2 across all lambda1 sequence averaged across K-fold.
cvsd: The estimate of standard error of cvm.
cvmin: Best cross-validation type.measure value across all combination of lambda1 and lambda2. It is minimum "mse" or "mae" if family is "gaussian"; it is the maximum "auc" or minimum "deviance" if family is "binomial".
cv.1se: Simliar to cvmin except one standard error rule is applied.
lambda1.min: Coupled with lambda2.min is the optimal regularization parameter selection.
lambda2.min: Coupled with lambda1.min is the optimal regularization parameter selection.
lambda1.1se: Coupled with lambda2.1se is the optimal regularization parameter selection if one standard error rule is applied.
lambda2.1se: Coupled with lambda1.1se is the optimal regularization parameter selection if one standard error rule is applied.
beta.min: Estimated beta with best type.measure value with the regularization parameter of lambda1.min and lambda2.min.
beta.1se: Estimated beta with best type.measure value with the regularization parameter of lambda1.1se and lambda2.1se.

Details

The function runs glmgraph nfolds+1 times; the first to get the lambda1 and lambda2 sequence, and then the remainder to compute the fit with each of the folds omitted. The error is accumulated, and the average error and standard deviation over the folds is computed. Note also that the results of cv.glmgraph are random, since the folds are selected at random. Users can reduce this randomness by running cv.glmgraph many times, and averaging the error curves.

References

Li Chen. Han Liu. Hongzhe Li. Jun Chen(2015) glmgraph: Graph-constrained Regularization for Sparse Generalized Linear Models.(Working paper)

Examples

Run this code

 set.seed(1234)
 library(glmgraph)
 n <- 100
 p1 <- 10
 p2 <- 90
 p <- p1+p2
 X <- matrix(rnorm(n*p), n,p)
 magnitude <- 1
 ## construct laplacian matrix from adjacency matrix
 A <- matrix(rep(0,p*p),p,p)
 A[1:p1,1:p1] <- 1
 A[(p1+1):p,(p1+1):p] <- 1
 diag(A) <- 0
 diagL <- apply(A,1,sum)
 L <- -A
 diag(L) <- diagL
 btrue <- c(rep(magnitude,p1),rep(0,p2))
 intercept <- 0
 eta <- intercept+X%*%btrue
 ### gaussian
 Y <- eta+rnorm(n)
 cv.obj <- cv.glmgraph(X,Y,L,penalty="lasso",lambda2=c(0,1.28))
 beta.min <- coef(cv.obj)
 print(cv.obj)
 ### binomial
 Y <- rbinom(n,1,prob=1/(1+exp(-eta)))
 cv.obj <- cv.glmgraph(X,Y,L,family="binomial",lambda2=c(0,1.28),penalty="lasso",type.measure="auc")
 beta.min <- coef(cv.obj)
 print(cv.obj)

Run the code above in your browser using DataLab