Performs cross validation for a group penalty.
rq.group.pen.cv(
x,
y,
tau = 0.5,
groups = 1:ncol(x),
lambda = NULL,
a = NULL,
cvFunc = NULL,
nfolds = 10,
foldid = NULL,
groupError = TRUE,
cvSummary = mean,
tauWeights = rep(1, length(tau)),
printProgress = FALSE,
weights = NULL,
...
)
An rq.pen.seq.cv object.
Matrix of cvSummary function, default is average, cross-validation error for each model, tau and a combination, and lambda.
Matrix of the standard error of cverr foreach model, tau and a combination, and lambda.
The rq.pen.seq object fit to the full data.
A data.table of the values of a and lambda that are best as determined by the minimum cross validation error and the one standard error rule, which fixes a. In btr the values of lambda and a are selected seperately for each quantile.
A data.table for the combination of a and lambda that minimize the cross validation error across all tau.
Group, across all quantiles, cross-validation error results for each value of a and lambda.
Original call to the function.
Matrix of predictors.
Vector of responses.
Vector of quantiles.
Vector of group assignments for the predictors.
Vector of lambda values, if set to NULL they will be generated automatically.
Vector of the other tuning parameter values.
Function used for cross-validation error, default is quantile loss.
Number of folds used for cross validation.
Fold assignments, if not set this will be randomly created.
If errors are to be reported as a group or as the average for each fold.
The
Weights for the tau penalty only used in group tau results (gtr).
If set to TRUE will print which fold the process is working on.
Weights for the quantile loss function. Used in both model fitting and cross-validation.
Additional parameters that will be sent to rq.group.pen().
Ben Sherwood, ben.sherwood@ku.edu and Shaobo Li shaobo.li@ku.edu
Two cross validation results are returned. One that considers the best combination of a and lambda for each quantile. The second considers the best combination of the tuning
parameters for all quantiles. Let \(y_{b,i}\), \(x_{b,i}\), and \(m_{b,i}\) index the response, predictors, and weights of observations in
fold b. Let \(\hat{\beta}_{\tau,a,\lambda}^{-b}\) be the estimator for a given quantile and tuning parameters that did not use the bth fold. Let \(n_b\) be the number of observations in fold
b. Then the cross validation error for fold b is
$$\mbox{CV}(b,\tau) = \frac{1}{n_b} \sum_{i=1}^{n_b} m_{b,i} \rho_\tau(y_{b,i}-x_{b,i}^\top\hat{\beta}_{\tau,a,\lambda}^{-b}).$$
Note that \(\rho_\tau()\) can be replaced by a different function by setting the cvFunc parameter. The function returns two different cross-validation summaries. The first is btr, by tau results.
It provides the values of lambda
and a
that minimize the average, or whatever function is used for cvSummary
, of \(\mbox{CV}(b)\). In addition it provides the
sparsest solution that is within one standard error of the minimum results.
The other approach is the group tau results, gtr. Consider the case of estimating Q quantiles of \(\tau_1,\ldots,\tau_Q\) with quantile (tauWeights) of \(v_q\). The gtr returns the values of lambda
and a
that minimizes the average, or again whatever function is used for cvSummary
, of
$$\sum_{q=1}^Q v_q\mbox{CV}(b,\tau_q).$$ If only one quantile is modeled then the gtr results can be ignored as they provide the same minimum solution as btr.
set.seed(1)
x <- matrix(rnorm(100*8,sd=1),ncol=8)
y <- 1 + x[,1] + 3*x[,3] - x[,8] + rt(100,3)
g <- c(1,1,1,1,2,2,3,3)
tvals <- c(.25,.75)
if (FALSE) {
m1 <- rq.group.pen.cv(x,y,tau=c(.1,.3,.7),groups=g)
m2 <- rq.group.pen.cv(x,y,penalty="gAdLASSO",tau=c(.1,.3,.7),groups=g)
m3 <- rq.group.pen.cv(x,y,penalty="gSCAD",tau=c(.1,.3,.7),a=c(3,4,5),groups=g)
m4 <- rq.group.pen.cv(x,y,penalty="gMCP",tau=c(.1,.3,.7),a=c(3,4,5),groups=g)
}
Run the code above in your browser using DataLab