rq.group.pen.cv: Performs cross validation for a group penalty.

Description

Performs cross validation for a group penalty.

Usage

rq.group.pen.cv(
  x,
  y,
  tau = 0.5,
  groups = 1:ncol(x),
  lambda = NULL,
  a = NULL,
  cvFunc = NULL,
  nfolds = 10,
  foldid = NULL,
  groupError = TRUE,
  cvSummary = mean,
  tauWeights = rep(1, length(tau)),
  printProgress = FALSE,
  weights = NULL,
  ...
)

Value

An rq.pen.seq.cv object.

cverr: Matrix of cvSummary function, default is average, cross-validation error for each model, tau and a combination, and lambda.
cvse: Matrix of the standard error of cverr foreach model, tau and a combination, and lambda.
fit: The rq.pen.seq object fit to the full data.
btr: A data.table of the values of a and lambda that are best as determined by the minimum cross validation error and the one standard error rule, which fixes a. In btr the values of lambda and a are selected seperately for each quantile.
gtr: A data.table for the combination of a and lambda that minimize the cross validation error across all tau.
gcve: Group, across all quantiles, cross-validation error results for each value of a and lambda.
call: Original call to the function.

Arguments

x: Matrix of predictors.
y: Vector of responses.
tau: Vector of quantiles.
groups: Vector of group assignments for the predictors.
lambda: Vector of lambda values, if set to NULL they will be generated automatically.
a: Vector of the other tuning parameter values.
cvFunc: Function used for cross-validation error, default is quantile loss.
nfolds: Number of folds used for cross validation.
foldid: Fold assignments, if not set this will be randomly created.
groupError: If errors are to be reported as a group or as the average for each fold.
cvSummary: The
tauWeights: Weights for the tau penalty only used in group tau results (gtr).
printProgress: If set to TRUE will print which fold the process is working on.
weights: Weights for the quantile loss function. Used in both model fitting and cross-validation.
...: Additional parameters that will be sent to rq.group.pen().

Author

Ben Sherwood, ben.sherwood@ku.edu and Shaobo Li shaobo.li@ku.edu

Details

Two cross validation results are returned. One that considers the best combination of a and lambda for each quantile. The second considers the best combination of the tuning parameters for all quantiles. Let $y_{b,i}$, $x_{b,i}$, and $m_{b,i}$ index the response, predictors, and weights of observations in fold b. Let $\hat{\beta}_{\tau,a,\lambda}^{-b}$ be the estimator for a given quantile and tuning parameters that did not use the bth fold. Let $n_b$ be the number of observations in fold b. Then the cross validation error for fold b is $$\mbox{CV}(b,\tau) = \frac{1}{n_b} \sum_{i=1}^{n_b} m_{b,i} \rho_\tau(y_{b,i}-x_{b,i}^\top\hat{\beta}_{\tau,a,\lambda}^{-b}).$$ Note that $\rho_\tau()$ can be replaced by a different function by setting the cvFunc parameter. The function returns two different cross-validation summaries. The first is btr, by tau results. It provides the values of lambda and a that minimize the average, or whatever function is used for cvSummary, of $\mbox{CV}(b)$. In addition it provides the sparsest solution that is within one standard error of the minimum results.

The other approach is the group tau results, gtr. Consider the case of estimating Q quantiles of $\tau_1,\ldots,\tau_Q$ with quantile (tauWeights) of $v_q$. The gtr returns the values of lambda and a that minimizes the average, or again whatever function is used for cvSummary, of $$\sum_{q=1}^Q v_q\mbox{CV}(b,\tau_q).$$ If only one quantile is modeled then the gtr results can be ignored as they provide the same minimum solution as btr.

Examples

Run this code

set.seed(1)
x <- matrix(rnorm(100*8,sd=1),ncol=8)
y <- 1 + x[,1] + 3*x[,3] - x[,8] + rt(100,3)
g <- c(1,1,1,1,2,2,3,3)
tvals <- c(.25,.75)
if (FALSE) {
m1 <- rq.group.pen.cv(x,y,tau=c(.1,.3,.7),groups=g)
m2 <- rq.group.pen.cv(x,y,penalty="gAdLASSO",tau=c(.1,.3,.7),groups=g)
m3 <- rq.group.pen.cv(x,y,penalty="gSCAD",tau=c(.1,.3,.7),a=c(3,4,5),groups=g)
m4 <- rq.group.pen.cv(x,y,penalty="gMCP",tau=c(.1,.3,.7),a=c(3,4,5),groups=g)
}

Run the code above in your browser using DataLab