cv.rq.pen: Cross Validated quantile regression

Description

Produces penalized quantile regression models for a range of lambdas and penalty of choice. If lambda is unselected than an iterative algorithm is used to find a maximum lambda such that the penalty is large enough to produce an intercept only model. Then range of lambdas goes from the maximum lambda found to "eps" on the log scale. For non-convex penalties local linear approximation approach used by Wang, Wu and Li to extend LLA as proposed by Zou and Li (2008) to the quantile regression setting.

Usage

cv.rq.pen(x,y,tau=.5,lambda=NULL,weights=NULL,penalty="LASSO",
          intercept=TRUE,criteria="CV",cvFunc="check",nfolds=10,
          foldid=NULL,nlambda=100,eps=.0001,init.lambda=1, penVars=NULL,
		  alg = ifelse(ncol(x) < 50, "LP", "QICD"),...)

Value

Returns the following:

models: List of penalized models fit. Number of models will match number of lambdas and correspond to cv$lambda.
cv: Data frame with "lambda" and second column is the evaluation based on the criteria selected.
lambda.min: Lambda which provides the smallest statistic for the selected criteria.
penalty: Penalty selected.

Arguments

x: Matrix of predictors.
y: Vector of response values.
tau: Conditional quantile being modelled.
lambda: Vector of lambdas. Default is for lambdas to be automatically generated.
weights: Weights for the objective function.
penalty: Type of penalty: "LASSO", "SCAD" or "MCP".
intercept: Whether model should include an intercept. Constant does not need to be included in "x".
criteria: How models will be evaluated. Either cross-validation "CV", BIC "BIC" or large P BIC "PBIC".
cvFunc: If cross-validation is used how errors are evaluated. Check function "check", "SqErr" (Squared Error) or "AE" (Absolute Value).
nfolds: K for K-folds cross-validation.
foldid: Group id for cross-validation. Function will randomly generate groups if not specified.
nlambda: Number of lambdas for which models are fit.
eps: Smallest lambda used.
init.lambda: Initial lambda used to find the maximum lambda. Not needed if lambda values are set.
penVars: Variables that should be penalized. With default value of NULL all variables are penalized.
alg: Algorithm that will be used, either linear programming (LP) or coordinate descent (QICD) algorithm from Peng and Wang (2015).
...: Additional arguments to be sent to rq.lasso.fit or rq.nc.fit.

Author

Ben Sherwood

References

[1] Peng, B. and Wang, L. An iterative coordinate descent algorithm for high-dimensional nonconvex penalized quantile regression. Journal of Computational and Graphical Statistics, 24, 676-694.

[2] Wang, L., Wu, Y. and Li, R. Quantile regression of analyzing heterogeneity in ultra-high dimension. J. Am. Statist. Ass, 107, 214--222.

[3] Wu, Y. and Liu, Y. (2009). Variable selection in quantile regression. Statistica Sinica, 19, 801--817.

[4] Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist., 36, 1509--1533.

Examples

Run this code

if (FALSE) {
x <- matrix(rnorm(800),nrow=100)
y <- 1 + x[,1] - 3*x[,5] + rnorm(100)
cv_model <- cv.rq.pen(x,y)
}

Run the code above in your browser using DataLab