SL.glmnet: Elastic net regression, including lasso and ridge

Description

Penalized regression using elastic net. Alpha = 0 corresponds to ridge regression and alpha = 1 corresponds to Lasso.

See vignette("glmnet_beta", package = "glmnet") for a nice tutorial on glmnet.

Usage

SL.glmnet(Y, X, newX, family, obsWeights, id, alpha = 1, nfolds = 10,
  nlambda = 100, useMin = TRUE, loss = "deviance", ...)

Arguments

Outcome variable

Covariate dataframe

newX

Dataframe to predict the outcome

family

"gaussian" for regression, "binomial" for binary classification. Untested options: "multinomial" for multiple classification or "mgaussian" for multiple response, "poisson" for non-negative outcome with proportional mean and variance, "cox".

obsWeights

Optional observation-level weights

Optional id to group observations from the same unit (not used currently).

alpha

Elastic net mixing parameter, range [0, 1]. 0 = ridge regression and 1 = lasso.

nfolds

Number of folds for internal cross-validation to optimize lambda.

nlambda

Number of lambda values to check, recommended to be 100 or more.

useMin

If TRUE use lambda that minimizes risk, otherwise use 1 standard-error rule which chooses a higher penalty with performance within one standard error of the minimum (see Breiman et al. 1984 on CART for background).

loss

Loss function, can be "deviance", "mse", or "mae". If family = binomial can also be "auc" or "class" (misclassification error).

...

Any additional arguments are passed through to cv.glmnet.

References

Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1), 1.

Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55-67.

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267-288.

Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301-320.

Examples

Run this code

# NOT RUN {
# Load a test dataset.
data(PimaIndiansDiabetes2, package = "mlbench")
data = PimaIndiansDiabetes2

# Omit observations with missing data.
data = na.omit(data)

Y = as.numeric(data$diabetes == "pos")
X = subset(data, select = -diabetes)

set.seed(1, "L'Ecuyer-CMRG")

sl = SuperLearner(Y, X, family = binomial(),
                  SL.library = c("SL.mean", "SL.glm", "SL.glmnet"))
sl

# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

References

See Also

Examples