Fit solution paths for linear or logistic regression models penalized by lasso (alpha = 1) or elastic-net (1e-4 < alpha < 1) over a grid of values for the regularization parameter lambda.
COPY_biglasso_main(X, y.train, ind.train, ind.col, covar.train,
family = c("gaussian", "binomial"), alphas = 1, K = 10,
ind.sets = sample(rep_len(1:K, n)), nlambda = 200, lambda.min = if
(n > p) 1e-04 else 0.001, nlam.min = 50, n.abort = 10,
base.train = NULL, eps = 1e-05, max.iter = 1000, dfmax = 50000,
warn = FALSE, return.all = FALSE, ncores = 1)
Either "gaussian" (linear) or "binomial" (logistic).
The elastic-net mixing parameter that controls the relative
contribution from the lasso (l1) and the ridge (l2) penalty. The penalty is
defined as $$ \alpha||\beta||_1 + (1-\alpha)/2||\beta||_2^2.$$
alpha = 1
is the lasso penalty and alpha
in between 0
(1e-4
) and 1
is the elastic-net penalty. Default is 1
. You can
pass multiple values, and only one will be used (optimized by grid-search).
Number of sets used in the Cross-Model Selection and Averaging
(CMSA) procedure. Default is 10
.
Integer vectors of values between 1
and K
specifying
which set each index of the training set is in. Default randomly assigns
these values.
The number of lambda values. Default is 200
.
The smallest value for lambda, as a fraction of
lambda.max. Default is .0001
if the number of observations is larger than
the number of variables and .001
otherwise.
Minimum number of lambda values to investigate. Default is 50
.
Number of lambda values for which prediction on the validation
set must decrease before stopping. Default is 10
.
Convergence threshold for inner coordinate descent.
The algorithm iterates until the maximum change in the objective after any
coefficient update is less than eps
times the null deviance.
Default value is 1e-5
.
Maximum number of iterations. Default is 1000
.
Upper bound for the number of nonzero coefficients. Default is
50e3
because, for large data sets, computational burden may be
heavy for models with a large number of nonzero coefficients.
Return warning messages for failures to converge and model
saturation? Default is FALSE
.
Whether to return coefficients for all alpha and lambda
values. Default is FALSE
and returns only coefficients which maximize
prediction on the validation sets.
The objective function for linear regression (family = "gaussian"
) is
$$\frac{1}{2n}\textrm{RSS} + \textrm{penalty},$$ for logistic regression
(family = "binomial"
) it is $$-\frac{1}{n} loglike +
\textrm{penalty}.$$