Input matrix of dimension n * p; each of the n rows is an observation vector of p variables. The intercept should be included in the first column as (1,...,1). If not, it is added.
Y
Response variable of length n.
mu
Positive regularization sequence to be used for the Lasso.
m
Number of bootstrap iteration of the Lasso. Default is m=100.
probaseuil
A frequency threshold for selecting the most stable variables over the m boostrap iteration of the Lasso. Default is 1.
penalty.factor
Separate penalty factors can be applied to each coefficient. This is a number that multiplies lambda to allow differential shrinkage. Can be 0 for some variables, which implies no shrinkage, and that variable is always included in the model. Default is 1 for all variables except the intercept.
random
optionnal parameter, matrix of size n*m. If random is provided, the m bootstrap samples are constructed from its m columns.
Value
plot is available.
data
A list containing:
Y - the input response vector
means.X - Vector of means of the input data matrix.
sigma.X - Vector of variances of the input data matrix.
ind
Set of selected variables for the regularization mu and the threshold probaseuil.
frequency
Appearance frequency of each variable; number of times each variables is selected over the m bootstrap iterations.
Details
The Lasso from the glmnet package is performed with the regularization parameter mu over m bootstrap samples. An appearance frequency is obtained which shows the predictive power of each variable. It is calculated as the number of times a variables has been selected by the Lasso over the m bootstrap iteration.
References
Model-consistent sparse estimation through the bootstrap; F. Bach 2009