Runs cvr.glmnet giving different penalty factors to predictors from different blocks and chooses the penalty factors by cross-validation from the list pflist
of candidates.
cvr2.ipflasso(X, Y, family, type.measure, standardize=TRUE,
alpha=1, blocks, pflist, nfolds, ncv,
nzeromax = +Inf, plot=FALSE)
a (nxp) matrix of predictors with observations in rows and predictors in columns
n-vector giving the value of the response (either continuous, numeric-binary 0/1, or Surv
object)
should be "gaussian" for continuous Y
, "binomial" for binary Y
, "cox" for Y
of type Surv
The accuracy/error measure computed in cross-validation. If not specified, type.measure is "class" (classification error) if family="binomial"
, "mse" (mean squared error) if family="gaussian"
and partial likelihood if family="cox"
. If family="binomial"
, one may specify type.measure="auc"
(area under the ROC curve).
whether the predictors should be standardized or not. Default is TRUE.
the elastic net mixing parameter: alpha
=1 yields the L1 penalty (lasso), alpha
=0 yields the L2 penalty. Default is alpha
=1 (lasso).
a list of length M the format list(block1=...,block2=...,
where the dots should be replaced by the indices of the predictors included in this block. The blocks should form a partition of 1:p.
a list of candidate penalty factors (see the argument pf
of the function cvr.ipflasso
) of the format weightslist=list(c(1,1),c(1,2),c(2,1),...).
the number of folds of CV procedure.
the number of repetitions of CV. Not to be confused with nfolds
. For example, if one repeats 50 times 5-fold-CV (i.e. considers 50 random partitions into 5 folds in turn and averages the results), nfolds
equals 5 and ncv
equals 50.
the maximal number of predictors allowed in the final model. Default is +Inf, i.e. the best model is selected based on CV without restriction.
If plot=TRUE
, the function outputs plots of CV errors and number of included predictors for each block.
A list with the following arguments:
the matrix of coefficients obtained with the best combination of penalty factors, with covariates corresponding to rows and lambda values corresponding to columns. The first row contains the intercept of the model.
the index of the best lambda as selected by CV for the best combination of penalty factors.
the best lambda as selected by CV for the best combination of penalty factors.
the index of the best penalty factor selected by CV from the list of candidates pflist
.
the CV error for each candidate lambda value, averaged over the ncv runs of cv.glmnet
.
a list of length length(pflist)
containing the outputs of the function cvr.ipflasso
for all candidate penalty factors from pflist
.
See arguments.
Boulesteix AL, De Bin R, Jiang X, Fuchs M, 2017. IPF-lasso: integrative L1-penalized regression with penalty factors for prediction based on multi-omics data. Comput Math Methods Med 2017:7691937.
# NOT RUN {
# load ipflasso library
library(ipflasso)
# generate dummy data
X<-matrix(rnorm(50*200),50,200)
Y<-rbinom(50,1,0.5)
cvr2.ipflasso(X=X,Y=Y,family="binomial",type.measure="class",standardize=FALSE,
blocks=list(block1=1:50,block2=51:200),
pflist=list(c(1,1),c(1,2),c(2,1)),nfolds=5,ncv=10)
# }
Run the code above in your browser using DataLab