Learn R Programming

ipflasso (version 1.1)

cvr.ipflasso: Cross-validated integrative lasso with fixed penalty factors

Description

Runs cvr.glmnet giving different penalty factors to predictors from different blocks.

Usage

cvr.ipflasso(X, Y, family, type.measure, standardize=TRUE, alpha=1, blocks, pf, nfolds,
  ncv)

Arguments

X

a (nxp) matrix of predictors with observations in rows and predictors in columns

Y

n-vector giving the value of the response (either continuous, numeric-binary 0/1, or Surv object)

family

should be "gaussian" for continuous Y, "binomial" for binary Y, "cox" for Y of type Surv

type.measure

The accuracy/error measure computed in cross-validation. If not specified, type.measure is "class" (classification error) if family="binomial", "mse" (mean squared error) if family="gaussian" and partial likelihood if family="cox". If family="binomial", one may specify type.measure="auc" (area under the ROC curve).

standardize

whether the predictors should be standardized or not. Default is TRUE.

alpha

the elastic net mixing parameter: alpha=1 yields the L1 penalty (lasso), alpha=0 yields the L2 penalty. Default is alpha=1 (lasso).

blocks

a list of length M the format list(block1=...,block2=..., where the dots should be replaced by the indices of the predictors included in this block. The blocks should form a partition of 1:p.

pf

a vector of length equal to the number of blocks M. Each entry contains the penalty factor to be applied to the predictors of the corresponding block. Example: if pf=c(1,2), the penalty applied to the predictors of the 2nd block is twice as large as the penalty applied to the predictors of the first block.

nfolds

the number of folds of CV procedure.

ncv

the number of repetitions of CV. Not to be confused with nfolds. For example, if one repeats 50 times 5-fold-CV (i.e. considers 50 random partitions into 5 folds in turn and averages the results), nfolds equals 5 and ncv equals 50.

Value

A list with the following arguments:

coeff

the matrix of coefficients with predictors corresponding to rows and lambda values corresponding to columns. The first rows contains the intercept of the model (for all families other than "cox").

ind.bestlambda

the index of the best lambda according to CV.

lambda

the lambda sequence.

cvm

the CV estimate of the measure specified by type.measure for each candidate lambda value.

nzero

the number of non-zero coefficients in the selected model.

family

See arguments.

References

Boulesteix AL, De Bin R, Jiang X, Fuchs M, 2017. IPF-lasso: integrative L1-penalized regression with penalty factors for prediction based on multi-omics data. Comput Math Methods Med 2017:7691937.

Examples

Run this code
# NOT RUN {
# load ipflasso library
library(ipflasso)

# generate dummy data
X<-matrix(rnorm(50*200),50,200)
Y<-rbinom(50,1,0.5)

cvr.ipflasso(X=X,Y=Y,family="binomial",standardize=FALSE,
            blocks=list(block1=1:50,block2=51:200), 
            pf=c(1,2),nfolds=5,ncv=10,type.measure="class")
# }

Run the code above in your browser using DataLab