Coxnet, loCoxnet: Fit a Cox Model with Various Regularization Forms

Description

Coxnet fits a Cox model regularized with net, elastic-net or lasso penalty, and their adaptive forms, such as adaptive lasso and net adjusting for signs of linked coefficients. Moreover, it treats the number of non-zero coefficients as another tuning parameter and simultaneously selects with the regularization parameter lambda. loCoxnet fits a varying coefficient Cox model by kernel smoothing, incorporated with the aforementioned penalties. The package uses one-step coordinate descent algorithm and runs extremely fast by taking into account the sparsity structure of coefficients.

Usage

Coxnet(x, y, Omega = NULL, penalty = c("Lasso", "Enet", "Net"), 
  alpha = 1, lambda = NULL, nlambda = 50, rlambda = NULL, nfolds = 1, foldid = NULL,
  inzero = TRUE, adaptive = c(FALSE, TRUE), aini = NULL, isd = FALSE,
  ifast = TRUE, keep.beta = FALSE, thresh = 1e-6, maxit = 1e+5)

loCoxnet(x, y, w, w0 = NULL, h = NULL, hnext = NULL, Omega = NULL,
  penalty = c("Lasso", "Enet", "Net"), alpha = 1, lambda = NULL,
  nlambda = 50, rlambda = NULL, nfolds = 1, foldid = NULL,
  adaptive = c(FALSE, TRUE), aini = NULL, isd = FALSE, keep.beta = FALSE,
  thresh = 1e-6, thresh2 = 1e-10, maxit = 1e+5)

Arguments

input matrix. Each row is an observation vector.

response variable. y should be a two-column matrix with columns named `time' and `status'. The latter is a binary variable, with `1' indicating event, and `0' indicating right censored.

input vector, same length as y. The coefficients vary with w.

evaluation local points. The output of estimates are evaludated at these local value w0. If w0 = NULL, w0 is generated as 10 equally spaced points in the range of w.

bandwidth.

hnext

an increase in bandwidth h. Default is 0.01.

Omega

correlation/adjancy matrix with zero diagonal, used for penalty = "Net" to calculate Laplacian matrix.

penalty

penalty type. Can choose "Net", "Enet" and "Lasso". For "Net", need to specify Omega; otherwises, "Enet" is performed.

alpha

ratio between L1 and Laplacian for "Net", or between L1 and L2 for "Enet". Can be zero and one. For penalty = "Net", the penalty is defined as $$\lambda*{\alpha*||\beta||_1+(1-\alpha)/2*(\beta^{T}L\beta)},$$ where $

lambda

a user supplied decreasing sequence. If lambda = NULL, a sequency of lambda is generated based on nlambda and rlambda. Supplying a value of lambda overrides this.

nlambda

number of lambda values. Default is 50.

rlambda

fraction of lambda.max to determine the smallest value for lambda. The default is rlambda = 0.0001 when the number of observations is larger than or equal to the number of variables; otherwise, rlambda = 0.01

nfolds

number of folds. Default is nfolds = 1 and foldid = NULL and cross-validation is not performed. For cross-validation, smallest value allowable is nfolds = 3. Specifying foldid overrisdes this.

foldid

an optional vector of values between 1 and nfolds specifying which fold each observation is in.

inzero

logical flag for simultaneously tuning the number of non-zero coefficients with lambda. Default is inzero = TRUE.

adaptive

logical flags for adaptive version. Default is adaptive = c(FALSE, TRUE). The first element is for adaptive on $\beta$ in L1 and the second for adjusting for signs of linked coefficients in Laplacian matrix.

aini

a user supplied initial estimate of $\beta$. It is a list including wbeta for adaptive L1 and sgn for adjusting Laplacian matrix. wbeta is the absolute value of inverse initial estimates. If aini = NULL

isd

logical flag for outputing standardized coefficients. x is always standardized prior to fitting the model. Default is isd = FALSE, returning $\beta$ on the original scale.

ifast

logical flag for efficient calculation of risk set updates. Default is ifast = TRUE.

keep.beta

logical flag for returning estimates for all lambda values. For keep.beta = FALSE, only return the estimate with the largest cross-validation partial likelihood.

thresh

convergence threshold for coordinate descent. Default value is 1E-6.

thresh2

threshold for removing very small lambda value for local methods. The algorithm computes along a sequence of lambda value until any absolute value of the second derivative is smaller than thresh2. The estimates are r

maxit

Maximum number of iterations for coordinate descent. Default is 10^5.

`Value`

Coxnet outputs an object with S3 class "Coxnet".
Betaestimated coefficients.
Beta0coefficients after tuning the number of non-zeros, for inzero = TRUE.
fita data.frame containing lambda and the number of non-zero coefficients nzero. For cross-validation, additional results are reported, such as average cross-validation partial likelihood cvm and its standard error cvse, and index with max indicating the largest cvm.
fit0a data.frame containing lambda, cvm and nzero based on inzero = TRUE.
lambda.maxvalue of lambda that gives maximum cvm.
lambda.optvalue of lambda based on inzero = TRUE.
cv.nzerocvm with length of number of non-zero components of Beta0. The kth value of cv.nzero corresponds to retaining the k largest non-zero coefficients (absolute values) in Beta0. The optimal number of non-zero is selected by the maximum value of cv.nzero at lambda = lambda.opt.
penaltypenalty type.
adaptivelogical flags for adaptive version (see above).
flagconvergence flag (for internal debugging). flag = 0 means converged.
loCoxnet outputs an object with S3 class "Coxnet" and "loCoxnet".
Betaa list of estimated coefficients with length of lambda. If there are more than one w0 value, each element of the list is a matrix with p rows and the number of columns is the length of w0. If there is one w0, Beta is a matrix rather than a list, with p rows and nlambda columns.
fita data.frame containing lambda and the number of non-zero coefficients nzero. For cross-validation, additional results are reported, such as average cross-validation partial likelihood cvm and its standard error cvse, and index with max indicating the largest cvm.
lambda.maxvalue of lambda that gives maximum cvm.
cvha data.frame containing bandwidth, cvm and cvse.
penaltypenalty type.
adaptivelogical flags for adaptive version (see above).
flagconvergence flag (for internal debugging). flag = 0 means converged.

`Details`

One-step coordinate descent algorithm is applied for each lambda. ifast = TRUE adopts an efficient way to update risk set and sometimes the algorithm ends before all nlambda values of lambda have been evaluated. To evaluate small values of lambda, use ifast = FALSE. The two methods only affect the efficiency of algorithm, not the estimates.
  
  Cross-validation partial likelihood is used for tuning parameters. For inzero =  TRUE, we further select the number of non-zero coefficients obtained from regularized Cox model at each lambda. This is motivated by formulating L0 variable selection in ADMM form.  
  
  For vayring coefficients methods, the bandwidth is selected by cross-validation.  We recommend to check whether a small increase of h, say h+hnext, will improve the current cvm.

`References`

Friedman, J., Hastie, T. and Tibshirani, R. (2008)
  Regularization Paths for Generalized Linear Models via Coordinate
    Descent, Journal of Statistical Software, Vol. 33(1), 1-22 Feb 2010
http://www.jstatsoft.org/v33/i01/
Simon, N., Friedman, J., Hastie, T., Tibshirani, R. (2011)
  Regularization Paths for Cox's Proportional Hazards Model via
    Coordinate Descent, Journal of Statistical Software, Vol. 39(5)
    1-13
http://www.jstatsoft.org/v39/i05/
Sun, H., Lin, W., Feng, R., and Li, H. (2014)
  Network-regularized high-dimensional cox regression for analysis of genomic data, Statistica Sinica.
http://www3.stat.sinica.edu.tw/statistica/j24n3/j24n319/j24n319.html
van Houwelingen, H. C., Bruinsma, T., Hart, A. A., van't Veer, L. J., & Wessels, L. F. (2006)
  Cross-validated Cox regression on microarray gene expression data. Statistics in medicine, 25(18), 3201-3216.
http://onlinelibrary.wiley.com/doi/10.1002/sim.2353/full

`See Also`

print.Coxnet, coxsplit

`Examples`

Run this codeset.seed(1213)
N=100;p=30;p1=5
x=matrix(rnorm(N*p),N,p)
beta=rnorm(p1)
xb=x[,1:p1]ty=rexp(N,exp(xb))
tcens=rbinom(n=N,prob=.3,size=1)  # censoring indicator
y=cbind(time=ty,status=1-tcens)

fiti=Coxnet(x,y,penalty="Lasso",nlambda=10,nfolds=10) # Lasso
# attributes(fiti)
Run the code above in your browser using DataLab