Fit a negative binomial linear model via penalized maximum likelihood. The regularization path is computed for the lasso (or elastic net penalty), snet and mnet penalty, at a grid of values for the regularization parameter lambda.
glmregNB(formula, data, weights, offset=NULL, nlambda = 100, lambda=NULL,
lambda.min.ratio = ifelse(nobs
An object with S3 class "glmreg", "glmregNB"
for the various types of models.
the call that produced the model fit
Intercept sequence of length length(lambda)
A nvars x
length(lambda)
matrix of coefficients.
The actual sequence of lambda
values used
The computed deviance. The deviance calculations incorporate weights if present in the model. The deviance is defined to be 2*(loglike_sat - loglike), where loglike_sat is the log-likelihood for the saturated model (a model with a free parameter per observation).
Null deviance (per observation). This is defined to be 2*(loglike_sat -loglike(Null)); The NULL model refers to the intercept model.
number of observations
formula used to describe a model.
argument controlling formula processing
via model.frame
.
an optional vector of `prior weights' to be used in the fitting process. Should be NULL
or a numeric vector. Default is a vector of 1s with equal weight for each observation.
optional numeric vector with an a priori known component to be included in the linear predictor of the model.
The number of lambda
values - default is 100.
A user supplied lambda
sequence
Smallest value for lambda
, as a fraction of
lambda.max
, the (data derived) entry value (i.e. the smallest
value for which all coefficients are zero). The default depends on the
sample size nobs
relative to the number of variables
nvars
. If nobs > nvars
, the default is 0.001
,
close to zero. If nobs < nvars
, the default is 0.05
.
The L2 penalty mixing parameter, with
\(0\le\alpha\le 1\). alpha=1
is lasso (mcp, scad) penalty; and alpha=0
the ridge penalty.
The tuning parameter of the snet
or mnet
penalty.
logical value, if TRUE, adaptive rescaling of the penalty parameter for penalty="mnet"
or penalty="snet"
with family
other than "gaussian". See reference
Logical flag for x variable standardization, prior to
fitting the model sequence. The coefficients are always returned on
the original scale. Default is standardize=TRUE
.
If variables are in the same units already, you might not wish to
standardize.
This is a number that multiplies lambda
to allow
differential shrinkage of coefficients. Can be 0 for some variables, which implies
no shrinkage, and that variable is always included in the
model. Default is same shrinkage for all variables.
Convergence threshold for coordinate descent. Defaults value is 1e-6
.
Maximum number of iterations for estimating theta
scaling parameter
Maximum number of coordinate descent iterations for each lambda
value; default is 1000.
If a number is less than eps
in magnitude, then this number is considered as 0
If TRUE
, fitting progress is reported
arguments for the link{glmreg}
function
initial scaling parameter theta
Estimate scale parameter theta? Default is FALSE. Note, the algorithm may become slow. In this case, one may use glmreg
function with family="negbin"
, and a fixed theta
.
initial scale parameter vector theta, with length nlambda
if theta.fixed=TRUE
. Default is NULL
Calculate index for which objective function ceases to
be locally convex? Default is FALSE and only useful if penalty="mnet" or "snet"
.
link function, default is log
Type of regularization
estimation method
logicals. If TRUE
the corresponding components
of the fit (model frame, response, model matrix) are returned.
the contrasts corresponding to levels
from the
respective models
a logical value, parallel computing or not for sequence of lambda
with the number of CPU cores to use. The lambda
loop will attempt to send different lambda
off to different cores.
Zhu Wang <zwang145@uthsc.edu>
The sequence of models implied by lambda
is fit by coordinate
descent. This is a lasso (mcp, scad) or elastic net (mnet, snet) regularization path
for fitting the negative binomial linear regression
paths, by maximizing the penalized log-likelihood.
Note that the objective function is
$$-\sum (weights * loglik) + \lambda*penalty$$ if standardize=FALSE
and $$-\frac{weights}{\sum(weights)} * loglik + \lambda*penalty$$ if standardize=TRUE
.
Zhu Wang, Shuangge Ma, Michael Zappitelli, Chirag Parikh, Ching-Yun Wang and Prasad Devarajan (2014) Penalized Count Data Regression with Application to Hospital Stay after Pediatric Cardiac Surgery, Statistical Methods in Medical Research. 2014 Apr 17. [Epub ahead of print]
if (FALSE) {
data("bioChemists", package = "pscl")
system.time(fm_nb1 <- glmregNB(art ~ ., data = bioChemists, parallel=FALSE))
system.time(fm_nb2 <- glmregNB(art ~ ., data = bioChemists, parallel=TRUE, n.cores=2))
coef(fm_nb1)
### ridge regression
fm <- glmregNB(art ~ ., alpha=0, data = bioChemists, lambda=seq(0.001, 1, by=0.01))
fm <- cv.glmregNB(art ~ ., alpha=0, data = bioChemists, lambda=seq(0.001, 1, by=0.01))
}
Run the code above in your browser using DataLab