Learn R Programming

mpath (version 0.4-2.26)

glmreg: fit a GLM with lasso (or elastic net), snet or mnet regularization

Description

Fit a generalized linear model via penalized maximum likelihood. The regularization path is computed for the lasso (or elastic net penalty), scad (or snet) and mcp (or mnet penalty), at a grid of values for the regularization parameter lambda. Fits linear, logistic, Poisson and negative binomial (fixed scale parameter) regression models.

Usage

# S3 method for formula
glmreg(formula, data, weights, offset=NULL, contrasts=NULL, 
x.keep=FALSE, y.keep=TRUE, ...)
# S3 method for matrix
glmreg(x, y, weights, offset=NULL, ...)
# S3 method for default
glmreg(x,  ...)

Value

An object with S3 class "glmreg" for the various types of models.

call

the call that produced this object

b0

Intercept sequence of length length(lambda)

beta

A nvars x length(lambda) matrix of coefficients.

lambda

The actual sequence of lambda values used

offset

the offset vector used.

resdev

The computed deviance (for "gaussian", this is the R-square). The deviance calculations incorporate weights if present in the model. The deviance is defined to be 2*(loglike_sat - loglike), where loglike_sat is the log-likelihood for the saturated model (a model with a free parameter per observation).

nulldev

Null deviance (per observation). This is defined to be 2*(loglike_sat -loglike(Null)); The NULL model refers to the intercept model.

nobs

number of observations

pll

penalized log-likelihood values for standardized coefficients in the IRLS iterations. For family="gaussian", not implemented yet.

pllres

penalized log-likelihood value for the estimated model on the original scale of coefficients

fitted.values

the fitted mean values, obtained by transforming the linear predictors by the inverse of the link function.

Arguments

formula

symbolic description of the model, see details.

data

argument controlling formula processing via model.frame.

weights

optional numeric vector of weights. If standardize=TRUE, weights are renormalized to weights/sum(weights). If standardize=FALSE, weights are kept as original input

offset

this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. Currently only one offset term can be included in the formula.

x

input matrix, of dimension nobs x nvars; each row is an observation vector

y

response variable. Quantitative for family="gaussian". Non-negative counts for family="poisson" or family="negbin". For family="binomial" should be either a factor with two levels or a vector of proportions.

x.keep, y.keep

logical values: keep response variables or keep response variable?

contrasts

the contrasts corresponding to levels from the respective models

...

Other arguments passing to glmreg_fit

Author

Zhu Wang <zwang145@uthsc.edu>

Details

The sequence of models implied by lambda is fit by coordinate descent. For family="gaussian" this is the lasso, mcp or scad sequence if alpha=1, else it is the enet, mnet or snet sequence. For the other families, this is a lasso (mcp, scad) or elastic net (mnet, snet) regularization path for fitting the generalized linear regression paths, by maximizing the appropriate penalized log-likelihood. Note that the objective function for "gaussian" is $$1/2* weights*RSS + \lambda*penalty,$$ if standardize=FALSE and $$1/2* \frac{weights}{\sum(weights)}*RSS + \lambda*penalty,$$ if standardize=TRUE. For the other models it is $$-\sum (weights * loglik) + \lambda*penalty$$ if standardize=FALSE and $$-\frac{weights}{\sum(weights)} * loglik + \lambda*penalty$$ if standardize=TRUE.

References

Breheny, P. and Huang, J. (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Statist., 5: 232-253.

Zhu Wang, Shuangge Ma, Michael Zappitelli, Chirag Parikh, Ching-Yun Wang and Prasad Devarajan (2014) Penalized Count Data Regression with Application to Hospital Stay after Pediatric Cardiac Surgery, Statistical Methods in Medical Research. 2014 Apr 17. [Epub ahead of print]

See Also

print, predict, coef and plot methods, and the cv.glmreg function.

Examples

Run this code
#binomial
x=matrix(rnorm(100*20),100,20)
g2=sample(0:1,100,replace=TRUE)
fit2=glmreg(x,g2,family="binomial")
#poisson and negative binomial
data("bioChemists", package = "pscl")
fm_pois <- glmreg(art ~ ., data = bioChemists, family = "poisson")
coef(fm_pois)
fm_nb1 <- glmreg(art ~ ., data = bioChemists, family = "negbin", theta=1)
coef(fm_nb1)
#offset
x <- matrix(rnorm(100*20),100,20)
y <- rpois(100, lambda=1)
exposure <- rep(0.5, length(y))
fit2 <- glmreg(x,y, lambda=NULL, nlambda=10, lambda.min.ratio=1e-4, 
	       offset=log(exposure), family="poisson")
predict(fit2, newx=x, newoffset=log(exposure))
if (FALSE) {
fm_nb2 <- glmregNB(art ~ ., data = bioChemists)
coef(fm_nb2)
}

Run the code above in your browser using DataLab