Internal function to conduct k-fold cross-validation for glmreg, produces a plot,
and returns cross-validated log-likelihood values for lambda
cv.glmreg_fit(x, y, weights, offset, lambda=NULL, balance=TRUE,
family=c("gaussian", "binomial", "poisson", "negbin"),
type=c("loss", "error"), nfolds=10, foldid, plot.it=TRUE,
se=TRUE, n.cores=2, trace=FALSE, parallel=FALSE, ...)
an object of class "cv.glmreg"
is returned, which is a
list with the ingredients of the cross-validation fit.
a fitted glmreg object for the full data.
matrix of log-likelihood values with row values for lambda
and column values for k
th cross-validation
The mean cross-validated log-likelihood values - a vector of
length(lambda)
.
estimate of standard error of cv
.
an optional vector of values between 1 and nfold
identifying what fold each observation is in.
a vector of lambda
values
index of lambda
that gives maximum cv
value.
value of lambda
that gives maximum cv
value.
x
matrix as in glmreg
.
response y
as in glmreg
.
Observation weights; defaults to 1 per observation
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. Currently only one offset term can be included in the formula.
Optional user-supplied lambda sequence; default is
NULL
, and glmreg
chooses its own sequence
for family="binomial"
only
response variable distribution
cross-validation criteria. For type="loss"
, loss function (log-negative-likelihood) values and type="error"
is misclassification error if family="binomial"
.
number of folds >=3, default is 10
an optional vector of values between 1 and nfold
identifying what fold each observation is in. If supplied,
nfold
can be missing and will be ignored.
a logical value, to plot the estimated log-likelihood values if TRUE
.
a logical value, to plot with standard errors.
a logical value, parallel computing or not with the number of CPU cores to use. The cross-validation loop will attempt to send different CV folds off to different cores.
a logical value, print progress of cross validation or not
Other arguments that can be passed to glmreg
.
Zhu Wang <zwang145@uthsc.edu>
The function runs glmreg
nfolds
+1 times; the
first to compute the lambda
sequence, and then to
compute the fit with each of the folds omitted. The error or the log-likelihood value is
accumulated, and the average value and standard deviation over the
folds is computed. Note that cv.glmreg
can be used to search for
values for alpha
: it is required to call cv.glmreg
with a fixed vector foldid
for different values of alpha
.
Zhu Wang, Shuangge Ma, Michael Zappitelli, Chirag Parikh, Ching-Yun Wang and Prasad Devarajan (2014) Penalized Count Data Regression with Application to Hospital Stay after Pediatric Cardiac Surgery, Statistical Methods in Medical Research. 2014 Apr 17. [Epub ahead of print]