These functions compute deletion influence diagnostics for linear mixed-effects models fit by lmer
in the lme4 package and
lme
in the nlme package and for generalized linear mixed-effects models fit by glmer
in the
lme4 package. The main functions are methods for the influence
generic function. Other functions are provided for
computing dfbeta
, dfbetas
, cooks.distance
, and influence on variance-covariance components based
on the objects computed by influence.merMod
and influence.lme
.
# S3 method for merMod
influence(model, groups, data, maxfun=1000, ncores=1, ...)
# S3 method for lme
influence(model, groups, data, ncores=1, ...)# S3 method for influence.merMod
cooks.distance(model, ...)
# S3 method for influence.lme
cooks.distance(model, ...)
# S3 method for influence.merMod
dfbeta(model, which = c("fixed", "var.cov"), ...)
# S3 method for influence.lme
dfbeta(model, which = c("fixed", "var.cov"), ...)
# S3 method for influence.merMod
dfbetas(model, ...)
# S3 method for influence.lme
dfbetas(model, ...)
in the case of influence.merMod
or influence.lme
, a model of class "merMod"
or "lme"
;
in the case of cooks.distance
, dfbeta
, or dfbetas
, an object returned by
influence.merMod
or influence.lme
.
a character vector containing the name of a grouping factor or names of grouping factors; if more than one name is supplied, then groups are defined by all combinations of levels of the grouping factors that appear in the data. If omitted, then each individual row of the data matrix is treated as a "group" to be deleted in turn.
an optional data frame with the data to which model
was fit; influence.merMod
can usually reconstruct
the data, and influence.lme
can access the data unless keep.data=FALSE
was specified in the call to lme
,
so it's usually unnecessary to supply the data
argument.
The maximum number of function evaluations (for influence.merMod
)
to perform after deleting each group; the defaults are large enough so that the iterations will typically continue to convergence.
Setting to maxfun=20
for an lmer
model or 100
for a glmer
model will typically produce a faster reasonable approximation.
An even smaller value can be used if interest is only in influence on the fixed effects.
number of cores for parallel computation of diagnostics; if 1
(the default), the computation isn't parallelized; if Inf
, all of the available physical cores
(not necessarily logical cores --- see detectCores
) on the computer will be used.
if "fixed.effects"
(the default), return influence on the fixed effects; if "var.cov"
, return influence on the variance-covariance components.
ignored.
influence.merMod
and influence.lme
return objects of class "influence.merMod"
and "influence.lme"
respectively,
each of which contains the following elements:
"fixed.effects"
the estimated fixed effects for the model.
"fixed.effects[-groups]"
a matrix with columns corresponding to the fixed-effects coefficients and rows corresponding to groups, giving the estimated fixed effects with each group deleted in turn; groups is formed from the name(s) of the grouping factor(s).
"var.cov.comps"
the estimated variance-covariance parameters for the model.
"var.cov.comps[-groups]"
a matrix with the estimated covariance parameters (in columns) with each group deleted in turn.
"vcov"
The estimated covariance matrix of the fixed-effects coefficients.
"vcov[-groups]"
a list each of whose elements is the estimated covariance matrix of the fixed-effects coefficients with one group deleted.
"groups"
a character vector giving the names of the grouping factors.
"deleted"
the possibly composite grouping factor, each of whose elements is deleted in turn.
"converged"
for influence.merMod
, a logical vector indicating whether the computation converged for each group.
"function.evals"
for influence.merMod
, a vector of the number of function evaluations performed for each group.
For plotting "influence.merMod"
and "influence.lme"
objects, see infIndexPlot
.
influence.merMod
and influence.lme
start with the estimated variance-covariance components from model
and then refit
the model omitting each group in turn, not necessarily iterating to completion. For example, maxfun=20
takes up to 20 function evaluations
step away from the ML or REML solution for the full data, which usually provides decent approximations to the fully iterated estimates.
The other functions are methods for the dfbeta
, dfbetas
, and cooks.distance
generics, to be applied to the
"influence.merMod"
or "influence.lme"
object produced by the influence
function; the dfbeta
methods can also return
influence on the variance-covariance components.
Fox, J. and Weisberg, S. (2019) An R Companion to Applied Regression, Third Edition, Sage.
lmer
, glmer
, lme
, infIndexPlot
.
# NOT RUN {
if (require("lme4")){
print(fm1 <- lmer(Reaction ~ Days + (Days | Subject),
sleepstudy)) # from ?lmer
infIndexPlot(influence(fm1, "Subject"))
}
if (require("lme4")){
gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial)
infIndexPlot(influence(gm1, "herd", maxfun=100))
gm1.11 <- update(gm1, subset = herd != 11) # check deleting herd 11
compareCoefs(gm1, gm1.11)
}
# }
Run the code above in your browser using DataLab