influence.mixed.models: Influence Diagnostics for Mixed-Effects Models

Description

These functions compute deletion influence diagnostics for linear mixed-effects models fit by lmer in the lme4 package and lme in the nlme package and for generalized linear mixed-effects models fit by glmer in the lme4 package. The main functions are methods for the influence generic function. Other functions are provided for computing dfbeta, dfbetas, cooks.distance, and influence on variance-covariance components based on the objects computed by influence.merMod and influence.lme.

Usage

# S3 method for merMod
influence(model, groups, data, maxfun=1000, ...)
# S3 method for lme
influence(model, groups, data, ...)
# S3 method for influence.merMod
cooks.distance(model, ...)
# S3 method for influence.lme
cooks.distance(model, ...)
# S3 method for influence.merMod
dfbeta(model, which = c("fixed", "var.cov"), ...)
# S3 method for influence.lme
dfbeta(model, which = c("fixed", "var.cov"), ...)
# S3 method for influence.merMod
dfbetas(model, ...)
# S3 method for influence.lme
dfbetas(model, ...)

Arguments

model

in the case of influence.merMod or influence.lme, a model of class "merMod" or "lme"; in the case of cooks.distance, dfbeta, or dfbetas, an object returned by influence.merMod or influence.lme.

groups

a character vector containing the name of a grouping factor or names of grouping factors; if more than one name is supplied, then groups are defined by all combinations of levels of the grouping factors that appear in the data. If omitted, then each individual row of the data matrix is treated as a "group" to be deleted in turn.

data

an optional data frame with the data to which model was fit; influence.merMod can usually reconstruct the data, and influence.lme can access the data unless keep.data=FALSE was specified in the call to lme, so it's usually unnecessary to supply the data argument.

maxfun

The maximum number of function evaluations (for influence.merMod) to perform after deleting each group; the defaults are large enough so that the iterations will typically continue to convergence. Setting to maxfun=20 for an lmer model or 100 for a glmer model will typically produce a faster reasonable approximation. An even smaller value can be used if interest is only in influence on the fixed effects.

which

if "fixed.effects" (the default), return influence on the fixed effects; if "var.cov", return influence on the variance-covariance components.

…

ignored.

Value

influence.merMod and influence.lme return objects of class "influence.merMod" and "influence.lme" respectively, each of which contains the following elements:

"fixed.effects": the estimated fixed effects for the model.
"fixed.effects[-groups]": a matrix with columns corresponding to the fixed-effects coefficients and rows corresponding to groups, giving the estimated fixed effects with each group deleted in turn; groups is formed from the name(s) of the grouping factor(s).
"var.cov.comps": the estimated variance-covariance parameters for the model.
"var.cov.comps[-groups]": a matrix with the estimated covariance parameters (in columns) with each group deleted in turn.
"vcov": The estimated covariance matrix of the fixed-effects coefficients.
"vcov[-groups]": a list each of whose elements is the estimated covariance matrix of the fixed-effects coefficients with one group deleted.
"groups": a character vector giving the names of the grouping factors.
"deleted": the possibly composite grouping factor, each of whose elements is deleted in turn.
"converged": for influence.merMod, a logical vector indicating whether the computation converged for each group.
"function.evals": for influence.merMod, a vector of the number of function evaluations performed for each group.

For plotting "influence.merMod" and "influence.lme" objects, see infIndexPlot.

Details

influence.merMod and influence.lme start with the estimated variance-covariance components from model and then refit the model omitting each group in turn, not necessarily iterating to completion. For example, maxfun=20 takes up to 20 function evaluations step away from the ML or REML solution for the full data, which usually provides decent approximations to the fully iterated estimates.

The other functions are methods for the dfbeta, dfbetas, and cooks.distance generics, to be applied to the "influence.merMod" or "influence.lme" object produced by the influence function; the dfbeta methods can also return influence on the variance-covariance components.

References

Fox, J. and Weisberg, S. (2019) An R Companion to Applied Regression, Third Edition, Sage.

Examples

Run this code

# NOT RUN {
if (require("lme4")){
    print(fm1 <- lmer(Reaction ~ Days + (Days | Subject),
        sleepstudy)) # from ?lmer
    infIndexPlot(influence(fm1, "Subject"))
    }
if (require("lme4")){
    gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
        data = cbpp, family = binomial)
    infIndexPlot(influence(gm1, "herd", maxfun=100))
    gm1.11 <- update(gm1, subset = herd != 11) # check deleting herd 11
    compareCoefs(gm1, gm1.11)
    }
# }

Run the code above in your browser using DataLab