extractAIC: Extract AIC from a Fitted Model

Description

Computes the (generalized) Akaike An Information Criterion for a fitted parametric model.

Usage

extractAIC(fit, scale, k = 2, …)

Arguments

fit

fitted model, usually the result of a fitter like lm.

scale

optional numeric specifying the scale parameter of the model, see scale in step. Currently only used in the "lm" method, where scale specifies the estimate of the error variance, and scale = 0 indicates that it is to be estimated by maximum likelihood.

numeric specifying the ‘weight’ of the equivalent degrees of freedom ($\equiv$ edf) part in the AIC formula.

…

further arguments (currently unused in base R).

Value

A numeric vector of length 2, with first and second elements giving

edf

the ‘equivalent degrees of freedom’ for the fitted model fit.

AIC

the (generalized) Akaike Information Criterion for fit.

Details

This is a generic function, with methods in base R for classes "aov", "glm" and "lm" as well as for "negbin" (package https://CRAN.R-project.org/package=MASS) and "coxph" and "survreg" (package https://CRAN.R-project.org/package=survival). The criterion used is $$AIC = - 2\log L + k \times \mbox{edf},$$ where $L$ is the likelihood and edf the equivalent degrees of freedom (i.e., the number of free parameters for usual parametric models) of fit. For linear models with unknown scale (i.e., for lm and aov), $-2\log L$ is computed from the deviance and uses a different additive constant to logLik and hence AIC. If $RSS$ denotes the (weighted) residual sum of squares then extractAIC uses for $- 2\log L$ the formulae $RSS/s - n$ (corresponding to Mallows' $C_p$) in the case of known scale $s$ and $n \log (RSS/n)$ for unknown scale. AIC only handles unknown scale and uses the formula $n \log (RSS/n) + n + n \log 2\pi - \sum \log w$ where $w$ are the weights. Further AIC counts the scale estimation as a parameter in the edf and extractAIC does not. For glm fits the family's aic() function is used to compute the AIC: see the note under logLik about the assumptions this makes. k = 2 corresponds to the traditional AIC, using

k =
    log(n)

provides the BIC (Bayesian IC) instead. Note that the methods for this function may differ in their assumptions from those of methods for AIC (usually via a method for logLik). We have already mentioned the case of "lm" models with estimated scale, and there are similar issues in the "glm" and "negbin" methods where the dispersion parameter may or may not be taken as ‘free’. This is immaterial as extractAIC is only used to compare models of the same class (where only differences in AIC values are considered).

References

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. New York: Springer (4th ed).