extractAIC(fit, scale, k = 2, …)
lm
.scale
in step
. Currently only used
in the "lm"
method, where scale
specifies the estimate
of the error variance, and scale = 0
indicates that it is to
be estimated by maximum likelihood.
edf
)
part in the AIC formula.fit
.fit
."aov"
, "glm"
and "lm"
as well as for
"negbin"
(package https://CRAN.R-project.org/package=MASS) and "coxph"
and
"survreg"
(package https://CRAN.R-project.org/package=survival). The criterion used is
$$AIC = - 2\log L + k \times \mbox{edf},$$
where \(L\) is the likelihood and edf
the equivalent degrees
of freedom (i.e., the number of free parameters for usual parametric
models) of fit
. For linear models with unknown scale (i.e., for lm
and
aov
), \(-2\log L\) is computed from the
deviance and uses a different additive constant to
logLik
and hence AIC
. If \(RSS\)
denotes the (weighted) residual sum of squares then extractAIC
uses for \(- 2\log L\) the formulae \(RSS/s - n\) (corresponding
to Mallows' \(C_p\)) in the case of known scale \(s\) and
\(n \log (RSS/n)\) for unknown scale.
AIC
only handles unknown scale and uses the formula
\(n \log (RSS/n) + n + n \log 2\pi - \sum \log w\)
where \(w\) are the weights. Further AIC
counts the scale
estimation as a parameter in the edf
and extractAIC
does not. For glm
fits the family's aic()
function is used to
compute the AIC: see the note under logLik
about the
assumptions this makes. k = 2
corresponds to the traditional AIC, using k =
log(n)
provides the BIC (Bayesian IC) instead. Note that the methods for this function may differ in their
assumptions from those of methods for AIC
(usually
via a method for logLik
). We have already
mentioned the case of "lm"
models with estimated scale, and
there are similar issues in the "glm"
and "negbin"
methods where the dispersion parameter may or may not be taken as
‘free’. This is immaterial as extractAIC
is only used
to compare models of the same class (where only differences in AIC
values are considered).AIC
, deviance
, add1
,
step