A generic function for computing the recursive residuals (standardized one step prediction errors) of a linear regression model.
# S3 method for default
recresid(x, y, start = ncol(x) + 1, end = nrow(x),
tol = sqrt(.Machine$double.eps)/ncol(x), qr.tol = 1e-7,
engine = c("R", "C"), ...)
# S3 method for formula
recresid(formula, data = list(), ...)
# S3 method for lm
recresid(x, data = list(), ...)
A vector containing the recursive residuals.
specification of the linear regression model:
either by a regressor matrix x
and a response variable y
,
or by a formula
or by a fitted object x
of class "lm"
.
integer. Index of the first and last observation, respectively, for which recursive residuals should be computed. By default, the maximal range is selected.
numeric. A relative tolerance for precision of recursive coefficient estimates, see details.
numeric. The tol
erance passed to lm.fit
for detecting linear dependencies.
character. In addition to the R implementation of the default method, there is also a faster C implementation (see below for further details).
an optional data frame containing the variables in the model. By
default the variables are taken from the environment which recresid
is
called from. Specifying data
might also be necessary when applying
recresid
to a fitted model of class "lm"
if this does not
contain the regressor matrix and the response.
currently not used.
Recursive residuals are standardized one-step-ahead prediction errors. Under the usual assumptions for the linear regression model they are (asymptotically) normal and i.i.d. (see Brown, Durbin, Evans, 1975, for details).
The default method computes the initial coefficient estimates via QR
decomposition, using lm.fit
. In subsequent steps, the
updating formula provided by Brown, Durbin, Evans (1975) is employed.
To avoid numerical instabilities in the first steps (with typically
small sample sizes), the QR solution is computed for comparison.
When the relative difference (assessed bey all.equal
)
between the two solutions falls below tol
, only the updating
formula is used in subsequent steps.
In large data sets, the R implementation can become rather slow. Hence, a C implementation is also available. This is not the default, yet, because it should receive more testing in numerically challenging cases. In addition to the R and C implementation, there is also an Armadillo-based C++ implementation available on R-Forge in package strucchangeArmadillo. For models with about 10 parameters, the C and C++ version perform similarly. For larger models, the C++ implementation seems to scale better.
Brown R.L., Durbin J., Evans J.M. (1975), Techniques for testing constancy of regression relationships over time, Journal of the Royal Statistical Society, B, 37, 149-163.
efp
x <- rnorm(100) + rep(c(0, 2), each = 50)
rr <- recresid(x ~ 1)
plot(cumsum(rr), type = "l")
plot(efp(x ~ 1, type = "Rec-CUSUM"))
Run the code above in your browser using DataLab