recresid: Recursive Residuals

Description

A generic function for computing the recursive residuals (standardized one step prediction errors) of a linear regression model.

Usage

# S3 method for default
recresid(x, y, start = ncol(x) + 1, end = nrow(x),
  tol = sqrt(.Machine$double.eps)/ncol(x), qr.tol = 1e-7,
  engine = c("R", "C"), …)
# S3 method for formula
recresid(formula, data = list(), …)
# S3 method for lm
recresid(x, data = list(), …)

Arguments

x, y, formula

specification of the linear regression model: either by a regressor matrix x and a response variable y, or by a formula or by a fitted object x of class "lm".

start, end

integer. Index of the first and last observation, respectively, for which recursive residuals should be computed. By default, the maximal range is selected.

tol

numeric. A relative tolerance for precision of recursive coefficient estimates, see details.

qr.tol

numeric. The tolerance passed to lm.fit for detecting linear dependencies.

engine

character. In addition to the R implementation of the default method, there is also a faster C implementation (see below for further details).

data

an optional data frame containing the variables in the model. By default the variables are taken from the environment which recresid is called from. Specifying data might also be necessary when applying recresid to a fitted model of class "lm" if this does not contain the regressor matrix and the response.

…

currently not used.

Value

A vector containing the recursive residuals.

Details

Recursive residuals are standardized one-step-ahead prediction errors. Under the usual assumptions for the linear regression model they are (asymptotically) normal and i.i.d. (see Brown, Durbin, Evans, 1975, for details).

The default method computes the initial coefficient estimates via QR decomposition, using lm.fit. In subsequent steps, the updating formula provided by Brown, Durbin, Evans (1975) is employed. To avoid numerical instabilities in the first steps (with typically small sample sizes), the QR solution is computed for comparison. When the relative difference (assessed bey all.equal) between the two solutions falls below tol, only the updating formula is used in subsequent steps.

In large data sets, the R implementation can become rather slow. Hence, a C implementation is also available. This is not the default, yet, because it should receive more testing in numerically challenging cases. In addition to the R and C implementation, there is also an Armadillo-based C++ implementation available on R-Forge in package strucchangeArmadillo. For models with about 10 parameters, the C and C++ version perform similarly. For larger models, the C++ implementation seems to scale better.

References

Brown R.L., Durbin J., Evans J.M. (1975), Techniques for testing constancy of regression relationships over time, Journal of the Royal Statistical Society, B, 37, 149-163.

Examples

Run this code

# NOT RUN {
x <- rnorm(100) + rep(c(0, 2), each = 50)
rr <- recresid(x ~ 1)
plot(cumsum(rr), type = "l")

plot(efp(x ~ 1, type = "Rec-CUSUM"))
# }

Run the code above in your browser using DataLab