blus: Compute Best Linear Unbiased Scalar-Covariance (BLUS) residuals from a linear model

Description

This function computes the Best Linear Unbiased Scalar-Covariance (BLUS) residuals from a linear model, as defined in Theil65;textualskedastic and explained further in Theil68;textualskedastic.

Usage

blus(
  mainlm,
  omit = c("first", "last", "random"),
  keepNA = TRUE,
  exhaust = NA,
  seed = 1234
)

Arguments

mainlm

Either an object of class "lm" (e.g., generated by lm), or a list of two objects: a response vector and a design matrix. The objects are assumed to be in that order, unless they are given the names "X" and "y" to distinguish them. The design matrix passed in a list must begin with a column of ones if an intercept is to be included in the linear model. The design matrix passed in a list should not contain factors, as all columns are treated 'as is'. For tests that use ordinary least squares residuals, one can also pass a vector of residuals in the list, which should either be the third object or be named "e".

omit

A numeric vector of length \(p\) (the number of columns in the linear model design matrix) giving the indices of \(p\) observations to omit in the BLUS residual vector; or a character partially matching "first" (for the first \(p\)) observations, "last" (for the last \(p\) observations), or "random" (for a random sample of \(p\) indices between 1 and \(n\)). Defaults to "first".

keepNA

A logical. Should BLUS residuals for omitted observations be returned as NA_real_ to preserve the length of the residual vector?

exhaust

An integer. If singular matrices are encountered using the passed value of omit, how many random combinations of \(p\) indices should be attempted before an error is thrown? If NA (the default), all possible combinations are attempted provided that \({n \choose p} \le 10^4\); otherwise up to \(10^4\) random samples of size p from 1:n are attempted (with replacement). Integer values of exhaust greater than 1e4L are treated as NA.

seed

An integer specifying a seed to pass to set.seed for random number generation. This allows reproducibility of bootstrap results. The default value, NA, results in not setting a seed.

Value

A double vector of length \(n\) containing the BLUS residuals (with NA_real_) for omitted observations), or a double vector of length \(n-p\) containing the BLUS residuals only (if keepNA is set to FALSE)

Details

Under the ideal linear model conditions, the BLUS residuals have a scalar covariance matrix \(\sigma^2 I\) (meaning they have a constant variance and are mutually uncorrelated), unlike the OLS residuals, which have covariance matrix \(\sigma^2 M\) where \(M\) is a function of the design matrix. Use of BLUS residuals could improve the performance of tests for heteroskedasticity and/or autocorrelation in the linear model. A linear model with \(n\) observations and an \(n\times p\) design matrix yields only \(n-p\) BLUS residuals. The choice of which \(p\) observations will not be represented in the BLUS residuals is specified within the algorithm.

References

Examples

Run this code

# NOT RUN {
mtcars_lm <- lm(mpg ~ wt + qsec + am, data = mtcars)
blus(mtcars_lm)
plot(mtcars_lm$residuals, blus(mtcars_lm))
# Same as first example
mtcars_list <- list("y" = mtcars$mpg, "X" = cbind(1, mtcars$wt, mtcars$qsec, mtcars$am))
blus(mtcars_list)
# Again same as first example
mtcars_list2 <- list("e" = mtcars_lm$residuals, "X" = cbind(1, mtcars$wt, mtcars$qsec, mtcars$am))
blus(mtcars_list2)
# BLUS residuals cannot be computed with `omit = "last"` in this example, so
# omitted indices are randomised:
blus(mtcars_lm, omit = "last")

# }

Run the code above in your browser using DataLab