This function implements the nonparametric test of Horn81;textualskedastic for testing for heteroskedasticity in a linear regression model.
horn(
mainlm,
deflator = NA,
restype = c("ols", "blus"),
alternative = c("two.sided", "greater", "less"),
exact = (nres
Either an object of class
"lm"
(e.g., generated by lm
), or
a list of two objects: a response vector and a design matrix. The objects
are assumed to be in that order, unless they are given the names
"X"
and "y"
to distinguish them. The design matrix passed
in a list must begin with a column of ones if an intercept is to be
included in the linear model. The design matrix passed in a list should
not contain factors, as all columns are treated 'as is'. For tests that
use ordinary least squares residuals, one can also pass a vector of
residuals in the list, which should either be the third object or be
named "e"
.
Either a character specifying a column name from the
design matrix of mainlm
or an integer giving the index of a
column of the design matrix. This variable is suspected to be
related to the error variance under the alternative hypothesis.
deflator
may not correspond to a column of 1's (intercept).
Default NA
means the data will be left in its current order
(e.g. in case the existing index is believed to be associated with
error variance).
A character specifying which residuals to use: "ols"
for OLS residuals (the default) or the "blus"
for
BLUS residuals. The advantage of using BLUS residuals is
that, under the null hypothesis, the assumption that the random series
is independent and identically distributed is met (whereas with OLS
residuals it is not). The disadvantage of using BLUS residuals is that
only \(n-p\) residuals are used rather than the full \(n\).
A character specifying the form of alternative
hypothesis; one of "two.sided"
(default), "greater"
,
or "less"
. "two.sided"
corresponds to any trend in the
absolute residuals when ordered by deflator
. "greater"
corresponds to a negative trend in the absolute residuals when ordered by
deflator
. "less"
corresponds to a positive trend in the
absolute residuals when ordered by deflator
. (Notice that \(D\)
tends to be small when there is a positive trend.)
A logical. Should exact \(p\)-values be computed? If
FALSE
, a normal approximation is used. Defaults to TRUE
only if the number of absolute residuals being ranked is \(\le 10\).
A logical. If TRUE
, only the test statistic value
is returned, instead of an object of class
"htest"
. Defaults to FALSE
.
Optional further arguments to pass to blus
.
An object of class
"htest"
. If object is
not assigned, its attributes are displayed in the console as a
tibble
using tidy
.
The test entails specifying a 'deflator', an explanatory variable
suspected of being related to the error variance. Residuals are ordered
by the deflator and the nonparametric trend statistic
\(D=\sum (R_i - i)^2\) proposed by
Lehmann75;textualskedastic is
then computed on the absolute residuals and used to test for an
increasing or decreasing trend, either of which would correspond to
heteroskedasticity. Exact probabilities for the null distribution of
\(D\) can be obtained from functions dDtrend
and
pDtrend
, but since computation time increases rapidly with
\(n\), use of a normal approximation is recommended for \(n>10\).
Lehmann75;textualskedastic proves that \(D\) is
asymptotically normally distributed and the approximation of the
statistic \(Z=(D-E(D))/\sqrt{V(D)}\) to the standard normal
distribution is already quite good for \(n=11\).
The expectation and variance of \(D\) (when ties are absent) are respectively \(E(D)=\frac{n^3-n}{6}\) and \(V(D)=\frac{n^2(n+1)^2(n-1)}{36}\); see Lehmann75;textualskedastic for \(E(D)\) and \(V(D)\) when ties are present. When ties are absent, a continuity correction is used to improve the normal approximation. When exact distribution is used, two-sided \(p\)-value is computed by doubling the one-sided \(p\)-value, since the distribution of \(D\) is symmetric. The function does not support the exact distribution of \(D\) in the presence of ties, so in this case the normal approximation is used regardless of \(n\).
# NOT RUN {
mtcars_lm <- lm(mpg ~ wt + qsec + am, data = mtcars)
horn(mtcars_lm, deflator = "qsec")
horn(mtcars_lm, deflator = "qsec", restype = "blus")
# }
Run the code above in your browser using DataLab