This function implements the two methods (parametric and nonparametric) of Goldfeld65;textualskedastic for testing for heteroskedasticity in a linear regression model.
goldfeld_quandt(
mainlm,
method = c("parametric", "nonparametric"),
deflator = NA,
prop_central = 1/3,
group1prop = 1/2,
alternative = c("greater", "less", "two.sided"),
prob = NA,
twosidedmethod = c("doubled", "kulinskaya"),
restype = c("ols", "blus"),
statonly = FALSE,
...
)
Either an object of class
"lm"
(e.g., generated by lm
), or
a list of two objects: a response vector and a design matrix. The objects
are assumed to be in that order, unless they are given the names
"X"
and "y"
to distinguish them. The design matrix passed
in a list must begin with a column of ones if an intercept is to be
included in the linear model. The design matrix passed in a list should
not contain factors, as all columns are treated 'as is'. For tests that
use ordinary least squares residuals, one can also pass a vector of
residuals in the list, which should either be the third object or be
named "e"
.
A character indicating which of the two tests derived in Goldfeld65;textualskedastic should be implemented. Possible values are "parametric" and "nonparametric". Default is "parametric". It is acceptable to specify only the first letter.
Either a character specifying a column name from the
design matrix of mainlm
or an integer giving the index of a
column of the design matrix. This variable is suspected to be
related to the error variance under the alternative hypothesis.
deflator
may not correspond to a column of 1's (intercept).
Default NA
means the data will be left in its current order
(e.g. in case the existing index is believed to be associated with
error variance).
A double specifying the proportion of central
observations to exclude from the F test (when method
is
"parametric"
only). round
is
used to ensure the number of central observations is an integer. The
value must be small enough to allow the two auxiliary regressions to
be fit; otherwise an error is thrown. Defaults to 1 / 3
.
A double specifying the proportion of remaining
observations (after excluding central observations) to allocate
to the first group. The default value of 1 / 2
means that an
equal number of observations is assigned to the first and second groups.
A character specifying the form of alternative
hypothesis. If it is suspected that the
error variance is positively associated with the deflator variable,
"greater"
. If it is suspected that the error variance is
negatively associated with deflator variable, "less"
. If no
information is available on the suspected direction of the association,
"two.sided"
. Defaults to "greater"
.
A vector of probabilities corresponding to values of the test
statistic (number of peaks) from 0 to \(n-1\) inclusive (used
only when method
is "nonparametric"
). If
NA
(the default), probabilities are calculated within the
function by calling ppeak
. The user can improve computational
performance of the test (for instance, when the test is being used
repeatedly in a simulation) by pre-specifying the exact probability
distribution of the number of peaks using this argument, e.g. by
calling the \(n\)th element of dpeakdat
(or \((n-p)\)th
element, if BLUS residuals are used).
A character indicating the method to be used to compute
two-sided \(p\)-values for the parametric test when alternative
is "two.sided"
. The argument is passed to
twosidedpval
as its method
argument.
A character specifying which residuals to use: "ols"
for OLS residuals (the default) or the "blus"
for
BLUS residuals. The advantage of using BLUS residuals is
that, under the null hypothesis, the assumption that the random series
is independent and identically distributed is met (whereas with OLS
residuals it is not). The disadvantage of using BLUS residuals is that
only \(n-p\) residuals are used rather than the full \(n\). This
argument is ignored if method
is "parametric"
.
A logical. If TRUE
, only the test statistic value
is returned, instead of an object of class
"htest"
. Defaults to FALSE
.
Optional further arguments to pass to blus
.
An object of class
"htest"
. If object is
not assigned, its attributes are displayed in the console as a
tibble
using tidy
.
The parametric test entails putting the data rows in increasing order of some specified deflator (one of the explanatory variables). A specified proportion of the most central observations (under this ordering) is removed, leaving a subset of lower observations and a subset of upper observations. Separate OLS regressions are fit to these two subsets of observations (using all variables from the original model). The test statistic is the ratio of the sum of squared residuals from the 'upper' model to the sum of squared residuals from the 'lower' model. Under the null hypothesis, the test statistic is exactly F-distributed with numerator and denominator degrees of freedom equal to \((n-c)/2 - p\) where \(n\) is the number of observations in the original regression model, \(c\) is the number of central observations removed, and \(p\) is the number of columns in the design matrix (number of parameters to be estimated, including intercept).
The nonparametric test entails putting the residuals of the linear model in increasing order of some specified deflator (one of the explanatory variables). The test statistic is the number of peaks, with the \(j\)th absolute residual \(|e_j|\) defined as a peak if \(|e_j|\ge|e_i|\) for all \(i<j\). The first observation does not constitute a peak. If the number of peaks is large relative to the distribution of peaks under the null hypothesis, this constitutes evidence for heteroskedasticity.
lmtest::gqtest
, another implementation
of the Goldfeld-Quandt Test (parametric method only).
# NOT RUN {
mtcars_lm <- lm(mpg ~ wt + qsec + am, data = mtcars)
goldfeld_quandt(mtcars_lm, deflator = "qsec", prop_central = 0.25)
# This is equivalent to lmtest::gqtest(mtcars_lm, fraction = 0.25, order.by = mtcars$qsec)
goldfeld_quandt(mtcars_lm, deflator = "qsec", method = "nonparametric",
restype = "blus")
goldfeld_quandt(mtcars_lm, deflator = "qsec", prop_central = 0.25, alternative = "two.sided")
goldfeld_quandt(mtcars_lm, deflator = "qsec", method = "nonparametric",
restype = "blus", alternative = "two.sided")
# }
Run the code above in your browser using DataLab