goldfeld_quandt: Goldfeld-Quandt Tests for Heteroskedasticity in a Linear Regression Model

Description

This function implements the two methods (parametric and nonparametric) of Goldfeld65;textualskedastic for testing for heteroskedasticity in a linear regression model.

Usage

goldfeld_quandt(
  mainlm,
  method = c("parametric", "nonparametric"),
  deflator = NA,
  prop_central = 1/3,
  group1prop = 1/2,
  alternative = c("greater", "less", "two.sided"),
  prob = NA,
  twosidedmethod = c("doubled", "kulinskaya"),
  restype = c("ols", "blus"),
  statonly = FALSE,
  ...
)

Arguments

mainlm

Either an object of class "lm" (e.g., generated by lm), or a list of two objects: a response vector and a design matrix. The objects are assumed to be in that order, unless they are given the names "X" and "y" to distinguish them. The design matrix passed in a list must begin with a column of ones if an intercept is to be included in the linear model. The design matrix passed in a list should not contain factors, as all columns are treated 'as is'. For tests that use ordinary least squares residuals, one can also pass a vector of residuals in the list, which should either be the third object or be named "e".

method

A character indicating which of the two tests derived in Goldfeld65;textualskedastic should be implemented. Possible values are "parametric" and "nonparametric". Default is "parametric". It is acceptable to specify only the first letter.

deflator

Either a character specifying a column name from the design matrix of mainlm or an integer giving the index of a column of the design matrix. This variable is suspected to be related to the error variance under the alternative hypothesis. deflator may not correspond to a column of 1's (intercept). Default NA means the data will be left in its current order (e.g. in case the existing index is believed to be associated with error variance).

prop_central

A double specifying the proportion of central observations to exclude from the F test (when method is "parametric" only). round is used to ensure the number of central observations is an integer. The value must be small enough to allow the two auxiliary regressions to be fit; otherwise an error is thrown. Defaults to 1 / 3.

group1prop

A double specifying the proportion of remaining observations (after excluding central observations) to allocate to the first group. The default value of 1 / 2 means that an equal number of observations is assigned to the first and second groups.

alternative

A character specifying the form of alternative hypothesis. If it is suspected that the error variance is positively associated with the deflator variable, "greater". If it is suspected that the error variance is negatively associated with deflator variable, "less". If no information is available on the suspected direction of the association, "two.sided". Defaults to "greater".

prob

A vector of probabilities corresponding to values of the test statistic (number of peaks) from 0 to \(n-1\) inclusive (used only when method is "nonparametric"). If NA (the default), probabilities are calculated within the function by calling ppeak. The user can improve computational performance of the test (for instance, when the test is being used repeatedly in a simulation) by pre-specifying the exact probability distribution of the number of peaks using this argument, e.g. by calling the \(n\)th element of dpeakdat (or \((n-p)\)th element, if BLUS residuals are used).

twosidedmethod

A character indicating the method to be used to compute two-sided \(p\)-values for the parametric test when alternative is "two.sided". The argument is passed to twosidedpval as its method argument.

restype

A character specifying which residuals to use: "ols" for OLS residuals (the default) or the "blus" for BLUS residuals. The advantage of using BLUS residuals is that, under the null hypothesis, the assumption that the random series is independent and identically distributed is met (whereas with OLS residuals it is not). The disadvantage of using BLUS residuals is that only \(n-p\) residuals are used rather than the full \(n\). This argument is ignored if method is "parametric".

statonly

A logical. If TRUE, only the test statistic value is returned, instead of an object of class "htest". Defaults to FALSE.

...

Optional further arguments to pass to blus.

Value

An object of class "htest". If object is not assigned, its attributes are displayed in the console as a tibble using tidy.

Details

The parametric test entails putting the data rows in increasing order of some specified deflator (one of the explanatory variables). A specified proportion of the most central observations (under this ordering) is removed, leaving a subset of lower observations and a subset of upper observations. Separate OLS regressions are fit to these two subsets of observations (using all variables from the original model). The test statistic is the ratio of the sum of squared residuals from the 'upper' model to the sum of squared residuals from the 'lower' model. Under the null hypothesis, the test statistic is exactly F-distributed with numerator and denominator degrees of freedom equal to \((n-c)/2 - p\) where \(n\) is the number of observations in the original regression model, \(c\) is the number of central observations removed, and \(p\) is the number of columns in the design matrix (number of parameters to be estimated, including intercept).

The nonparametric test entails putting the residuals of the linear model in increasing order of some specified deflator (one of the explanatory variables). The test statistic is the number of peaks, with the \(j\)th absolute residual \(|e_j|\) defined as a peak if \(|e_j|\ge|e_i|\) for all \(i<j\). The first observation does not constitute a peak. If the number of peaks is large relative to the distribution of peaks under the null hypothesis, this constitutes evidence for heteroskedasticity.

References

Examples

Run this code

# NOT RUN {
mtcars_lm <- lm(mpg ~ wt + qsec + am, data = mtcars)
goldfeld_quandt(mtcars_lm, deflator = "qsec", prop_central = 0.25)
# This is equivalent to lmtest::gqtest(mtcars_lm, fraction = 0.25, order.by = mtcars$qsec)
goldfeld_quandt(mtcars_lm, deflator = "qsec", method = "nonparametric",
 restype = "blus")
goldfeld_quandt(mtcars_lm, deflator = "qsec", prop_central = 0.25, alternative = "two.sided")
goldfeld_quandt(mtcars_lm, deflator = "qsec", method = "nonparametric",
 restype = "blus", alternative = "two.sided")
# }

Run the code above in your browser using DataLab