scale_stat: Test statistic to detect Scale Changes

Description

Computes the test statistic for CUSUM-based tests on scale changes.

Usage

scale_stat(x, version =  c("empVar", "MD", "GMD", "Qalpha"), method = "kernel",
           control = list(), alpha = 0.8)

Value

Test statistic (numeric value) with the following attributes:

cp-location: indicating at which index a change point is most likely.
teststat: test process (before taking the maximum).
lrv-estimation: long run variance estimation method.

If method = "kernel" the following attributes are also included:

sigma: estimated long run variance.
param: parameter used for the lrv estimation.
kFun: kernel function used for the lrv estimation.

Is an S3 object of the class "cpStat".

Arguments

x: time series (numeric or ts vector).
version: variance estimation method.
method: either "kernel" for performing a kernel-based long run variance estimation, or "bootstrap" for performing a dependent wild bootstrap. See 'Details' below.
control: a list of control parameters.
alpha: quantile of the distribution function of all absolute pairwise differences used in version = "Qalpha".

Author

Sheila Görz

Details

Let $n$ be the length of the time series. The CUSUM test statistic for testing on a change in the scale is then defined as $$\hat{T}_{s} = \max_{1 < k \leq n} \frac{k}{\sqrt{n}} |\hat{s}_{1:k} - \hat{s}_{1:n}|,$$ where $\hat{s}_{1:k}$ is a scale estimator computed using only the first $k$ elements of the time series $x$.

If method = "kernel", the test statistic $\hat{T}_s$ is divided by the estimated long run variance $\hat{D}_s$ so that it asymptotically follows a Kolmogorov distribution. $\hat{D}_s$ is computed by the function lrv using kernel-based estimation.

For the scale estimator $\hat{s}_{1:k}$, there are five different options which can be set via the version parameter:

Empirical variance (empVar) $$\hat{\sigma}^2_{1:k} = \frac{1}{k-1} \sum_{i = 1}^k (x_i - \bar{x}_{1:k})^2; \; \bar{x}_{1:k} = k^{-1} \sum_{i = 1}^k x_i.$$

Mean deviation (MD) $$\hat{d}_{1:k}= \frac{1}{k-1} \sum_{i = 1}^k |x_i - med_{1:k}|,$$ where $med_{1:k}$ is the median of $x_1, ..., x_k$.

Gini's mean difference (GMD) $$\hat{g}_{1:k} = \frac{2}{k(k-1)} \sum_{1 \leq i < j \leq k} (x_i - x_j).$$

$Q^{\alpha}$ (Qalpha) $$\hat{Q}^{\alpha}_{1:k} = U_{1:k}^{-1}(\alpha) = \inf\{x | \alpha \leq U_{1:k}(x)\},$$ where $U_{1:k}$ is the empirical distribtion function of $|x_i - x_j|, \, 1 \leq i < j \leq k$ (cf. Qalpha).

For the kernel-based long run variance estimation, the default bandwidth $b_n$ is determined as follows:

If $\hat{\rho}_j$ is the estimated autocorrelation to lag $j$, a maximal lag $l$ is selected to be the smallest integer $k$ so that $$\max \{|\hat{\rho}_k|, ..., |\hat{\rho}_{k + \kappa_n}|\} \leq 2 \sqrt(\log_{10}(n) / n), $$ $\kappa_n = \max \{5, \sqrt{\log_{10}(n)}\}$. This $l$ is determined for both the original data $x$ and the squared data $x^2$ and the maximum $l_{max}$ is taken. Then the bandwidth $b_n$ is the minimum of $l_{max}$ and $n^{1/3}$.

References

Gerstenberger, C., Vogel, D., and Wendler, M. (2020). Tests for scale changes based on pairwise differences. Journal of the American Statistical Association, 115(531), 1336-1348.

Examples

Run this code

x <- arima.sim(list(ar = 0.5), 100)

# under H_0:
scale_stat(x, "GMD")
scale_stat(x, "Qalpha", method = "bootstrap")

# under the alternative:
x[51:100] <- x[51:100] * 3
scale_stat(x)
scale_stat(x, "Qalpha", method = "bootstrap")

Run the code above in your browser using DataLab