Learn R Programming

robcp (version 0.3.8)

scale_stat: Test statistic to detect Scale Changes

Description

Computes the test statistic for CUSUM-based tests on scale changes.

Usage

scale_stat(x, version =  c("empVar", "MD", "GMD", "Qalpha"), method = "kernel",
           control = list(), alpha = 0.8)

Value

Test statistic (numeric value) with the following attributes:

cp-location

indicating at which index a change point is most likely.

teststat

test process (before taking the maximum).

lrv-estimation

long run variance estimation method.

If method = "kernel" the following attributes are also included:

sigma

estimated long run variance.

param

parameter used for the lrv estimation.

kFun

kernel function used for the lrv estimation.

Is an S3 object of the class "cpStat".

Arguments

x

time series (numeric or ts vector).

version

variance estimation method.

method

either "kernel" for performing a kernel-based long run variance estimation, or "bootstrap" for performing a dependent wild bootstrap. See 'Details' below.

control

a list of control parameters.

alpha

quantile of the distribution function of all absolute pairwise differences used in version = "Qalpha".

Author

Sheila Görz

Details

Let \(n\) be the length of the time series. The CUSUM test statistic for testing on a change in the scale is then defined as $$\hat{T}_{s} = \max_{1 < k \leq n} \frac{k}{\sqrt{n}} |\hat{s}_{1:k} - \hat{s}_{1:n}|,$$ where \(\hat{s}_{1:k}\) is a scale estimator computed using only the first \(k\) elements of the time series \(x\).

If method = "kernel", the test statistic \(\hat{T}_s\) is divided by the estimated long run variance \(\hat{D}_s\) so that it asymptotically follows a Kolmogorov distribution. \(\hat{D}_s\) is computed by the function lrv using kernel-based estimation.

For the scale estimator \(\hat{s}_{1:k}\), there are five different options which can be set via the version parameter:

Empirical variance (empVar) $$\hat{\sigma}^2_{1:k} = \frac{1}{k-1} \sum_{i = 1}^k (x_i - \bar{x}_{1:k})^2; \; \bar{x}_{1:k} = k^{-1} \sum_{i = 1}^k x_i.$$

Mean deviation (MD) $$\hat{d}_{1:k}= \frac{1}{k-1} \sum_{i = 1}^k |x_i - med_{1:k}|,$$ where \(med_{1:k}\) is the median of \(x_1, ..., x_k\).

Gini's mean difference (GMD) $$\hat{g}_{1:k} = \frac{2}{k(k-1)} \sum_{1 \leq i < j \leq k} (x_i - x_j).$$

\(Q^{\alpha}\) (Qalpha) $$\hat{Q}^{\alpha}_{1:k} = U_{1:k}^{-1}(\alpha) = \inf\{x | \alpha \leq U_{1:k}(x)\},$$ where \(U_{1:k}\) is the empirical distribtion function of \(|x_i - x_j|, \, 1 \leq i < j \leq k\) (cf. Qalpha).

For the kernel-based long run variance estimation, the default bandwidth \(b_n\) is determined as follows:

If \(\hat{\rho}_j\) is the estimated autocorrelation to lag \(j\), a maximal lag \(l\) is selected to be the smallest integer \(k\) so that $$\max \{|\hat{\rho}_k|, ..., |\hat{\rho}_{k + \kappa_n}|\} \leq 2 \sqrt(\log_{10}(n) / n), $$ \(\kappa_n = \max \{5, \sqrt{\log_{10}(n)}\}\). This \(l\) is determined for both the original data \(x\) and the squared data \(x^2\) and the maximum \(l_{max}\) is taken. Then the bandwidth \(b_n\) is the minimum of \(l_{max}\) and \(n^{1/3}\).

References

Gerstenberger, C., Vogel, D., and Wendler, M. (2020). Tests for scale changes based on pairwise differences. Journal of the American Statistical Association, 115(531), 1336-1348.

See Also

lrv, Qalpha

Examples

Run this code
x <- arima.sim(list(ar = 0.5), 100)

# under H_0:
scale_stat(x, "GMD")
scale_stat(x, "Qalpha", method = "bootstrap")

# under the alternative:
x[51:100] <- x[51:100] * 3
scale_stat(x)
scale_stat(x, "Qalpha", method = "bootstrap")

Run the code above in your browser using DataLab