seqOpenEndCpMean: Open-end Nonparametric Sequential Change-Point Detection Test for Univariate Time Series Sensitive to Changes in the Mean

Description

Open-end nonparametric sequential test for change-point detection based on the retrospective CUSUM statistic. The observations need to be univariate but can be serially dependent. To carry out the test, two steps are required. The first step consists of computing a detector function. The second step consists of comparing the detector function to a suitable constant threshold function. Each of these steps corresponds to one of the functions in the usage section below. The current implementation is preliminary and not optimized for real-time monitoring (but could still be used for that). Details can be found in the third reference.

Usage

detOpenEndCpMean(x.learn, x, sigma = NULL, b = NULL,
          weights = c("parzen", "bartlett"))
monOpenEndCpMean(det, statistic = c("t", "s", "r", "e", "cs"), eta = 0.001,
          gamma = 0.45, alpha = 0.05, sigma = NULL, plot = TRUE)

Value

Both functions return lists whose components have explicit names. The function monOpenEndCpMean() in particular returns a list whose components are

alarm: a logical indicating whether the detector function has exceeded the threshold function.
time.alarm: an integer corresponding to the time at which the detector function has exceeded the threshold function or NA.
times.max: a vector of times at which the successive detectors "r" (if statistic = "r", statistic = "s" or statistic = "t") or "e" (if statistic = "e") have reached their maximum; a vector of NA's if statistic = "cs"; this sequence of times can be used to estimate the time of change from the time of alarm.
time.change: an integer giving the estimated time of change if alarm is TRUE; the latter is simply the value in times.max which corresponds to time.alarm.
statistic: the value of statistic in the call of the function.
eta: the value of eta in the call of the function.
gamma: the value of gamma in the call of the function.
alpha: the value of alpha in the call of the function.
sigma: the value of sigma in the call of the function.
detector: the successive values of the chosen detector.
threshold: the value of the constant threshold for the chosen detector.

Arguments

x.learn: a numeric vector representing the learning sample.
x: a numeric vector representing the observations collected after the beginning of the monitoring for a change in mean.
sigma: an estimate of the long-run variance of the time series of which x.learn is a stretch. If set to NULL, sigma will be estimated using an approach similar to those described in the fourth reference.
b: strictly positive integer specifying the value of the bandwidth for the estimation of the long-run variance if sigma is not provided. If set to NULL, b will be estimated from x.learn using the function bOpt().
weights: a string specifying the kernel for creating the weights used for the estimation of the long-run variance if sigma is not provided; see Section 5 of the first reference.
det: an object of class det.cpMean representing a detector function computed using detOpenEndCpMean().
statistic: a string specifying the statistic/detector to be used for the monitoring; can be either "t", "s", "r", "e" or "cs"; "t" corresponds to the detector \(T_{m}\) in the third reference, "s" to the detector \(S_{m}\), "r" to the detector \(R_{m}\), "e" to the detector \(E_m\) and "cs" to so-called ordinary CUSUM detector denoted by \(Q_m\) in the third reference. Note that the detector \(E_m\) was proposed in the second reference.
eta: a real parameter whose role is described in detail in the third reference.
gamma: a real parameter that can improve the power of the sequential test at the beginning of the monitoring; possible values are 0, 0.1, 0.25, 0.45, 0.65 and 0.85, but not for all statistics; see the third reference.
alpha: the value of the desired significance level for the sequential test.
plot: logical indicating whether the monitoring should be plotted.

Details

The testing procedure is described in detail in the third reference. An alternative way of estimating the long-run variance is to use the function lrvar() of the package sandwich and to pass it through the argument sigma.

References

A. Bücher and I. Kojadinovic (2016), A dependent multiplier bootstrap for the sequential empirical copula process under strong mixing, Bernoulli 22:2, pages 927-968, https://arxiv.org/abs/1306.3930.

J. Gösmann, T. Kley and H. Dette (2021), A new approach for open-end sequential change point monitoring, Journal of Time Series Analysis 42:1, pages 63-84, https://arxiv.org/abs/1906.03225.

M. Holmes and I. Kojadinovic (2021), Open-end nonparametric sequential change-point detection based on the retrospective CUSUM statistic, Electronic Journal of Statistics 15:1, pages 2288-2335, tools:::Rd_expr_doi("10.1214/21-EJS1840").

D.N. Politis and H. White (2004), Automatic block-length selection for the dependent bootstrap, Econometric Reviews 23(1), pages 53-70.

Examples

Run this code

if (FALSE) {
## Example of open-end monitoring
m <- 100 # size of the learning sample

## The learning sample
set.seed(123)
x.learn <- rnorm(m)

## New observations with a change in mean
## to simulate monitoring for the period m+1, ..., n
n <- 5000
k <- 2500 ## the true change-point
x <- c(rnorm(k-m), rnorm(n-k, mean = 0.2))

## Step 1: Compute the detector
det <- detOpenEndCpMean(x.learn = x.learn, x = x)

## Step 2: Monitoring with the default detector
m1 <- monOpenEndCpMean(det)
str(m1)

## Monitoring with another detector
m2 <- monOpenEndCpMean(det, statistic = "s", gamma = 0.85)
str(m2)
}

Run the code above in your browser using DataLab