Estimates the long run variance respectively covariance matrix of the supplied time series.
lrv(x, method = c("kernel", "subsampling", "bootstrap", "none"), control = list())
long run variance \(\sigma^2\) (numeric) resp. \(\Sigma\) (numeric matrix)
vector or matrix with each column representing a time series (numeric).
method of estimation. Options are kernel
, subsampling
, bootstrap
and none
.
a list of control parameters. See 'Details'.
Sheila Görz
The long run variance equals the limit of \(n\) times the variance of the arithmetic mean of a short range dependent time series, where \(n\) is the length of the time series. It is used to standardize tests concering the mean on dependent data.
If method = "none"
, no long run variance estimation is performed and the value 1 is returned (i.e. it does not alterate the test statistic).
The control
argument is a list that can supply any of the following components:
kFun
Kernel function (character string). More in 'Notes'.
b_n
Bandwidth (numeric > 0 and smaller than sample size).
gamma0
Only use estimated variance if estimated long run variance is < 0? Boolean.
l
Block length (numeric > 0 and smaller than sample size).
overlapping
Overlapping subsampling estimation? Boolean.
distr
Tranform observations by their empirical distribution function? Boolean. Default is FALSE
.
B
Bootstrap repetitions (integer).
seed
RNG seed (numeric).
version
What property does the CUSUM test test for? Character string, details below.
loc
Estimated location corresponding to version
. Numeric value, details below.
scale
Estimated scale corresponding to version
. Numeric value, details below.
Kernel-based estimation
The kernel-based long run variance estimation is available for various testing scenarios (set by control$version
) and both for one- and multi-dimensional data. It uses the bandwidth \(b_n = \) control$b_n
and kernel function \(k(x) = \) control$kFun
. For tests on certain properties also a corresponding location control$loc
(\(m_n\)) and scale control$scale
(\(v_n\)) estimation needs to be supplied. Supported testing scenarios are:
"mean"
1-dim. data:
$$\hat{\sigma}^2 = \frac{1}{n} \sum_{i = 1}^n (x_i - \bar{x})^2 + \frac{2}{n} \sum_{h = 1}^{b_n} \sum_{i = 1}^{n - h} (x_i - \bar{x}) (x_{i + h} - \bar{x}) k(h / b_n).$$
If control$distr = TRUE
, then the long run variance is estimated on the empirical distribution of \(x\). The resulting value is then multiplied with \(\sqrt{\pi} / 2\).
Default values: b_n
= \(0.9 n^{1/3}\), kFun = "bartlett"
.
multivariate time series: The \(k,l\)-element of \(\Sigma\) is estimated by $$\hat{\Sigma}^{(k,l)} = \frac{1}{n} \sum_{i,j = 1}^{n}(x_i^{(k)} - \bar{x}^{(k)}) (x_j^{(l)} - \bar{x}^{(l)}) k((i-j) / b_n),$$ \(k, l = 1, ..., m\).
Default values: b_n
= \(\log_{1.8 + m / 40}(n / 50)\), kFun = "bartlett"
.
"empVar"
for tests on changes in the empirical variance.
$$\hat{\sigma}^2 = \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} ((x_i - m_n)^2 - v_n)((x_{i+|h|} - m_n)^2 - v_n).$$
Default values: \(m_n =\) mean(x)
, \(v_n = \) var(x)
.
"MD"
for tests on a change in the median deviation.
$$\hat{\sigma}^2 = \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} (|x_i - m_n| - v_n)(|x_{i+|h|} - m_n| - v_n).$$
Default values: \(m_n =\) median(x)
, \(v_n = \frac{1}{n-1} \sum_{i = 1}^n |x_i - m_n|\).
"GMD"
for tests on changes in Gini's mean difference.
$$\hat{\sigma}^2 = 4 \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} \hat{\phi}_n(x_i)\hat{\phi}_n(x_{i+|h|})$$
with \(\hat{\phi}_n(x) = n^{-1} \sum_{i = 1}^n |x - x_i| - v_n\).
Default value: \(v_n =\) \(\frac{2}{n(n-1)} \sum_{1 \leq i < j \leq n} |x_i - x_j|.\)
"Qalpha"
for tests on changes in Qalpha
.
$$\hat{\sigma}^2 = \frac{4}{\hat{u}(v_n)} \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} \hat{\phi}_n(x_i)\hat{\phi}_n(x_{i+|h|}),$$
where \(\hat{\phi}_n(x) = n^{-1} \sum_{i = 1}^n 1_{\{|x - x_i| \leq v_n\}} - m_n\) and
$$\hat{u}(t) = \frac{2}{n(n-1)h_n} \sum_{1 \leq i < j \leq n} K\left(\frac{|x_i - x_j| - t}{h_n}\right)$$
the kernel density estimation of the densitiy \(u\) corresponding to the distribution function \(U(t) = P(|X-Y| \leq t)\), \(h_n =\) IQR(x)
\(n^{-\frac{1}{3}}\) and \(K\) is the quatratic kernel function.
Default values: \(m_n = \alpha = 0.5\), \(v_n =\) Qalpha(x, m_n)[n-1]
.
"tau"
for tests in changes in Kendall's tau.
Only available for bivariate data: assume that the given data x
has the format \((x_i, y_i)_{i = 1, ..., n}\).
$$\hat{\sigma}^2 = \sum_{h = -(n-1)}^{n-1} W \left( \frac{|h|}{b_n} \right) \frac{1}{n} \sum_{i = 1}^{n - |h|} \hat{\phi}_n((x_i, y_i))\hat{\phi}_n((x_{i+|h|}, y_{i+|h|}),$$
where \(\hat{\phi}_n(x) = 4 F_n(x, y) - 2F_{X,n}(x) 2 - F_{Y,n}(y) + 1 - v_n\) and \(F_n\), \(F_{X,n}\) and \(F_{Y,n}\) are the empirical distribution functions of \(((X_i, Y_i))_{i = 1, ..., n}\), \((X_i)_{i = 1, ..., n}\) and \((Y_i)_{i = 1, ..., n}\).
Default value: \(v_n = \hat{\tau}_n = \frac{2}{n(n-1)} \sum_{1 \leq i < j \leq n} sign\left((x_j - x_i)(y_j - y_i)\right)\).
"rho"
for tests on changes in Spearman's rho.
Only availabe for \(d\)-variate data with \(d > 1\): assume that the given data x
has the format \((x_{i,j} | i = 1, ..., n; j = 1, ..., d)\).
$$\hat{\sigma}^2 = a(d)^2 2^{2d} \left\{ \sum_{h = -(n-1)}^{n-1} K\left( \frac{|h|}{b_n} \right) \left( \sum_{i = 1}^{n-|h|} n^{-1} \prod_{j = 1}^d \hat{\phi}_n(x_i, x_j) \hat{\phi}_n(x_{i+|h|}, x_j) - M^2 \right) \right\} ,$$
where \(a(d) = (d+1) / (2^d - d - 1)\), \(M = n^{-1} \sum_{i = 1}^n \prod_{j = 1}^d \hat{\phi}_n(x_i, x_j)\) and \(\hat{\phi}_n(x, y) = 1 - \hat{U}_n(x, y)\), \(\hat{U}_n(x, y) = n^{-1}\) (rank of \(x_{i,j}\) in \(x_{i,1}, ..., x_{i,n})\).
When control$gamma0 = TRUE
(default) then negative estimates of the long run variance are replaced by the autocovariance at lag 0 (= ordinary variance of the data). The function will then throw a warning.
Subsampling estimation
For method = "subsampling"
there are an overlapping and a non-overlapping version (parameter control$overlapping
). Also it can be specified if the observations x were transformed by their empirical distribution function \(\tilde{F}_n\) (parameter control$distr
). Via control$l
the block length \(l\) can be controlled.
If control$overlapping = TRUE
and control$distr = TRUE
:
$$\hat{\sigma}_n = \frac{\sqrt{\pi}}{\sqrt{2l}(n - l + 1)} \sum_{i = 0}^{n-l} \left| \sum_{j = i+1}^{i+l} (F_n(x_j) - 0.5) \right|.$$
Otherwise, if control$distr = FALSE
, the estimator is
$$\hat{\sigma}^2 = \frac{1}{l (n - l + 1)} \sum_{i = 0}^{n-l} \left( \sum_{j = i + 1}^{i+l} x_j - \frac{l}{n} \sum_{j = 1}^n x_j \right)^2.$$
If control$overlapping = FALSE
and control$distr = TRUE
:
$$\hat{\sigma} = \frac{1}{n/l} \sqrt{\pi/2} \sum_{i = 1}{n/l} \frac{1}{\sqrt{l}} \left| \sum_{j = (i-1)l + 1}^{il} F_n(x_j) - \frac{l}{n} \sum_{j = 1}^n F_n(x_j) \right|.$$
Otherwise, if control$distr = FALSE
, the estimator is
$$\hat{\sigma}^2 = \frac{1}{n/l} \sum_{i = 1}^{n/l} \frac{1}{l} \left(\sum_{j = (i-1)l + 1}^{il} x_j - \frac{l}{n} \sum_{j = 1}^n x_j\right)^2.$$
Default values: overlapping = TRUE, the block length is chosen adaptively: $$l_n = \max{\left\{ \left\lceil n^{1/3} \left( \frac{2 \rho}{1 - \rho^2} \right)^{(2/3)} \right\rceil, 1 \right\}}$$ where \(\rho\) is the Spearman autocorrelation at lag 1.
Bootstrap estimation
If method = "bootstrap"
a dependent wild bootstrap with the parameters \(B = \) control$B
, \(l = \) control$l
and \(k(x) = \) control$kFun
is performed:
$$ \hat{\sigma}^2 = \sqrt{n} Var(\bar{x^*_k} - \bar{x}), k = 1, ..., B$$
A single \(x_{ik}^*\) is generated by \(x_i^* = \bar{x} + (x_i - \bar{x}) a_i\) where \(a_i\) are independent from the data x
and are generated from a multivariate normal distribution with \(E(A_i) = 0\), \(Var(A_i) = 1\) and \(Cov(A_i, A_j) = k\left(\frac{i - j}{l}\right), i = 1, ..., n; j \neq i\). Via control$seed
a seed can optionally be specified (cf. set.seed
). Only "bartlett"
, "parzen"
and "QS"
are supported as kernel functions. Uses the function sqrtm
from package pracma
.
Default values: B
= 1000, kFun = "bartlett"
, l
is the same as for subsampling.
Andrews, D.W. "Heteroskedasticity and autocorrelation consistent covariance matrix estimation." Econometrica: Journal of the Econometric Society (1991): 817-858.
Dehling, H., et al. "Change-point detection under dependence based on two-sample U-statistics." Asymptotic laws and methods in stochastics. Springer, New York, NY, (2015). 195-220.
Dehling, H., Fried, R., and Wendler, M. "A robust method for shift detection in time series." Biometrika 107.3 (2020): 647-660.
Parzen, E. "On consistent estimates of the spectrum of a stationary time series." The Annals of Mathematical Statistics (1957): 329-348.
Shao, X. "The dependent wild bootstrap." Journal of the American Statistical Association 105.489 (2010): 218-235.
CUSUM
, HodgesLehmann
, wilcox_stat
Z <- c(rnorm(20), rnorm(20, 2))
## kernel density estimation
lrv(Z)
## overlapping subsampling
lrv(Z, method = "subsampling", control = list(overlapping = FALSE, distr = TRUE, l_n = 5))
## dependent wild bootstrap estimation
lrv(Z, method = "bootstrap", control = list(l_n = 5, kFun = "parzen"))
Run the code above in your browser using DataLab