rMRCov: Modulated realized covariance

Description

Calculate univariate or multivariate pre-averaged estimator, as defined in Hautsch and Podolskij (2013).

Usage

rMRCov(
  pData,
  pairwise = FALSE,
  makePsd = FALSE,
  theta = 0.8,
  crossAssetNoiseCorrection = FALSE,
  ...
)

Value

A $d \times d$ covariance matrix.

Arguments

pData: a list. Each list-item contains an xts or data.table object with the intraday price data of a stock.
pairwise: boolean, should be TRUE when refresh times are based on pairs of assets. FALSE by default.
makePsd: boolean, in case it is TRUE, the positive definite version of rMRCov is returned. FALSE by default.
theta: a numeric controlling the preaveraging horizon. Detaults to 0.8 as recommended by Hautsch and Podolskij (2013)
crossAssetNoiseCorrection: a logical denoting whether to apply the bias correction term on the off-diagonals (covariance) terms. We set this to FALSE by default as noise is typically seen as independent across assets.
...: used internally, do not change.

Author

Giang Nguyen, Jonathan Cornelissen, Kris Boudt, and Emil Sjoerup.

Details

In practice, market microstructure noise leads to a departure from the pure semimartingale model. We consider the process $Y$ in period $\tau$: $$ \mbox{Y}_{\tau} = X_{\tau} + \epsilon_{\tau}, $$ where the observed $d$ dimensional log-prices are the sum of underlying Brownian semimartingale process $X$ and a noise term $\epsilon_{\tau}$.

$\epsilon_{\tau}$ is an i.i.d. process with $X$.

It is intuitive that under mean zero i.i.d. microstructure noise some form of smoothing of the observed log-price should tend to diminish the impact of the noise. Effectively, we are going to approximate a continuous function by an average of observations of $Y$ in a neighborhood, the noise being averaged away.

Assume there is $N$ equispaced returns in period $\tau$ of a list (after refreshing data). Let $r_{\tau_i}$ be a return (with $i=1, \ldots,N$) of an asset in period $\tau$. Assume there is $d$ assets.

In order to define the univariate pre-averaging estimator, we first define the pre-averaged returns as $$ \bar{r}_{\tau_j}^{(k)}= \sum_{h=1}^{k_N-1}g\left(\frac{h}{k_N}\right)r_{\tau_{j+h}}^{(k)} $$ where g is a non-zero real-valued function $g:[0,1]$ $\rightarrow$ $R$ given by $g(x)$ = $\min(x,1-x)$. $k_N$ is a sequence of integers satisfying $\mbox{k}_{N} = \lfloor\theta N^{1/2}\rfloor$. We use $\theta = 0.8$ as recommended in Hautsch and Podolskij (2013). The pre-averaged returns are simply a weighted average over the returns in a local window. This averaging diminishes the influence of the noise. The order of the window size $k_n$ is chosen to lead to optimal convergence rates. The pre-averaging estimator is then simply the analogue of the realized variance but based on pre-averaged returns and an additional term to remove bias due to noise $$ \hat{C}= \frac{N^{-1/2}}{\theta \psi_2}\sum_{i=0}^{N-k_N+1} (\bar{r}_{\tau_i})^2-\frac{\psi_1^{k_N}N^{-1}}{2\theta^2\psi_2^{k_N}}\sum_{i=0}^{N}r_{\tau_i}^2 $$ with $$ \psi_1^{k_N}= k_N \sum_{j=1}^{k_N}\left(g\left(\frac{j+1}{k_N}\right)-g\left(\frac{j}{k_N}\right)\right)^2,\quad $$ $$ \psi_2^{k_N}= \frac{1}{k_N}\sum_{j=1}^{k_N-1}g^2\left(\frac{j}{k_N}\right). $$ $$ \psi_2= \frac{1}{12} $$ The multivariate counterpart is very similar. The estimator is called the Modulated Realized Covariance (rMRCov) and is defined as $$ \mbox{MRC}= \frac{N}{N-k_N+2}\frac{1}{\psi_2k_N}\sum_{i=0}^{N-k_N+1}\bar{\boldsymbol{r}}_{\tau_i}\cdot \bar{\boldsymbol{r}}'_{\tau_i} -\frac{\psi_1^{k_N}}{\theta^2\psi_2^{k_N}}\hat{\Psi} $$ where $\hat{\Psi}_N = \frac{1}{2N}\sum_{i=1}^N \boldsymbol{r}_{\tau_i}(\boldsymbol{r}_{\tau_i})'$. It is a bias correction to make it consistent. However, due to this correction, the estimator is not ensured PSD. An alternative is to slightly enlarge the bandwidth such that $\mbox{k}_{N} = \lfloor\theta N^{1/2+\delta}\rfloor$. $\delta = 0.1$ results in a consistent estimate without the bias correction and a PSD estimate, in which case: $$ \mbox{MRC}^{\delta}= \frac{N}{N-k_N+2}\frac{1}{\psi_2k_N}\sum_{i=0}^{N-k_N+1}\bar{\boldsymbol{r}}_i\cdot \bar{\boldsymbol{r}}'_i $$

References

Hautsch, N., and Podolskij, M. (2013). Preaveraging-based estimation of quadratic variation in the presence of noise and jumps: theory, implementation, and empirical Evidence. Journal of Business & Economic Statistics, 31, 165-183.

Examples

Run this code

if (FALSE) {
library("xts")
# Note that this ought to be tick-by-tick data and this example is only to show the usage.
a <- list(as.xts(sampleOneMinuteData[as.Date(DT) == "2001-08-04", list(DT, MARKET)]),
          as.xts(sampleOneMinuteData[as.Date(DT) == "2001-08-04", list(DT, STOCK)]))
rMRCov(a, pairwise = TRUE, makePsd = TRUE)


# We can also add use data.tables and use a named list to convey asset names
a <- list(foo = sampleOneMinuteData[as.Date(DT) == "2001-08-04", list(DT, MARKET)],
          bar = sampleOneMinuteData[as.Date(DT) == "2001-08-04", list(DT, STOCK)])
rMRCov(a, pairwise = TRUE, makePsd = TRUE)

}

Run the code above in your browser using DataLab