leadLag: Lead-Lag estimation

Description

Function that estimates whether one series leads (or lags) another.

Let $X_{t}$ and $Y_{t}$ be two observed price over the time interval $[0,1]$.

For every integer $k \in \cal{Z}$, we form the shifted time series

$$ Y_{\left(k+i\right)/n}, \quad i = 1, 2, \dots $$ $H=\left(\underline{H},\overline{H}\right]$ is an interval for $\vartheta\in\Theta$, define the shift interval $H_{\vartheta}=H+\vartheta=\left(\underline{H}+\vartheta,\overline{H}+\vartheta\right]$ then let

$$ X\left(H\right)_{t}=\int_{0}^{t}1_{H}\left(s\right)\textrm{d}X_{s} $$

Which will be abbreviated: $$ X\left(H\right)=X\left(H\right)_{T+\delta}=\int_{0}^{T+\delta}1_{H}\left(s\right)\textrm{d}X_{s} $$

Then the shifted HY contrast function is: $$ \tilde{\vartheta}\rightarrow U^{n}\left(\tilde{\vartheta}\right)= \\ 1_{\tilde{\vartheta}\geq0}\sum_{I\in{\cal{I}},J\in{\cal{J}},\overline{I}\leq T}X\left(I\right)Y\left(J\right)1_{\left\{ I\cap J_{-\tilde{\vartheta}}\neq\emptyset\right\}} \\ +1_{\tilde{\vartheta}<0}\sum_{I\in{\cal{I}},J\in{\cal{J}},\overline{J}\leq T}X\left(I\right)Y\left(Y\right)1_{\left\{ J\cap I_{\tilde{\vartheta}}\neq\emptyset\right\} } $$ This contrast function is then calculated for all the lags passed in the argument lags

Usage

leadLag(
  price1 = NULL,
  price2 = NULL,
  lags = NULL,
  resolution = "seconds",
  normalize = TRUE,
  parallelize = FALSE,
  nCores = NA
)

Value

A list with class leadLag which contains contrasts, lead-lag-ratio, and lags, denoting the estimated values for each lag calculated, the lead-lag-ratio, and the tested lags respectively.

Arguments

price1: xts or data.table containing prices in levels, in case of data.table, use a column DT to denote the date-time in POSIXct format, and a column PRICE to denote the price
price2: xts or data.table containing prices in levels, in case of data.table, use a column DT to denote the date-time in POSIXct format, and a column PRICE to denote the price
lags: a numeric denoting which lags (in units of resolution) should be tested as leading or lagging
resolution: the resolution at which the lags is measured. The default is "seconds", use this argument to gain 1000 times resolution by setting it to either "ms", "milliseconds", or "milli-seconds".
normalize: logical denoting whether the contrasts should be normalized by the product of the L2 norms of both the prices. Default = TRUE. This does not change the value of the lead-lag-ratio.
parallelize: logical denoting whether to use a parallelized version of the C++ code (parallelized using OPENMP). Default = FALSE
nCores: integer valued numeric denoting how many cores to use for the lead-lag estimation procedure in case parallelize is TRUE. Default is NA, which does not parallelize the code.

Details

The lead-lag-ratio (LLR) can be used to see if one asset leads the other. If LLR < 1, then price1 MAY be leading price2 and vice versa if LLR > 1.

References

Hoffmann, M., Rosenbaum, M., and Yoshida, N. (2013). Estimation of the lead-lag parameter from non-synchronous data. Bernoulli, 19, 1-37.

Examples

Run this code

if (FALSE) {
# Toy example to show the usage
# Spread prices
spread <- spreadPrices(sampleMultiTradeData[SYMBOL %in% c("ETF", "AAA")])
# Use lead-lag estimator
llEmpirical <- leadLag(spread[!is.na(AAA), list(DT, PRICE = AAA)], 
                       spread[!is.na(ETF), list(DT, PRICE = ETF)], seq(-15,15))
plot(llEmpirical)
}

Run the code above in your browser using DataLab