Estimates a wide variety of spot volatility estimators.
spotVol(
data,
method = "detPer",
alignBy = "minutes",
alignPeriod = 5,
marketOpen = "09:30:00",
marketClose = "16:00:00",
tz = "GMT",
...
)
A spotVol
object, which is a list containing one or more of the
following outputs, depending on the method used:
spot
An xts
or matrix
object (depending on the input) containing
spot volatility estimates \(\sigma_{t,i}\), reported for each interval
\(i\) between marketOpen
and marketClose
for every day
\(t\) in data
. The length of the intervals is specified by alignPeriod
and alignBy
. Methods that provide this output: All.
daily
An xts
or numeric
object (depending on the input) containing
estimates of the daily volatility levels for each day \(t\) in data
,
if the used method decomposed spot volatility into a daily and an intraday
component. Methods that provide this output: "detPer"
.
periodic
An xts
or numeric
object (depending on the input) containing
estimates of the intraday periodicity factor for each day interval \(i\)
between marketOpen
and marketClose
, if the spot volatility was
decomposed into a daily and an intraday component. If the output is in
xts
format, this periodicity factor will be dated to the first day of
the input data, but it is identical for each day in the sample. Methods that
provide this output: "detPer"
.
par
A named list containing parameter estimates, for methods that estimate one
or more parameters. Methods that provide this output:
"stochper", "kernel"
.
cp
A vector containing the change points in the volatility, i.e. the observation
indices after which the volatility level changed, according to the applied
tests. The vector starts with a 0. Methods that provide this output:
"piecewise"
.
ugarchfit
A ugarchfit
object, as used by the rugarch
package, containing all output from fitting the GARCH model to the data.
Methods that provide this output: "garch"
.
The spotVol
function offers several methods to estimate spot
volatility and its intraday seasonality, using high-frequency data. It
returns an object of class spotVol
, which can contain various outputs,
depending on the method used. See `Details' for a description of each method.
In any case, the output will contain the spot volatility estimates.
The input can consist of price data or return data, either tick by tick or sampled at set intervals. The data will be converted to equispaced high-frequency returns \(r_{t,i}\) (read: the \(i\)-th return on day \(t\)).
Can be one of two input types, xts
or data.table
.
It is assumed that the input comprises prices in levels. Irregularly spaced
observations are allowed. They will be aggregated to the level specified by
parameters alignBy
and alignPeriod
.
specifies which method will be used to estimate the spot
volatility. Valid options are "detPer"
, "stochPer"
"kernel"
"piecewise"
"garch"
, "RM"
,"PARM"
See `Details' below for explanation and parameters to use in each of the methods.
character, indicating the time scale in which alignPeriod
is expressed.
Possible values are: "ticks"
, "secs"
, "seconds"
, "mins"
, "minutes"
, "hours"
positive integer, indicating the number of periods to aggregate
over. For example, to aggregate an xts
object to the 5-minute frequency, set
alignPeriod = 5
and alignBy = "minutes"
.
the market opening time. This should be in the time zone
specified by tz
. By default, marketOpen = "09:30:00"
.
the market closing time. This should be in the time zone
specified by tz
. By default, marketClose = "16:00:00"
.
fallback time zone used in case we we are unable to identify the timezone of the data, by default: tz = NULL
.
We attempt to extract the timezone from the DT column (or index) of the data, which may fail.
In case of failure we use tz
if specified, and if it is not specified, we use "UTC"
method-specific parameters (see `Details' below).
Jonathan Cornelissen, Kris Boudt, Onno Kleen, and Emil Sjoerup.
The following estimation methods can be specified in method
:
Deterministic periodicity method ("detPer"
)
Parameters:
dailyVol
A string specifying the estimation method for the daily component \(s_t\).
Possible values are "rBPCov", "rRVar", "rMedRVar"
. "rBPCov"
by default.
periodicVol
A string specifying the estimation method for the component of intraday volatility,
that depends in a deterministic way on the intraday time at which the return is observed.
Possible values are "SD", "WSD", "TML", "OLS"
. See Boudt et al. (2011) for details. Default = "TML"
.
P1
A positive integer corresponding to the number of cosine terms used in the flexible Fourier
specification of the periodicity function, see Andersen et al. (1997) for details. Default = 5.
P2
Same as P1
, but for the sine terms. Default = 5.
dummies
Boolean: in case it is TRUE
, the parametric estimator of periodic standard deviation
specifies the periodicity function as the sum of dummy variables corresponding to each intraday period.
If it is FALSE
, the parametric estimator uses the flexible Fourier specification. Default is FALSE
.
Outputs (see `Value' for a full description of each component):
spot
daily
periodic
Let there be \(T\) days of \(N\) equally-spaced log-returns \(r_{i,t}\),
\(i = 1, \dots, N\) and \(i = 1, \dots, T\).
In case of method = "detPer"
, the returns are modeled as
$$
r_{i,t} = f_i s_t u_{i,t}
$$
with independent \(u_{i,t} \sim \mathcal{N}(0,1)\).
The spot volatility is decomposed into a deterministic periodic factor
\(f_{i}\) (identical for every day in the sample) and a daily factor
\(s_{t}\) (identical for all observations within a day).
Both components are then estimated separately, see Taylor and Xu (1997)
and Andersen and Bollerslev (1997). The jump robust versions by Boudt et al.
(2011) have also been implemented.
If periodicVol = "SD"
, we have
$$
\hat f_i^{SD} = \frac{SD_i}{\sqrt{\frac{1}{\lfloor{\lambda / \Delta}\rfloor} \sum_{j = 1}^N SD_j^2}}
$$
with \(\Delta = 1 / N\), cross-daily averages \(SD_i = \sqrt{1/T \sum_{i = t}^T r_{i,t}^2}\),
and \(\lambda\) being the length of the intraday time intervals.
If periodicVol = "WSD"
, we have another nonparametric estimator that is robust to jumps in contrast to
periodicVol = "SD"
. The definition of this estimator can be found in Boudt et al. (2011, Eqs. 2.9-2.12).
The estimates when periodicVol = "OLS"
and periodicVol = "TML"
are based on the regression equation
$$
\log \left| 1/T \sum_{t = 1}^T r_{i,t} \right| - c = \log f_i + \varepsilon_i
$$
with i.i.d. zero-mean error term \(\varepsilon_i\) and \(c = -0.63518\).
periodicVol = "OLS"
employs ordinary-least-squares estimation and
periodicVol = "TML"
truncated maximum-likelihood estimation (see Boudt et al., 2011, Section 2.2, for further details).
Stochastic periodicity method ("stochPer"
)
Parameters:
P1
: A positive integer corresponding to the number of cosine terms used in the flexible Fourier
specification of the periodicity function. Default = 5.
P2
: Same as P1
, but for the sine terms. Default = 5.
init
: A named list of initial values to be used in the optimization routine ("BFGS"
in optim
).
Default = list(sigma = 0.03, sigma_mu = 0.005, sigma_h = 0.005, sigma_k = 0.05,
phi = 0.2, rho = 0.98, mu = c(2, -0.5), delta_c = rep(0, max(1,P1)),
delta_s = rep(0, max(1,P2)))
.
The naming of the parameters follows Beltratti and Morana (2001), the corresponding model equations are listed below.
init
can contain any number of these parameters.
For parameters not specified in init
, the default initial value will be used.
control
: A list of options to be passed down to optim
.
Outputs (see `Value' for a full description of each component):
spot
par
This method by Beltratti and Morana (2001) assumes the periodicity factor to
be stochastic. The spot volatility estimation is split into four components:
a random walk, an autoregressive process, a stochastic cyclical process and
a deterministic cyclical process. The model is estimated using a
quasi-maximum likelihood method based on the Kalman Filter. The package
FKF
is used to apply the Kalman filter. In addition to
the spot volatility estimates, all parameter estimates are returned.
The model for the intraday change in the return series is given by
$$ r_{t,n} = \sigma_{t,n} \varepsilon_{t,n}, \ t = 1, \dots, T; \ n = 1, \dots, N, $$ where \(\sigma_{t,n}\) is the conditional standard deviation of the \(n\)-th interval of day \(t\) and \(\varepsilon_{t,n}\) is a i.i.d. mean-zero unit-variance process. The conditional standard deviations are modeled as $$ \sigma_{t,n} = \sigma \exp \left(\frac{\mu_{t,n} + h_{t,n} + c_{t,n}}{2} \right) $$ with \(\sigma\) being a scaling factor and \(\mu_{t,n}\) is the non-stationary volatility component $$ \mu_{t,n} = \mu_{t,n-1} + \xi_{t,n} $$ with independent \(\xi_{t,n} \sim \mathcal{N}(0,\sigma_\xi^2)\). \(h_{t,n}\) is the stochastic stationary acyclical volatility component $$ h_{t,n} = \phi h_{t,n-1} + \nu_{t,n} $$ with independent \(\eta_{t,n} \sim \mathcal{N}(0,\sigma_\eta^2)\) and \(| \phi | \leq 1\). The cyclical component is separated in two components: $$ c_{t,n} = c_{1,t,n} + c_{2,t,n} $$ The first component is written in state-space form, $$ \left( \begin{array}{r} c_{1,t,n} \\ c_{1,t,n}^* \end{array}\right) = \rho \left(\begin{array}{rr} \cos \lambda & \sin \lambda \\ -\sin \lambda & \cos \lambda \end{array}\right) \left(\begin{array}{r} c_{1,t,n - 1} \\ c_{1,t,n-1}^* \end{array}\right) + \left(\begin{array}{r} \kappa_{1,t,n} \\ \kappa_{1,t,n}^* \end{array}\right) $$ with \(0 \leq \rho \leq 1\) and \(\kappa_{1,t,n}, \kappa_{1,t,n}^*\) are mutually independent zero-mean normal random variables with variance \(\sigma_\kappa^2\). All other parameters and the process \(c_{1,t,n}^*\) in the state-space representation are only of instrumental use and are not part of the return value which is why we won't introduce them in detail in this vignette; see Beltratti and Morana (2001, pp. 208-209) for more information.
The second component is given by $$ c_{2,t,n} = \mu_1 n_1 + \mu_2 n_2 + \sum_{p = 2}^P (\delta_{cp} \cos(p\lambda) + \delta_{sp} \sin (p \lambda n)) $$ with \(n_1 = 2n / (N+1)\) and \(n_2 = 6n^2 / (N+1) / (N+2)\).
Nonparametric filtering ("kernel"
)
Parameters:
type
String specifying the type of kernel to be used. Options
include "gaussian", "epanechnikov", "beta"
. Default = "gaussian"
.
h
Scalar or vector specifying bandwidth(s) to be used in kernel.
If h
is a scalar, it will be assumed equal throughout the sample. If
it is a vector, it should contain bandwidths for each day. If left empty,
it will be estimated. Default = NULL
.
est
String specifying the bandwidth estimation method. Possible
values include "cv", "quarticity"
. Method "cv"
equals
cross-validation, which chooses the bandwidth that minimizes the Integrated
Square Error. "quarticity"
multiplies the simple plug-in estimator
by a factor based on the daily quarticity of the returns. est
is
obsolete if h
has already been specified by the user.
"cv"
by default.
lower
Lower bound to be used in bandwidth optimization routine,
when using cross-validation method. Default is \(0.1n^{-0.2}\).
upper
Upper bound to be used in bandwidth optimization routine,
when using cross-validation method. Default is \(n^{-0.2}\).
Outputs (see `Value' for a full description of each component):
spot
par
This method by Kristensen (2010) filters the spot volatility in a nonparametric way by applying kernel weights to the standard realized volatility estimator. Different kernels and bandwidths can be used to focus on specific characteristics of the volatility process.
Estimation results heavily depend on the bandwidth parameter \(h\), so it is important that this parameter is well chosen. However, it is difficult to come up with a method that determines the optimal bandwidth for any kind of data or kernel that can be used. Although some estimation methods are provided, it is advised that you specify \(h\) yourself, or make sure that the estimation results are appropriate.
One way to estimate \(h\), is by using cross-validation. For each day in
the sample, \(h\) is chosen as to minimize the Integrated Square Error,
which is a function of \(h\). However, this function often has multiple
local minima, or no minima at all (\(h \rightarrow \infty\)). To ensure a reasonable
optimum is reached, strict boundaries have to be imposed on \(h\). These
can be specified by lower
and upper
, which by default are
\(0.1n^{-0.2}\) and \(n^{-0.2}\) respectively, where \(n\) is the
number of observations in a day.
When using the method "kernel"
, in addition to the spot volatility
estimates, all used values of the bandwidth \(h\) are returned.
A formal definition of the estimator is too extensive for the context of this vignette. Please refer to Kristensen (2010) for more detailed information. Our parameter names are aligned with this reference.
Piecewise constant volatility ("piecewise"
)
Parameters:
type
string specifying the type of test to be used. Options
include "MDa", "MDb", "DM"
. See Fried (2012) for details. Default = "MDa"
.
m
number of observations to include in reference window.
Default = 40
.
n
number of observations to include in test window.
Default = 20
.
alpha
significance level to be used in tests. Note that the test
will be executed many times (roughly equal to the total number of
observations), so it is advised to use a small value for alpha
, to
avoid a lot of false positives. Default = 0.005
.
volEst
string specifying the realized volatility estimator to be
used in local windows. Possible values are "rBPCov", "rRVar", "rMedRVar"
.
Default = "rBPCov"
.
online
boolean indicating whether estimations at a certain point
\(t\) should be done online (using only information available at
\(t-1\)), or ex post (using all observations between two change points).
Default = TRUE
.
Outputs (see `Value' for a full description of each component):
spot
cp
This nonparametric method by Fried (2012) is a two-step approach and
assumes the volatility to be
piecewise constant over local windows. Robust two-sample tests are applied to
detect changes in variability between subsequent windows. The spot volatility
can then be estimated by evaluating regular realized volatility estimators
within each local window.
"MDa", "MDb"
refer to different test statistics, see Section 2.2 in Fried (2012).
Along with the spot volatility estimates, this method will return the
detected change points in the volatility level. When plotting a
spotVol
object containing cp
, these change points will be
visualized.
GARCH models with intraday seasonality ("garch"
)
Parameters:
model
string specifying the type of test to be used. Options
include "sGARCH", "eGARCH"
. See ugarchspec
in the
rugarch
package. Default = "eGARCH"
.
garchorder
numeric value of length 2, containing the order of
the GARCH model to be estimated. Default = c(1,1)
.
dist
string specifying the distribution to be assumed on the
innovations. See distribution.model
in ugarchspec
for
possible options. Default = "norm"
.
solver.control
list containing solver options.
See ugarchfit
for possible values. Default = list()
.
P1
a positive integer corresponding to the number of cosine
terms used in the flexible Fourier specification of the periodicity function.
Default = 5.
P2
same as P1
, but for the sinus terms. Default = 5.
Outputs (see `Value' for a full description of each component):
spot
ugarchfit
Along with the spot volatility estimates, this method will return the
ugarchfit
object used by the rugarch
package.
In this model, daily returns \(r_t\) based on intraday observations \(r_{i,t}, i = 1, \dots, N\) are modeled as $$ r_t = \sum_{i = 1}^N r_{i,t} = \sigma_t \frac{1}{\sqrt{N}} \sum_{i = 1}^N s_i Z_{i,t}. $$ with \(\sigma_t > 0\), intraday seasonality \(s_i\) > 0, and \(Z_{i,t}\) being a zero-mean unit-variance error term.
The overall approach is as in Appendix B of Andersen and Bollerslev (1997).
This method generates the external regressors \(s_i\) needed to model the intraday
seasonality with a flexible Fourier form (Andersen and Bollerslev, 1997, Eqs. A.1-A.4).
The rugarch
package is then employed to estimate the specified intraday GARCH(1,1) model
on the residuals \(r_{i,t} / s_i\).
Realized Measures ("RM"
)
This estimator takes trailing rolling window observations of intraday returns to estimate the spot volatility.
Parameters:
RM
string denoting which realized measure to use to estimate the local volatility.
Possible values are: "rBPCov", "rMedRVar", "rMinRVar", "rCov", "rRVar"
.
Default = "rBPCov"
.
lookBackPeriod
positive integer denoting the amount of sub-sampled returns to use
for the estimation of the local volatility. Default is 10
.
dontIncludeLast
logical indicating whether to omit the last return in the calculation of the local volatility.
This is done in Lee-Mykland (2008) to produce jump-robust estimates of spot volatility.
Setting this to TRUE
will then use lookBackPeriod - 1
returns in the construction of the realized measures. Default = FALSE
.
Outputs (see `Value' for a full description of each component):
spot
RM
lookBackPeriod
This method returns the estimates of the spot volatility, a string containing the realized measure used, and the lookBackPeriod.
(Non-overlapping) Pre-Averaged Realized Measures ("PARM"
)
This estimator takes rolling historical window observations of intraday returns to estimate the spot volatility
as in the option "RM"
but adds return pre-averaging of the realized measures.
For a description of return pre-averaging see the details on spotDrift.
Parameters:
RM
String denoting which realized measure to use to estimate the local volatility.
Possible values are: "rBPCov", "rMedRVar", "rMinRVar", "rCov", and "rRVar"
. Default = "rBPCov"
.
lookBackPeriod
positive integer denoting the amount of sub-sampled returns to use for the estimation of the local volatility. Default = 50.
Outputs (see `Value' for a full description of each component):
spot
RM
lookBackPeriod
kn
Andersen, T. G. and Bollerslev, T. (1997). Intraday periodicity and volatility persistence in financial markets. Journal of Empirical Finance, 4, 115-158.
Beltratti, A. and Morana, C. (2001). Deterministic and stochastic methods for estimation of intraday seasonal components with high frequency data. Economic Notes, 30, 205-234.
Boudt K., Croux C., and Laurent S. (2011). Robust estimation of intraweek periodicity in volatility and jump detection. Journal of Empirical Finance, 18, 353-367.
Fried, R. (2012). On the online estimation of local constant volatilities. Computational Statistics and Data Analysis, 56, 3080-3090.
Kristensen, D. (2010). Nonparametric filtering of the realized spot volatility: A kernel-based approach. Econometric Theory, 26, 60-93.
Taylor, S. J. and Xu, X. (1997). The incremental volatility information in one million foreign exchange quotations. Journal of Empirical Finance, 4, 317-340.
if (FALSE) {
init <- list(sigma = 0.03, sigma_mu = 0.005, sigma_h = 0.007,
sigma_k = 0.06, phi = 0.194, rho = 0.986, mu = c(1.87,-0.42),
delta_c = c(0.25, -0.05, -0.2, 0.13, 0.02),
delta_s = c(-1.2, 0.11, 0.26, -0.03, 0.08))
# Next method will take around 370 iterations
vol1 <- spotVol(sampleOneMinuteData[, list(DT, PRICE = MARKET)], method = "stochPer", init = init)
plot(vol1$spot[1:780])
legend("topright", c("stochPer"), col = c("black"), lty=1)}
# Various kernel estimates
if (FALSE) {
h1 <- bw.nrd0((1:nrow(sampleOneMinuteData[, list(DT, PRICE = MARKET)]))*60)
vol2 <- spotVol(sampleOneMinuteData[, list(DT, PRICE = MARKET)],
method = "kernel", h = h1)
vol3 <- spotVol(sampleOneMinuteData[, list(DT, PRICE = MARKET)],
method = "kernel", est = "quarticity")
vol4 <- spotVol(sampleOneMinuteData[, list(DT, PRICE = MARKET)],
method = "kernel", est = "cv")
plot(cbind(vol2$spot, vol3$spot, vol4$spot))
xts::addLegend("topright", c("h = simple estimate", "h = quarticity corrected",
"h = crossvalidated"), col = 1:3, lty=1)
}
# Piecewise constant volatility
if (FALSE) {
vol5 <- spotVol(sampleOneMinuteData[, list(DT, PRICE = MARKET)],
method = "piecewise", m = 200, n = 100, online = FALSE)
plot(vol5)}
# Compare regular GARCH(1,1) model to eGARCH, both with external regressors
if (FALSE) {
vol6 <- spotVol(sampleOneMinuteData[, list(DT, PRICE = MARKET)], method = "garch", model = "sGARCH")
vol7 <- spotVol(sampleOneMinuteData[, list(DT, PRICE = MARKET)], method = "garch", model = "eGARCH")
plot(as.numeric(t(vol6$spot)), type = "l")
lines(as.numeric(t(vol7$spot)), col = "red")
legend("topleft", c("GARCH", "eGARCH"), col = c("black", "red"), lty = 1)
}
if (FALSE) {
# Compare realized measure spot vol estimation to pre-averaged version
vol8 <- spotVol(sampleTDataEurope[, list(DT, PRICE)], method = "RM", marketOpen = "09:00:00",
marketClose = "17:30:00", tz = "UTC", alignPeriod = 1, alignBy = "mins",
lookBackPeriod = 10)
vol9 <- spotVol(sampleTDataEurope[, list(DT, PRICE)], method = "PARM", marketOpen = "09:00:00",
marketClose = "17:30:00", tz = "UTC", lookBackPeriod = 10)
plot(zoo::na.locf(cbind(vol8$spot, vol9$spot)))
}
Run the code above in your browser using DataLab