Computes moments over a sliding window, then adjusts the data accordingly, centering, or scaling, or z-scoring, and so on.
t_running_centered(
v,
time = NULL,
time_deltas = NULL,
window = NULL,
wts = NULL,
na_rm = FALSE,
min_df = 0L,
used_df = 1,
lookahead = 0,
restart_period = 100L,
variable_win = FALSE,
wts_as_delta = TRUE,
check_wts = FALSE,
normalize_wts = TRUE,
check_negative_moments = TRUE
)t_running_scaled(
v,
time = NULL,
time_deltas = NULL,
window = NULL,
wts = NULL,
na_rm = FALSE,
min_df = 0L,
used_df = 1,
lookahead = 0,
restart_period = 100L,
variable_win = FALSE,
wts_as_delta = TRUE,
check_wts = FALSE,
normalize_wts = TRUE,
check_negative_moments = TRUE
)
t_running_zscored(
v,
time = NULL,
time_deltas = NULL,
window = NULL,
wts = NULL,
na_rm = FALSE,
min_df = 0L,
used_df = 1,
lookahead = 0,
restart_period = 100L,
variable_win = FALSE,
wts_as_delta = TRUE,
check_wts = FALSE,
normalize_wts = TRUE,
check_negative_moments = TRUE
)
t_running_sharpe(
v,
time = NULL,
time_deltas = NULL,
window = NULL,
wts = NULL,
lb_time = NULL,
na_rm = FALSE,
compute_se = FALSE,
min_df = 0L,
used_df = 1,
restart_period = 100L,
variable_win = FALSE,
wts_as_delta = TRUE,
check_wts = FALSE,
normalize_wts = TRUE,
check_negative_moments = TRUE
)
t_running_tstat(
v,
time = NULL,
time_deltas = NULL,
window = NULL,
wts = NULL,
lb_time = NULL,
na_rm = FALSE,
compute_se = FALSE,
min_df = 0L,
used_df = 1,
restart_period = 100L,
variable_win = FALSE,
wts_as_delta = TRUE,
check_wts = FALSE,
normalize_wts = TRUE,
check_negative_moments = TRUE
)
a vector the same size as the input consisting of the adjusted version of the input.
When there are not sufficient (non-nan) elements for the computation, NaN
are returned.
a vector of data.
an optional vector of the timestamps of v
. If given, must be
the same length as v
. If not given, we try to infer it by summing the
time_deltas
.
an optional vector of the deltas of timestamps. If given, must be
the same length as v
. If not given, and wts
are given and wts_as_delta
is true,
we take the wts
as the time deltas. The deltas must be positive. We sum them to arrive
at the times.
the window size, in time units. if given as finite integer or double, passed through.
If NULL
, NA_integer_
, NA_real_
or Inf
are given,
and variable_win
is true, then we infer the window from the lookback times: the
first window is infinite, but the remaining is the deltas between lookback times.
If variable_win
is false, then these undefined values are equivalent to an
infinite window.
If negative, an error will be thrown.
an optional vector of weights. Weights are ‘replication’
weights, meaning a value of 2 is shorthand for having two observations
with the corresponding v
value. If NULL
, corresponds to
equal unit weights, the default. Note that weights are typically only meaningfully defined
up to a multiplicative constant, meaning the units of weights are
immaterial, with the exception that methods which check for minimum df will,
in the weighted case, check against the sum of weights. For this reason,
weights less than 1 could cause NA
to be returned unexpectedly due
to the minimum condition. When weights are NA
, the same rules for checking v
are applied. That is, the observation will not contribute to the moment
if the weight is NA
when na_rm
is true. When there is no
checking, an NA
value will cause the output to be NA
.
whether to remove NA, false by default.
the minimum df to return a value, otherwise NaN
is returned.
This can be used to prevent e.g. Z-scores from being computed on only 3
observations. Defaults to zero, meaning no restriction, which can result in
infinite Z-scores during the burn-in period.
the number of degrees of freedom consumed, used in the denominator of the centered moments computation. These are subtracted from the number of observations.
for some of the operations, the value is compared to mean and standard deviation possibly using 'future' or 'past' information by means of a non-zero lookahead. Positive values mean data are taken from the future. This is in time units, and so should be a real.
the recompute period. because subtraction of elements can cause loss of precision, the computation of moments is restarted periodically based on this parameter. Larger values mean fewer restarts and faster, though less accurate results.
if true, and the window
is not a concrete number,
the computation window becomes the time between lookback times.
if true and the time
and time_deltas
are not
given, but wts
are given, we take wts
as the time_deltas
.
a boolean for whether the code shall check for negative weights, and throw an error when they are found. Default false for speed.
a boolean for whether the weights should be
renormalized to have a mean value of 1. This mean is computed over elements
which contribute to the moments, so if na_rm
is set, that means non-NA
elements of wts
that correspond to non-NA elements of the data
vector.
a boolean flag. Normal computation of running
moments can result in negative estimates of even order moments due to loss of
numerical precision. With this flag active, the computation checks for negative
even order moments and restarts the computation when one is detected. This
should eliminate the possibility of negative even order moments. The
downside is the speed hit of checking on every output step. Note also the
code checks for negative moments of every even order tracked, even if they
are not output; that is if the kurtosis, say, is being computed, and a
negative variance is detected, then the computation is restarted.
Defaults to TRUE
to avoid negative even moments. Set to FALSE
only if you know what you are doing.
a vector of the times from which lookback will be performed. The output should
be the same size as this vector. If not given, defaults to time
.
for running_sharpe
, return an extra column of the
standard error, as computed by Mertens' correction.
This function supports time (or other counter) based running computation. Here the input are the data \(x_i\), and optional weights vectors, \(w_i\), defaulting to 1, and a vector of time indices, \(t_i\) of the same length as \(x\). The times must be non-decreasing: $$t_1 \le t_2 \le \ldots$$ It is assumed that \(t_0 = -\infty\). The window, \(W\) is now a time-based window. An optional set of lookback times are also given, \(b_j\), which may have different length than the \(x\) and \(w\). The output will correspond to the lookback times, and should be the same length. The \(j\)th output is computed over indices \(i\) such that $$b_j - W < t_i \le b_j.$$
For comparison functions (like Z-score, rescaling, centering), which compare values of \(x_i\) to local moments, the lookbacks may not be given, but a lookahead \(L\) is admitted. In this case, the \(j\)th output is computed over indices \(i\) such that $$t_j - W + L < t_i \le t_j + L.$$
If the times are not given, ‘deltas’ may be given instead. If \(\delta_i\) are the deltas, then we compute the times as $$t_i = \sum_{1 \le j \le i} \delta_j.$$ The deltas must be the same length as \(x\). If times and deltas are not given, but weights are given and the ‘weights as deltas’ flag is set true, then the weights are used as the deltas.
Some times it makes sense to have the computational window be the space between lookback times. That is, the \(j\)th output is to be computed over indices \(i\) such that $$b_{j-1} - W < t_i \le b_j.$$ This can be achieved by setting the ‘variable window’ flag true and setting the window to null. This will not make much sense if the lookback times are equal to the times, since each moment computation is over a set of a single index, and most moments are underdefined.
Steven E. Pav shabbychef@gmail.com
Given the length \(n\) vector \(x\), for a given index \(i\), define \(x^{(i)}\) as the elements of \(x\) defined by the sliding time window (see the section on time windowing). Then define \(\mu_i\), \(\sigma_i\) and \(n_i\) as, respectively, the sample mean, standard deviation and number of non-NA elements in \(x^{(i)}\).
We compute output vector \(m\) the same size as \(x\). For the 'centered' version of \(x\), we have \(m_i = x_i - \mu_i\). For the 'scaled' version of \(x\), we have \(m_i = x_i / \sigma_i\). For the 'z-scored' version of \(x\), we have \(m_i = (x_i - \mu_i) / \sigma_i\). For the 't-scored' version of \(x\), we have \(m_i = \sqrt{n_i} \mu_i / \sigma_i\).
We also allow a 'lookahead' for some of these operations. If positive, the moments are computed using data from larger indices; if negative, from smaller indices.
Terriberry, T. "Computing Higher-Order Moments Online." https://web.archive.org/web/20140423031833/http://people.xiph.org/~tterribe/notes/homs.html
J. Bennett, et. al., "Numerically Stable, Single-Pass, Parallel Statistics Algorithms," Proceedings of IEEE International Conference on Cluster Computing, 2009. tools:::Rd_expr_doi("10.1109/CLUSTR.2009.5289161")
Cook, J. D. "Accurately computing running variance." https://www.johndcook.com/standard_deviation/
Cook, J. D. "Comparing three methods of computing standard deviation." https://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of-computing-standard-deviation/
running_centered
, scale