determinism: Detecting determinism in a time series

Description

Infers the existence of deterministic structure in a given time series. If fractal strucutre exists, this function is useful in helping the user decide whether a deterministic chaotic model or stochastic fractal time series model is appropriate for their data.

Usage

determinism(x, dimension=6, tlag=NULL,
    olag=1, scale.min=NULL, scale.max=NULL,
    resolution=NULL, method="ce",
    n.realization=10, attach.summary=TRUE,
    seed=0)

Arguments

a numeric vector or matrix containing uniformly-sampled real-valued time series.

attach.summary

a logical flag. If TRUE, a summary of the results is calculated and attached to returned object as an attribute named "summary". The summary statistics are calculated using the summary method. Default: TRUE.

dimension

an integer defining the maximum embedding dimension to use in analyzing the data. Default: 6.

method

a character string representing the method to be used to generate surrogate data. Choices are:

"aaft": Theiler's Amplitude Adjusted Fourier Transform.
"phase": Theiler's phase randomization.
"ce": Davies and Harte's Circulant Embedding.
"dh": Davison and Hinkley's phase and amplitude randomization.

Default: "ce".

n.realization

an integer denoting the number of surrogate realizations to create and analyze for comparison to the ensmeble of E-statistics. Default: 10.

olag

the number of points along the trajectory of the current point that must be exceeded in order for another point in the phase space to be considered a neighbor candidate. This argument is used to help attenuate temporal correlation in the the embedding which can lead to spuriously low correlation dimension estimates. The orbital lag must be positive or zero. Default: length(x)/10 or 500, whichever is smaller.

resolution

a numeric value representing the spacing between scales (Euclidean bin size). Default: diff(range(x))/1000.

scale.max

a numeric value defining the maximum scale over which the results should be returned. Default: diff(range(x)) * sqrt(dimension).

scale.min

a numeric value defining the minimum scale over which the results should be returned. Default: min(diff(sort(x)))/1000.

seed

a positive integer representing the initial seed value for generating surrogate realizations of the original input time series. These surrogates are used to collect an ensemble of determinism statistics (see DETAILS section for more information). If the specified seed value is positive, the seeds used for generating the surrogate ensemble will be calculated via set.seed(seed);rsample(.Machine\$integer.max, size=n.realization). This argument should only be used (by specifying a positive seed value) if the user wishes to replicate a particular set of results, such as those illustrated in the casebook examples. If seed=0, then the random seeds will be generated based on the current time. Default: 0 (generate the random seeds based on the current time).

tlag

the time delay between coordinates. Default: the decorrelation time of the autocorrelation function.

Value

an object of class determinism.

S3 METHODS

eda.plot: plots a barplot of the determinism level (expressed as a percentage on [0,100]) based on the fraction of overlap between the E-statistics for the original series and that of the ensmeble of surrogates. The amount of non-overlap is calculated relative to both the first quartile and extreme values of the E-statistics for the surrogate ensemble.
plot: plots the E-statistics at small scales of the original series overlaid with those of the ensmeble of surrogates (illustrated using boxplots over a subsampled set of the surrogate E-statistics).
print: print a sumamry of the analysis.
summary: produces a summary of the E-statistics for use in the print, and plot, and eda.plot methods.

Details

This function calculates the so-called delta-epsilon test for detecting deterministic structure in a time series by exploiting (possible) continuity of orbits comprising a phase space topology created by a time-delayed embedding of the original time series. This phase space continuity is non-existent for stochastic white noise processes. The delta-epsilon test works by

1: an ensemble of randomized realizations of the original time series, i.e., surrogate data is created.
2: an appropriate phase space statistic (called the E-statistic) is calculated for both the time-delayed embedding of the original time series and the ensemble of surrogates.
3: a comparison of the E-statistic for the original series and the ensemble of surrogate data is made. If there is a separation of the original E-statistic from that of the ensemble, it implies the existence of deterministic structure in the original time series. Conversely, an overlap of E-statstics implies that the original series cannot be discriminated from the ensemble of randomized surrogates and thus it is inferred that the original series is a realization of a random process.

The discriminating E-statistic is calculated as follows: Define $$\delta _{j,k} = |z_{j} - z_{k}|$$ $$\epsilon _{j,k} = |z_{j+ \kappa} - z_{k+\kappa}|$$ $$e(r) \equiv \overline{\epsilon _{j,k}} \qquad\hbox{for $j,k$ s.t. } r \leq \delta_{j,k} < r + \Delta r $$

where $\delta_{j,k}$ is the Euclidean distance (using an infinity-norm metric) between phase space points $z_j$ and $z_k$, and $\epsilon_{j,k}$ is the corresponding separation distance between the points at a times $\kappa$ points in the future along their respective orbits. These future points are referred to as images of the original pair. The variable $\kappa$ is referred to as the orbital lag. The increment $\Delta r$ is the width of a specificed Euclidean bin size. Given $\Delta r$, the distance $\delta_{j,k}$ is used solely to identify the proper bin in which to store the image distance $\epsilon_{j,k}$. The average of each bin forms the $e(r)$ statistic. Finally, the E-statistic is formed by calculating a cumulative summation over the the $e(r)$ statistic, i.e., $$E(r) \equiv \sum \overline{e(r)}. $$

If there exists a distinct separation of the E-statstics for the original time series and the ensemble of surrogate data, it implies that the signal is deterministic. The orbital lag $\kappa$ should be chosen large enough to sufficiently decorrelate the points evaluated along a given orbit.

References

Kaplan, D. (1994), Exceptional Events as Evidence for Determinism, Physica D, 73, 38--48.

Examples

Run this code

# NOT RUN {
## perform a determinism test for the beamchaos 
## series. in order to do so, it is vitally 
## important to provide the proper orbital lag, 
## which can be estimated as the lag value 
## associated with the first common maxima over 
## all contours in a spaceTime plot. 
plot(spaceTime(beamchaos))

## we esimate an appropriate olag of 30, and 
## subsequently perform the deterrminism test 
beam.det <- determinism(beamchaos, olag=30)
print(beam.det)
plot(beam.det)

eda.plot(beam.det)

## perform a similar analysis for a Gaussian white 
## noise realization 
rnorm.det <- determinism(rnorm(1024),olag=1)
print(rnorm.det)
plot(rnorm.det)

eda.plot(rnorm.det)
# }

Run the code above in your browser using DataLab