Estimates the correlation dimension by forming a delay embedding of a time series, calculating correlation summation curves (one per embedding dimension), and subsequently fitting the slopes of these curves on a log-log scale using a robust linear regression model. If the slopes converge at a given embedding dimension \(E\), then \(E\) is the correct embedding dimension and the (convergent) slope value is an estimate of the correlation dimension for the data.
corrDim(x, dimension=5,
tlag=timeLag(x, method="acfdecor"), olag=0, resolution=2)
a vector containing a uniformly-sampled real-valued time series or
a matrix containing an embedding with each column representing a different coordinate.
If the latter, the dimension
input is set to the number of columns and the
tlag
input is ignored.
the maximal embedding dimension. Default: 5
.
the number of points along the trajectory of the
current point that must be exceeded in order for
another point in the phase space to be considered
a neighbor candidate. This argument is used
to help attenuate temporal correlation in the
the embedding which can lead to spuriously low
correlation dimension estimates. The orbital lag
must be positive or zero. Default: length(x)/10
or 500
, whichever is smaller.
an integer representing the spatial resolution factor.
A value of P increases the number of effective
scales by a factor of P at a cost of raising
the \(\ell_\infty\) norm to the Pth power.
For example, setting the resolution to 2
will double the number of scales while imposing
and additional multiplication operation. The
resolution must exceed unity. Default: 2
.
the time delay between coordinates.
Default: timeLag(x, method="acfdecor")
, the decorrelation time of the autocorrelation function.
an object of class chaoticInvariant
.
plots an extended data analysis plot, which graphically summarizes the process of obtaining a correlation dimension estimate. A time history, phase plane embeddding, correlation summation curves, and the slopes of correlation summation curves as a function of scale are plotted.
plots the correlation summation curves on a log-log scale. The following options may be used to adjust the plot components:
Character string denoting the type of data to be plotted. The "stat"
option
plots the correlation summation curves while the "dstat"
option plots a 3-point estimate
of the derivatives of the correlation summation curves. The "slope"
option plots the estimated
slope of the correlation summation curves as a function of embedding dimension. Default: "stat"
.
Logical flag. If TRUE
, a regression line is overlaid for each curve. Default: TRUE
.
Logical flag. If TRUE
, a grid is overlaid on the plot. Default: TRUE
.
Logical flag. If TRUE
, a legend of the estimated slopes as a function of
embedding dimension is displayed. Default: TRUE
.
Additional plot arguments (set internally by the par
function).
prints a qualitiative summary of the results.
To estimate the correlation dimension, correlation summation
curves must be generated and subsequently fit with a
robust linear regression model to obtain the slopes of these
curves on a log-log plot. The dimension at which these
slope estimates (appear to) converge reveals the proper
embedding dimension for the data and the slope at this
(and higher) embedding dimensions is an estimate of the
correlation dimension. The function used to fit the
correlation summation curves is lmsreg
which fits a robust
linear model to the data using the method of least median of squares
regression. See the on-line help documentation for help on the lmsreg
function: in R, lmsreg
is found in the MASS
package while in S-PLUS it is indigenous
and appears in the splus
database.
The correlation summation at scale \(\varepsilon\) for a given embedding dimension is defined as $$C_2(\varepsilon)={ 2 \over (N - \gamma)(N - \gamma - 1) } \sum_{i=1}^N\sum_{j=i+\gamma+1}^N\Theta(\varepsilon - || \mathbf{X_i} - \mathbf{X_j} ||),$$ where \(\Theta(\cdot)\) is the Heavyside function $$ \Theta(x)=\left\{ \begin{array}{ll} 0,& \mbox{if $x \le 0$;}\\ 1,& \mbox{otherwise} \end{array} \right.$$
and \(\mathbf{X_i}\) is the \(i\)th point of a
collection of N
points in the phase space. The parameter
\(\gamma\) is the orbital lag.
The algorithm used to calculate the correlation summation is made computationally efficient by using:
The \(\ell_\infty\) norm to calculate the distance between neighbors in the phase space as opposed to (say) the \(\ell_2\) norm which involves taking computationally intense square root and power of two operations. The \(\ell_\infty\) norm of the distance between two points in the phase space is the absolute value of the maximal difference between any of the points' respective coordinates, i.e. if \(\mathbf{X}=[z_1, z_2, z_3]^T\) then \(||\mathbf{X}||_\infty \equiv \max_i |z_i|\).
Bitwise masking and shift operations to reveal the radix-2 exponent of the \(\ell_\infty\) norm. This direct means of obtaining the exponent immediately yields the associated scale of the distance between neighbors in the phase space while avoiding costly log operations. The bitwise mask and shift factors are based on the IEEE standard 754 for binary floating-point arithmetic. Initial tests are performed in the code to verify that the current machine follows this standard.
a computationally efficient routine to calculate
the resulting value of a float raised to a positive integer power.
Specifically, the \(\ell_\infty\) norm is raised to an
integer power (p
) to
effectively increase the spatial resolution
by a factor of p
.
The correlation summation curves \(C_2(E,\varepsilon)\)
where E
is the embedding dimension and
\(\varepsilon\) is the scale, the correlation dimension curves
\(D_2(E,\varepsilon)\) can be calculated by
$$D_2(E,\varepsilon) ={\ln C_2(E,2\varepsilon) - \ln C_2(E,\varepsilon/2) \over
\ln 2\varepsilon - \ln \varepsilon/2} ={1 \over 2} \log_2{ C_2(E,2\varepsilon) \over
C_2(E,\varepsilon/2) }.$$
This formulation is used to help suppress numerical instabilities that are present in
other numerical derivative schemes such as a first order difference.
As a caveat to the user, the slope estimates of the correlation summation curves will typically display a fair amount of variability and the range of scales over which the slopes are approximately linear may be small. Inasmuch, the correlation dimension estimate should always be interpretted as a subjective summary statistic, even when the original times series is representative of a truly noise-free chaotic response.
Peter Grassberger and Itamar Procaccia (1983), Measuring the strangeness of strange attractors, Physica D, 9, 189--208.
Holger Kantz and Thomas Schreiber (1997), Nonlinear Time Series Analysis, Cambridge University Press.
Peter Grassberger and Itamar Procaccia (1983), Characterization of strange attractors, Physical Review Letters, 50(5), 346--349.
Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American Statistical Association, 79, 871--88.
infoDim
, embedSeries
, timeLag
, chaoticInvariant
, lyapunov
, poincareMap
, spaceTime
, findNeighbors
, determinism
.
# NOT RUN {
## calculate the correlation dimension estimates
## for chaotic beam data using a delay
## embedding for dimensions 1 through 10, a
## orbital lag of 10, and a spatial resolution
## of 4.
beam.d2 <- corrDim(beamchaos, olag=10, dim=10, res=4)
## print a summary of the results
print(beam.d2)
## plot the correlation summation curves
plot(beam.d2, fit=FALSE, legend=FALSE)
## plot an extended data analysis plot
eda.plot(beam.d2)
# }
Run the code above in your browser using DataLab