This function estimates the information dimension by forming a delay embedding of a time series, calculating related statistical curves (one per embedding dimension), and subsequently fitting the slopes of these curves on a log-log scale using a robust linear regression model. If the slopes converge at a given embedding dimension \(E\), then \(E\) is the correct embedding dimension and the (convergent) slope value is an estimate of the information dimension for the data.
infoDim(x, dimension=5, tlag=NULL,
olag=0, n.density=100, metric=Inf,
max.neighbors=as.integer(min(c(round(length(x) / 3), 100))),
n.reference=as.integer(round(length(x) / 20)))
a vector containing a uniformly-sampled real-valued time series.
the maximal embedding dimension. Default: 5
.
let \(p=k/N\) for \(0 < p \le 1\) be the mass density where \(N\)
is the number of points in the embedding and \(k\) is the number of neighbors found
near an arbitrary reference point in the embedding. The max.neighbors
parameter
defines the maximum value for \(k\), regardless of the required density.
In the case where the number of neighbors \(k\) required
to meet the density \(p\) exceeds max.neighbors
, then \(k\)
is limited to max.neighbors
and (instead) \(N\) is decreased accordingly to
\(N'=\lfloor \mbox{max.neighbors} / p \rfloor\).
It is important to note that only the database of neighbors (formed by an efficient
kd-tree algorithm) is reduced to \(N'\) points while all \(N\) points in the embedding
are considered as neighbor candidates for any given reference point. The point of all this
is to reduce the computational burden. Setting max.neighbors
to a larger value than
the default will increase the computational burden but will lessen the error in estimating
the average neighborhood radius of all reference points with a (specified) constant neighborhood density.
Default: min(c(round(length(x) / 3), 100))
.
the metric used to define the distance between
points in the embedding. Choices are limited to 1
, 2
, or
Inf
which represent an \(L_1\), \(L_2\), and
\(L_\infty\) norm, respectively. Default: Inf
.
the number of points to create in developing the density vector.
For a given reference point in the phase space, the density is defined by the
relation \(p=k / N\) where \(k\) is the number of neighbors in the phase space and \(N\)
is the total number of points in the embedding. To obtain the informaiton dimension
statistics, the density is varied logarithmically from \(1/N\) to \(1.0\).
Default: 100
.
the number of reference points to use in forming the information dimension
statistic. This argument directly specifies the number of equi-dense neighborhoods to average in
forming the average neighborhood radius statistic. As with the max.neighbors
argument,
increasing n.reference
beyond the default will increase the computational burden at the benefit
of obtaining (perhaps) less variable statistics. Default: round(length(x) / 20)
.
the number of points along the trajectory of the
current point that must be exceeded in order for
another point in the phase space to be considered
a neighbor candidate. This argument is used
to help attenuate temporal correlation in the
the embedding which can lead to spuriously low
correlation dimension estimates. The orbital lag
must be positive or zero. Default: length(x)/10
or 500
, whichever is smaller.
the time delay between coordinates. Default: the decorrelation time of the autocorrelation function.
an object of class chaoticInvariant
.
plots an extended data analysis plot, which graphically summarizes the process of obtaining a information dimension estimate. A time history, phase plane embeddding, information dimension curves, and the slopes of information dimension curves as a function of scale are plotted.
plots the information dimension curves on a log-log scale. The following options may be used to adjust the plot components:
Character string denoting the type of data to be plotted. The "stat"
option
plots the information dimension curves while the "dstat"
option plots a 3-point estimate
of the derivatives of the information dimension curves. The "slope"
option plots the estimated
slope of the information dimension curves as a function of embedding dimension. Default: "stat"
.
Logical flag. If TRUE
, a regression line is overlaid for each curve. Default: TRUE
.
Logical flag. If TRUE
, a grid is overlaid on the plot. Default: TRUE
.
Logical flag. If TRUE
, a legend of the estimated slopes as a function of
embedding dimension is displayed. Default: TRUE
.
Additional plot arguments (set internally by the par
function).
prints a qualitiative summary of the results.
The information dimension (\(D_1\)) is one of an infinite number of fractal dimensions of a chaotic system. For generalized fractal dimension estimations, correlation integral moments are determined as an average of the contents of neighbohoods in the phase space of equal radius eps. Using this approach. the information dimension for a given embedding dimension \(E\) is estimated via \(D_1(E)=<\ln(p)> / \ln(\varepsilon)\) in the limit as \(\varepsilon\) approaches zero, where \(\varepsilon\) is the radius of an E-dimensional hypersphere, p is the density (also known as the mass fraction), and \(<\ln(p)>\) is the average Shannon information needed to specify an arbitrary point in the phase space with accuracy \(\varepsilon\).
Alternatively, the neighborhoods can be constructed with variable radii but with constant density. The scaling behavior of the average radii of these neighborhoods as a function of density is then used to estimate the fractal dimensions. In this function, we use this constant density approach to calculate the statistics for estimating the information dimension.
For single variable time series, the phase space is approximated with a delay embedding and \(D_1(E)\) is thus estimated over statistics gathered for dimensions \(1,\ldots,E\). For chaotic systems, these statistics will `saturate' at a finite embedding dimension, revealing both the (estimated) information dimension and an appropriate embedding dimension for the system. A linear regression scheme should be to estimate the \(D_1(E)\) using the statistics returned by this function.
Peter Grassberger and Itamar Procaccia (1983), Measuring the strangeness of strange attractors, Physica D, 9, 189--208.
Holger Kantz and Thomas Schreiber (1997), Nonlinear Time Series Analysis, Cambridge University Press.
corrDim
, embedSeries
, timeLag
, chaoticInvariant
, lyapunov
, poincareMap
, spaceTime
, findNeighbors
, determinism
.
# NOT RUN {
## calculate the information dimension estimates
## for chaotic beam data using a delay
## embedding for dimensions 1 through 10
beam.d1 <- infoDim(beamchaos, dim=10)
## print a summary of the results
print(beam.d1)
## plot the information dimension curves without
## regression lines
plot(beam.d1, fit=FALSE, legend=FALSE)
## plot an extended data analysis plot
eda.plot(beam.d1)
# }
Run the code above in your browser using DataLab