Learn R Programming

fractal (version 2.0-4)

infoDim: Information dimension

Description

This function estimates the information dimension by forming a delay embedding of a time series, calculating related statistical curves (one per embedding dimension), and subsequently fitting the slopes of these curves on a log-log scale using a robust linear regression model. If the slopes converge at a given embedding dimension \(E\), then \(E\) is the correct embedding dimension and the (convergent) slope value is an estimate of the information dimension for the data.

Usage

infoDim(x, dimension=5, tlag=NULL,
    olag=0, n.density=100, metric=Inf,
    max.neighbors=as.integer(min(c(round(length(x) / 3), 100))),
    n.reference=as.integer(round(length(x) / 20)))

Arguments

x

a vector containing a uniformly-sampled real-valued time series.

dimension

the maximal embedding dimension. Default: 5.

max.neighbors

let \(p=k/N\) for \(0 < p \le 1\) be the mass density where \(N\) is the number of points in the embedding and \(k\) is the number of neighbors found near an arbitrary reference point in the embedding. The max.neighbors parameter defines the maximum value for \(k\), regardless of the required density. In the case where the number of neighbors \(k\) required to meet the density \(p\) exceeds max.neighbors, then \(k\) is limited to max.neighbors and (instead) \(N\) is decreased accordingly to \(N'=\lfloor \mbox{max.neighbors} / p \rfloor\). It is important to note that only the database of neighbors (formed by an efficient kd-tree algorithm) is reduced to \(N'\) points while all \(N\) points in the embedding are considered as neighbor candidates for any given reference point. The point of all this is to reduce the computational burden. Setting max.neighbors to a larger value than the default will increase the computational burden but will lessen the error in estimating the average neighborhood radius of all reference points with a (specified) constant neighborhood density. Default: min(c(round(length(x) / 3), 100)).

metric

the metric used to define the distance between points in the embedding. Choices are limited to 1, 2, or Inf which represent an \(L_1\), \(L_2\), and \(L_\infty\) norm, respectively. Default: Inf.

n.density

the number of points to create in developing the density vector. For a given reference point in the phase space, the density is defined by the relation \(p=k / N\) where \(k\) is the number of neighbors in the phase space and \(N\) is the total number of points in the embedding. To obtain the informaiton dimension statistics, the density is varied logarithmically from \(1/N\) to \(1.0\). Default: 100.

n.reference

the number of reference points to use in forming the information dimension statistic. This argument directly specifies the number of equi-dense neighborhoods to average in forming the average neighborhood radius statistic. As with the max.neighbors argument, increasing n.reference beyond the default will increase the computational burden at the benefit of obtaining (perhaps) less variable statistics. Default: round(length(x) / 20).

olag

the number of points along the trajectory of the current point that must be exceeded in order for another point in the phase space to be considered a neighbor candidate. This argument is used to help attenuate temporal correlation in the the embedding which can lead to spuriously low correlation dimension estimates. The orbital lag must be positive or zero. Default: length(x)/10 or 500, whichever is smaller.

tlag

the time delay between coordinates. Default: the decorrelation time of the autocorrelation function.

Value

an object of class chaoticInvariant.

S3 METHODS

eda.plot

plots an extended data analysis plot, which graphically summarizes the process of obtaining a information dimension estimate. A time history, phase plane embeddding, information dimension curves, and the slopes of information dimension curves as a function of scale are plotted.

plot

plots the information dimension curves on a log-log scale. The following options may be used to adjust the plot components:

type

Character string denoting the type of data to be plotted. The "stat" option plots the information dimension curves while the "dstat" option plots a 3-point estimate of the derivatives of the information dimension curves. The "slope" option plots the estimated slope of the information dimension curves as a function of embedding dimension. Default: "stat".

fit

Logical flag. If TRUE, a regression line is overlaid for each curve. Default: TRUE.

grid

Logical flag. If TRUE, a grid is overlaid on the plot. Default: TRUE.

legend

Logical flag. If TRUE, a legend of the estimated slopes as a function of embedding dimension is displayed. Default: TRUE.

...

Additional plot arguments (set internally by the par function).

print

prints a qualitiative summary of the results.

Details

The information dimension (\(D_1\)) is one of an infinite number of fractal dimensions of a chaotic system. For generalized fractal dimension estimations, correlation integral moments are determined as an average of the contents of neighbohoods in the phase space of equal radius eps. Using this approach. the information dimension for a given embedding dimension \(E\) is estimated via \(D_1(E)=<\ln(p)> / \ln(\varepsilon)\) in the limit as \(\varepsilon\) approaches zero, where \(\varepsilon\) is the radius of an E-dimensional hypersphere, p is the density (also known as the mass fraction), and \(<\ln(p)>\) is the average Shannon information needed to specify an arbitrary point in the phase space with accuracy \(\varepsilon\).

Alternatively, the neighborhoods can be constructed with variable radii but with constant density. The scaling behavior of the average radii of these neighborhoods as a function of density is then used to estimate the fractal dimensions. In this function, we use this constant density approach to calculate the statistics for estimating the information dimension.

For single variable time series, the phase space is approximated with a delay embedding and \(D_1(E)\) is thus estimated over statistics gathered for dimensions \(1,\ldots,E\). For chaotic systems, these statistics will `saturate' at a finite embedding dimension, revealing both the (estimated) information dimension and an appropriate embedding dimension for the system. A linear regression scheme should be to estimate the \(D_1(E)\) using the statistics returned by this function.

References

Peter Grassberger and Itamar Procaccia (1983), Measuring the strangeness of strange attractors, Physica D, 9, 189--208.

Holger Kantz and Thomas Schreiber (1997), Nonlinear Time Series Analysis, Cambridge University Press.

See Also

corrDim, embedSeries, timeLag, chaoticInvariant, lyapunov, poincareMap, spaceTime, findNeighbors, determinism.

Examples

Run this code
# NOT RUN {
## calculate the information dimension estimates 
## for chaotic beam data using a delay 
## embedding for dimensions 1 through 10 
beam.d1 <- infoDim(beamchaos, dim=10)

## print a summary of the results 
print(beam.d1)

## plot the information dimension curves without 
## regression lines 
plot(beam.d1, fit=FALSE, legend=FALSE)

## plot an extended data analysis plot 
eda.plot(beam.d1)
# }

Run the code above in your browser using DataLab