Learn R Programming

fractal (version 2.0-4)

KDE: Nonparametric multidimensional probability density function estimation

Description

Given a training matrix, this function estimates a multidimensional probability density function using the Epanechnikov kernel as a smoother. The density function is estimated at a specified and arbitrary set of points, i.e., at points not necessarily members of the training set.

Usage

KDE(x, at=NULL, n.grid=100)

Arguments

x

a matrix whose columns contain the coordinates for each dimension. Each row represents the location of a single point in a multidimensional embedding.

at

the locations of the points over which the KDE is to be calculated. Default: a multidimensional uniform grid of points spanning the training data space (defined by x).

n.grid

the number of divisions per dimension to using in forming the default grid when the at input is unspecified. Default: 100.

Value

an object of class KDE.

S3 METHODS

eda.plot

extended data analysis plot showing the original data along with a perspective and contour plot of the resulting KDE. In the case that the primary input x is a single variable (a time series), only the KDE is plotted.

plot

plot the KDE or original (training) data. Options are:

style

a character string denoting the type of plot to produce. Choices are "original", "perspective", and "contour" for plotting the original training data, a perspective plot of the KDE, or a contour plot of the KDE over the specifed dimensions. In the case that the primary input x is a single variable (a time series), this parameter is automatically set to unity and a KDE is plotted. Default: "original".

dimensions

a two-element integer vector denoting the dimensions/variables/columns to select from the training data and resulting multidimensional KDE for perspective and contour plotting. In the case that the primary input x is a single variable (a time series), this parameter is automatically set to unity and a KDE is plotted. Default: 1:2 for multivariate training data, 1 for univariate training data.

xlab

character string defining the x-axis label. Default: dimnames of the specified dimensions of the training data. If missing, "X" is used. For univariate training data, the x-axis label is set to the name of the original time series.

ylab

character string defining the y-axis label. Default: dimnames of the specified dimensions of the training data. If missing, "Y" is used. For univariate training data, the y-axis label is set to "KDE".

zlab

character string defining the z-axis label for perspective plots. Default: "KDE".

grid

a logical flag. If TRUE, a grid is plotted for the "original" style plot. Default: "FALSE".

...

Optional arguments to be passed directly to the specified plotting routine.

print

a summary of the KDE object is printed.. Available options are:

justify

text justification ala prettPrintList. Default: "left".

sep

header separator ala prettyPrintList. Default: ":".

...

Additional print arguments sent directly to the prettyPrintList function).

Details

The kernel bandwidth is constant (non-adaptive) and is determined by first computing the minimum variance of all dimensions (columns) of x. This minimum variance is then used in Scott's Rule to compute the final bandwidth.

This function is primarily used for estimating the mutual information of a time series and is included here for illustrative purposes.

See Also

timeLag.

Examples

Run this code
# NOT RUN {
## create a mixture of 2-D Gaussian distributed 
## RVs with different means, standard 
## deviations, point density, and orientation. 
n.sample <- c(1000, 500, 300)
ind      <- rep(1:3, n.sample)
x <- rmvnorm(sum(n.sample),
    mean = rbind(c(-10,-20), c(10,0), c(0,0))[ ind, ],
    sd   = rbind(c(5,3), c(1,3) , c(0.3,1))[ ind, ],
    rho  = c(0.5, 1, -0.4)[ind])

## perform the KDE 
z <- KDE(x)
print(z)

## plot a summary of the results 
eda.plot(z)

## form KDE of beamchaos series 
plot(KDE(beamchaos),type="l")
# }

Run the code above in your browser using DataLab