Learn R Programming

cepp (version 1.7)

CvM: Projection Pursuit Indices based on the bivariate empirical distribution function.

Description

This function can be used to compute the projection pursuit indices described in Perisic and Posse (2005).

Usage

ecdf.indices(A, sphered = FALSE)

Arguments

A
The projected data.
sphered
Whether the data has already been sphered or not. If set to FALSE (default), the function will sphere the data before computing the indices.

Value

  • A named numeric vector with the values of the following indices : the Cramer-von-Mises index, the Kolmogorov-Smirnov index, the D2 Symmetry index, and the D-infinity Symmetry index.

Details

The two-dimensional empirical distribution function is defined as, $$F_n(x, y) = \frac{1}{n} \#{(x_j, y_j): x_j \leq x \mbox{ and } y_j \leq y}$$

The indices described in Perisic and Posse (2005) use this function to construct the following four indices.

Cramer-von-Mises: $$\sum_i (F_n(x_i, y_i) - \Phi(x_i)\Phi(y_i))^2$$ Kolmogorov-Smirnov: $$\max_i |F_n(x_i, y_i) - \Phi(x_i)\Phi(y_i)|$$ D2: $$\sum_i (F_n(x_i, y_i) - F_n(y_i, x_i))^2$$ D-infinity: $$\max_i |F_n(x_i, y_i) - F_n(y_i, x_i)|$$

where $\Phi(.)$ is the cumulative distribution function of the standard normal distribution.

When using any of these indices, the original authors recommended rotating the data projection several times to obtain rotational invariance. In simulations, the indices performed well even without rotations.

References

Perisic, Igor, and Christian Posse. "Projection pursuit indices based on the empirical distribution function." Journal of Computational and Graphical Statistics 14.3 (2005).