Computes the halfspace median and its corresponding
halfspace depth for a \(p\)-dimensional data set x
. Computation is exact for \(p \le 2\) and approximate for \(p > 2\).
hdepthmedian(x, maxdir = NULL)
A list containing:
The coordinates of the halfspace median.
Approximate when \(p>2\).
The halfspace depth of the halfspace median.
Approximate when \(p>2\).
Logical indicating whether dithering has been applied in the exact algorithm.
FALSE
indicates no dithering has been applied.
TRUE
indicates dithering has been applied.
The number of projections used by the approximate algorithm. Due to the possibility of singularity of certain \(p\) subsamples it is possible that not all maxdir
directions are evaluated.
Indicates which stopping rule is used by the approximate algorithm.
0 indicates the maximum number of projections was reached
1 indicates no improvement of the location estimate was
made after \(10(p+1)\) steps.
If the data are lying in a lower dimensional subspace, the dimension of this subspace.
If the data are lying in a lower dimensional subspace, a direction orthogonal to this subspace.
An \(n\) by \(p\) data matrix.
The number of projections used in the approximate algorithm.
Defaults to \(250p\).
P. Segaert based on Fortran code by P.J. Rousseeuw, I. Ruts and A. Struyf
The halfspace median, or Tukey median, is the multivariate point with largest halfspace depth with respect to the data x
. This point is not always unique. In that case the halfspace median corresponds to the center of gravity of the convex set of deepest points.
It is first checked whether the data is found to lie in a subspace of dimension lower than \(p\). If so, the routine will give a warning, giving back the dimension of the subspace together with a direction describing a hyperplane containing this subspace.
For bivariate data the exact algorithm of Rousseeuw and Ruts (1998) is applied.
When the data are not in general position (i.e. when there is a line containing more than two observations) dithering is performed by adding random Gaussian noise to the data. In this case the output argument dithered
will contain a flag.
When \(p > 2\) the approximate algorithm of Struyf and Rousseeuw (2000) is applied. It is an iterative procedure based on projections. Their number can be chosen by the input parameter maxdir
.
Rousseeuw P.J., Ruts I. (1998). Constructing the bivariate Tukey median. Statistica Sinica, 8, 827--839.
Struyf A., Rousseeuw P.J. (2000). High-dimensional computation of the deepest location. Computational Statistics & Data Analysis, 34, 415--436.
# Compute a location estimate of a two-dimensional dataset.
data(cardata90)
Result <- hdepthmedian(x = cardata90)
plot(cardata90, pch = 16)
points(Result$median, col = "red", pch = 18, cex = 1.5)
Run the code above in your browser using DataLab