Learn R Programming

restlos (version 0.2-2)

pMST: The pMST Algorithm

Description

The function determines a robust subsample and computes estimates of location and scatter on the subset.

Usage

pMST(data, N = floor((nrow(data) + ncol(data) + 1)/2), lmax = nrow(data) * 100)

Arguments

data
data set to be analyzed, at least a 2-dimensional matrix whose number of rows (i.e. observations n) is greater than the number of columns (i.e. dimension d).
N
Size of the (robust) subsample to be determined. Default is (n+d+1)/2.
lmax
Numerical option: determines the maximal number of pruning steps, see deteils.

Value

loc
Location estimate based on the robust subsample.
cov
Covariance estimate based on the robust subsample.
sample
Index of the observations in the robust subsample.
data
The input data set.

Details

The function uses the minimum.spanning.tree function from the igraph-package to determine the minimum spanning tree (MST) of the data. The resulting MST is iteratively pruned by deleting edges (starting with the longest edge in the MST) until a connected subset with sufficient size (N) remains. Based on the robust subsample, location and scatter are estimated.

References

Kirschstein, T., Liebscher, S., and Becker, C. (2013): Robust estimation of location and scatter by pruning the minimum spanning tree, Journal of Multivariate Analysis, 120, 173-184, DOI: 10.1016/j.jmva.2013.05.004.

Liebscher, S., Kirschstein, T. (2015): Efficiency of the pMST and RDELA Location and Scatter Estimators, AStA-Advances in Statistical Analysis, 99(1), 63-82, DOI: 10.1007/s10182-014-0231-7.

Examples

Run this code
# Determine subsample of minimal size
# sub <- pMST(halle)
# Determine subsample of size=900 
# extsub <- pMST(halle, N=900)

Run the code above in your browser using DataLab