MVCH(data, ps=0.75, pf=0.2, k=1000, a.poi=2, del.poi=1)
0.7
. See Details for more information.
0.2
. See Details for more information.
1000
. See Details for more information.
2
. See Details for more information.
1
. See Details for more information.
The algorithm iteratively determines a sequence of subsets of certain size with minimum convex hull volume (i.e. minimum volume subsets) until a certain threshold is reached. In the first iteration a minimum volume subset of size $n_1=floor(n*ps)$ is sought. In the second iteration, out of the subset found in iteration 1, a subset of size $n_2=floor(n_1*ps)$ is determined. The procedure continues until the threshold is reached: $ceil(n*pf)$ where n
is the number of observations in data
. The mode is calculated as the arithmetic mean of the observations in the final subset. Hence, the combination of ps
and pf
determines the running time and robustness of the procedure. Highest robustness (in terms of maximum breakdown point) is achieved for $ps=floor((n+d+1)/2)$. Small values of pf
guarantee an accurate mode estimation also for asymmetric data sets but running times increase.
To find a minimum volume subset, in each iteration in.subs
atomic subsets (consisting of d+1
observations) are constructed. Each of these atomic subsets is iteratively expanded by adding the a.poi
closest points and deleting del.poi
. All three values determine the accuracy of the subset identification (and, hence, the estimate) as well as the running time of the algorithm. Small values of in.subs
reduce running time. Choosing similar values for a.poi
and del.poi
increases running time and algorithm accuracy.
For more details on the algorithm see the reference.
# maximum breakdown point estimation
# MVCH(halle, ps = floor((nrow(halle) + ncol(halle) + 1)/2), pf = 0.05)
# slower estimation
# MVCH(halle, ps = 0.75, pf = 0.05)
# quicker estimation
# MVCH(halle, ps = 0.25, pf = 0.05)
Run the code above in your browser using DataLab