Learn R Programming

BalancedSampling (version 2.0.6)

hlpm2: Hierarchical Local Pivotal Method 2

Description

Selects an initial sample using the lpm2(), and then splits this sample into subsamples of given sizes using successive, hierarchical selection with the lpm2(). The method is used to select several subsamples, such that each subsample, and the combination (i.e. the union of all subsamples), is spatially balanced.

Usage

hlpm2(prob, x, sizes, type = "kdtree2", bucketSize = 50, eps = 1e-12)

Value

A vector of selected indices in 1,2,...,N.

A matrix with the population indices of the combined sample in the first column, and the associated subsample in the second column.

Arguments

prob

A vector of length N with inclusion probabilities.

x

An N by p matrix of (standardized) auxiliary variables. Squared euclidean distance is used in the x space.

sizes

A vector of integers containing the sizes of the subsamples. sum(sizes) = sum(prob) must hold.

type

The method used in finding nearest neighbours. Must be one of "kdtree0", "kdtree1", "kdtree2", and "notree".

bucketSize

The maximum size of the terminal nodes in the k-d-trees.

eps

A small value used to determine when an updated probability is close enough to 0.0 or 1.0.

k-d-trees

The types "kdtree" creates k-d-trees with terminal node bucket sizes according to bucketSize.

  • "kdtree0" creates a k-d-tree using a median split on alternating variables.

  • "kdtree1" creates a k-d-tree using a median split on the largest range.

  • "kdtree2" creates a k-d-tree using a sliding-midpoint split.

  • "notree" does a naive search for the nearest neighbour.

Details

The inclusion probabilities prob must sum to an integer n. The sizes of the subsamples sum(sizes) must sum to the same integer n.

References

Friedman, J. H., Bentley, J. L., & Finkel, R. A. (1977). An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software (TOMS), 3(3), 209-226.

Maneewongvatana, S., & Mount, D. M. (1999, December). It’s okay to be skinny, if your friends are fat. In Center for geometric computing 4th annual workshop on computational geometry (Vol. 2, pp. 1-8).

Grafström, A., Lundström, N.L.P. & Schelin, L. (2012). Spatially balanced sampling through the Pivotal method. Biometrics 68(2), 514-520.

Lisic, J. J., & Cruze, N. B. (2016, June). Local pivotal methods for large surveys. In Proceedings of the Fifth International Conference on Establishment Surveys.

See Also

Other sampling: cube(), lcube(), lpm(), scps()

Examples

Run this code
if (FALSE) {
set.seed(12345);
N = 1000;
n = 100;
prob = rep(n/N, N);
x = matrix(runif(N * 2), ncol = 2);
sizes = c(10, 20, 30, 40);
s = hlpm2(prob, x, sizes);
plot(x[, 1], x[, 2]);
points(x[s, 1], x[s, 2], pch = 19);
}

Run the code above in your browser using DataLab