Learn R Programming

NCSampling (version 1.0)

Centroids: Calculate centroids

Description

Separates a single stratum of the population file into n clusters and finds the centroid of each cluster, where n is the sample size. Not intended to be called directly.

Usage

Centroids(popfile, nrefs, desvars, ctype, imax, nst)

Arguments

popfile

population file - dataframe containing information relating to all plots in the stratum.

nrefs

scalar defining the number of reference plots - required sample size for the stratum.

desvars

character vector containing the names of the design variables.

ctype

clustering type - either k-means ('km') or Ward's D2 ('WD').

imax

maximum number of iterations when calling the k-means clustering procedure.

nst

number of random initial centroid sets when calling the k-means clustering procedure.

Value

centroids

dataframe containing centroids.

cmns

dataframe containing centroid means.

Details

The virtual plots are partitioned so as to minimise the sums of squares of distances from plots to cluster centroids. This is done by using a multivariate clustering procedure such as k-means clustering (Hartigan & Wong, 1979) or Ward's D2 clustering (Murtagh & Legendre, 2013), using standardized design variables and a Euclidean distance metric.

References

Hartigan & Wong (1979) Algorithm AS 136: a K-means clustering algorithm. Applied Statistics 28, 100-108, DOI:10.2307/2346830.

Murtagh, M & Legendre, P. (2014) Ward's hierarchical agglomerative clustering method: which algorithms implement Ward's criterion? Journal of Classification, 31, 274-295, DOI: 10.1007/s00357-014-9161-z.

See Also

Existing, NC.sample and kmeans.

Examples

Run this code
## Centroids(popfile, nrefs, desvars, ctype='km', imax=200, nst=20) 

Run the code above in your browser using DataLab