Learn R Programming

mclust (version 5.4.6)

hc: Model-based Agglomerative Hierarchical Clustering

Description

Agglomerative hierarchical clustering based on maximum likelihood criteria for Gaussian mixture models parameterized by eigenvalue decomposition.

Usage

hc(data,
   modelName = mclust.options("hcModelName"),  
   use = mclust.options("hcUse"), …)

# S3 method for hc plot(x, …)

# S3 method for hc as.dendrogram(object, …)

# S3 method for hc as.hclust(x, …)

Arguments

data

A numeric vector, matrix, or data frame of observations. Categorical variables are not allowed. If a matrix or data frame, rows correspond to observations (\(n\)) and columns correspond to variables (\(d\)).

modelName

A character string indicating the model to be used. Possible models are:

"E"

equal variance (one-dimensional)

"V"

spherical, variable variance (one-dimensional)

"EII"

spherical, equal volume

"VII"

spherical, unequal volume

"EEE"

ellipsoidal, equal volume, shape, and orientation

"VVV"

ellipsoidal, varying volume, shape, and orientation.

By default the model provided by mclust.options("hcModelName") is used. See mclust.options.

use

A string or a vector of character strings specifying the type of input variables/data transformation to be used for model-based hierarchical clustering. By default the method specified in mclust.options("hcUse") is used. See mclust.options.

Arguments for the method-specific hc functions. See for example hcE.

object, x

An object of class 'hc' resulting from a call to hc().

Value

The function hc() returns a numeric two-column matrix in which the ith row gives the minimum index for observations in each of the two clusters merged at the ith stage of agglomerative hierarchical clustering. Several other informations are also returned as attributes.

The plotting function plot.hc() draws a dendrogram by first converting the input object from class 'hc' to class 'dendrogram' and then plot the transformed object using plot.dendrogram.

The functions as.dendrogram.hc() and as.hclust.hc() are used to convert the input object from class 'hc' to class, respectively, 'dendrogram' and 'hclust'.

Details

Most models have memory usage of the order of the square of the number groups in the initial partition for fast execution. Some models, such as equal variance or "EEE", do not admit a fast algorithm under the usual agglomerative hierarchical clustering paradigm. These use less memory but are much slower to execute.

References

J. D. Banfield and A. E. Raftery (1993). Model-based Gaussian and non-Gaussian Clustering. Biometrics 49:803-821.

C. Fraley (1998). Algorithms for model-based Gaussian hierarchical clustering. SIAM Journal on Scientific Computing 20:270-281.

C. Fraley and A. E. Raftery (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97:611-631.

See Also

hcE,..., hcVVV, hclass, mclust.options

Examples

Run this code
# NOT RUN {
hcTree <- hc(modelName = "VVV", data = iris[,-5])
cl <- hclass(hcTree,c(2,3))

# }
# NOT RUN {
par(pty = "s", mfrow = c(1,1))
clPairs(iris[,-5],cl=cl[,"2"])
clPairs(iris[,-5],cl=cl[,"3"])

par(mfrow = c(1,2))
dimens <- c(1,2)
coordProj(iris[,-5], dimens = dimens, classification=cl[,"2"])
coordProj(iris[,-5], dimens = dimens, classification=cl[,"3"])
# }

Run the code above in your browser using DataLab