coef.hclust: Agglomerative / Divisive Coefficient for 'hclust' Objects

Description

Computes the “agglomerative coefficient” (aka “divisive coefficient” for diana), measuring the clustering structure of the dataset.

For each observation i, denote by $m(i)$ its dissimilarity to the first cluster it is merged with, divided by the dissimilarity of the merger in the final step of the algorithm. The agglomerative coefficient is the average of all $1 - m(i)$. It can also be seen as the average width (or the percentage filled) of the banner plot.

coefHier() directly interfaces to the underlying C code, and “proves” that only object$heights is needed to compute the coefficient.

Because it grows with the number of observations, this measure should not be used to compare datasets of very different sizes.

Usage

coefHier(object)
coef.hclust(object, ...)
# S3 method for hclust
coef(object, ...)
# S3 method for twins
coef(object, ...)

Value

a number specifying the agglomerative (or divisive for

diana objects) coefficient as defined by Kaufman and Rousseeuw, see agnes.object $ ac or diana.object $ dc.

Arguments

object

an object of class "hclust" or "twins", i.e., typically the result of hclust(.),agnes(.), or diana(.).

Since coef.hclust only uses object$heights, and object$merge, object can be any list-like object with appropriate merge and heights components.

For coefHier, even only object$heights is needed.

...

currently unused potential further arguments

Examples

Run this code

data(agriculture)
aa <- agnes(agriculture)
coef(aa) # really just extracts aa$ac
coef(as.hclust(aa))# recomputes
coefHier(aa)       # ditto
# \dontshow{
 stopifnot(all.equal(coef(aa), coefHier(aa)))
 d.a <- dist(agriculture, "manhattan")
 for (m in c("average", "single", "complete"))
    stopifnot(all.equal(coef(hclust(d.a, method=m)),
                        coef(agnes (d.a, method=m)), tol=1e-13))
# }

Run the code above in your browser using DataLab