
This function implements hierarchical clustering with the same interface as hclust
from the stats package but with much faster algorithms.
hclust(d, method="complete", members=NULL)
a dissimilarity structure as produced by dist
.
the agglomeration method to be used. This must be (an
unambiguous abbreviation of) one of "single"
,
"complete"
, "average"
, "mcquitty"
,
"ward.D"
, "ward.D2"
, "centroid"
or "median"
.
NULL
or a vector with length the number of
observations.
An object of class 'hclust'
. It encodes a stepwise dendrogram.
See the documentation of the original function
hclust
in the stats package.
A comprehensive User's manual
fastcluster.pdf is available as a vignette. Get this from the R command line with vignette('fastcluster')
.
# NOT RUN {
# Taken and modified from stats::hclust
#
# hclust(...) # new method
# stats::hclust(...) # old method
require(fastcluster)
require(graphics)
hc <- hclust(dist(USArrests), "ave")
plot(hc)
plot(hc, hang = -1)
## Do the same with centroid clustering and squared Euclidean distance,
## cut the tree into ten clusters and reconstruct the upper part of the
## tree from the cluster centers.
hc <- hclust(dist(USArrests)^2, "cen")
memb <- cutree(hc, k = 10)
cent <- NULL
for(k in 1:10){
cent <- rbind(cent, colMeans(USArrests[memb == k, , drop = FALSE]))
}
hc1 <- hclust(dist(cent)^2, method = "cen", members = table(memb))
opar <- par(mfrow = c(1, 2))
plot(hc, labels = FALSE, hang = -1, main = "Original Tree")
plot(hc1, labels = FALSE, hang = -1, main = "Re-start from 10 clusters")
par(opar)
# }
Run the code above in your browser using DataLab