Learn R Programming

factoextra (version 1.0.3)

hkmeans: Hierarchical k-means clustering

Description

The final k-means clustering solution is very sensitive to the initial random selection of cluster centers. This function provides a solution using an hybrid approach by combining the hierarchical clustering and the k-means methods. The procedure is explained in "Details" section.
  • hkmeans(): compute hierarchical k-means clustering
  • print.hkmeans(): prints the result of hkmeans
  • hkmeans_tree(): plots the initial dendrogram

Usage

hkmeans(x, k, hc.metric = "euclidean", hc.method = "ward.D2", iter.max = 10, km.algorithm = "Hartigan-Wong")
"print"(x, ...)
hkmeans_tree(hkmeans, rect.col = NULL, ...)

Arguments

x
a numeric matrix, data frame or vector
k
the number of clusters to be generated
hc.metric
the distance measure to be used. Possible values are "euclidean", "maximum", "manhattan", "canberra", "binary" or "minkowski" (see ?dist).
hc.method
the agglomeration method to be used. Possible values include "ward.D", "ward.D2", "single", "complete", "average", "mcquitty", "median"or "centroid" (see ?hclust).
iter.max
the maximum number of iterations allowed for k-means.
km.algorithm
the algorithm to be used for kmeans (see ?kmeans).
...
others arguments to be passed to the function plot.hclust(); (see ? plot.hclust)
hkmeans
an object of class hkmeans (returned by the function hkmeans())
rect.col
Vector with border colors for the rectangles around clusters in dendrogram

Value

hkmeans returns an object of class "hkmeans" containing the following components:
  • The elements returned by the standard function kmeans() (see ?kmeans)
  • data: the data used for the analysis
  • hclust: an object of class "hclust" generated by the function hclust()

Details

The procedure is as follow: 1. Compute hierarchical clustering 2. Cut the tree in k-clusters 3. compute the center (i.e the mean) of each cluster 4. Do k-means by using the set of cluster centers (defined in step 3) as the initial cluster centers

Examples

Run this code

# Load data
data(USArrests)
# Scale the data
df <- scale(USArrests)

# Compute hierarchical k-means clustering
res.hk <-hkmeans(df, 4)

# Elements returned by hkmeans()
names(res.hk)

# Print the results
res.hk

# Visualize the tree
hkmeans_tree(res.hk, cex = 0.6)
# or use this
fviz_dend(res.hk, cex = 0.6)


# Visualize the hkmeans final clusters
fviz_cluster(res.hk, frame.type = "norm", frame.level = 0.68)

Run the code above in your browser using DataLab