kmedoids: K-Medoids Clustering

Description

Compute a \(k\)-medoids partition of a dissimilarity object.

Usage

kmedoids(x, k)

Value

An object of class "kmedoids" representing the obtained partition, which is a list with the following components.

cluster: the class ids of the partition.
medoid_ids: the indices of the medoids.
criterion: the value of the criterion function of the partition.

Arguments

x: a dissimilarity object inheriting from class "dist", or a square matrix of pairwise object-to-object dissimilarity values.
k: an integer giving the number of classes to be used in the partition.

Details

Let \(d\) denote the pairwise object-to-object dissimilarity matrix corresponding to x. A \(k\)-medoids partition of x is defined as a partition of the numbers from 1 to \(n\), the number of objects in x, into \(k\) classes \(C_1, \ldots, C_k\) such that the criterion function \(L = \sum_l \min_{j \in C_l} \sum_{i \in C_l} d_{ij}\) is minimized.

This is an NP-hard optimization problem. PAM (Partitioning Around Medoids, see Kaufman & Rousseeuw (1990), Chapter 2) is a very popular heuristic for obtaining optimal \(k\)-medoids partitions, and provided by pam in package cluster.

kmedoids is an exact algorithm based on a binary linear programming formulation of the optimization problem (e.g., Gordon & Vichi (1998), [P4']), using lp from package lpSolve as solver. Depending on available hardware resources (the number of constraints of the program is of the order \(n^2\)), it may not be possible to obtain a solution.

References

L. Kaufman and P. J. Rousseeuw (1990). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.

A. D. Gordon and M. Vichi (1998). Partitions of partitions. Journal of Classification, 15, 265--285. tools:::Rd_expr_doi("10.1007/s003579900034").