The \(k\)-modes algorithm (Huang, 1997) an extension of the k-means algorithm by MacQueen (1967).
The data given by data
is clustered by the \(k\)-modes method (Huang, 1997)
which aims to partition the objects into \(k\) groups such that the
distance from objects to the assigned cluster modes is minimized.
By default simple-matching distance is used to determine the dissimilarity of two objects. It is computed by counting the number of mismatches in all variables.
Alternative this distance is weighted by the frequencies of the categories in data (see Huang, 1997, for details).
If an initial matrix of modes is supplied, it is possible that
no object will be closest to one or more modes. In this case less cluster than supplied modes will be returned
and a warning is given.
If called using fast = TRUE
the reassignment of the data to clusters is done for the entire data set before recomputation of the modes is done. For computational reasons this option should be chosen unless moderate data sizes.
For clustering mixed type data it is referred to kproto
.