mkmeans:

Description

K-means variant that uses a class-wise Mahalanobis metric. The implementation follows somewhat Lloyd's, with class-wise covariance computation step following that of centres.

Usage

mkmeans(dat, k, maxiter = 100, seeds = NULL, prior = 1)

Arguments

dat

Matrix with n rows and d columns of n d-dimensional data elements to cluster.

Number of clusters in the output.

maxiter

Maximum number of iterations.

seeds

Optional indexes of initial centres taken in the input data. If NULL, uniform sampling is used.

prior

Prior population size used for regularizing components.

Value

labels

Cluster labels taking values in 1..k

Numeric vector of cluster weights

mean

List of mean vectors

cov

List of covariance matrices

Details

K-means is characterized by the use of identity as the metric. To remain close to this in spirit, each class-wise covariance matrix is normalized after computation so that is trace equals d. This avoids excessively unbalanced classes, while facilitating the case where the support of a given cluster is less than 2 - covariance cannot be computed in this case. Covariance then defaults to identity. Also to prevent degeneracies when 2 < cluster size < d, a regularization term proportional to sample data features is added to the covariance diagonal. The returned value follows the GMM data structure (i.e., as returned by e.g. varbayes() and newGmm())

Examples

Run this code

	mod <- mkmeans(irisdata, 3)