FKM.gkb: Gustafson, Kessel and Babuska - like fuzzy k-means

Description

Performs the Gustafson, Kessel and Babuska - like fuzzy k-means clustering algorithm.
Differently from fuzzy k-means, it is able to discover non-spherical clusters.
The Babuska et al. variant improves the computation of the fuzzy covariance matrices in the standard Gustafson and Kessel clustering algorithm.

Usage

FKM.gkb (X, k, m, vp, gam, mcn, RS, stand, startU, index, alpha, conv, maxit, seed)

Value

Object of class fclust, which is a list with the following components:

U: Membership degree matrix
H: Prototype matrix
F: Array containing the covariance matrices of all the clusters
clus: Matrix containing the indexes of the clusters where the objects are assigned (column 1) and the associated membership degrees (column 2)
medoid: Vector containing the indexes of the medoid objects (NULL for FKM.gkb)
value: Vector containing the loss function values for the RS starts
criterion: Vector containing the values of clustering index
iter: Vector containing the numbers of iterations for the RS starts
k: Number of clusters
m: Parameter of fuzziness
ent: Degree of fuzzy entropy (NULL for FKM.gkb)
b: Parameter of the polynomial fuzzifier (NULL for FKM.gkb)
vp: Volume parameter
delta: Noise distance (NULL for FKM.gkb)
gam: Weighting parameter for the fuzzy covariance matrices
mcn: Maximum condition number for the fuzzy covariance matrices
stand: Standardization (Yes if stand=1, No if stand=0)
Xca: Data used in the clustering algorithm (standardized data if stand=1)
X: Raw data
D: Dissimilarity matrix (NULL for FKM.gkb)
call: Matched call

Arguments

X: Matrix or data.frame
k: An integer value or vector specifying the number of clusters for which the index is to be calculated (default: 2:6)
m: Parameter of fuzziness (default: 2)
vp: Volume parameter (default: rep(1,k))
gam: Weighting parameter for the fuzzy covariance matrices (default: 0)
mcn: Maximum condition number for the fuzzy covariance matrices (default: 1e+15)
RS: Number of (random) starts (default: 1)
stand: Standardization: if stand=1, the clustering algorithm is run using standardized data (default: no standardization)
startU: Rational start for the membership degree matrix U (default: no rational start)
index: Cluster validity index to select the number of clusters: PC (partition coefficient), PE (partition entropy), MPC (modified partition coefficient), SIL (silhouette), SIL.F (fuzzy silhouette), XB (Xie and Beni) (default: "SIL.F")
alpha: Weighting coefficient for the fuzzy silhouette index SIL.F (default: 1)
conv: Convergence criterion (default: 1e-9)
maxit: Maximum number of iterations (default: 1e+2)
seed: Seed value for random number generation (default: NULL)

Author

Paolo Giordani, Maria Brigida Ferraro, Alessio Serafini

Details

If startU is given, the argument k is ignored (the number of clusters is ncol(startU)).
If startU is given, the first element of value, cput and iter refer to the rational start.
If a cluster covariance matrix becomes singular, then the algorithm stops and the element of value is NaN.

References

Babuska R., van der Veen P.J., Kaymak U., 2002. Improved covariance estimation for Gustafson-Kessel clustering. Proceedings of the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 1081-1085.
Gustafson E.E., Kessel W.C., 1978. Fuzzy clustering with a fuzzy covariance matrix. Proceedings of the IEEE Conference on Decision and Control, pp. 761-766.

Examples

Run this code

if (FALSE) {
## unemployment data
data(unemployment)
## Gustafson, Kessel and Babuska-like fuzzy k-means, fixing the number of clusters
clust=FKM.gkb(unemployment,k=3,RS=10)
## Gustafson, Kessel and Babuska-like fuzzy k-means, selecting the number of clusters
clust=FKM.gkb(unemployment,k=2:6,RS=10)}

Run the code above in your browser using DataLab