cshell: Fuzzy C-Shell Clustering

Description

The c-shell clustering algorithm, the shell prototype-based version (ring prototypes) of the fuzzy kmeans clustering method.

Usage

cshell(x, centers, iter.max=100, verbose=FALSE, dist="euclidean",
       method="cshell", m=2, radius = NULL)

Arguments

The data matrix, were columns correspond to the variables and rows to observations.

centers

Number of clusters or initial values for cluster centers

iter.max

Maximum number of iterations

verbose

If TRUE, make some output during learning

dist

Must be one of the following: If "euclidean", the mean square error, if "manhattan", the mean absolute error is computed. Abbreviations are also accepted.

method

Currently, only the "cshell" method; the c-shell fuzzy clustering method

The degree of fuzzification. It is defined for values greater than 1

radius

The radius of resulting clusters

Value

cshell returns an object of class "cshell".
centersThe final cluster centers.
sizeThe number of data points in each cluster.
clusterVector containing the indices of the clusters where the data points are assigned to. The maximum membership value of a point is considered for partitioning it to a cluster.
iterThe number of iterations performed.
membershipa matrix with the membership values of the data points to the clusters.
withinerrorReturns the sum of square distances within the clusters.
callReturns a call in which all of the arguments are specified by their names.

Details

The data given by x is clustered by the fuzzy c-shell algorithm. If centers is a matrix, its rows are taken as the initial cluster centers. If centers is an integer, centers rows of x are randomly chosen as initial values. The algorithm stops when the maximum number of iterations (given by iter.max) is reached.

If verbose is TRUE, it displays for each iteration the number the value of the objective function.

If dist is "euclidean", the distance between the cluster center and the data points is the Euclidean distance (ordinary kmeans algorithm). If "manhattan", the distance between the cluster center and the data points is the sum of the absolute values of the distances of the coordinates. If method is "cshell", then we have the c-shell fuzzy clustering method.

The parameters m defines the degree of fuzzification. It is defined for real values greater than 1 and the bigger it is the more fuzzy the membership values of the clustered data points are. The parameter radius is by default set to 0.2 for every cluster.

References

Rajesh N. Dave. Fuzzy Shell-Clustering and Applications to Circle Detection in Digital Images. Int. J. of General Systems, Vol. 16, pp. 343-355, 1996.

Examples

Run this code

## a 2-dimensional example
x<-rbind(matrix(rnorm(50,sd=0.3),ncol=2),
         matrix(rnorm(50,mean=1,sd=0.3),ncol=2))
cl<-cshell(x,2,20,verbose=TRUE,method="cshell",m=2)
print(cl)

# assign classes to some new data
y<-rbind(matrix(rnorm(13,sd=0.3),ncol=2),
         matrix(rnorm(13,mean=1,sd=0.3),ncol=2))
#         ycl<-predict(cl, y, type="both")

Run the code above in your browser using DataLab