Two types of conjugate convex functions are available: one that is based on
powers of the norm of the prototype vectors and another that is based on a
logarithmic transformation of the norm. Both are intended to obtain more
robust partitions.
Using par
= 2 is equivalent to performing ordinary k-means with
Euclidean distances. par
= 1 is equivalent to LVQ of Kohonen type
(the directions of the prototypes from the center of the data are used),
and par
= 0 is equivalent to using 2*ln(cosh(|p|))/2.
Internally the algorithm uses sparse data structures and avoids computations
with zero data values. Thus, the data must not be centered (the algorithm
does this internally with the option to further standardize the data). For
dense data this is slightly inefficient.
If initial prototypes are omitted the number of prototypes must be specified.
In this case the initial prototypes are drawn from the data (without
replacement).
If the number of retries is greater than zero the best among that number
of trial solutions is returned. Note that the number of prototypes must be
specified as the initial prototypes are sampled from the data.
The debugging output shows the iteration number, the inverted information
and the variance of the current partition as a percentage of the total (if
each data point were a cluster), and the number of active prototypes (those
with at least one member, i.e. a data point that is not closer to any
other prototype).
Note that the algorithm uses tie-breaking when it determines the cluster
memberships of the samples.