Bagging for clustering is really a rather general conceptual framework
than a specific algorithm. If the primary partitions generated in the
bootstrap stage form a cluster ensemble (so that class memberships of
the objects in x
can be obtained), consensus methods for
cluster ensembles (as implemented, e.g., in cl_consensus
and cl_medoid
) can be employed for the aggregation
stage. In particular, (possibly new) bagging algorithms can easily be
realized by directly running cl_consensus
on the results
of cl_boot
.
In BagClust1, aggregation proceeds by generating a reference partition
by running the base clustering algorithm on the whole given data set,
and averaging the ensemble memberships after optimally matching them
to the reference partition (in fact, by minimizing Euclidean
dissimilarity, see cl_dissimilarity
).
If the base clustering algorithm yields prototypes, aggregation can be
based on clustering these. This is the idea underlying the
“Bagged Clustering” algorithm introduced in Leisch (1999) and
implemented by function bclust
in package
e1071.