
Runs a user-specified set of clustering methods (CBI-functions, see
kmeansCBI
with several numbers of clusters on a dataset
with unified output.
cluster.magazine(data,G,diss = inherits(data, "dist"),
scaling=TRUE, clustermethod,
distmethod=rep(TRUE,length(clustermethod)),
ncinput=rep(TRUE,length(clustermethod)),
clustermethodpars,
trace=TRUE)
List of lists comprising
Two-dimensional list. The first list index i is the number
of the clustering method (ordering as specified in
clustermethod
), the second list index j is the number of
clusters. This stores the full output of clustermethod i run on
number of clusters j.
Two-dimensional list. The first list index i is the number
of the clustering method (ordering as specified in
clustermethod
), the second list index j is the number of
clusters. This stores the clustering integer vector (i.e., the
partition
-component of the CBI-function, see
kmeansCBI
) of clustermethod i run on
number of clusters j.
Two-dimensional list. The first list index i is the number
of the clustering method (ordering as specified in
clustermethod
), the second list index j is the number of
clusters. List entries are single logicals. If TRUE
, the
clustering method estimated some noise, i.e., points not belonging
to any cluster, which in the clustering vector are indicated by the
highest number (number of clusters plus one in case that the number
of clusters was fixed).
list of integer vectors of length 2. The first number is
the number of the clustering method (the order is determined by
argument clustermethod
), the second number is the
number of clusters for those methods that estimate the number of
clusters themselves and estimate a number that is smaller than
min(G)
or larger than max(G)
.
data matrix or dist
-object.
vector of integers. Numbers of clusters to consider.
logical. If TRUE
, the data matrix is assumed to be
a distance/dissimilariy matrix, otherwise it's observations times
variables.
either a logical or a numeric vector of length equal to
the number of columns of data
. If FALSE
, data won't be
scaled, otherwise scaling
is passed on to scale
as
argumentscale
.
vector of strings specifying names of
CBI-functions (see kmeansCBI
). These are the
clustering methods to be applied.
vector of logicals, of the same length as
clustermethod
. TRUE
means that the clustering method
operates on distances. If diss=TRUE
, all entries have to be
TRUE
. Otherwise, if an entry is true, the corresponding
method will be applied on dist(data)
.
vector of logicals, of the same length as
clustermethod
. TRUE
indicates that the corresponding
clustering method requires the number of clusters as input and will
not estimate the number of clusters itself.
list of the same length as
clustermethod
. Specifies parameters for all involved
clustering methods. Its jth entry is passed to clustermethod number
k. Can be an empty entry in case all defaults are used for a
clustering method. The number of clusters does not need to be
specified here.
logical. If TRUE
, some runtime information is
printed.
Christian Hennig christian.hennig@unibo.it https://www.unibo.it/sitoweb/christian.hennig/en/
Hennig, C. (2017) Cluster validation by measurement of clustering characteristics relevant to the user. In C. H. Skiadas (ed.) Proceedings of ASMDA 2017, 501-520, https://arxiv.org/abs/1703.09282
clusterbenchstats
, kmeansCBI
set.seed(20000)
options(digits=3)
face <- rFace(10,dMoNo=2,dNoEy=0,p=2)
clustermethod=c("kmeansCBI","hclustCBI","hclustCBI")
# A clustering method can be used more than once, with different
# parameters
clustermethodpars <- list()
clustermethodpars[[2]] <- clustermethodpars[[3]] <- list()
clustermethodpars[[2]]$method <- "complete"
clustermethodpars[[3]]$method <- "average"
cmf <- cluster.magazine(face,G=2:3,clustermethod=clustermethod,
distmethod=rep(FALSE,3),clustermethodpars=clustermethodpars)
print(str(cmf))
Run the code above in your browser using DataLab