The function implements the k-means for a set of histogram-valued data.
WH_kmeans(
x,
k,
rep = 5,
simplify = FALSE,
qua = 10,
standardize = FALSE,
verbose = FALSE
)
A MatH object (a matrix of distributionH).
An integer, the number of groups.
An integer, maximum number of repetitions of the algorithm (default rep
=5).
A logic value (default is FALSE), if TRUE histograms are recomputed in order to speed-up the algorithm.
An integer, if simplify
=TRUE is the number of quantiles used for recodify the histograms.
A logic value (default is FALSE). If TRUE, histogram-valued data are standardized, variable by variable, using the Wassertein based standard deviation. Use if one wants to have variables with std equal to one.
A logic value (default is FALSE). If TRUE, details on computations are shown.
a list with the results of the k-means of the set of Histogram-valued data x
into k
cluster.
solution
A list.Returns the best solution among the rep
etitions, i.e.
the one having the minimum sum of squares criterion.
solution$IDX
A vector. The clusters at which the objects are assigned.
solution$cardinality
A vector. The cardinality of each final cluster.
solution$centers
A MatH
object with the description of centers.
solution$Crit
A number. The criterion (Sum od square deviation from the centers) value at the end of the run.
quality
A number. The percentage of Sum of square deviation explained by the model. (The higher the better)
Irpino A., Verde R., Lechevallier Y. (2006). Dynamic clustering of histograms using Wasserstein metric. In: Rizzi A., Vichi M.. COMPSTAT 2006 - Advances in computational statistics. p. 869-876, Heidelberg:Physica-Verlag
# NOT RUN {
results=WH_kmeans(x = BLOOD,k = 2, rep = 10,simplify = TRUE,
qua = 10,standardize = TRUE,verbose=TRUE)
# }
Run the code above in your browser using DataLab