WH_hclust: Hierarchical clustering of histogram data
Description
The function implements a Hierarchical clustering
for a set of histogram-valued data, based on the L2 Wassertein distance.
Extends the hclust function of the stat package.
A logic value (default is FALSE), if TRUE histograms are recomputed in order to speed-up the algorithm.
qua
An integer, if simplify=TRUE is the number of quantiles used for recodify the histograms.
standardize
A logic value (default is FALSE). If TRUE, histogram-valued data are standardized, variable by variable,
using the Wassertein based standard deviation. Use if one wants to have variables with std equal to one.
distance
A string default "WDIST" the L2 Wasserstein distance (other distances will be implemented)
method
A string, default="complete", is the the agglomeration method to be used.
This should be (an unambiguous abbreviation of) one of "ward.D", "ward.D2",
"single", "complete", "average" (= UPGMA), "mcquitty"
(= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC).
Value
An object of class hclust which describes the tree produced by
the clustering process.
References
Irpino A., Verde R. (2006). A new Wasserstein based distance for the hierarchical clustering
of histogram symbolic data. In: Batanjeli et al. Data Science and Classification, IFCS 2006. p. 185-192,
BERLIN:Springer, ISBN: 3-540-34415-2
# NOT RUN {results=WH_hclust(x = BLOOD,simplify = TRUE, method="complete")
plot(results) # it plots the dendrogramcutree(results,k = 5) # it returns the labels for 5 clusters# }