cvi(a, b = NULL, type = "valid", ..., log.base = 10)
"cvi"(a, b = NULL, type = "valid", ..., log.base = 10)
"cvi"(a, b = NULL, type = "valid", ..., log.base = 10)
"cvi"(a, b = NULL, type = "valid", ..., log.base = 10)
comPart
, so please refer to that
function. "RI"
: Rand Index (to be maximized).
"ARI"
: Adjusted Rand Index (to be maximized).
"J"
: Jaccard Index (to be maximized).
"FM"
: Fowlkes-Mallows (to be maximized).
"VI"
: Variation of Information (Meila (2003); to be minimized).
shape_extraction
with series of different length). The indices marked with a tilde (~) require the calculation of a global centroid. Since
DBA
and shape_extraction
(for series of different length) have some
randomness associated, these indices might not be appropriate for those centroids. "Sil"
(!): Silhouette index (Arbelaitz et al. (2013); to be maximized).
"D"
(!): Dunn index (Arbelaitz et al. (2013); to be maximized).
"COP"
(!): COP index (Arbelaitz et al. (2013); to be minimized).
"DB"
(?): Davies-Bouldin index (Arbelaitz et al. (2013); to be minimized).
"DBstar"
(?): Modified Davies-Bouldin index (DB*) (Kim and Ramakrishna (2005);
to be minimized).
"CH"
(~): Calinski-Harabasz index (Arbelaitz et al. (2013); to be maximized).
"SF"
(~): Score Function (Saitta et al. (2007); to be maximized).
"valid"
: Returns all valid indices depending on the type of a
and whether
b
was provided or not.
"internal"
: Returns all internal CVIs. Only supported for
dtwclust-class
objects.
"external"
: Returns all external CVIs. Requires b
to be provided.
CVIs can be classified as internal, external or relative depending on how they are computed. Focusing on the first two, the crucial difference is that internal CVIs only consider the partitioned data and try to define a measure of cluster purity, whereas external CVIs compare the obtained partition to the correct one. Thus, external CVIs can only be used if the ground truth is known. Each index defines their range of values and whether they are to be minimized or maximized. In many cases, these CVIs can be used to evaluate the result of a clustering algorithm regardless of how the clustering works internally, or how the partition came to be.
Knowing which CVI will work best cannot be determined a priori, so they should be tested for each specific application. Usually, many CVIs are utilized and compared to each other, maybe using a majority vote to decide on a final result. Furthermore, it should be noted that many CVIs perform additional distance calculations when being computed, which can be very considerable if using DTW.
Note that, even though a fuzzy partition can be changed into a crisp one, making it compatible with many of the existing CVIs, there are also fuzzy CVIs tailored specifically to fuzzy clustering, and these may be more suitable in those situations, but have not been implemented here yet.
Kim, M., & Ramakrishna, R. S. (2005). New indices for cluster validity assessment. Pattern Recognition Letters, 26(15), 2353-2363.
Meila, M. (2003). Comparing clusterings by the variation of information. In Learning theory and kernel machines (pp. 173-187). Springer Berlin Heidelberg.
Saitta, S., Raphael, B., & Smith, I. F. (2007). A bounded index for cluster validity. In International Workshop on Machine Learning and Data Mining in Pattern Recognition (pp. 174-187). Springer Berlin Heidelberg.