Learn R Programming

clustMixType (version 0.4-2)

stability_kproto: Determination the stability of k Prototypes Clustering

Description

Calculating the stability for a k-Prototypes clustering with k clusters or computing the stability-based optimal number of clusters for k-Prototype clustering. Possible stability indices are: Jaccard, Rand, Fowlkes \& Mallows and Luxburg.

Usage

stability_kproto(
  object,
  method = c("rand", "jaccard", "luxburg", "fowlkesmallows"),
  B = 100,
  verbose = FALSE,
  ...
)

Value

The output contains the stability for a given k-Prototype clustering in a list with two elements:

kp_stab

stability values for the given clustering

kp_bts_stab

stability values for each bootstrap samples

Arguments

object

Object of class kproto resulting from a call with kproto(..., keep.data=TRUE)

method

character specifying the stability, either one or more of luxburg, fowlkesmallows, rand or/and jaccard.

B

numeric, number of bootstrap samples

verbose

Logical whether information about the bootstrap procedure should be given.

...

Further arguments passed to kproto, like:

  • nstart: If > 1 repetitive computations of kproto with random initial prototypes are computed.

  • lambda: Factor to trade off between Euclidean distance of numeric variables and simple matching coefficient between categorical variables.

Author

Rabea Aschenbruck

References

  • Aschenbruck, R., Szepannek, G., Wilhelm, A.F.X (2023): Stability of mixed-type cluster partitions for determination of the number of clusters. Submitted.

  • von Luxburg, U. (2010): Clustering stability: an overview. Foundations and Trends in Machine Learning, Vol 2, Issue 3. tools:::Rd_expr_doi("10.1561/2200000008").

  • Ben-Hur, A., Elisseeff, A., Guyon, I. (2002): A stability based method for discovering structure in clustered data. Pacific Symposium on Biocomputing. tools:::Rd_expr_doi("10/bhfxmf").

Examples

Run this code
if (FALSE) {
# generate toy data with factors and numerics
n   <- 10
prb <- 0.99
muk <- 2.5 

x1 <- sample(c("A","B"), 2*n, replace = TRUE, prob = c(prb, 1-prb))
x1 <- c(x1, sample(c("A","B"), 2*n, replace = TRUE, prob = c(1-prb, prb)))
x1 <- as.factor(x1)
x2 <- sample(c("A","B"), 2*n, replace = TRUE, prob = c(prb, 1-prb))
x2 <- c(x2, sample(c("A","B"), 2*n, replace = TRUE, prob = c(1-prb, prb)))
x2 <- as.factor(x2)
x3 <- c(rnorm(n, mean = -muk), rnorm(n, mean = muk), rnorm(n, mean = -muk), rnorm(n, mean = muk))
x4 <- c(rnorm(n, mean = -muk), rnorm(n, mean = muk), rnorm(n, mean = -muk), rnorm(n, mean = muk))
x <- data.frame(x1,x2,x3,x4)

#' # apply k-prototypes
kpres <- kproto(x, 4, keep.data = TRUE)

# calculate cluster stability
stab <- stability_kproto(method = c("luxburg","fowlkesmallows"), object = kpres)

}

Run the code above in your browser using DataLab