pcm: Compute Pairwise Co-occurrence Matrix

Description

Let clustering be a label from data of \(N\) observations and suppose we are given \(M\) such labels. Co-occurrent matrix counts the number of events where two observations \(X_i\) and \(X_j\) belong to the same category/class. PCM serves as a measure of uncertainty embedded in any algorithms with non-deterministic components.

Usage

pcm(partitions)

Arguments

partitions

partitions can be provided in either (1) an \((M\times N)\) matrix where each row is a clustering for \(N\) objects, or (2) a length-\(M\) list of length-\(N\) clustering labels.

Value

an \((N\times N)\) matrix, whose elements \((i,j)\) are counts for how many times observations \(i\) and \(j\) belong to the same cluster, ranging from \(0\) to \(M\).

Examples

Run this code

# NOT RUN {
# -------------------------------------------------------------
#               PSM with 'iris' dataset + k-means++
# -------------------------------------------------------------
## PREPARE WITH SUBSET OF DATA
data(iris)
X     = as.matrix(iris[,1:4])
lab   = as.integer(as.factor(iris[,5]))

## EMBEDDING WITH PCA
X2d = Rdimtools::do.pca(X, ndim=2)$Y

## RUN K-MEANS++ 100 TIMES
partitions = list()
for (i in 1:100){
  partitions[[i]] = kmeanspp(X)$cluster
}

## COMPUTE PCM
iris.pcm = pcm(partitions)

## VISUALIZATION
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,2), pty="s")
plot(X2d, col=lab, pch=19, main="true label")
image(iris.pcm[,150:1], axes=FALSE, main="PCM")
par(opar)

# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples