Learn R Programming

Tools for Cluster Analysis

T4cluster is an R package designed as a computational toolkit with comprehensive coverage in relevant topics around the study of cluster analysis. It contains several classes of algorithms for

  • Clustering with Vector-Valued Data
  • Clustering with Functional Data
  • Clustering on the Unit Hypersphere
  • Subspace Clustering
  • Measures : Compare Two Clusterings
  • Learning with Multiple Clusterings

and other utility functions for further use. If you request additional functionalities or have suggestions, please contact maintainer.

Installation

You can install the released version of T4cluster from CRAN with:

install.packages("T4cluster")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("kisungyou/T4cluster")

Minimal Example : Clustering SMILEY Data

T4cluster offers a variety of clustering algorithms in common interface. In this example, we show a basic pipeline with smiley dataset, which can be generated as follows;

# load the library
library(T4cluster)

# generate the data
smiley = T4cluster::genSMILEY(n=200)
data   = smiley$data
label  = smiley$label

# visualize
plot(data, pch=19, col=label, xlab="", ylab="", main="SMILEY Data")

where each component of the face is considered as one cluster - the data has 4 clusters. Here, we compare 4 different methods; (1) k-means (kmeans), (2) k-means++ (kmeanspp), (3) gaussian mixture model (gmm), and (4) spectral clustering with normalized cuts (scNJW).

# run algorithms
run1 = T4cluster::kmeans(data, k=4)
run2 = T4cluster::kmeanspp(data, k=4)
run3 = T4cluster::gmm(data, k=4)
run4 = T4cluster::scNJW(data, k=4, sigma = 0.1)

# visualize
par(mfrow=c(2,2))
plot(data, pch=19, xlab="", ylab="", col=run1$cluster, main="k-means")
plot(data, pch=19, xlab="", ylab="", col=run2$cluster, main="k-means++")
plot(data, pch=19, xlab="", ylab="", col=run3$cluster, main="gmm")
plot(data, pch=19, xlab="", ylab="", col=run4$cluster, main="scNJW")

Copy Link

Version

Install

install.packages('T4cluster')

Monthly Downloads

205

Version

0.1.2

License

MIT + file LICENSE

Maintainer

Kisung You

Last Published

August 16th, 2021

Functions in T4cluster (0.1.2)

compare.adjrand

(+) Adjusted Rand Index
LSR

Least Squares Regression
SSQP

Subspace Segmentation via Quadratic Programming
compare.rand

(+) Rand Index
LRR

Low-Rank Representation
EKSS

Ensembles of K-Subspaces
dpmeans

DP-Means Clustering
MSM

Bayesian Mixture of Subspaces of Different Dimensions
LRSC

Low-Rank Subspace Clustering
SSC

Sparse Subspace Clustering
genLP

Generate Line and Plane Example with Fixed Number of Components
gskmeans

Geodesic Spherical K-Means
genSMILEY

Generate SMILEY Data
household

Load 'household' data
sc05Z

Spectral Clustering by Zelnik-Manor and Perona (2005)
funkmeans03A

Functional K-Means Clustering by Abraham et al. (2003)
funhclust

Functional Hierarchical Clustering
sc09G

Spectral Clustering by Gu and Wang (2009)
kmeans

K-Means Clustering
kmeans18B

K-Means Clustering with Lightweight Coreset
gen3S

Generate from Three 5-dimensional Subspaces in 200-dimensional space.
gmm11R

Regularized GMM by Ruan et al. (2011)
gmm16G

Weighted GMM by Gebru et al. (2016)
gmm

Finite Gaussian Mixture Model
scSM

Spectral Clustering by Shi and Malik (2000)
sc12L

Spectral Clustering by Li and Guo (2012)
gmm03F

Ensemble of Gaussian Mixtures with Random Projection
scNJW

Spectral Clustering by Ng, Jordan, and Weiss (2002)
genDONUTS

Generate Nested Donuts
kmeanspp

K-Means++ Clustering
pcm

Compute Pairwise Co-occurrence Matrix
sc10Z

Spectral Clustering by Zhang et al. (2010)
sc11Y

Spectral Clustering by Yang et al. (2011)
scUL

Spectral Clustering with Unnormalized Laplacian
psm

Compute Posterior Similarity Matrix
predict.MSM

S3 method to predict class label of new data with 'MSM' object
spkmeans

Spherical K-Means Clustering