Learn R Programming

gama: a Genetic Approach to Maximize clustering criteria in R

We presented an R package to perform hard partitional clustering guided by an user-specified cluster validation criterion. The algorithm obtains high cluster validation indices when applied to datasets who contains superellipsoid clusters. The algorithm is capable of estimate the number of partitions for a given dataset by an automatic inference of the elbow in WCSSE graph or by using a broad search in 24 cluster validation criteria. The package brings six different built-in datasets for experimentation, two of them are in-house datasets collected from real execution of distributed machine learning algorithms on Spark clusters. The others are well-known datasets used in the benchmark of clustering problems.

Copy Link

Version

Install

install.packages('gama')

Monthly Downloads

8

Version

1.0.3

License

GPL (>= 2)

Issues

Pull Requests

Stars

Forks

Maintainer

Jairson Rodrigues

Last Published

February 26th, 2019

Functions in gama (1.0.3)

print.gama

Prints results of a Gama clustering.
cpu.als

CPU usage metrics for distributed ALS algorithm
path.based

Circular cluster.
aggregation

Synthetic dataset of two-dimensional points.
cpu.pca

CPU usage metrics for distributed PCA algorithm
compound

Synthetic dataset of two-dimensional points.
gama.how.many.k

Estimates the optimal number of partitions.
gama.plot.partitions

Plots results of a Gama clustering.
flame

DNA microarray data.
gama

Segments a dataset by using genetic search.