KMeansTrainer: K-Means Trainer

Description

Trains a unsupervised K-Means clustering algorithm. It borrows mini-batch k-means function from ClusterR package written in c++, hence it is quite fast.

Usage

KMeansTrainer

Format

R6Class object.

Usage

For usage details see Methods, Arguments and Examples sections.

kmt = KMeansTrainer$new(clusters, batch_size = 10, num_init=1, max_iters=100,
                        init_fraction=1, initializer = "kmeans++", early_stop_iter = 10,
                        verbose=FALSE, centroids=NULL, tol = 1e-04, tol_optimal_init=0.3,
                        seed=1, max_clusters=NA)
bst$fit(X_train, y_train=NULL)
prediction <- bst$predict(X_test)

Methods

$new(): Initialises an instance of k-means model
$fit(): fit model to an input train data
$predict(): returns cluster predictions for each row of given data

Arguments

params: for explanation on parameters, please refer to the documentation of MiniBatchKMeans function in clusterR package https://CRAN.R-project.org/package=ClusterR
find_optimal: Used to find the optimal number of cluster during fit method. To use this, make sure the value for max_clusters > 0.

Examples

Run this code

# NOT RUN {
data <- rbind(replicate(20, rnorm(1e4, 2)),
             replicate(20, rnorm(1e4, -1)),
             replicate(20, rnorm(1e4, 5)))
km_model <- KMeansTrainer$new(clusters=2, batch_size=30, max_clusters=6)
km_model$fit(data, find_optimal = FALSE)
predictions <- km_model$predict(data)
# }

Run the code above in your browser using DataLab