The input tibble with additional column
containing cluster values as a factor.
The new column is prefixed with "cls_".
The new column contains the fit result in the attribute stats.kmeans.fit.
The names of the items used for clustering are stored in the attribute stats.kmeans.items.
The clustering diagnostics (Within-Cluster and Between-Cluster Sum of Squares) are stored in the attribute stats.kmeans.wss.
Arguments
data
A dataframe.
cols
A tidy selection of item columns.
newcol
Name of the new cluster column as a character vector.
Set to NULL (default) to automatically build a name
from the common column prefix, prefixed with "cls_".
k
Number of clusters to calculate.
Set to NULL to output a scree plot for up to 10 clusters
and automatically choose the number of clusters based on the elbow criterion.
The within-sums of squares for the scree plot are calculated by
stats::kmeans.
method
The method as character value. Currently, only kmeans is supported.
All items are scaled before performing the cluster analysis using
base::scale.