auto_grouping(data, input, target, n_groups, model = "kmeans", seed = 999)
Arguments
data
data frame source
input
categorical variable indicating
target
string of the variable to optimize the re-grouping
n_groups
number of groups for the new category based on input, normally between 3 and 10.
model
is the clustering model used to create the grouping, supported models: "kmeans" (default) or "hclust" (hierarchical clustering).
seed
optional, random number used internally for the k-means, changing this value will change the model
Value
A list containing 3 elements: recateg_results which contains the description of the target variable with the new groups;
df_equivalence is a data frame containing the input category and the new category; fit_cluster which is the cluster model used to do the re-grouping
# NOT RUN {# Reducing quantity of countries based on has_flu variableauto_grouping(data=data_country, input='country', target="has_flu", n_groups=8)
# }