Learn R Programming

creditmodel (version 1.0)

customer_segmentation: Customer Segmentation

Description

customer_segmentation is a function for clustering and find the best segment variable.

Usage

customer_segmentation(dat, x_list = NULL, ex_cols = NULL,
  cluster_control = list(meth = "Kmeans", kc = 2, nstart = 1, epsm =
  0.000001, sf = 2, max_iter = 100), tree_control = list(cv_folds = 5,
  maxdepth = kc + 1, minbucket = nrow(dat)/(kc + 1)), save_data = FALSE,
  file_name = NULL, dir_path = tempdir())

Arguments

dat

A data.frame contained only predict variables.

x_list

A list of x variables.

ex_cols

A list of excluded variables. Default is NULL.

cluster_control

A list controls cluster. kc is the number of cluster center (default is 2), nstart is the number of random groups (default is 1), max_iter max iteration number(default is 100) .

  • meth Method of clustering. Provides two mehods,"Kmeans" and "FCM(Fuzzy Cluster Means)"(default is "Kmeans").

  • kc Number of cluster center (default is 2).

  • nstart Number of random groups (default is 1).

  • max_iter Max iteration number(default is 100).

tree_control

A list of controls for desison tree to find the best segment variable.

  • cv_folds Number of cross-validations(default is 5).

  • maxdepth Maximum depth of a tree(default is kc +1).

  • minbucket Minimum percent of observations in any terminal <leaf> node (default is nrow(dat) / (kc + 1)).

save_data

Logical. If TRUE, save outliers analysis file to the specified folder at dir_path

file_name

The name for periodically saved segmentation file. Default is NULL.

dir_path

The path for periodically saved segmentation file.

Value

A "data.frame" object contains cluster results.

References

Bezdek, James C. "FCM: The fuzzy c-means clustering algorithm". Computers & Geosciences (0098-3004),https://doi.org/10.1016/0098-3004(84)90020-7

Examples

Run this code
# NOT RUN {
clust <- customer_segmentation(dat = lendingclub[1:10000,40:50],
                              x_list = NULL, ex_cols = "id$|loan_status",
                              cluster_control = list(meth = "FCM", kc = 2),
                              tree_control = list(minbucket = round(nrow(lendingclub) / 10)),
                              file_name = NULL, dir_path = tempdir())
# }

Run the code above in your browser using DataLab