Features are sorted by descendant according to the relevance value obtained after applying an specific heuristic. Next, features are distributed into N clusters following a card-dealing methodology. Finally best distribution is assigned to the distribution having highest homogeneity.
D2MCS::GenericClusteringStrategy
-> SimpleStrategy
new()
Method for initializing the object arguments during runtime.
SimpleStrategy$new(
subset,
heuristic,
configuration = StrategyConfiguration$new()
)
subset
The Subset
used to apply the
feature-clustering strategy.
heuristic
The heuristic used to compute the relevance of each
feature. Must inherit from GenericHeuristic
abstract class.
configuration
Optional parameter to customize configuration
parameters for the strategy. Must inherited from
StrategyConfiguration
abstract class.
execute()
Function responsible of performing the clustering
strategy over the defined Subset
.
SimpleStrategy$execute(verbose = FALSE)
verbose
A logical value to specify if more verbosity is needed.
getBestClusterDistribution()
The function obtains the best clustering distribution.
SimpleStrategy$getBestClusterDistribution()
A list of clusters. Each list element represents a feature group.
getUnclustered()
The function is used to return the features that cannot be clustered due to incompatibilities with the used heuristic.
SimpleStrategy$getUnclustered()
A character vector containing the unclassified features.
getDistribution()
Function used to obtain a specific cluster distribution.
SimpleStrategy$getDistribution(
num.clusters = NULL,
num.groups = NULL,
include.unclustered = FALSE
)
num.clusters
A numeric value to select the number of clusters (define the distribution).
num.groups
A single or numeric vector value to identify a specific group that forms the clustering distribution.
include.unclustered
A logical value to determine if unclustered features should be included.
A list with the features comprising an specific clustering distribution.
createTrain()
The function is used to create a Trainset
object from a specific clustering distribution.
SimpleStrategy$createTrain(
subset,
num.clusters = NULL,
num.groups = NULL,
include.unclustered = FALSE
)
subset
The Subset
object used as a basis to create
the train set (see Trainset
class).
num.clusters
A numeric value to select the number of clusters (define the distribution).
num.groups
A single or numeric vector value to identify a specific group that forms the clustering distribution.
include.unclustered
A logical value to determine if unclustered features should be included.
If num.clusters
and num.groups
are not defined,
best clustering distribution is used to create the train set.
A Trainset
object.
plot()
The function is responsible for creating a plot to visualize the clustering distribution.
SimpleStrategy$plot(dir.path = NULL, file.name = NULL)
dir.path
An optional argument to define the name of the directory
where the exported plot will be saved. If not defined, the file path will
be automatically assigned to the current working directory,
'getwd()
'.
file.name
A character to define the name of the PDF file where the plot is exported.
saveCSV()
The function is used to save the clustering distribution to a CSV file.
SimpleStrategy$saveCSV(dir.path, name = NULL, num.clusters = NULL)
dir.path
The name of the directory to save the CSV file.
name
Defines the name of the CSV file.
num.clusters
An optional parameter to select the number of clusters to be saved. If not defined, all cluster distributions will be saved.
clone()
The objects of this class are cloneable with this method.
SimpleStrategy$clone(deep = FALSE)
deep
Whether to make a deep clone.
The strategy is suitable for all features that are valid for the indicated heuristics. Invalid features are automatically grouped into a specific cluster named as 'unclustered'.
GenericClusteringStrategy
,
StrategyConfiguration