ctrl: CTRL model for multi-label Classification

Description

Create a binary relevance with ConTRolled Label correlation exploitation (CTRL) model for multilabel classification.

Usage

ctrl(
  mdata,
  base.algorithm = getOption("utiml.base.algorithm", "SVM"),
  m = 5,
  validation.size = 0.3,
  validation.threshold = 0.3,
  ...,
  predict.params = list(),
  cores = getOption("utiml.cores", 1),
  seed = getOption("utiml.seed", NA)
)

Arguments

mdata

A mldr dataset used to train the binary models.

base.algorithm

A string with the name of the base algorithm. (Default: options("utiml.base.algorithm", "SVM"))

The max number of Binary Relevance models used in the binary ensemble. (Default: 5)

validation.size

The size of validation set, used internally to prunes error-prone class labels. The value must be between 0.1 and 0.5. (Default: 0.3)

validation.threshold

Thresholding parameter determining whether any class label in Y is regarded as error-prone or not. (Default: 0.3)

...

Others arguments passed to the base algorithm for all subproblems

predict.params

A list of default arguments passed to the predictor algorithm. (default: list())

cores

The number of cores to parallelize the training. Values higher than 1 require the parallel package. (Default: options("utiml.cores", 1))

seed

An optional integer used to set the seed. This is useful when the method is run in parallel. (Default: options("utiml.seed", NA))

Value

An object of class CTRLmodel containing the set of fitted models, including:

rounds: The value passed in the m parameter
validation.size: The value passed in the validation.size parameter
validation.threshold: The value passed in the validation.threshold parameter
Y: Name of labels less susceptible to error, according to the validation process
R: List of close-related labels related with Y obtained by using feature selection technique
models: A list of the generated models, for each label a list of models was built based on close-related labels.

Details

CTRL employs a two-stage filtering procedure to exploit label correlations in a controlled manner. In the first stage, error-prone class labels are pruned from Y to generate the candidate label set for correlation exploitation. In the second stage, classification models are built for each class label by exploiting its closely-related labels in the candidate label set.

Dependencies: The degree of label correlations are estimated via supervised feature selection techniques. Thus, this implementation use the relief method available in FSelector package.

References

Li, Y., & Zhang, M. (2014). Enhancing Binary Relevance for Multi-label Learning with Controlled Label Correlations Exploitation. In 13th Pacific Rim International Conference on Artificial Intelligence (pp. 91-103). Gold Coast, Australia.

Examples

Run this code

# NOT RUN {
model <- ctrl(toyml, "RANDOM")
pred <- predict(model, toyml)

# Change default values and use 4 CORES
model <- ctrl(toyml, 'C5.0', m = 10, validation.size = 0.4,
              validation.threshold = 0.5, cores = 4)

# Use seed
model <- ctrl(toyml, 'RF', cores = 4, seed = 123)

# Set a parameters for all subproblems
model <- ctrl(dataset$train, 'KNN', k=5)
# }

Run the code above in your browser using DataLab