Learn R Programming

utiml (version 0.1.6)

baseline: Baseline reference for multilabel classification

Description

Create a baseline model for multilabel classification.

Usage

baseline(
  mdata,
  metric = c("general", "F1", "hamming-loss", "subset-accuracy", "ranking-loss"),
  ...
)

Arguments

mdata

A mldr dataset used to train the binary models.

metric

Define the strategy used to predict the labels.

The possible values are: 'general', 'F1', 'hamming-loss' or 'subset-accuracy'. See the description for more details. (Default: 'general').

...

not used

Value

An object of class BASELINEmodel containing the set of fitted models, including:

labels

A vector with the label names.

predict

A list with the labels that will be predicted.

Details

Baseline is a naive multi-label classifier that maximize/minimize a specific measure without induces a learning model. It uses the general information about the labels in training dataset to estimate the labels in a test dataset.

The follow strategies are available:

general

Predict the k most frequent labels, where k is the integer most close of label cardinality.

F1

Predict the most frequent labels that obtain the best F1 measure in training data. In the original paper, the authors use the less frequent labels.

hamming-loss

Predict the labels that are associated with more than 50% of instances.

subset-accuracy

Predict the most common labelset.

ranking-loss

Predict a ranking based on the most frequent labels.

References

Metz, J., Abreu, L. F. de, Cherman, E. A., & Monard, M. C. (2012). On the Estimation of Predictive Evaluation Measure Baselines for Multi-label Learning. In 13th Ibero-American Conference on AI (pp. 189-198). Cartagena de Indias, Colombia.

Examples

Run this code
# NOT RUN {
model <- baseline(toyml)
pred <- predict(model, toyml)

## Change the metric
model <- baseline(toyml, "F1")
model <- baseline(toyml, "subset-accuracy")
# }

Run the code above in your browser using DataLab