Learn R Programming

mlr3measures (version 1.0.0)

bacc: Balanced Accuracy

Description

Measure to compare true observed labels with predicted labels in multiclass classification tasks.

Usage

bacc(truth, response, sample_weights = NULL, ...)

Value

Performance value as numeric(1).

Arguments

truth

(factor())
True (observed) labels. Must have the same levels and length as response.

response

(factor())
Predicted response labels. Must have the same levels and length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

...

(any)
Additional arguments. Currently ignored.

Meta Information

  • Type: "classif"

  • Range: \([0, 1]\)

  • Minimize: FALSE

  • Required prediction: response

Details

The Balanced Accuracy computes the weighted balanced accuracy, suitable for imbalanced data sets. It is defined analogously to the definition in sklearn.

First, all sample weights \(w_i\) are normalized per class so that each class has the same influence: $$ \hat{w}_i = \frac{w_i}{\sum_{j=1}^n w_j \cdot \mathbf{1}(t_j = t_i)}. $$ The Balanced Accuracy is then calculated as $$ \frac{1}{\sum_{i=1}^n \hat{w}_i} \sum_{i=1}^n \hat{w}_i \cdot \mathbf{1}(r_i = t_i). $$ This definition is equivalent to acc() with class-balanced sample weights.

References

Brodersen KH, Ong CS, Stephan KE, Buhmann JM (2010). “The Balanced Accuracy and Its Posterior Distribution.” In 2010 20th International Conference on Pattern Recognition. tools:::Rd_expr_doi("10.1109/icpr.2010.764").

Guyon I, Bennett K, Cawley G, Escalante HJ, Escalera S, Ho TK, Macia N, Ray B, Saeed M, Statnikov A, Viegas E (2015). “Design of the 2015 ChaLearn AutoML challenge.” In 2015 International Joint Conference on Neural Networks (IJCNN). tools:::Rd_expr_doi("10.1109/ijcnn.2015.7280767").

See Also

Other Classification Measures: acc(), ce(), logloss(), mauc_aunu(), mbrier(), mcc(), zero_one()

Examples

Run this code
set.seed(1)
lvls = c("a", "b", "c")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
bacc(truth, response)

Run the code above in your browser using DataLab