prauc: Area Under the Precision-Recall Curve

Description

Measure to compare true observed labels with predicted probabilities in binary classification tasks.

Usage

prauc(truth, prob, positive, na_value = NaN, ...)

Value

Performance value as numeric(1).

Arguments

truth: (factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.
prob: (numeric())
Predicted probability for positive class. Must have exactly same length as truth.
positive: (character(1))
Name of the positive class.
na_value: (numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.
...: (any)
Additional arguments. Currently ignored.

Meta Information

Type: "binary"
Range: \([0, 1]\)
Minimize: FALSE
Required prediction: prob

Details

Computes the area under the Precision-Recall curve (PRC). The PRC can be interpreted as the relationship between precision and recall (sensitivity), and is considered to be a more appropriate measure for unbalanced datasets than the ROC curve. The AUC-PRC is computed by integration of the piecewise function.

This measure is undefined if the true values are either all positive or all negative.

References

Davis J, Goadrich M (2006). “The relationship between precision-recall and ROC curves.” In Proceedings of the 23rd International Conference on Machine Learning. ISBN 9781595933836.

Examples

Run this code

truth = factor(c("a", "a", "a", "b"))
prob = c(.6, .7, .1, .4)
prauc(truth, prob, "a")

Run the code above in your browser using DataLab