The AUC tests whether positives are ranked higher than negatives. It is
simply the area under the ROC curve.
The AUC estimation of this function follows the procedure described by
Hand & Till (2001). The AUC_roc estimated following the trapezoid approach is
equivalent to the average between recall and specificity (Powers, 2011), which is
equivalent to the balanced accuracy (balacc
):
\(AUC_roc = \frac{(recall - FPR + 1)}{2} = \frac{recall+specificity}{2} = 1-\frac{FPR+FNR}{2}\)
Interpretation: the AUC is equivalent to the probability that a randomly case
from a given class (positive for binary) will have a smaller estimated probability
of belonging to another class (negative for binary) compared to a randomly
chosen member of the another class.
Values: the AUC is bounded between 0 and 1. The closer to 1 the better.
Values close to 0 indicate inaccurate predictions. An AUC = 0.5 suggests no
discrimination ability between classes; 0.7 < AUC < 0.8 is considered acceptable,
0.8 < AUC < 0.5 is considered excellent, and AUC > 0.9 is outstanding
(Mandrekar, 2010).
For the multiclass cases, the AUC is equivalent to the average of AUC of each class
(Hand & Till, 2001).
Finally, the AUC is directly related to the Gini-index (a.k.a. G1) since Gini + 1 = 2*AUC.
(Hand & Till, 2001).
For the formula and more details, see
online-documentation