list
containing:
a tibble
with summarized results (called summarized_metrics
)
a tibble
with random evaluations (random_evaluations
)
a tibble
with the summarized class level results
(summarized_class_level_results
)
....................................................................
Macro metrics
Based on the generated predictions,
one-vs-all (binomial) evaluations are performed and aggregated
to get the following macro metrics:
Balanced Accuracy
,
F1
,
Sensitivity
,
Specificity
,
Positive Predictive Value
,
Negative Predictive Value
,
Kappa
,
Detection Rate
,
Detection Prevalence
, and
Prevalence
.
In general, the metrics mentioned in
binomial_metrics()
can be enabled as macro metrics
(excluding MCC
, AUC
, Lower CI
,
Upper CI
, and the AIC/AICc/BIC
metrics).
These metrics also has a weighted average
version.
N.B. we also refer to the one-vs-all evaluations as the class level results.
Multiclass metrics
In addition, the Overall Accuracy
and multiclass
MCC
metrics are computed. Multiclass AUC
can be enabled but
is slow to calculate with many classes.
....................................................................
The Summarized Results
tibble
contains:
Summary of the random evaluations.
How: The one-vs-all binomial evaluations are aggregated by repetition and summarized. Besides the
metrics from the binomial evaluations, it
also includes Overall Accuracy
and multiclass
MCC
.
The Measure column indicates the statistical descriptor used on the evaluations.
The Mean, Median, SD, IQR, Max, Min,
NAs, and INFs measures describe the Random Evaluations
tibble
,
while the CL_Max, CL_Min, CL_NAs, and
CL_INFs describe the Class Level results.
The rows where Measure == All_<<class name>>
are the evaluations when all
the observations are predicted to be in that class.
....................................................................
The Summarized Class Level Results
tibble
contains:
The (nested) summarized results for each class, with the same metrics and descriptors as
the Summarized Results
tibble
. Use tidyr::unnest
on the tibble
to inspect the results.
How: The one-vs-all evaluations are summarized by class.
The rows where Measure == All_0
are the evaluations when none of the observations
are predicted to be in that class, while the rows where Measure == All_1
are the
evaluations when all of the observations are predicted to be in that class.
....................................................................
The Random Evaluations
tibble
contains:
The repetition results with the same metrics as the Summarized Results
tibble
.
How: The one-vs-all evaluations are aggregated by repetition.
If a metric contains one or more NAs
in the one-vs-all evaluations, it
will lead to an NA
result for that repetition.
Also includes:
A nested tibble
with the one-vs-all binomial evaluations (Class Level Results),
including nested Confusion Matrices and the
Support column, which is a count of how many observations from the
class is in the test set.
A nested tibble
with the predictions and targets.
A list
of ROC curve objects.
A nested tibble
with the multiclass confusion matrix.
A nested Process information object with information
about the evaluation.
Name of dependent variable.