most_challenging: Find the data points that were hardest to predict

Description

lifecycle::badge("experimental") Finds the data points that, overall, were the most challenging to predict, based on a prediction metric.

Usage

most_challenging(
  data,
  type,
  obs_id_col = "Observation",
  target_col = "Target",
  prediction_cols = ifelse(type == "gaussian", "Prediction", "Predicted Class"),
  threshold = 0.15,
  threshold_is = "percentage",
  metric = NULL,
  cutoff = 0.5
)

Value

data.frame with the most challenging observations and their metrics.

`>=` / `<=` denotes the threshold as score.

Arguments

data

data.frame with predictions, targets and observation IDs. Can be grouped by dplyr::group_by().

Predictions can be passed as values, predicted classes or predicted probabilities:

N.B. Adds .Machine$double.eps to all probabilities to avoid log(0).

Multinomial

When `type` is "multinomial", the predictions can be passed in one of two formats.

Probabilities (Preferable)

One column per class with the probability of that class. The columns should have the name of their class, as they are named in the target column. E.g.:

class_1	class_2	class_3	target	0.269
0.528	0.203	class_2	0.368	0.322
0.310	class_3	0.375	0.371	0.254
class_2	...	...	...	...

Classes

A single column of type character with the predicted classes. E.g.:

prediction	target	class_2	class_2	class_1
class_3	class_1	class_2	...	...

Binomial

When `type` is "binomial", the predictions can be passed in one of two formats.

Probabilities (Preferable)

One column with the probability of class being the second class alphabetically ("dog" if classes are "cat" and "dog"). E.g.:

prediction	target	0.769	"dog"	0.368
"dog"	0.375	"cat"	...	...

Note: At the alphabetical ordering of the class labels, they are of type character, why e.g. 100 would come before 7.

Classes

A single column of type character with the predicted classes. E.g.:

prediction	target	class_0	class_1	class_1
class_1	class_1	class_0	...	...

Gaussian

When `type` is "gaussian", the predictions should be passed as one column with the predicted values. E.g.:

prediction	target	28.9	30.2	33.2
27.1	23.4	21.3	...	...

type

Type of task used to get the predictions:

"gaussian" for regression (like linear regression).

"binomial" for binary classification.

"multinomial" for multiclass classification.

obs_id_col

Name of column with observation IDs. This will be used to aggregate the performance of each observation.

target_col

Name of column with the true classes/values in `data`.

prediction_cols

Name(s) of column(s) with the predictions.

threshold

Threshold to filter observations by. Depends on `type` and `threshold_is`.

The threshold can either be a percentage or a score. For percentages, a lower threshold returns fewer observations. For scores, this depends on `type`.

Gaussian

threshold_is "percentage"

(Approximate) percentage of the observations with the largest root mean square errors to return.

threshold_is "score"

Observations with a root mean square error larger than or equal to the threshold will be returned.

Binomial, Multinomial

threshold_is "percentage"

(Approximate) percentage of the observations to return with:

MAE, Cross Entropy: Highest error scores.

Accuracy: Lowest accuracies

threshold_is "score"

MAE, Cross Entropy: Observations with an error score above or equal to the threshold will be returned.

Accuracy: Observations with an accuracy below or equal to the threshold will be returned.

threshold_is

Either "score" or "percentage". See `threshold`.

metric

The metric to use. If NULL, the default metric depends on the format of the prediction columns.

Binomial, Multinomial

"Accuracy", "MAE" or "Cross Entropy".

When one prediction column with predicted classes is passed, the default is "Accuracy". In this configuration, the other metrics are not calculated.

When one or more prediction columns with predicted probabilities are passed, the default is "MAE". This is the Mean Absolute Error of the probability of the target class.

Gaussian

Ignored. Always uses "RMSE".

cutoff

Threshold for predicted classes. (Numeric)

N.B. Binomial only.

Author

Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk

Examples

Run this code

# \donttest{
# Attach packages
library(cvms)
library(dplyr)

##
## Multinomial
##

# Find the most challenging data points (per classifier)
# in the predicted.musicians dataset
# which resembles the "Predictions" tibble from the evaluation results

# Passing predicted probabilities
# Observations with 30% highest MAE scores
most_challenging(
  predicted.musicians,
  obs_id_col = "ID",
  prediction_cols = c("A", "B", "C", "D"),
  type = "multinomial",
  threshold = 0.30
)

# Observations with 25% highest Cross Entropy scores
most_challenging(
  predicted.musicians,
  obs_id_col = "ID",
  prediction_cols = c("A", "B", "C", "D"),
  type = "multinomial",
  threshold = 0.25,
  metric = "Cross Entropy"
)

# Passing predicted classes
# Observations with 30% lowest Accuracy scores
most_challenging(
  predicted.musicians,
  obs_id_col = "ID",
  prediction_cols = "Predicted Class",
  type = "multinomial",
  threshold = 0.30
)

# The 40% lowest-scoring on accuracy per classifier
predicted.musicians %>%
  dplyr::group_by(Classifier) %>%
  most_challenging(
    obs_id_col = "ID",
    prediction_cols = "Predicted Class",
    type = "multinomial",
    threshold = 0.40
  )

# Accuracy scores below 0.05
most_challenging(
  predicted.musicians,
  obs_id_col = "ID",
  type = "multinomial",
  threshold = 0.05,
  threshold_is = "score"
)

##
## Binomial
##

# Subset the predicted.musicians
binom_data <- predicted.musicians %>%
  dplyr::filter(Target %in% c("A","B")) %>%
  dplyr::rename(Prediction = B)

# Passing probabilities
# Observations with 30% highest MAE
most_challenging(
  binom_data,
  obs_id_col = "ID",
  type = "binomial",
  prediction_cols = "Prediction",
  threshold = 0.30
)

# Observations with 30% highest Cross Entropy
most_challenging(
  binom_data,
  obs_id_col = "ID",
  type = "binomial",
  prediction_cols = "Prediction",
  threshold = 0.30,
  metric = "Cross Entropy"
)

# Passing predicted classes
# Observations with 30% lowest Accuracy scores
most_challenging(
  binom_data,
  obs_id_col = "ID",
  type = "binomial",
  prediction_cols = "Predicted Class",
  threshold = 0.30
)

##
## Gaussian
##

set.seed(1)

df <- data.frame(
  "Observation" = rep(1:10, n = 3),
  "Target" = rnorm(n = 30, mean = 25, sd = 5),
  "Prediction" = rnorm(n = 30, mean = 27, sd = 7)
)

# The 20% highest RMSE scores
most_challenging(
  df,
  type = "gaussian",
  threshold = 0.2
)

# RMSE scores above 9
most_challenging(
  df,
  type = "gaussian",
  threshold = 9,
  threshold_is = "score"
)
# }

Run the code above in your browser using DataLab