mlr3 (version 0.1.1)

BenchmarkResult: Container for Results of benchmark()


This is the result container object returned by benchmark().

Note that all stored objects are accessed by reference. Do not modify any object without cloning it first.



R6::R6Class object.


bmr = BenchmarkResult$new(data)


  • data :: data.table::data.table() Internal data storage. We discourage users to directly work with this field.

  • tasks :: data.table::data.table() Table of used tasks with three columns: "task_hash" (character(1)), "task_id" (character(1)) and "task" (Task).

  • learners :: data.table::data.table() Table of used learners with three columns: "learner_hash" (character(1)), "learner_id" (character(1)) and "learner" (Learner).

  • resamplings :: data.table::data.table() Table of used resamplings with three columns: "resampling_hash" (character(1)), "resampling_id" (character(1)) and "resampling" (Resampling).


  • aggregate(measures = NULL, ids = TRUE, params = FALSE, warnings = FALSE, errors = FALSE) (list() of Measure, logical(1), logical(1), logical(1), logical(1)) -> data.table::data.table() Returns a result table where resampling iterations are aggregated together into ResampleResults. Arguments control the number of additional columns:

    • ids :: logical(1) Adds object ids ("task_id", "learner_id", "resampling_id") as extra character columns.

    • params :: logical(1) Adds the hyperparameter values as extra list column "params". You can unnest them with mlr3misc::unnest().

    • warnings :: logical(1) Adds the number of resampling iterations with at least one warning as extra integer column "warnings".

    • errors :: logical(1) Adds the number of resampling iterations with errors as extra integer column "errors".

  • performance(measures = NULL, ids = TRUE) (list() of Measure, logical(1)) -> data.table::data.table() Returns a table with one row for each resampling iteration, including all involved objects. Additionally calculates the provided performance measures and binds the performance as extra column. If ids is TRUE, character column of id names are added to the table for convenient filtering.

  • best(measure) (Measure) -> ResampleResult Returns the ResampleResult with the best performance according to Measure.

  • resample_result(hash) (character(1) -> ResampleResult) Retrieve the ResampleResult with hash hash.

  • combine(bmr) BenchmarkResult -> self Fuses a second BenchmarkResult into itself, mutating the BenchmarkResult in-place.

S3 Methods

Syntactic Sugar

The mlr3 package provides some shortcuts to ease the creation of its objects.

First, instead of an object, it is possible to pass a string identifier which is used to lookup the object in a mlr3misc::Dictionary:

Additionally, each task type has an associated default measure (stored in mlr_reflections) which is used as a fallback if no other measure is provided. Classification tasks default to the classification error in "classif.ce", regression tasks to the mean squared error in "regr.mse".


Run this code
tasks = mlr_tasks$mget(c("sonar", "spam"))
learners = mlr_learners$mget(c("classif.featureless", "classif.rpart"), predict_type = "prob")
resamplings = mlr_resamplings$get("cv3")
design = expand_grid(tasks = tasks, learners = learners, resamplings = resamplings)

bmr = benchmark(design)


# first 5 individual resamplings
head(, measures = c("classif.acc", "classif.auc")), 5)

# aggregate results

# aggregate results with hyperparameters as separate columns
mlr3misc::unnest(bmr$aggregate(params = TRUE), "params")

# extract resample result for classif.rpart
rr = bmr$aggregate()[learner_id == "classif.rpart", resample_result][[1]]

# access the confusion matrix of the first resampling iteration
# }

Run the code above in your browser using DataLab