This function is a very parallel version of [benchmark] using batchtools. Experiments are created in the provided registry for each combination of learners, tasks and resamplings. The experiments are then stored in a registry and the runs can be started via [batchtools::submitJobs]. A job is one train/test split of the outer resampling. In case of nested resampling (e.g. with [makeTuneWrapper]), each job is a full run of inner resampling, which can be parallelized in a second step with ParallelMap. For details on the usage and support backends have a look at the batchtools tutorial page: <https://github.com/mllg/batchtools>.
The general workflow with `batchmark` looks like this:
Create an ExperimentRegistry using [batchtools::makeExperimentRegistry].
Call `batchmark(...)` which defines jobs for all learners and tasks in an [base::expand.grid] fashion.
Submit jobs using [batchtools::submitJobs].
Babysit the computation, wait for all jobs to finish using [batchtools::waitForJobs].
Call `reduceBatchmarkResult()` to reduce results into a [BenchmarkResult].
If you want to use this with OpenML datasets you can generate tasks from a vector of dataset IDs easily with `tasks = lapply(data.ids, function(x) convertOMLDataSetToMlr(getOMLDataSet(x)))`.
batchmark(learners, tasks, resamplings, measures, models = TRUE,
reg = batchtools::getDefaultRegistry())
(list of Learner | character) Learning algorithms which should be compared, can also be a single learner. If you pass strings the learners will be created via makeLearner.
list of Task Tasks that learners should be run on.
[(list of) [ResampleDesc]) Resampling strategy for each tasks. If only one is provided, it will be replicated to match the number of tasks. If missing, a 10-fold cross validation is used.
(list of Measure) Performance measures for all tasks. If missing, the default measure of the first task is used.
(logical(1)
)
Should all fitted models be stored in the ResampleResult?
Default is TRUE
.
([batchtools::Registry]) Registry, created by [batchtools::makeExperimentRegistry]. If not explicitly passed, uses the last created registry.
([data.table]). Generated job ids are stored in the column “job.id”.
Other benchmark: BenchmarkResult
,
benchmark
,
convertBMRToRankMatrix
,
friedmanPostHocTestBMR
,
friedmanTestBMR
,
generateCritDifferencesData
,
getBMRAggrPerformances
,
getBMRFeatSelResults
,
getBMRFilteredFeatures
,
getBMRLearnerIds
,
getBMRLearnerShortNames
,
getBMRLearners
,
getBMRMeasureIds
,
getBMRMeasures
, getBMRModels
,
getBMRPerformances
,
getBMRPredictions
,
getBMRTaskDescs
,
getBMRTaskIds
,
getBMRTuneResults
,
plotBMRBoxplots
,
plotBMRRanksAsBarChart
,
plotBMRSummary
,
plotCritDifferences
,
reduceBatchmarkResults