Learn R Programming

⚠️There's a newer version (0.23.0) of this package.Take me there.

mlr3

Package website: release | dev

Efficient, object-oriented programming on the building blocks of machine learning. Successor of mlr.

Resources (for users and developers)

Installation

Install the last release from CRAN:

install.packages("mlr3")

Install the development version from GitHub:

remotes::install_github("mlr-org/mlr3")

Example

Constructing Learners and Tasks

library(mlr3)

# create learning task
task_iris <- TaskClassif$new(id = "iris", backend = iris, target = "Species")
task_iris
## <TaskClassif:iris> (150 x 5)
## * Target: Species
## * Properties: multiclass
## * Features (4):
##   - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width
# load learner and set hyperparameter
learner <- lrn("classif.rpart", cp = .01)

Basic train + predict

# train/test split
train_set <- sample(task_iris$nrow, 0.8 * task_iris$nrow)
test_set <- setdiff(seq_len(task_iris$nrow), train_set)

# train the model
learner$train(task_iris, row_ids = train_set)

# predict data
prediction <- learner$predict(task_iris, row_ids = test_set)

# calculate performance
prediction$confusion
##             truth
## response     setosa versicolor virginica
##   setosa         11          0         0
##   versicolor      0         12         1
##   virginica       0          0         6
measure <- msr("classif.acc")
prediction$score(measure)
## classif.acc 
##   0.9666667

Resample

# automatic resampling
resampling <- rsmp("cv", folds = 3L)
rr <- resample(task_iris, learner, resampling)
rr$score(measure)
##             task task_id               learner    learner_id     resampling
## 1: <TaskClassif>    iris <LearnerClassifRpart> classif.rpart <ResamplingCV>
## 2: <TaskClassif>    iris <LearnerClassifRpart> classif.rpart <ResamplingCV>
## 3: <TaskClassif>    iris <LearnerClassifRpart> classif.rpart <ResamplingCV>
##    resampling_id iteration prediction classif.acc
## 1:            cv         1     <list>        0.92
## 2:            cv         2     <list>        0.92
## 3:            cv         3     <list>        0.94
rr$aggregate(measure)
## classif.acc 
##   0.9266667

Why a rewrite?

mlr was first released to CRAN in 2013. Its core design and architecture date back even further. The addition of many features has led to a feature creep which makes mlr hard to maintain and hard to extend. We also think that while mlr was nicely extensible in some parts (learners, measures, etc.), other parts were less easy to extend from the outside. Also, many helpful R libraries did not exist at the time mlr was created, and their inclusion would result in non-trivial API changes.

Design principles

  • Only the basic building blocks for machine learning are implemented in this package.
  • Focus on computation here. No visualization or other stuff. That can go in extra packages.
  • Overcome the limitations of R’s S3 classes with the help of R6.
  • Embrace R6 for a clean OO-design, object state-changes and reference semantics. This might be less “traditional R”, but seems to fit mlr nicely.
  • Embrace data.table for fast and convenient data frame computations.
  • Combine data.table and R6, for this we will make heavy use of list columns in data.tables.
  • Defensive programming and type safety. All user input is checked with checkmate. Return types are documented, and mechanisms popular in base R which “simplify” the result unpredictably (e.g., sapply() or drop argument in [.data.frame) are avoided.
  • Be light on dependencies. mlr3 requires the following packages at runtime:
    • future.apply: Resampling and benchmarking is parallelized with the future abstraction interfacing many parallel backends.
    • backports: Ensures backward compatibility with older R releases. Developed by members of the mlr team. No recursive dependencies.
    • checkmate: Fast argument checks. Developed by members of the mlr team. No extra recursive dependencies.
    • mlr3misc: Miscellaneous functions used in multiple mlr3 extension packages. Developed by the mlr team. No extra recursive dependencies.
    • paradox: Descriptions for parameters and parameter sets. Developed by the mlr team. No extra recursive dependencies.
    • R6: Reference class objects. No recursive dependencies.
    • data.table: Extension of R’s data.frame. No recursive dependencies.
    • digest: Hash digests. No recursive dependencies.
    • uuid: Create unique string identifiers. No recursive dependencies.
    • lgr: Logging facility. No extra recursive dependencies.
    • mlr3measures: Performance measures. No extra recursive dependencies.
    • mlbench: A collection of machine learning data sets. No dependencies.
  • Reflections: Objects are queryable for properties and capabilities, allowing you to program on them.
  • Additional functionality that comes with extra dependencies:
    • To capture output, warnings and exceptions, evaluate and callr can be used.

Extension Packages

Consult the wiki for short descriptions and links to the respective repositories.

Contributing to mlr3

This R package is licensed under the LGPL-3. If you encounter problems using this software (lack of documentation, misleading or wrong documentation, unexpected behaviour, bugs, …) or just want to suggest features, please open an issue in the issue tracker. Pull requests are welcome and will be included at the discretion of the maintainers.

Please consult the wiki for a style guide, a roxygen guide and a pull request guide.

Citing mlr3

If you use mlr3, please cite our JOSS article:

@Article{mlr3,
  title = {{mlr3}: A modern object-oriented machine learning framework in {R}},
  author = {Michel Lang and Martin Binder and Jakob Richter and Patrick Schratz and Florian Pfisterer and Stefan Coors and Quay Au and Giuseppe Casalicchio and Lars Kotthoff and Bernd Bischl},
  journal = {Journal of Open Source Software},
  year = {2019},
  month = {dec},
  doi = {10.21105/joss.01903},
  url = {https://joss.theoj.org/papers/10.21105/joss.01903},
}

Copy Link

Version

Install

install.packages('mlr3')

Monthly Downloads

10,758

Version

0.3.0

License

LGPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Michel Lang

Last Published

June 2nd, 2020

Functions in mlr3 (0.3.0)

DataBackend

DataBackend
LearnerClassif

Classification Learner
MeasureClassif

Classification Measure
BenchmarkResult

Container for Benchmarking Results
MeasureRegr

Regression Measure
Learner

Learner Class
DataBackendMatrix

DataBackend for Matrix
DataBackendDataTable

DataBackend for data.table
LearnerRegr

Regression Learner
Measure

Measure Class
TaskRegr

Regression Task
PredictionClassif

Prediction Object for Classification
Prediction

Abstract Prediction Object
TaskSupervised

Supervised Task
Task

Task Class
Resampling

Resampling Class
TaskClassif

Classification Task
TaskGenerator

TaskGenerator Class
benchmark

Benchmark Multiple Learners on Multiple Tasks
auto_convert

Column Auto-Converter
mlr3-package

mlr3: Machine Learning in R - Next Generation
mlr_assertions

Assertion for mlr3 Objects
benchmark_grid

Generate a Benchmark Grid Design
default_measures

Get a Default Measure
as_benchmark_result

Convert to BenchmarkResult
PredictionRegr

Prediction Object for Regression
ResampleResult

Container for Results of resample()
mlr_learners_classif.rpart

Classification Tree Learner
mlr_learners_regr.rpart

Regression Tree Learner
mlr_learners_classif.debug

Classification Learner for Debugging
as_data_backend.data.frame

Create a Data Backend
mlr_learners_regr.featureless

Featureless Regression Learner
mlr_learners_classif.featureless

Featureless Classification Learner
mlr_coercions

Object Coercion
mlr_learners

Dictionary of Learners
mlr_measures_classif.auc

Area Under the ROC Curve
mlr_measures_classif.acc

Classification Accuracy
as.data.table

mlr_measures

Dictionary of Performance Measures
TaskUnsupervised

Unsupervised Task
mlr_measures_classif.ce

Classification Error
mlr_measures_classif.fdr

False Discovery Rate
mlr_measures_classif.dor

Diagnostic Odds Ratio
mlr_measures_classif.fn

False Negatives
mlr_measures_classif.fomr

False Omission Rate
mlr_measures_classif.fnr

False Negative Rate
mlr_measures_classif.bbrier

Binary Brier Score
mlr_measures_classif.bacc

Balanced Accuracy
mlr_measures_classif.costs

Cost-sensitive Classification Measure
mlr_measures_classif.fbeta

F-beta Score
mlr_measures_classif.recall

True Positive Rate
mlr_measures_classif.mbrier

Multiclass Brier Score
mlr_measures_classif.logloss

Log Loss
mlr_measures_classif.mcc

Matthews Correlation Coefficient
mlr_measures_classif.sensitivity

True Positive Rate
mlr_measures_classif.ppv

Positive Predictive Value
mlr_measures_classif.precision

Positive Predictive Value
mlr_measures_classif.npv

Negative Predictive Value
mlr_measures_classif.fp

False Positives
mlr_measures_classif.fpr

False Positive Rate
mlr_measures_regr.bias

Bias
mlr_measures_debug

Debug Measure
mlr_measures_classif.tpr

True Positive Rate
mlr_measures_classif.tn

True Negatives
mlr_measures_classif.specificity

True Negative Rate
mlr_measures_regr.ktau

Kendall's tau
mlr_measures_classif.tp

True Positives
mlr_measures_classif.tnr

True Negative Rate
mlr_measures_elapsed_time

Elapsed Time Measure
mlr_measures_regr.rae

Relative Absolute Error
mlr_measures_oob_error

Out-of-bag Error Measure
mlr_measures_regr.maxae

Max Absolute Error
mlr_measures_regr.msle

Mean Squared Log Error
mlr_measures_regr.pbias

Percent Bias
mlr_measures_regr.medae

Median Absolute Errors
mlr_measures_regr.rmse

Root Mean Squared Error
mlr_measures_regr.mape

Mean Absolute Percent Error
mlr_measures_regr.mae

Mean Absolute Errors
mlr_measures_regr.mse

Mean Squared Error
mlr_measures_selected_features

Selected Features Measure
mlr_measures_regr.rmsle

Root Mean Squared Log Error
mlr_measures_regr.sae

Sum of Absolute Errors
mlr_measures_regr.medse

Median Squared Error
mlr_measures_regr.rse

Relative Squared Error
mlr_measures_regr.smape

Symmetric Mean Absolute Percent Error
mlr_measures_regr.srho

Spearman's rho
mlr_reflections

Reflections for mlr3
mlr_measures_regr.sse

Sum of Squared Errors
mlr_measures_regr.rrse

Root Relative Squared Error
mlr_measures_regr.rsq

R Squared
mlr_resamplings_holdout

Holdout Resampling
mlr_resamplings_insample

Insample Resampling
mlr_resamplings_custom

Custom Resampling
mlr_sugar

Syntactic Sugar for Object Construction
mlr_resamplings_repeated_cv

Repeated Cross Validation Resampling
mlr_resamplings

Dictionary of Resampling Strategies
mlr_resamplings_subsampling

Subsampling Resampling
mlr_resamplings_cv

Cross Validation Resampling
mlr_resamplings_bootstrap

Bootstrap Resampling
mlr_task_generators

Dictionary of Task Generators
mlr_tasks

Dictionary of Tasks
mlr_task_generators_friedman1

Friedman1 Regression Task Generator
mlr_task_generators_2dnormals

2D Normals Classification Task Generator
mlr_tasks_boston_housing

Boston Housing Regression Task
predict.Learner

Predict Method for Learners
resample

Resample a Learner on a Task
mlr_tasks_pima

Pima Indian Diabetes Classification Task
mlr_tasks_mtcars

Motor Trend Regression Task
mlr_task_generators_xor

XOR Classification Task Generator
mlr_tasks_wine

Wine Classification Task
mlr_task_generators_smiley

Smiley Classification Task Generator
mlr_tasks_iris

Iris Classification Task
mlr_tasks_sonar

Sonar Classification Task
mlr_tasks_german_credit

German Credit Classification Task
mlr_tasks_zoo

Zoo Classification Task
mlr_tasks_spam

Spam Classification Task