Learn R Programming

mlflow: R interface for MLflow

  • Install MLflow from R to track experiments locally.
  • Connect to MLflow servers to share experiments with others.
  • Use MLflow to export models that can be served locally and remotely.

Prerequisites

To use the MLflow R API, you must install the MLflow Python package.

pip install mlflow

Optionally, you can set the MLFLOW_PYTHON_BIN and MLFLOW_BIN environment variables to specify the Python and MLflow binaries to use. By default, the R client automatically finds them using Sys.which("python") and Sys.which("mlflow").

export MLFLOW_PYTHON_BIN=/path/to/bin/python
export MLFLOW_BIN=/path/to/bin/mlflow

Installation

Install mlflow as follows:

devtools::install_github("mlflow/mlflow", subdir = "mlflow/R/mlflow")

Development

Install the mlflow package as follows:

devtools::install_github("mlflow/mlflow", subdir = "mlflow/R/mlflow")

Then install the latest released mlflow runtime.

However, currently, the development runtime of mlflow is also required; which means you also need to download or clone the mlflow GitHub repo:

git clone https://github.com/mlflow/mlflow

And upgrade the runtime to the development version as follows:

# Upgrade to the latest development version
pip install -e <local github repo>

Tracking

MLflow Tracking allows you to logging parameters, code versions, metrics, and output files when running R code and for later visualizing the results.

MLflow allows you to group runs under experiments, which can be useful for comparing runs intended to tackle a particular task. You can create and activate a new experiment locally using mlflow as follows:

library(mlflow)
mlflow_set_experiment("Test")

Then you can list view your experiments from MLflows user interface by running:

mlflow_ui()

You can also use a MLflow server to track and share experiments, see running a tracking server, and then make use of this server by running:

mlflow_set_tracking_uri("http://tracking-server:5000")

Once the tracking url is defined, the experiments will be stored and tracked in the specified server which others will also be able to access.

Projects

An MLflow Project is a format for packaging data science code in a reusable and reproducible way.

MLflow projects can be explicitly created or implicitly used by running R with mlflow from the terminal as follows:

mlflow run examples/r_wine --entry-point train.R

Notice that is equivalent to running from examples/r_wine,

Rscript -e "mlflow::mlflow_source('train.R')"

and train.R performing training and logging as follows:

library(mlflow)

# read parameters
column <- mlflow_log_param("column", 1)

# log total rows
mlflow_log_metric("rows", nrow(iris))

# train model
model <- lm(
  Sepal.Width ~ x,
  data.frame(Sepal.Width = iris$Sepal.Width, x = iris[,column])
)

# log models intercept
mlflow_log_metric("intercept", model$coefficients[["(Intercept)"]])

Parameters

You will often want to parameterize your scripts to support running and tracking multiple experiments. You can define parameters with type under a params_example.R example as follows:

library(mlflow)

# define parameters
my_int <- mlflow_param("my_int", 1, "integer")
my_num <- mlflow_param("my_num", 1.0, "numeric")

# log parameters
mlflow_log_param("param_int", my_int)
mlflow_log_param("param_num", my_num)

Then run mlflow run with custom parameters as follows

mlflow run tests/testthat/examples/ --entry-point params_example.R -P my_int=10 -P my_num=20.0 -P my_str=XYZ

=== Created directory /var/folders/ks/wm_bx4cn70s6h0r5vgqpsldm0000gn/T/tmpi6d2_wzf for downloading remote URIs passed to arguments of type 'path' ===
=== Running command 'source /miniconda2/bin/activate mlflow-da39a3ee5e6b4b0d3255bfef95601890afd80709 && Rscript -e "mlflow::mlflow_source('params_example.R')" --args --my_int 10 --my_num 20.0 --my_str XYZ' in run with ID '191b489b2355450a8c3cc9bf96cb1aa3' ===
=== Run (ID '191b489b2355450a8c3cc9bf96cb1aa3') succeeded ===

Run results that we can view with mlflow_ui().

Models

An MLflow Model is a standard format for packaging machine learning models that can be used in a variety of downstream tools—for example, real-time serving through a REST API or batch inference on Apache Spark. They provide a convention to save a model in different “flavors” that can be understood by different downstream tools.

To save a model use mlflow_save_model(). For instance, you can add the following lines to the previous train.R script:

# train model (...)

# save model
mlflow_save_model(
  crate(~ stats::predict(model, .x), model)
)

And trigger a run with that will also save your model as follows:

mlflow run train.R

Each MLflow Model is simply a directory containing arbitrary files, together with an MLmodel file in the root of the directory that can define multiple flavors that the model can be viewed in.

The directory containing the model looks as follows:

dir("model")
## [1] "crate.bin" "MLmodel"

and the model definition model/MLmodel like:

cat(paste(readLines("model/MLmodel"), collapse = "\n"))
## flavors:
##   crate:
##     version: 0.1.0
##     model: crate.bin
## time_created: 18-10-03T22:18:25.25.55
## run_id: 4286a3d27974487b95b19e01b7b3caab

Later on, the R model can be deployed which will perform predictions using mlflow_rfunc_predict():

mlflow_rfunc_predict("model", data = data.frame(x = c(0.3, 0.2)))
## Warning in mlflow_snapshot_warning(): Running without restoring the
## packages snapshot may not reload the model correctly. Consider running
## 'mlflow_restore_snapshot()' or setting the 'restore' parameter to 'TRUE'.

## 3.400381396714573.40656987651099

##        1        2
## 3.400381 3.406570

Deployment

MLflow provides tools for deployment on a local machine and several production environments. You can use these tools to easily apply your models in a production environment.

You can serve a model by running,

mlflow rfunc serve model

which is equivalent to running,

Rscript -e "mlflow_rfunc_serve('model')"

You can also run:

mlflow rfunc predict model data.json

which is equivalent to running,

Rscript -e "mlflow_rfunc_predict('model', 'data.json')"

Dependencies

When running a project, mlflow_snapshot() is automatically called to generate a r-dependencies.txt file which contains a list of required packages and versions.

However, restoring dependencies is not automatic since it’s usually an expensive operation. To restore dependencies run:

mlflow_restore_snapshot()

Notice that the MLFLOW_SNAPSHOT_CACHE environment variable can be set to a cache directory to improve the time required to restore dependencies.

RStudio

To enable fast iteration while tracking with MLflow improvements over a model, RStudio 1.2.897 an be configured to automatically trigger mlflow_run() when sourced. This is enabled by including a # !source mlflow::mlflow_run comment at the top of the R script as follows:

Contributing

See the MLflow contribution guidelines.

Copy Link

Version

Install

install.packages('mlflow')

Monthly Downloads

68,529

Version

2.21.2

License

Apache License 2.0

Issues

Pull Requests

Stars

Forks

Maintainer

Matei Zaharia

Last Published

April 2nd, 2025

Functions in mlflow (2.21.2)

mlflow_get_experiment

Get Experiment
mlflow_get_model_version

Get a model version
mlflow_get_metric_history

Get Metric History
mlflow_get_tracking_uri

Get Remote Tracking URI
mlflow_get_latest_versions

Get latest model versions
mlflow_log_batch

Log Batch
mlflow_log_model

Log Model
mlflow_id

Get Run or Experiment ID
mlflow_ui

Run MLflow User Interface
mlflow_set_model_version_tag

Set Model version tag
mlflow_set_tag

Set Tag
mlflow_search_runs

Search Runs
mlflow_restore_experiment

Restore Experiment
mlflow_log_param

Log Parameter
mlflow_rename_registered_model

Rename a registered model
mlflow_server

Run MLflow Tracking Server
mlflow_predict

Generate Prediction with MLflow Model
mlflow_load_flavor

Load MLflow Model Flavor
mlflow_param

Read Command-Line Parameter
mlflow_update_model_version

Update model version
mlflow_delete_run

Delete a Run
mlflow_register_external_observer

Register an external MLflow observer
mlflow_rename_experiment

Rename Experiment
mlflow_delete_tag

Delete Tag
mlflow_list_artifacts

List Artifacts
mlflow_log_metric

Log Metric
mlflow_load_model

Load MLflow Model
mlflow_update_registered_model

Update a registered model
mlflow_set_experiment

Set Experiment
mlflow_set_experiment_tag

Set Experiment Tag
mlflow_run

Run an MLflow Project
mlflow_set_tracking_uri

Set Remote Tracking URI
mlflow_restore_run

Restore a Run
mlflow_log_artifact

Log Artifact
mlflow_rfunc_serve

Serve an RFunc MLflow Model
mlflow_source

Source a Script with MLflow Params
reexports

Objects exported from other packages
mlflow_save_model.crate

Save Model for MLflow
mlflow_search_registered_models

List registered models
mlflow_transition_model_version_stage

Transition ModelVersion Stage
mlflow_search_experiments

Search Experiments
mlflow_start_run

Start Run
mlflow_create_experiment

Create Experiment
build_context_tags_from_databricks_job_info

Get information from a Databricks job execution context
build_context_tags_from_databricks_notebook_info

Get information from Databricks Notebook environment
mlflow_create_registered_model

Create registered model
mlflow-package

mlflow: Interface to 'MLflow'
mlflow_delete_experiment

Delete Experiment
mlflow_delete_registered_model

Delete registered model
mlflow_delete_model_version

Delete a model version
mlflow_client

Initialize an MLflow Client
mlflow_create_model_version

Create a model version
mlflow_download_artifacts

Download Artifacts
mlflow_get_registered_model

Get a registered model
mlflow_end_run

End a Run
mlflow_get_run

Get Run