Learn R Programming

targets (version 0.0.0.9000)

tar_make_clustermq: Run a pipeline of targets in parallel with persistent clustermq workers.

Description

This function is like tar_make() except that targets run in parallel with persistent clustermq workers. It requires that you set global options like clustermq.scheduler and clustermq.template inside the _targets.R script. clustermq is not a strict dependency of targets, so you must install clustermq yourself.

Usage

tar_make_clustermq(
  names = NULL,
  reporter = "verbose",
  garbage_collection = FALSE,
  workers = 1L,
  log_worker = FALSE,
  callr_function = callr::r,
  callr_arguments = list()
)

Arguments

names

Names of the targets to build or check. Set to NULL to check/build all the targets (default). Otherwise, you can supply symbols, a character vector, or tidyselect helpers like starts_with().

reporter

Character of length 1, name of the reporter to user. Controls how messages are printed as targets run in the pipeline. Choices:

  • "verbose": print one message for each target that runs (default).

  • "silent": print nothing.

  • "timestamp": print a time-stamped message for each target that runs.

  • "summary": print a running total of the number of each targets in each status category (queued, running, skipped, build, cancelled, or errored).

garbage_collection

Logical, whether to run base::gc() between targets. The pipeline will run slower but consume less memory.

workers

Positive integer, number of persistent clustermq workers to create.

log_worker

Logical, whether to write a log file for each worker. Same as the log_worker argument of clustermq::Q() and clustermq::workers().

callr_function

A function from callr to start a fresh clean R process to do the work. Set to NULL to run in the current session instead of an external process (but restart your R session just before you do in order to clear debris out of the global environment). callr_function needs to be NULL for interactive debugging, e.g. tar_option_set(debug = "your_target"). However, callr_function should not be NULL for serious reproducible work.

callr_arguments

A list of arguments to callr_function.

Value

NULL except if callr_function = callr::r_bg(), in which case a handle to the callr background process is returned. Either way, the value is invisibly returned.

Details

To use with a cluster, you will need to set the global options clustermq.scheduler and clustermq.template inside _targets.R. To read more about configuring clustermq for your scheduler, visit https://mschubert.github.io/clustermq/articles/userguide.html#configuration # nolint and navigate to the appropriate link under "Setting up the scheduler". Wildcards in the template file are filled in with elements from tar_option_get("resources").

Examples

Run this code
# NOT RUN {
tar_dir({
tar_script({
  options(clustermq.scheduler = "multicore")
  tar_option_set()
  tar_pipeline(tar_target(x, 1 + 1))
})
tar_make_clustermq()
})
# }

Run the code above in your browser using DataLab