make
Run your project (build the targets).
make(plan, targets = drake::possible_targets(plan), envir = parent.frame(),
verbose = TRUE, cache = NULL,
parallelism = drake::default_parallelism(), jobs = 1,
packages = (.packages()), prework = character(0),
prepend = character(0), command = "make",
args = drake::default_system2_args(jobs = jobs, verbose = verbose),
return_config = FALSE, clear_progress = TRUE, imports_only = FALSE)
workflow plan data frame.
A workflow plan data frame is a data frame
with a target
column and a command
column.
Targets are the objects and files that drake generates,
and commands are the pieces of R code that produce them.
Use the function plan()
to generate workflow plan
data frames easily, and see functions analyses()
,
summaries()
, evaluate()
,
expand()
, and gather()
for
easy ways to generate large workflow plan data frames.
character string, names of targets to build. Dependencies are built too.
environment to use. Defaults to the current
workspace, so you should not need to worry about this
most of the time. A deep copy of envir
is made,
so you don't need to worry about your workspace being modified
by make
. The deep copy inherits from the global environment.
Wherever necessary, objects and functions are imported
from envir
and the global environment and
then reproducibly tracked as dependencies.
logical, whether to print progress to the console. Skipped objects are not printed.
drake cache as created by new_cache()
.
See also get_cache()
, this_cache()
,
and recover_cache()
character, type of parallelism to use.
To list the options, call parallelism_choices()
.
For detailed explanations, see ?parallelism_choices
,
the tutorial vignettes, or the tutorial files generated by
example_drake("basic")
number of parallel processes or jobs to run.
See max_useful_jobs()
or plot_graph()
to help figure out what the number of jobs should be.
Windows users should not set jobs > 1
if
parallelism
is "mclapply"
because
mclapply()
is based on forking. Windows users
who use parallelism == "Makefile"
will need to
download and install Rtools.
If parallelism
is "Makefile"
, Makefile-level parallelism is
only used for targets in your workflow plan data frame, not imports. To
process imported objects and files, drake selects the best parallel backend
for your system and uses the number of jobs you give to the jobs
argument to make()
. To use at most 2 jobs for imports and at
most 4 jobs for targets, run
make(..., parallelism = "Makefile", jobs = 2, args = "--jobs=4")
character vector packages to load, in the order
they should be loaded. Defaults to (.packages())
, so you
shouldn't usually need to set this manually. Just call
library()
to load your packages before make()
.
However, sometimes packages need to be strictly forced to load
in a certian order, especially if parallelism
is
"Makefile"
. To do this, do not use library()
or require()
or loadNamespace()
or
attachNamespace()
to load any libraries beforehand.
Just list your packages in the packages
argument in the order
you want them to be loaded.
If parallelism
is "mclapply"
,
the necessary packages
are loaded once before any targets are built. If parallelism
is
"Makefile"
, the necessary packages are loaded once on
initialization and then once again for each target right
before that target is built.
character vector of lines of code to run
before build time. This code can be used to
load packages, set options, etc., although the packages in the
packages
argument are loaded before any prework is done.
If parallelism
is "mclapply"
, the prework
is run once before any targets are built. If parallelism
is
"Makefile"
, the prework is run once on initialization
and then once again for each target right before that target is built.
lines to prepend to the Makefile if parallelism
is "Makefile"
. See the vignettes
(vignette(package = "drake")
)
to learn how to use prepend
to take advantage of multiple nodes of a supercomputer.
character scalar, command to call the Makefile
generated for distributed computing.
Only applies when parallelism
is "Makefile"
.
Defaults to the usual "make"
, but it could also be
"lsmake"
on supporting systems, for example.
command
and args
are executed via
system2(command, args)
to run the Makefile.
If args
has something like "--jobs=2"
, or if
jobs >= 2
and args
is left alone, targets
will be distributed over independent parallel R sessions
wherever possible.
command line arguments to call the Makefile for
distributed computing. For advanced users only. If set,
jobs
and verbose
are overwritten as they apply to the
Makefile.
command
and args
are executed via
system2(command, args)
to run the Makefile.
If args
has something like "--jobs=2"
, or if
jobs >= 2
and args
is left alone, targets
will be distributed over independent parallel R sessions
wherever possible.
logical, whether to return the internal list
of runtime configuration parameters used by make()
logical, whether to clear the saved record of
progress seen by progress()
and in_progress()
before anything is imported or built.
logical, whether to skip building the targets
in plan
and just import objects and files.
# NOT RUN {
load_basic_example()
outdated(my_plan) # Which targets need to be (re)built?
my_jobs = max_useful_jobs(my_plan) # Depends on what is up to date.
make(my_plan, jobs = my_jobs) # Build what needs to be built.
outdated(my_plan) # Everything is up to date.
reg2 = function(d){ # Change one of your functions.
d$x3 = d$x^3
lm(y ~ x3, data = d)
}
outdated(my_plan) # Some targets depend on reg2().
plot_graph(my_plan) # See how they fit in an interactive graph.
make(my_plan) # Rebuild just the outdated targets.
outdated(my_plan) # Everything is up to date again.
plot_graph(my_plan) # The colors changed in the graph.
# }
Run the code above in your browser using DataLab