These functions are based on forking and so are not available on Windows.
mcparallel
starts a parallel R process which evaluates the
given expression.
mccollect
collects results from one or more parallel processes.
mcparallel(expr, name, mc.set.seed = TRUE, silent = FALSE,
mc.affinity = NULL, mc.interactive = FALSE,
detached = FALSE)mccollect(jobs, wait = TRUE, timeout = 0, intermediate = FALSE)
expression to evaluate (do not use any on-screen devices or GUI elements in this code).
an optional name (character vector of length one) that can be associated with the job.
logical: see section ‘Random numbers’.
if set to TRUE
then all output on stdout will be
suppressed (stderr is not affected).
either a numeric vector specifying CPUs to restrict
the child process to (1-based) or NULL
to not modify the CPU
affinity
logical, if TRUE
or FALSE
then the
child process will be set as interactive or non-interactive
respectively. If NA
then the child process will inherit the
interactive flag from the parent.
logical, if TRUE
then the job is detached from
the current session and cannot deliver any results back - it is used
for the code side-effect only.
list of jobs (or a single job) to collect results
for. Alternatively jobs
can also be an integer vector of
process IDs. If omitted collect
will wait for all currently
existing children.
if set to FALSE
it checks for any results that are
available within timeout
seconds from now, otherwise it waits
for all specified jobs to finish.
timeout (in seconds) to check for job results -- applies
only if wait
is FALSE
.
FALSE
or a function which will be called while
collect
waits for results. The function will be called with one
parameter which is the list of results received so far.
mcparallel
returns an object of the class "parallelJob"
which inherits from "childProcess"
(see the ‘Value’
section of the help for mcfork
). If argument
name
was supplied this will have an additional component
name
.
mccollect
returns any results that are available in a list. The
results will have the same order as the specified jobs. If there are
multiple jobs and a job has a name it will be used to name the
result, otherwise its process ID will be used. If none of the
specified children are still running, it returns NULL
.
If mc.set.seed = FALSE
, the child process has the same initial
random number generator (RNG) state as the current R session. If the
RNG has been used (or .Random.seed
was restored from a saved
workspace), the child will start drawing random numbers at the same
point as the current session. If the RNG has not yet been used, the
child will set a seed based on the time and process ID when it first
uses the RNG: this is pretty much guaranteed to give a different
random-number stream from the current session and any other child
process.
The behaviour with mc.set.seed = TRUE
is different only if
RNGkind("L'Ecuyer-CMRG")
has been selected. Then each
time a child is forked it is given the next stream (see
nextRNGStream
). So if you select that generator, set a
seed and call mc.reset.stream
just before the first use
of mcparallel
the results of simulations will be reproducible
provided the same tasks are given to the first, second, …
forked process.
mcparallel
evaluates the expr
expression in parallel to
the current R process. Everything is shared read-only (or in fact
copy-on-write) between the parallel process and the current process,
i.e.no side-effects of the expression affect the main process. The
result of the parallel execution can be collected using
mccollect
function.
mccollect
function collects any available results from parallel
jobs (or in fact any child process). If wait
is TRUE
then collect
waits for all specified jobs to finish before
returning a list containing the last reported result for each
job. If wait
is FALSE
then mccollect
merely
checks for any results available at the moment and will not wait for
jobs to finish. If jobs
is specified, jobs not listed there
will not be affected or acted upon.
Note: If expr
uses low-level multicore functions such
as sendMaster
a single job can deliver results
multiple times and it is the responsibility of the user to interpret
them correctly. mccollect
will return NULL
for a
terminating job that has sent its results already after which the
job is no longer available.
The mc.affinity
parameter can be used to try to restrict
the child process to specific CPUs. The availability and the extent of
this feature is system-dependent (e.g., some systems will only
consider the CPU count, others will ignore it completely).
p <- mcparallel(1:10)
q <- mcparallel(1:20)
# wait for both jobs to finish and collect all results
res <- mccollect(list(p, q))
## IGNORE_RDIFF_BEGIN
## reports process ids, so not reproducible
p <- mcparallel(1:10)
mccollect(p, wait = FALSE, 10) # will retrieve the result (since it's fast)
mccollect(p, wait = FALSE) # will signal the job as terminating
mccollect(p, wait = FALSE) # there is no longer such a job
## IGNORE_RDIFF_END
# a naive parallel lapply can be created using mcparallel alone:
jobs <- lapply(1:10, function(x) mcparallel(rnorm(x), name = x))
mccollect(jobs)
Run the code above in your browser using DataLab