parDosa
is a wrapper function around many
functionalities of the parallel package.
It is designed to work closely with MCMC fitting functions,
e.g. can easily be called from inside of a function.
parDosa(cl, seq, fun, cldata,
lib = NULL, dir = NULL, evalq=NULL,
size = 1, balancing = c("none", "load", "size", "both"),
rng.type = c("none", "RNGstream"),
cleanup = TRUE, unload = FALSE, iseed=NULL, ...)
Usually a list with results returned by the cluster.
A cluster object created by makeCluster
, or
an integer. It can also be NULL
, see Details.
A vector to split.
A function or character string naming a function.
A list containing data.
This list is then exported to the cluster by
clusterExport
.
It is stored in a hidden environment.
Data in cldata
can be used by fun
.
Character, name of package(s). Optionally packages can be loaded onto the cluster. More than one package can be specified as character vector. Packages already loaded are skipped.
Working directory to use, if NULL
working
directory is not set on workers (default).
Can be a vector to set different directories on workers.
Character, expressions to evaluate,
e.g. for changing global options (passed to clusterEvalQ
).
More than one expressions can be specified as character vector.
Character, type of balancing to perform (see Details).
Vector of problem sizes (or relative performance information)
corresponding to elements of seq
(recycled if needed).
The default 1
indicates equality of problem sizes.
Character, "none"
will not set any seeds on the workers,
"RNGstream"
selects the "L'Ecuyer-CMRG"
RNG and then
distributes streams to the members of a cluster,
optionally setting the seed of the streams by set.seed(iseed)
(otherwise they are set from the current seed of the master process:
after selecting the L'Ecuyer generator).
See clusterSetRNGStream
.
The logical value !(rng.type == "none")
is used for
forking (e.g. when cl
is integer).
logical, if cldata
should be removed from
the workers after applying fun
.
If TRUE
, effects of dir
argument is also cleaned up.
logical, if pkg
should be unloaded after applying fun
.
integer or NULL
, passed to clusterSetRNGStream
to be supplied to set.seed
on the workers,
or NULL not to set reproducible seeds.
Other arguments of fun
, that are simple values and not objects.
(Arguments passed as objects should be specified in cldata
,
otherwise those are not exported to the cluster by this function.)
Peter Solymos, solymos@ualberta.ca
The function uses 'snow' type clusters when cl
is a cluster
object. The function uses 'multicore' type forking (shared memory)
when cl
is an integer.
The value from getOption("mc.cores")
is used if the
argument is NULL
.
The function sets the random seeds, loads packages lib
onto the cluster, sets the working directory as dir
,
exports cldata
and evaluates fun
on seq
.
No balancing (balancing = "none"
) means, that the problem
is split into roughly equal
subsets, without respect to size
(see clusterSplit
). This splitting
is deterministic (reproducible).
Load balancing (balancing = "load"
) means,
that the problem is not splitted into subsets
a priori, but subsequent items are placed on the
worker which is empty
(see clusterApplyLB
for load balancing).
This splitting is non-deterministic (might not be reproducible).
Size balancing (balancing = "size"
) means,
that the problem is splitted into
subsets, with respect to size
(see clusterSplitSB
and parLapplySB
).
In size balancing, the problem is re-ordered from
largest to smallest, and then subsets are
determined by minimizing the total approximate processing time.
This splitting is deterministic (reproducible).
Size and load balancing (balancing = "both"
) means,
that the problem is re-ordered from largest to smallest,
and then undeterministic load balancing
is used (see parLapplySLB
).
If size
is correct, this is identical to size balancing.
This splitting is non-deterministic (might not be reproducible).
Size balancing: parLapplySB
, parLapplySLB
,
mclapplySB
Optimizing the number of workers:
clusterSize
, plotClusterSize
.
parDosa
is used internally by parallel dclone
functions: jags.parfit
, dc.parfit
,
parJagsModel
, parUpdate
,
parCodaSamples
.
parDosa
manipulates specific environments
described on the help page DcloneEnv
.