dopar
and combinepar
are interfaces primarily designed to apply some function fn
in parallel on columns of a matrix, although other uses are possible. Depending on the nb_cores
argument, parallel or serial computation is performed. A socket cluster is used by default for parallel computations, but a fork cluster can be requested on linux and alike operating systems by using argument cluster_args=list(type="FORK")
.
dopar
has been designed to provide by default a progress bar in all evaluations contexts. A drawback is that different procedures are called depending e.g. on the type of cluster, with different possible controls. In particular, foreach
is called in some cases but not others, so non-trivial values of its .combine
control are not always enforced. The alternative interface combinepar
will always use foreach
, and will still try to provide by default a progress bar but may fail to do so in some cases (see Details).
dopar(newresp, fn, nb_cores = NULL, fit_env,
control = list(.final=function(v) if( ! is.list(v[[1]])) {do.call(cbind,v)} else v),
cluster_args = NULL, debug. = FALSE, iseed = NULL,
showpbar = eval(spaMM.getOption("barstyle")),
pretest_cores =NULL, ...)
combinepar(newresp, fn, nb_cores = NULL, cluster=NULL, fit_env,
control = list(.final=function(v) if( ! is.list(v[[1]])) {do.call(cbind,v)} else v),
cluster_args = NULL, debug. = FALSE, iseed = NULL,
showpbar = eval(spaMM.getOption("barstyle")),
pretest_cores =NULL, ...)
The result of calling foreach
, pbapply
or mclapply
, as dependent on the control
argument and the interface used. A side-effect of either interface is to show a progress bar whose character informs about the type of parallelisation performed: a "F"
or default "="
character for fork clusters, a "P"
for parallel computation via foreach
and doSNOW
, a "p"
for parallel computation via foreach
and doFuture
or via pbapply
, and "s"
for serial computation foreach
and doParallel
or via pbapply
.
A matrix on whose columns fn
will be applied (e.g., as used internally in spaMM, the return value of a simulate.HLfit()
call); or an integer, then converted to a trivial matrix matrix(seq(newresp),ncol=newresp,nrow=1)
.
Function whose first argument is named y
. The function will be applied for y
taken to be each column of newresp
.
Integer. Number of cores to use for parallel computations. If >1 (and no cluster is provided by the cluster
argument), a cluster of nb_cores
nodes is created, used, and stopped on completion of the computation. Otherwise, no parallel computation is performed.
(for combinepar
only): a cluster object (as returned by parallel::makeCluster
or parallel::makeForkCluster
). If this is used, the nb_cores
and cluster_args
arguments are ignored. The cluster is not stopped on completion of the computation
(for socket clusters only:) An environment, or a list, containing variables to be exported on the nodes of the cluster (by parallel::clusterExport
); e.g., list(bar=bar)
to pass object bar
to each node. The argument control(.errorhandling = "pass")
, below, is useful to find out missing variables.
A list following the foreach
control syntax, even if foreach
is not used. There are limitations when dopar
(but not combinepar
) is used, in all but the first case below:
for socket clusters, with doSNOW
attached, foreach
is called with default arguments including
i = 1:ncol(newresp), .inorder = TRUE, .errorhandling = "remove", .packages = "spaMM"
, and further arguments taken from the present function's control
argument, which may also be used to override the defaults. For example, .errorhandling = "pass"
is useful to get error messages from the nodes, and therefore strongly recommended when first experimenting with this function.
for socket clusters, with doSNOW
not attached, dopar
calls pbapply
instead of foreach
but control$.packages
is still handled. The result is still in the format returned in the first case, i.e. by foreach
, taking the control
argument into account. pbapply
arguments may be passed through the ... argument.
if a fork cluster is used, dopar
calls mclapply
instead of foreach
. control$mc.silent
can be used to control the mc.silent
argument of mclapply
.
(if nb_cores=1
dopar
calls mclapply
).
A list of arguments passed to parallel::makeCluster
. E.g., outfile="log.txt"
may be useful to collect output from the nodes, and type="FORK"
to force a fork cluster on linux(-alikes).
(for socket clusters only:) For debugging purposes. Effect, if any, is to be defined by the fn
as provided by the user.
(all parallel contexts:) Integer, or NULL. If an integer, it is used as the iseed
argument of clusterSetRNGStream
to initialize "L'Ecuyer-CMRG"
random-number generator (see Details). If iseed
is NULL
, the default generator is selected on each node, where its seed is not controlled.
(for socket clusters only:) Controls display of progress bar. See barstyle
option for details.
(for socket clusters only:) A function to run on the cores before running fn
. It may be used to check that all arguments of the fn
can be evaluated in the cores' environments (the internal function .pretest_fn_on_cores
provides an example).
Further arguments to be passed (unevaluated) to fn
, if not caught on the way by pbapply
(which means that different results may in principle be obtained depending on the mode of parallelisation, which is the kind of design issues that combinepar
aims to resolve by always calling foreach
).
Control of random numbers through the "L'Ecuyer-CMRG"
generator and the iseed
argument is not sufficient for consistent results when the doSNOW
parallel backend is used, so if you really need such control in a fn
using random numbers, do not use doSNOW
. Yet, it is fine to use doSNOW
for bootstrap procedures in spaMM, because the fitting functions do not use random numbers: only sample simulation uses them, and it is not performed in parallel.
combinepar
calls foreach::%dopar%
which assumes that a cluster has been declared using a suitable backend such as doSNOW
, doFuture
or doParallel
. If only the latter is available, no progress bar is displayed. A method to render a bar when doParallel
is used can be found on the Web, but that bar is not a valid progress bar as it is displayed only after all the processes have been run.
dofuture
is yet another interface with (essentially) the same functionalities as dopar
. See the documentation of the wrap_parallel
option for its differences from dopar
.
## See source code of spaMM_boot()
if (FALSE) {
# Useless function, but requiring some argument beyond the first
foo <- function(y, somearg, ...) {
if ( is.null(somearg) || TRUE ) length(y)
}
# Whether FORK can be used depends on OS and whether Rstudio is used:
dopar(matrix(1,ncol=4,nrow=3), foo, fit_env=list(), somearg=NULL,
nb_cores=2, cluster_args=list(type="FORK"))
}
Run the code above in your browser using DataLab