Parallelize applying a function over a list or vector according to the registered parallelization engine.
tm_parLapply(X, FUN, ...)
tm_parLapply_engine(new)
A vector (atomic or list), or other objects suitable for the engine in use.
the function to be applied to each element of X
.
optional arguments to FUN
.
an object inheriting from class cluster
as created
by makeCluster()
from package
parallel, or a function with formals X
, FUN
and
...
, or NULL
corresponding to the default of using no
parallelization engine.
A list the length of X
, with the result of applying FUN
together with the ...
arguments to each element of X
.
Parallelization can be employed to speed up some of the embarrassingly
parallel computations performed in package tm, specifically
tm_index()
, tm_map()
on a non-lazy-mapped
VCorpus
, and TermDocumentMatrix()
on a
VCorpus
or PCorpus
. Functions
tm_parLapply()
and tm_parLapply_engine()
can be used to
customize parallelization according to the available resources.
tm_parLapply_engine()
is used for getting (with no arguments)
or setting (with argument new
) the parallelization engine
employed (see below for examples).
If an engine is set to an object inheriting from class cluster
,
tm_parLapply()
calls
parLapply()
with this cluster and
the given arguments. If set to a function, tm_parLapply()
calls the function with the given arguments. Otherwise, it simply
calls lapply()
.
Hence, to achieve parallelization via
parLapply()
and a default cluster registered via
setDefaultCluster()
, one
can use
tm_parLapply_engine(function(X, FUN, ...) parallel::parLapply(NULL, X, FUN, ...))
or re-register the cluster, say cl
, using
tm_parLapply_engine(cl)
(note that there is no mechanism for programmatically getting the registered default cluster). Using
tm_parLapply_engine(function(X, FUN, ...) parallel::parLapplyLB(NULL, X, FUN, ...))
or
tm_parLapply_engine(function(X, FUN, ...) parallel::parLapplyLB(cl, X, FUN, ...))
gives load-balancing parallelization with the registered default or
given cluster, respectively. To achieve parallelization via forking
(on Unix-alike platforms), one can use the above with clusters created
by makeForkCluster()
, or use
tm_parLapply_engine(parallel::mclapply)
or
tm_parLapply_engine(function(X, FUN, ...) parallel::mclapply(X, FUN, ..., mc.cores = n))
to use mclapply()
with the default or
given number n
of cores.
makeCluster()
,
parLapply()
,
parLapplyLB()
, and
mclapply()
.