Parallelize applying a function over a list or vector according to the registered parallelization engine.
tm_parLapply(X, FUN, ...)
tm_parLapply_engine(new)A list the length of X, with the result of applying FUN
together with the ... arguments to each element of X.
A vector (atomic or list), or other objects suitable for the engine in use.
the function to be applied to each element of X.
optional arguments to FUN.
an object inheriting from class cluster as created
by makeCluster() from package
parallel, or a function with formals X, FUN and
..., or NULL corresponding to the default of using no
parallelization engine.
Parallelization can be employed to speed up some of the embarrassingly
parallel computations performed in package tm, specifically
tm_index(), tm_map() on a non-lazy-mapped
VCorpus, and TermDocumentMatrix() on a
VCorpus or PCorpus.
Functions tm_parLapply() and tm_parLapply_engine() can
be used to customize parallelization according to the available
resources.
tm_parLapply_engine() is used for getting (with no arguments)
or setting (with argument new) the parallelization engine
employed (see below for examples).
If an engine is set to an object inheriting from class cluster,
tm_parLapply() calls
parLapply() with this cluster and
the given arguments. If set to a function, tm_parLapply()
calls the function with the given arguments. Otherwise, it simply
calls lapply().
Hence, parallelization via
parLapply()
and a default cluster registered via
setDefaultCluster() can be
achieved via
tm_parLapply_engine(function(X, FUN, ...)
parallel::parLapply(NULL, X, FUN, ...))or re-registering the cluster, say cl, using
tm_parLapply_engine(cl)(note that since R version 3.5.0, one can use
getDefaultCluster() to get
the registered default cluster). Using
tm_parLapply_engine(function(X, FUN, ...)
parallel::parLapplyLB(NULL, X, FUN, ...))or
tm_parLapply_engine(function(X, FUN, ...)
parallel::parLapplyLB(cl, X, FUN, ...))gives load-balancing parallelization with the registered default or
given cluster, respectively. To achieve parallelization via forking
(on Unix-alike platforms), one can use the above with clusters created
by makeForkCluster(), or use
tm_parLapply_engine(parallel::mclapply)or
tm_parLapply_engine(function(X, FUN, ...)
parallel::mclapply(X, FUN, ..., mc.cores = n))to use mclapply() with the default or
given number n of cores.
makeCluster(),
parLapply(),
parLapplyLB(), and
mclapply().