Learn R Programming

The harvestr Parallel Simulation Framework.

The harvestr package is a framework for conducting replicable parallel simulations in R. It builds off the the popular plyr package for split apply combine framework, and the parallel combined multiple-recursive generator from L'Ecuyer (1999).

Due to the replicable simulations being based off seed values,this package takes a theme of seeds and farming. The principal functions are as follows:

  • gather - Creates a list of parallel rng seeds.
  • farm - Uses seeds from gather to evaluate expressions after each seed has been set. This is usefull for generating data.
  • harvest - This will take the results from farm and continue evaluation with the random number generation where farm left off. This is useful for the evaluating data generated with farm, through stochastic methods such as Markov Chain Monte Carlo.
  • reap - is the single version of harvest for a single element that has appropriately structured seed attributes.
  • plant - takes a list of objects, assumed to be of the same class, and gives each element a parallel seed value to use with harvest for evaluation.
  • graft - splits RNG sub-streams from a main object.
  • sprout - gets the seeds for use in graft.

##Lists## All of the functions work off lists, They expect and return lists, which can be easily converted to data frames. I would do this with ldply(list, I).

##Parallel## The advantage of setting the seeds like this is that parallelization is seamless and transparent, similar to the plyr framework each function has a .parallel argument, which defaults to FALSE, but when set to true will evaluate and run in parallel. An appropriate parallel backend must be specified. For example, with a multicore backend you would run the following code.

library(doMC)
regiserDoMC()

See the plyr and foreach packages documentation for what backends are currently supported.

Operating Systems

harvestr is limited in it's capabilities by the packages that it depends on, mainly foreach and plyr The Parallel backends are platform limited read the individual packages documentation:

Notes

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Copy Link

Version

Install

install.packages('harvestr')

Monthly Downloads

95

Version

0.7.1

License

GPL (>= 2)

Maintainer

Last Published

August 29th, 2016

Functions in harvestr (0.7.1)

bale

Combine results into a data frame
getAttr

Retrieve an attribute or a default if not present.
called_from

Test if a function was called from others.
harvest

Harvest the results.
total_time

retrieve the total time for a simulation
plant

Assign elements of a list with seeds
sprout

Create substreams of numbers based of a current stream.
plow

Apply over rows of a data frame
reap

Call a function continuing the random number stream.
withseed

Do a computation with a given seed.
use_method

Use a reference class method
noattr

Strip attributes from an object.
gather

Gather independent seeds.
harvestr

A Simple Reproducible Parallel Simulation Framework
farm

Evaluate an expression for a set of seeds.
is_seeded

Check if an object or list of objects has seed attributes
Interactive

Smarter interactive test