
Create "hexbin" object of hexagonally binned data for a distributed data frame. This computation is division agnostic - it does not matter how the data frame is split up.
drHexbin(data, xVar, yVar, by = NULL, xTransFn = identity,
yTransFn = identity, xRange = NULL, yRange = NULL, xbins = 30,
shape = 1, params = NULL, packages = NULL, control = NULL)
a distributed data frame
names of the variables to use
an optional variable name or vector of variable names by which to group hexbin computations
a transformation function to apply to the x and y variables prior to binning
range of x and y variables (can be left blank if summaries have been computed)
the number of bins partitioning the range of xbnds
the shape = yheight/xwidth of the plotting regions
a named list of objects external to the input data that are needed in the distributed computing (most should be taken care of automatically such that this is rarely necessary to specify)
a vector of R package names that contain functions used in fn
(most should be taken care of automatically such that this is rarely necessary to specify)
parameters specifying how the backend should handle things (most-likely parameters to rhwatch
in RHIPE) - see rhipeControl
and localDiskControl
a "hexbin" object
Carr, D. B. et al. (1987) Scatterplot Matrix Techniques for Large
# NOT RUN {
# create dummy data and divide it
dat <- data.frame(
xx = rnorm(1000),
yy = rnorm(1000),
by = sample(letters, 1000, replace = TRUE))
d <- divide(dat, by = "by", update = TRUE)
# compute hexbins on divided object
dhex <- drHexbin(d, xVar = "xx", yVar = "yy")
# dhex is equivalent to running on undivided data:
hexbin(dat$xx, dat$yy)
# }
Run the code above in your browser using DataLab