Learn R Programming

ddR (version 0.1.2)

repartition: Repartitions a distributed object. This function takes two inputs, a distributed object and a skeleton. These inputs must both be distributed objects of the same type and same dimension. If 'dobj' and 'skeleton' have different internal partitioning, this function will return a new distributed object with the same internal data as in 'dobj' but with the partitioning scheme of 'skeleton'.

Description

Repartitions a distributed object. This function takes two inputs, a distributed object and a skeleton. These inputs must both be distributed objects of the same type and same dimension. If 'dobj' and 'skeleton' have different internal partitioning, this function will return a new distributed object with the same internal data as in 'dobj' but with the partitioning scheme of 'skeleton'.

Usage

repartition(dobj, skeleton)
"repartition"(dobj, skeleton)

Arguments

dobj
distributed object whose data is to be preserved, but repartitioned.
skeleton
distributed Object whose partitioning is to be emulated in the output.

Value

A new distributed object with the data of 'dobj' and the partitioning of 'skeleton'.

Methods (by class)

  • DObject: The default implementation of repartition.

References

Prasad, S., Fard, A., Gupta, V., Martinez, J., LeFevre, J., Xu, V., Hsu, M., Roy, I. Large scale predictive analytics in Vertica: Fast data transfer, distributed model creation and in-database prediction. _Sigmod 2015_, 1657-1668.

Venkataraman, S., Bodzsar, E., Roy, I., AuYoung, A., and Schreiber, R. (2013) Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices. _EuroSys 2013_, 197-210.

Homepage: https://github.com/vertica/ddR

Examples

Run this code
## Not run: 
# a <- dlist(1,2,3,4,nparts=2)
# b <- dmapply(function(x) x, 11:14,nparts=4)
# c <- repartition(a,b) # c will have 4 partitions of length 1 each, containing 1 to 4.
# ## End(Not run)

Run the code above in your browser using DataLab