Learn R Programming

h2o (version 3.8.1.3)

h2o.merge: Merge Two H2O Data Frames

Description

Merges two H2OFrame objects by shared column names. Unlike the base R implementation, h2o.merge only supports merging through shared column names.

Usage

h2o.merge(x, y, all.x = FALSE, all.y = FALSE, by.x = NULL, by.y = NULL,
  method = "hash")

Arguments

x,y
H2OFrame objects
all.x
If all.x is true, all rows in the x will be included, even if there is no matching row in y, and vice-versa for all.y.
all.y
see all.x
by.x
x columns used for merging.
by.y
y columns used for merging.
method
auto, radix, or hash (default)

Details

In order for h2o.merge to work in multinode clusters, one of the datasets must be small enough to exist in every node. Currently, this function only supports all.x = TRUE. All other permutations will fail.

Examples

Run this code
h2o.init()
left <- data.frame(fruit = c('apple', 'orange', 'banana', 'lemon', 'strawberry', 'blueberry'),
color = c('red', 'orange', 'yellow', 'yellow', 'red', 'blue'))
right <- data.frame(fruit = c('apple', 'orange', 'banana', 'lemon', 'strawberry', 'watermelon'),
citrus = c(FALSE, TRUE, FALSE, TRUE, FALSE, FALSE))
l.hex <- as.h2o(left)
r.hex <- as.h2o(right)
left.hex <- h2o.merge(l.hex, r.hex, all.x = TRUE)

Run the code above in your browser using DataLab