As base::merge()
does for data.frame
s, this function takes two datasets,
matches rows based on a specified key variable, and adds columns from one to
the other.
joinDatasets(
x,
y,
by = intersect(names(x), names(y)),
by.x = by,
by.y = by,
all = FALSE,
all.x = TRUE,
all.y = FALSE,
copy = TRUE
)extendDataset(
x,
y,
by = intersect(names(x), names(y)),
by.x = by,
by.y = by,
all = FALSE,
all.x = TRUE,
all.y = FALSE,
...
)
# S3 method for CrunchDataset
merge(
x,
y,
by = intersect(names(x), names(y)),
by.x = by,
by.y = by,
all = FALSE,
all.x = TRUE,
all.y = FALSE,
...
)
x
extended by the columns of y
, matched on the "by" variables.
CrunchDataset to add data to
CrunchDataset to copy data from. May be filtered by rows and/or columns.
character, optional shortcut for specifying by.x
and
by.y
by alias if the key variables have the same alias in both
datasets.
CrunchVariable in x
on which to join, or the alias
(following crunch.namekey.dataset
of a variable. Must be type
numeric or text and have all unique, non-missing values.
CrunchVariable in y
on which to join, or the alias
(following crunch.namekey.dataset
of a variable. Must be type
numeric or text and have all unique, non-missing values.
logical: should all rows in x and y be kept, i.e. a "full outer"
join? Only FALSE
is currently supported.
logical: should all rows in x be kept, i.e. a "left outer"
join? Only TRUE
is currently supported.
logical: should all rows in y be kept, i.e. a "right outer"
join? Only FALSE
is currently supported.
logical: make a virtual or materialized join. Default is
TRUE
, which means materialized. Virtual joins are in fact not currently
implemented, so the default is the only valid value.
additional arguments, ignored
Since joining two datasets can sometimes produce unexpected results if the
keys differ between the two datasets, you may want to follow the
fork-edit-merge workflow for this operation. To do this, fork the dataset
with forkDataset()
, join the new data to the fork, ensure that
the resulting dataset is correct, and merge it back to the original dataset
with mergeFork()
. For more, see
vignette("fork-and-merge", package = "crunch")
.