Learn R Programming

data.table (version 1.11.4)

setops: Set operations for data tables

Description

Similar to base's set functions, union, intersect, setdiff and setequal but for data.tables. Additional all argument controls if/how duplicate rows are returned. bit64::integer64 is also supported.

Unlike SQL, data.table functions will retain order of rows in result.

Usage

fintersect(x, y, all = FALSE)
fsetdiff(x, y, all = FALSE)
funion(x, y, all = FALSE)
fsetequal(x, y)

Arguments

x,y

data.tables.

all

Logical. Default is FALSE and removes duplicate rows on the result. When TRUE, if there are xn copies of a particular row in x and yn copies of the same row in y, then:

  • fintersect will return min(xn, yn) copies of that row.

  • fsetdiff will return max(0, xn-yn) copies of that row.

  • funion will return xn+yn copies of that row.

Value

A data.table in case of fintersect, funion and fsetdiff. Logical TRUE or FALSE for fsetequal.

Details

Columns of type complex and list are not supported except for funion.

References

https://db.apache.org/derby/papers/Intersect-design.html

See Also

data.table, rbindlist, all.equal.data.table, unique, duplicated, uniqueN, anyDuplicated

Examples

Run this code
# NOT RUN {
x = data.table(c(1,2,2,2,3,4,4))
y = data.table(c(2,3,4,4,4,5))
fintersect(x, y)            # intersect
fintersect(x, y, all=TRUE)  # intersect all
fsetdiff(x, y)              # except
fsetdiff(x, y, all=TRUE)    # except all
funion(x, y)                # union
funion(x, y, all=TRUE)      # union all
fsetequal(x, y)             # setequal
# }

Run the code above in your browser using DataLab