J: Creates a Join data table

Description

Creates a data.table to be passed in as the i to a [.data.table join.

Usage

# DT[J(...)]                           # J() only for use inside DT[...].
SJ(...)                                # DT[SJ(...)]
CJ(..., sorted = TRUE, unique = FALSE)  # DT[CJ(...)]

Arguments

…

Each argument is a vector. Generally each vector is the same length but if they are not then the usual silent repetition is applied.

sorted

logical. Should the input be sorted (ascending order)? If FALSE, the input order is retained.

unique

logical. When TRUE, only unique values of each vectors are used (automatically).

Value

J : the same result as calling list. J is a direct alias for list but results in clearer more readable code.

SJ : (S)orted (J)oin. The same value as J() but additionally setkey() is called on all the columns in the order they were passed in to SJ. For efficiency, to invoke a binary merge rather than a repeated binary full search for each row of i.

CJ : (C)ross (J)oin. A data.table is formed from the cross product of the vectors. For example, 10 ids, and 100 dates, CJ returns a 1000 row table containing all the dates for all the ids. It gains sorted, which by default is TRUE for backwards compatibility. FALSE retains input order.

Details

SJ and CJ are convenience functions for creating a data.table in the context of a data.table 'query' on x.

x[data.table(id)] is the same as x[J(id)] but the latter is more readable. Identical alternatives are x[list(id)] and x[.(id)].

x must have a key when passing in a join table as the i. See [.data.table

Examples

Run this code

# NOT RUN {
DT = data.table(A=5:1,B=letters[5:1])
setkey(DT,B)    # re-orders table and marks it sorted.
DT[J("b")]      # returns the 2nd row
DT[.("b")]      # same. Style of package plyr.
DT[list("b")]   # same

# CJ usage examples
CJ(c(5,NA,1), c(1,3,2)) # sorted and keyed data.table
do.call(CJ, list(c(5,NA,1), c(1,3,2))) # same as above
CJ(c(5,NA,1), c(1,3,2), sorted=FALSE) # same order as input, unkeyed
# use for 'unique=' argument
x = c(1,1,2)
y = c(4,6,4)
CJ(x, y) # output columns are automatically named 'x' and 'y'
CJ(x, y, unique=TRUE) # unique(x) and unique(y) are computed automatically

# }