monotonically_increasing_id: monotonically_increasing_id

Description

Return a column that generates monotonically increasing 64-bit integers.

Usage

monotonically_increasing_id(x = "missing")
# S4 method for missing
monotonically_increasing_id()

Arguments

empty. Should be used with no argument.

Details

The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the record number within each partition in the lower 33 bits. The assumption is that the SparkDataFrame has less than 1 billion partitions, and each partition has less than 8 billion records.

As an example, consider a SparkDataFrame with two partitions, each with 3 records. This expression would return the following IDs: 0, 1, 2, 8589934592 (1L << 33), 8589934593, 8589934594.

This is equivalent to the MONOTONICALLY_INCREASING_ID function in SQL.

Examples

Run this code

# NOT RUN {
select(df, monotonically_increasing_id())
# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

Details

See Also

Examples