sdf_with_sequential_id

Add a sequential ID column to a Spark DataFrame. The Spark
<code>zipWithIndex</code> function is used to produce these. This differs from
<code>sdf_with_unique_id</code> in that the IDs generated are independent of
partitioning.

R interface to Apache Spark, a fast and general
engine for big data processing, see <https://spark.apache.org/>. This
package supports connecting to local and remote Apache Spark clusters,
provides a 'dplyr' compatible back-end, and provides an interface to
Spark's built-in machine learning algorithms.

Edgar Ruiz

sparklyr

R Interface to Apache Spark

Javier Luraschi

Kevin Kuo

Kevin Ushey

JJ Allaire

Samuel Macedo

Hossein Falaki

Lu Wang

Andy Zhang

Yitao Li

Jozef Hajnala

Maciej Szymkiewicz

Wil Davis

 RStudio

 The Apache Software Foundation

sdf_with_sequential_id function

<dl><dt>x</dt>
<dd>A <code>spark_connection</code>, <code>ml_pipeline</code>, or a <code>tbl_spark</code>.</dd>
<dt>id</dt>
<dd>The name of the column to host the generated IDs.</dd>
<dt>from</dt>
<dd>The starting value of the id column</dd></dl>

Arguments

Add a Sequential ID Column to a Spark DataFrame — sdf_with_sequential_id

<dl>

<dt>x</dt>
<dd>A <code>spark_connection</code>, <code>ml_pipeline</code>, or a <code>tbl_spark</code>.</dd>


<dt>id</dt>
<dd>The name of the column to host the generated IDs.</dd>


<dt>from</dt>
<dd>The starting value of the id column</dd>

</dl>

Add a Sequential ID Column to a Spark DataFrame

sdf_with_sequential_id: Add a Sequential ID Column to a Spark DataFrame

Description

Usage

Arguments