spark_write_rds

Write Spark dataframe to RDS files. Each partition of the dataframe will be
exported to a separate RDS file so that all partitions can be processed in
parallel.

R interface to Apache Spark, a fast and general
engine for big data processing, see <https://spark.apache.org/>. This
package supports connecting to local and remote Apache Spark clusters,
provides a 'dplyr' compatible back-end, and provides an interface to
Spark's built-in machine learning algorithms.

Edgar Ruiz

sparklyr

R Interface to Apache Spark

Javier Luraschi

Kevin Kuo

Kevin Ushey

JJ Allaire

Samuel Macedo

Hossein Falaki

Lu Wang

Andy Zhang

Yitao Li

Jozef Hajnala

Maciej Szymkiewicz

Wil Davis

 RStudio

 The Apache Software Foundation

spark_write_rds function

<dl><dt>x</dt>
<dd>A Spark DataFrame to be exported</dd>
<dt>dest_uri</dt>
<dd>Can be a URI template containing `partitionId` (e.g.,
<code>"hdfs://my_data_part_{partitionId}.rds"</code>) where `partitionId` will be
substituted with ID of each partition using `glue`, or a list of URIs
to be assigned to RDS output from all partitions (e.g.,
<code>"hdfs://my_data_part_0.rds"</code>, <code>"hdfs://my_data_part_1.rds"</code>, and so on)
If working with a Spark instance running locally, then all URIs should be
in <code>"file://&lt;local file path&gt;"</code> form. Otherwise the scheme of the URI should
reflect the underlying file system the Spark instance is working with
(e.g., "hdfs://"). If the resulting list of URI(s) does not contain unique
values, then it will be post-processed with `make.unique()` to ensure
uniqueness.</dd></dl>

Arguments

Write Spark DataFrame to RDS files — spark_write_rds

<dl>

<dt>x</dt>
<dd>A Spark DataFrame to be exported</dd>


<dt>dest_uri</dt>
<dd>Can be a URI template containing `partitionId` (e.g.,
<code>"hdfs://my_data_part_{partitionId}.rds"</code>) where `partitionId` will be
substituted with ID of each partition using `glue`, or a list of URIs
to be assigned to RDS output from all partitions (e.g.,
<code>"hdfs://my_data_part_0.rds"</code>, <code>"hdfs://my_data_part_1.rds"</code>, and so on)
If working with a Spark instance running locally, then all URIs should be
in <code>"file://&lt;local file path&gt;"</code> form. Otherwise the scheme of the URI should
reflect the underlying file system the Spark instance is working with
(e.g., "hdfs://"). If the resulting list of URI(s) does not contain unique
values, then it will be post-processed with `make.unique()` to ensure
uniqueness.</dd>

</dl>

Write Spark DataFrame to RDS files

spark_write_rds: Write Spark DataFrame to RDS files

Description

Usage

Value

Arguments