spark_read_source

The name to assign to the newly generated table.

name

A data source capable of reading data.

source

A list of strings with additional options. See <a href="http://spark.apache.org/docs/latest/sql-programming-guide.html#configuration">http://spark.apache.org/docs/latest/sql-programming-guide.html#configuration</a>.

options

The number of partitions used to distribute the
generated table. Use 0 (the default) to avoid partitioning.

repartition

Boolean; should the data be loaded eagerly into memory? (That
is, should the table be cached?)

memory

Boolean; overwrite the table with the given name if it
already exists?

overwrite

A vector of column names or a named vector of column types.

columns

Optional arguments; currently unused.

Read from a generic source into a Spark DataFrame.

R interface to Apache Spark, a fast and general engine for big data
processing, see <http://spark.apache.org>. This package supports connecting to
local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end,
and provides an interface to Spark's built-in machine learning algorithms.

Javier Luraschi

sparklyr

R Interface to Apache Spark

Kevin Kuo

Kevin Ushey

JJ Allaire

 RStudio

 The Apache Software Foundation

spark_read_source function

A list of strings with additional options. See <a href='http://spark.apache.org/docs/latest/sql-programming-guide.html#configuration'>http://spark.apache.org/docs/latest/sql-programming-guide.html#configuration</a>.

spark_read_source: Read from a generic source into a Spark DataFrame.

Description

Usage

Arguments

See Also