spark-api

spark_context

java_context

hive_context

spark_session

Access the commonly-used Spark objects associated with a Spark instance.
These objects provide access to different facets of the Spark API.

R interface to Apache Spark, a fast and general
engine for big data processing, see <https://spark.apache.org/>. This
package supports connecting to local and remote Apache Spark clusters,
provides a 'dplyr' compatible back-end, and provides an interface to
Spark's built-in machine learning algorithms.

Edgar Ruiz

sparklyr

R Interface to Apache Spark

Javier Luraschi

Kevin Kuo

Kevin Ushey

JJ Allaire

Samuel Macedo

Hossein Falaki

Lu Wang

Andy Zhang

Yitao Li

Jozef Hajnala

Maciej Szymkiewicz

Wil Davis

 RStudio

 The Apache Software Foundation

spark-api function

<dl><dt>sc</dt>
<dd>A <code>spark_connection</code>.</dd></dl>

Arguments

The main entry point for Spark functionality. The Spark Context
represents the connection to a Spark cluster, and can be used to create
<code>RDD</code>s, accumulators and broadcast variables on that cluster.

Spark Context

A Java-friendly version of the aforementioned Spark Context.

Java Spark Context

An instance of the Spark SQL execution engine that integrates with data
stored in Hive. Configuration for Hive is read from <code>hive-site.xml</code> on
the classpath.
Starting with Spark &gt;= 2.0.0, the Hive Context class has been
deprecated -- it is superceded by the Spark Session class, and
<code>hive_context</code> will return a Spark Session object instead.
Note that both classes share a SQL interface, and therefore one can invoke
SQL through these objects.

Hive Context

Available since Spark 2.0.0, the Spark Session unifies the
Spark Context and Hive Context classes into a single
interface. Its use is recommended over the older APIs for code
targeting Spark 2.0.0 and above.

Spark Session

Access the Spark API — spark-api

<dl>

<dt>sc</dt>
<dd>A <code>spark_connection</code>.</dd>

</dl>

The main entry point for Spark functionality. The Spark Context
represents the connection to a Spark cluster, and can be used to create
<code>RDD</code>s, accumulators and broadcast variables on that cluster.

A Java-friendly version of the aforementioned Spark Context.

An instance of the Spark SQL execution engine that integrates with data
stored in Hive. Configuration for Hive is read from <code>hive-site.xml</code> on
the classpath.
Starting with Spark &gt;= 2.0.0, the Hive Context class has been
deprecated -- it is superceded by the Spark Session class, and
<code>hive_context</code> will return a Spark Session object instead.
Note that both classes share a SQL interface, and therefore one can invoke
SQL through these objects.

Available since Spark 2.0.0, the Spark Session unifies the
Spark Context and Hive Context classes into a single
interface. Its use is recommended over the older APIs for code
targeting Spark 2.0.0 and above.

spark-api: Access the Spark API

Description

Usage

Arguments

Spark Context

Java Spark Context

Hive Context

Spark Session

Details