Access the commonly-used Spark objects associated with a Spark instance. These objects provide access to different facets of the Spark API.
spark_context(sc)java_context(sc)
hive_context(sc)
spark_session(sc)
A spark_connection
.
The main entry point for Spark functionality. The Spark Context
represents the connection to a Spark cluster, and can be used to create
RDD
s, accumulators and broadcast variables on that cluster.
A Java-friendly version of the aforementioned Spark Context.
An instance of the Spark SQL execution engine that integrates with data
stored in Hive. Configuration for Hive is read from hive-site.xml
on
the classpath.
Starting with Spark >= 2.0.0, the Hive Context class has been
deprecated -- it is superceded by the Spark Session class, and
hive_context
will return a Spark Session object instead.
Note that both classes share a SQL interface, and therefore one can invoke
SQL through these objects.
Available since Spark 2.0.0, the Spark Session unifies the Spark Context and Hive Context classes into a single interface. Its use is recommended over the older APIs for code targeting Spark 2.0.0 and above.
The Scala API documentation
is useful for discovering what methods are available for each of these
objects. Use invoke
to call methods on these objects.