Create a Spark Configuration for Livy
livy_config(
config = spark_config(),
username = NULL,
password = NULL,
negotiate = FALSE,
custom_headers = list(`X-Requested-By` = "sparklyr"),
proxy = NULL,
curl_opts = NULL,
...
)
Named list with configuration data
Optional base configuration
The username to use in the Authorization header
The password to use in the Authorization header
Whether to use gssnegotiate method or not
List of custom headers to append to http requests. Defaults to list("X-Requested-By" = "sparklyr")
.
Either NULL or a proxy specified by httr::use_proxy(). Defaults to NULL.
List of CURL options (e.g., verbose, connecttimeout, dns_cache_timeout, etc, see httr::httr_options() for a list of valid options) -- NOTE: these configurations are for libcurl only and separate from HTTP headers or Livy session parameters.
additional Livy session parameters
Extends a Spark spark_config()
configuration with settings
for Livy. For instance, username
and password
define the basic authentication settings for a Livy session.
The default value of "custom_headers"
is set to list("X-Requested-By" = "sparklyr")
in order to facilitate connection to Livy servers with CSRF protection enabled.
Additional parameters for Livy sessions are:
proxy_user
User to impersonate when starting the session
jars
jars to be used in this session
py_files
Python files to be used in this session
files
files to be used in this session
driver_memory
Amount of memory to use for the driver process
driver_cores
Number of cores to use for the driver process
executor_memory
Amount of memory to use per executor process
executor_cores
Number of cores to use for each executor
num_executors
Number of executors to launch for this session
archives
Archives to be used in this session
queue
The name of the YARN queue to which submitted
name
The name of this session
heartbeat_timeout
Timeout in seconds to which session be orphaned
conf
Spark configuration properties (Map of key=value)
Note that queue
is supported only by version 0.4.0 of Livy or newer.
If you are using the older one, specify queue via config
(e.g.
config = spark_config(spark.yarn.queue = "my_queue")
).