spark_config_kubernetes: Kubernetes Configuration

Description

Convenience function to initialize a Kubernetes configuration instead of spark_config(), exposes common properties to set in Kubernetes clusters.

Usage

spark_config_kubernetes(
  master,
  version = "3.2.3",
  image = "spark:sparklyr",
  driver = random_string("sparklyr-"),
  account = "spark",
  jars = "local:///opt/sparklyr",
  forward = TRUE,
  executors = NULL,
  conf = NULL,
  timeout = 120,
  ports = c(8880, 8881, 4040),
  fix_config = identical(.Platform$OS.type, "windows"),
  ...
)

Arguments

master: Kubernetes url to connect to, found by running kubectl cluster-info.
version: The version of Spark being used.
image: Container image to use to launch Spark and sparklyr. Also known as spark.kubernetes.container.image.
driver: Name of the driver pod. If not set, the driver pod name is set to "sparklyr" suffixed by id to avoid name conflicts. Also known as spark.kubernetes.driver.pod.name.
account: Service account that is used when running the driver pod. The driver pod uses this service account when requesting executor pods from the API server. Also known as spark.kubernetes.authenticate.driver.serviceAccountName.
jars: Path to the sparklyr jars; either, a local path inside the container image with the sparklyr jars copied when the image was created or, a path accesible by the container where the sparklyr jars were copied. You can find a path to the sparklyr jars by running system.file("java/", package = "sparklyr").
forward: Should ports used in sparklyr be forwarded automatically through Kubernetes? Default to TRUE which runs kubectl port-forward and pkill kubectl on disconnection.
executors: Number of executors to request while connecting.
conf: A named list of additional entries to add to sparklyr.shell.conf.
timeout: Total seconds to wait before giving up on connection.
ports: Ports to forward using kubectl.
fix_config: Should the spark-defaults.conf get fixed? TRUE for Windows.
...: Additional parameters, currently not in use.