Learn R Programming

sparkbq (version 0.1.1)

bigquery_defaults: Google BigQuery Default Settings

Description

Sets default values for several Google BigQuery related settings.

Usage

bigquery_defaults(billingProjectId, gcsBucket, datasetLocation = "US",
  serviceAccountKeyFile = NULL, type = "direct")

Arguments

billingProjectId

Default Google Cloud Platform project ID for billing purposes. This is the project on whose behalf to perform BigQuery operations.

gcsBucket

Google Cloud Storage (GCS) bucket to use for storing temporary files. Temporary files are used when importing through BigQuery load jobs and exporting through BigQuery extraction jobs (i.e. when using data extracts such as Parquet, Avro, ORC, ...). The service account specified in serviceAccountKeyFile needs to be given appropriate rights. This should be the name of an existing storage bucket.

datasetLocation

Geographic location where newly created datasets should reside. "EU" or "US". Defaults to "US".

serviceAccountKeyFile

Google Cloud service account key file to use for authentication with Google Cloud services. The use of service accounts is highly recommended. Specifically, the service account will be used to interact with BigQuery and Google Cloud Storage (GCS). If not specified, Google application default credentials (ADC) will be used, which is the default.

type

Default BigQuery import/export type to use. Options include "direct", "parquet", "avro", "orc", "json" and "csv". Defaults to "direct". Please note that only "direct" and "avro" are supported for both importing and exporting. "csv" and "json" are not recommended due to their lack of type safety.

See the table below for supported type and import/export combinations.

Direct Parquet Avro ORC JSON CSV
Import to Spark (export from BigQuery) X X X X
Export from Spark (import to BigQuery) X X X X

Value

A list of set options with previous values.

References

https://github.com/miraisolutions/spark-bigquery https://cloud.google.com/bigquery/pricing https://cloud.google.com/bigquery/docs/dataset-locations https://cloud.google.com/bigquery/docs/authentication/service-account-file https://cloud.google.com/docs/authentication/ https://cloud.google.com/bigquery/docs/authentication/ https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-parquet https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-avro https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-orc https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-json https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-csv

See Also

spark_read_bigquery, spark_write_bigquery, default_billing_project_id, default_gcs_bucket, default_dataset_location