Serialize a Spark DataFrame to the Parquet format.
spark_write_parquet(x, path, mode = NULL, options = list(),
partition_by = NULL, ...)
A Spark DataFrame or dplyr operation
The path to the file. Needs to be accessible from the cluster. Supports the "hdfs://", "s3a://" and "file://" protocols.
A character
element. Specifies the behavior when data or
table already exists. Supported values include: 'error', 'append', 'overwrite' and
ignore. Notice that 'overwrite' will also change the column structure.
For more details see also http://spark.apache.org/docs/latest/sql-programming-guide.html#save-modes for your version of Spark.
A list of strings with additional options. See http://spark.apache.org/docs/latest/sql-programming-guide.html#configuration.
A character
vector. Partitions the output by the given columns on the file system.
Optional arguments; currently unused.
Other Spark serialization routines: spark_load_table
,
spark_read_csv
,
spark_read_jdbc
,
spark_read_json
,
spark_read_libsvm
,
spark_read_parquet
,
spark_read_source
,
spark_read_table
,
spark_read_text
,
spark_save_table
,
spark_write_csv
,
spark_write_jdbc
,
spark_write_json
,
spark_write_source
,
spark_write_table
,
spark_write_text