Learn R Programming

SparkR (version 2.4.6)

write.parquet: Save the contents of SparkDataFrame as a Parquet file, preserving the schema.

Description

Save the contents of a SparkDataFrame as a Parquet file, preserving the schema. Files written out with this method can be read back in as a SparkDataFrame using read.parquet().

Usage

write.parquet(x, path, ...)

saveAsParquetFile(x, path)

# S4 method for SparkDataFrame,character write.parquet(x, path, mode = "error", ...)

# S4 method for SparkDataFrame,character saveAsParquetFile(x, path)

Arguments

x

A SparkDataFrame

path

The directory where the file is saved

...

additional argument(s) passed to the method.

mode

one of 'append', 'overwrite', 'error', 'errorifexists', 'ignore' save mode (it is 'error' by default)

See Also

Other SparkDataFrame functions: SparkDataFrame-class, agg(), alias(), arrange(), as.data.frame(), attach,SparkDataFrame-method, broadcast(), cache(), checkpoint(), coalesce(), collect(), colnames(), coltypes(), createOrReplaceTempView(), crossJoin(), cube(), dapplyCollect(), dapply(), describe(), dim(), distinct(), dropDuplicates(), dropna(), drop(), dtypes(), exceptAll(), except(), explain(), filter(), first(), gapplyCollect(), gapply(), getNumPartitions(), group_by(), head(), hint(), histogram(), insertInto(), intersectAll(), intersect(), isLocal(), isStreaming(), join(), limit(), localCheckpoint(), merge(), mutate(), ncol(), nrow(), persist(), printSchema(), randomSplit(), rbind(), rename(), repartitionByRange(), repartition(), rollup(), sample(), saveAsTable(), schema(), selectExpr(), select(), showDF(), show(), storageLevel(), str(), subset(), summary(), take(), toJSON(), unionByName(), union(), unpersist(), withColumn(), withWatermark(), with(), write.df(), write.jdbc(), write.json(), write.orc(), write.stream(), write.text()

Examples

Run this code
# NOT RUN {
sparkR.session()
path <- "path/to/file.json"
df <- read.json(path)
write.parquet(df, "/tmp/sparkr-tmp1/")
saveAsParquetFile(df, "/tmp/sparkr-tmp2/")
# }

Run the code above in your browser using DataLab