{"name":"read.df","title":"Load a SparkDataFrame","pagetitle":"Load a SparkDataFrame — read.df","aliases":["read.df","loadDF"],"author":[],"keywords":[],"description":"

Returns the dataset in a data source as a SparkDataFrame

","usage":"read.df(path = NULL, source = NULL, schema = NULL, na.strings = \"NA\", ...)\n\nloadDF(path = NULL, source = NULL, schema = NULL, ...)","arguments":[{"name":"path","description":"

The path of files to load

"},{"name":"source","description":"

The name of external data source

"},{"name":"schema","description":"

The data schema defined in structType or a DDL-formatted string.

"},{"name":"na.strings","description":"

Default string value for NA when source is \"csv\"

"},{"name":"...","description":"

additional external data source specific named properties.

"}],"has_args":true,"examples":"# NOT RUN {\nsparkR.session()\ndf1 <- read.df(\"path/to/file.json\", source = \"json\")\nschema <- structType(structField(\"name\", \"string\"),\n structField(\"info\", \"map\"))\ndf2 <- read.df(mapTypeJsonPath, \"json\", schema, multiLine = TRUE)\ndf3 <- loadDF(\"data/test_table\", \"parquet\", mergeSchema = \"true\")\nstringSchema <- \"name STRING, info MAP\"\ndf4 <- read.df(mapTypeJsonPath, \"json\", stringSchema, multiLine = TRUE)\n# }\n","sections":[],"value":"

SparkDataFrame

","details":"

The data source is specified by the source and a set of options(...).\nIf source is not specified, the default data source configured by\n\"spark.sql.sources.default\" will be used. \nSimilar to R read.csv, when source is \"csv\", by default, a value of \"NA\" will be\ninterpreted as NA.

","note":"

read.df since 1.4.0

loadDF since 1.6.0

","seealso":"

read.json

","package":{"package":"SparkR","version":"3.1.2"}}

Description

Usage

Arguments

Value

Details

See Also

Examples