Learn R Programming

arrow (version 0.14.1)

read_delim_arrow: Read a CSV or other delimited file with Arrow

Description

These functions uses the Arrow C++ CSV reader to read into a data.frame. Arrow C++ options have been mapped to argument names that follow those of readr::read_delim(), and col_select was inspired by vroom::vroom().

Usage

read_delim_arrow(file, delim = ",", quote = "\"",
  escape_double = TRUE, escape_backslash = FALSE, col_select = NULL,
  skip_empty_rows = TRUE, parse_options = NULL,
  convert_options = NULL, read_options = csv_read_options(),
  as_tibble = TRUE)

read_csv_arrow(file, quote = "\"", escape_double = TRUE, escape_backslash = FALSE, col_select = NULL, skip_empty_rows = TRUE, parse_options = NULL, convert_options = NULL, read_options = csv_read_options(), as_tibble = TRUE)

read_tsv_arrow(file, quote = "\"", escape_double = TRUE, escape_backslash = FALSE, col_select = NULL, skip_empty_rows = TRUE, parse_options = NULL, convert_options = NULL, read_options = csv_read_options(), as_tibble = TRUE)

Arguments

file

A character path to a local file, or an Arrow input stream

delim

Single character used to separate fields within a record.

quote

Single character used to quote strings.

escape_double

Does the file escape quotes by doubling them? i.e. If this option is TRUE, the value """" represents a single quote, \".

escape_backslash

Does the file use backslashes to escape special characters? This is more general than escape_double as backslashes can be used to escape the delimiter character, the quote character, or to add special characters like \n.

col_select

A tidy selection specification of columns, as used in dplyr::select().

skip_empty_rows

Should blank rows be ignored altogether? If TRUE, blank rows will not be represented at all. If FALSE, they will be filled with missings.

parse_options

see csv_parse_options(). If given, this overrides any parsing options provided in other arguments (e.g. delim, quote, etc.).

convert_options
read_options
as_tibble

Should the function return a data.frame or an arrow::Table?

Value

A data.frame, or an arrow::Table if as_tibble = FALSE.

Details

read_csv_arrow() and read_tsv_arrow() are wrappers around read_delim_arrow() that specify a delimiter.

Note that not all readr options are currently implemented here. Please file an issue if you encounter one that arrow should support.

If you need to control Arrow-specific reader parameters that don't have an equivalent in readr::read_csv(), you can either provide them in the parse_options, convert_options, or read_options arguments, or you can call csv_table_reader() directly for lower-level access.

Examples

Run this code
# NOT RUN {
try({
  tf <- tempfile()
  on.exit(unlink(tf))
  write.csv(iris, file = tf)
  df <- read_csv_arrow(tf)
  dim(df)
  # Can select columns
  df <- read_csv_arrow(tf, col_select = starts_with("Sepal"))
})
# }

Run the code above in your browser using DataLab