These functions uses the Arrow C++ CSV reader to read into a data.frame
.
Arrow C++ options have been mapped to argument names that follow those of
readr::read_delim()
, and col_select
was inspired by vroom::vroom()
.
read_delim_arrow(file, delim = ",", quote = "\"",
escape_double = TRUE, escape_backslash = FALSE, col_select = NULL,
skip_empty_rows = TRUE, parse_options = NULL,
convert_options = NULL, read_options = csv_read_options(),
as_tibble = TRUE)read_csv_arrow(file, quote = "\"", escape_double = TRUE,
escape_backslash = FALSE, col_select = NULL,
skip_empty_rows = TRUE, parse_options = NULL,
convert_options = NULL, read_options = csv_read_options(),
as_tibble = TRUE)
read_tsv_arrow(file, quote = "\"", escape_double = TRUE,
escape_backslash = FALSE, col_select = NULL,
skip_empty_rows = TRUE, parse_options = NULL,
convert_options = NULL, read_options = csv_read_options(),
as_tibble = TRUE)
A character path to a local file, or an Arrow input stream
Single character used to separate fields within a record.
Single character used to quote strings.
Does the file escape quotes by doubling them?
i.e. If this option is TRUE
, the value """"
represents
a single quote, \"
.
Does the file use backslashes to escape special
characters? This is more general than escape_double
as backslashes
can be used to escape the delimiter character, the quote character, or
to add special characters like \n
.
A tidy selection specification
of columns, as used in dplyr::select()
.
Should blank rows be ignored altogether? If
TRUE
, blank rows will not be represented at all. If FALSE
, they will be
filled with missings.
see csv_parse_options()
. If given, this overrides any
parsing options provided in other arguments (e.g. delim
, quote
, etc.).
Should the function return a data.frame
or an
arrow::Table?
A data.frame
, or an arrow::Table
if as_tibble = FALSE
.
read_csv_arrow()
and read_tsv_arrow()
are wrappers around
read_delim_arrow()
that specify a delimiter.
Note that not all readr
options are currently implemented here. Please file
an issue if you encounter one that arrow
should support.
If you need to control Arrow-specific reader parameters that don't have an
equivalent in readr::read_csv()
, you can either provide them in the
parse_options
, convert_options
, or read_options
arguments, or you can
call csv_table_reader()
directly for lower-level access.
# NOT RUN {
try({
tf <- tempfile()
on.exit(unlink(tf))
write.csv(iris, file = tf)
df <- read_csv_arrow(tf)
dim(df)
# Can select columns
df <- read_csv_arrow(tf, col_select = starts_with("Sepal"))
})
# }
Run the code above in your browser using DataLab