Explicitly create tokenizer objects. Usually you will not call these
function, but will instead use one of the use friendly wrappers like
read_csv()
.
tokenizer_delim(delim, quote = "\"", na = "NA", quoted_na = TRUE,
comment = "", trim_ws = TRUE, escape_double = TRUE,
escape_backslash = FALSE, skip_empty_rows = TRUE)tokenizer_csv(na = "NA", quoted_na = TRUE, quote = "\"",
comment = "", trim_ws = TRUE, skip_empty_rows = TRUE)
tokenizer_tsv(na = "NA", quoted_na = TRUE, quote = "\"",
comment = "", trim_ws = TRUE, skip_empty_rows = TRUE)
tokenizer_line(na = character(), skip_empty_rows = TRUE)
tokenizer_log()
tokenizer_fwf(begin, end, na = "NA", comment = "", trim_ws = TRUE,
skip_empty_rows = TRUE)
tokenizer_ws(na = "NA", comment = "", skip_empty_rows = TRUE)
Single character used to separate fields within a record.
Single character used to quote strings.
Character vector of strings to interpret as missing values. Set this
option to character()
to indicate no missing values.
Should missing values inside quotes be treated as missing values (the default) or strings.
A string used to identify comments. Any text after the comment characters will be silently ignored.
Should leading and trailing whitespace be trimmed from each field before parsing it?
Does the file escape quotes by doubling them?
i.e. If this option is TRUE
, the value """"
represents
a single quote, \"
.
Does the file use backslashes to escape special
characters? This is more general than escape_double
as backslashes
can be used to escape the delimiter character, the quote character, or
to add special characters like \n
.
Should blank rows be ignored altogether? i.e. If this
option is TRUE
then blank rows will not be represented at all. If it is
FALSE
then they will be represented by NA
values in all the columns.
Begin and end offsets for each file. These are C++ offsets so the first column is column zero, and the ranges are [begin, end) (i.e inclusive-exclusive).
# NOT RUN {
tokenizer_csv()
# }
Run the code above in your browser using DataLab