Reads a text file in ins entirety, re-encodes it, and splits it into text lines.
stri_read_lines(con, encoding = NULL, fname = con)
Returns a character vector, each text line is a separate string. The output is always marked as UTF-8.
name of the output file or a connection object (opened in the binary mode)
single string; input encoding;
NULL
or ''
for the current default encoding.
[DEPRECATED] alias of con
Marek Gagolewski and other contributors
This aims to be a substitute for the readLines
function,
with the ability to re-encode the input file in a much more robust way,
and split the text into lines with stri_split_lines1
(which conforms with the Unicode guidelines for newline markers).
The function calls stri_read_raw
,
stri_encode
, and stri_split_lines1
,
in this order.
Because of the way this function is currently implemented, maximal file size cannot exceed ~0.67 GB.
The official online manual of stringi at https://stringi.gagolewski.com/
Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, tools:::Rd_expr_doi("10.18637/jss.v103.i02")
Other files:
stri_read_raw()
,
stri_write_lines()