These functions detect whether a given byte stream is valid UTF-16LE, UTF-16BE, UTF-32LE, or UTF-32BE.
stri_enc_isutf16be(str)stri_enc_isutf16le(str)
stri_enc_isutf32be(str)
stri_enc_isutf32le(str)
Returns a logical vector.
character vector, a raw vector, or
a list of raw
vectors
Marek Gagolewski and other contributors
These functions are independent of the way R marks encodings in character strings (see Encoding and stringi-encoding). Most often, these functions act on raw vectors.
A result of FALSE
means that a string is surely not valid UTF-16
or UTF-32. However, false positives are possible.
Also note that a data stream may be sometimes classified as both valid UTF-16LE and UTF-16BE.
The official online manual of stringi at https://stringi.gagolewski.com/
Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, tools:::Rd_expr_doi("10.18637/jss.v103.i02")
Other encoding_detection:
about_encoding
,
stri_enc_detect2()
,
stri_enc_detect()
,
stri_enc_isascii()
,
stri_enc_isutf8()