UTF-32 is a 32bit encoding in which each Unicode code point corresponds to exactly one integer value. This function converts a character vector to a list of integer vectors so that e.g. individual code points may easily be accessed, changed, etc.
stri_enc_toutf32(str)
a character vector (or an object coercible to such a vector) to be converted
Returns a list of integer vectors.
Missing values are converted to NULL
s.
See stri_enc_fromutf32
for a dual operation.
This function is roughly equivalent to a vectorized call
to utf8ToInt(enc2utf8(str))
.
If you want a list of raw vector on output,
use stri_encode
.
Unlike utf8ToInt
, if improper UTF-8 byte sequences are detected,
a corresponding element is set to NULL and a warning is given,
see also stri_enc_toutf8
for a method to deal with such cases.
Other encoding_conversion: stri_enc_fromutf32
,
stri_enc_toascii
,
stri_enc_tonative
,
stri_enc_toutf8
, stri_encode
,
stringi-encoding