Setting a locale's codeset (specifically, the LC_CTYPE
category)
produces side effects in R's handling of strings. The most
important of these affects how the R parser marks strings. R has
specific internal support for latin1 (single-byte encoding) and
UTF-8 (multi-bytes variable-width encoding) strings. If the locale
codeset is latin1 or UTF-8, the parser will mark all strings with
the corresponding encoding. It is important for strings to have
consistent encoding markers, as they determine a number of internal
encoding conversions when R or packages handle strings (see
set_str_encoding()
for some examples).
mut_utf8_locale()mut_latin1_locale()
mut_mbcs_locale()
The previous locale (invisibly).
If you are changing the locale encoding for testing purposes, you need to be aware that R caches strings and symbols to save memory. If you change the locale during an R session, it can lead to surprising and difficult to reproduce results. In doubt, restart your R session.
Note that these helpers are only provided for testing interactively
the effects of changing locale codeset. They let you quickly change
the default text encoding to latin1, UTF-8, or non-UTF-8 MBCS. They
are not widely tested and do not provide a way of setting the
language and region of the locale. They have permanent side effects
and should probably not be used in package examples, unit tests, or
in the course of a data analysis. Note finally that
mut_utf8_locale()
will not work on Windows as only latin1 and
MBCS locales are supported on this OS.