
Keywords: R, text processing, strings, internationalization, localization, ICU, ICU4C, i18n, l10n, Unicode.
Homepage:
License: The BSD-3-clause license for the package code, the ICU license for the accompanying ICU4C distribution, and the UCD license for the Unicode Character Database. See the COPYRIGHTS and LICENSE file for more details.
stri_stats_general
andstri_stats_latex
for gathering some fancy statistics on a character vector's contents.stri_join
,stri_dup
,%s+%
,
andstri_flatten
for concatenation-based operations.stri_sub
for extracting and replacing substrings,
andstri_reverse
for a joyful function
to reverse all code points in a string.stri_trim
(among others) for
trimming characters from the beginning or/and end of a string,
see alsostringi-search-charclass,
andstri_pad
for padding strings so that
they have the same minimal number of code points.
Additionally,stri_wrap
wraps text into lines.stri_length
(among others) for determining the number
of code points in a string. See alsostri_count_boundaries
for counting the number ofUnicode characters
.stri_trans_tolower
(among others) for case mapping,
i.e. conversion to lower, UPPER, or Title Case,stri_trans_nfc
(i.a.) for Unicode normalization,
andstri_trans_general
for other very general yet powerful
text transforms, including transliteration.stri_cmp
,%s<%< a="">
,stri_order
,stri_sort
,stri_unique
andstri_duplicated
for collation-based,
locale-aware operations, see alsostringi-locale.stri_split_lines
(among others)
to split a string into text lines.stri_escape_unicode
(among others) for escaping
certain code points.stri_rand_strings
,stri_rand_shuffle
,
andstri_rand_lipsum
for generating (pseudo)random strings.stri_read_raw
,stri_read_lines
, andstri_write_lines
for reading and writing text files.Note that each man page has many links to other interesting facilities.
You are encouraged to call stri_install_check
after the package installation or update.
stri_opts_collator
for a description
of the string collation algorithm, which is used for
string comparing, ordering, sorting, case-folding, and searching.ICU -- International Components for Unicode,
ICU4C API Documentation,
The Unicode Consortium,
UTF-8, a transformation format of ISO 10646 -- RFC 3629,
stringi-arguments
;
stringi-encoding
;
stringi-locale
;
stringi-search-boundaries
;
stringi-search-charclass
;
stringi-search-coll
;
stringi-search-fixed
;
stringi-search-regex
;
stringi-search