This function computes a locale-dependent sort key, which is an alternative
character representation of the string that, when ordered in the C locale
(which orders using the underlying bytes directly), will give an equivalent
ordering to the original string. It is useful for enhancing algorithms
that sort only in the C locale (e.g., the strcmp
function in libc)
with the ability to be locale-aware.
stri_sort_key(str, ..., opts_collator = NULL)
The result is a character vector with the same length as str
that
contains the sort keys. The output is marked as bytes
-encoded.
a character vector
additional settings for opts_collator
a named list with ICU Collator's options,
see stri_opts_collator
, NULL
for default collation options
Marek Gagolewski and other contributors
For more information on ICU's Collator and how to tune it up
in stringi, refer to stri_opts_collator
.
See also stri_rank
for ranking strings with a single character
vector, i.e., generating relative sort keys.
Collation - ICU User Guide, https://unicode-org.github.io/icu/userguide/collation/
The official online manual of stringi at https://stringi.gagolewski.com/
Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, tools:::Rd_expr_doi("10.18637/jss.v103.i02")
Other locale_sensitive:
%s<%()
,
about_locale
,
about_search_boundaries
,
about_search_coll
,
stri_compare()
,
stri_count_boundaries()
,
stri_duplicated()
,
stri_enc_detect2()
,
stri_extract_all_boundaries()
,
stri_locate_all_boundaries()
,
stri_opts_collator()
,
stri_order()
,
stri_rank()
,
stri_sort()
,
stri_split_boundaries()
,
stri_trans_tolower()
,
stri_unique()
,
stri_wrap()
stri_sort_key(c('hladny', 'chladny'), locale='pl_PL')
stri_sort_key(c('hladny', 'chladny'), locale='sk_SK')
Run the code above in your browser using DataLab