
stri_extract_all_*
extracts all the matches.
On the other hand, stri_extract_first_*
and stri_extract_last_*
provide the first or the last matches, respectively.
stri_extract_all(str, ..., regex, coll, charclass)stri_extract_first(str, ..., regex, coll, charclass)
stri_extract_last(str, ..., regex, coll, charclass)
stri_extract(str, ..., regex, coll, charclass, mode = c("first", "all",
"last"))
stri_extract_all_charclass(str, pattern, merge = TRUE, simplify = FALSE)
stri_extract_first_charclass(str, pattern)
stri_extract_last_charclass(str, pattern)
stri_extract_all_coll(str, pattern, simplify = FALSE, opts_collator = NULL)
stri_extract_first_coll(str, pattern, opts_collator = NULL)
stri_extract_last_coll(str, pattern, opts_collator = NULL)
stri_extract_all_regex(str, pattern, simplify = FALSE, opts_regex = NULL)
stri_extract_first_regex(str, pattern, opts_regex = NULL)
stri_extract_last_regex(str, pattern, opts_regex = NULL)
"first"
(the default), "all"
, "last"
stri_extract_all_charclass
onlyTRUE
, then a character matrix is returned;
otherwise (the default), a list of character vectors is given, see Value;
stri_extract_all_*
onlystri_opts_collator
; NULL
for default settings; stri_extract_*_coll
onlystri_opts_regex
; NULL
for default settings;
stri_extract_*_regex
onlystri_extract_all*
, if simplify == FALSE
(the default), then
a list of character vectors is returned. Each list element
represents the results of a separate search scenario.
If a pattern is not found, then a character vector of length 1,
with single NA
value will be generated.
Otherwise, i.e. if simplify == TRUE
,
then stri_list2matrix
with byrow=TRUE
argument
is called on the resulting object.
In such a case, a character matrix with an appropriate number of rows
(according to the length of str
, pattern
, etc.)
is returned.stri_extract_first*
and stri_extract_last*
,
on the other hand, return a character vector.
A NA
element indicates no match.
str
and pattern
.Note that a stri_extract_*_fixed
family of functions does not
make sense. Thus, it has not been implemented in
If you would like to extract regex capture groups individually,
check out stri_match
.
stri_extract
, stri_extract_all
, stri_extract_first
,
and stri_extract_last
are convenience functions.
They just call stri_extract_*_*
, depending on arguments used.
Unless you are a very lazy person, please call the underlying functions
directly for better performance.
stri_extract_words
;
stri_match
, stri_match_all
,
stri_match_all_regex
,
stri_match_first
,
stri_match_first_regex
,
stri_match_last
,
stri_match_last_regex
;
stringi-search
stri_extract_all('XaaaaX', regex=c('\\p{Ll}', '\\p{Ll}+', '\\p{Ll}{2,3}', '\\p{Ll}{2,3}?'))
stri_extract_all('Bartolini', coll='i')
stri_extract_all('stringi is so good!', charclass='\\p{Zs}') # all whitespaces
stri_extract_all_charclass(c('AbcdeFgHijK', 'abc', 'ABC'), '\\p{Ll}')
stri_extract_all_charclass(c('AbcdeFgHijK', 'abc', 'ABC'), '\\p{Ll}', merge=FALSE)
stri_extract_first_charclass('AaBbCc', '\\p{Ll}')
stri_extract_last_charclass('AaBbCc', '\\p{Ll}')
stri_extract_all_coll(c('AaaaaaaA', 'AAAA'), 'a')
stri_extract_first_coll(c('Yy\u00FD', 'AAA'), 'y',
stri_opts_collator(strength=2, locale="sk_SK"))
stri_extract_last_coll(c('Yy\u00FD', 'AAA'), 'y',
stri_opts_collator(strength=1, locale="sk_SK"))
stri_extract_all_regex('XaaaaX', c('\\p{Ll}', '\\p{Ll}+', '\\p{Ll}{2,3}', '\\p{Ll}{2,3}?'))
stri_extract_first_regex('XaaaaX', c('\\p{Ll}', '\\p{Ll}+', '\\p{Ll}{2,3}', '\\p{Ll}{2,3}?'))
stri_extract_last_regex('XaaaaX', c('\\p{Ll}', '\\p{Ll}+', '\\p{Ll}{2,3}', '\\p{Ll}{2,3}?'))
stri_list2matrix(stri_extract_all_regex('XaaaaX', c('\\p{Ll}', '\\p{Ll}+')))
stri_extract_all_regex('XaaaaX', c('\\p{Ll}', '\\p{Ll}+'), simplify=TRUE)
Run the code above in your browser using DataLab