These functions return or modify a sub-vector where there is a match to
a given pattern. In other words, they
are roughly equivalent (but faster and easier to use) to a call to
str[stri_detect(str, ...)]
or
str[stri_detect(str, ...)] <- value
.
stri_subset(str, ..., regex, fixed, coll, charclass)stri_subset(str, ..., regex, fixed, coll, charclass) <- value
stri_subset_fixed(
str,
pattern,
omit_na = FALSE,
negate = FALSE,
...,
opts_fixed = NULL
)
stri_subset_fixed(str, pattern, negate=FALSE, ..., opts_fixed=NULL) <- value
stri_subset_charclass(str, pattern, omit_na = FALSE, negate = FALSE)
stri_subset_charclass(str, pattern, negate=FALSE) <- value
stri_subset_coll(
str,
pattern,
omit_na = FALSE,
negate = FALSE,
...,
opts_collator = NULL
)
stri_subset_coll(str, pattern, negate=FALSE, ..., opts_collator=NULL) <- value
stri_subset_regex(
str,
pattern,
omit_na = FALSE,
negate = FALSE,
...,
opts_regex = NULL
)
stri_subset_regex(str, pattern, negate=FALSE, ..., opts_regex=NULL) <- value
The stri_subset_*
functions return a character vector.
As usual, the output encoding is UTF-8.
The stri_subset_*<-
functions modifies str
'in-place'.
character vector; strings to search within
supplementary arguments passed to the underlying functions,
including additional settings for opts_collator
, opts_regex
,
opts_fixed
, and so on
non-empty character vector of replacement strings; replacement function only
character vector;
search patterns (no more than the length of str
);
for more details refer to stringi-search
single logical value; should missing values be excluded from the result?
single logical value; whether a no-match is rather of interest
a named list used to tune up
the search engine's settings; see
stri_opts_collator
, stri_opts_fixed
,
and stri_opts_regex
, respectively; NULL
for the defaults
Marek Gagolewski and other contributors
Vectorized over str
as well as partially over pattern
and value
,
with recycling of the elements in the shorter vector if necessary.
As the aim here is to subset str
, pattern
cannot be longer than the former. Moreover, if the number of
items to replace is not a multiple of length of value
,
a warning is emitted and the unused elements are ignored.
Hence, the length of the output will be the same as length of str
.
stri_subset
and stri_subset<-
are convenience functions.
They call either stri_subset_regex
,
stri_subset_fixed
, stri_subset_coll
,
or stri_subset_charclass
,
depending on the argument used.
The official online manual of stringi at https://stringi.gagolewski.com/
Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, tools:::Rd_expr_doi("10.18637/jss.v103.i02")
Other search_subset:
about_search
stri_subset_regex(c('stringi R', '123', 'ID456', ''), '^[0-9]+$')
x <- c('stringi R', '123', 'ID456', '')
`stri_subset_regex<-`(x, '[0-9]+$', negate=TRUE, value=NA) # returns a copy
stri_subset_regex(x, '[0-9]+$') <- NA # modifies `x` in-place
print(x)
Run the code above in your browser using DataLab