String searching facilities described here
provide a way to locate a specific sequence of bytes in a string.
The search engine's settings may be tuned up (for example
to perform case-insensitive search) via a call to the
stri_opts_fixed
function.
The fast Knuth-Morris-Pratt search algorithm, with worst time complexity of
O(n+p) (n == length(str)
, p == length(pattern)
)
is implemented (with some tweaks for very short search patterns).
Be aware that, for natural language processing, fixed pattern searching might not be what you actually require. It is because a bitwise match will not give correct results in cases of:
accented letters;
conjoined letters;
ignorable punctuation;
ignorable case,
see also stringi-search-coll.
Note that the conversion of input data to Unicode is done as usual.
Other search_fixed: stri_opts_fixed
,
stringi-search
Other stringi_general_topics: stringi-arguments
,
stringi-encoding
,
stringi-locale
,
stringi-package
,
stringi-search-boundaries
,
stringi-search-charclass
,
stringi-search-coll
,
stringi-search-regex
,
stringi-search