pattern
(the first argument)
within each element of the string x
(the second argument) using
the generalized Levenshtein edit distance (the minimal possibly
weighted number of insertions, deletions and substitutions needed to
transform one string into another).
agrep(pattern, x, max.distance = 0.1, costs = NULL, ignore.case = FALSE, value = FALSE, fixed = TRUE, useBytes = FALSE)
fixed = FALSE
) to be
matched.
Coerced by as.character
to a string if possible.as.character
to a character vector if
possible.cost
:
all
:
insertions
:
deletions
:
substitutions
:
If cost
is not given, all
defaults to 10%, and the
other transformation number bounds default to all
.
The component names can be abbreviated.
NULL
(default) indicating using unit cost for
all three possible transformations.
Coerced to integer via as.integer
if possible.FALSE
, the pattern matching is case
sensitive and if TRUE
, case is ignored during matching.FALSE
, a vector containing the (integer)
indices of the matches determined is returned and if TRUE
, a
vector containing the matching elements themselves is returned.TRUE
(default), the pattern is
matched literally (as is). Otherwise, it is matched as a regular
expression.value
is TRUE
, the matched elements (after
coercion, preserving names but no other attributes).
As from R 2.10.0 this uses tre
by Ville Laurikari
(http://http://laurikari.net/tre/), which supports MBCS
character matching much better than the previous version.
The main effect of useBytes
is to avoid errors/warnings about
invalid inputs and spurious matches in multibyte locales.
It inhibits the conversion of inputs with marked encodings, and is
forced if any input is found which is marked as "bytes"
.
grep
agrep("lasy", "1 lazy 2")
agrep("lasy", c(" 1 lazy 2", "1 lasy 2"), max = list(sub = 0))
agrep("laysy", c("1 lazy", "1", "1 LAZY"), max = 2)
agrep("laysy", c("1 lazy", "1", "1 LAZY"), max = 2, value = TRUE)
agrep("laysy", c("1 lazy", "1", "1 LAZY"), max = 2, ignore.case = TRUE)
Run the code above in your browser using DataLab