Public methods
Method new()
Usage
FuzzExtract$new(decoding = NULL)
Arguments
decoding
either NULL or a character string. If not NULL then the decoding parameter takes one of the standard python encodings (such as 'utf-8'). See the details and references link for more information.
Method Extract()
Usage
FuzzExtract$Extract(
string = NULL,
sequence_strings = NULL,
processor = NULL,
scorer = NULL,
limit = 5L
)
Arguments
string
a character string.
sequence_strings
a character string vector
processor
either NULL or a function of the form f(a) -> b, where a is the query or individual choice and b is the choice to be used in matching. See the examples for more details.
scorer
a function for scoring matches between the query and an individual processed choice. This should be a function of the form f(query, choice) -> int. By default, FuzzMatcher.WRATIO() is used and expects both query and choice to be strings. See the examples for more details.
limit
An integer value for the maximum number of elements to be returned. Defaults to 5L
Method ExtractBests()
Usage
FuzzExtract$ExtractBests(
string = NULL,
sequence_strings = NULL,
processor = NULL,
scorer = NULL,
score_cutoff = 0L,
limit = 5L
)
Arguments
string
a character string.
sequence_strings
a character string vector
processor
either NULL or a function of the form f(a) -> b, where a is the query or individual choice and b is the choice to be used in matching. See the examples for more details.
scorer
a function for scoring matches between the query and an individual processed choice. This should be a function of the form f(query, choice) -> int. By default, FuzzMatcher.WRATIO() is used and expects both query and choice to be strings. See the examples for more details.
score_cutoff
an integer value for the score threshold. No matches with a score less than this number will be returned. Defaults to 0
limit
An integer value for the maximum number of elements to be returned. Defaults to 5L
Method ExtractWithoutOrder()
Usage
FuzzExtract$ExtractWithoutOrder(
string = NULL,
sequence_strings = NULL,
processor = NULL,
scorer = NULL,
score_cutoff = 0L
)
Arguments
string
a character string.
sequence_strings
a character string vector
processor
either NULL or a function of the form f(a) -> b, where a is the query or individual choice and b is the choice to be used in matching. See the examples for more details.
scorer
a function for scoring matches between the query and an individual processed choice. This should be a function of the form f(query, choice) -> int. By default, FuzzMatcher.WRATIO() is used and expects both query and choice to be strings. See the examples for more details.
score_cutoff
an integer value for the score threshold. No matches with a score less than this number will be returned. Defaults to 0
Method ExtractOne()
Usage
FuzzExtract$ExtractOne(
string = NULL,
sequence_strings = NULL,
processor = NULL,
scorer = NULL,
score_cutoff = 0L
)
Arguments
string
a character string.
sequence_strings
a character string vector
processor
either NULL or a function of the form f(a) -> b, where a is the query or individual choice and b is the choice to be used in matching. See the examples for more details.
scorer
a function for scoring matches between the query and an individual processed choice. This should be a function of the form f(query, choice) -> int. By default, FuzzMatcher.WRATIO() is used and expects both query and choice to be strings. See the examples for more details.
score_cutoff
an integer value for the score threshold. No matches with a score less than this number will be returned. Defaults to 0
Method Dedupe()
Usage
FuzzExtract$Dedupe(contains_dupes = NULL, threshold = 70L, scorer = NULL)
Arguments
contains_dupes
a vector of strings that we would like to dedupe
threshold
the numerical value (0, 100) point at which we expect to find duplicates. Defaults to 70 out of 100
scorer
a function for scoring matches between the query and an individual processed choice. This should be a function of the form f(query, choice) -> int. By default, FuzzMatcher.WRATIO() is used and expects both query and choice to be strings. See the examples for more details.
Method clone()
The objects of this class are cloneable with this method.
Usage
FuzzExtract$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.