Learn R Programming

revtools (version 0.4.1)

fuzz_functions: Functions for fuzzy string matching

Description

Duplicate of functions from the Python library 'fuzzywuzzy' (https://github.com/seatgeek/fuzzywuzzy). These functions have been recoded from scratch based on the description given here. For consistency with stringdist, however, these functions are computed as distances rather than similarities; i.e. low values signify similar strings.

Usage

fuzzdist(a, b, method)
fuzz_m_ratio(a, b)
fuzz_partial_ratio(a, b)
fuzz_token_sort_ratio(a, b)
fuzz_token_set_ratio(a, b)

Arguments

a

a string

b

a vector containing one or more strings

method

a function to be called by 'fuzzdist'

Value

A score of same length as y, giving the proportional dissimilarity between x and y.

Examples

Run this code
# NOT RUN {
fuzz_m_ratio("NEW YORK METS", "NEW YORK MEATS")
fuzz_partial_ratio(
  "YANKEES",
  c("NEW YORK YANKEES", "something else", "YNAKEES")
)
fuzz_token_sort_ratio("New York Mets vs Atlanta Braves", "Atlanta Braves vs New York Melts")
fuzz_token_set_ratio(
  a = "mariners vs angels other words",
  b = c("los angeles angels of anaheim at seattle mariners", "angeles angels of anaheim ")
)
# }

Run the code above in your browser using DataLab