The function metaphone
phonentically encodes the
given string using the metaphone algorithm.
metaphone(word, maxCodeLen = 10L, clean = TRUE)
string or vector of strings to encode
maximum length of the resulting encodings, in characters
if TRUE
, return NA
for unknown alphabetical characters
a character vector containing the metaphones of word
,
or an NA if the word
value is NA
There is some discrepency with respect to how the metaphone algorithm actually works. For instance, there is a version in the Java Apache Commons library. There is a version provided within PHP. These do not provide the same results. On the questionable theory that the implementation in PHP is probably more well known, this code should match it in output.
This implementation is based on a Javascript implementation which is itself based on the PHP internal implementation.
The variable maxCodeLen
is the limit on how long the returned
metaphone should be.
The metaphone
algorithm is only defined for inputs over the
standard English alphabet, i.e., "A-Z.". Non-alphabetical
characters are removed from the string in a locale-dependent fashion.
This strips spaces, hyphens, and numbers. Other letters, such as
"<U+00DC>," may be permissible in the current locale but are unknown to
metaphone
. For inputs outside of its known range, the output
is undefined and NA
is returned and a warning
this
thrown. If clean
is FALSE
, metaphone
attempts
to process the strings. The default is TRUE
.
James P. Howard, II, "Phonetic Spelling Algorithm Implementations for R," Journal of Statistical Software, vol. 25, no. 8, (2020), p. 1--21, <10.18637/jss.v095.i08>.
Other phonics:
caverphone()
,
cologne()
,
lein()
,
mra_encode()
,
nysiis()
,
onca()
,
phonex()
,
phonics()
,
rogerroot()
,
soundex()
,
statcan()
# NOT RUN {
metaphone("wheel")
metaphone(c("school", "benji"))
# }
Run the code above in your browser using DataLab