phonics (version 1.3.2)

metaphone: Generate phonetic versions of strings with Metaphone


The function metaphone phonentically encodes the given string using the metaphone algorithm.


metaphone(word, maxCodeLen = 10L, clean = TRUE)



string or vector of strings to encode


maximum length of the resulting encodings, in characters


if TRUE, return NA for unknown alphabetical characters


a character vector containing the metaphones of word, or an NA if the word value is NA


There is some discrepency with respect to how the metaphone algorithm actually works. For instance, there is a version in the Java Apache Commons library. There is a version provided within PHP. These do not provide the same results. On the questionable theory that the implementation in PHP is probably more well known, this code should match it in output.

This implementation is based on a Javascript implementation which is itself based on the PHP internal implementation.

The variable maxCodeLen is the limit on how long the returned metaphone should be.

The metaphone algorithm is only defined for inputs over the standard English alphabet, i.e., "A-Z.". Non-alphabetical characters are removed from the string in a locale-dependent fashion. This strips spaces, hyphens, and numbers. Other letters, such as "<U+00DC>," may be permissible in the current locale but are unknown to metaphone. For inputs outside of its known range, the output is undefined and NA is returned and a warning this thrown. If clean is FALSE, metaphone attempts to process the strings. The default is TRUE.

See Also

Other phonics: caverphone, cologne, lein, mra_encode, nysiis, onca, phonex, phonics, rogerroot, soundex, statcan


Run this code
metaphone(c("school", "benji"))

