Daitch-Mokotoff Soundex

From Wikipedia, the free encyclopedia

Daitch-Mokotoff Soundex (D-M Soundex) is a phonetic algorithm invented in 1985 by genealogist Gary Mokotoff, and later improved by Randy Daitch, both of the Jewish Genealogical Society. It is a refinement of the Russell and American Soundex algorithms designed to allow matching of Slavic and Yiddish surnames with similar pronunciation but differences in spelling.

Daitch-Mokotoff Soundex is sometimes referred to as "Jewish Soundex" and "Eastern European Soundex", although the authors discourage use of these nicknames for the algorithm.

Contents

[edit] Improvements

Improvements over the older Soundex algorithms include:

  • Coded names are six digits long, resulting in greater search precision (traditional Soundex uses four characters)
  • Coded names can be stored as numeric values, which can save space in some applications (regular Soundex encodes values as alphanumeric text)
  • Several rules in the algorithm encode multiple character n-grams as single digits (American and Russell Soundex do not handle multi-character n-grams)
  • Multiple possible encodings can be returned for a single name (traditional Soundex returns only one encoding, even if the spelling of a name could potentially have multiple pronunciations)

[edit] Examples

Some examples:

Surname American Soundex D-M Soundex
Peters P362 739400, 734000
Peterson P362 739460, 734600
Moskowitz M232 645740
Moskovitz M213 645740
Auerbach A612 097500, 097400
Uhrbach U612 097500, 097400
Jackson J250 154600, 454600, 145460, 445460
Jackson-Jackson J252 154664, 454664, 145466, 445466, 154646, 454646, 145464, 445464

[edit] See also

[edit] External links