Talk:Daitch-Mokotoff Soundex

From Wikipedia, the free encyclopedia

[edit] DM Codes

D-M can return up to 32 possible distinct codes (not just two!)... Your samples are a little off as well:

Auerbach ==> 097500,097400

Peters ==> 739400,734000

Peterson ==> 739460,734600

Uhrbach ==> 097500,097400

And check out some of these:

Jackson ==> 154600,454600,145460,445460

(compound name) Jackson-Johnson ==> 154664,454664,145466,445466,154646,454646,145464,445464

Every time you encounter certain letter combinations, your results effectively double. CK is one example (rule for CK is "Try K (5) and TSK(4)", doubling your results). Since there is a max of 6 digits per result, you can theoretically have up to 2 ^ (6 - 1) = 32 possible results. If you want to check your results, look up the http://www.jewishgen.org/jos/jossound.htm website. According to the creator of D-M Soundex, Mr. Mokotoff, this calculator is the "official" implementation of the algorithm. There is also a SQL Server implementation based on the ruleset at http://www.avotaynu.com/soundex.html; it is found at http://www.sqlservercentral.com/columnists/mcoles/sql2000dbatoolkitpart3.asp.

Enjoy.

Fixed.

[edit] Inventors

BTW, D-M Soundex is often referred to as "Jewish Soundex" or "Eastern European Soundex", although the authors discourage use of those nicknames. The "official name", according to the creators is the "Daitch-Mokotoff Soundex Algorithm" (or D-M Soundex). It is the official searching algorithm for the Holocaust Museum and for the Ellis Island Database Project. It was invented BY Gary Mokotoff, and later IMPROVED BY Randy Daitch one year later; the article implies that it was co-invented by both genealogists together back in 1985.

Fixed