IDN homograph attack

From Wikipedia, the free encyclopedia

The internationalized domain name (IDN) homograph attack is a means by which a malicious party may seek to deceive computer users about what remote system they are communicating with, by exploiting the fact that many different characters may have nearly (or wholly) indistinguishable glyphs.

Contents

[edit] Homographs

In multilingual computer systems, different logical characters may have identical or very similar appearances. For example, Unicode character U+0430, Cyrillic small letter a ("а"), can look identical to Unicode character U+0061, Latin small letter a, ("a") which is the lowercase "a" used in English. Technically, characters that look alike in this way are known as homoglyphs (a subgroup of homographs). Spoofing attacks based on these similarities are known as homograph spoofing attacks.

The problem arises from the different treatment of the characters in the users mind and the computer's programming. From the viewpoint of the user, a Cyrillic "а" within a Latin string is a Latin "a"; there is literally no difference in the glyphs for these characters in most fonts. However, the computer treats them differently when processing the character string as an identifier. Thus, the user's assumption of a one-to-one correspondence between the visual appearance of a name, and the named entity, breaks down.

In a typical example of a hypothetical attack, someone could register a domain name that appears identical to an existing domain but goes somewhere else. For example, the spoofed domain "pаypal.com" contains a Cyrillic a, not a Latin a. In many ways, this is not a new thing. For example, even staying within the old character set of A-Z, 0-9 and hyphen, G00GLE.COM looks much like GOOGLE.COM in some fonts; or, using a mix of uppercase and lowercase characters, googIe.com (capital I, not small ell) looks much like google.com in some fonts. PayPal itself was a target of a phishing scam exploiting this, using the domain PayPaI.com Or, displaying characters in lowercase alone, rnozilla.org ("RNOZILLA.ORG") looks very much like mozilla.org in many fonts. What is new was that the expansion by the internationalized domain name system of the character repertoire from a few dozen characters in a single alphabet to many thousands of characters in many scripts greatly increased the scope for homograph attacks.

[edit] Homographs in internationalized domain names

The limitation of domain names to ASCII characters may not last forever, and is coming under pressure from organizations based in regions that do not use Latin characters. Internationalized domain names provides a backward-compatible way for domain names to use the full Unicode character set, and this standard is already widely supported.

For example, the Russian newspaper website gazeta.ru may wish to use the URL газета.ру, reflecting the newspaper's name spelled in Cyrillic. The disadvantage in this example is that the Cyrillic letters 'а', 'е', 'р', 'у' are indistinguishable in writing from their Latin counterparts. Some of the letters (such as a) are close etymologically, while others look similar by coincidence. For instance, the Cyrillic letter 'р' represents a phoneme similar to the English 'r', but the glyph is identical to the Latin letter 'p'.

This opens a rich vein of opportunities for phishing and other varieties of fraud. An attacker could register a domain name that looks just like that of a legitimate website, but in which some of the letters have been replaced by homographs in another alphabet. The attacker could then send e-mail messages purporting to come from the original site, but directing people to the bogus site. The spoof site could then record information such as passwords or account details, while passing traffic through to the real site. The victims may never notice the difference, until suspicious or criminal activity occurs with their accounts.

[edit] Defending against the attack

The simplest defense is for web browsers not to support IDNA or other similar mechanisms, or for users to turn off whatever support their browsers have. That could mean blocking access to IDNA sites, but generally browsers permit access and just display IDNs in Punycode. Either way, this amounts to abandoning non-ASCII domain names.

Firefox and Opera display punycode for IDNs unless the top-level domain (TLD, for example, .ac or .museum) prevents homograph attacks by restricting which characters can be used in domain names.[1] They both also allow users to manually add TLDs to the allowed list.[2][3]

Internet Explorer 7 allows IDNs except for labels that mix scripts for different languages. Labels that mix scripts are displayed in punycode. There are exceptions to locales where ASCII characters are commonly mixed with localized scripts.[4]

As an additional defense, Internet Explorer 7, Firefox 2.0 and Opera 9.10 include phishing filters to alert users when they visit malicious websites.[5][6][7]

Another possible defense would be for web browsers to display non-ASCII characters in URLs distinctively, perhaps by changing their color or that of their background. This wouldn't provide protection against spoofing by changing one non-ASCII character to another similar-looking one. (A solution to this problem would be using a different color for all character groups, but no software implements it that way.) This approach was adopted, as of July 9, 2005, by the plug-in Quero Toolbar for Internet Explorer. Besides IDN highlighting Quero has implemented several other techniques to mitigate IDN spoofing attacks like mixed-script/missing glyph detection, IDN/digit indication and "core domain" highlighting.

There is not yet (as of March 2005) a clear consensus as to the best way to balance the needs of the international community with protection against domain-name spoofing.

[edit] See also

[edit] References

  1. ^ Advisory: Internationalized domain names (IDN) can be used for spoofing.. Opera (2005-02-25). Retrieved on February 24, 2007.
  2. ^ IDN-enabled TLDs. Mozilla (2006-08-07). Retrieved on November 30, 2006.
  3. ^ Opera's Settings File Explained: IDNA White List. Opera Software (2006-12-18). Retrieved on February 24, 2007.
  4. ^ Sharif, Tariq (2006-07-31). Changes to IDN in IE7 to now allow mixing of scripts. IEBlog. Microsoft. Retrieved on November 30, 2006.
  5. ^ Sharif, Tariq (2005-09-09). Phishing Filter in IE7. IEBlog. Microsoft. Retrieved on November 30, 2006.
  6. ^ Firefox 2 Phishing Protection. Mozilla (2006). Retrieved on November 30, 2006.
  7. ^ Opera Fraud Protection. Opera Software (2006-12-18). Retrieved on February 24, 2007.

[edit] External links

In other languages