Legacy encoding

From Wikipedia, the free encyclopedia

In computing, a legacy encoding is a character encoding that can't represent all of Unicode, but is still used for compatibility or other reasons.

Many legacy encodings predate Unicode, while others are slight modifications to older encodings to support important new characters such as the euro sign (€) or to satisfy countries that felt there were significant omissions for their language. The best known such encoding is probably ISO-8859-15.

Legacy encodings are numerous, and include the following major groups:

  • The ISO-8859-n group of single byte encodings
  • The IBM/DOS/Windows OEM series of single byte code pages (437,850 and others).
  • The single-byte Windows "ANSI" code pages (125x)
  • The windows multibyte code pages used by windows as both ansi and OEM code pages for CJK languages.
  • Various other multibyte CJK encodings such as ISO-2022 and EUC.