JIS encoding

From Wikipedia, the free encyclopedia

In computing, JIS encoding refers to several Japanese Industrial Standards for encoding the Japanese language. Strictly speaking, the term means either:

  • A set of standard character sets for Japanese, notably:
    • JIS X 0201, the Japanese version of ISO 646 (ASCII) containing the base 7-bit ASCII characters (with some modifications) and 64 half-width katakana characters.
    • JIS X 0208, the most common kanji character set containing 6,879 kanji
    • JIS X 0212, a character set containing 6,067 characters
    • JIS X 0213, which extends JIS X 0208
  • JIS X 0202 (also known as ISO-2022-JP), a set of encoding mechanisms for sending JIS data over transmission mediums that only support 7-bit data.

In practice, "JIS encoding" usually refers to JIS X 0208 data encoded with JIS X 0202.

There is also the Shift JIS encoding, which adds the kanji, full-width hiragana and full-width katakana from JIS X 0208 in a compatible way to JIS X 0201. Shift JIS is perhaps the most widely used encoding in Japan, as the compatibility with the single-byte JIS X 0201 character set made it possible for electronic equipment manufacturers (such as cash register manufacturers) to offer an upgrade from older cheaper equipment that was not capable of displaying kanji to newer equipment while retaining character-set compatibility.

The main alternatives to JIS encoding are EUC (used on UNIX systems where the JIS encodings are incompatible with POSIX standards) and more recently Unicode, particularly in the form of UTF-8.

[edit] See also