JIS encoding

From Wikipedia, the free encyclopedia

In computing, JIS encoding refers to several Japanese Industrial Standards for encoding the Japanese language. Strictly speaking, the term means either:

A set of standard character sets for Japanese, notably:
- JIS X 0201, the Japanese version of ISO 646 (ASCII) containing the base 7-bit ASCII characters (with some modifications) and 64 half-width katakana characters.
- JIS X 0208, the most common kanji character set containing 6,879 kanji
- JIS X 0212, a character set containing 6,067 characters
- JIS X 0213, which extends JIS X 0208
JIS X 0202 (also known as ISO-2022-JP), a set of encoding mechanisms for sending JIS data over transmission mediums that only support 7-bit data.

In practice, "JIS encoding" usually refers to JIS X 0208 data encoded with JIS X 0202.

There is also the Shift JIS encoding, which adds the kanji, full-width hiragana and full-width katakana from JIS X 0208 in a compatible way to JIS X 0201. Shift JIS is perhaps the most widely used encoding in Japan, as the compatibility with the single-byte JIS X 0201 character set made it possible for electronic equipment manufacturers (such as cash register manufacturers) to offer an upgrade from older cheaper equipment that was not capable of displaying kanji to newer equipment while retaining character-set compatibility.

The main alternatives to JIS encoding are EUC (used on UNIX systems where the JIS encodings are incompatible with POSIX standards) and more recently Unicode, particularly in the form of UTF-8.

v t e Character encodings

Character sets

Early telecommunications	ASCII ISO/IEC 646 ISO/IEC 6937 T.61 BCD (6-bit) Baudot code Morse code Chinese telegraph code

ISO/IEC 8859	-1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14 -15 -16

Bibliographic use	ANSEL ISO 5426 / 5426-2 / 5427 / 5428 / 6438 / 6861 / 6862 / 10585 / 10586 / 10754 / 11822 MARC-8

National standards	ArmSCII CNS 11643 GOST 10859 GB 18030 HKSCS ISCII JIS X 0201 JIS X 0208 JIS X 0212 JIS X 0213 KPS 9566 KS X 1001 PASCII TIS-620 TSCII VISCII YUSCII

EUC	CN JP KR TW

ISO/IEC 2022	CN JP KR CCCII

MacOS codepages ("scripts")	Arabic CentralEurRoman ChineseSimp / EUC-CN ChineseTrad / Big5 Croatian Cyrillic Devanagari Dingbats Farsi Greek Gujarati Gurmukhi Hebrew Icelandic Japanese / ShiftJIS Korean / EUC-KR Roman Romanian Symbol Thai / TIS-620 Turkish Ukrainian

DOS codepages	437 667 668 720 737 770 773 775 790 808 819 850 851 852 853 854 855 857 858 860 861 862 863 864 865 866 867 868 869 872 895 912 915 932 991 Kamenický Mazovia MIK Iran System

Windows codepages	874 / TIS-620 932 / Shift JIS 936 / GBK 949 / EUC-KR 950 / Big5 1250 1251 1252 1253 1254 1255 1256 1257 1258 28604 54936 / GB18030

EBCDIC codepages	37/1140 273/1141 277/1142 278/1143 280/1144 284/1145 285/1146 297/1147 420/16804 424/12712 500/1148 838/1160 871/1149 875/9067 930/1390 933/1364 937/1371 935/1388 939/1399 1025/1154 1026/1155 1047/924 1112/1156 1122/1157 1123/1158 1130/1164 JEF KEIS

Platform specific	ATASCII CDC display code DEC-MCS DEC Radix-50 ELWRO-Junior Fieldata GSM 03.38 HP roman8 PETSCII TI calculator character sets WISCII ZX Spectrum character set

Unicode / ISO/IEC 10646	UTF-8 UTF-16/UCS-2 UTF-32/UCS-4 UTF-7 UTF-1 UTF-EBCDIC GB 18030 SCSU BOCU-1

Miscellaneous codepages	APL Cork HZ IBM code page 1133 KOI8 TRON

Related topics	control character (C0 C1) CCSID Character encodings in HTML charset detection Han unification ISO 6429/IEC 6429/ANSI X3.64 mojibake

This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.

JIS encoding

See also