ISO/IEC 8859-6
From Wikipedia, the free encyclopedia
ISO 8859-6, also known as Arabic, is an 8-bit character encoding, part of the ISO 8859 standard. It was designed originally to cover languages using the Arabic alphabet, but lacks many needed glyphs and therefore was never very popular. In recent times it is giving way to Unicode. Arabic joining processing is required for display of text in this character set.
ISO_8859-6:1987, better known by its preferred mime name of ISO-8859-6, is the IANA charset consisting of this standard with logical order (never visual (left-to-right) order, despite an RFC to the contrary), used together with the control codes from ISO/IEC 6429 for the C0 (0x00–0x1F) and C1 (0x80–0x9F) parts. Escape sequences (from ISO/IEC 6429 or ISO/IEC 2022) are not to be interpreted. Since for this charset the text is in logical order, bidi processing is required for display. This charset has aliases iso-ir-127, ISO_8859-6, ECMA-114, ASMO-708, Arabic and csISOLatinArabic. There are also versions where the directionality is specified to be explicit or implicit by an -e or -i on the end of the name.
[edit] Codepage layout
The following character set table may require cleanup to meet Wikipedia's quality standards. Please improve this table if you can. |
ISO/IEC 8859-6 | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | xA | xB | xC | xD | xE | xF | |
0x | unused | |||||||||||||||
1x | ||||||||||||||||
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | |
8x | unused | |||||||||||||||
9x | ||||||||||||||||
Ax | NBSP | ¤ | ، | SHY | ||||||||||||
Bx | ؛ | ؟ | ||||||||||||||
Cx | ء | آ | أ | ؤ | إ | ئ | ا | ب | ة | ت | ث | ج | ح | خ | د | |
Dx | ذ | ر | ز | س | ش | ص | ض | ط | ظ | ع | غ | |||||
Ex | ـ | ف | ق | ك | ل | م | ن | ه | و | ى | ي | ً | ٌ | ٍ | َ | ُ |
Fx | ِ | ّ | ْ |
In the table above, 20 is the regular SPACE character, and A0 is the NO-BREAK SPACE. AD is a SOFT HYPHEN, which should not appear at all in compliant web browsers.
Code values 0x00–0x1F, 0x7F, 0x80–0x9F, 0xA1–0xA3, 0xA5–0xAB, 0xB0–0xBA, 0xBC–0xBE, 0xC0, 0xDC–0xDF, and 0xF3–0xFF are not assigned to characters by ISO/IEC 8859-6.
Code values 0xEB–0xF2 are assigned to combining characters.
[edit] External links
- ISO/IEC 8859-6:1999
- Standard ECMA-114: 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Arabic Alphabet 2nd edition (December 2000)
- ISO-IR 127 Right-Hand Part of Latin/Arabic Alphabet (November 30, 1986)