ISO/IEC 8859-8

From Wikipedia, the free encyclopedia

ISO 8859-8, more formally cited as ISO/IEC 8859-8 (but not as Latin-8!), is part 8 of ISO/IEC 8859, a standard character encoding defined by ISO.ISO 8859-8 contains all the Hebrew letters (no Hebrew vowel signs).

ISO_8859-8:1988, more commonly known by its preferred MIME name of ISO-8859-8, is the IANA charset consisting of this standard ISO/IEC 8859-8 used together with the control codes from ISO/IEC 6429 for the C0 (0x00-0x1F) and C1 (0x80-0x9F) parts. Escape sequences (from ISO/IEC 6429 or ISO/IEC 2022) are not to be interpreted. This charset also has the aliases iso-ir-138, ISO_8859-8, Hebrew and csISOLatinHebrew.

ISO-8859-8 exists in three different forms: if just ISO-8859-8 is given the assumed order is visual, meaning that Hebrew, an RTL script, would be written LTR, i.e. backwards. If however ISO-8859-8-I is given, the logical order is used (also for plain text, such as unformatted emails), and Hebrew must be written correctly. As of 2004 the visual order is dying out in the Hebrew language computing scene, being fast replaced by logical order (as ISO-8859-8-I or Windows-1255 or UTF-8) everywhere. There is also ISO-8859-8-E which requires directionality to be explicitly specified with special control characters.

[edit] Codepage layout

The following table lists the characters in ISO 8859-8.

ISO/IEC 8859-8
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x unused
1x
2x SP ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~  
8x unused
9x
Ax NBSP   ¢ £ ¤ ¥ ¦ § ¨ © × « ¬ SHY ® ¯
Bx ° ± ² ³ ´ µ · ¸ ¹ ÷ » ¼ ½ ¾  
Cx                                
Dx                              
Ex א ב ג ד ה ו ז ח ט י ך כ ל ם מ ן
Fx נ ס ע ף פ ץ צ ק ר ש ת     LRM RLM  

In the table above, 20 is the regular SPACE character, and A0 is the NO-BREAK SPACE. AD is a SOFT HYPHEN, which should not appear at all in compliant web browsers.

FD is left-to-right mark (U+200E) and FE is right-to-left mark (U+200F), as specified in a newer amendment as ISO/IEC 8859-8:1999.

Code values 00-1F, 7F, 80-9F, A1, BF-DE, FB-FC and FF are not assigned to characters by ISO/IEC 8859-8.

[edit] External links