ISO/IEC 8859-2

From Wikipedia, the free encyclopedia

ISO 8859-2, more formally cited as ISO/IEC 8859-2 or less formally as Latin-2, is part 2 of ISO/IEC 8859, a standard character encoding defined by ISO. It encodes what it refers to as Latin alphabet no. 2, consisting of 191 characters from the Latin script, each encoded as a single 8-bit code value.

ISO_8859-2:1987, more commonly known by its preferred mime name of ISO-8859-2 (note extra hyphen), is the IANA charset name for this standard used together with the control codes from ISO/IEC 6429 for the C0 (0x00-0x1F) and C1 (0x80-0x9F) parts. Escape sequences (from ISO/IEC 6429 or ISO/IEC 2022) are not to be interpreted. This character set also has the aliases ISO_8859-2, latin2, l2 and csISOLatin2.

This encoding shares a lot of assignments with windows-1250 but is not a strict subset of it (unlike the case with windows-1252 and ISO 8859-1).

These code values can be used in almost any data interchange system to communicate in the following European languages: Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian (in Latin transcription), Serbocroatian, Slovak, Slovenian, Upper Sorbian and Lower Sorbian. Furthermore it is suitable to represent some western European languages like Finnish (with the exception of å used in Swedish-Finnish names) or German. When used alone, these latter languages are nominally using ISO 8859-1 encoding, but the needed codepoints are shared with ISO 8859-2, which is an important aspect for multi-lingual documents.

It may be argued that ISO 8859-2 is not really suitable for Romanian because of lack of letters s and t with commas below, containing s and t with cedillas instead. These letters were unified in the first versions of the Unicode standard, meaning that the appearance with cedilla or with comma was treated as a glyph choice rather than as separate characters; fonts intended for use with Romanian should, therefore, have characters with comma below at those code points.

ISO/IEC 8859-2
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x unused
1x
2x SP ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~  
8x unused
9x
Ax NBSP Ą ˘ Ł ¤ Ľ Ś § ¨ Š Ş Ť Ź SHY Ž Ż
Bx ° ą ˛ ł ´ ľ ś ˇ ¸ š ş ť ź ˝ ž ż
Cx Ŕ Á Â Ă Ä Ĺ Ć Ç Č É Ę Ë Ě Í Î Ď
Dx Đ Ń Ň Ó Ô Ő Ö × Ř Ů Ú Ű Ü Ý Ţ ß
Ex ŕ á â ă ä ĺ ć ç č é ę ë ě í î ď
Fx đ ń ň ó ô ő ö ÷ ř ů ú ű ü ý ţ ˙

In the table above, 20 is the regular SPACE character, and A0 is the NO-BREAK SPACE. AD is a SOFT HYPHEN, which should not appear at all in compliant web browsers.

Code values 00-1F, 7F, and 80-9F are not assigned to characters by ISO/IEC 8859-2.

[edit] Code page layout

In the following table characters for code values A0-FF are shown together with their corresponding Unicode code points.

.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 .A .B .C .D .E .F
 
A.
 
 
A0
Ą
104
˘
2D8
Ł
141
¤
A4
Ľ
13D
Ś
15A
§
A7
¨
A8
Š
160
Ş
15E
Ť
164
Ź
179
­
AD
Ž
17D
Ż
17B
 
B.
 
°
B0
ą
105
˛
2DB
ł
142
´
B4
ľ
13E
ś
15B
ˇ
2C7
¸
B8
š
161
ş
15F
ť
165
ź
17A
˝
2DD
ž
17E
ż
17C
 
C.
 
Ŕ
154
Á
C1
Â
C2
Ă
102
Ä
C4
Ĺ
139
Ć
106
Ç
C7
Č
10C
É
C9
Ę
118
Ë
CB
Ě
11A
Í
CD
Î
CE
Ď
10E
 
D.
 
Đ
110
Ń
143
Ň
147
Ó
D3
Ô
D4
Ő
150
Ö
D6
×
D7
Ř
158
Ů
16E
Ú
DA
Ű
170
Ü
DC
Ý
DD
Ţ
162
ß
DF
 
E.
 
ŕ
155
á
E1
â
E2
ă
103
ä
E4
ĺ
13A
ć
107
ç
E7
č
10D
é
E9
ę
119
ë
EB
ě
11B
í
ED
î
EE
ď
10F
 
F.
 
đ
111
ń
144
ň
148
ó
F3
ô
F4
ő
151
ö
F6
÷
F7
ř
159
ů
16F
ú
FA
ű
171
ü
FC
ý
FD
ţ
163
˙
2D9

[edit] External links