ISO/IEC 8859-2

From Wikipedia, the free encyclopedia

ISO 8859-2, more formally cited as ISO/IEC 8859-2 or less formally as Latin-2, is part 2 of ISO/IEC 8859, a standard character encoding defined by ISO. It encodes what it refers to as Latin alphabet no. 2, consisting of 191 characters from the Latin script, each encoded as a single 8-bit code value.

ISO_8859-2:1987, more commonly known by its preferred mime name of ISO-8859-2 (note extra hyphen), is the IANA charset name for this standard used together with the control codes from ISO/IEC 6429 for the C0 (0x00-0x1F) and C1 (0x80-0x9F) parts. Escape sequences (from ISO/IEC 6429 or ISO/IEC 2022) are not to be interpreted. This character set also has the aliases ISO_8859-2, latin2, l2 and csISOLatin2.

This encoding shares a lot of assignments with windows-1250 but is not a strict subset of it (unlike the case with windows-1252 and ISO 8859-1).

These code values can be used in almost any data interchange system to communicate in the following European languages: Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian (in Latin transcription), Serbocroatian, Slovak, Slovenian, Upper Sorbian and Lower Sorbian. Furthermore it is suitable to represent some western European languages like Finnish (with the exception of å used in Swedish-Finnish names) or German. When used alone, these latter languages are nominally using ISO 8859-1 encoding, but the needed codepoints are shared with ISO 8859-2, which is an important aspect for multi-lingual documents.

It may be argued that ISO 8859-2 is not really suitable for Romanian because of lack of letters s and t with commas below, containing s and t with cedillas instead. These letters were unified in the first versions of the Unicode standard, meaning that the appearance with cedilla or with comma was treated as a glyph choice rather than as separate characters; fonts intended for use with Romanian should, therefore, have characters with comma below at those code points.

ISO/IEC 8859-2
	x0	x1	x2	x3	x4	x5	x6	x7	x8	x9	xA	xB	xC	xD	xE	xF
0x	unused
1x	unused
2x	SP	!	"	#	$	%	&	'	(	)	*	+	,	-	.	/
3x	0	1	2	3	4	5	6	7	8	9	:	;	<	=	>	?
4x	@	A	B	C	D	E	F	G	H	I	J	K	L	M	N	O
5x	P	Q	R	S	T	U	V	W	X	Y	Z	[	\	]	^	_
6x	`	a	b	c	d	e	f	g	h	i	j	k	l	m	n	o
7x	p	q	r	s	t	u	v	w	x	y	z	{	\|	}	~
8x	unused
9x	unused
Ax	NBSP	Ą	˘	Ł	¤	Ľ	Ś	§	¨	Š	Ş	Ť	Ź	SHY	Ž	Ż
Bx	°	ą	˛	ł	´	ľ	ś	ˇ	¸	š	ş	ť	ź	˝	ž	ż
Cx	Ŕ	Á	Â	Ă	Ä	Ĺ	Ć	Ç	Č	É	Ę	Ë	Ě	Í	Î	Ď
Dx	Đ	Ń	Ň	Ó	Ô	Ő	Ö	×	Ř	Ů	Ú	Ű	Ü	Ý	Ţ	ß
Ex	ŕ	á	â	ă	ä	ĺ	ć	ç	č	é	ę	ë	ě	í	î	ď
Fx	đ	ń	ň	ó	ô	ő	ö	÷	ř	ů	ú	ű	ü	ý	ţ	˙

In the table above, 20 is the regular SPACE character, and A0 is the NO-BREAK SPACE. AD is a SOFT HYPHEN, which should not appear at all in compliant web browsers.

Code values 00-1F, 7F, and 80-9F are not assigned to characters by ISO/IEC 8859-2.

[edit] Code page layout

In the following table characters for code values A0-FF are shown together with their corresponding Unicode code points.

	.0	.1	.2	.3	.4	.5	.6	.7	.8	.9	.A	.B	.C	.D	.E	.F
A.	A0	Ą 104	˘ 2D8	Ł 141	¤ A4	Ľ 13D	Ś 15A	§ A7	¨ A8	Š 160	Ş 15E	Ť 164	Ź 179	AD	Ž 17D	Ż 17B
B.	° B0	ą 105	˛ 2DB	ł 142	´ B4	ľ 13E	ś 15B	ˇ 2C7	¸ B8	š 161	ş 15F	ť 165	ź 17A	˝ 2DD	ž 17E	ż 17C
C.	Ŕ 154	Á C1	Â C2	Ă 102	Ä C4	Ĺ 139	Ć 106	Ç C7	Č 10C	É C9	Ę 118	Ë CB	Ě 11A	Í CD	Î CE	Ď 10E
D.	Đ 110	Ń 143	Ň 147	Ó D3	Ô D4	Ő 150	Ö D6	× D7	Ř 158	Ů 16E	Ú DA	Ű 170	Ü DC	Ý DD	Ţ 162	ß DF
E.	ŕ 155	á E1	â E2	ă 103	ä E4	ĺ 13A	ć 107	ç E7	č 10D	é E9	ę 119	ë EB	ě 11B	í ED	î EE	ď 10F
F.	đ 111	ń 144	ň 148	ó F3	ô F4	ő 151	ö F6	÷ F7	ř 159	ů 16F	ú FA	ű 171	ü FC	ý FD	ţ 163	˙ 2D9

[edit] External links

ISO 8859-2:1999
Standard ECMA-94: 8-Bit Single Byte Coded Graphic Character Sets - Latin Alphabets No. 1 to No. 4 2nd edition (June 1986)
ISO-IR 101 Right-Hand Part of Latin Alphabet No.2 (February 1, 1986)
ISO 8859-2 (Latin 2) Resources

Retrieved from "http://en.wikipedia.org../../../i/s/o/ISO_IEC_8859-2_ec1c.html"

Category: ISO 8859

ISO/IEC 8859-2

From Wikipedia, the free encyclopedia

[edit] Code page layout

[edit] External links

Views

Navigation

Search

In other languages