Unicode Phonetic Symbols

From Wikipedia, the free encyclopedia

v • d • e Character Types
Letters and other script specific • Unihan ideographs, etc. • Phonetic characters Numerals Punctuation and separators Diacritics and other marks Symbols: Compatibility characters Control characters Other Topics • Combining character • Precomposed character

Note: This article contains special characters.

Unicode supports several phonetic alphabets and notations through the existing writing systems and the addition of several phonetic extension blocks. *IPA Extensions (0250–02AF); Spacing Modifier Letters (02B0–02FF); Phonetic Extensions (1D00–1D7F); Phonetic Extensions Supplement (1D80–1DBF); Modifier Tone Letters (A700–A71F); and Superscripts and Subscripts (2070–209F).

Phonetic alphabets, such as the International Phonetic Alphabet make use of letters from other writing systems: most notably Latin, Greek and Cyrillic. Combining diacritics also adds meaning to the phonetic text. Finally, these phonetic alphabets make use of modifier letters.. A "modifier letter" is strictly intended not as an independent grapheme but as a modification of the preceding character [1] resulting in a distinct grapheme, notably in the context of the International Phonetic Alphabet. For example, ʰ should not occur on its own but modifies the preceding or following symbol. Thus, tʰ is a single IPA symbol, distinct from t. In practice, however, several of these "modifier letters" are also used as full graphemes, e.g. ʿ as transliterating Semitic ayin or Hawaiian okina, or ˚ transliterating Abkhaz ә.

1 Blocks
2 Semantic Phonemes and character names
- 2.1 Consonants
- 2.2 Vowels
3 See also
4 External links

[edit] Blocks

Unicode ranges encoding phonetic notation.

IPA Extensions (0250–02AF)
Spacing Modifier Letters (02B0–02FF)
Phonetic Extensions (1D00–1D7F)
Phonetic Extensions Supplement (1D80–1DBF)
Modifier Tone Letters (A700–A71F)
Superscripts and Subscripts (2070–209F)

Legend:
Unicode 1.0	Unicode 3.2
Unicode 1.1	Unicode 4.0
Unicode 2.0	Unicode 4.1
Unicode 2.1	Unicode 5.0
Unicode 3.0	not used
Unicode 3.1	reserved

U+	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
0250	ɐ	ɑ	ɒ	ɓ	ɔ	ɕ	ɖ	ɗ	ɘ	ə	ɚ	ɛ	ɜ	ɝ	ɞ	ɟ
0260	ɠ	ɡ	ɢ	ɣ	ɤ	ɥ	ɦ	ɧ	ɨ	ɩ	ɪ	ɫ	ɬ	ɭ	ɮ	ɯ
0270	ɰ	ɱ	ɲ	ɳ	ɴ	ɵ	ɶ	ɷ	ɸ	ɹ	ɺ	ɻ	ɼ	ɽ	ɾ	ɿ
0280	ʀ	ʁ	ʂ	ʃ	ʄ	ʅ	ʆ	ʇ	ʈ	ʉ	ʊ	ʋ	ʌ	ʍ	ʎ	ʏ
0290	ʐ	ʑ	ʒ	ʓ	ʔ	ʕ	ʖ	ʗ	ʘ	ʙ	ʚ	ʛ	ʜ	ʝ	ʞ	ʟ
02A0	ʠ	ʡ	ʢ	ʣ	ʤ	ʥ	ʦ	ʧ	ʨ	ʩ	ʪ	ʫ	ʬ	ʭ	ʮ	ʯ
02B0	ʰ	ʱ	ʲ	ʳ	ʴ	ʵ	ʶ	ʷ	ʸ	ʹ	ʺ	ʻ	ʼ	ʽ	ʾ	ʿ
02C0	ˀ	ˁ	˂	˃	˄	˅	ˆ	ˇ	ˈ	ˉ	ˊ	ˋ	ˌ	ˍ	ˎ	ˏ
02D0	ː	ˑ	˒	˓	˔	˕	˖	˗	˘	˙	˚	˛	˜	˝	˞	˟
02E0	ˠ	ˡ	ˢ	ˣ	ˤ	˥	˦	˧	˨	˩	˪	˫	ˬ	˭	ˮ	˯
02F0	˰	˱	˲	˳	˴	˵	˶	˷	˸	˹	˺	˻	˼	˽	˾	˿
1D00	ᴀ	ᴁ	ᴂ	ᴃ	ᴄ	ᴅ	ᴆ	ᴇ	ᴈ	ᴉ	ᴊ	ᴋ	ᴌ	ᴍ	ᴎ	ᴏ
1D10	ᴐ	ᴑ	ᴒ	ᴓ	ᴔ	ᴕ	ᴖ	ᴗ	ᴘ	ᴙ	ᴚ	ᴛ	ᴜ	ᴝ	ᴞ	ᴟ
1D20	ᴠ	ᴡ	ᴢ	ᴣ	ᴤ	ᴥ	ᴦ	ᴧ	ᴨ	ᴩ	ᴪ	ᴫ	ᴬ	ᴭ	ᴮ	ᴯ
1D30	ᴰ	ᴱ	ᴲ	ᴳ	ᴴ	ᴵ	ᴶ	ᴷ	ᴸ	ᴹ	ᴺ	ᴻ	ᴼ	ᴽ	ᴾ	ᴿ
1D40	ᵀ	ᵁ	ᵂ	ᵃ	ᵄ	ᵅ	ᵆ	ᵇ	ᵈ	ᵉ	ᵊ	ᵋ	ᵌ	ᵍ	ᵎ	ᵏ
1D50	ᵐ	ᵑ	ᵒ	ᵓ	ᵔ	ᵕ	ᵖ	ᵗ	ᵘ	ᵙ	ᵚ	ᵛ	ᵜ	ᵝ	ᵞ	ᵟ
1D60	ᵠ	ᵡ	ᵢ	ᵣ	ᵤ	ᵥ	ᵦ	ᵧ	ᵨ	ᵩ	ᵪ	ᵫ	ᵬ	ᵭ	ᵮ	ᵯ
1D70	ᵰ	ᵱ	ᵲ	ᵳ	ᵴ	ᵵ	ᵶ	ᵷ	ᵸ	ᵹ	ᵺ	ᵻ	ᵼ	ᵽ	ᵾ	ᵿ
1D80	ᶀ	ᶁ	ᶂ	ᶃ	ᶄ	ᶅ	ᶆ	ᶇ	ᶈ	ᶉ	ᶊ	ᶋ	ᶌ	ᶍ	ᶎ	ᶏ
1D90	ᶐ	ᶑ	ᶒ	ᶓ	ᶔ	ᶕ	ᶖ	ᶗ	ᶘ	ᶙ	ᶚ	ᶛ	ᶜ	ᶝ	ᶞ	ᶟ
1DA0	ᶠ	ᶡ	ᶢ	ᶣ	ᶤ	ᶥ	ᶦ	ᶧ	ᶨ	ᶩ	ᶪ	ᶫ	ᶬ	ᶭ	ᶮ	ᶯ
1DB0	ᶰ	ᶱ	ᶲ	ᶳ	ᶴ	ᶵ	ᶶ	ᶷ	ᶸ	ᶹ	ᶺ	ᶻ	ᶼ	ᶽ	ᶾ	ᶿ
2070	⁰	ⁱ			⁴	⁵	⁶	⁷	⁸	⁹	⁺	⁻	⁼	⁽	⁾	ⁿ
2080	₀	₁	₂	₃	₄	₅	₆	₇	₈	₉	₊	₋	₌	₍	₎
2090	ₐ	ₑ	ₒ	ₓ	ₔ
A700	꜀	꜁	꜂	꜃	꜄	꜅	꜆	꜇	꜈	꜉	꜊	꜋	꜌	꜍	꜎	꜏
A710	꜐	꜑	꜒	꜓	꜔	꜕	꜖

[edit] Semantic Phonemes and character names

This article or section is in need of attention from an expert on the subject.

WikiProject Phonetics may be able to help recruit one.

If a more appropriate WikiProject or portal exists, please adjust this template accordingly.

Unicode includes letters and marks from the International Phonetic Alphabet (IPA) and those supporting other phonetic writing systems too. Essentially these characters are used as graphemes for phonemes. In terms of script or writing system, these phonetic alphabets are basically one writing system. What distinguishes the various phonetic alphabets are their glyphs. However, as with numerals, the UCS often focus more on the presentational forms or glyphs given to these phonemes by the various phonetic alphabets. This is in contrast to the alternate names of these characters provided by Unicode NamesList property which typically reflects the common phoneme semantics shared by those various writing systems regardless of the glyphs used. So these differences manifest in the alternate names given to these characters: the canonical UCS name and the NamesList property names. Similarly, Unicode assignees the value of “Latin” to the script property of many of these characters. However, the primary purpose for these characters inclusion in the character set is to support the various phonetic writing systems. These phonetic writing system, in many ways, constitute a single unified writing system on its own: despite borrowing glyphs from other Latin, Greek and Cyrillic scripts.

This possibly results in a larger than necessary allocation of characters, but it is likely due to the practice where the UCS often inherits character distinctions from other legacy character sets. However, this practice also raises other complications because the vast majority of changes in phonetic alphabets is in altering slightly or even completely changing glyphs. Seldom do these phonetic alphabets alter or change the underlying phonemes those glyphs represent. Such glyph changes would be better handled through font updates than through changes to the UCS and Unicode. The semantic phonemes have been fairly stable for decades: especially in the theoretically potential phonemes from our understanding of human aural anatomy. The phonemes have names like “labiodental flap” while the glyph character might be called “right-hook” in IPA informal usage ( “v”). For example, the UCS name for character U+1D18, is a “Latin Letter Small Capital P” while the semantic phoneme name added by Unicode is a “semi-voiced [p]”.

The alternate names provided by UCS and Unicode provide an excellent example of the motivation and benefits of semantic unification like that used for Unihan characters. If the phonemes themselves were semantically encoded in Unicode rather than the glyphs used in one or several semantic alphabets, the text processing would occur independent of its visual presentation. One person could view phoneme writing using a font created with IPA glyphs while another could read the same text with a font created for Americanist phonetic notation glyphs. In performing searches, sorting text and the like, the glyphs representing the phonemes would be independent of the characters. When the various phonetic associations alter the glyphs for a phoneme grapheme, the updates can take place in the fonts used to display the text and not in the underling characters. Archived text would display with the new glyphs simply by selecting the updated font for display.

[edit] Consonants

The following tables indicates the Unicode code point sequences for phonemes as used in the International Phonetic Alphabet. A bold code point indicates that the Unicode chart provides an application note such as "voiced retroflex lateral" for U+026D "LATIN SMALL LETTER L WITH RETROFLEX HOOK". An entry in bold italics indicates the character name itself refers to a phoneme, such as “LATIN LETTER BILABIAL CLICK” for U+0298.

	Bilabial		Labiodental		Dental		Alveolar		Postalveolar		Retroflex		Labial-palatal
Plosive	p (U+0070)	b U+0062	p̪ (U+0070, U+032A)	b̪ (U+0062, U+032A)	t̪ (U+0074, U+032A)	d̪ (U+0064, U+032A)	t (U+0074)	d (U+0064)			ʈ (U+0288)	ɖ (U+0256)
Implosive	ɓ̥ (U+0253, U+0325)	ɓ (U+0253)				ɗ̪ (U+0257', U+032A)		ɗ (U+0257)				*
Ejective	pʼ (U+0070, U+02BC)				t̪ʼ (U+0074, U+032A, U+02BC)		tʼ (U+0074, 'U+02BC)				ʈʼ (U+0288, U+02BC)
Nasal	m̥ (U+006D, U+0325)	m (U+006D)	ɱ̊ (U+0271, U+030A)	ɱ (U+0271)	n̪̊ (U+006E, U+032A U+030A)	n̪ (U+006E, U+032A)	n̥ (U+006E, U+0325)	n (U+006E)			ɳ̊ (U+0273, U+030A)	ɳ (U+0273)
Trill		ʙ (U+0299)					r̥ (U+0072, U+0325)	r (U+0072)				*
Tap or Flap		*		*				ɾ (U+027E)				ɽ (U+027D)
Lateral flap								ɺ (U+027A)				*
Fricative	ɸ (U+0278)	β (U+03B2)	f (U+0066)	v (U+0076)	θ (U+03B8)	ð (U+00F0)	s (U+0073)	z (U+007A)	ʃ (U+0283)	ʒ (U+0292)	ʂ (U+0282)	ʐ (U+0290)
Lateral fricative							ɬ (U+026C)	ɮ (U+026E)			*
Ejective fricative							sʼ (U+0073, U+02BC)		ʃʼ (U+0283, U+02BC)
Ejective lateral fricative							ɬʼ (U+026C, U+02BC)
Percussive	ʬ (U+02AC)				ʭ (U+02AD)
Approximant	β̞̊ (U+03B2, U+031E, U+030A)	β̞ (U+03B2, U+031E)	ʋ̥ (U+028B, U+0325)	ʋ (U+028B)		ð̞ (U+00F0, U+031E)	ɹ̥ (U+0279, U+0325)	ɹ (U+0279)			ɻ̊ (U+027B, U+030A)	ɻ (U+027B)	ɥ̊ (U+0265, U+030A)	ɥ (U+0265)
Lateral approximant							l̥ (U+006C, U+0325)	l (U+006C)				ɭ (U+026D)
Click consonant	ʘ (U+0298)				ǀ (U+01C0)		ǃ (U+01C3)		ǃ / ǂ (U+01C3 / U+01C2)
Lateral click					*		ǁ (U+01C1)

	Alveolo-palatal		Palatal		Labial-velar		Velar		Uvular		Pharyngeal		Epiglottal		Glottal
Plosive	ȶ (U+0236)	ȡ (U+0221)	c (U+0222)	ɟ (U+025F)	k͡p (U+006B, U+0361, U+0070)	ɡ͡b (U+0067 , U+0361, U+0062)	k (U+006B)	g (U+0067)	q (U+0071)	ɢ (U+0262)			ʡ (U+02A1)		ʔ (U+0294)
Implosive				ʄ (U+0284)				ɠ (U+0260)		ʛ (U+029B)
Ejective			cʼ (U+0063, U+02BC)				kʼ (U+006B, U+02BC)		qʼ (U+0071, U+02BC)
Nasal		ȵ (U+0235)		ɲ (U+0272)		ŋ͡m (U+014B, U+0361, U+006D)		ŋ (U+014B)		ɴ (U+0274)
Trill										ʀ (U+0280)				*
Tap or Flap														*
Lateral flap				*				*
Fricative	ɕ (U+0255)	ʑ (U+0291)	ç (U+0063, U+0327)	ʝ (U+029D)			x (U+0078)	ɣ (U+0263)	χ (U+03C7)	ʁ (U+0281)	ħ (U+0127)	ʕ (U+0295)	ʜ (U+029C)	ʢ	h (U+0068)	ɦ (U+0266)
Approximant				j (U+006A)	ʍ (U+028D)	w (U+0077)		ɰ (U+0270)
Lateral approximant		ȴ (U+0234)		ʎ (U+028E)				ʟ (U+029F)

[edit] Vowels

The following figures depict the phonetic vowels and their Unicode / UCS code points. Vowels appearing in pairs in the figure to the right indicate rounded and unrounded variations respectively. Again, characters with Unicode names referring to phonemes are indicated by bold text. Those with explicit application notes are indicated by bold italic text. Those from borrowed unchanged from another script (Latin,, Greek or Cyrillic) are indicated by italics.

'Unicode code points

Close vowels

U+0069

U+0079

U+0268

U+0289

U+026F

U+0075

Near-close vowels

U+026A

U+028F

U+028A

Close-mid vowels

U+0065

U+00F8

U+0258

U+0275

U+0264

U+006F

Mid vowels

U+0259

Open-mid vowels

U+025B

U+0153

U+025C

U+025E

U+028C

U+0254

Near-open vowels

U+00E6

U+0250

Open vowels

U+0061

U+0276

U+0251

U+0252

Edit - 2×

Front

Near-front

Central

Near-back

Back

Close

i • y

ɨ • ʉ

ɯ • u

ɪ • ʏ

• ʊ

e • ø

ɘ • ɵ

ɤ • o

ɛ • œ

ɜ • ɞ

ʌ • ɔ

a • ɶ

ɑ • ɒ

Near‑close

Close‑mid

Mid

Open‑mid

Near‑open

Open

[edit] See also

Unicode mapping tables
BMP		SMP	SIP		SSP
0000–0FFF	8000–8FFF	10000–10FFF	20000–20FFF	28000–28FFF	E0000–E0FFF
1000–1FFF	9000–9FFF		21000–21FFF	29000–29FFF
2000–2FFF	A000–AFFF	12000–12FFF	22000–22FFF	2A000–2AFFF
3000–3FFF	B000–BFFF		23000–23FFF
4000–4FFF	C000–CFFF	1D000–1DFFF	24000–24FFF	2F000–2FFFF
5000–5FFF	D000–DFFF		25000–25FFF
6000–6FFF	E000–EFFF		26000–26FFF
7000-7FFF	F000–FFFF		27000–27FFF