Wikipedia:Indic transliteration scheme

From Wikipedia, the free encyclopedia

All transliteration should be from the written form in the original script of the original language of the name or term. The original text in the original script should also be included near the start of the article for reference and checking.

All unpronounced 'a's should be removed if: the source script does not indicate the removal of the inherent 'a' AND if it is unpronounced in the original source language.

The scheme is based on ISO 15919 for Indic scripts. [1]. This is very close to IAST with minor differences to accommodate non-Devanagari scripts. The differences are:

ए - IAST: e, ISO: ē
ओ - IAST: o, ISO: ō
अं - IAST: ṃ, ISO: ṁ (ṃ is used to specifically represent Gurmukhi Tippi ੰ)
ऋ - IAST: ṛ, ISO: r̥
ॠ - IAST: ṝ, ISO: r̥̄

The advantages of using ISO 15919 is that it can be used equally across all Indic scripts.

1 Inherent vowel
2 Vowels
3 Consonants
4 Nasalisation
5 Other Signs
6 References

[edit] Inherent vowel

The inherent vowel is always transliterated as 'a' in the formal ISO 15919 transliteration. In the simplified transliteration, 'a' is also normally used except in Bengali, Assamese, and Oriya, where 'o'/'ô' is used. See Romanization of Bengali for the transliteration scheme set for Bengali on Wikipedia.

TODO Talk about differing IPAs for inherent vowels.

[edit] Vowels

Vowels are presented in their independent form on the left of each column, and combined with the corresponding consonant ka on the right. An asterisk indicates that the letter or ligature exists, but has not been encoded in unicode or is archaic/obsolete.

ISO 15919	Simplified	IPA	Devanagari		Bengali		Gurmukhi		Gujarati		Oriya		Tamil		Telugu		Kannada		Malayalam		Sinhala
a	a	ə/ɐ/ä/ɔ	अ	-	অ	-	ਅ	-	અ	-	ଅ	-	அ	-	అ	-	ಅ	-	അ	-	අ	-
ā	a	aː	आ	का	আ	কা	ਆ	ਕਾ	આ	કા	ଆ	କା	ஆ	கா	ఆ	కా	ಆ	ಕಾ	ആ	കാ	ක	කා
æ	ae	?	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	ඇ	කැ
ǣ	ae	?	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	ඈ	කෑ
i	i	i	इ	कि	ই	কি	ਇ	ਕਿ	ઇ	કિ	ଇ	କି	இ	கி	ఇ	కి	ಇ	ಕಿ	ഇ	കി	ඉ	කි
ī	i	iː	ई	की	ঈ	কী	ਈ	ਕੀ	ઈ	કી	ଈ	କୀ	ஈ	கீ	ఈ	కీ	ಈ	ಕೀ	ഈ	കീ	ඊ	කී
u	u	u	उ	कु	উ	কু	ਉ	ਕੁ	ઉ	કુ	ଉ	କୁ	உ	கு	ఉ	కు	ಉ	ಕು	ഉ	കു	උ	කු
ū	u	uː	ऊ	कू	ঊ	কূ	ਊ	ਕੂ	ઊ	કૂ	ଊ	କୂ	ஊ	கூ	ఊ	కూ	ಊ	ಕೂ	ഊ	കൂ	ඌ	කූ
ĕ	e	æ	ऍ	कॅ	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
e	e	e	ऎ	कॆ	-	-	-	-	-	-	-	-	எ	கெ	ఎ	కె	ಎ	ಕೆ	എ	കെ	එ	කෙ
ē	e	eː	ए	के	এ	কে	ਏ	ਕੇ	એ	કે	ଏ	କେ	ஏ	கே	ఏ	కే	ಏ	ಕೇ	ഏ	കേ	ඒ	කේ
ai	ai	ɛː/əj/æ/ɔj	ऐ	कै	ঐ	কৈ	ਐ	ਕੈ	ઐ	કૈ	ଐ	କୈ	ஐ	கை	ఐ	కై	ಐ	ಕೈ	ഐ	കൈ	ඓ	කෛ
ŏ	o	ɔ	ऑ	कॉ	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
o	o	o	ऒ	कॊ	-	-	-	-	-	-	-	-	ஒ	கொ	ఒ	కొ	ಒ	ಕೊ	ഒ	കൊ	ඔ	කො
ō	o	oː	ओ	को	ও	কো	ਓ	ਕੋ	ઓ	કો	ଓ	କୋ	ஓ	கோ	ఓ	కో	ಓ	ಕೋ	ഓ	കോ	ඕ	කෝ
au	au	ɔ/əw/ɔw	औ	कौ	ঔ	কৌ	ਔ	ਕੌ	ઔ	કૌ	ଔ	କୌ	ஔ	கௌ	ఔ	కౌ	ಔ	ಕೌ	ഔ	കൌ	ඖ	කෞ
r̥	ri	ṛ	ऋ	कृ	ঋ	কৃ	-	-	ઋ	કૃ	ଋ	କୃ	-	-	ఋ	కృ	ಋ	ಕೃ	ഋ	കൃ	ඍ	කෘ
r̥̄	ri	ṛː	ॠ	कॢ	ৠ	কৢ	-	-	ૠ	-	ୠ	-	-	-	ౠ	-	ೠ	-	ൠ	*	ඎ	කෲ
l̥	?	ḷ	ऌ	कॄ	ঌ	কৄ	-	-	-	કૄ	ଌ	-	-	-	ఌ	కౄ	ಌ	ಕೄ	ഌ	*	ඏ	කෟ
l̥̄	?	ḷː	ॡ	कॣ	ৡ	কৣ	-	-	-	-	ୡ	-	-	-	ౡ	-	ೡ	-	ൡ	*	ඐ	කෳ

[edit] Consonants

ISO 15919	Simplified	IPA	Devanagari	Bengali	Gurmukhi	Gujarati	Oriya	Tamil	Telugu	Kannada	Malayalam	Sinhala
k	k	k	क	ক	ਕ	ક	କ	க	క	ಕ	ക	ක
kh	kh	kʰ	ख	খ	ਖ	ખ	ଖ	-	ఖ	ಖ	ഖ	ඛ
g	g	g	ग	গ	ਗ	ગ	ଗ	க	గ	ಗ	ഗ	ග
gh	gh	gʱ	घ	ঘ	ਘ^[2]	ઘ	ଘ	-	ఘ	ಘ	ഘ	ඝ
ṅ	n	ŋ	ङ	ঙ	ਙ	ઙ	ଙ	ங	ఙ	ಙ	ങ	ඞ
c	ch	ʧ	च	চ	ਚ	ચ	ଚ	ச	చ	ಚ	ച	ච
ch	chh	ʧʰ	छ	ছ	ਛ	છ	ଛ	ச	ఛ	ಛ	ഛ	ඡ
j	j	ʤ	ज	জ	ਜ	જ	ଜ	ஜ	జ	ಜ	ജ	ජ
jh	jh	ʤʱ	झ	ঝ	ਝ^[3]	ઝ	ଝ	-	ఝ	ಝ	ഝ	ඣ
ñ	n	ɲ	ञ	ঞ	ਞ	ઞ	ଞ	ஞ	ఞ	ಞ	ഞ	ඤ
ṭ	t	ʈ	ट	ট	ਟ	ટ	ଟ	ட	ట	ಟ	ട	ට
ṭh	th	ʈʰ	ठ	ঠ	ਠ	ઠ	ଠ	த	ఠ	ಠ	ഠ	ඨ
ḍ	d	ɖ	ड	ড	ਡ	ડ	ଡ	ட	డ	ಡ	ഡ	ඩ
ḍh	dh	ɖʱ	ढ	ঢ	ਢ^[4]	ઢ	ଢ	-	ఢ	ಢ	ഢ	ඪ
ṇ	n	ɳ	ण	ণ	ਣ	ણ	ଣ	ண	ణ	ಣ	ണ	ණ
t	t	t̪	त	ত	ਤ	ત	ତ	த	త	ತ	ത	ත
th	th	t̪ʰ	थ	থ	ਥ	થ	ଥ	த	థ	ಥ	ഥ	ථ
d	d	d̪	द	দ	ਦ	દ	ଦ	ட	ద	ದ	ദ	ද
dh	dh	d̪ʰ	ध	ধ	ਧ^[5]	ધ	ଧ	-	ధ	ಧ	ധ	ධ
n	n	n̪/n^[6]	न	ন	ਨ	ન	ନ	ந	న	ನ	ന	න
ṉ	n	n	ऩ	ন়	ਨ਼	ન઼	-	ன	-	-	^[7]	න.^[8]
p	p	p	प	প	ਪ	પ	ପ	ப	ప	ಪ	പ	ප
ph	ph	pʰ	फ	ফ	ਫ	ફ	ଫ	-	ఫ	ಫ	ഫ	ඵ
b	b	b	ब	ব	ਬ	બ	ବ	ப	బ	ಬ	ബ	බ
bh	bh	bʱ	भ	ভ	ਭ^[9]	ભ	ଭ	-	భ	ಭ	ഭ	භ
m	m	m	म	ম	ਮ	મ	ମ	ம	మ	ಮ	മ	ම
y	y	j	य	য	ਯ	ય	ଯ	ய	య	ಯ	യ	ය
r	r	r/ɾ^[10]	र	র/ৰ^[11]	ਰ	ર	ର	ர	ర	ರ	ര	ර
ṟ	r	r	ऱ	-	ਰ਼	ર઼	-	ற	ఱ	ಱ	റ	ර.^[12]
r̆^[13]	r	r	र्‍	-	-	-	-	-	-	-	-	-
l	l	l	ल	ল	ਲ	લ	ଲ	ல	ల	ಲ	ല	ල
ḷ	l	ɭ	ळ	-	ਲ਼	ળ	ଳ	ள	ళ	ಳ	ള	ළ
ḻ	l	ɻ	ऴ	-	-	ળ઼	-	ழ	-	-	ഴ	ළ.^[14]
v	v	ʋ/w^[15]	व	ৱ^[16]	ਵ	વ	-	வ	వ	ವ	വ	ව
ś	sh	ɕ	श	শ	ਸ਼	શ	ଶ	-	శ	ಶ	ശ	ශ
ṣ	sh	ʂ	ष	ষ	-	ષ	ଷ	ஷ	ష	ಷ	ഷ	ෂ
s	s	s	स	স	ਸ	સ	ସ	ஸ	స	ಸ	സ	ස
h	h	ɦ	ह	হ	ਹ^[17]	હ	ହ	ஹ	హ	ಹ	ഹ	හ
q	q	q	क़	ক়	ਕ਼	ક઼	କ଼	-	-	-	-	-
ḵẖ	kh	x	ख़	খ়	ਖ਼	ખ઼	ଖ଼	-	-	-	-	-
ġ	g	ɣ	ग़	গ়	ਗ਼	ગ઼	ଗ଼	-	-	-	-	-
z	z	z	ज़	জ়	ਜ਼	જ઼	ଜ଼	-	-	-	-	-
ṛ	r	ɽ	ड़	ড়	ੜ	ડ઼	ଡ଼	-	-	-	-	-
ṛh	rh	ɽʱ	ढ़	ঢ়	ੜ੍ਹ	ઢ઼	ଢ଼	-	-	-	-	-
f	f	f	फ़	ফ়	ਫ਼	ફ઼	ଫ଼	-	-	-	ഫ	ෆ
ẏ	y	j	य़	য়	ਯ਼	ય઼	ୟ	-	-	-	-	-
t̤	t	t̪	त़	ত়	ਤ਼	ત઼	ତ଼	-	-	-	-	-
s̤	s	s	स़	স়	-	સ઼	ସ଼	-	-	-	-	-
h̤	h	ɦ	ह़	হ়	ਹ਼	હ઼	ହ଼	-	-	-	-	-
w	w	w	व़	র^[18]	ਵ਼	વ઼	ୱ	-	-	-	-	-
ṯ	t	t	t	-	-	-	-	-	-	-	റ്റ^[19]	-

^ See special notes for Punjabi. Specifically voiced aspirates.
^ In Indo-Aryan languages, this letter is theoretically pronounced as a dental nasal, but it is actually alveolar. In Tamil and Malayalam, it is a dental nasal and the alveolar nasal has a separate letter (n- see note below).
^ This letter is obsolete. See the Malayalam language article for further details.
^ In languages that contrast two rhotic consonants, this is generally [ɾ]. In Indo-Aryan languages that do not make this distinction but have [ɾ] and [r] as allophones, the /r/ phoneme is generally pronounced [ɾ] when following a voiced consonant (although there are exceptions, such as the consonant j /ʤ/) and [r] in most other environments.
^ Use when the distinction between the reph and eyelash form of Ra is required; otherwise transliterate as 'r'.
^ Used when writing Tamil in Sinhala script.
^ Use র for Bengali and Manipuri, and ৰ for Assamese.
^ Assamese and Manipuri only.
^ May be pronounced 'w' in some languages.
^ See special notes for Punjabi. Specifically 'ha'.
^ Need further info on Ba + Nukta being used as Ra and Wa.
^ This is the symbol for the geminate consonant - the letter for the single [t] has become obsolete.

[edit] Sinhalese half-nasals

ISO 15919	Simplified	IPA	Sinhala
n̆g	ng	ng	ඟ
jñ^[20]	jn	ʤɲ	ඥ
n̆j	nj	nʤ	ඦ
n̆ḍ	nd	nd	ඬ
n̆d	nd	nð	ඳ
m̆b	mb	mb	ඹ

^ This character is technically a conjunct, but is encoded separately in Unicode.

[edit] Sindhi/Western Punjabi consonants

ISO 15919	Simplified	IPA	Devanagari	Gurmukhi
gg^[21]	gg	ɠ	ॻ (ग॒)	ੱਗ
jj^[22]	jj	ʄ	ॼ (ज॒)	ੱਜ
ḍḍ^[23]	dd	ɗ̢	ॾ (ड॒)	ੱਡ
bb^[24]	bb	ɓ	ॿ (ब॒)	ੱਬ

^ Represents Sindhi/Western Punjabi bbē (ٻ).
^ Represents Sindhi/Western Punjabi jjē (ڄ).
^ Represents Sindhi dd.ē (ڏ) or Western Punjabi dd.āl (ڋ).
^ Represents Sindhi ggē (ڳ) or Western Punjabi ggāf (ڰ).

[edit] Special notes for Punjabi

Punjabi is rather unique for an Indo-European language in that tones are a prominent feature of speech. As such, the IPA conversion is not accurate for Punjabi. Fortunately, there is a direct correlation between certain aspirated consonants and use of subscript /ha/ to represent different tones.

[edit] Voiced aspirates

The consonants that are employed for voiced aspirates in other Indian languages are not prounced as such in Punjabi. In Punjabi these consonants are used to mark changes in tone. The table below indicates how each consonant is pronounced based on its position within a word.

Consonant	Beginning of word	All other positions
ਘ	ਕ [k]	ਗ [g]
ਝ	ਚ [ʧ]	ਜ [ʤ]
ਢ	ਟ [ʈ]	ਡ [ɖ]
ਧ	ਤ [t̪]	ਦ [d̪]
ਭ	ਪ [p]	ਬ [b]

At the beginning or middle of a word, a voiced aspirate indicates a low tone on the following vowel. Examples:

ਘੋੜਾ [gʱoːɽaː] is actually pronounced [kòːɽaː]
ਪਘਾਰਨਾ [pəgʱaːrnaː] is actally pronounced [pəgàːrnaː]
ਮਘਾਣਾ [məgʱaːɳaː] is actually pronounced [məgàːɳaː]

At the end of the word (stem-final), the voiced aspirates indicates a high tone on the preceding vowel. Examples:

ਕੁਝ [kuʤʱ] is actually pronounced [kúʤ]

[edit] Ha

At the beginning of a word, ਹ indicates [ha].

In the middle or at the end of a word, ha indicates a high tone on the preceding vowel. Examples:

ਚਾਹ [ʧaːh] is actually pronounced [ʧáː]

Subscript ha also indicates a high tone on the preceding vowel. Examples:

ਪੜ੍ਹ [pəɽʱ] is actually pronounced [pə́ɽ]

The following conventions apply apart from at the beginning of a word:

ਿਹ converts into a high tone ੇ (e.g. ਸਿਹਤ is pronounced ਸੇਤ [séːt̪]).
'ੁਹ converts into a high tone ੋ (e.g. ਸੁਹਣਾ is pronounced ਸੋਣਾ [sóːɳaː]).
'ਹਿ converts into a high tone ੈ (e.g. ਸ਼ਹਿਰ is pronounced ਸ਼ੈਰ [ɕǽr]).
'ਹੁ converts into a high tone ੌ (e.g. ਬਹੁਤ is pronounced ਬੌਤ [bɔ́t̪]).

References

Teach Yourself Panjabi ISBN 1-07143161-6 (p16, 19-21)
[25]
[26]
[27]

[edit] Nasalisation

ISO 15919	IPA	Devanagari	Bengali	Gurmukhi	Gujarati	Oriya	Tamil	Telugu	Kannada	Malayalam	Sinhala
ṁ^[28]	?	ं	ং	ਂ	ં	ଂ	ஂ	ం	ಂ	ം^[29]	ං
ṃ^[30]	?	-	-	ੰ	-	-	-	-	-	-	-
m̐^[31]	?	ँ	ঁ	ਁ	ઁ	ଁ	-	-	-	-	-

^ The signs ṁ and ṃ are essentially identical. However, Gurmukhi has two separate nasal characters and if this distinction is to be retained separate identifiers must be used.

^ For Malayalam, it is transliterated as 'm' at the end of a word. There is no actual phonemic nasalisation in Malayalam. This symbol only indicates nasalisation when Malayalam script is being used to write Sanskrit. Otherwise, it represents either consonantal /m/ (without the inherent vowel) or consonantal /ŋ/ (without the inherent vowel), mostly in borrowed Sanskrit words that originally had nasalisation. Some of these borrowed words are pronounced with /m/ and others with /ŋ/, and, because of analogy, this symbol has come to represent these phonemes (when the vowels are supressed - otherwise the normal letters would be used) in native words as well.

Should we include this point? When used with a semi-vowel (y, r, l, ḷ or v), candrabindu is placed before the semi-vowel. For example, यँ is written m̐ya and not yam̐.

The standard nasal signs (ṁ and ṃ) are only to be used at the end of words OR when it is crucial to keep the distinction between Bindi and Tippi use in Gurmukhi. Otherwise, the following rules should be enforced:

When followed by	ISO 15919	IPA
k, kh, g, gh or ṅ q, ḵẖ, or ġ	ṅ	ŋ
c, ch, j, jh or ñ z	ñ	ɲ
ṭ, ṭh, ḍ, ḍh, or ṇ	ṇ	ɳ
t, th, d dh, or n	n	n
p, ph, b bh, or m f	m	m
y, r, l, v, ś, ṣ, s, h ẏ	n	n

Not sure about ṛ and ṛh...

Also, should nasalisation always be written as /ⁿ/ ?

[edit] Other Signs

Talk about Nukta on its own Visarga Avagraha Other Signs Om, Ek Onkar etc.

[edit] References

Script specific resources

Retrieved from "http://en.wikipedia.org../../../i/n/d/Wikipedia%7EIndic_transliteration_scheme_58fa.html"