Romanization of Bengali
From Wikipedia, the free encyclopedia
The Romanization of Bengali, or the representation of the Bengali language in the Latin script. While different standards for romanization have been proposed for Bengali, these have not been adopted with the degree of uniformity seen in languages such as Japanese or Sanskrit[1]. Most standardized Bengali romanizations are adapted from standards proposed for Indic languages, and these models are compared below.
Contents |
[edit] Transliteration vs Transcription
In the context of romanization, it is important to distinguish between transliteration from transcription. Transliteration is orthographically accurate (i.e. the original spelling can be recovered), whereas transcription is phonetically accurate (i.e. the pronunciation can be reproduced).
The distinction is important in Bengali as its orthography was adopted from Sanskrit, and ignores sound change processes of several millennia. To some degree, all alphabetic writing systems differ from the way the language is pronounced, but this may be more extreme for many Indic languages. For example, the three letters শ, ষ, and স were distinct in Sanskrit, but over several centuries, the standard pronunciation of Bengali (usually modeled on the Nadia dialect), has lost these distinctions (all three are usually pronounced as IPA: [ʃ]). The distinction nevertheless persists in orthography, leading to homophones (words that sound the same).
In written texts, it is easy to distinguish between homophones such as শাপ shap "curse" from সাপ shap "snake", the distinction is particularly relevant in searching for the term (e.g. in an encyclopedia). If one were to use a transcription model, both would have the same written form (e.g. shap). On the other hand, transliteration would distinguish the two forms, (e.g. Harvard-Kyoto: zAp and sAp respectively). There are many other instances where differently written words have the same sound.
Occasionally, words written in the same way (homographs) may have different pronunciations for differing meanings: মত can mean "opinion" (pronounced môt), or "similar to" (môto). However, such instances are relatively rare, and most homographs are pronounced the same way.
Often, different phonemes (meaningful sound differences) are represented by the same grapheme (unit of written text). Thus, the vowel এ can represent both [e] (এল elo [elo] "came"), or [ɛ] (এক êk [ɛk] "one"). This distinction is important for pronouncing the text correctly.
One reason that many commonly-encountered romanizations of Bengali are transliterations as opposed to transcriptions is because transcription is notoriously difficult to standardize. Pronunciations vary across different language subgroups, or even for the same speakers in different social settings (registers), and thus the same unit of language (lexeme) may have many different transcriptions. Even simple words like মন "mind" may be pronounced "mon" or "môn", and then again, the final vowel may be pronounced, giving "mônô" (e.g. the Indian national anthem, Jana Gana Mana).
Finally, the determination of a single standard pronunciation is a difficult social challenge, since speakers of different varieties may feel disenfranchised. Orthography is comparatively easier to standardize owing to the weight of past usage.
The choice of the specific romanization used is related to the intent. For written texts that are intended to be referenced using search, or where sorting may be a consideration, or just to aid comprehension, transliteration is preferable. On the other hand, for representing a sound so that non-Bengali speakers can pronounce it easily, it may be better to use a transcription.
[edit] Comparison of Romanizations
Comparisons of standard romanization schemes for Bengali are given in the table below. Two standards are commonly used for transliteration of Indic languages including Bengali. Many standards (e.g. NLK / ISO), use diacritic marks and permit case markings for proper nouns. Newer forms (e.g. Harvard-Kyoto) are more suited for ASCII-derivative keyboards, and use upper- and lower-case letters contrastively and forgo normal standards for English capitalization.
- "NLK" stands for the diacritic-based letter-to-letter transliteration schemes, best represented by the National Library at Kolkata romanization or the ISO 15919, or IAST. This is the ISO standard, and it uses diacritic marks (e.g. ā, ) to reflect the additional characters and sounds of Bengali letters.
- ITRANS is an ASCII representation for Sanskrit; it is one-to-many, i.e. there may be more than one way of transliterating characters, which can make internet searching more complicated. ITRANS representations forgo capitalization norms of English so as to be able to represent the characters using a normal ASCII keyboard.
- "HK" stands for two other case-sensitive letter-to-letter transliteration schemes: Harvard-Kyoto and XIAST scheme. These are similar to the ITRANS scheme, and use only one form for each character.
- XHK or Extended Harvard-Kyoto (XHK) stands for the case-sensitive letter-to-letter Extended Harvard-Kyoto transliteration. This adds some specific characters for handling Bengali text to IAST.
- "Wiki" stands for a phonemic transcription-based romanization. It is a sound-preserving transcription based on what is perceived to be the standard pronunciation of the Bengali words, with no reference to how it is written in Bengali script. It uses diacritics often used by linguists specializing in Bengali (other than IPA), and is the transcription system used to represent Bengali sounds in Wikipedia articles.
[edit] Examples
The following table includes examples of Bengali words Romanized using the various systems mentioned above.
In orthography | Meaning | NLK | XHK | ITRANS | HK | Wiki | IPA |
---|---|---|---|---|---|---|---|
মন | mind | mana | mana | mana | mana | mon | [mon] |
সাপ | snake | sāpa | sApa | saapa | sApa | shap | [ʃap] |
শাপ | curse | śāpa | zApa | shaapa | zApa | shap | [ʃap] |
মত | opinion | mata | mata | mata | mata | môt | [mɔt̪] |
মত | like | mata | mata | mata | mata | môto | [mɔt̪o] |
তেল | oil | tēla | tela | tela | tela | tel | [t̪el] |
গেল | went | gēla | gela | gela | gela | gêlo | [gɛlo] |
জ্বর | fever | jvara | jvara | jvara | jvara | jôr | [dʒɔɹ] |
স্বাস্থ্য | health | svāsthya | svAsthya | svaasthya | svAsthya | shastho | [ʃast̪ʰo] |
বাংলাদেশ | Bangladesh | bāṃlādēśa | bAMlAdeza | baa.mlaadesha | bAMlAdeza | bangladesh | [baŋlad̪eʃ] |
ব্যঞ্জনধ্বনি | consonant | byañjanadhvani | byaJjanadhvani | bya~njanadhvani | byaJjanadhvani | bênjondhoni | [bɛndʒond̪ɦoni] |
আত্মহত্যা | suicide | ātmahatyā | AtmahatyA | aatmahatyaa | AtmahatyA | attõhotta | [at̪ːõhot̪ːa] |
[edit] Romanization Reference
The IPA (International Phonetic Alphabet) transcription is provided in the rightmost column, representing the most common pronunciation of the glyph in Standard Colloquial Bengali, alongside the various romanizations described above.
|
|
|
[edit] References
- ^ In Japanese there exists some debate as to whether to accent certain distinctions, such as Tōhoku vs Tohoku. Sanskrit is well standardized, because the speaking community is relatively small, and sound change is not a large concern
[edit] See also
|