Tamil script
From Wikipedia, the free encyclopedia
Tamil | ||
---|---|---|
Type: | Abugida | |
Languages: | Saurashtra, Sanskrit, Tamil | |
Time period: | ||
ISO 15924 code: | Taml | |
Note: This page may contain IPA phonetic symbols in Unicode. See IPA chart for English for an English-based pronunciation key. |
The Tamil script (or vaṭṭeḻuttu "rounded writing") is an Indic script that is used to write the Tamil language. With the use of special diacritics to represent aspirated and voiced consonants not represented in the basic script, it is also used to write Saurashtra and, by Tamils to write Sanskrit.
Contents |
[edit] Overview
[edit] Characteristics
The Tamil script has 12 vowels (uyireḻuttu "soul-letters"), 18 consonants (meyyeḻuttu "body-letters") and one character, the aytam, which is classified in Tamil grammar as being neither a consonant nor a vowel (aliyeḻuttu "the hermaphrodite letter"). The script, however, is syllabic and not alphabetic. The complete script, therefore, consists of the 31 letters in their independent form, and an additional 216 combinant letters (uyirmeyyeḻuttu) representing every possible combination of a vowel and a consonant. These combinant letters are formed by adding a vowel marker to the consonant. Some vowels require the basic shape of the consonant to be altered in a way that is specific to that vowel. Others are written by adding a vowel-specific suffix to the consonant, yet others a prefix, and finally some vowels require adding both a prefix and a suffix to the consonant. In every case the vowel marker is different from the standalone character for the vowel.
The Tamil script is an abugida, in that basic form of the symbol for every consonant has an inherent following vowel a, and must be modified not only to replace the inherent vowel with a different one, but also to produce a pure consonant without the inherent a. Thus, for example, the basic form of the letter k is க ka. The pure consonant k is written க், with an added marker that suppress the inherent vowel. The sign used to suppress the inherent trailing vowel is always an overdot (see image), called puḷḷi in Tamil.
The Tamil script is written from left to right.
[edit] History
The Tamil script, like the other Indic scripts, is thought to have evolved from the Brahmi script. It has recently been tentatively suggested by archaeologists from the Archaeological Survey of India that graffiti etched into a potsherd tentatively dated to the 5th century BC is an example of a very rudimentary form of Tamil writing[1]. However, the earliest inscriptions which are accepted examples of Tamil writing date to a time just after the Asokan period.
The script used by the earliest accepted inscriptions is commonly known as the Tamil Brahmi or Tamili script, and differs in many ways from standard Asokan Brahmi. For example, as the chart to the right shows, early Tamil Brahmi, unlike Asokan Brahmi, had a system to distinguish between pure consonants (m in this example) and consonants with an inherent vowel (ma in this example). In addition, early Tamil Brahmi used slightly different vowel markers, and had extra characters to represent letters not found in Sanskrit.
Inscriptions from the 2nd century AD use a later form of the Tamil Brahmi script, which is substantially similar to the writing system described in the Tolkappiyam, an ancient Tamil grammar. Most notably, they use the puḷḷi to suppress the inherent vowel. The Tamil letters thereafter evolved towards a more rounded form, and by the 5th or 6th century AD had reached a form called the early vaṭṭeḻuttu, the immediate ancestor of the vaṭṭeḻuttu ("rounded writing") script in use today. The rounded shape of the letters is partly the result of the fact that in ancient times, writing involved using a sharp-pointed stylus to carve the letters on palm leaves (olaiccuvaṭi), a process which made it easier to produce curves than straight lines. Some scholars state that the script was originally called veṭṭeḻuthu meaning script that was cut (on stone), standing for ease of carving in stones.
In addition to producing rounder letters, the use of palm leaves as the primary medium for writing led to other changes in the Tamil script. The scribe had to be careful not to piercing the leaves with the stylus while writing, because a leaf with a hole was likelier to tear and decay faster. The result of this was that the use of the puḷḷi to distinguish pure consonants became rare, with pure consonants usually being written as if the inherent vowel were present. Similarly, the vowel marker for the kuṟṟiyal ukaram, a half-rounded u which occurs at the end of some words and in the medial position in certain compound words, also fell out of use and was replaced by the marker for the simple u. The puḷḷi did not fully reappear until the introduction of printing, but the marker kuṟṟiyal ukaram never came back into use, although the sound itself still exists and plays an important role in Tamil prosody.
The forms of some of the letters were simplified in the 19th century to make the script easier to typeset. In the 20th century, the script was simplified even further in a series of reforms, which regularised the vowel markers used with consonants by eliminating special markers and most irregular forms.
[edit] Relationship with other Indic scripts
The Tamil script differs from other Brahmi-derived scripts in a number of ways. Unlike every other Indic script, it uses the same character to represent both an unvoiced stop and its voiced equivalent. Thus the character க் k, for example, represents both [k], and [g]. This is because Tamil grammar treats only unvoiced stops as being "true" consonants, treating voiced and aspirated sounds are euphonic variants of unvoiced sounds. Traditional Tamil grammars contain detailed rules, observed in formal speech, for when a stop is to be pronounced with and without voice. These rules are not followed in colloquial or dialectal speech, where voiced and unvoiced versions of a stop are, in effect, allophones, being used in specific phonetic contexts, without serving to distinguish words.
Also unlike other Indic scripts, the Tamil script does not use special consonantal ligatures to represent conjunct consonants, which are far less frequent in Tamil than in other Indian languages. Conjunct consonants, where they occur are written by writing the character for the first consonant, adding the puḷḷi to suppress its inherent vowel, and then writing the character for the second consonant.
[edit] The Tamil letters
[edit] Basic Consonants
Consonants are called the 'body' (mei) letters. The consonants are classified into three categories - vallinam or the hard consants, mellinam or the soft consonants (including all nasals), and idayinam or medium consonants.
There are some lexical rules for formation of words. Tolkāppiyam describes such rules. Some examples: a word cannot end in certain consonants, and cannot begin with some consonants including 'r' 'l' and 'll'; there are two consonants for the dental 'n' - which one should be used depends on whether the 'n' occurs at the start of the word and on the letters around it.
Consonant | Transliteration | Category | IPA |
---|---|---|---|
க் | k | vallinam | [k], [g], [x], [ɣ], [h] |
ங் | ṅ | mellinam | [ŋ] |
ச் | c | vallinam | [ʧ], [ʤ], [ʃ], [ʒ] |
ஞ் | ñ | mellinam | [ɲ] |
ட் | ṭ | vallinam | [ʈ], [ɖ], [ɽ] |
ண் | ṇ | mellinam | [ɳ] |
த் | t | vallinam | [t̪], [d̪], [ð] |
ந் | n | mellinam | [n] |
ப் | p | vallinam | [p], [b], [β] |
ம் | m | mellinam | [m] |
ய் | y | idaiyinam | [j] |
ர் | r | idaiyinam | [ɾ] |
ல் | l | idaiyinam | [l] |
வ் | v | idaiyinam | [ʋ] |
ழ் | ẓ, ḻ, ṛ | idaiyinam | [ɹ] |
ள் | ḷ | idaiyinam | [ɭ] |
ற் | ṟ, R | vallinam | [r], [t], [d] |
ன் | ṉ, N | mellinam | [n] |
[edit] Borrowed consonants
Also called Grantha letters, these are used exclusively for writing words borrowed from Sanskrit, English, and other languages. Of course not all such words include these letters.
Consonant | Transliteration | IPA |
---|---|---|
ஜ | j | [ʤ] |
ஷ | ṣ | [ʂ] |
ஸ | s | [s] |
ஹ | h | [h] |
க்ஷ | kṣ | [kʂ] |
[edit] Vowels
Vowels are also called the 'life' (uyir) or 'soul' letters. Together with the consonants (which are called 'body' letters), they form compound, syllabic (abugida) letters that are called 'living' letters (uyirmei ie. letters that have both 'body' and 'soul').
Tamil vowels are divided into short and long (five of each type) and two diphthongs.
[edit] Isolated Form
Vowel | Transliteration | IPA |
---|---|---|
அ | a | [ɐ] |
ஆ | ā | [ɑː] |
இ | i | [i] |
ஈ | ī | [iː] |
உ | u | [u], [ɯ] |
ஊ | ū | [uː] |
எ | e | [e] |
ஏ | ē | [eː] |
ஐ | ai | [ɐj] |
ஒ | o | [o] |
ஓ | ō | [oː] |
ஔ | au | [ɐʋ] |
[edit] Compound Form
Using the consonant 'k' as an example.
Compound form | Transliteration | IPA |
---|---|---|
க | ka | [kɐ] |
கா | kā | [kɑ:] |
கி | ki | [ki] |
கீ | kī | [kiː] |
கு | ku | [ku], [kɯ] |
கூ | kū | [kuː] |
கெ | ke | [ke] |
கே | kē | [keː] |
கை | kai | [kɐj] |
கொ | ko | [ko] |
கோ | kō | [koː] |
கௌ | kau | [kɐʋ] |
The special letter ஃ (pronounced 'akh') is rarely used by itself. It normally serves a purely grammatical function as the independent vowel form of the dot on consonants that suppresses the inherent 'a' sound in plain consonants.
The long (nedil) vowels are about twice as long as the short (kuRil) vowels. The diphthongs are usually pronounced about 1.5 times as long as the short vowels, though some grammatical texts place them with the long (nedil) vowels.
As can be seen in the compound form, the vowel sign can be added to the right, left or both sides of the consonants. It can also form a ligature. These rules are evolving and older use has more ligatures than modern use. What you actually see on this page depends on your font selection; for example, Code 2000 will show more ligatures than Latha.
There are proponents of script reform who want to eliminate all ligatures and let all vowel signs appear on the right side.
Unicode encodes the character in logical order (always the consonant first), whereas legacy 8-bit encodings (like TSCII) prefer the written order. This makes it necessary to reorder when converting from one encoding to another; it is not sufficient simply to map one set of codepoints to the other.
[edit] Tamil in Unicode
The Unicode range for Tamil is U+0B80 ... U+0BFF.
Please note that the following characters are only one interpretation of the unicode. Tamil unicode does not stipulate any mutilation or alteration of the Tamil characters as done by the following interpretation. The characters below are not authorised by any relevant government or educational authorities.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | ||
B80 | | | ஂ | ஃ | | அ | ஆ | இ | ஈ | உ | ஊ | | | | எ | ஏ | |
B90 | ஐ | | ஒ | ஓ | ஔ | க | | | | ங | ச | | ஜ | | ஞ | ட | |
BA0 | | | | ண | த | | | | ந | ன | ப | | | | ம | ய | |
BB0 | ர | ற | ல | ள | ழ | வ | ஶ | ஷ | ஸ | ஹ | | | | | ா | ி | |
BC0 | ீ | ு | ூ | | | | ெ | ே | ை | | ொ | ோ | ௌ | ் | | | |
BD0 | ௐ | | | | | | | ௗ | | | | | | | | | |
BE0 | | | | | | | ௦ | ௧ | ௨ | ௩ | ௪ | ௫ | ௬ | ௭ | ௮ | ௯ | |
BF0 | ௰ | ௱ | ௲ | ௳ | ௴ | ௵ | ௶ | ௷ | ௸ | ௹ | ௺ | | | | | |
[edit] See also
|
|||
Languages | Kannada - Kodava Takk - Malayalam - Tamil - Telugu - Tulu | ||
Script | Kannada script - Malayalam script - Tamil script - Telugu script - Tulu script | ||
Literature | Kannada literature - Malayalam literature - Tamil literature - Telugu literature - Tulu literature | ||
People | Kannada people - Kodava people - Malayali people - Tamil people - Telugu people - Tulu people | ||
Music | Carnatic Music - Ancient Tamil music | ||
States | Andhra Pradesh - Karnataka - Kerala - Tamil Nadu | ||
Related | South India - South Indian culture - Self-respect movement |
[edit] External links
- Tamil Alphabet & Basics - (PDF)
- Phonetics of spoken Tamil
- Unicode Character
- Unicode Chart - For Tamil (PDF)
- The Unicode Book: Chapter 9 - South and Southeast Asian Scripts (PDF)
- Unicode Converter - Online JavaScript tool to convert text in various Tamil encodings into Unicode
- Tamil script and language - From Omniglot
- NLS Information - NLS information page for Windows XP
- Tamil fonts - Links to download various Tamil fonts
- Sooriyan.com - A free Unicode Tamil font
- thamizhlinux.org - A community website for Linux and Open Source, in Tamil
- Transliterator - A means to transliterate romanized text to Unicode Tamil.
- [2] - A collections of links to learn Tamil.
- Windows Bamini Keyboard - Windows program to type in Tamil Unicode using Bamini keyboard layout.
- Tamil Studies Conference 2006: Tropes, Territories and Competing Realities
- Encoding converters for various encodings of Tamil
- Tamil Unicode Fonts And Software - Tamil Unicode Fonts, Encoding Solutions And Softwares. Founded by Mr Naa Govindasamy.
[edit] References
- Steever, Sanford B. (1996) "Tamil Writing" in William R. Bright and Peter B. Daniels (eds.) The World's Writing Systems. New York: Oxford University Press. ISBN 0-19-507993-0