Turkish dotted and dotless I
From Wikipedia, the free encyclopedia
The Turkish alphabet, which is a variant of the Latin alphabet, includes two distinct versions of the letter I, one dotted and the other dotless. Dotted and dotless "i" are used in the Turkish, Azerbaijani, Crimean Tatar and Tatar languages.
I ı is the letter which describes the close back unrounded vowel sound (/ɯ/). Neither the upper nor lower case version has a dot.
İ i describes the variant close front unrounded vowel sound (/i/). Both the upper and lower case versions have a dot.
Examples:
- İstanbul (starts with an i sound, not an ı).
- Diyarbakır (the first and last vowels are spelled and pronounced differently)
Contents |
[edit] Consequence for ligatures
In their realizations in several fonts, the common ligatures for "fi" and "ffi" make the dot of the letter "i" disappear by merging it with the dot-like end of the curve of the minuscule "f". These ligatures should be avoided when typesetting text in Turkish.
[edit] In computing
In Unicode U+0131 is a lower case letter dotless i (ı). U+0130 (İ) is capital i with dot. IS0-8859-9 has them at positions 0xDD and 0xFD respectively. In normal typography, when lower case i's is combined with other diacritics, the dot is generally removed before the diacritic is added; however, Unicode still lists the equivalent combining sequences as including the dotted i, since logically it is the normal dotted i character that is being modified.
Software handling Unicode uppercasing and lowercasing will generally change ı to I and İ to i but unless it is specifically set up for Turkish it will change I to i and i to I rather than I to ı and i to İ. This means that the effect of uppercasing followed by lowercasing can be different from the effect of just lowercasing for texts that contain these characters.
In the Microsoft Windows SDK, beginning with Windows Vista, several relevant functions have a NORM_LINGUISTIC_CASING flag, to indicate that for Turkish and Azeri locales, I should map to ı and i to İ.
In the LaTeX typesetting language the dotless i can be written with the backslash-i command: \i
.
Dotless i (and dotted capital I) is also famous for its problematic handling under Turkish locales in several software, including Oracle DBMS, Java (this bug in Java will be fixed in the upcoming Java 6.0 release), and Unixware 7, where implicit capitalization of keywords, variables, tables names are not foreseen by the application developers. When applications written for such software acts strangely, it is better to switch locale to C or US English via System-wide or application-specific settings. Bugs should be logged in such situations, and if necessary patches submitted by developers to the software involved.
[edit] See also
- Tittle - general name for dots, diacritics, etc.
[edit] References
- http://www.unicode.org/charts/PDF/U0100.pdf
- Tex Texin, Internationalization for Turkish: Dotted and Dotless Letter "I", accessed 15 Nov 2005
The ISO basic Latin alphabet | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Aa | Bb | Cc | Dd | Ee | Ff | Gg | Hh | Ii | Jj | Kk | Ll | Mm | Nn | Oo | Pp | Rr | Ss | Tt | Uu | Vv | Ww | Xx | Yy | Zz | |
history • palaeography • derivations • diacritics • punctuation • numerals • Unicode • list of letters |