Help:Multilingual support (Indic)

From Wikipedia, the free encyclopedia

Shortcuts:
WP:COMPLEX
WP:INDIC
WP:ECTS
Image:Example.of.complex.text.rendering.svg This article contains Indic text.
Without rendering support, you may see question marks, boxes or other symbols instead of Indic characters; or irregular vowel positioning and a lack of conjuncts.

Several pages on Wikipedia use Indic scripts to illustrate the native representation of names, places, quotes and literature. Unicode is the encoding used on Wikipedia and it contains support for a number of Indic scripts. However, before Indic scripts can be viewed or edited, support for Complex Text Layout must be enabled on your operating system. Some older operating systems do not support complex text rendering and you should not use such systems to edit Indic scripts.

This page lists the methods for enabling complex text rendering based on the operating environment or browser you are using. Many of the methods highlighted can be used for non-Indic complex scripts such as Arabic.

Contents

[edit] Check for existing support

The following table compares how a correctly enabled computer would render the following scripts with how your computer renders them:

Script Correct rendering Your computer
Bengali Image:Examples.of.complex.text.rendering.Bengali.png ক + িকি
Devanagari Image:Examples.of.complex.text.rendering.Devanagari.png क + िकि
Gujarati Image:Examples.of.complex.text.rendering.Gujarati.png ક + િકિ
Gurmukhi Image:Examples.of.complex.text.rendering.Gurmukhi.png ਕ + ਿਕਿ
Kannada Image:Examples.of.complex.text.rendering.Kannada.png ಕ + ಿಕಿ
Malayalam Image:Examples.of.complex.text.rendering.Malayalam.png ക + െകെ
Oriya Image:Examples.of.complex.text.rendering.Oriya.png କ + େକେ
Sinhala ඵ + ේඵේ
Tibetan Image:Examples_of_complex_text_rendering_Tibetan.png ར + ྐ + ྱརྐྱ
Tamil Image:Examples.of.complex.text.rendering.Tamil.png க + ேகே
Telugu Image:Examples.of.complex.text.rendering.Telugu.png య + ీయీ

If the rendering on your computer matches the rendering in the images for the scripts, then you have already enabled complex text support. You should be able to view text correctly in that script. However, this does not mean you will be able to edit text in that script. To edit such text you need to have the appropriate text entry software on your operating system.

[edit] Platform Independent support on Mozilla Firefox

Indic IME, a plugin for Firefox 1.0+ can help you write in many indian languages in your webpages. It is easy to install and works on all platforms where Firefox or other Mozilla-based browsers are running.

The Indic IME toolbar project was started to address the need of typing in Indian Languages in Web Forms, Emails, Blog, Search Boxes etc.

Padmas, a plugin for Firefox 2.0+ converts several Indic fonts to Unicode. This helps several popular Indian vernacular websites to render correctly, without the need for any additional font installation.

Firefox 3 can render Indic texts properly.

[edit] Windows 95, 98, ME and NT

These operating systems contain no inbuilt support for Indic scripts. Indic Scripts can only be seen properly in Internet Explorer. You also need to have a appropriate unicode font installed in your system for that script. It is suggested to install Internet Explorer 6.0 because it has better support for Indic scripts.

Mozilla Firefox does not support Indic scripts properly on these operating systems unless a modified version of the program is used, such as the one found here. This is due to a bug in Firefox [1], [2]. This bug is now removed in Firefox 3 Alpha. But Firefox 3 does not support Windows 98/ME.

No Unicode Keyboard Driver Engines (Like Indic IME, BarahaIME etc) are available for these older systems. One can either use online typing tools or offline text editors specially made for this purpose. A list of such tools is given here.

[edit] Windows 2000

Supports: Devanagari, Kannada, Tamil

Complex text support needs to be manually enabled.


[edit] Viewing Indic text

  • Go to Start > Settings > Control Panel > Regional Options > General [Tab].
  • In the "Language settings for this system" frame, check the box next to "Indic".
  • Copy the appropriate files from the Windows 2000 CD when prompted.
  • If prompted, reboot your computer once the files have been installed.

If you don't have the Windows CD or don't want to juggle with CD right now, you can simply download this zip file and extract its contents to a folder. When prompted for Windows CD, simply point to this folder using 'Browse' option of the prompt window.

[edit] Inputting Indic text

You must follow the steps above before you perform the remaining steps.

  • Select "Input Locale" [Tab].
  • Click the "Add" button in the "Installed input locales" frame.
  • Select the desired language in the "Input Locale" drop-down box on the "Add Input Locale" dialogue box.
  • Now select the appropriate keyboard you wish to use.
  • For the people who are not able to use the above InScript Keyboard, can use the Phonetic keyboards from Baraha. Baraha Direct included in Baraha Package supports both ANSI & Unicode while BarahaIME supports only Unicode.
  • For people who cannot download the above software, or for people on the move, dboard is an Indian language sandbox which provides an online virtual (visual) keyboard, you can use the following application, copy the text on the clipboard and then copy it back to the Wikipedia editing box.

[edit] Windows XP and Server 2003

This is where we install Complex Scripts in Windows XP & 2003
This is where we install Complex Scripts in Windows XP & 2003

Supports: Bengali (XP SP2), Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam (XP SP2), Tamil, Telugu

Complex text support needs to be manually enabled.

[edit] Viewing Indic text

  1. Go to Start > Control Panel.
  2. If you are in "Category View" select the icon that says "Date, Time, Language and Regional Options" and then select "Regional and Language Options".
  3. If you are in Classic View select the icon that says "Regional and Language Options".
  4. Select the "Languages" tab and make sure you select the option saying "Install files for complex script and right-to-left languages (including Thai)". A confirmation message should now appear - press "OK" on this confirmation message.
  5. Allow the OS to install necessary files from the Windows XP CD and then reboot if prompted.

This is not sufficient to render Indic scripts in Firefox. You also need the latest version of usp10.dll on your system and it may be necessary to install a Unicode OpenType font.

[edit] Inputting Indic text

Windows XP have inbuilt InScript Keyboards for nearly all Indian languages. You can add them via Control Panel. You must follow the steps above before you perform the remaining steps.

  • In the "Regional and Language Options", click the "Languages" tab.
  • Click on the "Details" tab.
  • Click the "Add" button to add a keyboard for your particular language.
  • In the drop-down box, select your required Indian language.
  • Make sure the check box labelled "Keyboard layout/IME" is selected and ensure you select an appropriate keyboard.
  • Now select "OK" to save changes.

You can use the combination ALT + SHIFT to switch between different keyboard layouts (e.g. from a UK Keyboard to Gurmukhi and vice-versa). If you want a language bar, you can select it by pressing the "Language Bar..." button on the "Text Services and Input Languages" dialog and then selecting "Show the language bar on my desktop". The language bar enables you to visually select the keyboard layout you are using.

  • For the people who are not able to use the above InScript Keyboard, there are some other Keyboard Drivers available. For Phonetic typing BarahaIME is suggested and for Remington typing IndicIME is suggested.

Baraha is Phonetic based software and includes nearly all of Indic languages. Baraha Direct included in Baraha Package supports both ANSI & Unicode while BarahaIME supports only Unicode.

  • Indic IME 1 (v5.0) is available from Microsoft Bhasha India. This supports Hindi Scripts, Gujarati, Kannada and Tamil. Indic IME 1 gives the user a choice between a number of keyboards including Phonetic, InScript and Remington.

If you do not have Windows CD, there is a modified version of the installer for Hindi named Hindi Toolkit which automatically installs Indic Support as well as Hindi Indic IME.

  • For people who cannot download the above software, or for people on the move, dboard is an Indian language sandbox which provides an online virtual (visual) keyboard, you can use the following application, copy the text on the clipboard and then copy it back to the Wikipedia editing box.
  • MyMyanmar Projects provide MyMyanmar Unicode System to input Myanmar(Burmese) text.[1]

[edit] Windows Vista

Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Sinhala, Tamil, Telugu, Tibetan

Complex text support is automatically enabled.

[edit] Viewing Indic text

You do not need to do anything to enable viewing of Indic text.

[edit] Inputting Indic text

Windows Vista like Windows XP have inbuilt InScript Keyboards for nearly all Indian languages. You can add them via Control Panel.

For Phonetic typing BarahaIME is suggested and for Remington typing IndicIME is suggested.

[edit] Mac OS 9 and earlier

The Indian Language Kit, available from Apple at additional cost,[3] provides support for Devanagari, Gujarati and Gurmukhi. No third-party Unicode solutions are known, though numerous custom-encoded fonts exist.

[edit] Mac OS X

[edit] Viewing Indic text

You do not need to do anything to enable viewing of Indic text as long as you use Safari or most other Cocoa applications, which fully support rearrangement and substitution for AAT-based fonts. Firefox after 2.0 renders Indic text (except Sinhala and Tibetan), although it does not replace प+ि with ि+प. (You will need a unicode script selected that supports Indic script, like Code 2000). Opera also provides some support, although considerable bugs remain as of version 9.2 (though Opera at least renders the glyphs).

Carbon software such as Microsoft Word, Adobe Photoshop and their siblings do not generally support Indic scripts, due to broken or non-existent ATSUI implementations.

[edit] Inputting Indic text

Specific keyboard layouts can be enabled in System Preferences, in the International pane. Switching among enabled keyboard layouts is done through the input menu in the upper right corner of the screen. The input menu appears as an icon indicating the current input method or keyboard layout — often a flag identified with the country, language, or script. Specific instructions are available from the "Help" menu (search for "Writing text in other languages").

Mac OS 10.4 system software comes with two installable Keyboard input options for Tamil: Murasu Anjal and Tamilnet 99. One needs to do the following steps to activate them:

i) Open "international" located within System Preferences and select "language". Select the "edit list", select "Tamil" from the list of languages shown and click OK.

ii) Select "input menu" to see a list of keyboard options available. Select "Anjal" and "Tamilnet99" keyboards under Murasu Anjal Tamil and Click OK.

iii) Anjal and Tamilnet99 keyboard icons appear immediately in the list of keyboards to select under the country flag in the top menu bar.

An alternate way to activate the keyboard(s) for Devanagari (Hindi etc.):

i) Open "International" located within System Preferences and select the "Input Menu" tab. (ii) Check the option for "Devanagari" and/or "Devanagari - QWERTY". (iii) Check the "Show input menu in menu bar" option at the bottom of the "International" panel. Close the panel, and the new keyboard(s) should be available for selection when you click on the menu bar icon (upper right corner).

SIL distributes a freeware Ukelele that allows anyone to design their own input keyboard for Mac OS X.

[edit] Linux

[edit] GNOME

Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, Tibetan

[edit] Viewing Indic text

You do not need to do anything to enable viewing of Indic text in GNOME 2.8 or later. Older versions may have support for some, but not all Indic scripts. Ensure you have appropriate Unicode fonts for each script you wish to view or edit.

Some web browsers may require you to enable Pango rendering to view Indic text properly.

  • For Epiphany, Pango rendering can be enabled in GConf. Press Alt+F2 to bring up the Run Application dialog, then enter gconf-editor and click Run. The Configuration Editor window will appear. In the left pane, unfold appsepiphany and click the web section. In the right pane, check the box next to the enable_pango option, then restart Epiphany.
  • When using Mozilla or Firefox, you can enable Pango rendering by opening xterm and typing MOZ_ENABLE_PANGO=1 mozilla or MOZ_ENABLE_PANGO=1 firefox. After this, all future sessions of Mozilla or Firefox will have Indic language support.
    • This will work only on Firefox compiled with --enable-pango.
    • The easiest way to check whether --enable-pango was used in your copy of Firefox is to type about:buildconfig in the address bar and to look for the string (--enable-pango).
    • For Ubuntu 6.06, this support has been turned off due to speed issues. To enable support, you must type MOZ_DISABLE_PANGO=0 firefox. Future sessions do not remember this setting, so it must be repeated.
    • For Ubuntu 7.10, this support can be enabled just by installing the relevant language support packs. For instance, to support Tamizh display, the following is sufficient: sudo apt-get install language-pack-ta language-support-ta language-pack-gnome-ta ttf-tamil-fonts
    • For SUSE 10.1 you have to add the "MOZ_ENABLE_PANGO=1″ to your .profile to make the effect permanent.
      1. Go to your home directory, then edit the .profile file -it is a hidden file.
      2. Scroll down to the last line of the file and add: export MOZ_ENABLE_PANGO=1
      3. Save the .profile file. Restart for the effect to take place

[edit] Inputting Indic text

  • Go to Applications > Preferences > Keyboard.
  • Select the "Layouts" tab.
  • Select the keyboard for the language or script you wish to use from the "Available Layouts" frame and then press "Add".
  • Press "Close" to discard the dialogue box.
  • Right click on the main menu on your desktop and select "Add to Panel...".
  • Select "Keyboard Indicator" and click "Add".
  • Position the keyboard indicator on your menu bar and click it to switch between keyboard layouts.

Using SCIM

Another option is to use SCIM, to enable that,

  • Install Hindi font support, groupinstall hindi-support
  • Then enable SCIM, using System -> Personal -> Input Method from the menu, and use Hindi phonetic support.

For more check http://www.ruturaj.net/fedora-6-hindi-support-scim

[edit] KDE

Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu.

[edit] Viewing Indic text

You do not need to do anything to enable viewing of Indic text. Ensure you have appropriate Unicode fonts for each script you wish to view or edit.

[edit] Inputting Indic text

  • In the Control Center, go to Regional & Accessibility, Keyboard Layout
  • In the tab Layout, click on Enable keyboard layouts
  • Choose the layout you want in Available layouts
  • Click on Apply
  • Now, you will have an icon for the KDE Keyboard Tool in your panel, in which you can choose the layout you want

[edit] Distribution-specific advice

[edit] Debian (and derivatives like Ubuntu)

Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, Tibetan, Punjabi.

[edit] Viewing Indic text

Enter as root:

apt-get install ttf-indic-fonts

and when the installation is complete restart the X server.

For Tibetan script:

apt-get install ttf-tmuni

For Mozilla and Firefox, see the comments above under "gnome". Rendering should work correctly "out of the box" as of Debian-4.0 (etch).

[edit] Inputting Indic text

SCIM supports text input in Indic languages including phonetic layout. SCIM should be working by default in recent distributions. More instructions on using and configuring SCIM can be found on help.ubuntu.com [4]

[edit] Fedora

Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, Punjabi among others.

[edit] Installing Indic fonts

For example, to install Kannada fonts, Simply enter as root on the console and type in the command:

yum install fonts-kannada

This will download the Kannada fonts from the repositories and install it.

Similarly, for Hindi, say, enter as root on the console and type in the command:

yum install fonts-hindi

[edit] Keyboard support

Start the Add/Remove software applet. For example in KDE, say, navigate to System and then Add/Remove software. In the applet window, select Languages on the list box to your left hand side. In the right hand side list box, select the Indian languages of interest to you.

For example, to have Kannada key board support, check the box for Kannada Support. Similarly, for Hindi support, say, check the box for Hindi Support.

It has been observed that for Kannada, Fedora not only puts in Kannada keyboard support, but also provides transliteration support and also the keyboard support for KGP (Kannada Ganaka Parishad) keyboards. With this feature, users can directly type in Kannada words in Roman script to be transliterated to Kannada text in the application of your choice. For example into your browser, text editor, document editor, email client etc. Users can also use native Kannada keyboards, KGP based or otherwise to type in Kannada texts directly.

[edit] Arch Linux

Supports: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu.

To install Indic fonts:

pacman -S ttf-indic-otf

To enter Indic text in GNOME/KDE, follow the instructions in the respective sections above.

[edit] Gentoo

Supports: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu,

[edit] Installing Indic fonts
emerge lohit-fonts
Note: The lohit-fonts package was earlier named media-fonts/fonts-indic.

The mozilla-*-bin products shipped by gentoo are directly taken from mozilla's ftp servers and aren't built with pango support. Unless you notice a problem with this you need to build your own copy with the "moznopango" USE flag disabled: USE="-moznopango" (notice the minus sign, which in this case results in a double negation). Firefox 3 will be shipping with pango enabled by default.

[edit] Inputting Indic text
emerge -av scim-tables scim-m17n

Study the USE flags and the LINGUAS flags and set them accordingly depending on your desktop environment and language support needed. The following needs to be set whenever you login (append it to your .xinitrc or .xsession).

export XMODIFIERS=@im=SCIM    #case matters for this variable!
export GTK_IM_MODULE=scim
export QT_IM_MODULE=scim

Mozilla apps and precompiled software such as acroread might not play well with scim (C++). In such cases, make use of scim-bridge (C - avoiding C++ ABI issues) [5].

emerge scim-bridge

and startup firefox as:

% GTK_IM_MODULE=scim-bridge firefox

You might have to start the scim daemon manually. (Add it your session's startup)

scim -d

SCIM is a unified frontend for currently available input method libraries.

[edit] FreeBSD

Supports: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu.

[edit] Installing Indic fonts

cd /usr/ports/x11-fonts/fonts-indic && make install clean

The binary package of firefox (when you do pkd_add -r firefox) might give the same problems as in Gentoo's bin package (needs confirmation)

[edit] Inputting Indic text

See Gentoo's section above.

[edit] Unicode OpenType fonts

This section lists OpenType fonts, supported by Microsoft Windows and most Linux distributions. For AAT fonts (required for the Apple Macintosh), see the Mac OS X section above.

If you have followed the instructions for your computer system as mentioned above and you still cannot view Indic text properly, you may need to install a Unicode font:

Department of Information Technology, India has provided Unicode Indic fonts for most of the Indian languages.

WAZU JAPAN's Gallery of Unicode Fonts is an excellent resource for all Indic scripts.

[edit] References

Sinhala = Kaputa Unicode

[edit] External links

Languages