Internationalization and localization

In computing, internationalization and localization (other correct spellings are internationalisation and localisation) are means of adapting computer software to different languages, regional differences and technical requirements of a target market. Internationalization is the process of designing a software application so that it can be adapted to various languages and regions without engineering changes. Localization is the process of adapting internationalized software for a specific region or language by adding locale-specific components and translating text.

The terms are frequently abbreviated to the numeronyms i18n (where 18 stands for the number of letters between the first i and last n in internationalization, a usage coined at DEC in the 1970s or 80s)^[1] and L10n respectively, due to the length of the words. The capital L in L10n helps to distinguish it from the lowercase i in i18n.

Some companies, like IBM and Sun Microsystems, use the term "globalization" for the combination of internationalization and localization.^[2]

Microsoft^[3] defines Internationalization as a combination of World-Readiness and localization. World-Readiness is a developer task, which enables a product to be used with multiple scripts and cultures (globalization) and separating user interface resources in a localizable format (localizability, abbreviated to L12y).^[4]

This concept is also known as NLS (National Language Support or Native Language Support).

1 Nomenclature
2 Scope
3 Business process for internationalizing software
4 Coding practice
5 Difficulties
6 Costs and benefits
7 See also
8 External links
9 Notes
10 References

Nomenclature

The support of multiple languages by computer systems can be considered a continuum between localization ("L10n"), through multilingualization (or "m17n"), to internationalization ("i18n").

A localized system has been adapted or converted for use in a particular locale (other than the one it was originally developed for), including the language of the user interface (UI), input, and display, and features such as time/date display and currency. Each instance of the system only supports a single locale, and there is no explicit support for languages that are not part of that locale (although the character set may coincidentally be usable for other languages).
Multilingualized software supports multiple languages for display and input, but has a single UI language which cannot be changed after installation of the software. Multi-locale support for other features like date, time, number, and currency formats varies as the system tends towards full internationalization. At present, most multi-lingual software relies for these features on the host operating system (e.g., Microsoft Windows or Mac OS X) of the machine on which the software runs, and may thus be able to support character sets for different languages within the same document. In general, a multilingualized system is intended for use in one specific locale, but is capable of handling multilingual content as data.
An internationalized system is equipped for use in a range of "locales" (or by users of multiple languages), by allowing the co-existence of several languages and character sets for input, display, and UI. In particular, a system may not be considered internationalized in the fullest sense unless the UI language is selectable by the user at runtime. Full internationalization may extend beyond support for multiple languages and orthography to compliance with jurisdiction-specific legislation (in respect of copyright, for instance) and other non-linguistic conventions.

The distinction arises because it is significantly more difficult to create a multi-lingual UI than simply to support the character sets and keyboards needed to express multiple languages. To internationalize a UI, every text string employed in interaction must be translated into all supported languages; then all output of literal strings, and literal parsing of input in UI code must be replaced by hooks to i18n libraries.

It should be noted that "internationalized" does not necessarily mean that a system can be used absolutely anywhere, since simultaneous support for all possible locales is both practically almost impossible and commercially very hard to justify. In many cases an internationalized system includes full support only for the most spoken languages, plus any others of particular relevance to the application.

Scope

Focal points of internationalization and localization efforts include:

Language
- Computer-encoded text
  - Alphabets/scripts; most recent systems use the Unicode standard to solve many of the character encoding problems.
  - Different systems of numerals
  - Writing direction left to right in most European languages (e.g. German), right-to-left in Hebrew and Arabic, vertical in some Asian languages
  - Complex text layout
  - Text processing differences, such as the concept of capitalization which exists in some scripts and not in others, different text sorting rules, etc.
  - Plural forms in text output, which differ depending upon language^[5]
- Input
  - Enablement of keyboard shortcuts on any keyboard layout^[6]
- Graphical representations of text (printed materials, online images containing text)
- Spoken (Audio)
- Subtitling of film and video
Culture
- Images and colors: issues of comprehensibility and cultural appropriateness
- Names and titles
- Government assigned numbers (such as the Social Security number in the US, National Insurance number in the UK, Isikukood in Estonia, and Resident registration number in South Korea) and passports
- Telephone numbers, addresses and international postal codes
- Currency (symbols, positions of currency markers)
- Weights and measures
- Paper sizes
Writing conventions
- Date/time format, including use of different calendars
- Time zones (UTC in internationalized environments)
- Formatting of numbers (decimal separator, digit grouping)
- Differences in symbols (e.g. quoting text using double-quotes (" "), as in English, or guillemets (« »), as in French).
Any other aspect of the product or service that is subject to regulatory compliance
- Disputed borders shown on maps (e.g. failing to show Kashmir as Indian is a crime in India)

The distinction between internationalization and localization is subtle but important. Internationalization is the adaptation of products for potential use virtually everywhere, while localization is the addition of special features for use in a specific locale. Internationalization is done once per product, while localization is done once for each combination of product and locale. The processes are complementary, and must be combined to lead to the objective of a system that works globally. Subjects unique to localization include the following:

Language translation
National varieties of languages (see language localization)
Special support for certain languages such as East Asian languages
Local customs
Local content
Symbols
Order of sorting (Collation)
Aesthetics
Cultural values and social context
Differing laws/regulations (e.g. taxation laws, labour laws, etc.)

Business process for internationalizing software

In order to internationalize a product, it is important to look at a variety of markets that your product will foreseeably enter. Details such as field length for street addresses, unique format for the address, ability to make the zip code field optional to address countries that do not have zip codes, plus the introduction of new registration flows that adhere to local laws are just some of the examples that make internationalization a complex project.^[7]

A broader approach takes into account cultural factors regarding for example the adaptation of the business process logic or the inclusion of individual cultural (behavioral) aspects.^[8]

Coding practice

The current prevailing practice is for applications to place text in resource strings which are loaded during program execution as needed. These strings, stored in resource files, are relatively easy to translate. Programs are often built to reference resource libraries depending on the selected locale data. One software library that aids this is gettext.

Thus to get an application to support multiple languages one would design the application to select the relevant language resource file at runtime. Resource files are translated to the required languages. This method tends to be application-specific and, at best, vendor-specific. The code required to manage date entry verification and many other locale-sensitive data types also must support differing locale requirements. Modern development systems and operating systems include sophisticated libraries for international support of these types.

Some tools help in detecting i18n issues and guiding software resolution of those issues, such as Lingoport's Globalyzer^[9] or Parasoft Test.^[10]

Difficulties

While translating existing text to other languages may seem easy, it is more difficult to maintain the parallel versions of texts throughout the life of the product. For instance, if a message displayed to the user is modified, all of the translated versions must be changed. This in turn results in a somewhat longer development cycle.

Many localization issues (e.g. writing direction, text sorting) require more profound changes in the software than text translation. For example, OpenOffice.Org achieves this with compilation switches.

To some degree (e.g. for Quality assurance), the development team needs someone who understands foreign languages and cultures and has a technical background. In large societies with one dominant language/culture, it may be difficult to find such a person.

One example of the pitfalls of localization is the attempt made by Microsoft to keep some keyboard shortcuts significant in local languages. This has resulted in some (but not all) programs in the Italian version of Microsoft Office using "CTRL + S" (sottolineato) as a replacement for "CTRL + U" (underline), rather than the (almost) universal "Save" function.

Costs and benefits

In a commercial setting, the benefit from localization is access to more markets. However, there are considerable costs involved, which go far beyond just engineering. First, software must generally be re-engineered to make it world-ready.

Then, providing a localization package for a given language is in itself a non-trivial undertaking, requiring specialized technical writers to construct a culturally-appropriate syntax for potentially complicated concepts, coupled with engineering resources to deploy and test the localization elements. Further, business operations must adapt to manage the production, storage and distribution of multiple discrete localized products, which are often being sold in completely different currencies, regulatory environments and tax regimes.

Finally, sales, marketing and technical support must also facilitate their own operations in the new languages, in order to support customers for the localized products. Particularly for relatively small language populations, it may thus never be economically viable to offer a localized product. Even where large language populations could justify localization for a given product, and a product's internal structure already permits localization, a given software developer/publisher may lack the size and sophistication to manage the ancillary functions associated with operating in multiple locales.

One alternative, most often used by open source software communities, is self-localization by teams of end-users and volunteers. The KDE project, for example, has been translated into over 100 languages.^[11] However, self-localization requires that the underlying product first be engineered to support such activities, which is a non-trivial endeavor.

External links

Notes

^ "Glossary of W3C Jargon". World Wide Web Consortium. http://www.w3.org/2001/12/Glossary#I18N. Retrieved 2008-10-13.
^ IBM Globalization web site
^ Microsoft "Globalization Step-by-Step" guide
^ MSDN.microsoft.com
^ GNU.org
^ Blog.i18n.ro
^ Internationalizing a Product: Product Internationalization 101
^ Pawlowski, J.M. (2008): Culture Profiles: Facilitating Global Learning and Knowledge Sharing. Proc. of ICCE 2008, Taiwan, Nov. 2008. Draft Version
^ Globalyzer.com
^ Parasoft.com
^ For the current list see KDE.org

References

.NET Internationalization: The Developer's Guide to Building Global Windows and Web Applications, Guy Smith-Ferrier, Addison-Wesley Professional, 7 August 2006. ISBN 0-321-34138-4
A Practical Guide to Localization, Bert Esselink, John Benjamins Publishing, [2000]. ISBN 1-58811-006-0
Lydia Ash: The Web Testing Companion: The Insider's Guide to Efficient and Effective Tests, Wiley, May 2, 2003. ISBN 0471430218
Business Without Borders: A Strategic Guide to Global Marketing, Donald A. DePalma, Globa Vista Press [2004]. ISBN 978-0976516903