Complex text layout

From Wikipedia, the free encyclopedia

The Devanagari ddhrya-ligature  of JanaSanskritSans, should be invoked by the layout engine to render the sequence of seven Unicode characters  द + ् + ध + ् + र + ् + य = द्ध्र्य.
The Devanagari ddhrya-ligature of JanaSanskritSans, should be invoked by the layout engine to render the sequence of seven Unicode characters द + ् + ध + ् + र + ् + य = द्ध्र्य.

Complex text layout (abbreviated CTL) or complex text rendering refers to the typesetting of writing systems which require complex transformations between text input and text display for proper rendering on the screen or the printed page (also known as complex scripts). In other words, for these scripts the way text is stored is not mapped to the way it is displayed in a straightforward fashion. The term is used in the field of software internationalization.

Examples of writing systems requiring CTL are the Arabic alphabet and scripts of the Brahmic family such as Devanagari or the Thai alphabet.

CTL is a generalization of the concept of ligature: for the Latin alphabet, ligatures are usually considered a marginal aesthetic concern, but there is no fundamental difference between the ligatures required for acceptable typesetting of the Arabic script, and typesetting a Latin cursive.[1] Conversely, most characters of the Chinese script are compositional and could be considered ligatures, but are usually encoded as so many individual characters, so that typesetting Chinese requires an enormous typeface rather than sophisticated layout. An example of a contextual variant that is not considered a ligature is Greek final sigma ς, the word-final contextual variant of the usual σ shape. Unicode encodes both variants separately, at U+03C2 and U+03C3 respectively, although the string "δῖος Ἀχιλλεύς." should be considered canonically equivalent to "δῖοσ Ἀχιλλεύσ."

The main characteristics of CTL language complexity are:

  • Bi-directional text, where characters may be written from either right-to-left or left-to-right direction.
  • Context-sensitive shaping (ligatures), where character may changes its shape, depends on its location and/or surrounding characters. For example, a character in Arabic script can have at least four different shape forms, depending on context.
  • Ordering, the displayed order of the characters is not the same as the logical order. For example, in Devanagari, which is written from left to right, the grapheme for "short i" appears to the left ("before") the preceding vowels: in कि ki, the ि -i should render on the left, its bow reaching until above the k- to the right.

[edit] Notes

  1. ^ indeed, historically, the Arabic alphabet is simply a cursive of the Nabatean alphabet, with context-dependent letter shapes that became mandatory from ca. the 4th century AD.

[edit] See also

[edit] External links

In other languages