Interword separation

From Wikipedia, the free encyclopedia

Interword separation is the act and the effect of mutually separating the written representations of words.

According to Spaces between Words[1], the early Semitic languages—which had no vowel signs—had interword separation, but languages with vowels (principally Greek and Latin) lost the separation, not regaining it until much later.

In modern languages, though punctuation marks used for other reasons (such as commas or semicolons) may have the side-effect to break consecutive words, the issue of separating distinct consecutively written terms exists in general. Depending on the language and the epochs, interword separation may be achieved by means of special symbols or conventions, or by means of "blank zones" called spaces.

Contents

[edit] Types of separations

Vertical lines
The ancient Anatolian hieroglyphs frequently (but not always) used vertical lines to separate words. Similarly, Linear B used short vertical lines. However, this technical advance mostly died out. In Biblical Hebrew, a vertical line between words called a Pasek indicates a small pause.
Slashes and dots
One reference implies that Phoenician originally used slashes and dots to mark word boundaries. It continues to say that Hebrew and Aramaic scribes borrowed the slash and dot advance, and in Aramaic used a space.
Vertical lines/dots
Ethiopic inscriptions used a vertical line, but on paper was written as two dots, resembling a colon (in Unicode, "ethiopic wordspace", at U+1361: ፡). This double-dot symbol also appears in ancient Turkic.
Interpunct
The Romans used the interpunct, a small dot, to separate words for a while before abandoning it (as in ALEA·IACTA·EST‬).
Different letter shapes
Because Hebrew script and Arabic script do not have vowels, it is particularly important to recognize word boundaries. While Hebrew and Arabic have always used spaces between words, some letters also have different shapes depending upon their position.
Five Hebrew letters take a different shape when they are at the end of a word. Arabic characters have up to three different shapes, depending upon whether they are at the beginning, middle, or end of a word. Additionally, characters can have yet another shape when they stand alone as headings in an index.
Vertical space
The Nasta'liq version of the Arabic script also uses vertical space to separate words. The beginning of each word is written high up above the baseline, while the end of the word is low, near the baseline; the line of text ends up looking a little bit like the teeth of a saw. While Nastaliq script is sometimes used to write Arabic, it is more often used for Persian, Uyghur, Pashto, and Urdu.

[edit] Rediscovery of spaces in Latin

The Irish appear to have been the first to consistently use blank spaces to delimit word boundaries in the Latin alphabet, sometime between 600 AD and 800 AD. As Irish is from a different branch of the Indo-European language family than Latin, the Irish would have had much more difficulty reading Latin than people with, for example, Spanish or Italian (which descended from Latin and are still quite close to it) as their first language. Thus they would have had greater incentive to make reading Latin easier.

[edit] See also

[edit] References

  1. ^ Saenger, Paul (2000). Spaces between Words. Stanford University Press. ISBN 0-8047-4016-X.