Indo-European languages

Indo-European
Geographic
distribution:
Before the fifteenth century, Europe, and South, Central and Southwest Asia; today worldwide.
Genetic
classification
:
One of the world's major language families; although some have proposed links with other families, none of these has received mainstream acceptance.
Subdivisions:
Anatolian
Balto-Slavic
Indo-Iranian
Italic
ISO 639-2: ine
IE countries.svg

     Countries with a majority of speakers of IE languages      Countries with an IE minority language with official status

The Indo-European languages comprise a family of several hundred related languages and dialects,[1] including most of the major languages of Europe, the Iranian plateau (Southwest Asia), much of Central Asia and the Indian subcontinent (South Asia). The Indo-European (Indo refers to the Indian subcontinent, since geographically the language group extends from Europe in the west to India in the east) group has the largest numbers of speakers of the recognised families of languages in the world today, with its languages spoken by approximately three billion native speakers.[2]

Contents

History of the Indo-European theory

Main article: Indo-European studies

Suggestions of similarities between Indian and European languages began to be made by European visitors to India in the sixteenth century. In 1583 Thomas Stephens, an English Jesuit missionary in Goa, noted similarities between Indian languages, specifically Konkani, and Greek and Latin. These observations were included in a letter to his brother which was not published until the twentieth century.[3]

The first account to mention Sanskrit came from Filippo Sassetti (born in Florence, Italy in 1540 AD), a Florentine merchant who traveled to the Indian subcontinent and was among the first European observers to study the ancient Indian language, Sanskrit. Writing in 1585, he noted some word similarities between Sanskrit and Italian (e.g. devaḥ/dio 'God', sarpaḥ/serpe 'snake', sapta/sette 'seven', aṣṭa/otto 'eight', nava/nove 'nine').[3] However, neither Stephens' nor Sassetti's observations led to further scholarly inquiry.[3]

In 1647 Dutch linguist and scholar Marcus Zuerius van Boxhorn noted the similarity among Indo-European languages, and supposed the existence of a primitive common language which he called "Scythian". He included in his hypothesis Dutch, Greek, Latin, Persian, and German, later adding Slavic, Celtic and Baltic languages. However, the suggestions of Van Boxhorn did not become widely known and did not stimulate further research.

The hypothesis re-appeared in 1786 when Sir William Jones first lectured on similarities between four of the oldest languages known in his time: Latin, Greek, Sanskrit, and Persian. It was Thomas Young who first used the term Indo-European in 1813[4], which became the standard scientific term (except in Germany[5]) through the work of Franz Bopp, whose systematic comparison of these and other old languages supported the theory. Bopp's Comparative Grammar, appearing between 1833 and 1852, counts as the starting-point of Indo-European studies as an academic discipline.

Classification

Further information: List of languages by first written accounts
Indo-European language family.

Indo-European topics

Indo-European languages
Albanian · Armenian · Baltic
Celtic · Germanic · Greek
Indo-Iranian (Indo-Aryan, Iranian)
Italic · Slavic  

extinct: Anatolian · Paleo-Balkans (Dacian,
Phrygian, Thracian) · Tocharian

Indo-European peoples
Albanians · Armenians
Balts · Celts · Germanic peoples
Greeks · Indo-Aryans
Iranians · Latins · Slavs

historical: Anatolians (Hittites, Luwians)
Celts (Galatians, Gauls) · Germanic tribes
Illyrians · Italics  · Cimmerians · Sarmatians
Scythians  · Thracians  · Tocharians
Indo-Iranians (Rigvedic tribes, Iranian tribes) 

Proto-Indo-Europeans
Language · Society · Religion
 
Urheimat hypotheses
Kurgan hypothesis
Anatolia · Armenia · India · PCT
 
Indo-European studies
Hypothetical
Indo-European
phylogenetic clades

Daco-Thracian
Graeco-Aryan
Graeco-Armenian
Italo-Celtic
Thraco-Illyrian

Indo-Hittite


The various subgroups of the Indo-European language family include ten major branches (in historical order of their first attestation):

  1. Anatolian languages, earliest attested branch. Isolated terms in Old Assyrian sources from the 19th century BC, Hittite texts from about the 16th century BC; extinct by Late Antiquity.
  2. Greek language, fragmentary records in Mycenaean from the late 15th - early 14th century BC; Homeric traditions date to the 8th century BC. (See Proto-Greek language, History of the Greek language.)
  3. Indo-Iranian languages, descending from a common ancestor, Proto-Indo-Iranian
  4. Italic languages, including Latin and its descendants (the Romance languages), attested from the 7th century BC.
  5. Celtic languages, descended from Proto-Celtic. Gaulish inscriptions date as early as the 6th century BC; Old Irish manuscript tradition from about the 8th century AD.
  6. Germanic languages (from Proto-Germanic), earliest testimonies in runic inscriptions from around the 2nd century, earliest coherent texts in Gothic, 4th century AD. Old English manuscript tradition from about the 8th century.
  7. Armenian language, attested from the 5th century AD.
  8. Tocharian languages, extant in two dialects, attested from roughly the 6th to the 9th century AD. Marginalized by the Old Turkic Uyghur Khaganate and likely extinct by the 10th century.
  9. Balto-Slavic languages, believed by most Indo-Europeanists[6] to form from a phylogenetic unit, while a minority ascribes similarities to prolongued language contact.
  10. Albanian language, attested from the 15th century; Proto-Albanian likely emerged from "Paleo-Balkanic" predecessors.

In addition to the classical ten branches listed above, several extinct and little-known languages have existed:

Grouping

Further information: Language families

Of the top 20 contemporary languages in terms of native speakers according to SIL Ethnologue, 12 are Indo-European: Spanish, English, Hindi, Portuguese, Bengali, Russian, German, Marathi, French, Italian, Punjabi and Urdu, accounting for over 1.6 billion native speakers.[7]

Membership of these languages in the Indo-European language family and branches, groups and subgroups thereof, is determined by a genetic relationship, defined by shared innovations which are presumed to have taken place in a common ancestor. For example, what makes Germanic languages "Germanic" is that large parts of the structures of all the languages so designated can be stated just once for all of them. In other words, they can be treated as an innovation that took place in Proto-Germanic, the source of all the Germanic languages.

Exempted from this concept are shared innovations acquired by borrowing (or other means of convergence), that can not be considered genetic. It has been asserted, for example, that many of the more striking features shared by Italic languages (Latin, Oscan, Umbrian, etc.) might well be "areal features". More certainly, very similar-looking alterations in the systems of long vowels in the West Germanic languages greatly postdate any possible notion of a proto-language innovation (and cannot readily be regarded as "areal", either, since English and continental West Germanic were not a linguistic area). In a similar vein, there are many similar innovations in Germanic and Balto-Slavic that are far more likely to be areal features than traceable to a common proto-language, such as the uniform development of a high vowel (*u in the case of Germanic, *i/u in the case of Baltic and Slavic) before the PIE syllabic resonants *ṛ,* ḷ, *ṃ, *ṇ, unique to these two groups among IE languages. The Balkan sprachbund even features areal convergence that comprise very different branches.

To the evolutionary history of a language family, a genetic "tree model" is considered appropriate only if communities do not remain in effective contact as their languages diverge. Otherwise, a "wave model" applies, featuring borrowings and no clear underlying genetic tree. Using an extension to the Ringe-Warnow model of language evolution early IE was confirmed to have featured limited contact between distinct lineages, while only the Germanic subfamily exhibited a less treelike behaviour as it acquired some characteristics from neighbours early in its evolution rather than from its direct ancestors. The internal diversification of especially West Germanic is cited to have been radically non-treelike.[8]

The Indo-Iranian languages form the largest sub-branch of Indo-European in terms of the number of native speakers as well as in terms of the number of individual languages.

Proposed subgroupings

Specialists have postulated the existence of such subfamilies (subgroups) as Italo-Celtic, Graeco-Armenian, Graeco-Aryan, and Germanic with Balto-Slavic. The vogue for such subgroups waxes and wanes; Italo-Celtic for example used to be a standard subgroup of Indo-European, but it is now little honored, in part because much of the evidence on which it was based has turned out to have been misinterpreted.

Subgroupings of the Indo European languages are commonly held to reflect genetic relationships and linguistic change. The generic differentiation of Proto-Indo-European into dialects and languages happened hand in hand with language contact and the spread of innovations over different territories.

Rather than being entirely genetic, the grouping of satem languages is commonly inferred as an innovative change that occurred just once, and subsequently spread over a large cohesive territory or PIE continuum that affected all but the peripheral areas.[9] For instance, Kortlandt proposes this satemization process involved interaction between a western and central Indo-European sphere of influence to the ancestors of Balts and Slavs.[10]

Shared features of Phrygian and Greek [11] and of Thracian and Armenian [12] group the southeastern branches of Indo-European together. Some fundamental shared features, like the verbal aorist category (this is a verb form denoting action without reference to duration or completion) having the perfect active particle -s fixed to the stem, link this group closer to Anatolian languages[13] and Tocharian. Shared features with Balto-Slavic languages, on the other hand (especially present and preterit formations), might be due to later contacts.[14]

The Indo-Hittite hypothesis proposes the Indo European language family to consist of two main branches: one represented by the Anatolian languages and another branch encompassing all other Indo European languages. Features that separate Anatolian from all other branches of Indo-European (such as the gender or the verb system) have been interpreted alternately as archaic debris or as innovations due to prolonged isolation. Points proffered in favour of the Indo-Hittite hypothesis are the (non-universal) Indo-European agricultural terminology in Anatolia[15] and the preservation of laryngeals.[16] However, in general this hypothesis is considered to attribute too much weight to the Anatolian evidence. According to another view the Anatolian subgroup left the Indo-European parent language comparatively late, approximately at the same time as Indo-Iranian and later than the Greek or Armenian divisions. A third view, especially prevalent in the so-called French school of Indo-European studies, holds that extant similarities in non-satem languages in general - including Anatolian - might be due to their peripheral location in the Indo-European language area and early separation, rather than indicating a special ancestral relationship.[17] Holm (2008)[18] based on lexical calculations arrives at a picture roughly replicating the general scholarly opinion and refuting the Indo-Hittite hypothesis.

Satem and centum languages

Main article: Centum-Satem isogloss
Diachronic map showing the Centum (blue) and Satem (red) areas. The supposed area of origin of satemization is shown in darker red ( Sintashta/Abashevo/Srubna cultures).

The terms Centum and Satem are used to describe the evolution of the three original sets of velar consonants that have been reconstructed for Proto-Indo-European, * (labiovelars), *k (velars), and *; (palatovelars). Satem languages (Indo-Iranian and Balto-Slavic) lost the distinction between labiovelar and pure velar sounds, and at the same time assibilated the palatal velars. The Centum languages (Germanic, Italic, and Celtic), on the other hand, changed the palatal velars to be the same as pure velars.

Note that the terms "Centum" or "Satem" do not imply that Centum languages descend from a "proto-Centum" or that languages exhibiting Satem features descend from a "proto-Satem". Most modern scholars see the Satem sound change as an areal feature radiating outward from the central Indo-European language communities, but largely failing to reach the western and eastern peripheries.

The Satem-Centum isogloss runs right between the Greek (Centum) and Armenian (Satem) languages (which a number of scholars regard as closely related), with Greek exhibiting some marginal Satem features. Some scholars think that some languages classify neither as Satem nor as Centum (Anatolian, Tocharian, and possibly Albanian).

Areal contact among already distinct post-PIE languages (say, during the 3rd millennium BC) may have spread the sound changes involved. In any case, present-day specialists are rather less galvanized by the division than 19th cent. scholars were, partly because of the recognition that it is, after all, just one isogloss among the multitudes that criss-cross Indo-European linguistic geography.

Suggested superfamilies

Some linguists propose that Indo-European languages form part of a hypothetical Nostratic language superfamily, and attempt to relate Indo-European to other language families, such as South Caucasian languages, Altaic languages, Uralic languages, Dravidian languages, and Afro-Asiatic languages. This theory remains controversial, like the similar Eurasiatic theory of Joseph Greenberg, and the Proto-Pontic postulation of John Colarusso. Objections to such groupings are not based on any theoretical claim about the likely historical existence or non-existence of such super-families; it is entirely reasonable to suppose that they existed. The difficulty in identifying the details of actual relationships between language families, however, comes in finding concrete evidence that transcends chance resemblance. Since the noise-to-signal ratio in historical linguistics increases steadily over time, at great enough time-depths it becomes open to reasonable doubt that it can even be possible to distinguish between signal and noise.

Historical evolution

Proto-Indo-European

The Proto-Indo-European language (PIE) is the common ancestor of the Indo-European languages, spoken by the Proto-Indo-Europeans. The classical phase of Indo-European comparative linguistics leads from Franz Bopp's Comparative Grammar (1833) to August Schleicher's 1861 Compendium and up to Karl Brugmann's Grundriss published from the 1880s. Brugmann's junggrammatische re-evaluation of the field and Ferdinand de Saussure's development of the laryngeal theory may be considered the beginning of "contemporary" Indo-European studies. The generation of Indo-Europeanists active in the last third of the 20th century (such as Calvert Watkins, Jochem Schindler and Helmut Rix) developed a better understanding of morphology and, in the wake of Kuryłowicz's 1956 Apophonie, understanding of the ablaut. From the 1960s, knowledge of Anatolian became certain enough to establish its relationship to PIE. Using the method of internal reconstruction an earlier stage, called Pre-Proto-Indo-European, has been proposed.

PIE was an inflected language, in which the grammatical relationships between words were signaled through inflectional morphemes (usually endings). The roots of PIE are basic morphemes carrying a lexical meaning. By addition of suffixes, they form stems, and by addition of desinences (usually endings), these form grammatically inflected words (nouns or verbs). The hypothetical Indo-European verb system is complex and, like the noun, exhibits a system of ablaut.

Diversification

The diversification of the parent language into the attested branches of daughter languages is historically unattested. The timeline of the evolution of the various daughter languages, on the other hand, is mostly undisputed, quite regardless of the question of Indo-European origins.

mid 2nd millennium BC distribution
mid 1st millennium BC distribution
post- Roman Empire and Migrations period distribution
late medieval distribution (after Islamic, Hungarian and Turkic expansions)

Sound changes

Main article: Indo-European sound laws

As the Proto-Indo-European language broke up, its sound system diverged as well, changing according to various sound laws evidenced in the daughter-languages. Notable cases of such sound laws include Grimm's law in Proto-Germanic, loss of prevocalic *p- in Proto-Celtic, loss of prevocalic *s- in Proto-Greek, Brugmann's law in Proto-Indo-Iranian, as well as satemization (discussed above). Grassmann's law and Bartholomae's law may or may not have operated at the common Indo-European stage.

Comparison of conjugations

The following table presents a comparison of conjugations of the thematic present indicative of the verbal root *bʰer- 'to carry' (whence English verb to bear) and its reflexes in various early attested IE languages and their modern descendants or relatives, showing that all languages had in the early stage an inflectional verb system.

Proto-Indo-European
(*bʰer- 'to carry')
I (1st. Sg.) *bʰéroh₂
You (2nd. Sg.) *bʰéresi
He/She/It (3rd. Sg.) *bʰéreti
We (1st. Pl.) *bʰéromos
You (2nd. Pl.) *bʰérete
They (3rd. Pl.) *bʰéronti
Language Family Indo-Aryan Greek Italic Germanic Celtic Slavic
Vedic Sanskrit Ancient Greek Latin Old Norse Old Irish OCS
I (1st. Sg.) bhárāmi phérō ferō bera biru berǫ
You (2nd. Sg.) bhárasi phéreis fers berr biri bereši
He/She/It (3rd. Sg.) bhárati phérei fert berr berid beretъ
We (1st. Pl.) bhárāmas phéromen ferimus berum bermai beremъ
You (2nd. Pl.) bháratha phérete fertis berið beirthe berete
They (3rd. Pl.) bháranti phérousi ferunt bera berait berǫtъ
Hindi Greek French Faroese Irish Czech
I (1st. Sg.) (maiṃ) bhartā (hūṃ) phéro (je) {con}fère (eg) beri beirim beru
You (2nd. Sg.) (tū) bhartā (hai) phéreis (tu) {con}fères (tú) bert beireann (tú) bereš
He/She/It (3rd. Sg.) (vah) bhartā (hai) phérei (il) {con}fère (hann/hon/tað) ber beireann (sé/sí) bere
We (1st. Pl.) (ham) bharte (haiṃ) phéroume (nous) {con}ferons (vit) bera beirimid berem(e)
You (2nd. Pl.) (tum) bharte (ho) phérete (vous) {con}ferez (tit) bera beireann (sibh) berete
They (3rd. Pl.) (ve) bharte (haiṃ) phéroun (ils) {con}fèrent (teir/tær/tey) bera beireann (siad) berou

While similarities are still visible between the modern descendants and relatives of these ancient languages, the differences have increased over time. Some IE languages have moved from synthetic verb systems to largely periphrastic systems. The pronouns of periphrastic forms are in brackets when they appear. Some of these verbs have undergone a change in meaning as well.

See also

Citations and notes

  1. 449 according to the 2005 SIL estimate, about half (219) belonging to the Indo-Aryan sub-branch.
  2. the Sino-Tibetan family of tongues has the second-largest number of speakers.
  3. 3.0 3.1 3.2 Auroux, Sylvain (2000). History of the Language Sciences. Berlin, New York: Walter de Gruyter. pp. p.1156. ISBN 3110167352. http://books.google.com/books?id=yasNy365EywC&pg=PA1156&vq=stephens+sassetti&dq=3110167352&as_brr=3&sig=nOsHuf3fqPmzmjmGYk1UnvSiFAs. 
  4. In London Quarterly Review X/2 1813.; cf. Szemerényi 1999:12, footnote 6
  5. In German it's indogermanisch 'Indo-Germanic' which indicates the east-west extension. That term was first recorded in use in French original as indo-germanique, in 1810 by Conrad Malte-Brun, a French geographer of Danish descent.
  6. such as Schleicher 1861, Szemerényi 1957, Collinge 1985, and Beekes 1995
  7. 308 languages according to SIL; more than one billion speakers (see List of languages by number of native speakers). Historically, also in terms of geographical spread (stretching from the Caucasus to South Asia; c.f. Scythia)
  8. [1] Perfect Phylogenetic Networks: A New Methodology for Reconstructing the Evolutionary History of Natural Languages - Luay Nakhleh,Don Ringe & Tandy Warnow, 2005, Language- Journal of the Linguistic Society of America, Volume 81, Number 2, June 2005
  9. Britannica 15th edition, vol.22, 1981, p.588, 594
  10. Frederik Kortlandt-The spread of the Indo-Europeans, 1989,[2]
  11. Lubotsky - The Old Phrygian Areyastis-inscription, Kadmos 27, 9-26, 1988
  12. Kortlandt - The Thraco-Armenian consonant shift, Linguistique Balkanique 31, 71-74, 1988
  13. Encyclopaedia Britannica, vol.22, Helen Hemingway Benton Publisher, Chicago, (15th ed.) 1981, p.593
  14. George S. Lane, Douglas Q. Adams, Britannica 15th edition 22:667, "The Tocharian problem"
  15. The supposed autochthony of Hittites, the Indo-Hittite hypothesis and migration of agricultural "Indo-European" societies became intrinsically linked together by C. Renfrew. (Renfrew, C 2001a The Anatolian origins of Proto-Indo-European and the autochthony of the Hittites. In R. Drews ed., Greater Anatolia and the Indo-Hittite language. family: 36-63. Washington, DC: Institute for the Study of Man).
  16. Britannica 15th edition, 22 p. 586 "Indo-European languages, The parent language, Laryngeal theory" - W.C.; p. 589, 593 "Anatolian languages" - Philo H.J. Houwink ten Cate, H. Craig Melchert and Theo P.J. van den Hout
  17. Britannica 15th edition, 22 p. 594, "Indo-Hittite hypothesis"
  18. [3] Holm, Hans J.: The Distribution of Data in Word Lists and its Impact on the Subgrouping of Languages. In: Christine Preisach, Hans Burkhardt, Lars Schmidt-Thieme, Reinhold Decker (eds.): Data Analysis, Machine Learning, and Applications. Proc. of the 31th Annual Conference of the German Classification Society (GfKl), University of Freiburg, March 7-9, 2007. Springer-Verlag, Heidelberg-Berlin (2008)

References

Recommended readings

External links

Databases

Lexicon

Images