The Balkan sprachbund or linguistic area is the ensemble of areal features—similarity in grammar, syntax, vocabulary and phonology—among the languages of the Balkans. Several features are found across these languages though not all need apply to every single language. The languages in question may be wholly unrelated as modern forms in that they belong to various branches of Indo-European (such as Slavic, Greek, Romance, Albanian and Indo-Aryan) and also outside of Indo-European (such as Turkish). Also interesting is that some of the languages use these features for their standard language (ie. those whose homeland lies almost entirely within the region) whilst other populations to whom the land is not a cultural pivot (as they have wider communities outside of it) may still adopt the features for their local register; this is turn is viewed as non-standard by their respective peoples away from the region.
While they share little vocabulary, their grammars have very extensive similarities; for example they have similar case systems and verb conjugation systems and have all become more analytic, although to differing degrees.
Contents |
The earliest scholar to notice the similarities between Balkan languages belonging to different families was the Slovenian scholar Jernej Kopitar in 1829.[1] August Schleicher (1850)[2] more explicitly developed the concept of areal relationships as opposed to genetic ones, and Franc Miklošič (1861)[3] studied the relationships of Balkan Slavic and Romance more extensively.
Nikolai Trubetzkoy (1923),[4] Kristian Sandfeld-Jensen (1930),[5] and Gustav Weigand (1925)[6] developed the theory in the 1920s and 1930s.
In the 1930s, the Romanian linguist Alexandru Graur criticized the notion of “Balkan linguistics,” saying that one can talk about “relationships of borrowings, of influences, but not about Balkan linguistics”.[7]
The term "Balkan linguistic union" was coined by the Romanian linguist Alexandru Rosetti in 1958, when he claimed that the shared features conferred the Balkan languages a special similarity. Theodor Capidan went further, claiming that the structure of Balkan languages could be reduced to a standard language. Many of the earliest reports on this theory were in German, hence the term "Balkansprachbund" is often used as well.
The languages that share these similarities belong to five distinct branches of the Indo-European languages:
However, not all of these languages have the same number of features shared. That is why they are divided into three groups:
The Finnish linguist Jouko Lindstedt computed in 2000 a "Balkanization factor" which gives each Balkan language a score proportional with the number of features shared in the Balkan linguistic union.[8] The results were:
Language | Score |
---|---|
Macedonian | 12 |
Balkan Slavic | 11.5 |
Albanian | 10.5 |
Greek, Balkan Romance | 9.5 |
Romani (Gypsy) | 7.5 |
Another language that may have been influenced by the Balkan language union is the Judeo-Spanish variant that used to be spoken by Sephardi Jews living in the Balkans. The grammatical features shared (especially regarding the tense system) were most likely borrowed from Greek.
The source of these features as well as the directions have long been debated, and various theories were suggested.
Since most of these features cannot be found in languages related to those that belong to the linguistic union (such as other Slavic or Romance languages), early researchers, including Kopitar, believed they must have been inherited from the Paleo-Balkan languages (Illyrian, Thracian and Dacian) which formed the substrate for modern Balkan languages. But since very little is known about Paleo-Balkan languages, it cannot be determined whether the features were present. The strongest candidate for a shared Paleo-Balkan feature is the postposed article.
Another theory, advanced by Kristian Sandfeld in 1930, was that these features were an entirely Greek influence, under the presumption that since Greece "always had a superior civilization compared to its neighbours", Greek could not have borrowed its linguistic features from them. However, no ancient dialects of Greek possessed Balkanisms, so that the features shared with other regional languages appear to be post-classical innovations. Also, Greek appears to be only peripheral to the Balkan linguistic union, lacking some important features, such as the postposed article. Nevertheless, several of the features that Greek does share with the other languages (loss of dative, replacement of infinitive by subjunctive constructions, object clitics, formation of future with auxiliary verb "to want") probably originated in Medieval Greek and spread to the other languages through Byzantine influence.
The Roman Empire ruled all the Balkans, and local variation of Latin may have left its mark on all languages there, which were later the substrate to Slavic newcomers. This was proposed by Georg Solta. The weak point of this theory is that other Romance languages have few of the features, and there is no proof that the Balkan Romans were isolated for enough time to develop them. An argument for this would be the structural borrowings or "linguistic calques" into Macedonian from Aromanian, which could be explained by Aromanian being a substrate of Macedonian, but this still does not explain the origin of these innovations in Aromanian. The analytic perfect with the auxiliary verb "to have" (which Balkan languages share with Western European languages), is the only feature whose origin can fairly safely be traced to Latin.
The most commonly accepted theory, advanced by Polish scholar Zbigniew Gołąb, is that the innovations came from different sources and the languages influenced each other: some features can be traced from Latin, Slavic or Greek languages, while others, particularly features that are shared only by Romanian, Albanian, Macedonian and Bulgarian, could be explained by the substratum kept after Romanization (in the case of Romanian) or Slavicization (in the case of Bulgarian). Albanian was influenced by both Latin and Slavic, but it kept many of its original characteristics.
Several arguments favour this theory. First, throughout the turbulent history of the Balkans, many groups of people moved to another place, inhabited by people of another ethnicity. These small groups were usually assimilated quickly and sometimes left marks in the new language they acquired. Second, the use of more than one language was common in the Balkans before the modern age, and a drift in one language would quickly spread to other languages. Third, the dialects that have the most "balkanisms" are those in regions where people had contact with people of many other languages.
According to the central hypothesis of a project undertaken by the Austrian Science Fund FWF, Old Albanian had a significant influence on the development of many Balkan languages. Intensive research now aims to confirm this theory. This little-known language is being researched using all available texts before a comparison with other Balkan languages is carried out. The outcome of this work will include the compilation of a lexicon providing an overview of all Old Albanian verbs. As project leader Dr. Schumacher explains, the research is already bearing fruit: "So far, our work has shown that Old Albanian contained numerous modal levels that allowed the speaker to express a particular stance to what was being said. Compared to the existing knowledge and literature, these modal levels are actually more extensive and more nuanced than previously thought. We have also discovered a great many verbal forms that are now obsolete or have been lost through restructuring - until now, these forms have barely even been recognized or, at best, have been classified incorrectly." These verbal forms are crucial to explaining the linguistic history of Albanian and its internal usage. However, they can also shed light on the reciprocal relationship between Albanian and its neighbouring languages. The researchers are following various leads which suggest that Albanian played a key role in the Balkan Sprachbund. For example, it is likely that Albanian is the source of the suffixed definite article in Romanian, Bulgarian and Macedonian, as this has been a feature of Albanian since ancient times.[9]
Most likely the earliest contact was between the Proto-Romanians and Proto-Albanians, (1st century - 5th century AD) this theory being supported by the Albanian vocabulary borrowed from Balkan Latin, as well as the Romanian substrate, which has words cognate to Albanian words.
The exact area where contact occurred is under debate, ranging from Northern Albania to Transylvania. For more, see Origin of Romanians and Origin of Albanians. All Romanian varieties (from the Republic of Moldova to the Vlachs of Serbia) are part of the sprachbund, which shows that the contact happened before they diverged.
The invasion of the Slavs led to a period of migrations throughout the Balkans which created multi-ethnic communities and this led to the sprachbund beginning around the 8th century; most features were present by the 12th century, but in some parts it continued until the 17th century.
The number of cases is reduced, several cases being replaced with prepositions, the only exception being Serbian. In Bulgarian and Macedonian, on the other hand, this development has actually led to the loss of all cases except the vocative.
A common case system of a Balkan language is:
In the Balkan languages, the genitive and dative cases (or corresponding prepositional constructions) undergo syncretism.
Example:
Language | Dative | Genitive |
---|---|---|
English | I gave the book to Maria. | It is Maria's book. |
Albanian | Librin i'a (ja) dhashë Marisë. | Libri është i Marisë. |
Aromanian | U-ded vivliapi Maria. | Easte vivlia ali Marie. |
Bulgarian | Дадох книгата на Мария [dadoh knigata na Marija] |
Книгата е на Мария [knigata e na Marija] |
Romanian | I-am dat cartea Mariei. colloq. for fem. (oblig. for masc.): I-am dat cartea lui Marian. |
Este cartea Mariei. colloq. for fem. (oblig. for masc.): Este cartea lui Marian. |
Macedonian | Ѝ ја дадов книгата на Марија. [ì ja dadov knigata na Marija] |
Книгата е на Марија. [knigata e na Marija] |
Greek |
Έδωσα το βιβλίο στην Μαρία. [édhosa to vivlío stin María] or Έδωσα το βιβλίο της Μαρίας. [édhosa to vivlío tis Marías] |
Είναι το βιβλίο της Μαρίας. [íne to vivlío tis Marías] |
Της το έδωσα [tis to édhosa] 'I gave it to her.' |
Είναι το βιβλίο της. [íne to vivlío tis] 'It is her book.' |
language | "in Greece" | "into Greece" |
---|---|---|
Albanian | në Greqi | në Greqi |
Aromania | tu Gârția | tu Gârția |
Bulgarian | в Гърция (v Gărcija) | в Гърция (v Gărcija) |
Greek | στην Ελλάδα (stin Elládha) | στην Ελλάδα (stin Elládha) |
Macedonian | Во Грција (vo Grcija) | Во Грција (vo Grcija) |
Romanian | în Grecia | în Grecia |
The future tense is formed in an analytic way using an auxiliary verb or particle with the meaning "will, want", referred to as de-volitive, similar to the way the future is formed in English. This feature is present to varying degrees in each language. Decategoralization is less advanced in Romanian voi and in Serbian ću, ćeš, će, where the future marker is still an inflected auxiliary. In Modern Greek, Bulgarian, Macedonian, and Albanian, decategoralization and erosion have given rise to an uninflected tense form, where the frozen third person singular of the verb has turned into an invariable particle followed by the main verb inflected for person.[8]
Language | Variant | Formation | Example: "I'll see" |
---|---|---|---|
Albanian | Tosk | "do" (invariant) + subjunctive | Do të shoh |
Gheg | "kam" (conjugated) + infinitive | Kam me pa | |
Aromanian | "va" (invariant) + subjunctive | Va s-ved | |
Greek | "θα" (invariant) + subjunctive | Θα δω / βλέπω (tha dho / vlépo); "I'll see / be seeing" | |
Bulgarian | "ще" (invariant) + present tense | Ще видя (shte vidya) | |
Macedonian | "ќе" (invariant) + present tense | Ќе видам (kje vidam) | |
Serbian | (literary standard) | "хтети/hteti" (conjugated) + infinitive | Ја ћу видети (видећу) (ja ću videti [videću]) |
(colloquial) | "хтети/hteti" (conjugated) + subjunctive | Ја ћу да видим (ja ću da vidim) | |
Romanian | (literary standard) | "a voi" (conjugated) + infinitive | Voi vedea/vedere (Note: Compare to Spanish "voy a ver") |
(colloquial) | "o" (invariant) + subjunctive | O să văd | |
(colloquial alternate) | "a avea" (conjugated) + subjunctive | Am să văd | |
(archaic) | "va" (invariant) + subjunctive | Va să văd | |
Romani | (Erli)[10] | "ka" (invariant) + subjunctive | Ka dikhav |
The analytic perfect tense is formed in the Balkan languages with the verb "to have" and, usually, a past passive participle, similarly to the construction found in Germanic and other Romance languages: e.g. Romanian am promis "I have promised", Albanian kam premtuar "I have promised". A somewhat less typical case of this is Greek, where the verb "to have" is followed by the so-called απαρέμφατο ('invariant form', historically the aorist infinitive): έχω υποσχεθεί. However, a completely different construction is used in Bulgarian and Serbian, which have inherited from Common Slavic an analytic perfect formed with the verb "to be" and the past active participle: обещал съм, obeštal sǎm (Bul.) / обећао сам, obećao sam (Ser.) - "I have promised" (lit. "I am one who has promised"). On the other hand, Macedonian, the third Slavic language in the Sprachbund, is like Romanian and Albanian in that it uses quite typical Balkan constructions consisting of the verb to have and a past passive participle (имам ветено, imam veteno = "I have promised").
The use of the infinitive (common in other languages related to some of the Balkan languages, such as Romance and Slavic) is generally replaced with subjunctive constructions, following early Greek innovation.
For example, "I want to write" in several Balkan languages:
Language | Example | Notes |
---|---|---|
Albanian | "Dua të shkruaj" | as opposed to Gheg me fjet "to sleep" or me hangër "to eat" |
Aromanian | "Voi să-ngrăpsescu" | |
Macedonian | "Сакам да пишувам" [sakam da pišuvam] | |
Bulgarian | "Искам да пиша" [iskam da piša] | |
Modern Greek | "Θέλω να γράψω" | as opposed to Ancient Greek "βούλομαι γράψαι" |
Romanian | "Vreau să scriu" (with subjunctive)
Vreau a scrie/scriere (with infinitive) |
The use of the infinitive is preferred in writing in some cases only. In speech it is more commonly used in the northern varieties (Transylvania, Banat, and Moldova) than in Southern varieties (Wallachia) of the language.[11] |
Serbian | "Želim da pišem"/"Желим да пишем | as opposed to the more literary form: "Želim pisati"/"Желим пиcaти, where pisati/пиcaти is the infinitive. Both phrases are correct and do not create misunderstandings, although the colloquial one is more commonly used in daily conversation. |
Bulgarian Turkish | "isterim yazayım" | In Standard Turkish in Turkey this is "yazmak istiyorum" where "yazmak" is the infinitive. |
Romani (Erli) | "Mangav te pišinav" | Many forms of Romani add the ending -a to express the indicative present, while reserving the short form for the subjunctive serving as an infinitive: e.g. "mangava te pišinav". Some varieties outside the Balkans have been influenced by non-Balkan languages and have developed new infinitives by generalizing one of the finite forms (e.g. Slovak Romani varieties may express "I want to write" as "kamav te irinel/pisinel" - generalized third person singular - or "kamav te irinen/pisinen" - generalized third person plural). |
But here is an example of a relict form, preserved in Bulgarian:
Language | Without infinitive | With relict "infinitive" | Translation | Notes |
---|---|---|---|---|
Bulgarian | "Недей да пишеш." | "Недей писа." | Don't write. | The first part of the first three examples is the prohibitative element недей ("don't", composed of не, "not", and дей, "do" in the imperative). The second part of the examples, писа, я, зна and да, are relicts of what used to be an infinitive form (писати, ясти, знати and дати respectively). This second syntactic construction is colloquial and more common in the eastern dialects. The forms usually coincide with the past aorist tense of the verb in the third person singular, as in the case of писа; those that don't coincide (as in the last three examples) are highly unusual today, but do occur, above all in older literature. |
"Недей да ядеш." | "Недей я." | Don't eat. | ||
"Недей да знаеш." | "Недей зна." | Don't know. | ||
"Можете ли да ми дадете?" | "Можете ли ми да?" | Can you give me? |
Sentences which include only a subjunctive construction can be used to express a wish, a mild command, an intention or a suggestion.
This example translates in the Balkan languages the phrase "You should go!", using the subjunctive constructions.
Language | Example | Notes | |
---|---|---|---|
Macedonian | Да (си) одиш! | "Оди" [odi] in the imperative is more common, and has the identical meaning. | |
Bulgarian | Да си ходиш! | ||
Serbian | Да идеш! | "Иди!" in the imperative is grammatically correct, and has the identical meaning. | |
Albanian | Të shkosh! | "Shko!" in the imperative is grammatically correct. "Të shkosh" is used in sentence only followed by a modal verbs, ex. in these cases: Ti duhet të shkosh (You should go), Ti mund të shkosh (You can go) etc. | |
Modern Greek | Να πας! | ||
Romany (Gypsy) | Te dža! | ||
Romanian | Să te duci! |
|
|
Megleno-Romanian | S-ti duț! | ||
Aromanian | S-ti duț! |
With the exception of Greek, Serbian and Romani, all languages in the union have their definite article attached to the end of the noun, instead of before it. None of the related languages (like other Romance languages or Slavic languages) share this feature and it is thought to be either an innovation or Albanian borrowing spread in the Balkans.
However, each language created its own internal articles, so the Romanian articles are related to the articles (and demonstrative pronouns) in Italian, French, etc., while the Bulgarian articles are related to demonstrative pronouns in other Slavic languages.
Language | Feminine | Masculine | ||
---|---|---|---|---|
without
article |
with
article |
without
article |
with
article |
|
English | woman | the woman | man | the man |
Albanian | grua | gruaja | burr | burri |
Aromanian | muľare | muľarea | bărbat | bărbatlu |
Bulgarian | жена | жената | мъж | мъжът |
Macedonian[12] | жена | жената | маж | мажот |
Romanian | femeie
muiere |
femeia
muierea |
bărbat | bărbatul |
Torlakian | жена | жената | муж | мужът |
The Slavic way of composing the numbers between 10 and 20, e.g. "one + on + ten" for eleven, called superessive, is widespread.
Greek does not follow this.
Language | The word "Eleven" | compounds |
---|---|---|
Albanian | "njëmbëdhjetë" | një + mbë + dhjetë |
Aromanian | "unăspră" | ună + spră |
Bulgarian | "единадесет" | един + (н)а(д) + десет |
Macedonian | "единаесет" | еде(и)н + (н)а(д) + (д)есет |
Romanian | "unsprezece" or, more commonly, "unșpe" | un + spre + zece < *unu + supre + dece; unu + spre; the latter is more commonly used, even in formal speech. |
Bosnian/Croatian/Serbian | "jedanaest/једанаест" | jedan+ (n)a+ (d)es(e)t/један + (н)а + (д)ес(е)т. This is not the case only with South Slavic languages. This word is formed in the same way in most Slavic languages, e.g. Polish - "jedenaście", Czech - "jedenáct", Slovak - "jedenásť", Russian - "одиннадцать", Ukrainian - "одинадцять", etc. |
Direct and indirect objects are cross-referenced, or doubled, in the verb phrase by a clitic (weak) pronoun, agreeing with the object in gender, number, and case or case function. This can be found in Romanian (although mostly optional[13]), Greek, Bulgarian, Macedonian, and Albanian. In Albanian and Macedonian, this feature shows fully grammaticalized structures and is obligatory with indirect objects and to some extent with definite direct objects; in Bulgarian, however, it is optional and therefore based on discourse. In Greek, the construction contrasts with the clitic-less construction and marks the cross-referenced object as a topic. Southwest Macedonia appears to be the location of innovation.
For example, "I see George" in Balkan languages:
Language | Example |
---|---|
Albanian | "E shoh Gjergjin" |
Aromanian | "U- ved Yioryi" |
Bulgarian | "Виждам го Георги." (colloquial form; see note) |
Macedonian | "Гo гледам Ѓорѓи." |
Greek | "Τον βλέπω τον Γιώργο" |
Romanian | "Îl văd pe Gheorghe." or simply "Văd pe Gheorghe."[13] |
Note: The neutral case in normal (SVO) word order is without a clitic: "Виждам Георги." However, the form with an additional clitic pronoun is also possible in colloquial speech: "Виждам го Георги." And the clitic is obligatory in the case of a topicalized object (with OVS-word order), which serves also as the common colloquial equivalent of a passive construction. "Георги го виждам."
The replacement of synthetic adjectival comparative forms with analytic ones by means of preposed markers is common. These markers are:
Macedonian and Modern Greek have retained some of the earlier synthetic forms. In Macedonian these have become proper adjectives in their own right without the possibility of [further] comparison (ex. виш, "higher, superior"; ниж, "lower, inferior").
Also, some common suffixes can be found in the linguistic area, such as the diminutive suffix of the Slavic languages (Srb. Bul. Mac.) "-ovo" "-ica" that can be found in Albanian, Greek and Romanian.
Several hundred words are common to the Balkan union languages; the origin of most of them is either Greek, Bulgarian or Turkish, as the Byzantine Empire, the First Bulgarian Empire, the Second Bulgarian Empire and later the Ottoman Empire directly controlled the territory throughout most of its history, strongly influencing its culture and economics.
Albanian, Aromanian, Bulgarian, Greek, Romanian, Serbian and Macedonian also share a large number of words of various origins:
Source | Source word | Meaning | Albanian | Aromanian | Bulgarian | Greek | Romanian | Macedonian | Serbian | Turkish |
---|---|---|---|---|---|---|---|---|---|---|
Latin | mensa | table | mensa (tavolinë) | masã | маса (masa) | - | masă | маса (masa) | - | masa |
Thracian | rompea | spear | - | roféja | руфия (rufiya) - dialectal, meaning "thunderbolt" | ρομφαία (rhomphaía) | - | - | - | |
Byzantine Greek | λιβάδιον (libádion) | meadow | lëndinë | livadã | ливада (livada) | λιβάδι | livadă | ливада (livada) | livada ливада (livada) |
- |
Byzantine Greek | διδάσκαλος (didáskalos) | teacher | - | dascal | даскал (daskal) (colloquial) | δάσκαλος | dascăl | даскал (daskal) (colloquial) | даскал (daskal) (colloquial) | - |
Byzantine Greek | κουτίον (koutíon) | box | kuti | cutii | кутия (kutiya) | κουτί | cutie | кутија (kutija) | kutija кутија (kutija) |
kutu |
Turkish | boya | paint, color | bojë (but also ngjyrë) | boi | боя (boya) | μπογιά (boyá) | boia | боја (boja) | boja боја (boja) |
boya |
Apart from the direct loans, there are also many calques that were passed from one Balkan languages to another, most of them between Albanian, Macedonian, Bulgarian, Greek, Aromanian and Romanian.
For example, the word "ripen" (as in fruit) is constructed in Albanian, Romanian and (rarely) in Greek (piqem, a (se) coace, ψήνομαι), in Turkish pişmek by a derivation from the word "to bake" (pjek, a coace, ψήνω).[14]
Another example is the wish "(∅/to/for) many years":
Language | Expression | Transliteration | |
---|---|---|---|
Greek | (medieval) | εις έτη πολλά | is eti polla |
(modern) | χρόνια πολλά | khronia polla | |
Latin | ad multos annos | ||
Aromanian | ti mulț ań | ||
Romanian | la mulţi ani | ||
Albanian | për shumë vjet | ||
Bulgarian | за много години | za mnogo godini | |
Macedonian | за многу години | za mnogu godini | |
Serbian | за многo годинa | za mnogo godina |
Idiomatic expressions for "whether one <verb> or not" are formed as "<verb>-not-<verb>".[15]
Language | expression | meaning |
---|---|---|
Bulgarian | ще - не ще | "whether one wants or not" |
Greek | θέλει δε θέλει | "whether one wants or not" |
Romanian | vrea nu vrea | "whether one wants or not" |
Turkish | ister istemez | "whether one wants or not" |
Serbian | Hteo- ne hteo/хтео - не хтео | "whether one wants or not" |
Albanian | deshti - nuk deshti | "whether one wants or not" |
Macedonian | сакал - не сакал / нејќел | "whether one wants or not" |
Aromanian | i vrei - i nu vrei | "whether one wants or not" |
The main phonological features consist of:
This feature also occurs in Greek, but it is lacking in some of the other Balkan languages; the central vowel is found in Romanian, Bulgarian, some dialects of Albanian, and Serbian, but not in Greek or Standard Macedonian.
Less widespread features are confined largely to either Romanian or Albanian, or both: