Linguasphere language code
From Wikipedia, the free encyclopedia
The Linguasphere language code is a reference system for world languages used by the Linguasphere Observatory and published in its Linguasphere Register. It is an expansive, flexible system that relates each language or dialect with another. In this it is quite unlike the scheme used by Ethnologue, where three letter codes are unrelated mnemonics. Since 2006, the Linguasphere language codes have evolved to become part of a wider cooperation with UNESCO and MAAYA on securing a multi-lingual internet, and so protecting endangered languages.
The first part of the Linguaspere code is a decimal classification consisting of two numerals: from 00 to 99. This part is fixed, and is a systematic framework for the classification of the world's languages. Although the classification method used in this part of the code is familiar to many linguists, unique terminology is used in the definitions in the Linguasphere Register. The first numeral of the code represents the sector into which world languages are divided. The sector can either be a phylosector, where its constituent languages are considered to be in genetic relationship one with another, or geosector, where the languages are grouped geographically rather than genetically.
The second numeral is used to represent the zone into which each sector is divided. The zones, like the sectors, are described as either phylozones or geozones based on the relationship of their languages, one to another: genetic or geographical.
The second section of the Linguasphere language code consists of three capital letters: from AAA to ZZZ. Each zone is divided into one or more sets, with each set being represented by the first letter of the second section. Each set is divided into one or more chains (represented by the second letter) and each chain is divided into one or more nets (represented by the third letter). The division of the languages of a zone into sets, chains and nets is based on statistical analysis of linguistic similarity. Thus, a geozone is often divided into many more sets than a phylozone because the genetic relationship between languages of the latter usually ensures greater similarity between its members.
The third and final part of the code consists of up to three lowercase letters used to identify a language or dialect with precision: from aaa to zzz. The first letter of this section represents the outer language. According to statistical analysis of linguistic similarity the various lects that comprise the outer language are coded using a second, and often a third letter.
[edit] Examples
Appreciation of the Linguasphere language code is often easier with concrete examples.
For example,
- The code for the English language is 52-ABA, where 5= represents the Indo-European phylosector, 52= represents the Germanic phylozone, 52-A represents the Norsk+ Frysk set (which covers the entire phylozone), 52-AB represents the English+ Anglo-Creole chain, and 52-ABA is the English net. Within this net, the outer languages are:
- 52-ABA-a — Scots+ Northumbrian.
- 52-ABA-b — Anglo-English (the traditional language of southern England).
- 52-ABA-c — Global English (English as spoken around the world).
- Some more specific examples of English lects are:
- 52-ABA-abb is Geordie: belonging to 52-ABA-a Scots+ Northumbrian outer language, and 52-ABA-ab Northumbrian.
- 52-ABA-bco is the Norfolk dialect: belonging to 52-ABA-b Anglo-English outer language, and 52-ABA-bc Southern Anglo-English.
- 52-ABA-cof is Nigerian English: belonging to 52-ABA-c Global English outer language, and 52-ABA-co West-African English.