Wikipedia talk:Naming conventions (Cyrillic)

From Wikipedia, the free encyclopedia

I've started this article to document the prevalent usage of Cyrillic in Wikipedia. Please correct or add as necessary. Michael Z. 2005-12-8 08:46 Z

Contents

[edit] Usage

Perhaps we should also devise a list of cases, when Cyrillic version is absolutely necessary, when it is optional, and when it is completely redundant. I see it that the article's title (when given in bold in the intro line) must absolutely be accompanied with a language designation and Cyrillic version. Infobox headers should probably also provide Cyrillic spelling. The rest is open to discussion. A usage examples section would be helpful too—it would be a good reference tool to see when and why Cyrillics is currently used in English Wikipedia. Once it's done, we can start discussing which category (mandatory, optional, redundant) each case falls into. What do you think?—Ëzhiki (erinaceus amurensis) 18:49, 8 December 2005 (UTC)

Excellent. Random comments; feel free to adjust or add better examples:
I see the following as being treated somewhat differently, although there are grey areas in all these cases:
  • Proper names (e.g., Nikita Khrushchev)
  • Words with no precisely-corresponding English translation (oblast, khokhol)
  • Words which are simply translations, although in another language they may have special meaning or connotation (oseledets in the article "Khokhol")
Where:
  • Article title:
  • e.g., "durak" (дурак)—the title of the Durak article.
  • Lead sentence as the primary name:
  • e.g., "The Union of Soviet Socialist Republics... (Сою́з Сове́тских Социалисти́ческих Респу́блик..., Soyuz Sovetskikh Sotsialisticheskikh Respublik)..."—from the Soviet Union article.
  • Lead sentence, as an alternate or secondary name or spelling
  • e.g., "...also called the Soviet Union (Сове́тский Сою́з, Sovetsky Soyuz)..."—from the Soviet Union article.
  • Nomenclature section with other languges, and possibly less-important alternate names
  • Infobox title
  • Infobox body
  • In the article text
Comments:
  • Sometimes words are spelled the same in different Cyrillic alphabets
  • Sometimes words are pronounced the same in different languages
  • Sometimes both, but not always


I think if the title of the article is a Cyrillic name or a word, then we should provide the Cyrillic spelling as it greatly simplify the search. If a person qualify for more than one Wikipedia:Naming conventions/Ethno-cultural labels in biographies national labels and the spelling of his/her name is different all the different spellings should be provided (if known). The same with the words as in Knyaz, if different languages have different spelling that is relevant, all of them should be provided.

On the other hand, for a word somehow referred in the article (say marshrutka in Moscow article). I think the rules should be different. I propose:

  • If there is an article on the word (as is the case with marshrutka - the wikilink is sufficient)
  • If a usual translation is enough for a native speaker to reverse engineer the meaning - then only translation is nesseccarry: e.g. Kazan State University does not require Kazanskiy Gosudarstvenniy Universitet (nor Казанский Государственный Университет)
  • If a translation cannot be easily reverse engineered, e.g. Vnevedomstvennaya okhrana -> Interdepartment Guards, then we need a transliteration and/or the original Cyrillic. Usually it is an indication that we need an article about the word. abakharev 06:18, 10 December 2005 (UTC)
And of course, any term that would puzzle the average reader should be explained or expanded in an article, if only briefly, so that the article can stand on its own. Michael Z. 2006-01-4 05:07 Z

[edit] UN/GOST for Russian

I would recommend using the UN/GOST system for transliterating Russian. It's what is used in Russia, and is spreading internationally. The situation strikes me as being rather like the case of Pinyin -- it's going to be adopted eventually, so we might as well start now so we don't have to convert a second time. It's also rather elegant. kwami 01:04, 10 December 2005 (UTC)

Can you point to a good reference? Sounds like something that belongs in "Transliteration of Russian into English". Michael Z. 2005-12-10 05:04 Z
Found one: Russian.pdf. Michael Z. 2005-12-10 05:50 Z
Kwami, can you elaborate on how GOST is gaining usage, particularly in the English-speaking world?
I'm not writing off the idea, but GOST looks better-suited to academic usage, or for Europeans, than for a general use English-language encyclopedia. Reading it is far from intuitive for English-speakers (pronunciation of ë, ž, c, č, š, ju, ja are foreign to English readers), and typing the diacritics is impossible for many editors. It would also be a radical change from the modified BGN/PCGN in current use. Michael Z. 2006-01-4 05:16 Z
Actually, the inclusion of ë, ž, c, č, š and the like is very possible with the "insert" panel at the bottom of the edit screen. Keep in mind, the average English reader has no idea of how to pronounce a Polish names such as Tadeusz Kościuszko, yet the article exists because Poland has an official Latin alphabet. Even Belarus's unofficial Lacinka alphabet is widely used - since GOST is the closest thing to an official Russian Latin alphabet, we should use that as well. Kazak 01:35, 28 January 2006 (UTC)
The insert panel has never worked in Safari, my web browser, so I've used my user style sheet to hide it completely. I notice that it almost works in Firefox, but displays some of the characters incorrectly.
But part of the point of transliteration is to make Slavic names more accessible to anglophones—hopefully more-or-less pronounceable. Going from the completely foreign Cyrillic to an only somewhat foreign Polish-like alphabet doesn't accomplish this goal.
Furthermore, switching from modified BGN/PCGN to GOST for the academic translations will have substantial editor resistance. Switching to GOST for article titles and proper names would probably be completely unacceptable to most Wikipedians. Ending up with a mix of the two would be more confusing than necessary.
What I'd like to do is switch to a more precise usage of BGN/PCGN for the pure transliterations, and possibly harmonize by adopting BGN/PCGN for some of the other languages, too. That way readers could learn one transliteration system for most Slavic names, and not worry about the technical differences.
I'm still not totally against other options, but how would you feel about using the scholarly "scientific notation" for all academic transliterations? It's almost identical to GOST (the only difference being x instead of h (ch), I think), but applicable directly to any Slavic language. Michael Z. 2006-01-28 02:04 Z
I support the idea of using the scholarly transliteration. It's already in place at many articles, but we'd still need an official policy. Cossack 04:04, 2 May 2007 (UTC)
The point is that GOST is the closest thing to an official Russian alphabet - the fact that many English speakers are still used to "Rachmaninoff" for "Rakhmaninov" shouldn't deter us from instating a more professional system. If Belarus's Lacinka is used, GOST should be used for Russian. Otherwise we might as well move the Kościuszko article to Tadeush Koschtsyushko. The fact remains that GOST is the Gosudarstvennyj Standart, meaning that it is what is used by the Russian government on official documents and maps. Why should we deviate? Kazak 18:01, 28 January 2006 (UTC)

[edit] Confirmation for Serbian and Bosnian; a note about Macedonian

Serbian and Bosnian should be written as in their Latin forms. Macedonian can be written similar to Serbian latin: Macedonian "ѕ" should be transcribed as "dz" and "ќ" and "ѓ" as Latin equivalents (k and g with '); others is the same as in Serbian. But, I am not sure what should be for Macedonian. --millosh (talk (sr:)) 02:40, 10 December 2005 (UTC)

Thanks; I've added that to the page. Michael Z. 2005-12-10 05:12 Z
Found a reference for transliteration of Macedonian, including the systems UN 1977, ISO 9 1995, ALA-LC 1997, and the scholarly transliteration and IPA used in the book World's Writing Systems: Macedonian.pdf. Michael Z. 2005-12-10 05:48 Z
Please note that the mentioned document http://transliteration.eki.ee/pdf/Macedonian.pdf contains an error regarding ALA-LC 1997. It says that Cyrillic "х" becomes Latin "x", but in fact it has to become Latin "h" in Serbian and Macedonian. See http://www.loc.gov/catdir/cpso/romanization/serbian.pdf for the original ALA-LC 1997 publication, also available in printed form: ISBN 0-8444-0940-5. I have just removed this error from the front page. --217.232.159.92 (talk) 00:31, 8 January 2008 (UTC)

I think that we should also specify:

Use Latin letters with diacritics (čćšžđ) in titles of articles about people, locations and other terms which do not have an established English transliteration. Wherever it makes sense, also provide a diacritic-free redirection (disambiguation).

Duja 08:00, 12 December 2005 (UTC)

[edit] Bulgarian

There has been some discussion on Talk:Bulgaria about which Transliteration of Bulgarian to use. The preferred transliteration is the one given by the Bulgarian Ministry of the Interior: see here. Most place names have been transliterated using this system, at least for the article titles. I agree that the cyrillic version should be given as well in the first line, and if the English version is not identical with the systematic transliteration, also the latter (Danube-Dunav, Sofia-Sofiya). Markussep 08:12, 10 December 2005 (UTC)

Thanks; I've added it to the page. By the way, there are a couple of other methods described at http://transliteration.eki.ee/pdf/Bulgarian.pdf. Michael Z. 2005-12-10 08:55 Z

[edit] Problems with Russian translit

Well, we all know them already. Transliteration of Russian into English is a lousy encyclopedic article, because it only describes one system, and modified one at that. To make matters worse, it's also used as a policy for transliterating Russian in Wikipedia. So, I've just made the first step by creating Wikipedia:Transliteration of Russian into English—it's basically the same article, only stripped of the intro, links, cats, and interwikies with a policy reference note added. I suggest that whatever changes we work out during the course of this all-Cyrillics workshop would be then represented on that page.

I will work on the actual encyclopedic article later to improve it a notch or two.

Hopefully this separation clears a lot of the confusion. Please comment.—Ëzhiki (erinaceus amurensis) 18:28, 14 December 2005 (UTC)

[edit] Documenting other encyclopedias' transliteration

We should consider how other encyclopedias, or any sources which deal with transliterating several languages in a Unified text, deal with the problem. Anyone have a handy source? Michael Z. 2005-12-14 20:12 Z

I doubt they have written standards available to the general public. I was trying to find something like that a while ago, and was unsuccessful. I guess the only way is to look at each encyclopedia and try to guess what system(s) they use. It would be tedious and boring, and possibly inaccurate, but I do not see any other way. Unless someone knows someone working for an encyclopedia who would have access to such information, that is.—Ëzhiki (erinaceus amurensis) 20:27, 14 December 2005 (UTC)
I haven't looked at a lot of encyclopedias yet, but I've seen several mainstream or academic books which have a brief note mentioning which transliteration system they use, or include a transliteration table. My Ukraine: A Concise Encyclopædia includes about 2-1/2 pages describing their naming conventions, which I described briefly at Romanization of Ukrainian#Conventional romanization of proper names. I've a busy week coming up, but I'll visit the local library some time and see if I can suss out what the Britannica does. Michael Z. 2006-01-4 05:23 Z
The german language WP has de:Wikipedia:Namenskonventionen/Kyrillisch, which covers a significant number of languages. Note that we don't transliterate, but transkribe there. I actually ended up here because I was looking for an english language convention for transkribing mongolian names. Has there been any discussion about this so far? --Latebird 12:14, 3 March 2006 (UTC)

[edit] "apostrophe" transliteration of soft/hard signs

Wikipedia:Naming_conventions_(Cyrillic)#Ukrainian displays ’ (U+2019 RIGHT SINGLE QUOTATION MARK) and ” (U+201D RIGHT DOUBLE QUOTATION MARK) as the suggested transliteration characters. I would never suggest these ones—because according to Unicode they belong to the category of punctuation marks (thus, something separating words), whereas the Cyrillic soft/hard signs are parts of words. So, I'd rather suggest to display either the conventional ASCII, semantically overloaded ' or a Unicode modifier letter like ʹ(U+02B9 MODIFIER LETTER PRIME, this is one is suggested by Unicode as the character for the transliteration).

Surely, this comment applies not specifically to Ukrainian.

I also mentioned this in a note.

I also checked whether some Google search results would be lost if the specialized Unicode characters are used in place of the simple apostrophe; it seems, they wouldn't: for example, in the case of Igor Melʹčuk the results number increased, in the case of Rusʹ remained the same: google:Rusʹ, google:Rus'.

So, probably, if you fancy the specialized Unicode characters, you could easily recommend their use here. (Yes, that's the second point of this comment. The first point is not to recommend the punctuation marks, ’ U+2019 RIGHT SINGLE QUOTATION MARK and ” U+201D RIGHT DOUBLE QUOTATION MARK.)--Imz 23:11, 14 December 2005 (UTC)

I commented on the note, but I'll expand here.
This page currently discusses the semantic characters (primes and apostrophes). Unicode encoding issues should be mentioned, perhaps in an appendix, but they are secondary.
Typewriter punctuation marks ( ' " ) are acceptable, but semantically ambiguous and typographically inferior representations for primes and apostrophes. Punctuation primes and apostrophes are better ( ′ ″ ’ ” ). But because some software considers them word breaks, Unicode "modifier letters" are better still:
  • U+02B9 modifier letter prime ( ʹ )
  • U+02BA modifier letter double prime ( ʺ )
  • U+02BC modifier letter apostrophe ( ʼ )
  • U+02EE modifier letter double apostrophe ( ˮ )
I've found that Google essentially ignores all punctuation marks and diacritics, so Melʹčuk and Melcuk are equivalent (unless you enclose them in double quotes: "Melʹčuk"). I've heard that this is not true if you are located in the USA or UK, but in it's the case here in Canada (presumably because we are an English/French bilingual country). Michael Z. 2005-12-15 01:40 Z
The problem still there. Omitting soft sign makes prounouncing incorrect and sometime incomprehensive. Omitting hard sign is ok. Y for soft sign is wrong way because makes it hard. It should be apostrophe as it supposed by all standards. Elk Salmon 00:01, 17 April 2006 (UTC)

[edit] Proposal of another system of transliterating Russian

Here I propose an alternate system for transliteration of Russian. You can take a look on an example in an article, which I created using this system: List of Moscow metro stations. I suppose it is the most natural for Russian among other systems.

Do you have anything against it?--Nixer 21:33, 22 December 2005 (UTC)

Does this correspond to one of the established systems, or is it new? Does it work with other languages? Can you point to some documentation? Michael Z. 2005-12-22 21:37 Z
About other languages I dont know. This is way how people tend to transliterate their own names. Of course it could be described. BTW, look at MY version of the article (as I noticed, you looked through the article, but after another user edited). Permanent link to the my version is [1]--Nixer 12:49, 23 December 2005 (UTC)
I can't really comment without a description of this system. We already have one "informal" system for "how people tend to transliterate their own names" described at Romanization of Russian#Conventional transcription of Russian names; is your system the same as that one or different? Has this been documented somewhere, or does it constitute original research? Michael Z. 2005-12-23 15:46 Z

I fully and honestely support your convention for the following reasons: If the letter Ы is already reserved for y; then Й should be translited via I. When reading english users are bold to know when to use I as И and when as Й. Besides in Russian the two letters are derived from one another. Also compleately pointless is to use ye for Е after a vowel. Likewise yo for Ё should be given as E considering that in Russian it is not grammaticaly incorrect to write Ё as E. --Kuban kazak 14:03, 6 February 2006 (UTC)

...then Й should be transliterated... Should be? Says who? We already have a <SarcasticQuote>conventional</SarcasticQuote> translit system, which is basically BGN/PCGN with amendments. And those amendments are a source of all evil. You are essentially proposing a different system with a different set of amendments. How is that going to change anything? None of the major transliteration systems uses the convention you like so eagerly. Writing "ye" for "е" after a vowel may not make any sense to you, but it did to both the United States Board on Geographic Names and to the Permanent Committee on Geographical Names for British Official Use. It may not be the best solution for all problems, but it exists and dismissing it as "pointless" without any arguments is not going to make it disappear.
With all that in mind, please do not invent any more systems until a consensus is reached here. Whether we like it or not, we are stuck with Wikipedia:Transliteration of Russian into English as a policy. If it makes you feel any better, I strongly dislike it in its current form and am dedicated to work on this Cyrillics naming project, yet I expect editors to comply with the existing policies as long as they are in effect.—Ëzhiki (ërinacëus amurënsis) 21:18, 7 February 2006 (UTC)
Is there a description of Nixer's system anywhere? Does it correspond to any standard, or is it original research? Michael Z. 2006-02-06 17:22 Z
The closest one (but not exact) is Allworth's system. I am not aware of it being used much anywhere.—Ëzhiki (ërinacëus amurënsis) 21:18, 7 February 2006 (UTC)
Convention or not, it is fact that it is used less than without it particulary for EE as opossed to EYE (Same Novogireevo gets 13500 hits whilst Novogireyevo just barely passes a tenth of that). Use your logic how many incorrect ways can you read it: Новогирей-иво; Новогиреыево; Новогирайво... --Kuban kazak 23:53, 7 February 2006 (UTC)
Those appear to be examples of ALA-LC and BGN/PCGN transcriptions. How does Nixer's system relate to this, and how can we discuss it, much less consider adopting it, if it only exists in his head? Michael Z. 2006-02-08 00:07 Z
This also appears to be a case when google isn't doing a good job, at least not good enough for our purposes. To a Russian, it would be more natural to write the "ее" combination as "ee", not "eye", and the most of those google hits were to the sites created by Russians (the first set of google "Novogireevo" results only contains one (!) link to a non-Russian site). Now, google for "Novogireyevo", and the result set mostly comprises link to Western sites, written by non-Russians. As we are an English Wikipedia, we should gear toward the English-speaking audience, which means that even though a spelling looks unnatural to a Russian may be in fact the best solution for English Wikipedia. And we already covered that no matter what transliteration system is used, there will always be a way (or ways) to read words transliterated with the help of that system incorrectly.—Ëzhiki (ërinacëus amurënsis) 16:45, 8 February 2006 (UTC)
So as I understand your point we should use American spelling and words, because they appear to be more common in the internet, over British ones. How rediculous is that?
BTW Novogireevo still has advantage in non-Russian webisites [2] against [3]. Finally this is an international encyclopedia, which amongst other things, has to present facts to readers correctely. --Kuban kazak 17:45, 8 February 2006 (UTC)
British vs. American spelling issue is quite irrelevant here. It has nothing to do with transliteration, it is a completely different phenomenon, and it is very well covered by existing Wikipedia policies.
As for the search links you provided, I would not consider a sample that's less than 100 (22:4 in this case) to have any meaning. What's more important, if you look closer at the first result set (the one with "ee"), you will see that even though none of the sites is hosted in Russia, a good chunk of them is in Russian and/or written by Russians. Filtering down this already meaninglessly small sample to include only truly non-Russian hits will not make it any more meaningful, but will certainly reduce the margin of difference.
The internationalism of the English Wikipedia goes, of course, undisputed. The English edition is indeed unique in that it caters to the whole world, because English is a de facto international language. However, it is also undisputed that the needs of the English-speaking readers take a higher priority, and that the English WP articles will always have slight American/Commonwealth bias (which is possible even within NPOV boundaries) or will simply be written in the Western style. If you seek true internationalism, check out the Esperanto edition.
Furthermore, I am completely lost as to what you are trying to say with "an... encyclopedia... has to present facts to readers correctely." Selecting a transliteration variation from the list of widely accepted transliteration systems is in fact presenting facts to the readers correctly. I don't see how inventing a brand new system which looks and sounds good to you personally fits into what you are trying to say. Your statement is devoid of logic and you seem to contradict youself. In case you missed it, we are not trying to invent a new translit system here. What we are trying to do is to figure out how to best utilize already existing translit systems (preferrably those applicable to the English language and convenient to the English-speaking readers) to better serve Wikipedia's needs. While we are in the process of figuring that out, everyone is expected to abide by existing rules and regulations and not push their own petty little agendas. We can very well add your translit proposal to all other proposals and put it out for a vote when it's time, and you can bet anyting that I'll personally be enforcing it if it is accepted. But I am absolutely not going to use it at this time while the other policy is still in effect, and I expect you to do the same. Instead of wasting time pointlessly discussing "the new system which is the best and everyone else shut up because it looks good and I like it and there is another guy that likes it too" here, you could at least put your proposal together, formalize it into a section for easy review, indicate sources, describe current usage (as applied to the English language), and post it here for everyone to comment on. If you are unable to do that, I would suggest that you retire from this discussion and apply your efforts elsewhere, preferrably without breaking any rules.—Ëzhiki (ërinacëus amurënsis) 18:55, 8 February 2006 (UTC)

A single obscure place name is hardly a complete gauge for choosing a transliteration system. In my opinion, -eyevo is better for anglophones because it definitely represents three syllables, while to many readers -eevo may look like it rhymes with Tivo. How it reads for Russophones is probably a tertiary consideration, after first, the average anglophone, and second, the anglophone slightly familiar with Cyrillic transliteration or with a Slavic language.

The differences between ALA-LC and BGN/PCGN are not dramatic, with the former reflecting the Cyrillic spelling a bit better and the latter being slightly more intuitive for pronunciation by anglophones and eschewing diacritics. Both are widely used in technical applications, as well as in their simplified forms in more general publications. I don't think a switch from the latter to the former is warranted.

But anyway, this discussion of Nixer's alleged transliteration system is totally moot in the complete absence of any documentation. Michael Z. 2006-02-08 20:04 Z

[edit] Belarusian/Russian transliteration

"Where that spelling is established in English, Belarusian is transliterated using the Russian method (below)"

How is this done?

  1. Translating to Russian first, then transliterating?
  2. Transliterating from Belarusian, using the Russian method; if so, how to transliterate letters і and ў?

Is the following a better wording? Michael Z. 2006-01-2 21:54 Z

"If a Russian name is better-established in English than a Belarusian name, use the Russian name and transliterate as Russian (below)."
No, I think it means apply the Russian transliteration guidelines to Belarusian in preference to Lacinka.
  • ў can be tranliterated via w
  • і via i

--Kuban kazak 01:06, 21 January 2006 (UTC)

Would г=h for Belarusian? Michael Z. 2006-01-28 02:12 Z
I don't think that would be necessary I'm no expert on Belarusian, but in Ukrainian there can be some confusion between the words жодні/згодні if "h" is applied uniformly. I think "g" would be safest. Kazak 22:21, 30 January 2006 (UTC)
How then would we transliterate old works which include the Belarusian letter ge (ґ)? I don't like the idea of improvising a transliteration method, because →it's original research, →we're not experts, →there may be problems we don't anticipate, and →it wouldn't be compatible with any other transliteration. Better to choose an established standard, no?
For Ukrainian, in technically-correct ALA-LC, a tie-bar is added to join the digraphs: z͡hodni/zhodni. In BGN/PCGN, a centre dot is added to separate the non-digraphs: zhoda/z·hoda (just found this out in a note recently added at Romanizing Ukrainian#Notes-table). These are often ignored, in practice. Michael Z. 2006-02-07 03:34 Z
Alternatively I propose this: The official transcription system of Belarusian names very close to Łacinka was introduced by the Belarus State Land Resources, Geodesy and Cartography Committee on November 23, 2000. The main difference to Łacinka is the using of softening rule Nn – Ńń, Cc – Ćć also in the case of Ll: Ll – Ĺĺ. The letter Ł ł is absent. Official, hence correct, hence suitable replacement. Although I personally would give articles titled with Russian names: for two reasons, Russian is an official language of Belarus; and currentely has state preferance over Belarusian. Also for geographical names, googling any Belarusian town or even a village in Russian would give more results than the different transliteration methods put together.--Kuban kazak 00:00, 8 February 2006 (UTC)

Putting aside the question of Russian/Belarusian geographic names for article titles for now, since some words will have to be translated from the language itself anyway....

There are also ISO, ALA-LC, and BGN/PCGN systems (Belarusian.pdf). BGN/PCGN may be worth considering, to harmonize with what is currently in use for Russian and Ukrainian (although everything is up for discussion here). Michael Z. 2006-02-08 00:15 Z

[edit] Macedonian

Macedonian has no official Latin script for transliteration. But, during Yugoslav times it had used a script which I have only known as being called yu-slova ... this is the endorsed version and is identical to the current Latin script of Serbia and Croatia. Ќ and Ѓ are transliterated as Ć and Đ, respectively. Being a Macedonian myself, and using the language every day ... using these letters appeals to me as they look cleaner, as opposed to and ǵ. Not only that, but as I mentioned before, this is practically the only script used when transliterating Macedonian into Latin script for formal purposes. --Daniel Tanevski talk 14:44, 1 February 2006 (UTC)

Dunno... I haven't seen much transliterated Macedonian texts, but ć and đ look (this is just a personal impression) wrong to me in this context. Although phonologically they Ќ and Ѓ, in most cases, cognates to Serbo-Croatian variants, they are phonetically so different that transliteration of e.g. Đorče Petrov seems overly "serbianized". I'm much more comfortable with "Gorče" or "Gjorče" or "G'orče". Just my 2c... Duja 14:56, 1 February 2006 (UTC)
Btw, are there uppercase Latin Unicode letters for and ǵ? I can't find any. Duja 15:01, 1 February 2006 (UTC)

Yes, Đorče Petrov does look "Serbianized" ... --Daniel Tanevski talk 15:16, 1 February 2006 (UTC)

[edit] ISO 9

...can this be used. why (not)? Tobias Conradi (Talk) 00:49, 7 February 2006 (UTC)

...well, looks not as if it is directly usable.

The major advantage ISO 9 has over other competing systems is its univocal system of one character for one character equivalents (by the use of diacritics), which faithfully represents the original and allows for reverse transliteration, even if the language is unknown.

Can someone insert some mayor disadvantages in the article? Tobias Conradi (Talk) 00:54, 7 February 2006 (UTC)

Disadvantages which affect ISO 9's adoption in Wikipedia:
  • Cannot be typed straight from a normal English keyboard.
  • Some characters have very little computer support (e.g. Ukrainian ґ = g̀—italics form fails on my computer: )
  • Unfamiliar diacritic characters don't have an intuitive pronunciation for most anglophones (e.g. č, š, ŝ, â for ч, ш, щ, я)
  • Ignores phonetic values in different languages (e.g. Russian ge and Ukrainian he are conflated: г = g)
The main advantage over other systems is support for all of the Asian languages, which may become an issue in the future, but I suppose there may be more intuitive systems for many of these languages.
I would consider ISO 9 or the Scholarly system it is based on in linguistics articles or for use in transliterations in the leading nomenclature section of an article, if we can avoid any technical problems with character display. But I wouldn't want to see it used for article titles or proper names in the text. Michael Z. 2006-02-07 01:10 Z

[edit] Technical difficulties

I just had a look at this discussion page in a few web browsers and found some problems, so I did some more quick testing.

On this discussion page:

  • MSIE: k-acute and g-acute used for Macedonian, and the modifier-letter accents draw as boxes, the z-tie-bar used in ALA-LC transliteration is rendered as an uppercase A-grave
  • Firefox: the centre dot used in BGN/PCGN transliteration for Ukrainian is rendered as a figure 6, the g-grave used in ISO 9 for Belarusian/Ukrainian ґ is rendered with the diacritic too far to the right
  • Safari: g-grave is followed by the accent, when italicized followed by a box
  • Safari with Lucida Grande font applied in style sheet: only italicized g-grave fails

In the table in ISO 9:

  • MSIE: 28 Cyrillic letter pairs, 20 Latin letter pairs, two accents, and the palochka fail to draw correctly
  • Firefox: 12-1/2 accented Latin letter pairs draw as letter followed by a box
  • Safari: 12-1/2 accented Latin letters pairs draw as letter followed by the accent
  • Safari with Lucida Grande font applied in style sheet: no problems

In the table in Romanization of Ukrainian:

  • MSIE: ALA-LC tie bars fail to render, two as different accented letters, three as boxes
  • Firefox: g-grave has the accent after the letter, the double-dagger reference link renders as a letter a
  • Safari: ALA-LC tie bars render too wide, but readable, g-grave is followed by accent
  • Safari with Lucida Grande font applied in style sheet: no problems

[Setup: MSIE 6 on vanilla WinXP; Safari 2.0.3 with a Wikipedia user style sheet specifying lucida grande font; Firefox 1.5.0.1 with default styl sheets; latter two on Mac OS X with lots of extra international fonts.] Michael Z. 2006-02-07 04:14 Z

[edit] Example convention

Please see the Slavic and East European Journal's Style Sheet for Authors, sections about transliteration, translation and names. Notably:

Whenever possible, please use transliteration instead of Cyrillic, since this broadens the potential readership of the journal and is less expensive to set. However, for poetry, long quotations, and especially when a point can be better made by reference to the Cyrillic, Cyrillic may certainly be used.

Michael Z. 2006-02-07 05:15 Z

[edit] Scientific transliteration

New article: Scientific transliteration. Michael Z. 2006-02-07 06:01 Z

[edit] Translit block

Just a crazy idea of mine. Thought I'd document it here. How about introducing translit blocks to the articles? I see it as an inline table/template listing transliterations of the article title under all major translit systems used for that particular language. For example, an article on Russian Raduzhny would list the translit variants of this name under BGN/PCGN, ISO 9, ALA-LC, etc. For people, this information can probably be incorporated in the persondata section. The advantages are that all possible legitimate variations would be covered, and the intro line can be unloaded. The diadvantages are that I can't think of the best way to incorporate this information without disrupting the article flow (as a metadata block?), or that it may be adding little value.

I am not sure of the best implementation (or necessity for that matter), but at least this is something to think about. Please comment.—Ëzhiki (ërinacëus amurënsis) 21:28, 7 February 2006 (UTC)

Sounds like it could just go into the alternate names field of persondata (alternate spellings, actually).
The problem is inherent in persondata; because it is hidden using CSS, that means that handicapped screen reader users, some of whom already have the hardest time wading through a web page, may be forced to plough through all of it. See the list of alternate transliterations at the bottom of this page—yes, if one wants to search for Chinese sources about Khrushchev, one needs to search for all of Hei-lu-hsue-fu, He Lu Xiao Fu, and Ho-lu-hsiao-fu. Michael Z. 2006-02-07 22:30 Z
Problem is that persondata is only available in articles about people. Any ideas how to add same information to articles on geo entities? We can, of course, invent something like "locationdata", but that's another metablock with the same problems as persondata. Anyway, this idea still needs lots of thought.
P.S. Got a kick out of that list. Makes me glad I am not involved in the Chinese edition of Wikipedia :)
Ëzhiki (ërinacëus amurënsis) 23:13, 7 February 2006 (UTC)
Yeah, but that list is what serious English-language scholars may have to deal with, if they want to do a thorough literature search or are dealing with international issues. Yeesh!
I posted a question about this at persondata#. I think it's important to ask the right questions now so things don't end up screwed up; geodata, etc. will probably show up soon enough. Michael Z. 2006-02-07 23:31 Z

By the way, have you seen Wikipedia:WikiProject Geographical coordinates, which coordinates the lat/long templates? Perhaps it would be useful to extend that to cover geographic names and other related metadata. Michael Z. 2006-02-08 00:26 Z

[edit] Mediation notice

With great regret I am informing all interested parties that a mediation request against Kuban kazak has been filed by me. The request is in regards to Kuban kazak's transliteration practices in the articles pertaining to the Moscow Metro. Please review and participate as the mediation guidelines permit.—Ëzhiki (ërinacëus amurënsis) 15:49, 14 February 2006 (UTC)

[edit] Transcription for Mongolian

I would like to see a standard transcription table for mongolian. I think that for the base characters, the table in Wikipedia:Romanization of Russian should work fine, so we just need to decide what to do with ө and ү. Using ö and ü seem to be the cleanest solution. Any opinions? --Latebird 20:53, 10 March 2006 (UTC)

We should pick a system that's already in use, rather than improvising one based on Russian transliteration. This PDF file: Mongolian.pdf (page 5) outlines four standardized systems for transliterating Mongolian Cyrillic. The BGN/PCGN system is probably quite compatible with our current Russian and Ukrainian transliteration conventions (and does use those two characters, although it is different from Russian in other ways). Michael Z. 2006-03-10 22:40 Z
I think we should be careful not to mix the concepts of transliteration and transcription (unfortunately most relevant pages in the WP namespace here make a mess out of this as well). We want the page names to reflect the correct pronounciation as closely as possible for an english language reader, so we need a transcription with as few diacritics as possible. The PDF file purports to list transliterations, which may be technically more precise but only linguists will be able to infer the actual pronounciation in many cases. With PGN/PCGN I don't understand the "yö" (pronounciation is "ye"), the "dz" adds nothing useful to "z", and most readers will not have seen either variation of a diacritic "i" ever before. Actually, even though I suggested using "ü" and "ö" myself, I wouldn't complain if we could get rid of those as well. --Latebird 23:50, 10 March 2006 (UTC)
Actually, we use transliteration systems for names in most contexts, and we add IPA transcription to illustrate precise pronunciation. It's helpful if a transliteration system is intuitive for an anglophone to pronounce, but many languages have sounds not represented in English. I don't know much about Mongolian, but if letters like ö and ü represent vowel letters for sounds not found in English, I don't see a problem. If they're unfamiliar to a reader, they can be read just like o and u, but they conventionally represent sounds as in German, that I would guess help an educated reader with the Mongolian names. Michael Z. 2006-03-11 00:14 Z

I should be more specific about use of transliteration and transcription. My comments above are a bit oversimplified, and the situation could use some improvement. For an exhaustive analysis, see #Usage at the top of this page. For names of people and places in the text, a more relaxed "conventional transliteration/transcription" is used, essentially to create an "English spelling" of a foreign proper name. However, in some contexts a more disciplined transliteration is required to illustrate specific words, and should almost always be used instead of Cyrillic (IMO)—in an article's first line or in the text, e.g. throughout "obsolete Russian units of measurement". Michael Z. 2006-03-11 00:52 Z

The german ö and ü are approximations to the actual phonems. The mongolian vowel system is not fully compatible to that of any european language. At least with those two examples, most english language readers may get some reasonable idea about what is meant. But if I had encountered ï and the i with a bow from the PDF in any other context, I would be stumped about their meaning (not to mention that they look horrible in almost any font available on my computer). Interestingly, the transcription in Mongolian language, although falsely titled transliteration, comes very close to what I had in mind. We may be able to simplify it even more, to get rules that are easy for everyone to understand and apply. To be honest, I don't see any reason to use transliterations (one-to-one-mappings) in Wikipedia. They are just a way to represent cyrillic without writing "real" cyrillic, after all. In a context where all unicode characters are easily available to present the original form, that seems kind of pointless. --Latebird 12:02, 11 March 2006 (UTC)

Now that I actually understand what everybody is talking about, it seems that a simplified BGN/PCGN system makes sense for Mongolian as well. The simplifications I would apply are as follows:

  • Е -> ye (most common pronounciation)
  • З -> z (the d in dz doesn't add any information)
  • Х -> kh (not technically a simplification, but h just doesn't seem to be the right phoneme)
  • Ъ -> (-) (not pronounced, actually missing in the table in Mongolian language)
  • Ы -> y (diacritical i is confusing)
  • Ь -> y (diacritical i is confusing)
  • Ю -> yu (most common pronounciation, BGN/PCGN rules too complicated)

This happens to be very similar to the russian romanization, which is no surprise, because the mongolian cyrillic alphabet was designed to be phonetically as close as possible to the russian one. Anything I missed, or any other comments? --Latebird 13:47, 20 March 2006 (UTC)

Qui tacet, licet? Or is this just such a boring question, that nobody really cares? --Latebird 10:56, 30 March 2006 (UTC)
It's not boring, it's too specialized. One has to at least have some knowledge of Mongolian to answer your question, and Mongolian Wikipedians who know English and are interested in transliteration are extremely hard to find :)—Ëzhiki (ërinacëus amurënsis) 13:33, 30 March 2006 (UTC)

It’s not really an area of concern for me. If you write your bibliography, a transliteration or rahter precise transcription is necessary. I usually use a transcription very similar to KNAB, only that ь is represented by ’. For internet use, Mongolians use a kind of transcription as follows: а-a б-b в-v г-g д-d е-ye ё-yo ж-j з-z и-i й-i л-l м-m н-n о-o ө-o п-p р-r с-s т-t у-u ү-u ф-f х-h ц-ts ч-ch ш-sh щ-? ъ-? (both never used) ы-ii ь-i or nothing э-e ю-yu я-ya. There is no instance of a Mongolian word with щ and the only instance of ъ in the orthographical form of the voluntative suffix is 1. written ь by most Mongols using Cyrillic 2. pronounced as a long i and thus written ii (instead of Cyrillic ъя) in most Latin Mongolian. So is, by the way, ы that might be transliterated y, but never be transcribed in that way. ii is fine. з in Mongolian (in contrast to Russian) is an affricate and not a fricative, so an English person would be closer to the Mongolian equivalent if s/he tried to pronounce dz. x as kh is fine, and it is also used in Mongolian passports. ь only indicates that the preceding consonant is palatalized, but so does in most cases short i in non-initial syllable. So i might be best. As for the vowel system: any English person without practice will fail to correctly pronounce it. Non-initial short vowels are about 0,4 times the length of initial short vowels or not pronounced at all. Most of them become somewhat centralized. ө-ö and ү-ü don’t reflect the actual German pronunciation (though they would have done so 800 years ago), but for the sake of differentiability we should write them in this way. The remaining vowels don’t seem to be in dispute. (o-o and y-u are not very helpful as well: o is [ɔ] and y is somewhere in between [o] and [ʊ].) G Purevdorj 14:19, 13 June 2006 (UTC)


I think Purevdorj's а-a б-b в-v г-g д-d е-ye ё-yo ж-j з-z и-i й-i л-l м-m н-n о-o ө-o п-p р-r с-s т-t у-u ү-u ф-f х-h ц-ts ч-ch ш-sh щ-? ъ-? (both never used) ы-ii ь-i or nothing э-e ю-yu я-ya is fine because it is short and still contains all the relevant information. Plus it looks familiar. Personally, I am for omitting the ь because it doesn't seem to represent anything on its own, esp. no vowel. Re x -> h vs. x -> kh, I am a bit ambivalent. Yaan,141.30.65.26 13:30, 5 February 2007 (UTC) Edit: I'm for ө->ö and ү->ü, actually. Yaan

[edit] Proposal for Russian and Ukrainian

I'd like to put forward a draft proposal. No votes yet, just discussion to see how people feel about this and suggest wording, improvements, or changes to the proposal. I see this as a refinement of the current convention, and adopting it wouldn't rule out a more radical change in the future, such as adopting GOST transliteration for Russian.

Goals:

  • No radical changes
  • Keep the editorial rules simple, or make them simpler
  • Emulate academic and general literature
  • Use existing standards
  • Use a more precise transliteration where it provides advantages
  • Replace Cyrillic text with a transliteration where possible, for accessibility
  • Harmonize Russian and Ukrainian transliteration
  • Accommodate anglophone readers

[edit] Proposal

This would apply to both Russian and Ukrainian in Wikipedia. All the usual editorial guidelines would still be in effect (use the most common name for article titles, prefer local names, etc.), including the recommendations currently at the top of the page, and I won't restate them here.

  1. For proper names of people and places in the text, and for article titles, use a "conventional name" based on simplified BGN/PCGN transliteration [minor change for Ukrainian only].
    • Except, for modern Ukrainian place names, use the simplified National system for geographic names [no change].
  2. In the leading line, use a precise BGN/PCGN transliteration after, or instead of, the Cyrillic text [more detailed transliteration for Russian, adopting a standardized system for Ukrainian]
    • Including Ukrainian place names [change, from the precise National system for geographic names].
  3. For untranslated Russian or Ukrainian words in the text, use a precise BGN/PCGN transliteration [more detailed transliteration for Russian, adopting a standardized system for Ukrainian]

Michael Z. 2006-03-11 02:12 Z

[edit] References

[edit] Addendum

Below please find another attempt to approach the problem from a slightly different angle. I did a bit of thinking about the nature of the problems and disagreements regarding the transliteration practices, and my conclusion is that we have (mostly) failed to accept that the problem is not one-sided. While it is possible to use one transliteration system for all purposes, as we pretty much do now, without at least some minor tweaking the debates and disagreements are not going away.

So, it is proposed to adopt slightly different practices depending on

  1. Type of the term being transliterated
  2. Its location in the text.

The major types of terms include:

  • modern geographic names (cities, villages, rivers, mountains, stations, street names...);
  • historic geographic names (one word—Chernigov/Chernihiv :));
  • human names (given names, nicknames...);
  • other names (organization names...);
  • special terminology with no exact English equivalents (raion, oblast, uyezd...);
  • all other cases (can't think of any; please fill in).

Locations in the text include:

  • article title;
  • article intro line;
  • article body (when there is no link).

As such, the following rules outline is proposed. It takes into consideration the points mentioned in the Goals section above. This is not to be voted upon, but a general idea as well.

  • If there is a term commonly used in English, it is to be used as a title.
    • Positives: obvious
    • Negatives: there is no definition of what's considered "common English use". For our purposes, a term that outnumbers all other terms at least 1:100 (1:500? 1:1000? other ratio? any set ratio is a bad idea?) in google (google books?) search can be considered "common English"

When no common English term is available, use the following guidelines:

  • Modern geonames
    • article title: simplified BGN/PCGN (meaning no change from current policy for Russian; a switch from the Ukrainian National System for Ukrainian)
    • article intro: strict BGN/PCGN and (or? and/or?) GOST (for Russian geonames)/Ukrainian National System (for Ukrainian geonames)
    • article body: not applicable; geonames should always be linked.
  • Historic geonames
    • should probably be a matter of a separate policy or reviewed on case-by-case basis.
  • Human names
    • article title: simplified BGN/PCGN?
      • We will need a well-defined system of exceptions here (going beyond "common use"), or should allow no exceptions whatsoever (again, beyond "common use"). It's either we allow and substantiate something like Yevgeniy Plushenko, or place him under Yevgeny Plyushchenko no matter what.
    • article intro: scientific transliteration (?) in addition to the title
    • article body: not applicable; human names should always be linked.
  • Other names
    • article title: no need to transliterate; translation is sufficient
    • article intro: title+strict BGN/PCGN (GOST/Nat'l System?) if necessary
    • article body: translation; strict BGN/PCGN if transliteration is necessary.
  • Special terminology
    • case-by-case basis?

Ëzhiki (ërinacëus amurënsis) 21:25, 13 March 2006 (UTC)

[edit] Discussion

Overall, I think this would provide a bit more precision for Russian transliteration where needed, and simplify the rules for Ukrainian, harmonizing transliteration for the two languages, without making any radical changes. It can be phased in as articles are written and edited, and wouldn't require any major conversion effort.

Disadvantages: BGN/PCGN does not officially support pre-1918 letters fita ѳ, yat’ ѣ, and izhitsa ѵ. They could be transliterated as f, ě, and i (although Ukrainian yat’ may be better as i). In most cases it would simply be appropriate to use the modern orthography and ignore these letters. Michael Z. 2006-03-11 02:12 Z

Argh, you beat me to it :) Well, I wasn't going to put forward a proposal just yet, but I wanted to outline all the areas where transliteration/romanization is used along with the practices used in each case and possible ways to improve them (some of which mirror comments above). I will try to cough it up next week, but please don't kill me if I don't.—Ëzhiki (ërinacëus amurënsis) 14:45, 11 March 2006 (UTC)
Gotcha! Take your time: even this minor change is a big deal, and we should get lots of input. I've tried to keep the explanation as simple as possible. Although there is a lot of nuance and room for editors' judgement, I think all of it can be boiled down to: 1) proper names of foreign origin used in English, and 2) foreign words, transliterated (see for example SEEJ's Style Sheet for Authors).
Unless we apply the following I disagree:
  1. Use of Ye for E only in start of words (otherwise it can be misread as Ы/И); and after Ы/И for that fact (as in НЫЕ - NIYE)
  2. Use of I for Й, for same reason above
  3. Use of IY and YI for -ИЙ/-IЙ and -ЫЙ/-ИЙ respectfully
  4. Use of Sch for Щ (prevents ШЧ cofusion (Веснушчатый), after all СЩ из closer in sound)
  5. No appostrophes in Ukrainian or for Ь
  6. Use of Ë for Ё, except for start of words as in Yё.
  7. Use of W for Belarusian ў.
  8. Use of H for Belarusian and Ukrainian Г

Optional

  1. Use of Ia for Я to avoid confusion with Y being reserved for Ы/И

--Kuban Cossack 16:53, 11 March 2006 (UTC)

I'm merely suggesting a refinement of the system we already use for these two languages. If you'd like to suggest your home-grown system to replace it in a more major change, that's okay, but please don't oppose some minor changes which will improve what we are already doing.
At least three or four of your points are already covered by the systems we currently use. Michael Z. 2006-03-11 16:59 Z
Those are just my suggestions, most are currentely in use, although a few minor adjustments would certainly be more appropriate. And one other, no Lacinka for Belarusian.--Kuban Cossack 17:30, 11 March 2006 (UTC)
What kind of minor adjustments? Are you still opposed to my proposal? Belarusian is irrelevant, since I'm only talking about Russian and Ukrainian. Michael Z. 2006-03-11 18:00 Z
Points one - eight, everything else as you say.--Kuban Cossack 23:54, 11 March 2006 (UTC)
Do you mean both in proper names and academic transliterations? Do you understand that I'm proposing treating the two differently), and the reasons for it? Many academic books and journals do it this way, e.g. SEEJ's Style Sheet for Authors.
  1. Russian е→e, but ye "word-initially, and after a vowel, й, ъ, or ь" is the way BGN/PCGN already does it; am I missing something?
  2. Й→i: are you proposing a home-grown transliteration standard, or switching to ALA-LC?
  3. This appears to be an exception to your no. 2; perhaps acceptable in spelling a proper name, but not in an academic transliteration. In a precise transliteration using BGN/PCGN it would be -iy and -yy, respectively.
  4. Щ→sch: this doesn't seem to conform to any standard.
  5. Dropping apostrophes in an academic transliteration makes it ambiguous. Seems to be the opposite intention of your point 4.
  6. Ё→ë, but "word-initially, and after a vowel, й, ъ, or ь" is exactly the way BGN/PCGN does it.
  7. Irrelevant: I'm not making any proposal for Belarusian, but you're welcome to.
  8. Ukrainian г→h is exactly what BGN/PCGN does.
So your points 1, 6, 7, and 8 are already satisfied or not relevant, right? Regarding points 2 and 4, I feel that creating a new standard is a radical departure and would require massive re-writing of transliterations throughout English Wikipedia for consistency—I wouldn't propose it myself. Your point 3 seems contradictory, as you insist on й→i to eliminate ambiguity, but now you propose using it sometimes—it doesn't sit right with me, although for proper names something like this might be usable. Michael Z. 2006-03-12 00:14 Z
And I proposed:
  1. Abandon the after a vowel, й, ъ, or ь" use of ye for e.
  2. Combining the two methods, like you said introduce modifications to the existing standard.
  3. Nevertheless there are endless examples in wikipedia, particulary in Ukrainian names where -yi is used, this is another modification I propose.
  4. Sch is used, or at least was used for Щ like in Khruschev.
  5. Apostrophes do nothing but confuse readers, defenitely no apostrophes in titles, and really I see no point in giving them in heading either.
  6. Same for point one, abandon the after a vowel, й, ъ, or ь" use of yё for ё. --Kuban Cossack 12:09, 12 March 2006 (UTC)
So in fact what we get is a modification of BGN/PCGN (which does touch ALA-LC) that is adapted to wikipedia. --12:09, 12 March 2006 (UTC)
Without going into the specific merits of your proposed revisions yet, how can you justify developing a new transliteration standard? (The simplified BGN/PCGN that we use for Russian now is used in many books and articles, and so is a similarly simplified ALA-LC.) But when there are a number of tried and familiar standards that have been used in thousands of publications for decades, what reason is there for us amateurs to develop a new one which doesn't correspond to the cataloguing conventions of any library in the world? And if there are good reasons, how can they override the rule against original research? Michael Z. 2006-03-12 22:58 Z
IMO, Original research maybe developing new standards, what I proposed is a modification, or a simplification to the existing ones (like no ye for e after vowels), which is why calling it original reseach is rather strong and unsuitable. I would more suggest it is simply looking at existing standards and combining them by tsking the best bits out of each. In the end I think the more simple transliteration is the better (hence the ye and no appostrophes) on the other had I do believe that it has to be correct and non-confusing (-IY and -YI). Finally lets remember that some versions were much more common historically (Khruschev not Khrushchyov); hence our standard should reflect on that. --Kuban Cossack 23:20, 12 March 2006 (UTC)
The whole point of a standard is that it is standardized! If you change one detail, then it is no longer the standard; it is not even "our standard", it becomes non-standard. Calling it original research is neither "strong" nor "unsuitable", it's just a fact. Just try screwing bolts that are "just about" 4 mm diameter into standard 4 mm nuts.
The merits of your proposed modifications are irrelevant. You still haven't given one reason to abandon standards and make up a new system. Michael Z. 2006-03-13 00:50 Z

I'm not familiar with the subtleties of the russian and ukrainian languages, so I can only make general remarks here:

  • I can't see any good reason to use differenct conventions for titles and body text. Every name in the text has at least the theoretical potential to be turned into a link at some time in the future, and would then need to be "fixed" again. Or worse, because spelled differently, future editors will not notice that it is meant to represent the same name and should be turned into a link.
  • In my opinion, the cyrillic version should always be given in the intro line. After all, this is the ultimate reference that everything else is based on. leaving it away would be a major omission.
  • Listing alternative romanizations in the intro line may be useful, but should never replace the cyrillic version.
  • --Latebird 15:17, 16 March 2006 (UTC)
I agree with the last two points. As for the first one, the body vs. title issue only refers to words that do not have a potential of being turned into an article (those include uncommon or alternative names and clarifications, and there are plenty of cases falling under this rather narrow definition).—Ëzhiki (ërinacëus amurënsis) 15:50, 16 March 2006 (UTC)
I understand the exception that the current Usage rules make for linguistic topics. But other than that, every further variation will just confuse readers. Authors not intimately familiar with the source language will also quite often get it wrong when they try to apply strict BGN/PCGN. On the other hand, I probably wouldn't mind enough to change those examples where someone has followed your suggestion. --Latebird 17:33, 16 March 2006 (UTC)

[edit] "Commonly used"

Another remark about how to determine "commonly used" versions. In the german language WP we look at what is called the "frequency class", as determined by a tool by the University of Leipzig, which counts occurences in recent printed publications. A term is considered "common" if its class is lower than 16 (the word "der" (="the") is 10^16 times more common). Obviously, Google (or any web search engine) will not serve as a useful reference, lacking quality control of the indexed text corpus. The only remotely similar tool for the english language that I found is the The Sydney Morning Herald Word Database, but maybe there are others out there that I missed. --Latebird 14:40, 21 March 2006 (UTC)

That Word Database would be a great tool if it were finding the words we need. I tried Novogire(y)evo, Vladivostok, and Moskva—all returned no results (which is kind of surprising). Not really useful for our purposes, I'm afraid. Now, a LexisNexis search would probably work the best, but it is not accessible to the general public. I am sure there are Wikipedians with LexisNexis accounts who would be willing to run queries for us, but it is not a solution I like. Perhaps Google Print could be an appropriate substitution, at least for now?—Ëzhiki (ërinacëus amurënsis) 15:02, 21 March 2006 (UTC)
Unfortunately, the text base of the Sidney Morning Herald is too small to be of practical use, so it just serves to demonstrate the principle. I'm not sure about Google Print either, unless you can restrict your searches to works published within the last 10 years or so. Older publications probably don't correctly reflect current word use. A selection of books might also be (randomly) biased by topic, a text corpus based on newspapers and magazines may be more balanced. But then, those are rather theoretical considerations right now anyway... --Latebird 18:23, 21 March 2006 (UTC)

[edit] Why use transliteration

I hope this is not too much in the wrong place here, I'm just trying to understand the goal of this discussion. Maybe someone can explain this to me, or point me to the place where it is explained: What is the point of using any kind of transliteration in WP? I guess those systems were useful in times when scientists only had typewriters available and couldn't actually (with reasonable effort) print cyrillic characters in publications. But in times of Unicode, where any computer may display and print (almost) any character ever conceived, and any article about a foreign language name should list the original spelling anyway, transliterations seems rather redundant. Or am I missing something fundamental here? --Latebird 11:29, 12 March 2006 (UTC)

So are you proposing that we use cyrillic titles in an english encyclopedia??? --Kuban Cossack 12:09, 12 March 2006 (UTC)
To list the original spelling does not mean to use it as a title. It means to include it in the article for reference. --Latebird 13:45, 13 March 2006 (UTC)
Transliteration conveys foreign words to an English-language reader who doesn't know the Cyrillic alphabet. All English-language Wikipedia readers can read "The name of tachanka appears to be a Ukrainian version of an endearing form of the word tachka, meaning 'a cart'." But if the article was entitled тачанка, and referred to the word тачка, then 99% of readers would make nothing of that sentence.
In fact, using Cyrillic letters in English-language articles should be avoided altogether, except where there is a specific point made about the orthography, say in articles about the Cyrillic alphabet or letters, or to illustrate why Soviet soldiers called the U.S. Sherman tank Em Cha. Perhaps the Cyrillic spelling of an article's title should be also be present to assist with Cyrillic-language Google searches, etc. Michael Z. 2006-03-12 22:43 Z
The only purpose of transliteration that I can see is the ability to unambiguously deduct the cyrillic original from the latin version. I guess I'm just questioning the necessity of that. Readers able to identify cyrillic characters will find the cyrillic original in the introduction anyway. For all the others, a reverse mapping is pointless, they just want to know how to (approximately) pronounce the word, which is the purpose of transcription. What information does transliteration add to this? And why would you not show the cyrillic original within the article for reference? Or are we talking about entirely different things? --Latebird 13:45, 13 March 2006 (UTC)
The transliteration serves to convey the word, precisely in Latin characters. In addition to allowing one to deduce the original Cyrillic spelling, it illustrates the word with precision to a reader who doesn't know the Cyrillic alphabet, and lets him compare a name to another source, for example to a bibliography entry in an English-language book, or a search result from a library catalogue.
A transliteration essentially serves the same purpose as the Cyrillic spelling, but it is more accessible to many more readers of English-language Wikipedia. This is even more important for a general encyclopedia like Wikipedia than it is to the expert readership of a Slavistics journal, and they normally use transliterations over Cyrillic (e.g. see the SEEJ's Style Sheet for Authors.
Also, an appropriate method of transliteration accounts for the differences between languages by also serving the role of a transcription, useful for readers whether they know the Cyrillic alphabet or not. For example, it would convey the difference between the name in Russian Григорий, Grigoriy and in Ukrainian Григорій, Hryhoriy. Michael Z. 2006-03-13 17:44 Z
by also serving the role of a transcription - And there we are again at a point where I don't think we're talking about the same thing. Transliteration converts the characters of a script into characters of another script. Transciption translates the phonemes of a language into (approximate) phonemes of another language. I don't think that any system can do both jobs at the same time without confusing the hell out of a lot of people. Your explanation is already confusing me, at any rate! Just to make sure I understand you correctly: Do you consider the system currently in use here for Russian to be a transliteration? --Latebird 23:10, 13 March 2006 (UTC)
I'm proposing that BGN/PCGN transliteration, one of the most commonly-used systems, be offered in the first line of articles with Russian and Ukrainian titles, and also be used in many or most places where Russian and Ukrainian words are represented. In case you missed it the first time, there's a summary table of BGN/PCGN transliteration at Wikipedia:Naming conventions (Cyrillic)/Romanization table.
Kuban Kazak and Ëzhiki have also offered proposals.
I don't see the point in general discussion about what I consider to be transcription and transliteration; if you're just trying to prove that I don't know what I'm talking about, I'll gladly concede the point. Now, are you interested in discussing the merits and disadvantages of this specific, minor proposed change to our current practices? Michael Z. 2006-03-13 23:45 Z

Sorry if I'm sounding too inquisitive. I just consider consistent use of terminology important, because by using one term or the other you may be implicitly stating some of your goals. If it is one of your goals that page titles can be unambiguously converted back into the correct cyrillic character sequence, then transliteration is the appropriate term to use. If this is not one of your goals, then using the same term just muddies the waters.

(Alright, just found that in english, the word "transliteration" can also be used in a "wider sense". I still think it isn't wise to do so in WP, but at least now I have an explanation for my confusion...) --Latebird 00:52, 14 March 2006 (UTC)

So you are against BGN/PCGN transliteration for use in Wikipedia because it is only "transliteration in the wider sense"? Michael Z. 2006-03-14 19:07 Z
To the contrary. I was under the false impression that you all talked about transliteration in the narrow sense, and the implications scared me. --Latebird 22:30, 14 March 2006 (UTC)

[edit] Proposal for Belarusian

NO Lacinka, having wikipedia be a platform to try getting a system that has no official usage anywhere, to become more widespread is rediculous and violates WP:NOR. I propose to remove it from wiki altogether. Replacement - BGN/PCGN as everywhere else. --Kuban Cossack 04:53, 16 May 2006 (UTC)

First of all, it's up to native Belarusian speakers to express the best way to transliterate their language. The national Belarusian transliteration rules should have priority over the other systems. Only if the national transliteration system is not established then we should look for substitutes such as BGN/PCGN and other. KPbIC 05:30, 16 May 2006 (UTC)
First of all it is up to wikipedia to establish. Native ethnicity has no value as this is an ENGLISH encyclopedia directed towarards ENGLISH readers. Also check Romanization of Belarusian for that matter. BGN/PCGN as you can see is a Belarusian system. (Which is currentely MOST widespread, that is the bottom line) --Kuban Cossack 13:50, 16 May 2006 (UTC)
BGN/PCGN is most widespread, because of unnecessity of using diacritic signs. But if the article you mentioned proposes at least 5 Romanization systems except Lacinka and all of them are correct! and if technological abilities allow to use diacritic signs, why should we take your point of view? Huh, Mr. Kuban kazak?! --Zlobny 15:20, 31 May 2006 (UTC)
I hope you won't mind if I answer this question. As multiple similar discussions about romanization of Russian showed, BGN/PCGN is the most convenient system for Anglophones (which should come as no surprise if you remember who developed it). It is true that we are able and have all the technology to use diactitic signs, but it doesn't mean that an average Joe in Anytown, USA (or in Middleborough, UK, if you please) would appreciate your efforts in putting diacritics everywhere you can reach. Diactritics are not generally perceived well by English-speaking people, nor are they easy for them to type. BGN/PCGN systems had been developed with just that problem in mind, they work well, are quite common, and are used by a variety of respectable institutions. Why overcomplicate things when there is already a workable solution?—Ëzhiki (ërinacëus amurënsis) • (yo?); 15:35, 31 May 2006 (UTC)
EXACTLY! --Kuban Cossack 17:13, 31 May 2006 (UTC)
Good point. You are neither a native speaker of Belarusian, nor a native speaker of English, and you are not known as a linguist too. Then, why wikicommunity should value your controversial (Russification) suggestions? KPbIC 19:13, 17 May 2006 (UTC)
Sorry since when is adopting a tranliteration method Russification? Is having Ukrainian translit being abandoned in favour of Latynka de-Russification? Lacinka, or Latynka vs a translit system has nothing to do with Russification. If you look it up in any dictionary you will find that adopting a tranliteration method in favour of an archaic and unused script does not fall into Russification. It does fall into Aglification of Cyrillic. Finally liguistics are not exactly my speciality but neither are they of any of the people who wrote these conventions. (Which is why WP:NOR is there in the first place). As for being native speaker, too means nothing. This is not be:wiki this is en:wiki, and its readers are English speakers not Belarusian ones. Please don't mix consitency and conviniency with your "Russification" claims. --Kuban Cossack 20:40, 17 May 2006 (UTC)

[edit] What is wrong with Lacinka

From Talk:Maładečna:

Sorry, Alex and Irpen but I'm missing your point completely. WHY do we need to modify WP:CYR? Did I miss any dramatic historical changes in Belarus or elsewhere, which happened in the last few days? And, Alex, why the letters are funny? Dozen nations in Europe use such letters, and they treat them as native. It's not in a standard English alphabet, but what is "funny" with that? KPbIC 04:52, 16 May 2006 (UTC)

WP:CYR is still not adopted as it is a slow process that is currentely getting into its final stage. Belarusian translits on the other hand are a MESS. There are several systems in use on wiki. On one side we have the Lacinka which is officially not recognised anywhere (including in Belarus mind you). On the other side we have a Russian equivelent on all names that presentely have MUCH more popularity in all english publications. Then we have several Belarusian translit systems that are scholarly acceptable, and are more widespread than Lacinka (at least on Atlases) but wiki seams to have no clear definition on which to use. What I see is that all Belarusian names are treated as a Kiev/Kyiv equivelence in all English press and literature (including Britannica). --Kuban Cossack 05:01, 16 May 2006 (UTC)
Well, I have a number of issues over the Lacinka:
  1. It uses characters that an ordinary reader of wikipedia have no clue how to pronounce or how to enter on their keyboards.
  2. It tends to produce names that are barely used in English, e.g. if we look into the Google results, among 36 hits for Maładečna only 4 hits uses proper letters, the other hits are actually to Maladecna. That means that an American schoolboy could not find this Maładečna place on the map or on Internet, or, probably, realize that this is the same Molodechno, his great-grandmother came from.
  3. All official Belarusian servers use other spelling for their English version. Lukashenka does not like Lacinka (I am not a Lukashenka's fun but it is him who rule Belarus now).
  4. I have a feeling (may be wrong) that many Belarusians are not comfortable with Lacinka.
  5. No English Encyclopaedias (Columbia, Britannica, etc.) use Lacinka
  6. Some people consider Lacinka as an instrument of Polonization of Belarus, Polonization is not better then Rusification, is it?
My feelings on the matter are probably twisted by my Russian heritage, feelings of other editors may be twisted by the understandable desire to make a WP:Point against Lukashenka or against russification in general. That is why I asked User:Mikkalai to take part in the discussion. If he would said that Lacinka is the best for the Belarusian editors I am withdrawing my objections, if he would say that Lacinka is unsuitable, I would stand against it quite strongly. The problem is he is avoiding direct suggestions so far. abakharev 06:43, 16 May 2006 (UTC)
ad 1 the same is true for just about any non English language, it can be a worked around by including a real phonetical transcription and when possible a recordin but still an avarage English speaker won't be able to pronounce Slavic names.
ad 2 I highly doubt Maładečna is refered to often by any name in English, as for the hypothetical schoolboy the same applies to Pressburg or Fiume and a whole lot of other cities.
ad 3 doesn't really need a comment
ad 4 it just so happens there are a few Belarusians on wikipedia, let's ask them. It doesn't make sense to have a Russian and a Bulgarian disscus this.
ad 5 today still most westerners get they're knowledge about Belarus through Russia
ad 6 nonsense, if anything it would be Lithuanization, have You ever seen written Polish, it's completely diffrent, in fact written Belarusian is closer to Czech and Slovak and southern languages such as Serbian, Croat and Slovenian. —Preceding unsigned comment added by 213.91.192.5 (talk • contribs)
1 For the most of non English languages we reproduce the original spelling in their country. Lacinka is just another way of transliteration Belarusian version of cyrillic, why should we artificially create problems for the English readers. We had the same problem with transliterations of Russian and after long debates chose a special system Wikipedia:Romanization of Russian Modified BGN/PCGN transliteration that does not have non-English characters/intuitive/have reasonable prononciation/usualy closer to the most-frequent English usage. The same with Ukrainian.
2 If you had bothered to read this talk, you would find that Molodechno produces 37K English-language hits include all official Belarusian servers, oficial city site, Brittanica, etc. It is wealth of information.
3 Well, since the official Belarusian sources never use lacinka it means that the users would have difficulty finding any Belarusian info
4 Agree
5 Dubious, so what
6 Written Belarusian is written in Cyrillic, it might resemble Serbian or Bulgarian or Macedonian, it is hardly resembles Czech. For the sentiments just ask User:Irpen about the sentiments some Ukrainians feel about Latynka - a very similar system for the transliteration of Ukrainian. abakharev 08:45, 16 May 2006 (UTC)
This is the first time I hear about Latynka and as far as I cant tell it seems like an unsuccesfull attempt from the 19th century. Łacinka on the other hand dates back to the 16th century and is the alphabet that the earliest form of modern Belarusian was written in and was used by the greatest Belarusian writers. Now as for the Polonization, the Polish alphabet is actually quite unique among Slavic nations and seems to be somewhat influenced by German ortography (for example using the letter "w" where all others use "v"). I'm no expert on Slavic languages it's just a hobby, so if there is anything wrong please fix it, but take a look at the folowing table I took the time to make. As You can see the Latin alphabet used by Belarusians is actaully much closer to any other Slavic language (and to Lithuanian) then to Polish.
sound Belarusian Polish Sorbian,Czech,Slovakian,Slovenian,Croatian,Serbian and Lithuanian(where applicable)
t̠͡ɕ Ć Ć Ć
Č Cz Č
[[[w]]] Ł Ł Ł
ɕ Ś Ś
ʃ Š Sz Š
Ŭ
ʒ Ž Ż or Rz Ž or Ř
Belarusian Ł doesn't sound like [[[w]]], but hard [[[l]]]! --Zlobny 05:26, 17 May 2006 (UTC)

Yet, this is an English wikipedia, and the fact of how close or far away Lacinka is is irrlevant, it has no usage in any of the ENGLISH language publications, official and unofficial. Hence using lacinka for titling names of cities whose non-lacinka names (be they translits into Belarusian or Russian) exceed their use exponentially is absurd. Lacinka has to go from titles and replaced by tranliterations, for most cases this should be done via Belarusian (like Maladzechna) and for large cities - Mogilev, Vitebsk etc. --Kuban Cossack 19:15, 16 May 2006 (UTC)

I would have to agree with Kuban Cossack on this. Preferences, practices, and egos of Belarusian, Russian, Polish, or any other Slavic editors have no meaning here in English Wikipedia. All we have to keep in mind is the needs of native English-speaking users. I personally would recommend using BGN/PCGN, but would also accept whatever other system that suits the needs of the English-speaking audience. The very least we can do before breaking more spears and axes is to get a native English speaker familiar with the topic and listen to him real well. Otherwise this debate is quite meaningless.—Ëzhiki (ërinacëus amurënsis) • (yo?); 14:55, 30 May 2006 (UTC)

Gentlemen, what's wrong with you? Why mention Poland or Poles in every topic even loosely related to Central and Eastern Europe? Sorry for strong words, but this seems a tad paranoid to me. Latynka was invented by a Czech, based primarily on Czech alphabet and promoted by Czech politicians. Yet - an instrument of Polonization? Same for Łacinka: modelled after Hus' alphabet more than anything else, yet mentioned here and there as based on Polish... Gosh. Sorry for OT. //Halibutt 14:53, 17 June 2006 (UTC)

[edit] Final Proposal

As there has been no other discussion available here is a summary of facts:

  1. Presentely in Wikipedia there is no applied standard to Belarus-related titles.
  2. Some people prefer to use transliteration, whilst others Lacinka.
  3. The latter system is a latin alphabet for Belarusian language.
  4. Presentely it is not even scholarly recognised in Belarus, and its grammar is under question
  5. All scholarly English media and literature (eg.Britannica) use Belarusian transliteration (or the corresponding Russian name if it has a clear more widespread usage).None use Lacinka.
  • proposal - As Lacinka has no schlorly value in Belarus or in English language print. I propose that it be fully eradicated from article titles in wikipedia by moving them to their respective transliterated versions. Separately it can and should be given in article lead paragraph along with other respective names for that described place.
  • comments -
    1. The only credible argument for Lacinka is that Wikipedia for Serbian, Polish and Moldavian articles use the respective Latin scripts and not transliteration, so Belarusian is entitled to be seen similarly.
      However, point 4 and 5 cannot be applied to the listed languages as both the English media, the UN, and respectible guides like Britannica use the latin scripts in preferance to transliteration and their grammatical structure is officialy recognised by their parent countries. For Lacinka neither of these points apply.
    2. As wikipedia is presentely the only English encyclopedia that has Lacinka titled articles, some people have expressed that this will contribute to it being more widespread.
      However WP:NOR clearely states that wiki is meant to present facts not look for them. Likewise the argument of using wikipedia as a platform of making Lacinka more widespread is not allowed.
--Kuban Cossack 15:17, 24 May 2006 (UTC)
Can first follow to ask the opinion belorussian editor? --Yakudza 20:40, 24 May 2006 (UTC)
"Lacinka" is a system of rendering Belarusian text in Latin script, in fact, one of such systems, going under the same name. Those were adopted and used at some locations several times in history (mostly in areas of prevalently Polish language publishing). The reason was, *briefly*, the incapability of the author or technical impossibility for the publisher to use Cyrillic script. The rendition in question is NOT a transliteration, as it introduces different way of using "L" with vowels, seen in Polish orthography and arbitrary denotes "hard" "Л" with phonetically rather different Polish "Ł". So it introduces uncharacteristic for Belarusian
This system isn't mandatory and isn't well known and is supported almost exclusively (even exclusively, perhaps) by promoters and supporters of alternative Belarusian orthography. Possibly, it could find some use in Polish part of WP, however, Poles have their own rules for rendering Belarusian names (http://slowniki.pwn.pl/zasady/629693_1.html) and in English WP surely it's BGN/PCGN which ought to be used! It doesn't even loose the Belarusian diphtongs, like some other systems. If it's relevant, I'm a native Belarusian speaker. Yury Tarasievich 06:56, 25 May 2006 (UTC)
Well, I can add point six wrt to what is said above. Thank you for the response --Kuban Cossack 11:16, 25 May 2006 (UTC)
It would've been very funny, it it were not so sad. It's 21st century, a Russian imperialist on English Wikipedia pushes imperial Russian spelling into Belarusan names and forces Russification into Belarus-related articles, in all possible ways (POV, lies, spelling...). Hillarious, simply hillarious. Absolutely nothing has changed since 1863 anti-Russian uprising of Kalinouski. --rydel 15:07, 31 May 2006 (UTC)
Irrelevant Material, please withhold personal insults and use ACADEMIC language (which I know is hard for you, but that is part of Wikietiquette...веди себя культурно.) --Kuban Cossack 17:11, 31 May 2006 (UTC)
P.S. Yury Tarasievich: there's enough misinformation here (thanks to our Russian "friend"), why introduce more confusion by making such a strange summary of Lacinka history? I'd like to refer you to this article: http://www.cus.cam.ac.uk/~np214/lacin.htm
And I quote from the article:
The problem now is however also about "which Lacinka?". These uncertainties proved to be very harmful. ... Firstly, there is an old uncertainty about spelling the combinations of the "i" preceded by a vowel. While the traditional spelling is e.g. "akademii", "Rasiei", "racyi", "dla ich", some actually printed it as "akademiji", "Rasieji", "racyji", and "dla jich". Secondly, some have been urging to "upgrade" the traditional Lacinka, to make it supposedly 'more convenient' for modern purposes, or to 'simplify' it. There is (was?) a fairly active group urging to replace the "ŭ" (the "u"-consonant), with the "w" character (as the latter letter has otherwise remained "unemployed" since WW2). Their argument was that the "ŭ" was not in the standard computer character set. This has been, however, not the case after the Unicode fonts become available, such as for the MS Office 97, and there is the UTF-8 encoding option in the Netscape and Explorer.
So I was right was not I? The difficulties of this archaic and obsolete system is still disputed amogst which is right, yet some already want it to substitute for English wiki titiles (with system that 99% of the english-speaking world do not know of and violating WP:Naming Conventions in the process...)--Kuban Cossack 17:11, 31 May 2006 (UTC)

I will repeat here my message from Wikipedia talk:Naming conventions (city names): There is an official transliteration system applied by State Cartography Committee of the Republic of Belarus that is to be used in transliterating Belarusian toponyms in foreign language texts. Basta.. It is not exactly Lacinka (although it has most features of it) and is actually used on maps. The official transliteration should be in the article's name - all other former and Russian names shall be redirects. We have Mumbai as the article's name - and not Bombay, don't we? Even if Bombay used to be the traditional and most widespread English name for the city. So, I think, it would be logical--Czalex 15:52, 31 May 2006 (UTC)

Nope, it's not logical. The official transliteration system you are referring to is used by Belarusian authorities in their own foreign-language publications. Russia also has its own GOST. Ukraine has its official transliteration system. All these systems have a purpose, but they are not a good fit for the needs of the English Wikipedia. This had been discussed numerous times in the past as well—foreign governments have no authority when it comes to what transliteration system Western media use; the best they can do is to provide recommendations. If a system does not efficiently serve the needs of Anglophones, why should we use it?—Ëzhiki (ërinacëus amurënsis) • (yo?); 16:12, 31 May 2006 (UTC)
That is exactly why it is unsuitable for use in English wiki. --Kuban Cossack 17:11, 31 May 2006 (UTC)

P.S. ATTENTION! I also would like to draw your attention to the fact that this Russian user User:Kuban kazak has already changed Homel to Gomel on all Wikipedia articles, without any prior consulting or permission or voting from any of the WP-community members. --rydel 16:27, 31 May 2006 (UTC)

As that is compleately irrelevant to this discussion I am crossing it. --Kuban Cossack 17:11, 31 May 2006 (UTC)
Actually speaking of Gomel which version does Britannica use which you yourself said is the sole applicable guide to ENGLISH foreign names? is it the archaic Homiel or Homyel or the BGN/PCGN Homel? --Kuban Cossack 17:11, 31 May 2006 (UTC)
Homel/Gomel problem is very relevant. It is about transliteration of Belarusan geographical names into English. It's 100% relevant. Period. But it's also relevant because it demonstrates and proves that you are the wrongest person to propose/discuss any such conventions because (1) you are NOT behaving properly within Wikipedia community conventions, (2) you do NOT follow Wikipedia rules, (3) you change spelling for political reasons, (4) you do NOT know Belarusian language; (5) you are NOT a linguist; (6) You have NO experience or knowledge of the existing transliteration schemes and conventions of Belarus geographic names, as the above discussion indicates. The person is the problem, in this case. It's not an AD HOMINEM attack. It's a fact. --rydel 18:08, 31 May 2006 (UTC)
Rydel I can disprove eveysingle one of those points on why they are irrlevant to THIS discussion but is that the stubborn and uncompromising scenario that you seem to follow? No scholarly reasons left and now you resort to attacks and discredibiliting actions...do realise one thing the proposal is still up there, and so far apart from attacks and insults I am yet to see any objections to it from YOU. If that is because you have not got any, bugger off, I will blanck (not cross) further comments based on the Russification or the equivelent bullshit. Please end this circuis to save what remains of your reputation.--Kuban Cossack 13:30, 1 June 2006 (UTC)
  • What are you folks talking about??
That my recount of "Lacinka" history seems "strange" to Mr.Rydel, is unsurprising, considering he himself promotes it actively, both here, and on his site http://pravapis.org (I even seem to recall he wrote this article http://www.cus.cam.ac.uk/~np214/lacin.htm himself about 5 years ago).
That article about Lacinka was written by a Ph.D. student from Cambridge university, that's why it's located on Cambridge university web server, btw. I have nothing to do with it. As for my personal preferences, it is not relevant in this discussion. Besides, if you read my blog, you would know that I oppose introduction of Lacinka to Belarusians. This doesn't matter though; do you have any objection to the facts presented in that article? --rydel 14:53, 3 June 2006 (UTC)
Okay, so your preferences have no relevance here. So does your perception of "strangeness" (of my recount). Did I twist the truth? Did I omit something important? The article in question does both things. I've re-read it right now (first read it in 2000). Crafty wording (weasel-wording in WP-speak) and dropping the unsuitable facts. That's what my objections to that article are. BTW, the author is (or wants to be) Ph.D. in political sciences, for what it's worth.
And anyway, whatever the objections, of what relevance is the supposedly internal system of rendering of the Belarusian text here, in English WP? —Yury Tarasievich 07:21, 5 June 2006 (UTC)
I never doubted facts that are about Lacinka. However in the article itsefl they are directed against it. This discussion is not as much about Lacinka, but about how to make the Belarusian articles more flexible for English readers. As you are in opposition to Lacinka that is good, since the point to start this discussion was to remove it from wikipedia. The article proves its unsuitability. Thank you for the reference. --Kuban Cossack 15:40, 3 June 2006 (UTC)

Let's recount the facts:

"Lacinka" existed in several versions and none of them ever was standard. The 5th (1929) or 8th (1943) editions of Tarashkyevich grammar were just a non-mandatory proposals, initiatives.
"Lacinka" is NOT transliteration. It has its issue of "L", and it introduces very incompatible new meaning of Polish "Ł".
"Lacinka" introduces artificial dichotomy on use of "J".
"Lacinka" requires diacriticals.
"Lacinka" is known and interesting and understandable -- and useful! -- only to its promoters.
All that pretty much excludes its use here..
  • Now for BGN/PCGN. Why is the Romanization_of_Russian ven mentioned here?? If I am to believe the recount of BGN/PCGN here Romanization_of_Belarusian -- and I have my doubts because of dichotomy on "G"/"H", -- then there exists -- unsurprisingly! -- a special version of BGN/PCGN for Belarusian names. As far as I can see, it is very adequate, being fairly un-ambigous and not requiring diacriticals.
  • What I've seen of contemporary German and American maps, use what looks to me like BGN/PCGN with addition of apostrophe for denoting of the "soft sign".
  • Belarusian Standard on Geographicals (however is it called), isn't, of course, mandatory or even appliable here -- just think to whom national standards are to be mandatory. :) And it resembles BGN/PCGN for Belarusian very much, anyway, possibly even represents "true text" ("dakladny tekst") of it. Yury Tarasievich 14:00, 1 June 2006 (UTC)
Well regarless of what Рыдель is on about me "lacking knowledge" but you summed up exactly what I had in mind about the unsuitebility of a system that has less official status than Egyptian Hieroglyphics. (And I do not recall Anciet Egyptian nationalists on wikipedia moving Egyptian articles to their heiroglyphical names. ;) --Kuban Cossack 16:32, 1 June 2006 (UTC)
Re: BGN/PCGN, and why Romanization of Russian was mentioned here. Well, it was mentioned as an example, as a means to help with a comparison. If a simple comparison offends you just because "RUSSIAN!" is mentioned, I feel truly sorry for you, my friend. BGN/PCGN is available for twenty-nine languages, by the way—I was by no means offering to use Russian version of it for Belarusian names. As you rightfully noted, there is a Belarusian version of BGN/PCGN, and I am more than willing to help create an article about it if you (or anyone else) help(s) me find some examples for the summary table. The article will look a lot like BGN/PCGN romanization of what-shall-not-be-named. Once we have that up as a reference, everyone will know exactly what Belarusian BGN/PCGN looks like instead of guessing as to what is it that's being proposed.—Ëzhiki (ërinacëus amurënsis) • (yo?); 18:22, 1 June 2006 (UTC)
No need to "feel sorry", as I wasn't "offended". :) Just that talking about X, one should discuss Romanisation of X in the first place, and that was all I wanted to say with my remark. :) Phonetics differ, after all (take the issue of "E" for example).
BTW, Wikipedia's article on Romanisation of Cyrillic seems somehow misguiding to me. I've researched the matter a bit yesterday, but I have urgent work to complete, though, and can't address this myself right now, so for now I'll give just some relevant links for interested:
(PCGN) http://www.pcgn.org.uk/Romanisation_systems.htm
(BGN) http://earth-info.nga.mil/gns/html/index.html
(UN) http://www.eki.ee/wgrs/rom2_be.pdf
(good reading on GOST 7.79, and provides often dropped "Table A", too) http://www.viniti.ru/cgi-bin/nti/nti.pl?action=show&year=2_2002&issue=9&page=1
Yury Tarasievich 09:12, 2 June 2006 (UTC)
Thanks for the clarification and the links—the last one is extremely interesting and useful. The first two, not so much—I actually happen to have a hard copy of BGN's Romanization Systems and Roman-Script Spelling Conventions (covering twenty-nine languages), which I don't believe is widely available (I may be wrong). That's the reason why I offered my assistance in the first place. Also, we don't have an article on Romanization of Cyrillics; did you mean something else? Thanks.—Ëzhiki (ërinacëus amurënsis) • (yo?); 12:14, 2 June 2006 (UTC)
My mistake. Actually, I was meaning this: BGN/PCGN romanization
and this: Romanization of Belarusian. Too many of unrelated material and suspicious wording. I didn't check those articles closely for factual correctness yet.
Anyway, the whole Category:Romanization seems to be a mess and in need of massive re-organising. I got lost in it immediately. :) I'd suggest to create "disambiguation" root page catching all "general" terms like "Transliteration" and "Romanisation" and "Transcription", divided into sub-sections for each Script A -- Script B direction of conversion, each sub-section having list of relevant systems, standards and methodologies.
I do not know whether it is implementable in wiki software, though. —Yury Tarasievich 12:49, 2 June 2006 (UTC)
Yury, I am not sure why you are finding the whole structure confusing. There are a lot of gaps, true, but the hierarchy is quite straightforward. Taking Belarusian as an example, if we had all articles written, the structure would have been as follows:
  1. Romanization of Belarusian—a general overview of transliteration systems for Belarusian
    1. Scientific (scholarly) transliteration of Belarusian (currently a part of Scientific transliteration)
    2. ALA-LC romanization of Belarusian
    3. BGN/PCGN romanization of Belarusian
    4. National System of Romanization of Belarusian
    5. ISO 9 (not language-specific)
    6. Łacinka
There are also overview articles about each particular system:
  1. Transliteration/Romanization
    1. Scientific transliteration
    2. ALA-LC romanization
    3. BGN/PCGN romanization
    4. ISO 9
    5. more...
Not each of these system has a Script A→Script B designation; ISO-9, for example, is not language-specific. Category:Romanization can still be cleaned up, of course, but I think you are lost there only because there are so many gaps, substitutions, and incomplete articles.
Also, could you please point out the wording you think is "suspicious"? In what sense is it suspicious? Is it factually incorrect, or does it draw conclusions based on the wrong premises? In what way do you think it can be improved?—Ëzhiki (ërinacëus amurënsis) • (yo?); 14:32, 2 June 2006 (UTC)
Seems "confusing" to me, because info is badly structured and non-uniformly named and not very comprehensively organised inside of the articles. I had problems finding what I wanted -- and that tells me something.
"Script" as in "Cyrillic Script", "Latin Script", sub-divided into language sections, e.g., "Russian", "Belarusian", ..., for "Cyrillic to Latin" with further sub-points for, e.g., BGN, PCGN, ALA-LC, ISO 9 (BTW, non-language-specific entries should be repeated in *each* language section).
As for the "suspicious" -- I think I see some POV-pushing weasel-wording there, however, I'll have yet to reasearch this better, for the problem is not limited to this category, but exists, e.g., in the Alphabets series. Have to prepare. —Yury Tarasievich 21:40, 2 June 2006 (UTC)
OK, thanks for the explanation. As promised, I created BGN/PCGN romanization of Belarusian. Please add examples (I left some Russian ones in the comments as a formatting suggestion, but feel free to improve the layout in any way you see fit).—Ëzhiki (ërinacëus amurënsis) • (yo?); 15:25, 6 June 2006 (UTC)
Thanks! I'll surely try to improve it... wait a moment, I see "Е"(cyr)-"E"(lat) conversion in the BGN/PCGN romanization of Belarusian, while Romanization of Belarusian suggests "Е"(cyr)-"YE"(lat) for the BGP/PCGN. And the current UN document on romanisation of Belarusian says that "BGN/PCGN 1979 System" uses "YE" for that purpose. What to believe? —Yury Tarasievich 07:24, 7 June 2006 (UTC)
Hey, I believe you just erred on that there. I've just obtained (sort of) the PDF of the BGN book and it tells "YE" should be used there. Could you verify, please? —Yury Tarasievich 08:15, 7 June 2006 (UTC)
Sorry, my bad. Apparently, I looked in the cursive column instead of the romanization column when typing the info in. Thanks for catching this. I have already made a correction to the article and re-checked the rest of it—there should be no more errors.—Ëzhiki (ërinacëus amurënsis) • (yo?); 14:08, 7 June 2006 (UTC)

[edit] Resolve

So can we begin mass moval of articles or not, as I think we all agreed that Lacinka has NO arguments in its favour. --Kuban Cossack 22:55, 9 June 2006 (UTC)

Does "moving" mean changing the rendering of the Belarusian names to BGN/PCGN system? Can't see why not. ---Yury Tarasievich 20:25, 10 June 2006 (UTC)

[edit] Proposal is now LAW

As there have not been any serious objections the proposal is now LAW. So any further objections can be settled only by restarting a policy to change anew from the people who want to see Lacinka again. If no such proposal is raised in the near time, there will be a mass moval of articles from Lacinka to transliterated titles. I thank all the people who have supported in this becoming official. --Kuban Cossack 14:43, 11 June 2006 (UTC)

I wasn't paying close attention -- where's that "modified BGP/PCGN" defined, esp. for Belarusian? There is, e.g., Belarusian BGP/PCGN (1979), which is both quite adequate and as close to being "English standard" as possible. Why "modified"? ---Yury Tarasievich 06:20, 12 June 2006 (UTC)
Another important correction: the directive on "Lacinka" should read: "L. is not to be used for the Cyrillic Belarusian names (words?)" There *had* been some cultural artifacts made exclusively in one version of Belarusian Latin alphabet or another (e.g., newspaper «Biełarus»). Re-conversion of such to BGN/PCGN would be pointless. ---Yury Tarasievich 07:09, 12 June 2006 (UTC)
I'm willing to speculate that "modified" is a remnant of a copy-paste from WP:RUS-related definitions (current Russian xlit policy indeed uses a modified version of BGN/PCGN), but I, too, would like KK to confirm that. As for the second proposition, it makes sense to me, but, again, the definition needs to be slightly amended if there are no objections.—Ëzhiki (ërinacëus amurënsis) • (yo?); 15:06, 12 June 2006 (UTC)
Agree, I think the question we are raising is about the form of the transliteration, however my concern is that Lacinka is eliminated. Anyway I am sorry for my delay in participation due to my departure to Feodosiya. --Kuban Cossack 09:27, 15 June 2006 (UTC)

[edit] Conventional naming examples

It has been proposed that the notes about "conventional names" at Wikipedia:Romanization of Russian should be moved up to this guideline, to serve as an example for all the affected languages. Please not that this is not a change to the convention, merely the addition of some cases which illustrate the principle.

If there are no objections, I'll move that section into this guideline. We could make the examples a little more international—proposals welcome. Michael Z. 2006-06-08 21:25 Z

[edit] View point of someone originally from outside the Cyrillic area

OK, I agree I have some private interest in this - as my name was once mangled beyond recognition when turned into the Cyrillic alphabet. When I married in the Ukraine, I found that the translator (the only one in the town who could translate from Dutch into Ukrainian, and I guess he also translated from German, now and then) had interpreted my Mazurian name as ... German. So, there I was - with a "ю" instead of "ы" at the end of my name and the guy reponsible having gone on a holiday. In the end I decided the best way was to wait for him to come back after the marriage ceremony and then pay him extra to turn the "ю" into "y" and not "ü" for the translation of the marriage certificate in the opposite direction. Unfortunately, I had not counted on the Ukrainian passport guys putting the same "ü" in my wife's new name...

I think when it says "use the convential name" we must be careful. In the case of Ludmilla Tourischeva we are faced with two different problems. First, the lady still lives, and lives in a country which uses the Cyrillic alphabet. Second, the Christian name is spelled wrong, apart from any transliteration issues. This is not an "Alexander" case. I think whether we use the "convential" name (which actually googles a little better in the Tourischeva case because the votes split on Turishcheva and Turischeva, nobody writes Tourishcheva) will have to depend on a few factors:

1) how well established is that convential name? We need something to differentiate. 2) if the name is not so "well established" does the person in question still live in the Cyrillic area? If not, do we have an idea how (s)he likes the name to be transliterated? 3) if the "established" name is spelled wrong (even if it is only one part of the name) discard the complete name and go for normal transliteration.

I hate to refer to nationalistic sentiment here, but I have noticed that using frenchified or germanized versions of Russian names in an English text (I have even seen "Poutine" in an English text) makes them look distinctively non-Russian.User_talk:Pan_Gerwazy--pgp 14:19, 14 June 2006 (UTC)

As to "Ludmilla Tourischeva" variant, I believe, it's more popular, than other ones. By the way, previous discussion showed, that arguments for "Ludmilla Tourischeva" variant is stronger, than against.
As to "shch" or "sch" usage for "щ", for example, transcribing "Лещенко" as "Leschenko" is much more popular in the web, then as "Leshchenko". And "Petr Leschenko" variant also seems to be more popular, than "Pyotr Leshchenko". Hence, "shch"'s preference over "sch" is questionable at least for me. Cmapm 08:18, 15 June 2006 (UTC)


You cannot have your cake and eat it! Meaning you cannot use the web to say it must be "Leschenko" (how many references are in the English language, by the way?) and then forget about it when Lyudmila Turishcheva is concerned. If you had had a look at Romanization of Russian you would know why "shch" is to be preferred as transliteration. This is the English Wikipedia, not the German or the French one - so how to transliterate is not even in doubt. The point is : do we transliterate or not. We do NOT, when there is a conventional name in English. In the case of Leshchenko a good clue is given by the number of references in the English Wikipedia itself to him IN TEMPORE NON SUSPECTO (before this argument started). 6 for Pyotr Leshchenko, 2 for Pyotr Leschenko and zilch zero nada kein einzige for Petr Leschenko. In fact, Petr is totally unacceptable, the high number of googles for that one are caused by a mistake (just Like Ludmilla with two l's is a misprint by the way) - people mixing up Russian "e" and "ë". "Conventional" would be "Peter", of course. If there is no conventional way of spelling the name in English (and the low frequency of the name in English texts suggests that) we transliterate. Using Romanization of Russian. Again, to do otherwise - use a German looking name in an English context - would mean to de-Russify the name. And judging on how desperately Leshchenko wanted to return to Russia after World War II, I suspect he may not have liked that himself.
Returning to Turishcheva, I can tell you (having experienced this once more with my own wife three years ago) that if Lyudmila is now a Ukrainian citizen, she (or anybody else) does not have much say in how her name is spelled outside the Ukraine. That is why I leave her name to the Ukrainians on Wikipedia. User_talk:Pan_Gerwazy--pgp 12:49, 15 June 2006 (UTC)
Having raised myself the "how many ENGLISH googles" argument, I decided to have a look myself. The number of googles for "Petr Leschenko" does look impressive - but only at first sight. When you page through, you will find that Google only finds 37 unique pages. The count for "Pyotr Leshchenko" LOOKS much lower, but in fact they come from 30 unique websites. Now add the condition "English language". Surprise, surprise: 13 times "Pyotr Leshchenko", 12 times "Pyotr Leschenko", 7 times "Petr Leschenko", 7 times "Petr Leshchenko". In fact, in my opinion, these google numbers are so low and so evenly divided we can safely say there is no established conventional name in ENGLISH. Which means: transliterate. That the correct transliteration is also the leader in English Wikipedia and the marginal leader in Englsih googles, is a good point. [[User_talk:Pan_Gerwazy]--pgp 13:26, 15 June 2006 (UTC)
As I said earlier, Google searches are not the argument. Even if it would be one source, but a reliable one, it would be worth attention. Saying "popular in the web", I mean, that a lot of reliable sources exist in the web, I don't mean, that there are a lot of Google hits. In favor of "Lev Leschenko" I can provide links to a lot of sources, while for "Lev Leshchenko" I can provide much fewer. I am not so sure about "Petr Leschenko", it just seems popular to me, but I didn't deep enough into that. But if I should deep, I, as earlier, should analyze each of sources, not simply count Google hits.
As a side note, I think, that a discussion would be more constructive, if each claim is backed by cited sources, not just by somebody's personal experience. Cmapm 15:35, 15 June 2006 (UTC)

[edit] Belarusian language

It seems User:Kuban kazak prefers "revert wars" to normal disput. He has changed this page recently to:

  1. . Where that spelling is established in English, the established English name is used.
  2. . Elsewhere is transliterated using a modified BGN/PCGN system. Lacinka is NOT to be used.

Why?

1. First of all, what's wrong with Lacinka? It's for sure the only one "one-to-one" transliteration system, which can provide direct transliteration from cyrillics and back without loss of information; it is established and widely used. Diacritics are not problem any way - French, German, Polish names and so on use diacritics and there are no objections.

See reasoning above. --Kuban Cossack 09:38, 17 June 2006 (UTC)

2. If there are problems with Lacinka, don't forget about state supported transliteration system (so called NSR). Even it, while it is much less clear and accurate than Lacinka, is many times better than US BGN/PCGN system.

That is a POV --Kuban Cossack 09:38, 17 June 2006 (UTC)
Those points 1 and 2 are rather sort of thing, which well-wishing, but very, very uninformed person would utter.
The "state supported system" is: 1) non-mandatory outside of Belarus 2) non-recommended by UN 3) impossible to use with strictly English alphabet 4) redundant, as the "one-to-one transliteration" is already handled by ISO 9 since 1968, and English transcription is handled by BGN/PCGN romanization of Belarusian since 1979.
The "Lacinka" is: 1) quasi-orthographical, not transliterating, system 2) never codified properly (ref. to Tarashkyevich's 5th edition) 3) breaking Belarusian language traditions (chiefly with rules on "L" and "Ł", which, thankfully, the "state supported system" dares not do) 4) used by virtually nobody 5) redundant even for Polish WP, as there are already codified Polish rules of the rendering of the Belarusian names (ref. to the archive of the discussion).
That's why Belarusian names in English WP should be rendered in BGN/PCGN. ---Yury Tarasievich 12:47, 17 June 2006 (UTC)

That's why I PROPOSE next changes:


  1. . Where that spelling is established in English, the established English name is used.
  2. . Elsewhere is transliterated using Lacinka (preferred) or NSR system (where Lacinka could not be used).

Please speak and vote. --Monk 09:06, 17 June 2006 (UTC)

Your absence has already gained my proposal to become law, now unless you want to start a new one, then it already established rules--Kuban Cossack 09:38, 17 June 2006 (UTC)
Yes Kuban kazak I really have something to do in except sitting in wikipedia 24/7 as you can. Do you really think you can establish your POV as rules just while someone isn't watching? No way. We have established rules and those say we should use Łacinka. --Monk 10:25, 17 June 2006 (UTC)
The proposal has been there for a month. I too just took a lengthy break to shoo off the Nato ships in Feodosiya btw, but that does not change anything, the discussion is above and it is now settled, there was consensus based on people who have been involved. I can't be blamed for your absence. Start a new proposal. However breaking the rules, and I will ask the admin to take sanctions against you. --Kuban Cossack 11:50, 17 June 2006 (UTC)

User:Kuban kazak is well known for trolling and other wikipedia rule violations. In particular, this is not the first time he uses a starategy of intoducing a controvercial change by claiming imaginary consensus, and then asking others to jump over the head in order to return to the status quo. (see User talk:Rydel, Talk:Maładečna). The discussion over Belarussian Naming convention is far from over, and this is clearly indicated above. —Preceding unsigned comment added by 134.84.5.47 (talk • contribs)

Please discuss relevant issues than insults. --Kuban Cossack 10:02, 18 June 2006 (UTC)

I don't understand what's there to discuss at all. This is English WP, not political meeting. There exists BGN/PCGN system for Romanisation of Belarusian language (BGN/PCGN romanization of Belarusian) since 1979, which seems to be the only system internationally recognised. At very best, we could ponder the possibility of the Belarusian state system but to what purpose? And Latinka is just a non-issue here, actually. See my June 17 post several paragraphs before for more details. ---Yury Tarasievich 08:31, 18 June 2006 (UTC)

The proposal was there for a month, Kuban Cossaks put notices on all possibly related forums. I personally begged for feedback on Belarussian anouncement board and on the talk pages of many involved people. The only people answered were Yury Tarasievich and Ezhiki. All people participated eventually came to the consensus - Lacynka should go, BGN/PCGN should come. I think the company was reasonably qualified. Yury Tarasievich seems to be an expert in transliteration, Ezhiki is the author of the Russian wiki system, Cuban at least speaks Belarusian. Anyway nobody else cared to participate. Now in a month the talks start from the very beginning. I personally do not care about the outcome, if you want to use the system that shows for Belarusian towns as less represented on internet than Ethiopian villages of the same size - go ahead abakharev 10:52, 18 June 2006 (UTC)

[edit] Period of discussing

There seems to be a discussing here (at last) about the Belarussia places. Please not apply the convention to move any article related to Belarusian places. Lets discuss rather than revert war. abakharev 10:59, 18 June 2006 (UTC)

Well the message is out to the people who insist yet do not participate in discussions, and break WP rules. I put a note on Portal:Belarus a month ago, so far I have seen no objections to my proposal (apart from insults from Rydel and Krysa). So what was I to do, wait for a year? Молчание знак согласия, I acted on instinct and the policy has been changed you can't unchange a policy just because you don't like it. There are rules to wikipedia. If you had any objection you had a month to raise them. Getting into an edit war is not going to help you. Because in such a case the admin will follow the rules, and defend the existing policy (even if it is only a week old).
Now the only way you can change the policy back is to start it out anew, put a disputed tag on the text and restart the discussion. However before you do so, please see the relevant notes that were discussed between me, Yuri and Ezh above. --Kuban Cossack 11:07, 18 June 2006 (UTC)

I've edited the Belarusian section of the rules somewhat, removing the bits that could lead to ambigousness and bad feelings. What do you think? ---Yury Tarasievich 14:15, 18 June 2006 (UTC)

I've further amended the Belarusian section. I hope this would conclude the dispute on the Belarusian section constructively. ---Yury Tarasievich 10:40, 8 July 2006 (UTC)

[edit] Page Moves

Is there a thought on whether it would be right to use wp;cyr to justify page moves. Yuri Luzhkov to Yuriy Luzhkov for example? --Spartaz 08:44, 8 July 2006 (UTC)

Can't see why not --- as long as all the valid alternatives have their entries. ---Yury Tarasievich 10:42, 8 July 2006 (UTC)
You mean Yury Luzhkov as well as Yuriy Luzhkov ;)? I suppose we should we make sure that all the possibilities link to the main article? --Spartaz 21:38, 8 July 2006 (UTC)

[edit] Category:User Cyrl

use ISO 15924 script codes [4]

  • Category_talk:User cyr
  • Template_talk:User cyr

Tobias Conradi (Talk) 23:20, 14 August 2006 (UTC)

Just for clarity Tobias has proposed moving the above to his version. Surely we should be discussing this first? Unless there is a clear policy on using ISO naming conventions it strikes me as pointless moving these given that they are perfectly understandable as they are. --Spartaz 06:16, 15 August 2006 (UTC)
three letter can perfectly be mistaken as language related, which it is not. It is script related. Don't make it personal with "his" version. It's not mine, it's ISO. And certainly we do not need for every little thing a policy, but hey, write one if you like. Tobias Conradi (Talk) 09:15, 15 August 2006 (UTC)

[edit] Kyrgyz

I just updated Romanization of Kyrgyz. How about making the BGN-PCGN system the recommended way to romanize Kyrgyz names? One problem I see: Kyrgyz last names like Akayev are actually Russifications, I don't think the "ае" combination exists in real Kyrgyz words. Markussep 18:36, 12 September 2006 (UTC)

[edit] Italics in Cyrillic characters

Is there any agreement on whether to italicize or not Cyrillic characters ? If not, I propose to adopt a guideline advising against the use of italics in these cases:

  • Italics are not necessary, since the difference with "normal Latin text" is obvious.
  • Italics hinder readability, at least for those of us not used to those funny characters :-)

Of course, there would be exceptions, as for the "Bibliography" and "References" sections, where italics in these scripts do tend to make sense.

I imagine something very simple, along the lines of:

Do not use Italics for the following cases:
  • Foreign language words and texts in Cyrillic characters, such as Кириллица.

I'm posting this in Wikipedia talk:Manual of Style (text formatting)#Italics in Cyrillic and Greek characters & Wikipedia talk:Naming conventions (Greek)#Italics in Greek characters too. - - Regards, Evv 03:16, 12 October 2006 (UTC)

Usually, to denote the "alien" word quote signs suffice, and yes, italics looks not very legible in typical computer display sans-serif, no matter what script. Yury Tarasievich 06:33, 12 October 2006 (UTC)

I concur. Furthermore, in most cases in the body of an article Latin transliteration is preferable to Cyrillic letters anyway, because it is more accessible to English Wikipedia's audience. Some specific problems with Cyrillic italics:

  • Accented Cyrillic letters fail to display on some systems when they are italicized, due to a lack of precomposed characters in available fonts (this affects Safari on Mac OS X, but it appears that Firefox/Mac creates the characters by superimposing accents). This particularly affects an article's first line, where accents are often used to show stressed syllables.
  • In some cases, a limited range of Cyrillic characters will render correctly, but less common characters will not, also depending on available fonts (e.g. the latest version of the Gentium multilingual font has Russian letters, but not letters for other languages. Most fonts lack many of the characters necessary for non-Slavic languages or early Cyrillic).

 Michael Z. 2006-10-12 07:38 Z

I guess that the main discussion will be in Manual of Style (text formatting), regarding all scripts other than Latin.
At the same time, it may be a good idea to add a simple "on Cyrillic characters only" guideline to this project page, in a last section titled "Style recommendations" or similar (providing there a link to Wikipedia:Manual of Style (text formatting).
Best regards, Evv 04:09, 13 October 2006 (UTC)
Sounds reasonable. Couple more things:
  1. There was some relevant discussion and a vote at Template talk:Lang-ru#Italicizing Cyrillic text
  2. The style manual of the Slavic and East European Journal recommends not italicizing Cyrillic text, as the visual appearance of the alphabet already emphasizes or distinguishes it from surrounding text in a Roman font.
As far as I'm concerned, italicized Cyrillic text looks beautiful (at least with Mac OS X's antialiasing and a good font), but unfortunately browser/OS/font support is too poor to display it reliably. Michael Z. 2006-10-13 06:15 Z
Evv wrote: "Italics are not necessary, since the difference with "normal Latin text" is obvious."
Is it, really? Fact is that the Cyrillic alphabet has a number of letters common with the Roman alphabet i.e. graphically identical with Roman letters (including lower case ones), making it impossible in some cases to know if a certain word is from one alphabet or the other. If you encounter say 'Tacex' in your English text, you wouldn't know whether it's the Roman 'Tacex' or the Cyrillic 'Тасех' (that would read 'Taseh' if transliterated). Whether and how often such confusion may appear in Wiki texts I cannot say, what's certain is that without italicization one cannot certainly tell Cyrillic words in the text. If not italics, then quote signs or something else would be needed to guarantee differentiation, but the arguments against italicization given above seem less than convincing to me. Apcbg 19:46, 18 December 2006 (UTC)
If Tacex (or Cyrillic Тасех) was italicized, that would not help. It could well be an unfamiliar English word, a word from another Latin-alphabet language, or a transliterated word (e.g., Тацех, Tacex, in scientific transliteration). Italicizing does not resolve ambiguity on its own, because it is used for emphasis, to define new terms, for citing titles, for foreign terms in both the original alphabet and transliterated.
Recognizing the alphabet can occasionaly be a problem in very short words or word fragments in isolation, but then it should be dealt with by careful formatting and thoughtful writing. But for most words, the alphabets are different enough in character that they are easy to spot. Michael Z. 2007-07-04 06:51 Z
The purpose of my example was to refute Evv's claim which it did. That italicizing is used for several purposes does not imply it should not be used for Cyrillic words; such reasoning might be employed to infer that italicizing should not be used at all. You start with 'emphasis' which however competes neither with the other uses you mention nor with italicizing Cyrillic words, but is part of them all. Apcbg 07:42, 4 July 2007 (UTC)

[edit] Russian/Cyrillic toponyms

Here is my $0.02 on translation of Russian toponyms (i.e. place names). Right now it's a total mess, and I think Wikipedia needs a policy for consistent representation of geographic names in English. I propose something like:

1. If there's already an established name in English - use it. Good places to look up established names are English dictionaries (i.e. Webster's, Encyclopedia Britannica, etc.) and atlases (e.g. National Geographic Atlas of the World). One can also use Google test, but the name shouldn't be treated as established unless there's at least a few thousand hits. Examples of established names that don't follow regular transliteration rules: Moscow, Red Square.
2. If there's no established name, transliterate the name.
3 Do not translate words that belong to a proper name. This applies to common words like улица (street), набережная (embankment), мост (bridge), гора (mount/hill), верхний/нижний (upper/lower), etc. E.g. translate здание на улице Кузнецкий мост as a building on Kuznetskiy Most street (not Kuznetskiy Bridge street or Blacksmith Bridge street), but здание на Улице 1905 года as a building on Ulitsa 1905 Goda street, because in this case the word "улица" (ulitsa = street) is a part of the street name. Include literal translation when appropriate, e.g. Kuznetskiy Most (literally, a blacksmith bridge) - a street in Moscow.

This is a rough draft and needs much more work - or at least many more examples - to clearly communicate the idea. Azov 02:51, 14 October 2006 (UTC)

Why Kuznetskij rather than Kuznetskiy? Spartaz 05:53, 28 October 2006 (UTC)
Well, there are many transliteration systems, but most transliterate й as j. Azov 08:26, 28 October 2006 (UTC)
Most? The suggested transliteration table for en here suggests 'iy' or 'y'. Spartaz 08:57, 28 October 2006 (UTC)
Allright, I don't really have a strong opinion on which particular table to use. The one you linked to sounds fine, so I changed it to "Kuznetskiy" in my example. I also linked the romanization article in the main namespace to the Wiki guideline you quoted. Azov 10:29, 28 October 2006 (UTC)

<deindent>Thanks. Your proposal makes sense. Spartaz 15:53, 28 October 2006 (UTC)

It's not really within the scope of these naming conventions, that are IMO about the preferred way to render names, that are written in the Cyrillic alphabet, in the Latin alphabet. Still, this is an interesting point. Proposals 1 and 2 are completely OK with me. I have some trouble with proposal 3. Especially "Ulitsa 1905 Goda street" looks very silly to me ("street of the year 1905 street"), that's like "Lake Chiemsee" or "Rio Negro River". Some street names in other countries are not translated at all ("Friedrichstraße", "Paseo del Prado", "Prinsengracht", "Avenue d'Iéna", "Rådhuspladsen"), for some only the generic part is translated ("Omonoia Square", "Taksim Square", "Andrássy Avenue", "Andriyivskyy Descent"). Category:Streets and squares in Moscow suggests that the latter method is used for Russian streets, which I think is OK since it's not so likely that the average en.wikipedia user speaks Russian. Markussep 23:48, 28 October 2006 (UTC)
Well, parts 1 & 2 are sort of commons sense, it's the third part that I actually try to bring up. Yes, this translation may seem redundand, but it's useful and often neccessary redundancy. Take your example with Rio Negro River. I don't speak Spanish, so if you just said "Rio Negro", I wouldn't know you're talking about a river. From the oteher hand, if you just said "Black River", I wouldn't know which river you're talking about (as I won't be able to translate the name back to Spanish - or, perhaps, Portuguese. See, I won't even know which language to translate it to!). So, for me Rio Negro river is actually the best variant. Of course, if you use the name somewhere where it's clear from the context that you're talking about a river - the word "river" can and should be left out. However, translating generic part of a common name is practically always a bad practice. Again, take, say, Bolshoi Kislovsky Drive from your example. I'm fluent in both, Russian & English, but I have no idea what this name refers to. Is it Большой Кисловский проезд? Or maybe Большая Кисловская улица? Or, perhaps, Большой Кисловский переулок?.. And, by the way, why they translated 'переулок', but didn't translate 'Большой'?.. There's no way to tell. Or take "Yauza Boulvard". What this refers to? Бульвар Яузы? Яузинский бульвар? Яузовский? Яузский?.. This translation is useless unless you already know the original name, and this is what we want to avoid.
A tricky part is figuring out whether a common word is a part of the proper name. Sometimes it's obvious, e.g. the word "Most" in Kuznetskiy Most is clearly part of the name (as it is, after all, a street, not a bridge), but one can argue that in "Bagrationovskiy Most" the name is just "Bagrationovskiy", so Bagrationovskiy bridge (note capitalization!) would be a better translation.
As to the argument about average Wikipedia reader not speaking Russian - well, that's exactly why proper names should not be translated. The main purpose of the name is to identify the subject. I.e. let the reader find it on a map, recognize the name mentioned in some other context, etc. Classifying the subject - i.e. explaining that the name refers to a street, a bridge, etc. - is secondary to that. Azov 10:23, 29 October 2006 (UTC)
Russian has the additional feature (as do many other languages, but not English) that it modifies the object the street, lake etc. is named after. For instance Nevsky Prospekt - Neva Avenue, Ploshchad Lenina - Lenin Square. I don't know what's best, I'm inclined towards translating as little as possible. The translations can always be mentioned in the articles. Markussep 17:31, 29 October 2006 (UTC)

[edit] Mongolian again

The current discussion on Talk:Aymguud_of_Mongolia#Transliteration motivated me to reanimate my old proposal further up on this page. Comparing the current useage trends in WP, existing maps from Mongolia, and other sources, as well as leaning on the naming conventions for Russian to some degree, the following seems to make the most sense to me:

Cyrillic Latin Latin, alt.
А,а A,a
Б,б B,b
В,б V,v
Г,г G,g
Д,д D,d
Е,е Ye,ye
Ё,ё Yo,yo
Ж,ж J,j
З,з Z,z
И,и I,i
Й,й Y,y I,i
К,к K,k
Л,л L,l
М,м M,m
Н,н N,n
О,о O,o
Ө,ө Ö,Ö
П,п P,p
Р,р R,r
С,с S,s
Т,т T,t
У,у U, u
Ү,ү Ü,Ü
Ф,ф F,f
Х,х H,h Kh,kh
Ц,ц Ts,ts
Ч,ч Ch,ch
Ш,ш Sh,sh
Щ,щ Shch,shch
Ъ,ь (omitted)
Ы,ы Y,y
Ь,ь (omitted) Y,y
Э,э E,e
Ю,ю Yu,yu
Я,я Ya,ya

There are three open questions:

  • Й,й is written I,i on some mongolian Maps, but Y,y almost anywhere else (including usually on WP).
  • Х,х is written either H,h or Kh,kh randomly on WP, apparently with a slight trend towards the former. Phonetically, sometimes one spelling makes more sense, sometimes the other. We'll probably just have to take a random choice here.
  • Ь,ь is sometimes written Y,y, sometimes omitted (not sure about the criteria, or where it is used in Mongolian anyway)

A few characters are only needed for Russian loan words, so they appear as in Wikipedia:Romanization of Russian.

Unless someone is going to present an entirely different concept, we only need to establish a consensus on those three questions, until we can define a final naming convention for Mongolian. What do the language experts think about it? --Latebird 12:13, 7 February 2007 (UTC)

ь is sometimes represented with a ', too (as in " Dundgov' "). IMO that is still better than representing it with a vowel. Й -> I makes sense as it makes ий compatible with the other double vowels (aa, oo, ee etc). What transscription do Mongolian passports use? Yaan, 217.188.99.115 10:56, 8 February 2007 (UTC)

I'm not sure if an apostrophe would be helpful for the average reader. One argument to omit it (and to use "kh" and "y" in the other two cases) would be that they are then handled the same way as for Russian, reducing the potential for confusion. The double ii is an interesting point. On the other hand, it makes it impossible to differentiate betweeb the two characters when they appear alone. --Latebird 16:55, 8 February 2007 (UTC)

I early 1980's were officially in use different transliteration rules: Ш - Š; Ч - Č, Й - J, Ы - Y, Ц - C, Ь - ' I think it was only Russian style transliteration. Ь is transliterated "i" and "ĭ" too (in WP aymguud table). Naturally it is wery possible the official passport transliteration can be reasonable decision. But in Mongolia are unofficial transliterations used in computer and SMS: Х - X. —Preceding unsigned comment added by Bogomolov.PL (talkcontribs) --Latebird 09:12, 12 February 2007 (UTC)

Official where and defined by whom? The examples you give look like a scientific transliteration. We need something that makes sense for english language readers. Are the mongolian passports actually transliterated consistently?

Out of practical considerations, I must say that I'd prefer the following:

Cyrillic Latin
Й,й Y,y
Х,х Kh,kh
Ь,ь (omitted)

This is simple to use, easy to remember because it is similar to what we use for Russian, makes sense phonetically in most cases, and probably requires the least amount of changes to existing articles. --Latebird 09:12, 12 February 2007 (UTC)

May be Russians omitte ь? Gov intead of govi or gov'? OK, I remember proposition to use wellknown names in traditional latin forms (coming from Russian usually). But this transliteration project needs traditional names list creation. And I am sure: if Mongolian word was transliterated and after that is possible to transliterate it back to Mongolian cyrillic - it is good transliteration system. In WP article about transliteration was good transliteration instance - traditional transliteration for Japan words. It is not in Japanese sound SHI (ŠI) or JI(ŽI) but SьI and ZьI (using Cyrillic softsign), but the aim was to create direct correspondance betveen Japanese syllabary and its latin form. Is softsign in use in native Mongolian words? Хонь - sheep, Морьт - horse, for example. This words don't look be loan... Bogomolov.PL 15:03, 12 February 2007 (UTC)

In the WP Transliteration system for Russian, the soft sign is omitted in most situations. For Mongolian, it can be omitted in all cases, because it never specifies a phoneme of its own. The sheep can be pronounced correctly as Khon, and the horse (Морь) as Mor.

Говь is a special case, because there is an established english (and german) translation: "Gobi" (compare Gobi Desert). We kan skip the transliteration in such cases, which will result in Dornogobi, Gobi-Altay, etc. The other example I'm currently aware of is Ulan Bator instead of Ulaanbaatar.

Back-transliteration is explicitly NOT required for WP. We need a system that makes the most sense phonetically (transcription), and is easy to use and understand. --Latebird 15:51, 12 February 2007 (UTC)

But why Americans use i instead of ь on their military maps http://geoengine.nga.mil/geospatial/SW_TOOLS/NIMAMUSE/webinter/muse_webinter_output/roamoutput1171299403_11371.png ? Russian transliteration style transliteration exists un WP http://en.wikipedia.org/wiki/Mongolian_language , but http://en.wikipedia.org/wiki/Wikipedia:Romanization_of_Russian idea is to use ONLY ENGLISH Latin alphabet (no unlauts etc). Bogomolov.PL 17:31, 12 February 2007 (UTC)

You'll have to ask the military types about their reasons... We can't *exactly* follow the system for Russian, only just as closely as reasonable. Russian doesn't have the extra vowels that Mongolian has. And those are most conveniently, and closest to correct pronounciation, represented by ü and ö. --Latebird 01:39, 13 February 2007 (UTC)

If we clame it have to be clear as possible for ENGLISH reader we need avoid umlauts: umlauts are absent in Enlish alphabet and common English reader need be informed about Ü and Ö pronunciation. For me, for example, difference between Mongolian "У" and "Υ" is too subtle, I don't hear the difference. May be Russians are rigth in "Υ" transliteration like simple "У", but it will be too close to the transcription, too far from transliteration. Russians also ignore Ö, they replace it with У and Э (Мурэн). And double vowels they ignore - for common person (like me) it is only emphasis, long ONE vowel, not TWO, but it will be too close to the transcription, too far from transliteration, isn't it?
About soft sign omitting in Russian transliteration. It is omitted only in two situations: at the word's end and between consonants (but my Mongolian examples are exactly for the fist and second, it happens). We are discussing not in empty space but in real situation: it is Google Earth/Google Map, the most common map source (WP supports Google, you know). This maps are based on American military maps (1: 1M), where its own transliteration system in use. And softsign is respected as "i", and Х is H (not KH) etc. Do you remember this system? If we will change transliteration system it will not be changed on maps (not our hand made), atlases etc. May be we neeed to realize fully: will we try to change the world? Bogomolov.PL 09:16, 13 February 2007 (UTC)
The military maps don't seem to use any really consistent transscription (or -literation) system. Or at least not from Mongolian written in cyrillic letters. Or more than one. Otherwise they wouldn't mark Erhel nuur as Erhili nuur or Tsagaan burgasnii (or -niy or -yn) hüree as Tsagaan burgasanii huryee (both between Mörön and Hatgal). Google earth isn't terribly authorative when it comes to geographical names in Mongolia either.
Since this isn't only about geographic (where probably no really definite system exists) but also about personal names, I'm all for going after the transscription the passports use. Yaan 13:32, 13 February 2007 (UTC)
Otherwise, it's of course always possible to create redirects from other spelling variants.Yaan 13:36, 13 February 2007 (UTC)

[edit] Proposal: Wikipedia:Romanization of Mongolian

I've created Wikipedia:Romanization_of_Mongolian to give the discussion a more solid basis, and to better present my proposal in context. Maybe it will be easier to discuss individual details on the talk page there. I'm also trying to attract broader attention to the topic somehow. --Latebird 17:28, 13 February 2007 (UTC)

It seems that the participants at Wikipedia talk:Romanization of Mongolian have reached a reasonable consensus and I've adapted the proposal to all the good suggestions made. Now are there any formalities necessary to promote this to an official policy? Or can I just declare it as such? --Latebird 13:44, 8 March 2007 (UTC)
I've been bold and declared Wikipedia:Naming conventions (Mongolian) an official guideline (after renaming to the more conventional title). I hope the lack of objections so far reflects actual agreement and not only the obscure nature of the topic... --Latebird 09:13, 16 March 2007 (UTC)

[edit] Accents

What about accents?
As in "Колмогóров" goes to "Kolmogorov" versus "Kolmogórov", say??
(Posted also at Talk:Romanization of Russian.) —DIV (128.250.204.118 08:48, 15 November 2007 (UTC))

As far as I know those accents are used only as a pronunciation guide, you will never find them in a normal Russian text (except in dictionaries), probably that's the same for other languages with the cyrillic alphabet. I think it's helpful to put an accent on the first appearance of the cyrillic form (like in the Andrey Kolmogorov article). They should definitely not be used in article titles. Markussep Talk 09:32, 15 November 2007 (UTC)
I didn't realize this question was cross-posted in two different places, so for the sake of cross-referencing the two threads, here is the link to my answer to this question.—Ëzhiki (Igels Hérissonovich Ïzhakoff-Amursky) • (yo?); 15:06, 15 November 2007 (UTC)
Thanks Markussep, this makes sense I think. :-)
—DIV (128.250.204.118 00:02, 16 November 2007 (UTC))

[edit] Romanization of Russian proposal announcement

This is a formal announcement to inform the community that a proposal to redefine the criteria of conventionality in the Russian language romanization guideline has been submitted and to solicit the community to review the said proposal and vote on it. The proposal is available at Wikipedia talk:Romanization of Russian#Proposal to re-define the criteria of conventionality. Thank you for your attention.—Ëzhiki (Igels Hérissonovich Ïzhakoff-Amursky) • (yo?); 17:35, 21 November 2007 (UTC)

The proposal has passed.—Ëzhiki (Igels Hérissonovich Ïzhakoff-Amursky) • (yo?); 16:04, 12 March 2008 (UTC)