Talk:Paul Erdős/Archive of title discussions

From Wikipedia, the free encyclopedia

Archive This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

__NOEDITSECTION__

[edit] Policy on Extended-ASCII Titles

What's our policy about extended-ASCII titles? Should the default Paul Erdos article be at Paul_Erd%F6s or Paul_Erdos? --The Cunctator

I don't think we have a policy yet, since non-Ascii characters are new in Wikipedia, but personally I prefer the correct title (i.e. with ö), because that way the headline of the article looks right. We certainly need a redirect from Paul Erdos though, so that the page can be found be people without fancy keyboards. AxelBoldt

There's LDC's Proposed Wikipedia policy on foreign characters over on the meta wiki, but there's no consensus. Summary as far as titles: LDC wants all non-ASCII characters banned from titles -- drop most diacritics, convert umlauts to "e"s, ß to "ss". In the talk page, I disagree and say that titles with non-ASCII characters ought to be preferred where appropriate, with plain-ASCII versions as redirects. There's also some talk of a #TITLE code whereby the title shown at the top of the screen could be set to something different from the article title. Brion VIBBER, Monday, April 1, 2002

I'm in favour of full diacritics wherever possible, with search engines being left to interfile these things as required. Even so, I think it will still be a while before Unicode is generally understood by all systems. A practical goal in the shorter term would be to at least implement the characters in ISO 8859-1, and give people a chance to catch up to that. Personally I find "Erdös" easiest to write by using Alt+0246 instead of the ampersand format; the F6 format doesn't reproduce at all, and I understand that the people using German ASCII would do something else again. Eclecticology

We use ISO-8859-1 right now; the %F6 is merely a URL-encoding of the character "Ö" in ISO-8859-1, and isn't something users should ever have to manually type. As far as Unicode, the current plan (see Wikitech-L archives) is to switch everything to UTF-8 internally, with downconversion to a legacy character set (for English, ISO-8859-1) for browsers that don't report themselves as grokking UTF-8. This should keep older browsers and search engines (such as Google) that have no or buggy UTF-8 support to still work fine. Brion VIBBER

I agree, but I would go further: search engines should

  • match exact matches first, and then
  • fall back to ignoring diacritics (as if ASCII-only), and then
  • fall back to Metaphone or other similar scheme (just replacing all vowel sequences by '*' normalises to a sufficient extent for many purposes)

Just looking at the failed searches shows that about half of them would succeed given some very simple normalisation. Many of the others would work if a combination of guess-the-spaces and stemming was used. Wikipedia is small compared to the Web, and so techniques like this will improve recall without deluging the reader in dross, providing that exact matches and article-title matches take priority. The Anome


[edit] Incorrect Title

Why do we have his name in the title written (incorrectly) as Erdös? It should be Erdős or Erdős (as it is written correctly on the line just below). Anyone care to make corrections everywhere (including on pages linkink here)?

That's Unicode point "LATIN SMALL LETTER O WITH DOUBLE ACUTE" Erdős, by the way.

Testing: Erdős

note -- Unicode code point 337 (hex #151)

[edit] Title Still Incorrect

This is a featured article but it's title is still incorrect. In hungarian the correct form is „Pál Erdős”, or maybe in hunglish the „Paul Erdős” form is admissible. „Paul Erdös” is an absolutely mixed mutant chimaera form. The english version „Paul Erdoes” or „Paul Erdős” is less socking then, but the immaculate form is „Pál Erdős”, with the character ő instead of ö.Gubbubu 13:03, 5 Aug 2004 (UTC)

  • There is a reason for that. Having characters like ő in titles causes various sofware problems for people who do not have the appropriate character on their computer. (There have been cases of article histories not showing correctly, etc.)
This was already once mentioned higher on this talk page, and the situation has not changed. The software still has difficulties with handling "ő" in character titles. Andris 14:29, Aug 5, 2004 (UTC)
Currently only ISO-8859-1 is supported. This has nothing to do with difficulties, just unwillingness, or a policy not to support Unicode in article names. Since the correct title is not possible to render in ISO-8859-1, I suggest Paul Erdos as the title, which is easier to type.
Dbenbenn, thanks for putting the title limitation template. It looks like it is the best solution at the time being given the above arguments. BACbKA 17:13, 1 Dec 2004 (UTC)
Sure thing. I've also been going through the what links here and what links to Erdös number lists and changing Erdos and Erdös to Erdős. There are plenty of other occurrences of his name that need to be fixed. --Dbenbenn 00:51, 2 Dec 2004 (UTC)

---

To RTC: your attempt to rename this page renamed it to "Paul ErdÅ?s". Please don't do that. It will have to remain Erdös for now, until the software can cope with "ő" in article titles. -- Anon.