Wikipedia talk:Milestone statistics

From Wikipedia, the free encyclopedia

Contents

See also the following archives:

  • /Archive - archive of discussion from before January 2006
  • /Watched - archive of "watched" languages from the past several months


Date of last complete update

Please indicate here when you have checked the Wikipedias in all languages for their article counts:
  • The table seems to be complete and up to date as of the time I'm posting this. - dcljr (talk) 20:49, 6 February 2007 (UTC)
  • Table is again complete and up to date. - dcljr (talk) 03:34, 25 March 2007 (UTC)

Note that the article counts listed at the link above might not be completely up to date since it is "manually" updated by users by pasting automatically generated wikitext into the page. When an active wiki is very near a milestone, you might want to check the wiki's Special:Statistics page to get the "live" count. See also Wikipedia:Multilingual monthly statistics for article counts on the first of each month, and m:Wikimedia News for article-milestone announcements and a similar table. - dcljr (talk) 06:36, 7 May 2006 (UTC)


See below for languages in need of watching this month.


Past discussion follows...


[edit] Pennsylvania Dutch Wikipedia

I listed the 1000th article for the Pennsylvania Dutch Wikipedia, even though it is technically not at pdc.wikipedia.org yet, it has been accepted to be moved to wikipedia.org and is just waiting for a developer since october. Stettlerj 18:48, 17 February 2006 (UTC)

Nevermind, someone removed the mention of the 1000th article in the Pennsylvania Dutch wiki, so for the record it occured on january 6, 2006 :). Stettlerj 02:54, 2 March 2006 (UTC)

[edit] English

Started the English wikipedia with 1 million articles or is there another reason why the English one then is first noted? --Edroeh 13:13, 17 April 2006 (UTC)

The languages are listed by the last article-count milestone reached, in decreasing numerical order (then by date within each section). As the largest Wikipedia, the English one is listed first in the table for that reason. It didn't start at 1 million articles, but it's the only one so far that has reached that milestone. - dcljr (talk) 00:36, 7 May 2006 (UTC)

[edit] Unfortunately...

Such milestones don't tell us much about the Wikipedia.

The Javanese Wikipedia, for example, is full of articles entirely in Indonesian with a template at the top that says "This article needs translation" (to their credit, there are a few articles entirely in Javanese, and perhaps a couple hundred articles with at least a couple of paragraphs in Javanese).

The Northern Sami Wikipedia has only stub articles with a handful of possible exceptions, see se:Special:Longpages. The same is true for the Frappucino® Wikipedia (really Francoprovençal) frp:Special:Longpages,

Others, such as the Uzbek Wikipedia, are filled to the brim with apparent copyvios (one wonders if the admins there understand the GFDL policies of WP), uz:Special:Longpages

On the other hand, we have such wikipedias as the Taiwanese Wikipedia, which has a relatively small number of articles (not 2000 yet), but most of them are of superior quality, for example zh-min-nan:Chhùi-khí ê hoat-io̍k, an article about tooth development. The Faroese Wikipedia is alright, although most of its articles are a little stubby, they're not as bad as the Northern Sami or Francoprovençal Wikipedias, and they aren't all copyvios. (fo:Special:Random).

I think that we should have some way to come up with a general "rating" of Wikipedias, based on not only their number of articles, but on their average number of bytes per article (obviously accounting for different byte lengths for translations of the same text between languages, perhaps this could be based on the text of UDHR or, if that's not available, the Our Father).

An obvious statistic would be Bytes of text in a Wikipedia, but then, would you not agree that a Wikipedia with 50000 bytes divided over 10 articles is better than one with 50000 bytes divided over 100, given that 500 byte articles are going to be next to useless, no matter how many of them you have? If a Wikipedia has a stub for every country in the world saying "(country) is a country in (continent)", is that as valuable as a Wikipedia that has well-developed articles on just 10 countries, but is entirely missing articles on the others? --Node 02:35, 18 June 2006 (UTC)

Note - the taiwanese tooth development article you cite appears to be a verbatim translation from the en featured article on the same. Raul654 09:11, 18 June 2006 (UTC)
Doesn't mean it isn't a quality article. Are verbatim translations illegal? If you checked all the good, long articles in smaller wikis, you'd find a good portion of them are translated. What's wrong with that? --Node 05:24, 19 June 2006 (UTC)
Nothing, actually. I think the people who do translations do good work. I mentioned hte fact because I thought it relavant to the discussion Raul654 05:48, 19 June 2006 (UTC)

...ps, I created this page to rank Wikipedias using the "alternate" article count. It seems to do a great job, and it's interesting to compare. Not many differences at the very top, but as you go down further in categories, some Wikipedias get pushed down one or two, and some disappear off the list completely. --Node 05:24, 19 June 2006 (UTC)

[edit] Newar date discrepancy

This table lists the Newar / Nepal Bhasa Wikipedia as having reached 1,000 articles on October 21st instead of the 11th. As new:Special:Newpages currently stands, it looks like it did actually happen on the 21st. But m:Wikimedia News lists the date as October 11th. Someone once explained to me how the "Newpages" output can't be trusted since it lists all new pages in the main namespace, not just those that would count toward the "official" article count; OTOH, it doesn't list any "new" articles that have since been deleted. So, I guess I'm asking, should we keep the October 21st date or go with the one listed at Meta? (See also m:Talk:Wikimedia News#Clarification about Nepal Bhasa Wikipedia.) - dcljr (talk) 00:38, 29 October 2006 (UTC)

[edit] Languages watched in November 2006

Based on trends in article counts seen at m:List of Wikipedias, the following languages are the ones most likely to need promoting in the table this month (November 2006):

  • German (de:) will reach 500,000 Happened Nov 23rd. - dcljr
  • Chinese (zh:) will reach 100,000 Happened Nov 12th. - dcljr
  • Czech (cs:) will reach 50,000 Happened Nov 18th. - dcljr
  • Arabic (ar:) will reach 20,000 Happened Nov 10th. - dcljr
  • Albanian (sq:) will reach 10,000 Happened Nov 23rd. - dcljr
  • Norman (nrm:) will reach 2,000 Happened Nov 14th. - dcljr
  • Piedmontese (pms:) will reach 2,000 Happened Nov 7th or 8th. - dcljr
  • Tajik (tg:) will probably reach 5,000

The following are less likely, but possible promotions:

  • Portuguese (pt:) might reach 200,000 Happened Nov 29th. - dcljr
  • Galician (gl:) might reach 20,000 Happened Dec 6th. - dcljr
  • Aragonese (an:) might reach 5,000
  • Ukrainian (uk:) might possibly reach 50,000
  • Haitian (ht:) might possibly reach 10,000
  • Siberian/Nort Russian (ru-sib:) might possibly reach 10,000
  • Sundanese (su:) might possibly reach 10,000
  • Kannada (kn:) might possibly reach 5,000
  • Scottish Gaelic (gd:) might possibly reach 5,000
  • Waray-Waray (war:) might possibly reach 2,000

And these are the most likely newcomers to the table:

  • Volapük (vo:) will probably reach 1,000
  • Amharic (am:) might possibly reach 1,000
  • Maltese (mt:) might possibly reach 1,000

Within each list above, the entries are given in decreasing order of certainty, then in decreasing numerical order by milestone, and finally in alphabetical order. Note that my prediction method is biased towards saying a wiki with a recent growth spurt will continue such activity. - dcljr (talk) 00:25, 5 November 2006 (UTC)

[edit] Languages watched in December 2006

Based on trends in article counts seen at m:List of Wikipedias, the following languages are the ones most likely to need promoting in the table this month (December 2006):

  • Swedish (sv:) will reach 200,000 Happened Dec 23rd. - dcljr
  • Hebrew (he:) will reach 50,000 Happened Dec 24th. - dcljr
  • Bishnupriya Manipuri (bpy:) surely will reach 10,000 Happened Dec 24th. - dcljr
  • Malayalam (ml:) probably will reach 2,000

The following are less likely, but possible promotions:

  • Catalan (ca:) might reach 50,000
  • Indonesian (id:) might reach 50,000
  • Aragonese (an:) might reach 5,000 Happened Dec 29th. - dcljr
  • Corsican (co:) might reach 5,000 Happened Dec 22nd. - dcljr
  • Chuvash (cv:) might reach 5,000
  • Hindi (hi:) might reach 5,000
  • Ukrainian (uk:) possibly might reach 50,000
  • Sundanese (su:) possibly might reach 10,000
  • Kapampangan (pam:) possibly might reach 2,000

And these are the most likely newcomers to the table:

  • Quechua (qu:) surely will reach 1,000 Happened Dec 10th. - dcljr
  • West Flemish (vls:) surely will reach 1,000 Happened Dec 24th. - dcljr
  • Volapük (vo:) surely will reach 1,000 Happened Dec 13th. - dcljr
  • Maltese (mt:) probably will reach 1,000 Happened Dec 20th. - dcljr
  • Nahuatl (nah:) might reach 1,000 Happened Dec 28th. - dcljr
  • Voro (fiu-vro:) possibly might reach 1,000

Within each list above, the entries are given in decreasing order of certainty, then in decreasing numerical order by milestone, and finally in alphabetical order. Note that my prediction method is biased towards saying a wiki with a recent growth spurt will continue such activity. - dcljr (talk) 03:48, 9 December 2006 (UTC)

[edit] Numbers

I would prefer that the numbers appear with zeroes (eg as 100,000), rather than with a suffix of M or K (eg as 100 K). I think there will be many readers of Wikipedia for whom the suffix doesn't make immediate sense, and at least a few who will be unsure whether the letter refers to powers of ten e.g. Mega (which it does) or powers of two, e.g. mebi.-gadfium 22:01, 28 December 2006 (UTC)

I agree. The 0s weren't hurting anything. I'm going to put them back. - dcljr (talk) 05:40, 3 January 2007 (UTC)

[edit] Languages watched in January 2007

Based on trends in article counts seen at m:List of Wikipedias, the following languages are the ones most likely to need promoting in the table this month (January 2007):

  • Romanian (ro:) will reach 50,000 Happened Jan 5th. - dcljr
  • Malayalam (ml:) will reach 2,000 Happened Jan 15th. - dcljr
  • Chuvash (cv:) will almost certainly reach 5,000 Happened Jan 12th. - dcljr
  • Dutch Low Saxon (nds-nl:) will almost certainly reach 2,000 Happened Jan 11th. - dcljr
  • Newar/Nepal Bhasa (new:) will almost certainly reach 2,000 Happened Jan 24th. - dcljr
  • Hindi (hi:) will probably reach 5,000 Happened Jan 16th. - dcljr
  • Cantonese (zh-yue:) will probably reach 2,000 Happened Jan 23rd. - dcljr

The following are less likely, but possible promotions:

  • Nynorsk (nn:) might reach 20,000 Happened Feb 4th. - dcljr
  • Quechua (qu:) might reach 2,000
  • Indonesian (id:) might possibly reach 50,000 Happened Feb 1st. - dcljr
  • Ukrainian (uk:) might possibly reach 50,000 Happened Jan 16th. - dcljr
  • Cebuano (ceb:) might possibly reach 20,000 Happened Feb 2nd. - dcljr
  • Sundanese (su:) might possibly reach 10,000
  • Amharic (am:) might possibly reach 5,000
  • Javanese (jv:) might possibly reach 5,000 Happened Jan 15th. - dcljr
  • Lombard (lmo:) might possibly reach 5,000 Happened Feb 1st. - dcljr
  • Kapampangan (pam:) might possibly reach 2,000 Happened Jan 19th. - dcljr

And these are the most likely newcomers to the table:

  • Zazaki (diq:) will almost certainly reach 1,000 Happened Jan 14th. - dcljr
  • Turkmen (tk:) might possibly reach 1,000
  • Yoruba (yo:) might possibly reach 1,000

Within each list above, the entries are given in decreasing order of certainty, then in decreasing numerical order by milestone, and finally in alphabetical order by site prefix. Note that my prediction method is biased towards saying a wiki with a recent growth spurt will continue such activity. - dcljr (talk) 09:10, 5 January 2007 (UTC)

[edit] Languages watched in February 2007

Based on trends in article counts seen at m:List of Wikipedias, the following languages are the ones most likely to need promoting in the table this month (February 2007):

  • Spanish (es:) will reach 200,000 Happened Feb 10th. - dcljr
  • Finnish (fi:) will reach 100,000 Happened Feb 11th. - dcljr
  • Norwegian (no:) will reach 100,000 Happened Feb 24th. - dcljr
  • Hungarian (hu:) will reach 50,000 Happened Feb 7th. - dcljr
  • Quechua (qu:) will reach 2,000 Happened Feb 21st. - dcljr
  • Waray-Waray (war:) will reach 2,000 Happened Feb 13th. - dcljr
  • Azeri (az:) will probably reach 5,000
  • Sanskrit (sa:) will probably reach 2,000 Happened Feb 13th. - dcljr

The following are less likely, but possible promotions:

  • Amharic (am:) might possibly reach 5,000
  • Urdu (ur:) might possibly reach 5,000
  • Volapük (vo:) might possibly reach 2,000

And this is the most likely newcomer to the table:

  • Yoruba (yo:) will almost certainly reach 1,000 Happened Feb 12th. - dcljr

Within each list above, the entries are given in decreasing order of certainty, then in decreasing numerical order by milestone, and finally in alphabetical order by site prefix. As I have pointed out several times before, my prediction method is biased towards saying a wiki with a recent growth spurt will continue such activity. - dcljr (talk) 20:58, 6 February 2007 (UTC)

[edit] Languages in need of watching

Based on trends in article counts seen at m:List of Wikipedias, the following languages are the ones most likely to need promoting in the table this month (March 2007):

  • Turkish (tr:) will reach 50,000 Happened March 9th. - dcljr
  • Greek (el:) will reach 20,000 Happened March 17th. - dcljr
  • Thai (th:) will reach 20,000 Happened March 16th. - dcljr
  • Azerbaijani (az:) will reach 5,000 Happened March 9th. - dcljr
  • Low Saxon (nds:) will reach 5,000 Happened March 3rd. - dcljr
  • Urdu (ur:) will reach 5,000 Happened March 15th. - dcljr
  • Novial (nov:) will almost certainly reach 2,000
  • Newar/Nepal Bhasa (new:) will probably reach 10,000 Happened March 5th. - dcljr

The following are less likely, but possible promotions:

  • Cebuano (ceb:) might possibly reach 50,000
  • Hindi (hi:) might possibly reach 10,000 Happened March 14th. - dcljr
  • Volapük (vo:) might possibly reach 2,000

And these are the most likely newcomers to the table:

  • Classical Chinese (zh-classical:) will probably reach 1,000
  • Upper Sorbian (hsb:) might possibly reach 1,000
  • Tongan (to:) might possibly reach 1,000

Within each list above, the entries are given in decreasing order of certainty, then in decreasing numerical order by milestone, and finally in alphabetical order by site prefix. Note that my prediction method is biased towards saying a wiki with a recent growth spurt will continue such activity. - dcljr (talk) 08:04, 3 March 2007 (UTC)