Wikipedia talk:Half-million pool

From Wikipedia, the free encyclopedia

Until when are we allowed to place our bets? Certainly you'll agree with me that placing a bet when we're at 499,999 shouldn't be allowed. — Timwi 10:49, 3 Jun 2004 (UTC)

Ahem. Look up my original statement- Two weeks from this writing, this page will be locked to prevent late entries. ;) →Raul654 13:16, Jun 3, 2004 (UTC)

What do we win? GUllman 21:21, 3 Jun 2004 (UTC)

The same thing you get for contributing here, only lots of it all at once. →Raul654 21:48, Jun 3, 2004 (UTC)
Lots of wikilove, perhaps? No, we have lots of time to draw some special star or banner we can award the skilled statistican (if that is a word). ✏ Sverdrup 21:50, 16 Jun 2004 (UTC)

Contents

[edit] Million article pool

So, when does the million article pool start? I was thinking the day we hit 500k, unless other people want an intermediate level pool. Not that we need a decision any time soon. -- Cyrius| 21:17, 16 Jun 2004 (UTC)

Do you think there's really a million different things which deserve an interesting article having written about? pir 09:59, 17 Jun 2004 (UTC)
Let me count the ways: Every living species, currently in the millions, every town, currently in the hundreds of thousands, every politico, currently in the hundreds of thousands, etc. Yes there are probably about 2 or 3 billions things we could write about. Burgundavia 10:11, Jun 17, 2004 (UTC)
The Million pool might as well start now. The english Wikipedia's doubling time is between 10 and 13 months now, and its only 13 if you use among the lowest historical monthly growth rates and predict based on those. The latest 5 month moving average rate is 7.6% growth per month and it's been relativley stable between 5% and 8%. 7.6% puts the 500,000 article milestone at early Febuary 2005 and the 1,000,000 in late October 2005. (I'll take October 25th in the pool). I'm surprised so many people guessed at dates that would require growth outside of any of the recent average growth rates. I guess few people did the analysis. Clearly most people voting in Febuary 2005 like Angela and I did. :) Of course, as insightfully pointed out in Modelling Wikipedia's growth, the growth may be the early part of a logistic curve, but who knows where the exponential growth will stop and diminishing returns set in. I'm betting it's past 500,000, but may be before 1,000,000. - Taxman 12:38, Jun 17, 2004 (UTC)

[edit] Default time of day

I've noticed that many people didn't specify a time of day. We should decide on what time to assign those guesses before closing the pool. Otherwise, it's possible that the actual time will occur when different people would win depending on whether their guess counts for 0000 UTC or 1200 UTC. I suggest 1200 UTC. -- Yath 09:42, 17 Jun 2004 (UTC)

1200 sounds good to me. Dysprosia 09:47, 17 Jun 2004 (UTC)
Since I don't see any date collisions here, I'll just assume that their vote is valid for the entire day. And for the record, all guesses are assumed to be UTC. →Raul654 18:23, Jun 17, 2004 (UTC)
Let me point out that making people's votes valid for a period of one day contradicts the contest page, which states that whoever comes closest wins. Presumably, someone gets to win even if nobody guessed the correct date? Also, are you going to assign 24-hour periods straddling the guesses of people who specified the time of day? Assigning a specific instant to each entry would be simpler, IMHO. -- Yath 11:55, 19 Jun 2004 (UTC)
Making users votes valid for the entire day does not contradict the idea that the closest wins, it is just one way of defining the rules that removes the ambiguity of who wins. If you have the entire day, and the 500,000th article happens on that day, you are by definition, the closest. Therefore the pronouncement that people have the entire day simply makes specific times of day irrelevant. This make sense since few people chose a time of day, and the contest rules did not mention to choose one. Therefore a rule that does not consider the time of day is more consistent with the rules that were originally put forth. - Taxman 13:37, Jun 19, 2004 (UTC)
I hate to tell you, but if we assume 12:00 for each day, then we get *exactly* the same result. Let's say someone guesses May 22 and someone else May 23, and someone May 24. Let's say we hit 500,000 at at 00:01 May 23, then the person who guessed May 23 is correct. Let's say someone guessed May 22 and May 24. It happens at 12:01 May 23. Then the person who guessed the 24th is correct. The only slight issues is with the two people who guessed the same date, but each of them specificed a specific time, so we're still good. →Raul654 15:50, Jun 19, 2004 (UTC)

[edit] Damn it

Just my luck to miss this opportunity. Wikipedia has got its last 100,000 since January 4, so it should take another year from now (assuming the growth rate is constant and not exponential, which it likely will be). - Mark 02:22, 22 Jun 2004 (UTC)

That sounds like a vote for a 750,000 pool instead of waiting for the full million.
I just had a thought, we could open the next pool now, and leave it open until the half-million pool ends. -- Cyrius| 03:32, 22 Jun 2004 (UTC)
No, just be patient - this pool probably won't take a year to finish - 14 months at the outside. Then we can start the next one. →Raul654 03:49, Jun 22, 2004 (UTC)

[edit] Please add to Category:Wikipedia statistics

Thanks. ··gracefool | 02:36, 1 Sep 2004 (UTC)

Done. -- Cyrius| 03:12, 1 Sep 2004 (UTC)
Sorry to be niggly, but could you give it a sort key (eg. [[Category:Wikipedia statistics|Half-million pool]])? ··gracefool |
Done.
Gracefool banned. (<-- joke, you "sysops are evil" idiots!) -- Cyrius| 03:39, 1 Sep 2004 (UTC)
Lol. Cheers. ··gracefool | 04:01, 1 Sep 2004 (UTC)
Aw C'mon, I want to ban someone. I might as well start with Gracefool -- nothing personal, Grace -- so should I need to ban someone in an actual emergency, I know what the steps are. -- llywrch 23:44, 4 Sep 2004 (UTC)
Look, an emerge 'n' see!!!!!11!!111!1111 ··gracefool | 05:18, 5 Sep 2004 (UTC)

[edit] I Win!!

Sannse is the lucky winner of this tractor.
Enlarge
Sannse is the lucky winner of this tractor.

I just want to note that my entry was actually a prediction of the date we would hit one million articles in all versions - and is undoubtedly the most accurate on that basis. I assure you that this was my intention from the start and will be pleased to receive all the acclaim and cookies that this amazingly accurate prediction deserves. Thank you, thank you -- sannse (talk) 16:25, 14 Sep 2004 (UTC)

Congratulations! You won a tractor! ✏ Sverdrup 23:28, 23 Sep 2004 (UTC)

I shall treasure it always! -- sannse (talk) 17:40, 24 Sep 2004 (UTC)

[edit] Million article pool

I have started the Million pool, since Wikipedia just exceeded 400,000 articles. The pool will be closed when 500,000 is exceeded (i.e. when we find out who won this pool). Feel free to place your entries any time between now and that day. — David Remahl 11:04, 21 Nov 2004 (UTC)

[edit] My failed attempt

Screenshot of browser
Enlarge
Screenshot of browser

About 100 articles were submitted at nearly the same time to be the 500,000th. Of those, 50 were mine, all public domain American Civil War battle articles. (only took about half an hour to make them all). -- BRIAN0918  21:46, 17 Mar 2005 (UTC)

[edit] Which article is it?

Which one? -- Toytoy 06:30, Mar 18, 2005 (UTC)

  1. 04:54, Mar 18, 2005 Milon's Secret Castle (1163 bytes) . . 63.249.99.51
  2. 04:54, Mar 18, 2005 Involuntary settlements in the Soviet Union (2071 bytes) . . Mikkalai
  3. 04:54, Mar 18, 2005 Fire 'n Ice (340 bytes) . . 63.249.99.51
  4. 04:54, Mar 18, 2005 Snow Brothers (265 bytes) . . 63.249.99.51
  5. 04:54, Mar 18, 2005 Little Samson (273 bytes) . . 63.249.99.51
  6. 04:54, Mar 18, 2005 Bionic Commando (607 bytes) . . 63.249.99.51
  7. 04:54, Mar 18, 2005 Keshmirian (135 bytes) . . 69.227.0.149
  8. 04:54, Mar 18, 2005 P. J. Abbott (322 bytes) . . Hedley

Anyone of them? -- Toytoy 06:35, Mar 18, 2005 (UTC)

We don't know, but people I've talked to have agreed that it's probably one of the top six or so there. Alterego went nuts with trying to get a snapshot as it happened, and wrote about it on his blog. -- Cyrius| 10:38, 18 Mar 2005 (UTC)
They all now seem to be markerd as stubs. So they no longer really count.--Henrygb 14:22, 18 Mar 2005 (UTC)
We went with Involuntary settlements in the Soviet Union based on Alterego's calculation of the time, down to the second. If you assume that these were all spaced out equally over that minute, that one comes closest to being the one created at that precise time. The fact that it isn't about a Nintendo game is an added bonus. --Michael Snow 17:29, 18 Mar 2005 (UTC)
The reasoning that eliminated Milon's Secret Castle was incorrect. I'm pissed off about it because I'm quoted in said reasoning as saying something I didn't. -- Cyrius| 20:10, 18 Mar 2005 (UTC)
Cyrius is referring to Alterego's explanation at Wikipedia talk:Press releases/March 2005. This assumed that certain articles weren't included in the count. My reasoning was not based on this erroneous assumption, so it shouldn't be affected by it. --Michael Snow 21:02, 18 Mar 2005 (UTC)
The line of reasoning you are using for this, then, is faulty as well. I will upload more data when I get home in a couple of hours. The articles are not spaced out evenly over a period of a minute - there was a huge rush of them. From memory (i'm not on the server I used), at the time of turning 500,000 there was a ten second gap before turning 500,000 where the server did not purge the cache. If a decision is to be made on a SINGLE article, it would have to be Milon's Secret Castle (being as it meets the requirements of an article). However, a more accurate representation will be a tie between all articles that were produced in that ten second period. --Alterego 22:32, Mar 18, 2005 (UTC)
Just now browsing those that I have posted, consider this page produced ten seconds before turning 500,000 (and this is with a purged cache). The count is 499993. So in those ten seconds, seven articles were created - they are the top seven in toytoy's reproduced list above. --Alterego 22:32, Mar 18, 2005 (UTC)
Uh, how do we know they're the top seven rather than the bottom seven? I realize that it's quite possible the articles were posted more closely together in terms of time; my assumption about being evenly spaced was mostly because lots of people want to have a specific article they can point to. Calling it a tie isn't very satisfying in that sense, and this seems as good a way as any of breaking the tie. --Michael Snow 22:48, 18 Mar 2005 (UTC)
Further studying the data has presented a new problem - conflicting data. I assumed that there was no need to specifically request a non-cached page from New Pages, however, this page at 23 seconds and this page at 46 seconds are identical. This conflicts with this non-cached page at 36 seconds which says 499993, and this non-cached page at 46 seconds which says exactly 500000. Note that between 23 and 46 I received 12 copies of the exact same New Pages - all of which have the articles listed above. However, between 36 and 46 I received NO copies of User:JRM/Sandbox which contained the article count, because I was requesting a non-cached page, which take longer to produce. So how do we analyse this data? Clearly there were seven new articles created between 36 and 46 seconds, however, all the articles listed above existed since 23 seconds. This would on the face seem contradictory, unless we consider that the server timestamps are not accurate, as all articles listed after the ones posted above are stamped with 55 minutes. At this point, can someone intimately familiar with the software answer two questions? They are 1) Is New Pages cached for non-logged in users (appears yes) and 2) is there a delay , somewhere in the processing, and not necessarily in this order, between a) The article counter going up and b) the timestamp being placed on that article and showing up on RC or NP. I understand that it may not be an intentional delay, but I have a sneaking suspicion that due to the heavy load, some of those articles marked as coming into existence at 55 minutes were actually from 54. Oh, how I could not have predicted this =) PS: I saved all pages from within a minute or so to backups.rar (250 KB) if anyone likes. --Alterego 00:54, Mar 19, 2005 (UTC)
Okay, so maybe it was Battle of Bean's Station after all? (See Wikipedia talk:Press releases/March 2005#Determining article #500,000). Anyway, as you indicated earlier, it's a kind of precision that's really beyond our technical capacity. At this point the Involuntary settlements information has already gone out beyond hope of recall, and it's about as likely to be right as any other random article within those few minutes. Supposedly a Hebrew Wikipedia article on the Flag of Kazakhstan was the millionth article overall, I believe, but I have no idea how that's supposed to have been identified - probably just an equally arbitrary designation. --Michael Snow 01:08, 19 Mar 2005 (UTC)

What about the deleted articles? Do they decrement the counter? What about deleted and recreated articles? Mikkalai 05:11, 19 Mar 2005 (UTC)