User:Colonel Chaos/study

From Wikipedia, the free encyclopedia

In the last several days, I conducted a small survey of vandalism to featured articles on Wikipedia. The following page includes a discussion of my methods and results. The shocking result of my study: it takes the Wikipedia community on average 10 hours to remove serious vandalism other than childish vandalism from Featured Articles.

Contents

[edit] Method

For my project, I studied only featured articles. Featured articles should be the very best Wikipedia has to offer as they have been extensively checked and vetted. At the beginning, I also assumed that the individuals who went to the work of creating featured articles would also probably watch them heavily. I also figured that most featured articles would me on a number of watchlists. For these reasons, I assumed that featured articles would display the ideal response time for vandalism.

Of course, we all know that vandalism that includes profanity and childish insults is caught rather quickly thanks to the efforts of our hundreds of RC patrollers and the multitude of software devices created to help find easy to spot vandalism. They are quite adept at finding and reverting "Your mom..." or "PENIS!" So, for this study, I engaged in slightly more complex vandalism in three categories:

  • Grave Factual Accuracy: For these articles, I changed or inserted material that any average reader or editor of Wikipedia would immediately know to be untrue. For example, in the hydrochloric acid article, I wrote that Martin Sheen discoved the acid by mixing potatoes with salt and that Martin Sheen also invented agent orange for dissolving gold. Finally, I inserted the sentence: "Most of Sheen's research received funding from the United States military due to their interest in new weapons technology". If a reader/editor didn't immediately notice that this information was false (and wholly inconsistent with the remainder of the article), clicking on any of the wikilinks I inserted would've revealed the truth.
  • Complete Nonsense: In this category, I inserted a passage of completely irrelevant prose into an article. For example, I inserted the opening of This Side of Paradise into the middle of the article Island Fox.
  • Factual Innaccuracy: In this category, I changed articles more slightly so that a user with knowledge of the topic would be needed in order to spot the incorrect information. For example, in the article on Norman Borlaug, I changed "Between 1965 and 1970, wheat yields nearly doubled in Pakistan and India" to "Between 1968 and 1975, wheat yields nearly tripled in Pakistan and India" in the articles lead section.

[edit] Results

[edit] Complete Nonsense

In the complete nonsense category, I modified a total of five articles. The average response time was 691.8 minutes or 11.5 hours!

Article Vandalized Reverted Elapsed Time Notes
Island Fox 23:43, 27 April 2007 03:04, 30 April 2007 51 hours 47 minutes This one took longer to revert than any other article in the study (over two days).
Data Encryption Standard 23:44, 27 April 2007 00:57, 28 April 2007 1 hour 17 minutes
Cornell University 23:48, 27 April 2007 04:07, 28 April 2007 4 hours 19 minutes This one was reverted by an anon, as were many of my changes.
Technetium 02:57, 28 April 2007 03:12, 28 April 2007 15 minutes One of the quicker reverts in the study, but the editor still assumed good faith.
Fin Whale 03:03, 28 April 2007 03:04, 28 April 2007 1 minute Now that was a fast revert, what we're going for here!

All in all, the response in this category was shockingly slow, especially for Island Fox. Only two of the articles, Technetium and Fin Whale were reverted in what I would deem an acceptable time frame for vandalism this easy to spot. Of course, the average revert time above is skewed by the extreme values, so if we take the average of the middle 3, we end up with 113 minutes or nearly two hours, which is better, but still terrible.

[edit] Grave Factual Innaccuracy

I concentrated the most on this category, modifying a total of 9 articles. The response time here was still quite bad. On average, it took 555.7 minutes or 9.25 hours to revert modifications to articles in this category. Unlike the previous category, no single value threw off the data as there were two values of nearly 14 hours and one of over 31 hours.

Article Vandalized Reverted Elapsed Time Notes
Medal of Honor 22:57, 27 April 2007 22:58, 27 April 2007) 1 Minute Great revert time, congratulations to ERcheck!
Dime (United States coin) 23:01, 27 April 2007 02:55, 28 April 2007 3 hours 54 minutes
Butter 23:06, 27 April 2007 13:25, 28 April 2007 14 hours 19 minutes This one wasn't ever exactly reverted. An anon removed the entire paragraph that I changed and as such the article is simply missing a paragraph. To be honest, it wasn't an essential paragraph, so I didn't put it back in. Be bold and decide if you want to revert to an earlier version!
Hydrochloric Acid 23:10, 27 April 2007 13:26, 28 April 2007 14 hours 16 minutes This one was edited by a bot in between my edit and the revert.
James Joyce 23:15, 27 April 2007 02:58, 28 April 2007 3 hours 43 minutes
New Orleans Mint 20:51, 28 April 2007 01:04, 29 April 2007 4 hours 13 minutes I don't know why I kept on modifying articles to include Vietnam information, but this was one of those.
Great Lakes Storm of 1913 21:15, 28 April 2007 21:53, 28 April 2007 38 minutes A good revert time, apparently because the article is heavily edited by brian0918.
Second Crusade 21:21, 28 April 2007 15:59, 30 April 2007 42 hours 38 minutes This one suffered another incident of vandalism and was reverted to my version before my modifications were corrected. Honestly, how long does it take to figure out that Gregory Peck, Bill Cosby, and Harry Potter didn't lead the Second Crusade and that Paul Revere wasn't involved?
Francis Petre 19:54, 29 April 2007 20:02, 29 April 2007 8 minutes I'd call that a good response time!

Once again, we are faced with an appalling response time. Every article in this category had obviously been vandalized and no one cleaned many of them up for several hours. I shudder a little bit when I think about people reading these articles in the elapsed time. What kind of impression would a casual reader form of an encyclopedia that reported that Bill Cosby led the Second Crusade along with Harry Potter and Gregory Peck?

[edit] Factual Innaccuracy

Interestingly, this category provided the single glimmer of hope in my study with an average revert time of 57.4 minutes for five articles. Of course, prior to this survey I would've looked at a revert time of nearly an hour as atrocious, but you take what you can get. Unfortunately, the quick revert time in this category wasn't the result of actions of a group of vigilant users. Instead Morven, a user with checkuser permissions connected the five articles through the fact that I had made all of my edits up to that point through the same proxy. So, I feel that this category is essentially invalid. Nonetheless, here is the data.

Article Vandalized Reverted Elapsed Time Notes
Eifel Aqueduct 22:26, 27 April 2007 23:41, 27 April 2007 1 hour 15 minutes Revert by Morven
The Scout Association of Hong Kong 22:29, 27 April 2007 23:37, 27 April 2007 1 hour 8 minutes Revert by Morven
Canon T90 22:33, 27 April 2007 23:52, 27 April 2007 1 hour 19 minutes Revert by Morven
Norman Borlaug 22:36, 27 April 2007 23:36, 27 April 2007 1 hour Revert by Morven
Order of the Thistle 22:38, 27 April 2007 22:43, 27 April 2007 5 minutes Quick revert and not by Morven (Doops reverted it)

Once again, I don't think this category was especially notable given the influence of Morven and his checkuser permissions. Nonetheless, it is a glimmer of hope when it comes to a possible organized vandal attack.

[edit] Conclusions

For the categories of grave factual innaccuracy and complete nonsense, the average overall response time was about 10 hours. I would categorize the edits in these categories as serious vandalism. Personally, I think that a revert time of 10 minutes would be more appropriate for featured articles than 10 hours. In other words, our anti-vandal measures have failed. While they are highly successful at catching childish vandalism, our vandal fighters don't even notice serious, but equally if not more damaging content vandalism.

Also, this study given that it involved changing facts reveals that the general community response to a bad fact in a featured article is far too slow.

[edit] What do I suggest

  • Discussion on this topic - people need to talk about it
  • Put more articles on your watchlist! I've already watched another half-dozen featured articles.
  • Stable versions- so people don't have to deal with bad facts
  • Divert some of the extenisve resources we devote to vandal-fighting to fact-checking. It may be less glamorous, but fact-checking is essential to catch bad facts inserted not just as vandalism but all in good faith.
  • Create some sort of bot to monitor edits by new editors. Ideally this would apply to those with a low edit count, but recently created accounts could be monitored. I think such a bot is technically feasible and a good use of resources. Just look at the numerous variations on detecing "PENIS!" in rss feeds or monitoring recently reverted users.

[edit] Answers to Questions

I have answered some questions I anticipate receiving below

1. Isn't this a massive violation of WP:POINT
In a word, yes. But someone needed to look at how long it actually takes to revert vandalism and this seemed like the best way.
2. Who are you?
I am a frequent editor with over 2500 mainspace edits. I am not an administrator. For obvious reasons, I have concealed my identity for this survey.
3. Why did you use registered accounts?
For several reasons, primarily so that I could operate behind the same IP without having my edits linked by other editors (except one with checkuser). Additionally, people screen anonymous edits more thoroughly so I felt being registered would give better data.
4. What did you expect to be the results of your study?
I would've guessed about an hour to revert minor factual innaccuracies, 15 minutes for grave factual inaccuracies, and 5-10 minutes for nonsense. Yes, I was wrong.
5. Why shouldn't we ban you?
I'm not a threat. Go ahead and block the accounts that I used to do my modifications, but I am not a vandal and I don't plan to use this account to do anything wrong. This account is just for sparking discussion.
6. Why did you choose Featured Articles?
Because they are supposed to be Wikipedia's best. I think we all know that vandalism to stubs can go unnoticed for a long time. But Featured Articles, simply put, matter a lot.