Wikipedia talk:Size comparisons/Archive 1
From Wikipedia, the free encyclopedia
Just how many pages about Wikipedia's size and activity do we need? We've got Wikipedia:Statistics, Wikipedia:Size of Wikipedia, Wikipedia:Traffic, and now Wikipedia:Size comparisons. Probably half a dozen more I haven't noticed yet, too. --Brion 19:15 Sep 11, 2002 (UTC)
These and the other new statistics pages are really fantastic! --Larry Sanger
I moved this here from the main article:
In addition to the comparisons above, the size of wiki articles can be compared with the size in other encyclopedias.
The table below compares the number of words for five topics in five categories between wikipedia, www.encyclopedia and encarta.
word count | |||||
category | subject | encarta | wiki | encyclopedia.com | |
literature | shakespeare | 3130 | 765 | 2230 | (without summaries of individual plays) |
mathematics | calculus | 3298 | 1093 | 1377 | (including diff. Equotions) |
geography | france | 60841 | 9020 | 6712 | (in wiki, including text from main articles) |
biology | elephant | 4318 | 673 | 891 | |
history | attack pearl harbour | 339 | 940 | 315 |
It looks like we can compare ourselves well with columbia/www.encyclopedia, but still have a long way to go to catch up with encarta when it comes to the depth of information in articles.
As for the topics, I have tried to choose topics from different categories. With the exception of the mathematics, topic, I suppose they may be topics easily choosen by kids for a presentation at school.
Please describe which specific articles were used for the above comparison (article titles). I do not think this is a fair comparison if relevant linked articles that contain further information are not counted. --Eloquence 05:52 20 Jun 2003 (UTC)
hi Eloquence,
what you ask would require much more time than the original count required. You may have noted that I stated I included text from main articles when appropriate. 'main articles' are used in the article about France as sub articles as the original article became too lengthy.
Let me detail my count for one of the lines from the above table: France:
- Encarta: 15 page article, with subtopics:
- Introduction;
- Land and Resources;
- People and Society;
- Culture;
- Economy;
- Government;
- History.
Not included are listed related topics as: capital, Paris. Charlemagne, major figure in medieval French history, more... Total 60841 words, excluding numerous adds. For counting I copy/pasted everything inwo MSWord 2000 and used the count words function.
- [Columbia/www.encyclopedia.com Columbia/www.encyclopedia.com] This article has been divided up into:
- http://www.encyclopedia.com/html/F/France.asp (introduction)
- http://www.encyclopedia.com/html/section/France_Land.asp (land)
- http://www.encyclopedia.com/html/section/France_People.asp (people)
- http://www.encyclopedia.com/html/section/France_Economy.asp (economy)
- http://www.encyclopedia.com/html/section/France_Government.asp (government)
- http://www.encyclopedia.com/html/section/France_History.asp (history)
- http://www.encyclopedia.com/html/section/France_Bibliography.asp (sources)
Not included is Geography, a listed of some of the major townas. Total of 6712 words,using same technic as for encarta.
- for our wiki:
- France
- History of France (despite overlap with the main article)
- Politics of France (same comment)
- Département (same comment)
- Geography of France
- Economy of France
- Demographics of France
- Culture of France
- French cuisine
Not included was
List of regions in France
I am no longer sure if I included: French literature I probably did, but it consists of little more than a list of French authors.
The problem is that Wikipedia articles are typically not neatly integreated long discussions. Encarta tries to do this with main topics such as "France", whereas we have articles about each individual aspect of France's history, politics, culture etc. So if you wanted to do a fair comparison, you would have to make a keyword-style list of the topics discussed in Encarta's "France" article and then check which of these topics are covered in Wikipedia, and then compare the length of the overall coverage. And to be fair, you would have to go the other way, too, and look at all the Wikipedia articles about France and see if coverage within Encarta exists.
Here are some other France articles that are relevant:
- French Revolution
- French Resistance
- List of French monarchs
- First French Empire
- Second French Empire
- French Consulate
- French colonial empire
- Indochina
- French Wars of Religion
- French States-General
- French presidential election, 2002
- Freedom fries
...
Do equivalents exist within Encarta for each of those? If not, are the topics covered within the main article? If only the main article covers them, the size comparison must include the word count from the Wikipedia articles.
As you see, it is quite difficult to use a fair methodology with projects that have such different information organization. In comparisons it is usually best not to pick articles about large meta-subjects, but to compare those about individuals, or specific historical events instead. I have done this (using random picks from Encarta and Wikipedia) in part 2 of my German article series about Wikipedia, and Encarta did not look very good. I also compared some of the Brilliant Prose stuff in Wikipedia to the Encarta equivalents, and found that, at least the German Encarta did not even have articles about essential stuff like the Milgram experiment. Coverage of controversial subjects like sexuality or real-world conspiracies (MKULTRA, COINTELPRO) is close to zero. Articles about movies, books etc. are often extremely POV ("masterpiece", "his best work").
Where Wikipedia fails is with subjects few people care about, but which are nevertheless important for a reference work, e.g. Politics of Cambodia. --Eloquence 08:17 20 Jun 2003 (UTC)
hi User:Eloquence,
You are right that it is difficult to make comparisons between the three pedias because of their different structures. I realized this from the onset, that's why I started including notes in the rightmost column about what I had and had not included. It was my intention to include a note about this, no comparison will be perfect.
I did a quick count the same way I had counted the articles (copy/paste text into msword, than count words) you mentioned, with the exception of the freedom fries, which has more to do with US based 'Francophobia' than with France itself. France should be expanded with 22.453 words, making a total of 9020+22453=31473. That includes numerous quotations of the Brittanica 1911.
I agree it is a major boost - but it still brings us only half way the encarta size. And because we have splitted things up, we have numerous overlaps. Compare the main article of France with its subpages (mentioned as 'main articles' in the France article) to see what I mean.
I did not check if encarta has separate entries on any of the subjects you mentioned.
I also did a quick recount of the elephant, when I included:
The number of words is raised with 919 to 673+919=1592, still not 50% of what encarta has.
The essential conclusion remains the same: we are a par with columbia (in some ways have surpassed them) but are still not at the same level as encarta. Thats not a bad thing, I guess we grow faster.
I would have loved to include info on the encyclopedia brittanica, but my time is too limited to use their temporary access offer for such a count.
You removed this info from the article to the discxussion page. What should we do to restore it? - I feel the count does contain relevant info TeunSpaans 18:29 20 Jun 2003 (UTC)