Wikipedia:Size comparisons

From Wikipedia, the free encyclopedia

Contents

This article compares the size of Wikipedia with other encyclopedias and information collections.

Source material from which Wikipedia statistics in this article are derived is available here; the Footnote on WikiStatistics section at the end of this page provides technical discussion of this article.

Currently, the English Wikipedia alone has over 2,409,367 articles of any length, and the combined Wikipedias for all other languages greatly exceeded the English Wikipedia in size, giving a combined total of more than 1.74 billion words in 9.25 million articles in approximately 250 languages. The English Wikipedia alone has over 1 billion words, over 25 times as many as the next largest English-language encyclopedia, Encyclopædia Britannica, and more than the enormous 119-volume Spanish-language Enciclopedia universal ilustrada europeo-americana.

Nevertheless, there are many other online databases which combine several encyclopedias and encyclopedic dictionaries and allowing users to search all of the works simultaneously. One example is Oxford Reference Online — a combined database of 160 encyclopedias and encyclopedic dictionaries, offering a total of 940,000 articles as of 2006, with expansions planned for the future.[1] Another example is Xrefplus, which offers access to 262 encyclopedias, dictionaries, and other reference books.[2] This all added up to about 2.9 million entries when the database had 225 titles.[3] There also is HighBeam Research and GaleNet. GaleNet — which is likely the largest named so far — offers users the ability to search several encyclopedia databases, including the Biography Resource Center (1,335,000 people), Gale Virtual Reference Library (594 reference books),[4] and the Science Resource Center (58 titles),[5] among others.

The largest encyclopedia ever produced is possibly the Yongle Encyclopedia, completed in 1407 in 11,095 books, 370 million Chinese characters. These books were small by modern standards; the work was twelve times the size of the 20 million word French Encyclopédie,[6] giving a total of 240 million words, or 21,600 words per book, although it is unclear if that is how it differs from the Encyclopédie in size. It is also unclear if it is twelve times larger than the original 28-volume version of the Encyclopédie completed in 1772 or the 35-volume version completed in 1780. The Yung-lo ta-tien was a collection of excerpts and entire existing works, rather than an original work. Only two copies were made and all that survives is a small fraction of one copy.

In 2005 the English-language Wikipedia more than doubled in size, and many smaller wikipedias have grown by a higher multiple.

There have been 282,885 contributors to all Wikipedia language editions (151,937 to the English language edition), with 8.2 million edits in September 2006 (3.8 million of which were in the English version).

Wikipedia is still in need of much expansion and improvement. Many of the articles are of poor quality and some mainstream encyclopedia topics are not covered adequately. And, the average article length is only a little over half the size of that in Encyclopædia Britannica. Over time the balance of the editorial effort is expected to slowly tilt towards a greater emphasis on increasing the quality, scope, classification and interlinkage of existing articles. However new articles will probably always be created in large numbers, as Wikipedia's conventions on acceptable article topics incorporate huge numbers of potential new articles every year (newly prominent people, current events, media products, physical products etc). In mid 2006 the rate of new article creation was still rising, but only slowly. As of January 2007 it looks like the rate of article creation may have peaked in mid 2006, though it would be premature to state that it did so for certain.

[edit] Comparison of encyclopedias

Numbers regarding total characters are based on an estimated average word length of five, plus a space, or six characters per word.

On September 1, 2006, the English Wikipedia had 1.4 million articles1 and 609 million words, giving a mean article length of 435 words and over three and a half billion total characters. It also had about 850,000 photographs and illustrations, 1.4 million redirect pages, over 2.6 million links to other websites and a staggering 32.1 million cross reference links between articles.

Encyclopedias by size
Encyclopedia Edition Articles
(thousands)
Words
(millions)
Est. characters
(millions)
Average words per article
Wikipedia English >2,000 >1,000 >3,500 435
Siku Quanshu (四庫全書)* 1782 800
Yongle Encyclopedia (永樂大典) * 1403 370[7] / 770[8]
Enciclopedia universal ilustrada europeo-americana 1933 >1,000 200 1,000
Gǔjīn Túshū Jíchéng (古今圖書集成) 1725 100
Encyclopedia of China (中国大百科全书) 1993 80 126.4 1580
Enciclopedia italiana 1939 60§ 50 247 833
Nationalencyklopedin 183**
Encyclopædia Britannica 2002 65[9] 44 650
Online 120 55 300 370
Great Soviet Encyclopedia 1978 100 21†† 200 570
Encyclopédie 1751-1780 72 20 278
Microsoft Encarta Encarta Deluxe 2002 70‡‡ 40 200 600
Encarta Deluxe 2005** 63 40 200 200
2002 Encarta Encyclopedia 40 26 200 200
Encyclopedia Americana 2004 45[10] 25 556
Grolier Multimedia Encyclopedia Online 39[11] 11 70 280
Columbia Encyclopedia Sixth 51 6.5 40 130
Meyers Konversations-Lexikon Forth ed. 1888-92 97 15.5 110
Encyclopædia Universalis 13th ed. 2008 41.5 60 350 1450

*Classical Chinese is a very compact language. The result is very short in size for the same content.

It is said that Yongle is larger than Siku, but it is uncertain how they were compared.

Kenneth F. Kister, Kister's best encyclopedias: a comparative guide to general and specialized encyclopedias, (1994) p. 450. [Article count is for the 82-volume edition, rather than the 119-volume one.]

§Alfieri, G. Treccani Degli. "Enciclopedia italiana" Diccionario Literario (2001 HORA, S.A.)

**Number of encyclopedic articles. The Nationalencyklopedin contains a total of 356,000 entries.

††Kister, op. cit., p. 365.

**Includes 10,000 historical archives.

‡‡Advertised as containing "over 63,000 articles...with 36,000-plus map locations, and over 29,000 editor-approved Web site links." The 2006 Premium CD-ROM had 68,000 articles.[12]

Advertised as containing 41,500 articles written by 6,803 authors, 60 million of words, 350 million of characters, 360,000 links, 122,000 definitions in the included dictionary, 130,000 bibliographical references, 2008 Press release.

[edit] Size of other information collections

Note that Wikipedia is neither a dictionary nor a web index; these figures are just for order-of-magnitude comparison.

[edit] Astronomy

  • The Guide Star Catalog II has entries on 998,402,801 distinct astronomical objects searchable online.
  • 5.5 TB of astronomical images (covering the whole night sky in several colours) are available online from Aladin.

[edit] Biology

  • The World Resources Institute claims that approximately 1.4 million species have been named, out of an unknown number of total species (estimates range between 2 and 100 million species).

[edit] Chemistry

[edit] Film and television

[edit] Genetics

[edit] Geography

[edit] Internet

  • Over 25 billion web pages were known to Google on February 24, 2006.
  • Netcraft logged roughly 92,615,362 distinct websites in 28 August 2006.
  • As of August 2006, the Open Directory Project web index claims to have over 590,000 categories for 4 million websites.

[edit] Language

[edit] Law

[edit] Libraries

  • The British Library is known to hold over 150 million items.
  • The Library of Congress claims that it holds approximately 119 million items, 12 million of which are electronically searchable.
  • Copac is a searchable electronic catalogue of over 31 million books held in libraries in the United Kingdom and Ireland (includes all electronic records from the British Library)

[edit] Music

  • The freeDB database holds information for around 1,579,205 compact discs. Many of the disks are duplicates, however, so the number of unique CDs is unclear.
  • The All Music Guide database contains entries for 834,069 unique albums, and 14,642,322 credits (as of June 2005).
  • The New Grove Dictionary of Music and Musicians, Second Edition, claims "25 million words with over 29,000 articles" about the subject of music alone.
  • Jamendo project contains over 3,050 free and open albums.

[edit] People

  • Thomson-Gale's Biography Resource Center contains over 1,335,000 biographies. 335,000 are essays, while over a million are thumbail entries.[15]
  • The Oxford Dictionary of National Biography has over 50,000 articles on famous Britons, in 50 million words (implying an average article size of 1000 words).
  • The old British Dictionary of National Biography had 36,500 articles in 33 million words.

[edit] Larger numbers

  • As of 2006, there are about six and a half billion human beings, each with his or her own life story. Between 25 and 100 billion more have lived and died in the past, although almost all of their lives are lost to history. As Arthur C. Clarke put this, in his preface to 2001: A Space Odyssey (in 1968, when the world population was only about 3.5 billion [16]):
Behind every man now alive stand thirty ghosts, for that is the ratio by which the dead outnumber the living. Since the dawn of time, roughly a hundred billion human beings have walked the planet Earth. — Now this is an interesting number, for by a curious coincidence there are approximately a hundred billion stars in our local universe, the Milky Way. So for every man who has ever lived, in this universe, there shines a star.

[edit] Footnote on Wikipedia statistics

Very detailed statistics for almost all aspects of Wikipedia are available from http://www.wikipedia.org/wikistats/EN/Sitemap.htm.

Statistics for this page are taken from the Article count (alternate) table and from the Words table.

Excluding redirect pages, there are roughly (using figures from September 1, 2006):

  • 1.4 million articles that have at least a single link.
  • 1.3 million articles that have at least a single link and 200 readable characters (roughly equivalent to at least 33 words).

Taking the difference of these two figures, there are about:

  • 100,000 articles that have at least a single link but fewer than 200 characters.

There is also an uncounted number of articles which have no links. The current statistics provide no indication of the size of this last category. The upshot is that the 609 million words in fact span the 1.3 million bona fide articles, the remaining 100,000 linked articles, and the unknown number of articles without links. A rough estimate of the word count in the latter two categories is ten million words. Dividing the remaining 600 million words by 1.3 million gives a mean article length of about 460 words.

Further, of the articles on the English Wikipedia, perhaps 36,000 are "data dumped" gazetteer entries about towns and cities in the United States. It is controversial whether gazetteer entries should count towards the number of "real" encyclopedia articles; however, their statistical significance is very much less now than in October 2002 when they were added. Very many have been colonised by Wikipedians who have transformed them to varying extents, including to an unimpeachably encyclopedic status.

[edit] References

[edit] See also