User:Jimbo Wales/Pushing To 1.0/archive

From Wikipedia, the free encyclopedia

Archive This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Some questions, though, are outside the scope of this particular page. Should we drive to 1.0? Should we strive to create a paper-friendly version? These are interesting questions, I suppose, but outside the scope of what I want to do here, which is to assume that the answers to those are affirmative.

Clarification please. Does paper friendly imply both a distributable user friendly CD as well as a paper friendly version? Sorry to be so dense. mirwin 02:43, 15 Sep 2003 (UTC)
I just put a couple of ideas there. lysdexia 19:48, 8 Nov 2004 (UTC)

Please note: I plan to refactor this page heavily as we go along! I intend for this to be somewhere between an 'unowned' wiki page and a 'personal' page of mine. Please don't be offended, and if you have a personal rant you'd like to include (and please do!), possibly it will be best to put it in your own userspace and just link it from here.

I may also sometimes post versions of this to the mailing list.

Contents

Basic ideas

Some of the basic ideas that I've had so far:

  • Wikipedia 1.0 should be about as good as Britannica -- better in some ways, worse in some ways, but a valid alternative
    • Should it also be comparable? IE, similar size/range/number of articles, similar focus?
    • Not ideologically. Britannica exists to support a particular canon, that being, the British and now American concept of "what history is." It is, for instance, light on the History of India, China, Africa, Latin America and figures of those cultures - one way Wikipedia can differentiate itself is to say that it is less Anglo-centric than Britannica. Build up an audience in developing nations who can really benefit from having a neutral encyclopedia — like in China where Wikipedia.org is banned, but they won't be able to keep all the CD-ROMs out. It may thus make sense to *focus on Chinese figures and history* deliberately. How can they keep out the only encyclopedia that does their history justice?
      • Ok, Wikipedia should be better than Britannica, whatever. You know what he means, don't you? LDan 14:34, 2 Sep 2003 (UTC)
  • The push for Wikipedia 1.0 should interfere as minimally as possible with the ongoing work on the website
    • This calls for a sifter project. Try my attempt - Magnus Manske
    • It should interfere as much as it likes - as long as it does so in a positive way!
    • There are, realistically, two web interfaces: one for authors and casual editors, and one for formal editors doing Wikipedia 1.0 stuff. These are probably choosing particular versions of articles, although work improving them will continue in parallel, an editor has to say "yes this one is better than the one in 1.0". Making it clear in the Wikipedia.org interface what version will go to 1.0, if any, is just putting a "Wikipedia 1.0 approved version" link on the page that goes back in the log to whatever. But no one should feel they can't change an approved version - it just means there's now a difference and the "This is the" before "Wikipedia 1.0 approved version" disappears, the latter becoming a link. That's "as minimally as possible".
    • Wikipedia 1.0 should just be the systematic reviewing/editing of every article, while still in the main Wikipedia, by approved editors (all sysops should automatically be approved, and basically anyone who asks and has a good history is approved). After that, it would be assumed that any negative changes will be followed by a positive one, as is usually the case. LDan 14:34, 2 Sep 2003 (UTC)
      • I disagree it should be the systematic reviewing/editing of every article. I'm sure an encyclopedia doesn't need to include Cyan Worlds. I think it should be a systematic reviewing/editing of every article that is included in Wikipedia 1.0 with it being looked at very carefully by professionals, added to wikipedia (the website) and wikipedia 1.0 and then never looked at again in the 1.0. I can gurantee the second a 1.0 page is published on wikipedia it will be updated with info, but you eventually have to say "we're going to perfect version X and then that's it"--John Lynch 13:16, 3 Aug 2004 (UTC)
    • Do we want every article in the main Wikipedia in the 1.0? Including stubs and NPOV's? Featured articles are a must. --zandperl 23:45, 29 Oct 2004 (UTC)
  • Wikipedia 1.0 is just Wikipedia 1.0 — by this I mean that we don't need to come up with a perfect system for all time, and that we should be prepared to learn from our mistakes on 1.0 as we, in the future, drive towards 1.1, 1.2, or 2.0
    • Whatever editorial process is arrived at, will eventually stick for good. It is worth working out in depth.
    • Well, as long as it is properly edited and fact-checked. LDan 14:34, 2 Sep 2003 (UTC)
  • Wikipedia 1.0 should be primarily about producing a single end product that is suitable for printers to print, cheaply. This will mean that we'll want to work for a state such that a printer could receive a CD-ROM from me in the mail and start producing books as easily as possible. I have no idea right now what this would involve, frankly.
    • Similarly, anyone should be able to start producing CD-ROMs to mail to others. All open - no single point of failure.
    • There should be a CD-ROM target, including everything in Simple English, specifically for poor countries with little net access where people are learning English or using it only for technical and business purposes, another CD-ROM for English speakers with more emphasis on Western European derived history and culture perhaps, and a DVD-ROM containing the full encyclopedia with audio, pictures, and all the rest. Maybe all the non-English Wikipedias will themselves fit on a single CD, then one DVD?
      • If they're truely poor, they won't have access to a cd drive, letalone a dvd drive. I think it'd just be better to have one version that's just English and one that's all lanugages. If the content is too big for a CD, we'll just compress it. If that still doesn't do it, we can use 2 CDs, but we should really avoid using a DVD. Currently, all of the wikipedias bz2 compressed with no past revisions is only 182 MB, so I'm confident that, even by the end of 2004, the entire wikipedia will fit on one CD.
        • I don't think it's right to say "If they are too poor to have a DVD drive, then they are too poor to have a CD drive as well". That's the case over here, where a new DVD drive is about as cheap as a CD drive, but doesn't translate to poor countries that routinely have to keep 10-year-old computers working. PhilHibbs 14:33, 19 Nov 2004 (UTC)
        • Mozilla already has Gzip decoding built in, for servers that send gzip-encoded data (e.g. Apache with mod_gzip). Maybe Moz can be extended to read .html.gz files, or browse the contents of tar.gz archives. —Michael Z. 18:07, 2004 Aug 24 (UTC)
    • We may want to look at DocBook. User:David Merrill may be able to help with that; he wrote some tools to convert from our wiki syntax to DocBook for the Linux Documentation Project: [1], [2]. It's also one of the formats that O'Reilly is able to accept; someone from there may chat with us about how they convert from DocBook to FrameMaker format.
    • User:Stw is working a script to convert Wikipedia to PDF via Latex. You can check it out here and download the source from CVS.
    • PDF gives precise control over type and layout, but HTML plus a print stylesheet (CSS) is pretty good too. If a designer is going to lay out each page, then PDF will capture all of the nuance. But if the text is going to be auto-flowed into a template, why not stick with more flexible and accessible HTML?
    • The exception is with text that flows across columns. There are plans for this in the future CSS3 standard, but I don't think it's practical yet. Maybe if we stick to Mozilla. —Michael Z. 18:07, 2004 Aug 24 (UTC)
  • Wikipedia the website should remain just as it is, but it is likely that Wikipedia 1.0 is going to require some degree of more formality and controls. These should be kept to a minimal level! We want to preserve the maximum possible openness while at the same time doing what needs doing to ensure that approved versions of articles are actually quite good.
    • Be as formal or informal as is needed - 1.0 need not compete with the website for openness - it'd lose!
    • An Editor's Guild that disapproves the approvers of articles may be effective. That way everyone's an approver and their "vote counts" until the cabal of those most trusted editors agrees to discount their vote. That need not be visible! But for instance User:JoeM's approval is at best irrelevant to Wikipedia 1.0.
    • I think the Editor's idea is not a bad one, since it will let people decide whether they want to work on this move to 1.0 or not. I think the most important element is to get different types of votes, ie, quick yes'es, quick no's, and then articles that need to be looked at in depth. I can obvisously say no to a random city page and it would pass easily, but there could be debate about some articles.Flamingantichimp 22:30, 8 Jan 2004 (UTC)
    • We should still stay relatively informal and not fork everything. I think that Wikipedia 1.0 should just be every article systematically edited (including new articles made during this editing period) and made fit for printing. After that, it can still be edited, but most edits will either be benign or be fixed by later edits. We don't want parts of our encyclopedia to be up to a year out of date. LDan 14:34, 2 Sep 2003 (UTC)
  • As a working hypothesis, let's talk about this project as though we intend to finish by December 31st, 2004, just over a year from now. This should not be considered an announcement of a goal date, just yet! As we put together a more formal idea of what needs doing, we can refine the date. But my own estimate, based on the need to approve around 75,000 articles, is that we can do it by then. Comments?
    • 75,000 articles of what size?
    • I favour starting very small - about the size of the xtian bible, for example. That lets us be utterly ruthless in selecting the most encyclopedic and expertly written (most of my stuff will never make it). It also means we can have lots of pictures on the CD version. :)
    • I think someone has to start calculating file sizes to see what will fit. Remember there will also be an offline reader/cache and maybe low end web browser like Opera. Never ship a CD or DVD that cannot read itself, even if in practice you expect someone's real daily broswer to take up the slack.
      • I wrote a script to do exactly this task: read the wikipedia database and build a static version in html, suitable to be put on CD or DVD. The current english Wikipedia is just about filling a 700MB CD, without including images. At present, stubs and very small articles take a huge amount of the used space, so if the biggest 75,000 articles are selected there would be enough space to include a good fraction of the images (the media directory on wikipedia is about 400 MB). Articles can be compressed, but then you need a reader program and it would much less portable (need different executables for win, mac, linux, etc.) At18 16:39, 23 Aug 2003 (UTC)
        • Surprised it's that small, but, fine, seems like it can fit on one CD then with the pictures. Compression should be avoided until the browsers can all handle a standard compressed-html-file format. Something like .sit would be nice, easy to uncompress on the fly, already ubiquitous on Mac, easy to implement on Win or Linux, and if it was built into the browser, it could be just like a .txt or .htm file. There's time to put this in open source browsers before Dec '04. That would save probably 50% of the text space. In the long run of course, browsers should also be able to directly interpret Wikitax.  ;-)
          • Well, it would take a whole lot more programming, but what if you put the whole thing in one compressed file, and then decompressed parts of it on the fly. There are libraries and programs to do this. As for using Opera, we should instead use an open source (or at least freeware without ads) browser instead of Opera. Mozilla Firefox is opensource, freeware, and only 3 megabytes more than Opera. LDan 14:34, 2 Sep 2003 (UTC)
  • 1.0 should be completely compliant with the GFDL. Currently Wikipedia has some rough edges (eg printable version) where this probably isn't the case.
    • I think that's a given. Actually Secondary Section, Front-Cover Text, Back-Cover Text, and maybe even Invariant Section on the final distributed version ("this is part of Wikipedia 1.0") will probably play a role to make this work.
      • No invariant sections. Invariant sections are fundimentally non-free. All we need is a GNU FDL notice, and that is effectively invariant already. As for the author issue (not mentioned here yet), I think we should just reprint the Wikipedia:Wikipedians main list on a page near the front. LDan 14:34, 2 Sep 2003 (UTC)

Things We Need To Do

  • Clarify our goals with a set of specifications for the end product
    • Settle CD-ROM vs. DVD-ROM capacity, number of versions/languages, first
      • Judging by the above, it seems one CD-ROM for the Main English version is fine. Plan on another CD-ROM for Simple English and all other languages - to encourage the use of Wikipedia even in village schools in the developing world, for learning English, for kids learning other languages, and all that. Figure that's a good project for 2005, and could be released say by Dec.31 of 2005, using all the tools and procedures worked out for the Main English project. The multi-lingual stuff is enough work for one year! After that imagine releasing a new Main English version at the end of every even numbered year, and an other-languages-and-Simple-English version at the end of every odd numbered year. Unless we want to release them in time for September start of school year? That might be better.
        • September isn't the start of the school year where I live... - Mark
          • Here neither. I think we should scrap that idea. - Lankiveil
  • Create a procedure and some minimal policies (which will surely change with experience)
  • Come up with a (likely bogus, but still inspiring) timeline for tasks to be completed
    • The above is certainly bogus, and hopefully also inspring! I'd say aim for a release on September 1, 2004, to start the school year. If we miss, fine, fall back to December 31, 2004.
  • Specify some default public domain sources, like 1911 British Encyclopedia, that could automatically added for empty articles, to be later edited by wikipedians, to improve it.
Not sure what you mean by this? What does this have to do with 1.0 in particular? It sounds like a good idea generally, though. Jimbo Wales 18:03, 20 Aug 2003 (UTC)
When the user is writting an article and includes a link to an empty page, s/he could click in an option " Fill up with 1911 B.E.". The user could wikify the text in a second step (and another ones improve the article too).
(You mean Encyclopedia Britannica, not British Encyclopedia, right?) We can just ignore non-existent links in a print version,[sic] that is, don't do the notation that indicates we have an article. (I think a close-kerned [[bracket pair]] would look cool.) But CD-ROM versions should preferably have a stub and an ad: "You can write this article at wikipedia.org! Click [here] to connect to the Internet and start editing." Geoffrey 01:10, 22 Aug 2003 (UTC)
(Technically, it's THE ENCYCLOPÆDIA BRITANNICA ELEVENTH EDITION. :P) That's only useful if you were working from a CD-R[W], so that updates on the site could be reflected on the disc. lysdexia 19:16, 8 Nov 2004 (UTC)
  • Define a wikipedia website quick and stable base.
  • What else? Add things here...
  • Will we need some sort of dispute resolution or arbitration system for the more controversal articles? Jfeckstein 17:20, 20 Aug 2003 (UTC)
Probably. For most articles, just having 2 trusted people say 'yeah, this is good' seems good enough for me, but we have to be careful about people gaming the system to get their own POV through. Jimbo Wales 18:03, 20 Aug 2003 (UTC)
How about at least some number of 'points' needed at the date of finalization, and more on controversial articles? Each user can give +1 or -1 per revision of article (or zero of course), each sysop +1.25 or -1.25, each developer +1.5 and -1.5, Jimbo +2 and -2. Suspected vandals +.5 or -.5, soft-banned vandals -1 and +1 (yes, they're flipped). Or something... Geoffrey 01:10, 22 Aug 2003 (UTC)
It's absurd to confuse the social harmony of Wikipedia with editorial quality. Consider any factor other than someone's completeness, veracity and good-faith effort to get facts right, and the project falls apart. Giving sysops or the developers more points serves what purpose? These are people distracted by the admin process, who are clearly concentrating on something other than article quality. They are devoted but this is the wrong way to reward them. Since this is "culling", it makes more sense to "disapprove" than "approve" articles anyway.
Two trusted people say yes, and no trusted people say no. Vetos would work well, and be appropriately conservative. Any article with an NPOV or accuracy dispute should be automatically off-limits.
Yes, although there is not really a Wikipedia:trust model yet. Those interested in this should maybe start there.
  • Software to convert Wiki-markup, including HTML, into a printer-friendly format (e.g. PS/PDF) automatically. This would be a great boon to us before and after "1.0".
Is PDF propietary ??-.
PS and PDF are well-documented, and there are many free software tools for generating and viewing them. DanKeshet
Yes, PDF is proprietary, but it's impossible for Adobe to exert its 'proprietation' over PDF because the documentation and free use of PDF is so extensive. Even Apple fundamentally uses a form of PDF to display its entire screen. We can technically get sued I believe, but it'll likely be thrown out because there has been so much use elsewhere of PDF that they should have sued those uses first. Geoffrey 01:10, 22 Aug 2003 (UTC)
Anyone is allowed to create PDF documents; Adobe has given explicit (limited) permission. From the PDF Technical Spec, Adobe reserves the copyright, and intends to enforce it, particularly to prevent people from making their own versions of PDF (as happened with HTML in the browser wars). However, basically Adobe explicitly gives permission to prepare files as long as they conform to the PDF spec, create drivers/programs that prepare PDF to spec, and to create PDF readers. Thus, for example pdflatex is totally legal (and actually not a bad path to get from wiki markup to PDF). BTW, I'm paraphrasing the Portable Document Format Reference Manual version 1.3 by Adobe Systems Inc., March 11, 1999.
Wiki-Markup is also proprietary. Converting it into anything else remains a problem as long as there is no "Wiki-markup" defined and standarised in EBNF. Up to now there are only suggestions and a bunch of Regular Expressions. --Nichtich
Wiki-markup is not proprietary. There is nothing stopping anybody else from adopting wiki-markup; indeed, our wiki-markup is a fork from Usemod's. It is not very well documented, though. DanKeshet


Wiki markup is not proprietary because anyone who wants can implement something in Wiki markup. Microsoft Word .DOC is proprietary because it's extremely hard for just anyone to create a Word file. Most Wiki markup is documented in Wikipedia:How to edit a page. Also, Wiki markup is GPL'd (it's part of the software) - .DOC format is Microsoft's copyrighted intellectual property. Geoffrey 01:10, 22 Aug 2003 (UTC)
You're right - Wiki-Markup is not proprietary. But it is also no Open standard since there is no specification. Somebody should urgend start an RFC or something like this. And the Wikipedia-Software should conform to the standard not pretend to be one. --Nichtich 23:18, 23 Aug 2003 (UTC)
Yes. Read m:simple ideology of Wikitax, it is a start towards an RFC. So is m:Wikitax but it's really not focused on making participation easier as the "simple ideology" is. And there are issues in geographic representation, names, etc., that are dealt within the "simple ideology".
Software to convert Wiki-markup: wget? :-) Make a stylesheet that doesn't put sidebars, topbars, etc. - just shows the marked up text. Like nostalgia but ten levels simpler. Then wget the site recursively - preferably from a personal computer running MediaWiki and wgetting localhost, not off the real Wiki. Geoffrey 01:10, 22 Aug 2003 (UTC)
Would cut the load a *lot*. This is a local client solution, and a good one.
Use pdflatex. Wikimarkup -> shell/php/perl scipt of choice -> pdflatex = book-ready pdf.
We could do a quick mock-up in a hurry, I think. Markup to print layout is not a problem. — Sverdrup 14:22, 24 Mar 2004 (UTC)
I've written a script that does this: http://wiki.auf-trag.de/ -- Stw | Talk 17:42, 21 Jun 2004 (UTC)
Why not keep it simple and author 1.0 using HTML with a good print style sheet (CSS)? HTML is accessible and flexible, and usable on the Web, CDROM, and print. PDF production is more complex. —Michael Z. 18:17, 2004 Aug 24 (UTC)
  • I think you should be able to browse from the main page to any article. Jfeckstein 17:42, 20 Aug 2003 (UTC)
Not sure what this one means. Jimbo Wales 18:03, 20 Aug 2003 (UTC)
Sorry. I meant that for each article, there should be a chain of intermediate articles which lead to the main page. Ie Main->science->physics->particle physics->Fermilab. Jfeckstein 22:23, 20 Aug 2003 (UTC)
A "main page" makes no sense on paper. On CD, the requirements are different - a CD main page would have *no* dynamic content - except possibly an auto-generated "on this day" section.
There's no reason it can't be done as a dynamic main page simulator on a CD-ROM. It's just a cgi instead of a page.
Better short urls (i.e. www.wikipedia.org/cocoa to go to www.wikipedia.org/wiki/cocoa
That was discussed at some point, but disregarded because they'd have to support the URLs in the root. I think we should "200 Moved"-ly phase out the /wiki/ URLs, use the current interpretation of 'subpages' of lowercase "/wiki/","/upload/","/w/", and make the canonical form of these "/", "/_upload","/" (the only real thing in w is wiki.phtml, right? and that doesn't deserve an article...) If this is not possible, can we rewrite "/x.html" and "/x/" (x being an article, not a folder or file) as "/wiki/x"? Geoffrey 01:10, 22 Aug 2003 (UTC)
  • Accessibility: 1.0 should conform to a basic standard of markup accessibility. The HTML template and wiki text are a great start. A basic audit of image ALT tags, table markup, etc. is a good idea. If the final product is PDF, then it should carry through accessibility features from the HTML. Learn about the basics at Dive into Accessibility, or in Joe Clark Building Accessible Websites New Riders, 2002 [ISBN 0-7357-1150-X] —Michael Z. 18:25, 2004 Aug 24 (UTC)

Articles we should write

(With those last three done, if we don't have a real map, we can generate one. There are examples of this on m:Maps).

Articles we should expand/improve

Article Tracking

OK, I just discovered Wikipedia:Wikipedia_maintenance...a link to there or to Wikipedia:Most_wanted_articles etc. from Special:Wantedpages might be good for future navigators. (I would add it myself, but this page isn't editable.) This takes care of many of my previous comments...

Browsing around these many manually maintained lists (these pages need copy editing, these others need expansion, these over here need NPOV, this is a list of all the images in Wikipedia, etc.), it seems a little automation might be helpful. That might be accomplished with a Maintenance Status / Ratings section for each article, perhaps at the top or bottom of the Talk page. People could come by and either flip switches "In need of copy editing", "NPOV problem", "not in English", etc. These could be used to automatically generate lists, and could be unflipped when they are fixed.

For more subjective issues, like organization, completeness, quality, etc., you might take numerical ratings from passersby and display the average over the past N weeks, or reset when there are major changes, or something.

Looking at these statistics across articles might give a better idea of how mature various areas of the Wikipedia are. Automating certain list-making tasks would also free up more human labor for actual writing and editing.

--Beland

Comments

They seems to be a lot of holes in the topics about medicine -- Youssefsan 17:29, 20 Aug 2003 (UTC)

Can you give examples? I don't disagree, I'd just like to get a handle on the shapes of our holes! Jimbo Wales 18:03, 20 Aug 2003 (UTC)
Just go to Medicine and check any of the links there. There are gaping holes in the whole field. Kosebamse 21:32, 20 Aug 2003 (UTC)
It's not limited to medicine. Look at monkey. Compare with the huge britannica article on the subject! Maybe there should be an organized effort to track holes? -- Arvindn 04:57, 21 Aug 2003 (UTC)
Yeah, look at Wikipedia:WikiProject_Ecoregions for a truly big hole. Not sure every Phylum even has an entry yet.
Wikipedia is loaded with holes, but it also has a lot of specialized information that is very useful. Like with any opensource/opencontent project, we can't just tell someone to fix it; we just do it ourselves. This Wikipedia paper 1.0 project won't do anything more to fix that except to (possibly) delegate jobs, which they still don't need to listen to. Remember, at the beginning of Wikipedia, Wikipedia was all one big hole about everything.

What's this about "approving" articles? How is this supposed to work? --Wik 17:46, Aug 20, 2003 (UTC)

I don´t like too much this idea. I think the another wikimedia encyclopedia is for this goal.
Well, that's what we're here to decide. My current vision is vague, but would involve having at least 2 people 'flag' an article as 'good enough' -- not 'perfect' but 'good enough'. It should be possible to replace those with later versions before the final release date, of course. And nothing will change about the existing Wikipedia website. We want to come up with a system which permits us to seamlessly and magically produce a final product, without interrupting ongoing work. Jimbo Wales 18:03, 20 Aug 2003 (UTC)
So as well as Wikipedia:Brilliant prose, there'd be Wikipedia:Adequate prose? -- Jim Regan 08:35, 21 Aug 2003 (UTC)
Yes, that's a good place to start. Having people flag articles that *can* be brilliant prose focuses energy on going from "adequate" to "good", and some of those will end up "brilliant".
Imagine that what is now Wikipedia:Brilliant prose would get a note on it saying "Wikipedia 0.9 includes this version of this article" which goes away if the article is changed. The "adequate" articles say "Wikipedia 0.9 will be including a polished version of this article" to inspire people to work on it. That doesn't go away if people edit. So far, no promises about 1.0. Snapshot of all articles not presently in NPOV dispute or marked as stub article constitutes Wikipedia 0.8. The bar for getting from 0.8 to 0.9 is set high, at present Brilliant Prose level. If this doesn't work out, well, fine, we can always fall back, approve Brilliant Prose for 1.0, and adequate stuff for 0.9, while letting fertilizer flow from 0.8, asking only for it to become adequate.

I still think a CD 1.0 and a paper 1.0 would be very different beasts. On a CD, it woulnd't matter too much if we had as much material on The Simpsons as we had on medicine (say). On paper, it would be more noticeable. I personally would like to involved in a the work on a paper edition -- a one-volume edition, which would require a fair amount of trimming in some areas, and some late-night cramming in others -- Tarquin 19:16, 20 Aug 2003 (UTC)

Gotta agree that CD and paper are different. However, I see the paper version as building on top of the CD version, i.e, we get to the CD version first and from there to the paper version. And I think the CD version is a worthy subgoal in itself; it could bring in a lot more new hands than we think -- Arvindn 05:21, 21 Aug 2003 (UTC)
Ditch all the Anglo-American pop-culture from every CD version. It's just free advertising, and is confusing to the people who need it most, in poor countries.
This is also an argument to ensure that idiom dictionary functions stay on Wikipedia consistently, even if an article in some cases is short enough to move to Wiktionary. This will be a huge strategic edge for Wikipedia over such guides as Longman's American Idioms, if they get that as part of the package. No idiom can really be explained properly in a dictionary given it needs history and culture to be referenced. So there are both integrity and marketing reasons to keep idioms here. Also how nice if Wiktionary could just fit on a floppy!

One subject that needs a thorough think-thru is the links to nonexistent articles. There are several points that we should consider:

...which seem all to belong in Wikipedia:link editing

  • On one or more sites that have "borrowed" Wikipedia content, links to nonexistent articles are quietly removed. While this is a defensible solution, I would also caution that these are votes for articles we need to write.
  • One could make a decision that any unwritten article with (say) five or more links needs to be written, & the rest will be removed. However, there are several groups of links to nonexistent articles which should be combined. An example I can immediately would be the articles pointing to the Controversy of the Three Chapters or the Schism of the Three Chapters - which deals with an important disagreement with political overtones of the 6th & 7th centuries.
    • Many articles deal with several aspects of a subject, and very often a link only tends to invoke one of them. So we should really be liberal with redirects, which are at least standard ways to simplify invoking one article. The practice of putting a link to an article with different anchor text is very hard to translate to a paper version. If CD is like paper, one thing that we may have to do is rephrase sentences that don't use the proper article title or a reasonable redirect as their anchor text.
Simple solution to anchor text not equal to article name: footnote or parentheses saying "see X", e.g. "one of the best things about great apes (see Primates) is that they're great." Not amazingly elegant, but seems okay, and wouldn't require much work.
With hyperlinking, we shouldn't need to depend on liberal use of quid vide (q.v.) -- or "which see". In reference to your example, the text should read: "one of the best things about [[Primates|great apes]] is that they're great." If that's not what you are saying, I don't follow your argument. -- llywrch 00:57, 25 Aug 2003 (UTC)
I was only referring to translating Wikipedia to a printed paper version, where hyperlinks aren't available, and where having to rephrase links for the entire 'pedia is a big hassle. For an electronic version, I agree there's little need for q.v./sees. Zashaw 02:44, 25 Aug 2003 (UTC)


  • And lastly, the eternal problem of links that fail to lead to existing articles due to spelling, capitalization, or failure to follow Wikipedia standards.

I guess you could consider me obsessed about links. -- llywrch 21:39, 20 Aug 2003 (UTC)


One thing to consider about a print version is the order of the articles. Obviously we can't just alphabetize the titles as they are, and thereby list people under their first name. --Wik 22:09, Aug 20, 2003 (UTC)

Maybe List of people by name could be used to generate the Lastname, Firstname listing? -- Jim Regan 08:35, 21 Aug 2003 (UTC)
Disambig/redirects in printed form. If "Wales, Jimbo" redirects to "Jimbo Wales" (does it?) either say "Wales, Jimbo. See Jimbo Wales," or "Wales. 1. Founder of Wikipedia, see Jimbo Wales. 2. Inventer of the wheel, see Wales, Frumpysnarf. 3. Wales (rest of Wales here)." Geoffrey 03:09, 22 Aug 2003 (UTC)
Actually, it's worth considering just alphabetizing titles as they are, at least for 1.0. Yes, it's different than what other encyclopedias do... but it's certainly a simple convention to explain! Simply "cleaning up" for a 1.0 will be a major effort, and I doubt that switching all names could be done well. Instead, concentrate on getting strong content, and tell people it's sorted "the other way". A few redirects could help clue people in, as well as a note in the front. One minor problem is that entries for related people with the same last name will be separate, but many related people have different names anyway. Once it's released, and people try it out, we can get feedback to determine if the (major!) effort of re-alphabetizing would be worth it.
  • According the history of GPL-Software I strongly prefere a version number less than 1.0 for the first offline-release.
    • Presumably, to be "pushing to 1.0" means having a "0.8" soon, and a "0.9" between now and then. So let's just say everything in Brilliant Prose is at 0.9 and everything adequate is at 0.8, and let's then argue about what "adequate" is, very inclusive requiring disapproval to exclude, or exclusive requiring approval to include.
  • Are you talking about a standalone Wikipedia on CD/DVD or a printed extract of Wikipedia?

--Nichtich

Gutza's rant --Gutza 22:33, 20 Aug 2003 (UTC)

I think one of our prime foci for this should be organization... the sheer volume of data kets bulky. Let's not forget many versions of EB have Index to the Index as a separate (multi-hundred-page) volume. I'd like to see a few categorical sub-pedias, for example, off the top of my head:

  • The Chronicle: a general history, in order by historical periods, within those periods (which would probably be centuries for most things pre-Industrial Revolution) group by region or topic.
  • The Gazetteer: All our geographic/regional data. Would require more fleshing out of the automated entries, and a lot more maps, etc...
  • The Taxonomy: Organisms. We've got a good template for the sidebar, work from that.

There's a lot more of these I can think of... I'll write it up in full as a personal subpage and link here soonish. -- Jake 04:06, 2003 Aug 21 (UTC)

I'm majorly in favor of making a paper version of "Wikipedia: World History" or "Wikipedia: Quantum mechanics" or something like that before going ahead with the full effort because it will help us decide whether we want to do it at all and if so, what is required to do, without sidetracking too much effort from the main project. -- Arvindn 05:12, 21 Aug 2003 (UTC)

Yes, a good test of capability and process. I'd suggest doing Ecology first, though, since ecological borders and processes don't really change fast, then History (political borders and movements of peoples and changes in languages), then Geography (since place names and borders are totally dependent on history). That gives us a good look at the Earth. Then all Plants, all Animals and Biology. Another group can do mathematics, quantum mechanics and particle physics, things that are globally standard with only one set of names, and a small cult of high priests each that actually thinks they are real. That group should then be forced to do all of Religion next.
I'd go further than that - I think it would be good for a crack team of three or four people to spend a week doing a CD or paper version of "Wikipedia - Presidents of the United States" or "Wikipedia - Lord of the Rings" and report back how they got on. P.S. history is real and mathematics isn't? - that's an intriguing philosophical stance. Onebyone 02:59, 14 Dec 2003 (UTC)

Jimbo, I love the idea of a 1.0 release. It gives us a real focus and impetus. However, I'm not sure about the priority you place on a print release.

Firstly, printing a complete set would surely have a very high cost per copy. One would almost wonder whether it would be cheaper to give worthy recipients a second-hand computer to run a "Wikipedia-disc" on than print the complete volumes!

Secondly, to get an encyclopedia edited down to manageable size with some semblance of balanced coverage will be a very considerable task, requiring many longer articles to be truncated. This will require skilful editors. If the same long-time respected contributors to the Wikipedia spend their time working on the 1.0 release, the state of the live Wikipedia will inevitably suffer.

Thirdly, some of our articles use complex HTML layouts, and many have images of variable sizes and quality. Converting these into a form suitable for printing may be difficult, or in many cases impossible, and will require some expertise in print graphics which is of course available but possibly not amongst Wikipedia's current contributor base.

So, all in all, I wonder whether the very considerable extra effort required to produce a 1.0 print edition is worth it. At the very least, I would argue we should aim for a "0.9 CD-only" edition first.

Or even just a CD-size archive that can be downloaded and widely tested on all kinds of machines, including low-end ones like are still used in many schools.

Another issue is whether we can find funding to pay people to do some of the grunt work of putting 1.0 together. --Robert Merkel 06:00, 21 Aug 2003 (UTC)

Full agreement with Merkel. I don't think enough of us have fully considered how much time this would take away from working on the actual Wikipedia; and the cost of this in print form would likely be prohibitive. I would have thought that Wikipedia 1.0 would be on CD-ROM only. In a year or two, we could produce Wikipedia paperbacks on specific topics, such as Science, History, etc, each of which would be a lot more manageable to edit, cheaper to produce, and easier to make some sort of profit from. (I assume profit would go to the Wikipdia foundation?) RK 16:04, 22 Aug 2003 (UTC)
Paperbacks for profit to fund things is a good idea. But you must have a Wikipedia:trust model before you can have a Wikipedia:revenue model. Else you go the way of Enron.
I too like the idea of pushing toward a 1.0 just so we have a target, but I am unsure about a printed edition. How about a print-able edition (on CD-ROM), some more modest effort to see if all 75,000 of our articles print out in a reasonable form. (I will do articles 67,003-67,217). Jfeckstein 03:52, 24 Aug 2003 (UTC)~

I doubt if all the stuff on the cities belongs in a paper 1.0 (or even a CD, takes up space). If Random page hits a city page that often (try it, betcha it'll reach a city page) flipping through the paper version should also bring up these Rambot entries, and that's not a good thing. Also redirects and stubs...hey, can a bot be written to turn redirects into pipe links, except for those specially marked not to be changed? Geoffrey 01:10, 22 Aug 2003 (UTC)


I think something to consider is some automated assistance with fact checking. In a pinch we could simply use a standard name off the talk page to record our observations. More ideally we could have a standard format wiki page to record work on. It would have data entry fields for facts/statements, assessment, source, etc. This would quickly improve our overall quality by allowing large teams of people (including neophytes) dissect an existing article and document the correct facts and sources of verification while removing detected errors. This would be useful to both the online Wikipedia development process as well as the push for 1.0 release. user:mirwin


I support what you call 1.0 because it provides some focus, an endpoint, and an opportunity for proper editing that presently does not exist. Leaving aside markup, format conversion, and other strictly mechanical issues, there is a need for:

  1. Review, and a means to document what has been reviewed, how, and by whom;
  2. Categorization of content as to suitability for a print version;
  3. A means for setting a version marker for each article to show which revision is intended for print publication.

These are valuable mechanisms independent of 1.0 and independent of whether 1.0 is primarily geared towards a print or CD-rom distribution.

I offer the observation that a points-based review system such as that used at slashdot is both weak and a poor fit for the Wiki workflow model. There is too much potential to "game" the system where points are involved and it does nothing to satisfy a skeptical reader about an articles quality. I suggest a more freeform approach, where reviewers can approve content in one of several categories, to wit:

  1. Fact checking
  2. Spelling and grammar
  3. POV
  4. Suitability of title
  5. Completeness
  6. Organization
  7. Overall subjective quality

Ideally there would be software support for this, with boxes for yes, no, no opinion, along with user names such that we might see an article with a list of reviewers, dates, and revisions reviewed; should we see that prominent Wikipedians have covered all areas we would conclude an article leaves little room for improvement. Then we might have a search feature where we can look for articles that are unreviewed, and ideally we could each choose which reviewers we wish to consider when so doing. I would think that the reviews would be tied to a particular revision. Perhaps also follow-on reviewers could assert that changes made since a certain revision are minor and appropriate.

There must be some means of categorization, so that we can mark those articles that will never be suitable for a print edition, and perhaps also so we can mark those articles that are by consensus considered complete for 1.0.

Anything in Wikipedia:brilliant prose is aleady complete for 0.9 except perhaps for link trimming we may choose to do. But I think redirecting any missing articles to more general ones is better than changing anything in a brilliant article.

There will also be a need to mark which revision is considered most suitable for print publication. Exactly how this is done is likely to prove contentious, but the fact remains that we will need to freeze some articles when they are good enough and sufficiently reviewed, while still working on others. We may be best served by making these markers relatively easy to move at first, with increasing difficulty of movement as we reach closure.

Kat 21:14, 22 Aug 2003 (UTC)


I advocate strongly for a version on cd-rom even if require 2 or 3 cd-roms and must be installed on a HD to be usable (this doens't exclude a DVD version). DVD burners are not very common and even DVD readers are not very common in third world countries where many people would be interrested by a free encyclopedia. Ericd 20:12, 23 Aug 2003 (UTC)

Yes true. Save DVD for Wikipedia 3.1 ;-) The fully UN-funded version.

As some articles use complex HTML layout a distribution in HTML seems obvious to me but how to implement a search feature ? Ericd 23:15, 23 Aug 2003 (UTC)

That's just a "Simple Matter of Programming". Text searching and retrieval is an extremely well-studied problem. One of my lecturers wrote the book on the problem, in fact. I have no doubt that a solution can be put together in the time frame Jimbo is proposing out of bits of free software lying around on the net and maybe a bit of extra work. --Robert Merkel 03:55, 24 Aug 2003 (UTC)
Yes, but HTML is a simple solution as it allow to browse wikipedia with any web browser? Sadly a web a browser is not conceived for text search and retrieval on a cd-rom.
Ericd 04:08, 24 Aug 2003 (UTC)
Please see my comment near the start of this page. The CD I am talking about already has search builtin, but only for page titles. It's written in javascript (should be compatible with most browsers) and works quite well. Adding search for articles text may be doable, but probably difficult. At18 09:52, 24 Aug 2003 (UTC)


EoFT

As a suggestion to refactor, consider combining the discussion of "approval versus scoring versus disapproval/vote" for articles with that already in Wikipedia:approval_mechanism.

And, for the discussion of "who should be allowed to participate" or "whose votes to count" or "what differential on scoring should apply for trusted contributors versus known trolls versus anonymous IP numbers", a quite different issue, I'd suggest moving the stuff on Advogato (not a great model BTW) to Wikipedia:trust model (or to a talk file on that subject first).

Then anything regarding self-funding or donation options or payments that keep the process going, such as RK's suggestion to sell paperback compilations, can go to Wikipedia:revenue model (or to a talk file on that subject first).

There are meta: references on all of these, but to put something in the [[Wikipedia:]] space is a signal that some decisions are about to be made.

With those three hived off as separate problems the question of "what to include" or "how to deliver" or "what timelines" are easier to consider.

Hope this helps. EofT

Also, there is now a Wikipedia:list of central issues to compliment the Wikipedia:list of controversial issues and Wikipedia:Sysop reading list to make it a bit easier to find the connections between community (which is not part of a CD-ROM version obviously) and the encyclopedia as a work of writing. These will aid the push to 1.0 by helping differentiate content from community. As 1.0 is "content", we need a separate (maybe distracting) thing for the "community" to pay attention to, that will ultimately pay off as better streamlined ways to deal with content.EofT

Obviously, refactor any of the above as you like when you do the "push" page. They're just suggestions. But planning a parallel push to get the community to read and understand the central issues and come to agreements on them, may be more important to the content plan than you think. EofT

What should be excluded ??IS THIS SERIOUS??

I have a strange feeling with this part of this article is just someone taking[sic] fun of us.

It was added on 11:54, 26 Aug 2003 by 213.105.87.216 even without signing. See http://www.wikipedia.org/w/wiki.phtml?title=User:Jimbo_Wales/Pushing_To_1.0&diff=1347479&oldid=1347394

Was this just a troll making fun of us? When I first read this page, I thought, Jimmy wrote this. Then I thought... NO, this can not be. But people took this comment serious. What happened here?

If someone could clarify who/why this was added/commentet/taken serious, please let know. Thanks, Fantasy 15:30, 3 Oct 2003 (UTC)

I dunno if this was serious, or if other people took it seriously. I just know that if the 1.0 is excluding all this, there is very little chance I will be part of it :-) Anthère
Jimmys reply on the mailinglist:
"No, to clarify, I didn't write that part. I do think that some things should not be included in the 1.0 version, but the bulk of the things listed *should* be included."
I'd say it's pretty obviously talking about the paper 1.0 - a volume of limited space - not about the CD-ROM 1.0. As such, I think it's worthy of serious consideration at least - David Gerard 14:47, 25 Jul 2004 (UTC)

What should be excluded. First a sweeping statement - the majority of wikipedia articles are irredeemiably[sic] trivial or otherwise lacking and should not be included in 1.0, no matter how well they are written.

Below is a list of areas that should be excluded unless the sublect in question is particularly famous or a landmark in the field. As you look through the list you will clearly not agreed[sic] but please keep the preceeding[sic] sentence in mind.

Articles that should not be in 1.0

  • Settlement articles (towns, cities etc), especially the Rambot articles which are, in original form, almost worthless demographic/economic information. Per country only the capital city and a few of the most noteworth[sic] other settlements should be included in individual articles. you could bundle them as a gazeteer but not in an encyclopedia
  • State, county, department, prefecture or similar articles.
  • National park or similar articles
  • All lists
  • Individual road articles
  • Airline articles
  • Company articles
  • Fictional character/entity/place articles
  • Television program articles
  • Movie articles
  • Television/movie actor articles
  • Television/movie director articles
  • Unedited EB1911 articles. These are horribly dated and 'quaint' and the idea of filling in our gaps with these is dreadful.
  • Current events or current event material from existing articles (e.g. Igla), it is so quickly dated
  • Specific board game, card game or similar articles
  • Specific aircraft articles
  • specific vehicle articles
  • Warship articles
  • Musical group/artist articles
  • Musical album articles
  • Record label articles
  • Stamp and related articles
  • Comic/graphic novel articles
  • Book articles
  • Articles of less than three or four sentences
  • Sporting team articles
  • sports person articles
  • Sports event report articles (e.g. Tour de France is fine but not the 2002, 2003 specifics)
  • Computer game articles
  • Specific weapon articles
  • Species level animal articles
  • Specific drug articles
  • Specific disease articles
  • Award articles
  • Individual pope, royalty, nobility articles
  • Individual university/school or similar articles
  • Recipe articles

Articles derived from Federal Standard 1037C or similar

I'm tempted to include all Israel-Palestine articles.

Just the history. No interpretation of any kind. Numbers of people estimated to have moved around where, when, etc. - the diaspora. You can't ignore this but really anything that isn't rock-solid for six months? Forget it. At the same time, it is a reputation for getting agreement on tough subjects that will actually get the mainstream to notice Wikipedia. As IBM already has - they map its controversial articles as a test of some text analysis software.
I don't want to do this if we have to exclude so much information. Much of that information would be included in something like Britannica. LDan 15:31, 2 Sep 2003 (UTC)

There are more I'm sure and a lot of articles could be substantially edited - while you could include the Simpsons, as being famous, the article could be reduced enormously and all seperate Simpsons related material dropped.

A lot of summary articles or introduction articles. See what is currently going on with simple view of ethics and morals versus what is at ethics proper. It's quite possible to introduce a complex subject in two pages, in plain language giving readable references, give a more detailed and academic treatment in five pages giving all the buzzwords. For the fictional stuff, you just don't both with the more detailed treatment. Links in the CD-ROM can go to the web version, in any case. There's no reason why not. Someone clicks on a detailed Simpsons link, it dials their modem...
there's not much left ;-) Jimbo, I regret a bit this page is only on the english wikipedia. That is a meta topic. Or...I regret we don't have combined watch list[sic].Anthère
after all the articles you mention are removed, please send me the archive and I'll produce a paper edition. All 2 pages of it.
If there are to be "no lists", and "no books" well, there must be drastically improved maps, timelines, [[geneology[sic]]] and citation capacities here. Else it will be very hard to track what work influenced what what or who[sic], who is related to who[sic] (royalty especially), when and where things happened. Thankfully there is so much on the web on all forms of indexing that this is not an area where anyone has to think: just read. There are dozens of options, and no reason in a CD-ROM version not to exploit advanced user interfaces like a fisheye or a gamelike paradigm. Just make sure it runs on oh say a Mac68K or 486, the lowest level machines that will have CD-ROMs attached, and we're fine.
The list is not an absolute. To take the above example, it is not no books, it is that the number of book articles should be carefully limited. Only those on books which clearly were influential should be included. E.g. Philip K. Dick, a number of his works are written up but they are not all deserving of inclusion. Just because a book, or indeed any subject, is written up should not be grounds for inclusion, no matter the quality. Encyclopedias should contain material centred around a certain core, Wikipedia is much looser than that and contains many articles that do not belong in any encyclopedia.
Okay, as long as it is not absolute, cool. Because some warships are very important, USS Maine for example, or DKM Bismarck, or Yamato.
Also, images needs[sic] to be talked about (and this page cleaned up), some images are more useful than others. The picture of money is not as important as a diagram, even one like Raster_graphics.
~ender 2003-09-12 05:35:MST
A traditional encyclopedia is limited in size and scope by the available manpower (=money) and by the number of pages realistically printable, before one could distribute DVDs. Wikipedia does not have these limitations. So, for a paper version, you would select 75,000 - 100,000 articles about mainstream topics. For the electronics version, throw everything in, the more the better.
Right now, wikipedia, if printed, would be about as big as britannica. There is no need for this drastic cutting, even in the printed one. LDan 15:31, 2 Sep 2003 (UTC)

Everything that is considered "good" should be in wikipedia 1.0 implementing a process to sort articles will add a step and consume time. What is the problem if someone has writen a article on a unkown book ? This article may even the best thing ever written on that book. I think that Wikipedia will always be different from other encyclopedia[sic]. IMO we have only one priority find a efficient way to exclude the worses[sic] articles for Wikipedia 1.0. IMO a quick and efficient solution will be a veto system with motivated veto a slogan like "don't be afraid to veto and article". Ericd 22:23, 26 Aug 2003 (UTC)

Do you mean everything that's well written? If so, I agree. If you go to VfD, they're always talking about deleting articles because, for example, they're about a TV show or a song. We don't have any policy against these, we just delete them anyway. I think that stubs should be preserved in wikipedia printed version too. Although it's not the best encyclopedia, I've seen many one- or two-sentence articles in World Book. At VfD, they also often delete stubs. I usually try to write a better article there to get it to not be deleted. Remember, a stub or dictionary def (unless it's under something like made) is better than nothing. LDan 15:31, 2 Sep 2003 (UTC)
I think we should keep everything that is well written reasonably NPOV-compliant and factually correct. Making a distinction between usefull[sic] and not so usefull[sic] articles will be subjective, controversial and a waste of time.

This is also IMO in contradiction with the spirit or Wikipedia.

Ericd 12:58, 12 Sep 2003 (UTC)

Just as a bit of fun, I tried to do a mockup of what a Paper Wikipedia article might look like... Media:PaperWikipediaMockup.jpg (Warning, large picture) Matthewmayer

That's pretty good. LDan 15:31, 2 Sep 2003 (UTC)

Leave it to someone else

Why don't we leave this task to someone else? Draw a parallel with Linux: the developers are not the distributors. If we work to make Wikipedia as good as it can be, then sooner or later the profit motive will draw in redistributors to exploit the heck out of it.

All we should do is make Wikipedia as redistributable as possible - for starters, that means going to all the Wikipedia:sites that use Wikipedia content and asking them what we could do to make their lives easier and Wikipedia more redistributable. Martin 14:57, 5 Sep 2003 (UTC)

I don't see what's wrong with publishing it. It's a bit different when you're talking about linux then you are with wikipedia. In linux, when distributing it, you bundle software with the kernel. We're making the whole thing, not just the base. Also, if we sell a print Wikipedia, it could make money for Wikimedia to buy more servers, something which we will undoutedly need in the future. LDan 18:55, 7 Sep 2003 (UTC)
I would like to see it published. I trust wikipedia.org not to be deleting things, or altering things without saying so.
~ender 2003-09-12:MST

Comparing content to other Encyclopedias

There is currently a page comparing the size of Wikipedia against other encyclopedias, but I'd like to post some results comparing the coverage, quality and organization methods of different 'pedias. There was some talk in Aug 2003 about comparisons against Britannica, but does a page exist for this yet? I could not find any.

One benefit is the ability to benchmark the 1.0 topic lists against ones from other pedias, and also to test the quality of Wikipedia against others. Fuzheado 00:32, 15 Sep 2003 (UTC)


Are We Ready?

I am a little concerned about the proposed launch date of Dec. 2004.

My comments are limited to the business and economics sections of the 'pedia because these are the only areas I can speak on with any authority. When I look at the 750 or so articles in this section, I see a work in progress. There are a lot of problems with incomplete articles, missing articles, inaccuracies, and NPOV. The best developed sections are the 248 economics articles and the 192 marketing articles, most of which are of reasonable quality. But when we get into other areas, we have problems. Of the 48 human resource management articles, about 30 of them are stubs. Of the 149 finance articles, about 80 are stubs. Of the 58 information technology articles, about 25 are stubs. We have about the same ratios for the 59 business law articles, the 69 accounting articles, and the 71 management articles. The 25 production articles are fairly good, but many more articles are needed to fill in this topic. Many of the 39 business ethics articles have questionable NPOV.

My concern is that this section of the 'opedia has a long way to go before it will be taken seriously by academia. As best as I can tell there are about a dozen of us doing substantive work on this section (by "substantive" I mean to exclude copy editors and formatting). Fifteen months does not give us enough time to get Wiki' up to a level that would be useful at the undergrad level. If we launch prematurely, we may do irreversible dammage to the reputation of the 'pedia. mydogategodshat 09:27, 29 Sep 2003 (UTC)

If you think that that's bad, have a look at the medical articles ... there are *some* good ones but a lot ... a whole lot ... are taken verbatim from public health sites and read like brochures or are lifted from Grays 1918 and need some editing to bring up to today's standards (not that there's anything wrong with these as starts) ... and let's not even get into the stubs ...
having said that, a deadline is a good thing. it allows focus. besides, if we're late, we can just postpone it ... -- Alex.tan 16:24, 29 Sep 2003 (UTC)

Will we ever be truly ready? I personally don't see any need to produce a static version, except for long-term archival purposes (to say, every year, burn the entire article space on a solid medium and submit to the Library of Congress and comparable archives). It just rubs me the wrong way; Wikipedia is as much a process as a product. As the wired world expands towards 100% percent of the human population, the *need* for static, distributable copies will decrease (as access to the online version becomes easier), and by diverting the efforts of valuable contributors to a project that will be (necessarily) incomplete and rapidly out of date. Just my two cents... Seth Ilys 04:33, 15 Jan 2004 (UTC)

Indeed Seth, I doubt we will ever be "ready". Also, what kind of stick in the mud would actually buy a paper edition, other than Wikipedia contributors and Mr. Little Old Man and his wife? We should sell a DVD version, including all languages, coming with FREEWARE FONTS for all current Unicode scripts (I can help with this, I have a freeware font almost ready supporting most of the scripts new in 4.1, and I have contacts who can help as well). It should offer updates with an option to make it update automatically. Updates should include new articles (what goes into an update should be sorted through to make sure no vandalized articles are included) and changes to existing articles. It should also be offered at a competitive price. Also, I think that we should wait until March 2005, but I think if free updates are offered, we should be able to get it out earlier.

Wikipedia 0.1

I think that the time is fast approaching when we should make a start, and see how our best theories translate to practice. My suggestion is that we set up a volunteer editorial board (perhaps call it Wikipedia 0.1 for now so as not to raise false hopes or needlessly consume name space) and some means of them working and see what they can come up with. For one form of infrastructure to support them see my proposal m:referees#Baselining. This article on the Meta assumes that a form of article review should be implemented before trying to develop static versions of Wikipedia, and I still think this is the best way to go, but I'm very interested in other views on this and it could work either way. Andrewa 14:59, 4 Nov 2003 (UTC)

Environmental Issues

I have said this else where but I will say it here also. We should use recycled materials for Wikipedia 1.0. If it is published on paper we should use 100% recycle paper. (the quality of it now is excellent, i have some so trust me.) see: [3]. If it is published on CD-RW's or DVD-RW's you can (i dont know where) get recycled ones. "Wikipedia 1.0 the first "Green" Encylopeadia". -fonzy

I totally agree with this and hope that the majority of wikipedians will push for this solution if we go printed-paper. MikeCapone Jan 24, 2004.

Why?

Wikipedia is an excellent resource because it is not restrained in size or content. It's the only reference in the world where you can find not only important things like "China," but trivial things like "Klingon." The point being, it is fundamentally different from EB because it is not a summary of the most important knowledge, but aims to be a compendium of ALL knowledge. Culling only the most 'significant' articles from it (i.e., getting rid of pop-culture entries) would significantly diminish its value. Not to mention, what's the point? Wikipedia is already available online- are there that many people who have CD-ROM drives but not internet access? A book might look nice on your coffee table, but why would you refer to it when a fuller reference, fully searchable, is available on your PC? Not to mention, Wikipedia 1.0 would be uneditable and static, the absolute antithesis of www.wikipedia.org. I agree that the project needs goals of "completeness," and in fact probably should have a "1.0" in the works, but why do it in a format that negates the best of wikipedia?

  1. I don't think you claim that all people in the world have access to the Internet and even if they do, how will they know about us?
  2. Setting such a goal is useful for ourselves — do you think that if we declare, for example, that from 15 Dec it's 1.0 that somebody will do such a large job? We need some motivation.

But I agree that it's arguable. ilya 06:49, 17 Jan 2004 (UTC)

Why would someone use Wikipedia 1.0 instead of the online version?
  1. Because they don't have Internet access.
  2. Because using a book is quicker than waiting for their computer to boot (or using a CD is quicker than waiting for their modem to dial up).
  3. Because they want a version they can trust not to have been recently vandalized.
These situations should shape the style and coverage of Wikipedia 1.0. For example:
  1. If readers are unlikely to have Internet access, 1.0 articles should be designed for longevity (rather than, for example, going into great detail on what happened the week before 1.0 was published).
  2. If readers are using 1.0 to find something fast, 1.0 articles should be kept short (deferring to specialized articles for specialized topics), and should be sorted how people expect encyclopedias to be sorted (e.g. all the musical Bachs in the B section, History of Italy under I rather than H, and so on).
  3. If readers will use 1.0 because they trust it more than the online version, perhaps each article in 1.0 should be reviewed and initialled by someone at least moderately qualified in the subject of the article. (So reviewing would be distinct from copy-editing: reviewers would specialize in fact-checking and relevance, while copy-editors specialized in clarity and importance.) -- Mpt 06:31, 6 Apr 2004 (UTC)

Will 1.0 Be Legal?

Whether we distribute 1.0 as a CD, DVD, in printed form, or even via audio, it seems likely to expose Wikimedia to significant new liability for copyright violations. Currently, Wikipedia is largely shielded by OCILLA, which protects us as long as we promptly take down any infringing material when the copyright holder complains to our designated agent (the individual contributor responsible isn't protected, but I doubt anybody's coming after them). But OCILLA or the CDA only protect Wikipedia on the Internet; once we move into other media, Wikimedia could be looking at lawsuits.

In part it depends on the approach we take to creating 1.0. If it's just a snapshot of all Wikipedia content as of X date, that's problematic. I don't think you can argue that we don't have infringing material present at any given moment. So we'd basically be inviting litigation.

Alternatively, 1.0 could be screened for possible violations, with only clean articles included. But even so, we could easily miss significant violations that are too subtle for whoever does the screening. Also, we have to keep in mind that fair use will not necessarily protect us everywhere that we might distribute. So it would require closer scrutiny than we currently tend to give our content. --Michael Snow 01:05, 30 Jan 2004 (UTC)

I agree. Not only would 1.0 be an easy target for lawsuits (because there will be copyright infringements), as soon as Wikipedia moves to other mediums and becomes real competition there will be motives as well. Perhaps 1.0 could be moved off-page or something (really no idea), so that lawsuits could endanger 1.0, but not Wikipedia itself. On the other hand, I might be paranoid. DrZ 12:24, 25 Feb 2004 (UTC)

One thing we should try to do is document our sources. If you contribute a paragraph or two, add something to the references section stating where you researched that information. I admit I sometimes don't bother to do this myself, but it's useful for reasons other than showing copyright compliance as well. Anthony DiPierro 01:39, 12 Mar 2004 (UTC)

Documenting sources is also important so that 1.0 will be citable, and thus far more widely accepted
I agree that this is a significant problem, but have nothing intelligent to add to what's been said. Isomorphic 06:48, 6 Apr 2004 (UTC)
possibly a separate foundation should be set up to do this kind of thing and take the risks and/or benefits. Because Wikipedia is GNU FDL, there's no reason why physical publishing shouldn't be done by a completely separate entity.Mozzerati 21:43, 2004 Aug 4 (UTC)

A 1.0 in 2004?

FYI, this article from the Independent (UK) should probably get this conversation started again.

"Wales must have faith because he is planning Wikipedia 1.0, a "frozen" subset of the project's content that he hopes to release later in the year. This will contain articles that have been flagged as reliable by a committee of experts, and hints at the introduction of a loose hierarchy into the Wikipedia community."

Fuzheado 01:08, 24 Mar 2004 (UTC)

I haven't followed the relevant discussion. Is the above quote an accurate representation of our plans, or are they repeating as policy what is merely a proposal for the moment? Specifically, I want to know if 1.0 is definitely going to contain only articles reviewed by "experts." Isomorphic 06:48, 6 Apr 2004 (UTC)

Where/How to sell it / Advertising

After I have published the first printed Wikipedia:WikiReader I can speak of some experience. Hardest thing is to sell it. I have sold something like 200 copies in two weeks, not very much. And I'm facing some very hard resistance on advertising for the WikiReader on the de:Hauptseite and normal article-namespace. So, be prepared to face some very hard resistance if you want to advertise it within Wikipedia. --TomK32 21:12, 28 Jun 2004 (UTC)

Sure, advertising within Wikipedia might be an easy way of selling some, but doesn't that defeat the purpose? Ambivalenthysteria 14:52, 3 Jul 2004 (UTC)

Lots of thoughts on the software release

I was thinking about what would happen if it was discovered that a copyright infringement was included on a CD version of Wikipedia. I think any software we do should automatically look for an internet connection and, if it detects one, check for any such copyright alerts. If there is a reported copyright infringement, the software could download a designated replacement article/picture to the hard drive to use in its place, or restrict access to it altogether.

I have also been thinking about all this on the assumption that the Wikipedia 1.0 in its CD version is a piece of software native to a particular operating system. How would we select these operating systems? Windows is a must, and people would be angry if it did not run on Linux. Macs have a lot of users too. How would we accomplish this? Write it all in Java??

Now, I come to the issue of selecting which articles (and which versions of articles) should be included. I think there should be an automatically generated (but only 1000 articles or so in size) list of the most frequently visited - and edited - articles. Is this possible? I know there is a kind of hit-meter built into the MediaWiki software, but it might have been disabled on Wikipedia. Anyway, so we have a list of the most popular articles. That list would be a good starting point for the things people would be looking for in the CD version of the 'pedia. From there, we can remove articles and add all the others which should be included.

I propose that all admins (and stewards etc) should have the ability to do the page/version selection thing (possibly by a selectable icon in the Page History). I also suggest that there be a new class of user created called "scrutineer" or something, whose only special ability above normal users is to select these articles.

I think the software will have to be engaging. To be able to compete with Britannica and Encarta requires a *LOT* of multimedia, flashy graphics and features.

My final thought is on the name of the publication/CD. I absolutely hate "Wikipedia 1.0" as a name, even as a working name. I much prefer "Wikipedia 2005" or something like that, because it emphasises that it is a work-in-progress, a snapshot in time rather than an evolving thing like the main Wikipedia website. "1.0" suggests a final release, and I think it would be conceited to think this initial Wikipedia release will be a complete work. Of course there will be problems and mistakes.

I hope someone takes the time to read this. I'm really interested in the push towards 1.0 and want to help as much as possible, even if it means giving up most of my editing on the main 'pedia. - Mark 10:32, 17 Jul 2004 (UTC)


I think it might be better to release the CD in a modified HTML format (suitable for offline viewing). It seems like a lot of work to create a special piece of software just for Wikipedia viewing, and sticking with HTML means that the CD would be readable on just about any contemporary machine. -- Beland 07:05, 23 Jul 2004 (UTC)
Call it by the number of articles it has. Screw the multimedia; put a bunch of MIDIs and text formats in it and be done. lysdexia 19:36, 8 Nov 2004 (UTC)

Africa

I see mention in other places of an 'Africa' story. Just to let you know, I have put down at Nansindlela school [4] in Ingwavuma a digital snapshot of Wikipedia - currently text-only, but, courtesy Brian Vibber, soon pictures as well. I just have to visit there again .. somewhat far from Cape Town .. I think Digital is feasible, and way better. Wizzy

I love this- but organization?

I happen to love this idea, first off. Second off, perhaps I'm jumping to far ahead here, but these are the thoughts I came up with. I'm also not on the mailing lists, so perhaps these have already been discussed, but I figure, if I put them here, at least someone may read them.

This is highly ambitious, and I think before discussing a lot of the technical issues, how to convert things, what amount of articles are needed, etc[.], etc[.], there needs to be some semblance of organization.

First off, what exactly is the plan? Mention has been made of a print edition, an english CD, an other languages (with simple english) CD, a DVD with all of them, a two CD set, etc[.] etc. For the purposes of this discussion, I'll assume a print edition, the english CD, and the other languages CD.

How would we go about this? First off, I think you need to look at the timing, which was mentioned above. I see it working something like: The English CD is the first to be done. It's the easiest for sevearl reasons: Wiki --> CD is easier than wiki --> Paper, and less culling of articles. Set this to happen every two years. On the second year, do the other languages CD. This requires a bit more work (more languages, coordination, etc[.]), but with the experience gained from the English CD, should be easy enough. WIth this two year staggered system, it also allows for 6 month quarterly "updates". This is a long enough period of time to actually have a significant change in content, while also allowing to show some of the "transient" nature of wikipedia.

The print edition is harder. Every year? Every two years? Either way, it needs to be done after the english[sic] CD. First, I assume that we'll have gained experience like I said above, and be able to have some organization for the print edition. Also, it gives time to find the appropiate services/programs/etc[.] to take on the task of transferring wiki to a paper edition. I assume we have more programmers in wikipedia than we do bookbinders ;).

Secondly, Organization. How does this go about? I see it done as assigning a committee or a few main people, like caretakers, to each of the three projects. That committee would be answerable to the foundation board. They would be responsible for creating a set timeline, identification of what was needed, and perhaps recruiting whoever[sic] they need to add in the project.

The OL languages group would probably need to have at least one person per smaller language, and 2-3 per larger, to be on hand for this group.

For the print version, one thing the caretakers should be responsible for is recruiting editors for each of the "topical" areas, like Biology, Mathematics, etc[.], etc. Those in turn would recruit a few editors for their sections.

I'd also like to point out the wikireader projects. I Happen[sic] to adore this idea,[sic] I may start one myself soon, but irregardless[sic], these seem to be a good primer, and a good experiment, to see how converting to a paper encyclopedia might work, and what hitches may be found. A further effort towards workign on these might be warranted.

I may have just spouted a lot of stuff that has already been thought of, but as it was not mentioned here, I wanted to make sure that peopel didn't put the "planning" aspects ahead of the "organization" aspects.

Lyellin 09:38, Jul 29, 2004 (UTC)

Danger, Will Robinson

I just wanted to say that when I first heard about this whole print Wikipedia thing, I immediately thought of the Mathworld debacle. I haven't read very much about the plans (who has the time?), but just be damn careful to read the fine print of any contracts you sign! - dcljr 05:54, 12 Aug 2004 (UTC)

Layout standards in print version

One thing to watch is consistency of article layout for the print version. While the online version can tolerate variations,since only one article can be seen at a time, once it is set on a page, it needs consistency. By this I mean the order of an article > title print size, > opening paragraph, > following paras> links > references> categorisation, etc. How we handle lists, and the degree of indent etc[.] will also need attention. If we don't do this from the outset, the printed version is going to look amateurish, and be a messy read for the purchaser. Also what about a final index to it all? Apwoolrich 17:31, 6 Sep 2004 (UTC)

Page production process

I have added a possible process for upgrading pages to production quality at User talk:M.e/Production Wikipedia   m.e. 12:01, 12 Sep 2004 (UTC).

RE: Environmental Issues...

I think a vote should be cast weather[sic] we should use recycled CDs\[sic]DVDs\[sic]Paper....

For

Against

Comments

  • There should be a discussion before a vote starts. Also, you seem to be assuming we would print this ourselves and it would be our decision to make. That is not necessarily the case. Angela. 04:20, Oct 4, 2004 (UTC)

Issues

  1. The idea of providing a CD/DVD/Paper variant is interesting, and thanks to the License, not necessarily funded or generated by wikipedia admininstrative.
  2. The idea of limiting article (e.g. reducing simpsons/pop video etc. content) according to an editorial board - in my mind - appears to be nothing more than attaching an elite bias onto a collaborative project - the articles that have been authored on wikipedia indicate the insterests of the readership; does it not strike anyone that therefore, editorial decisions based on content would be considerably undesirable for the number one buyer - the current wikipedian?
  3. The idea that one could logistically manage the process of converting articles through editorship - on the assumption that the current rate of acceleration of articles increases - also appears in my mind to be questionable.
  4. The idea that any article needs to be edited by an editorial board implies that there is something wrong with the ideology implicit in the structure of wikipedia itself.

IMHO, the best way to push wikipedia onto a DVD (sorry the arguments against DVD are notably dated) would be to provide a downloadable snapshot of 1.0, which can be burned/or mounted onto a hard drive.. And it would be best left as a federation of linked web-pages, albeit without the cgi/php. Importantly, one could leave links for editorship in that direct the enduser to wikipedia online. The advantage is an offline resource, that also serves as a historic record. Disadvantages are those that concern attempting to make money out of the project. DVDs could be sold with a browser, etc.

Regarding paper copy - why do it? The regression back to linear communication would lose one of the greatest assets of wikipedia - the hyperlink. The energy costs involved are high, and for what benefit? I love books - I love reading books too. But even if I had Wikipedia v1.0 on my shelf, i would never take it anywhere, and I would never use it! Why should I when I can go straight to the real one - Version "Today".

Jim et. al. - You will do what you want - this is a fantastic project, which has opened many minds to the phenomena of wikis - and wikipedia itself. These ideas here are just a bit of a rant. Signing Off.. (20040302 16:29, 14 Oct 2004 (UTC))Pushing To 1.0/archive]]