Wikipedia:Snapshots
From Wikipedia, the free encyclopedia
A quick FAQ about existing and future snapshots of the English Wikipedia.
- For information on German snapshots, see m:Wikipedia on CD/DVD and more detailed pages on de:WP.
- For information on multilingual snapshots, see pages here and on meta about the Mandrake collaboration (links?).
- To be created: a global page on meta about snapshots in all languages.
[edit] Existing snapshots
- Wikipedia Offline is a commercial application containing 2,600,000 articles with a helper software for off-line searching and displaying Wikipedia content.
- User:BozMo & SOS Children offer the free Wikipedia for Schools which is 4500 hand chosen & checked articles available as a free download
- User:Wikiwizzy converted the wikipedia portion of BozMo's 2006 CD to a Plucker database to carry it on an SD card on a Palm. It is up for download at ftp://ftp.wizzy.com/pub/wizzy/palm/Wikipedia.pdb - it weighs in at 44 Megabytes, with pictures downsized to 150x150, 8 bit color.
[edit] Snapshot projects
- Wikipedia:Pushing to 1.0, a project about generating a reviewed layer for Wikipedia content, has side-discussions about [re]using snapshots of such a beast.
- Wikipedia:Stable versions looks are producing static and reviewed versions of articles most suitable from print at any given time.
- A collaboration with Mandriva (formerly Mandrakesoft) which is underway has led people to think about preparing a DVD of English Wikipedia content similar to the existing one of German content; this gave impetus to the Wikipedia:Image tagging project last fall but has not solved many other problems:
- finding good free software for reading/searching through such a snapshot; working out kinks in integrating this with a database dump
- effectively stripping and reformatting pages;
- removing articles with certain tags (vfd, et al)
- catching recent vandalism; searching through changes made in the past week and particularly those in the past few days for suspicious edits
- details for the image and multimedia dump; both auto-extraction of the appropriate files and attributions from en:WP or commons, and resizing images / storing multiple resized versions as necessary for the reader
- MediaWiki 1.5 includes routines to dump a wiki to HTML, rendering the HTML with the same parser used on a live wiki. (details of use?)
- Early discussions of creating a German DVD, on Meta, list a number of separate projects dating back to 2003 which have largely been mothballed. Directmedia'sDigibux is a working reader-project for Linux...
[edit] See also
- meta:Alternative parsers includes several tools to aid in creating snapshots.
- Wikipedia:Database download includes information on the raw database dumps, and also some existing manipulations of them.