Wikipedia:Verbatim copying

From Wikipedia, the free encyclopedia

This page is an essay. This is an essay. It is not a policy or guideline, it simply reflects some opinions of its authors. Please update the page as needed, or discuss it on the talk page.


Verbatim copying is one of two main ways to reuse Wikipedia articles and other material; the other being the creation of derived works from such material.

For the purposes of this discussion, Wikipedia is considered to be a Collection of Documents. (An alternative interpretation could be that Wikipedia is a single Document, which invalidates the discussion on this page.) Each Document comprises:

  • Title
  • Title Page
  • Main Text - the publicly editable text of the document.
  • (potentially) History Subunit - the "page history" of the page.
  • License and Copyright Statement - currently, "All text is available under the terms of the GNU Free Documentation License (see Copyrights for details).".
  • Inline images are not considered part of the Document, but rather are aggregated with it.

Other content, such as the sidebar links, the Wikipedia logo, and so forth, are not considered part of the Document, though you may consider them to be "cover pages" for the Document. An article's talk page is considered a separate Document. An image and its associated image description page is considered a separate Document.

A verbatim copy of a Wikipedia Document is copying that qualifies under section 2 (verbatim copying) of the GFDL. Verbatim is usually defined as "word for word".

Contents

[edit] Webpage copies

For verbatim copying on the Internet (i.e. setting up mirrors of Wikipedia), the following restrictions apply:

[edit] Title

You may not change the Title. For example, in this document, the Title is Wikipedia:Verbatim copying.

[edit] Title Page

The "Title Page" is the text just below the title, before the start of the article proper. This is currently "From Wikipedia, the free encyclopedia", but was previously (briefly) "Find out how you can help support Wikipedia's phenomenal growth", etc.

Where Wikipedia has imported text from a third party, such as Nupedia, the Title Page may additionally extend to some italicised block text immediately after the title, and before the start of the main text.

There have been objections to including this text at mirrors because in web searches a page with this text will appear to be the original at wikipedia.org, misleading people about where they are going to go if they follow the link. Including this text may be a trademark infringement.

[edit] Main Text

You may not add, remove, or change any content or links within the Main Text itself, except:

  1. Optionally, you may change links to other Wikipedia articles to point somewhere other than the local version at your site which they will normally point to. (legally questionable)
    • if you prefer not to do this, a page with a redirect at your own site will accomplish the same result.
  2. Optionally, you may remove links to Wikipedia articles that have not been copied locally. (legally questionable)
  3. Optionally, you may make unlimited formatting and linking changes (provided such changes do not have the effect of obscuring certain words). (legally questionable)

Here are the legally questionable bits:

  1. Wikipedia articles use relative links, so no changes are required to point to your own copy of an article. If you don't have a local version, the effect is a broken link. Changing the link text to point to a copy of the article thus produces the same semantic result, even if it changes the HTML. Either way, the underlying transparent copy, the wikitext itself, is not changed.
  2. The removal of links to non-copied articles may be excused as a side effect of your use of section 6 (collections of documents) - the act of "extraction" includes the act of removing those links.
  3. Verbatim copying may be considered to be "word for word" copying, as opposed to "exact copy", which would include such details as formatting (for the written word), intonation (for the spoken word), and so forth.

Note that the link text itself (typically a few words, and underlined) may not be changed.

[edit] Images

As inline images are considered to be aggregated with the Document, rather than part of the Document, you may add or remove any number of images at any place, with any copyright status, and still be making a verbatim copy, as you are not modifying the Document. This interpretation of images as being aggregated is the official position of Jimbo Wales, and by extension the Wikimedia Foundation. (see [1], for example).

[edit] Making verbatim copies of images

Many images on Wikipedia are licensed under the GFDL. In these cases, according to the aggregation interpretation, each image and its image description page (and possibly its upload history and the "page history" of its image description page - see History Section in this essay) form an Article. You can therefore make verbatim copies of this image and use it in a similar way to Wikipedia. In other words, there must be an easy way to view the image description information of each image that use. In an HTML copy, you might display this if the user clicks on the image. In a printed copy, you might attach the image description information in a seperate appendix. Note that if you use GFDL images to illustrate your own written work, according to the aggregation interpretation, you do not need to license your written work under the GFDL.

Other images on Wikipedia licensed under other licenses, such as Created Commons, are also treated as aggregated. For information on how to reuse these images, check the details of the appropriate license. Again, according to the aggregation interpretation, you do not need to license your written work under the same license to use these images.

[edit] History Section

As the GFDL was never intended for wiki articles, things get complicated. Some interpret the GFDL as applied to Wikipedia such that the "history" link (known as the "page history" in some skins) is the GFDL History Section. Thus, you should at least include this "by reference" by linking to it. Regardless of the legalities, we would appreciate sub-licensees adding such a "link back". Because the page history will change if the article is edited subsequent to your copy, the date of copying should be given, to allow users to reconstruct the page history at the time of copying. Ideally, the link text should include the word "history", and it should be made clear that the history is to be considered part of the article (for example, on your copyrights page).

A legally questionable alternative is to link to the entire source article on Wikipedia (as above), with the understanding that its "page history" is thus included by reference. More conservatively, take a static copy of the entire page history, host it locally, and link to it. More conservatively still, take such a static copy and append it to the end of the article.

The "page history" complies with the spirit of the GFDL History requirements. However, the "page history"...

  • ... is often very large, and can be much larger than the article itself.
  • ... does not include a record of the changing titles of articles that have been moved, which the GFDL would require.
  • ... is not "Entitled" History in the Classic, Cologne Blue, or Nostalgia skins, which the GFDL would require.
  • ... is incomplete in many articles, notably those that have been merged or split, in very old pages, in pages moved or translated between wikimedia projects, or in pages sourced in part from external GFDL sources.
  • ... is not guaranteed by the Wikimedia Foundation to be retained.


An alternative interpretation is that the "page history" should be considered a separate Document (as with the Talk page). Thus, you may ignore the page for the purposes of verbatim copying. If the body text includes a section Entitled "History", then you should copy that along with the rest of the body text.

There are issues with this interpretation also:

  • It requires Wikipedia to be viewed as a collection of collaborative original works (with the exception of articles with explicit "History" sections) rather than a collection of Derivative Works.
  • It requires Wikipedia contributors to have implicitly granted a license to combine contributions freely and redistribute the result under the GFDL.
  • You may need to add information regarding the five most significant contributors to each Article.

There has been no official declaration from the Wikimedia foundation that the "page history" is or is not intended to represent the GFDL History Section.

[edit] License and Copyright Statement

Here you have more freedom, chiefly because Wikipedia's current License and Copyright Statement is itself questionable. However:

  • You must link to a local copy of the GFDL.
  • You must make it clear that the content from Wikipedia is available under the GFDL license.

[edit] Aggregation and cover pages

Under section 7 (aggregation with independent works) you may aggregate the Document with other separate and independent documents or works. There are two forms of aggregation worth noting:

  • Embedding a Wikipedia Document within a larger webpage. (legally questionable)
  • Running a website that includes both Wikipedia Documents and other (copyrighted) documents or works.

The use of embedding as a form of aggregation is dubious because it could equally be considered to be the creation of a derivative work. You can support the "aggregation" interpretation by designing your webpage to logically separate the verbatim copy of the Wikipedia page from the rest of your content, by use of colour, lines, and other markup.

You could also consider brief headers and footers on a HTML page to be "cover pages", as discussed in section 3, though again this is legally questionable. The GFDL states that "copying with changes limited to the covers [...] can be treated as verbatim copying [...]".

If you aggregate Wikipedia content with other content, you may claim a compilation copyright for the whole. This means that others cannot copy the whole without your permission. The GFDL forbids you from using the compilation copyright to restrict the rights of users of the Wikipedia parts of your aggregation.

[edit] Copying in Quantity

If you are copying more than 100 copies (printed or otherwise), then section 3 (copying in quantity) comes into play. One of the requirements is that you distribute machine-readable copies. If you are distributing an online copy of Wikipedia, your HTML pages are considered machine-readable copies.

[edit] Printed Copies

If you are distributing printed copies, you still need to offer machine-readable copies. You can do so by providing one of the following:

  • A machine-readable copy of your selection (eg, on CDs)
  • A URL for a machine-readable copy hosted by you (must remain available for a year after publishing)
  • (legally questionable) A URL for Wikipedia

Relying on Wikipedia is legally questionable because pages, images, and individual revisions on Wikipedia are routinely deleted from public view. The GFDL requires that "reasonably prudent steps" to guarantee that the machine-readable copy "will remain thus accessible at the stated location until at least one year." Wikipedia as a whole is very likely to still be accessible in a year, but it could be taken down for legal or financial reasons. Additionally, the Wikimedia Foundation makes no guarantee to retain a machine-readable copy of articles.

If you choose not to rely on Wikipedia, download the database, strip it of any content that you are not redistributing, and either distribute it with your copies, or else host it on a webserver for a year. Alternatively, if you are only copying a handful of pages, it may be easier to just host the HTML of the pages you are copying.

If you distribute a Wikipedia article in printed form, you cannot "link" to a local copy of the GFDL. Instead, you must additionally print out a copy of Wikipedia:Text of the GNU Free Documentation License for each copy of the Wikipedia article you distribute, and distribute them together. We recommend you distribute the "printable version" of the article, and the "printable version" of the text of the GFDL.

If you are distributing a selection of Wikipedia articles in printed form, you must include one copy of the GFDL in each selection.

As with webpage copying, there are potential issues to do with the "page history", which some allege counts as the history subunit (see above) and hence part of the Document. If you accept this interpretation, then you would need to print out the entire "page history" of the article (you may need to URL-hack to get the entire page history on a single webpage).

[edit] Verbatim copies of images

Some images hosted on Wikipedia are available under the GFDL, some are available under other licenses, some are used under fair use, and some are public domain.

If you wish to take verbatim copies of GFDL images from Wikipedia, you should distribute them with a verbatim copy of the image description page, including the section giving the dates and sizes of the various uploads. Viewers of the images must be able to view this image description page. For web copies, consider displaying this information if the image is clicked. For printed copies, this information might go in an appendix, giving the page number where you use the image in question.

For images under other licenses, consult the license in question.


[edit] Legally binding terms

Wikipedia does not give legal advice. This page contains our interpretation of the GFDL, but it is not legally binding. For the binding legal terms see the text of the GNU Free Documentation License, particularly section two. See also Wikipedia:Copyrights.