WebCite
From Wikipedia, the free encyclopedia
WebCite is a free non-profit tool supported by a consortium of publishers and editors, designed for scholarly authors to cite webpages which have previously been archived by WebCite, thereby preventing linkrot. The purpose of the tool is to allow future readers to retrieve what had been cited by the author in the past, which is especially important in the academic context.
Rather than relying on a crawler which archives pages in a "random" fashion (as the Internet Archive does), WebCite users who want to cite webpages in a scholarly article can initiate the archiving process. They then cite—instead of or in addition to the original URL—a WebCite address, with a specific identifier which identifies the snapshot of the page they meant to cite.
Archived are all types of web content including HTML web pages, PDF files, style sheets, JavaScript, and images. WebCite also archives metadata about the collected resources such as access time, MIME type, and content length. This metadata is useful in establishing authenticity and provenance of the archived collection.
Contents |
[edit] History
The WebCite idea was first conceived in 1997 and mentioned in a 1998 article on quality control on the Internet, alluding to the fact that such a service would also be useful to measure the citation impact of webpages [1]. In the same year, a pilot service was set up at the address webcite.net (see archived screenshots of that service at [2]). However, shortly after, Google and the Internet Archive entered the market, both apparently making a service like WebCite redundant. The idea was revived in 2003, when a study published in Science concluded that there is still no appropriate and agreed on solution in the publishing world available [3]. Both the Internet Archive and Google do not allow for "on-demand" archiving by authors, and do not have interfaces to scholarly journals and publishers to automate the archiving of cited links. In 2005, the first journal announced using WebCite routinely [4], and dozens of other journals followed.
[edit] Process
WebCite allows on-demand prospective archiving and is not crawler-based, i.e. pages are only archived if the citing author (or editor/publisher) has requested archiving of a cited webpage when he cited the piece for the first time. In other words, no cached copy on WebCite will be found if the author or somebody else hasn't cached it beforehand.
Caching/archiving a page can be initiated by going to WebCite and using the "archive" menu option, or by creating the WebCite bookmarklet, which will allow users to cache pages while they are surfing by just clicking a button in your bookmarks folder.
Archived pages can be retrieved or cited using a transparent format like http://www.webcitation.org/query?url=URL&date=DATE, where URL is the URL that is broken and needs to be restored. The DATE variable is optional and indicates the caching date. For example, http://www.webcitation.org/query?url=www.pewinternet.org/pdfs/PIP_Health_Report_July_2003.pdf&date=2005-12-31 retrieves a copy of the URL http://www.pewinternet.org/pdfs/PIP_Health_Report_July_2003.pdf which is closest to the date of Dec 31st, 2005. Alternatively, a short form is available to cite WebCite'd documents, using the WebCite ID. The latter is primarily used by print journals to save space.
[edit] References
- Gunther Eysenbach and Mathieu Trudel (2005). "Going, going, still there: using the WebCite service to permanently archive cited web pages". Journal of Medical Internet Research 7 (5).