Atom (standard)

From Wikipedia, the free encyclopedia

For Atom feeds from Wikipedia, see Wikipedia:Syndication.
Atom
The Firefox and Internet Explorer 7 Feed icon.
File extension: .atom, .xml
MIME type: application/atom+xml
Type of format: Syndication
Extended from: XML

The name Atom applies to a pair of related standards. The Atom Syndication Format is an XML language used for web feeds, while the Atom Publishing Protocol (APP for short) is a simple HTTP-based protocol for creating and updating Web resources.

Web feeds allow software programs to check for updates published on a web site. To provide a web feed, a site owner may use specialized software (such as a content management system) that publishes a list (or "feed") of recent articles or content in a standardized, machine-readable format. The feed can then be downloaded by web sites that syndicate content from the feed, or by feed reader programs that allow Internet users to subscribe to feeds and view their content.

A feed contains entries, which may be headlines, full-text articles, excerpts, summaries, and/or links to content on a web site, along with various metadata.

The development of Atom was motivated by the existence of many incompatible versions of the RSS syndication format, all of which had shortcomings, and the poor interoperability [1] of XML-RPC-based publishing protocols. The Atom syndication format was published as an IETF "proposed standard" in RFC 4287. The Atom Publishing Protocol is still in draft form.

Contents

[edit] Usage

Web feeds are used by the weblog community to share the latest entries' headlines or their full text, and even attached multimedia files. (See podcasting, vodcasting, broadcasting, screencasting, Vloging, and MP3 blogs.) These providers allow other websites to incorporate the weblog's "syndicated" headline or headline-and-short-summary feeds under various usage agreements. Atom and other web syndication formats are now used for many purposes, including journalism, marketing, bug-reports, or any other activity involving periodic updates or publications. Atom also provides a standardized way to export an entire blog, or parts of it, for backup or for importing into other blogging systems.

A program known as a feed reader or aggregator can check webpages on behalf of a user and display any updated articles that it finds. It is common to find web feeds on major Web sites, as well as many smaller ones. Some websites let people choose between RSS or Atom formatted web feeds; others offer only RSS or only Atom. In particular, many Blog and Wiki sites offer their web feeds in the Atom format.

Client-side readers and aggregators may be designed as standalone programs or as extensions to existing programs like web browsers. Browsers are moving toward integrated feed reader functions, such as Safari RSS, Web Browser for S60, Opera, Firefox and Internet Explorer. Such programs are available for various operating systems.

Web-based feed readers and news aggregators require no software installation and make the user's "feeds" available on any computer with Web access. Some aggregators syndicate (combine) web feeds into new feeds, e.g., taking all football related items from several sports feeds and providing a new football feed. There are also search engines for content published via web feeds, including Technorati and Blogdigger.

On Web pages, web feeds (Atom or RSS) are typically linked the word "Subscribe" or with the unofficial web feed logo ().

[edit] Atom Compared to RSS 2.0

The main motivation for the development of Atom was dissatisfaction with RSS [2]. Among other things, there are multiple incompatible and widely adopted versions of RSS. The intention was to ease the difficulty of developing applications with web syndication feeds.

A brief description of the ways Atom 1.0 seeks to differentiate itself from RSS 2.0 follows [3], [4]:

  • RSS 2.0 may contain either plain text or escaped HTML as a payload, with no way to indicate which of the two is provided. Atom in contrast uses an explicitly labeled (i.e. typed) "entry" (payload) container. It allows for a wider variety of payload types including plain text, escaped HTML, XHTML, XML, Base64-encoded binary, and references to external content such as documents, video and audio streams, as so forth.
  • RSS 2.0 has a "description" element which can contain either a full entry or just a description. Atom has separate “summary” and “content” elements. Atom thus allows the inclusion of non-textual content that can be described by the summary.
  • Atom standardizes autodiscovery in contrast to the many non-standard variants used with RSS 2.0.
  • Atom is defined within an XML namespace whereas RSS 2.0 is not.
  • Atom specifies use of the XML's built-in xml:base for relative URIs. RSS 2.0 does not have a means of differentiating between relative and non-relative URIs.
  • Atom uses XML's built-in xml:lang attribute as opposed to RSS 2.0's use of its own "language" element.
  • In Atom, it is mandatory that each entry have a globally unique ID, which is important for reliable updating of entries.
  • Atom 1.0 allows standalone Atom Entry documents whereas with RSS 2.0 only full feed documents are supported.
  • Atom specifies that dates be in the format described in RFC 3339 (which is a subset of ISO 8601). The date format in RSS 2.0 was underspecified and has led to many different formats being used.
  • Atom 1.0 has IANA-registered MIME-type. RSS 2.0 feeds are often sent as application/rss+xml, although it is not a registered MIME-type.
  • Atom 1.0 includes an XML schema. RSS 2.0 does not.
  • Atom is an open and evolvable standard developed through the IETF standardization process. RSS 2.0 is not standardized by any standards body. Furthermore according to its copyright it may not be modified.
  • Atom 1.0 elements can be used as extensions to other XML vocabularies, including RSS 2.0 as illustrated in a weblog post by Tim Bray entitled "Atomic RSS".
  • Atom 1.0 describes how feeds and entries may be digitally signed using the XML Digital Signatures specification such that entries can be copied across multiple Feed Documents without breaking the signature.

Despite the emergence of Atom as an IETF Proposed Standard and the decision by major companies such as Google to embrace Atom, use of the older and more widely known RSS 1.0 and RSS 2.0 formats has continued.

  • Many sites choose to publish their feeds in only a single format. For example CNN, the New York Times, and the BBC offer their web feeds only in RSS 2.0 format.
  • News articles about web syndication feeds have increasingly used the term "RSS" to refer generically to any of the several variants of the RSS format such as RSS 2.0 and RSS 1.0 as well as the Atom format. (For example, "There's a Popular New Code for Deals: RSS" (NYT January 29, 2006)
  • RSS 2.0 support for enclosures led directly to the development of podcasting. While many podcasting applications, such as iTunes, support the use of Atom 1.0, RSS 2.0 remains the preferred format [5].
  • Each of the various web syndication feed formats has attracted large groups of supporters who remain satisfied by the specification and capabilities of their respective formats.

[edit] Development History

[edit] Background

Before the creation of Atom the primary method of web content syndication was the RSS family of formats.

Members of the community who felt there were significant deficiencies with this family of formats were unable to make changes directly to RSS 2.0 because it was not an open standard. RSS 2.0 was copyrighted by Harvard University and in the official specification document it stated that it was purposely frozen "no significant changes can be made and it is intended that future work be done under a different name". [6]

[edit] Initial Work

In June 2003, Sam Ruby set up a wiki to discuss what makes "a well-formed log entry". This initial posting acted as a rallying point. [7] People quickly started using the wiki to discuss a new syndication format to address the shortcomings of RSS. It also became clear that the new format could also form the basis of a more robust replacement for blog editing protocols such as Blogger API and LiveJournal XML-RPC Client/Server Protocol.

The project aimed to develop a web syndication format that was: [8]

  • "100% vendor neutral,"
  • "implemented by everybody,"
  • "freely extensible by anybody, and"
  • "cleanly and thoroughly specified."

In short order, a project road map was built. The effort quickly attracted more than 150 supporters including David Sifry of Technorati, Mena Trott of Six Apart, Brad Fitzpatrick of LiveJournal, Jason Shellen of Blogger, Jeremy Zawodny of Yahoo, Timothy Appnel of the O'Reilly Network, Glenn Otis Brown of Creative Commons and Lawrence Lessig. Other notables supporting Atom include Mark Pilgrim, Tim Bray, Aaron Swartz, Joi Ito, and Jack Park. [9] Also, Dave Winer, the key figure behind RSS 2.0, gave tentative support to the Atom endeavor (which at the time was called Echo.)[10]

After this point, discussion became chaotic, due to the lack of a decision-making process. The project also lacked a name, tentatively using "Pie," "Echo," and "Necho" before settling on Atom. After releasing a project snapshot known as Atom 0.2 in early July 2003, discussion was shifted off the wiki.

[edit] Atom 0.3 and Adoption by Google

The discussion then moved to a newly set up mailing list. The next and final snapshot during this phase was Atom 0.3, released in December 2003. This version gained widespread adoption in syndication tools, and in particular it was added to several Google-related services, such as Blogger, Google News, and Gmail. Google's Data APIs (Beta) GData are based on Atom 1.0 and RSS 2.0.

[edit] Atom 1.0 and IETF Standardization

In 2004, discussions began about moving the project to a standards body such as the World Wide Web Consortium or the Internet Engineering Task Force (IETF). The group eventually chose the IETF and the Atompub working group was formally set up in June 2004, finally giving the project a charter and process. The Atompub working group is co-chaired by Tim Bray (the co-editor of the XML specification) and Paul Hoffman. Initial development was focused on the syndication format.

The final draft of Atom 1.0 was published in July 2005 and was accepted by the IETF as a "proposed standard" in August of 2005. Work then continued on the further development of the publishing protocol and various extensions to the syndication format.

The Atom Syndication Format was issued as a proposed "internet official protocol standard" in IETF RFC 4287 in December 2005. [11] The co-editors of RFC 4287 were Mark Nottingham and Robert Sayre.

[edit] Example of an Atom 1.0 Feed

An example of a document in the Atom Syndication Format:

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

 <title>Example Feed</title>
 <subtitle>A subtitle.</subtitle>
 <link href="http://example.org/"/>
 <updated>2003-12-13T18:30:02Z</updated>
 <author>
   <name>John Doe</name>
   <email>johndoe@example.com</email>
 </author>
 <id>urn:uuid:60a76c80-d399-11d9-b91C-0003939e0af6</id>

 <entry>
   <title>Atom-Powered Robots Run Amok</title>
   <link href="http://example.org/2003/12/13/atom03"/>
   <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
   <updated>2003-12-13T18:30:02Z</updated>
   <summary>Some text.</summary>
 </entry>

</feed>

[edit] See also

[edit] External links