Sitemap index

From Wikipedia, the free encyclopedia

A Sitemap index is an XML file that lists the multiple XML sitemap files. Sitemap index is an XML sitemap for multiple XML sitemaps. The XML format of a Sitemap index file is very similar to the XML format of a Sitemap file[1]. It allows webmasters to include additional information about each XML sitemap (when it was last updated). After creation of a Sitemap index file webmasters can just notify search engines about the index file and the other XML sitemaps that are included in the Sitemap index file will be automatically notified too[2].

Contents

[edit] XML Sitemap index Format

The Sitemap Protocol format consists of XML tags. The file itself must be UTF-8 encoded.

[edit] Sample

The following example shows a Sitemap index that lists two Sitemaps:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <sitemap>
      <loc>http://www.example.com/sitemap1.xml.gz</loc>
      <lastmod>2004-10-01T18:23:17+00:00</lastmod>
   </sitemap>
   <sitemap>
      <loc>http://www.example.com/sitemap2.xml.gz</loc>
      <lastmod>2005-01-01</lastmod>
   </sitemap>
</sitemapindex>

[edit] Submitting Sitemaps

If Sitemaps are submitted directly to a search engine, it will return status information and any processing errors. Refer to Google Webmaster Tools or Yahoo SiteExplorer.

Search engine Submission URL Help page
Google http://www.google.com/webmasters/sitemaps/ping?sitemap= How do I resubmit my Sitemap once it has changed?
Yahoo! http://search.yahooapis.com/SiteExplorerService/V1/updateNotification?appid=SitemapWriter&url=
http://search.yahooapis.com/SiteExplorerService/V1/ping?sitemap=
Does Yahoo! support Sitemaps?
Ask.com http://submissions.ask.com/ping?sitemap= Q: Does Ask.com support sitemaps?
Live Search http://webmaster.live.com/ping.aspx?siteMap= Webmaster Tools (beta)
All search engines http://www.sitemapwriter.com/notify.php?crawler=all&url= Ping URL for XML sitemaps

Also, the location of the Sitemap index can be specified using a robots.txt file to help search engines find the Sitemap index files. To do this, the following lines need to be added to robots.txt:

Sitemap: <sitemap_index_location>

The <sitemap_index_location> should be the complete URL to the Sitemap index, such as: http://www.example.org/sitemap_index.xml

[edit] Sitemap Limits

Sitemap index files may not list more than 1,000 Sitemaps and must be no larger than 10MB (10,485,760 bytes).

The Sitemap index file must:

   * Begin with an opening <sitemapindex> tag and end with a closing </sitemapindex> tag.
   * Include a <sitemap> entry for each Sitemap as a parent XML tag.
   * Include a <loc> child entry for each <sitemap> parent tag.

The optional <lastmod> tag is also available for Sitemap index files.

[edit] Time format for <lastmod> tag

The value for the lastmod tag should be in W3C Datetime format. For example, 2007-08-25T00:00:00+00:00. This encoding allows to omit the time portion of the ISO8601 format; for example, 2007-08-25 is also valid.

Available time formats:

Format Example
YYYY-MM-DDThh:mm:ssTZD 2007-08-25T00:00:00+00:00
YYYY-MM-DDThh:mmTZD 2007-08-25T00:00+00:00
YYYY-MM-DD 2007-08-25

[edit] Validating Sitemap index

Google uses an XML schema to define the elements and attributes that can appear in Sitemap index file. http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd

In order to validate your Sitemap or Sitemap index file against a schema, the XML file will need additional headers.[3]


[edit] Notes

  1. ^ Sitemaps.org official site
  2. ^ FAQ of sitemapwriter.com
  3. ^ Google Webmaster Tools


[edit] See also

[edit] External links