Sitemap index
From Wikipedia, the free encyclopedia
A Sitemap index is an XML file that lists the multiple XML sitemap files. Sitemap index is an XML sitemap for multiple XML sitemaps. The XML format of a Sitemap index file is very similar to the XML format of a Sitemap file[1]. It allows webmasters to include additional information about each XML sitemap (when it was last updated). After creation of a Sitemap index file webmasters can just notify search engines about the index file and the other XML sitemaps that are included in the Sitemap index file will be automatically notified too[2].
Contents |
[edit] XML Sitemap index Format
The Sitemap Protocol format consists of XML tags. The file itself must be UTF-8 encoded.
[edit] Sample
The following example shows a Sitemap index that lists two Sitemaps:
<?xml version="1.0" encoding="UTF-8"?> <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <sitemap> <loc>http://www.example.com/sitemap1.xml.gz</loc> <lastmod>2004-10-01T18:23:17+00:00</lastmod> </sitemap> <sitemap> <loc>http://www.example.com/sitemap2.xml.gz</loc> <lastmod>2005-01-01</lastmod> </sitemap> </sitemapindex>
[edit] Submitting Sitemaps
If Sitemaps are submitted directly to a search engine, it will return status information and any processing errors. Refer to Google Webmaster Tools or Yahoo SiteExplorer.
Search engine | Submission URL | Help page |
---|---|---|
http://www.google.com/webmasters/sitemaps/ping?sitemap= | How do I resubmit my Sitemap once it has changed? | |
Yahoo! | http://search.yahooapis.com/SiteExplorerService/V1/updateNotification?appid=SitemapWriter&url= http://search.yahooapis.com/SiteExplorerService/V1/ping?sitemap= |
Does Yahoo! support Sitemaps? |
Ask.com | http://submissions.ask.com/ping?sitemap= | Q: Does Ask.com support sitemaps? |
Live Search | http://webmaster.live.com/ping.aspx?siteMap= | Webmaster Tools (beta) |
All search engines | http://www.sitemapwriter.com/notify.php?crawler=all&url= | Ping URL for XML sitemaps |
Also, the location of the Sitemap index can be specified using a robots.txt file to help search engines find the Sitemap index files. To do this, the following lines need to be added to robots.txt:
Sitemap: <sitemap_index_location>
The <sitemap_index_location> should be the complete URL to the Sitemap index, such as: http://www.example.org/sitemap_index.xml
[edit] Sitemap Limits
Sitemap index files may not list more than 1,000 Sitemaps and must be no larger than 10MB (10,485,760 bytes).
The Sitemap index file must:
* Begin with an opening <sitemapindex> tag and end with a closing </sitemapindex> tag. * Include a <sitemap> entry for each Sitemap as a parent XML tag. * Include a <loc> child entry for each <sitemap> parent tag.
The optional <lastmod> tag is also available for Sitemap index files.
[edit] Time format for <lastmod> tag
The value for the lastmod tag should be in W3C Datetime format. For example, 2007-08-25T00:00:00+00:00. This encoding allows to omit the time portion of the ISO8601 format; for example, 2007-08-25 is also valid.
Available time formats:
Format | Example |
---|---|
YYYY-MM-DDThh:mm:ssTZD | 2007-08-25T00:00:00+00:00 |
YYYY-MM-DDThh:mmTZD | 2007-08-25T00:00+00:00 |
YYYY-MM-DD | 2007-08-25 |
[edit] Validating Sitemap index
Google uses an XML schema to define the elements and attributes that can appear in Sitemap index file. http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd
In order to validate your Sitemap or Sitemap index file against a schema, the XML file will need additional headers.[3]