Skip to content

sitemap index

An XML file that lists multiple sitemap URLs (`<sitemapindex>` root) -- the canonical way to publish more than 50,000 URLs without exceeding the protocol per-file cap.

A sitemap index is an XML file whose root element is <sitemapindex> rather than the usual <urlset>. Instead of listing URLs directly, it lists URLs of OTHER sitemap files. Crawlers read the index, fetch each child sitemap, and aggregate the URL set.

Use a sitemap index when:

  • Your URL set exceeds the per-sitemap protocol cap (50,000 URLs OR 50 MB uncompressed -- whichever you hit first).
  • You want to organize by content type (/sitemap-products.xml, /sitemap-articles.xml, /sitemap-help.xml) for easier debugging and selective resubmission to Search Console.
  • You want different crawl cadences per section -- a daily-changing news section can declare frequent lastmod values without flagging the static help docs as also fast-changing.

Format:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-products.xml</loc>
    <lastmod>2026-05-09</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-articles.xml</loc>
  </sitemap>
</sitemapindex>

A sitemap index can itself be referenced from robots.txt via Sitemap: lines (one per file is fine but generally one index is cleaner). Google Search Console treats the index as the entry point and shows per-child stats.

Common bug: shipping a <sitemapindex> with an empty body (no <sitemap> children) -- the BeaverCheck sitemap-depth analyzer flags this as sd-index-no-children.

Related terms

Further reading

Send Feedback