Sitemap
An XML or HTML file listing the pages of a website, used to help search engines discover and index content efficiently.
Also known as: XML sitemap, sitemap.xml, HTML sitemap
A sitemap is a file that lists the pages of a website, used to help search engines discover and index content. The most common form is an XML sitemap at /sitemap.xml, which is a machine-readable file specifically designed for search engine crawlers. An HTML sitemap is a human-readable page listing the site’s content, less common today but still used on some larger sites.
XML sitemap structure
An XML sitemap follows a defined schema (sitemaps.org). A minimal example:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2026-04-23</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://example.com/about</loc>
<lastmod>2026-03-10</lastmod>
</url>
</urlset>
Each <url> entry can include:
<loc>, the page URL (required)<lastmod>, last modification date (recommended; search engines actually use this)<changefreq>, how often the page changes (largely ignored by modern crawlers)<priority>, relative importance (largely ignored by modern crawlers)
What sitemaps do
Sitemaps help search engines:
- Discover pages they might not find through normal crawling
- Understand site structure at a glance
- Prioritize crawling of recently updated pages (via
<lastmod>) - Index large sites more efficiently
- Cover orphan pages (pages not linked from elsewhere)
For small sites with good internal linking, a sitemap may add little value beyond what crawling alone provides. For larger or more complex sites, sitemaps significantly improve indexation.
Sitemap variants
| Variant | Purpose |
|---|---|
| Standard XML sitemap | Lists pages |
| Image sitemap | Lists images for image search indexing |
| Video sitemap | Lists videos with metadata |
| News sitemap | For Google News submissions |
| Sitemap index | Master file listing multiple individual sitemaps; useful for sites with more than 50,000 URLs |
A sitemap index allows breaking large sites into multiple sitemaps:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://example.com/sitemap-pages.xml</loc>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-blog.xml</loc>
</sitemap>
</sitemapindex>
Where to put the sitemap
Conventional location: https://example.com/sitemap.xml (the root of the site). Alternative locations work as well; the location is referenced from robots.txt or submitted directly to search consoles.
How sitemaps are submitted to search engines
Several ways:
- Reference in
robots.txt. AddSitemap: https://example.com/sitemap.xmltorobots.txt - Google Search Console. Submit the sitemap URL in the Sitemaps section
- Bing Webmaster Tools. Submit through the Sitemaps section
- Yandex Webmaster. Similar process for the Russian search engine
- Ping (legacy). Some search engines historically accepted ping URLs to notify of sitemap updates; this is largely deprecated
After submission, search engines fetch the sitemap on a schedule (often daily for active sites).
Sitemap limits
- Maximum 50,000 URLs per sitemap file
- Maximum 50 MB uncompressed file size
- For larger sites, use sitemap indexes to combine multiple sitemaps
Generating sitemaps
Most modern web platforms generate sitemaps automatically:
- WordPress. Generates a sitemap by default since version 5.5; SEO plugins (Yoast, Rank Math) override with their own
- Static site generators. Astro, Hugo, Eleventy, Next.js, Gatsby, etc. include sitemap generation as plugins or built-in features
- Hosted CMS. Squarespace, Wix, Webflow, Shopify all generate sitemaps automatically
- Custom sites. Generated by build scripts or libraries
Manual sitemap creation is rare; automation is standard.
What to include
- All canonical, indexable URLs of the site
- Pages that should appear in search results
What to exclude:
- Pages with
noindexmeta tags - Duplicate or near-duplicate pages
- Admin, login, or internal pages
- URL parameter variations (filtered, sorted, paginated versions)
- Pages requiring authentication
Sitemap and SEO
Sitemaps assist indexing but do not guarantee ranking improvements. They are a discovery mechanism, not a ranking factor. The pages must still earn rankings through content quality, site authority, and other signals.
For new sites, well-structured sitemaps can speed up initial indexing. For established sites, the impact is usually smaller.
Common misconceptions
- “Submitting a sitemap guarantees indexing.” Search engines may still choose not to index pages they consider low-quality or duplicate.
- “Higher priority values improve rankings.” The
<priority>attribute is largely ignored by modern crawlers. - “Every page needs to be in the sitemap to rank.” Pages can rank without being in the sitemap if they are discovered through links.
- “Sitemaps replace internal linking.” They complement linking, not replace it; well-linked pages are crawled more frequently and treated as more important.