SEO

Sitemap

A Sitemap is a structured file that provides search engines with a list of URLs for pages, images, videos, and other content on a website. It serves as a "map" that helps search engine crawlers explore and index a site more efficiently.

A Sitemap is a structured file that provides search engines with a list of URLs for pages, images, videos, and other content on a website. It serves as a "map" that helps search engine crawlers explore and index a site more efficiently.

Why It Matters

Search engines discover web pages by following links. However, for newly created pages, deep pages with insufficient internal links, or large-scale sites with hundreds of thousands of pages, crawlers may struggle to discover every page naturally. A sitemap directly informs search engines about these pages, improving crawling efficiency and preventing indexing omissions.

Sitemaps are particularly essential in the following scenarios:

  • Large-scale sites with 500 or more pages
  • New sites with very few external backlinks
  • Sites with abundant rich media content such as images and videos
  • News sites where content is frequently updated

Types

Sitemaps come in several types depending on their purpose:

XML Sitemap: The most basic and widely used format. It structures each page's URL and metadata using tags such as <url>, <loc>, and <lastmod>.

Image Sitemap: A format that specifically informs search engines about image content. Useful when you want to maximize image search visibility.

Video Sitemap: Includes metadata such as title, description, and duration of video content to help Google better understand your videos.

News Sitemap: A specialized format for news publishers that should only include articles published within the last 2 days.

Sitemap Index: When a single sitemap file exceeds 50,000 URLs or 50MB, multiple sitemaps are grouped and managed through a single index file.

HTML Sitemap: A sitemap designed for users rather than search engines. It is a page that aggregates links to key pages on the site, improving navigation convenience.

Setup Guide

Step 1 — Generate the Sitemap

There are three methods for generating a sitemap. First, use built-in CMS or framework features or plugins (e.g., Yoast SEO for WordPress). Second, auto-generate using crawling tools like Screaming Frog. Third, manually write the XML file — suitable for small-scale sites.

Step 2 — Follow Required Rules

  • Keep URLs per file to 50,000 or fewer and file size to 50MB or less
  • Use UTF-8 encoding
  • Write URLs as absolute paths (e.g., https://example.com/page)
  • Include only canonical URLs. Exclude URLs that redirect or duplicate pages

Step 3 — Deploy and Submit

Place the sitemap file in the site's root directory (e.g., https://example.com/sitemap.xml). Add Sitemap: https://example.com/sitemap.xml to your robots.txt file, and submit the URL through Google Search Console's "Sitemaps" menu.

Step 4 — Set Up Automatic Updates

Configure the sitemap to update automatically whenever content is added, modified, or deleted. Use accurate modification timestamps in the <lastmod> tag to prompt search engines to prioritize re-crawling of changed pages.

Common Mistakes

Including noindex pages in the sitemap: Adding pages with a noindex tag or pages blocked by robots.txt to the sitemap sends conflicting signals to search engines. Only include pages you want indexed in your sitemap.

Including broken links (404s): If URLs for deleted pages remain in the sitemap, Google Search Console will report "Submitted URL not found (404)" errors. Regularly audit your sitemap and remove invalid URLs.

Date format errors: According to SEMrush research, approximately 62% of XML sitemap errors stem from date format issues. <lastmod> must follow the W3C Datetime format (e.g., 2026-03-17 or 2026-03-17T09:00:00+09:00).

URL format inconsistency: Mixing https and http, or www and non-www, can cause search engines to treat the same page as separate entities. All URLs within the sitemap should use one consistent format.

Generating the sitemap but not submitting it: Even if you create a sitemap file, if you do not submit it to Google Search Console or Bing Webmaster Tools, it may take a significant amount of time for search engines to discover it.

Related inblog Posts

How inblog Helps

inblog dynamically generates XML sitemaps that automatically reflect post publishing and deletion.