Noindex
Noindex is a robots meta directive that instructs search engines not to include a specific page in search results. It can be set via an HTML `<meta>` tag or an HTTP response header (`X-Robots-Tag`), preventing the page from appearing on search engine results pages (SERPs) such as Google and Bing.
Noindex is a robots meta directive that instructs search engines not to include a specific page in search results. It can be set via an HTML <meta> tag or an HTTP response header (X-Robots-Tag), preventing the page from appearing on search engine results pages (SERPs) such as Google and Bing.
Why It Matters
Search engines attempt to crawl and index every page on a website. However, not all pages deserve to appear in search results. If pages such as login pages, internal search result pages, thank-you pages, or staging environment pages get indexed, crawl budget is wasted, duplicate content issues arise, and the overall search quality assessment of the site can be degraded. Properly using noindex allows you to direct search engines to focus crawling resources on pages that genuinely provide value.
How to Set It Up
- HTML Meta Tag Method
Add the following tag to the <head> section of the page:
<meta name="robots" content="noindex">
You can also target specific search engines. For example, to apply noindex only to Google, change the name attribute to googlebot:
<meta name="googlebot" content="noindex">
To also block link crawling, use nofollow together:
<meta name="robots" content="noindex, nofollow">
- HTTP Header Method (X-Robots-Tag)
For non-HTML resources (PDFs, images, etc.) where meta tags cannot be inserted, set the directive in the server response header:
X-Robots-Tag: noindex
In frameworks like Next.js, you can set response headers directly within API routes or getServerSideProps.
When to Use It
Applying noindex is recommended for the following types of pages:
- Internal search result pages: Dynamic pages generated by on-site search functionality can be perceived as duplicate content by search engines.
- Login, sign-up, and profile pages: Pages containing personal information that do not need to be exposed in search results.
- Thank-you pages: Confirmation pages displayed after form submission have no search traffic value.
- Staging and test environments: Prevents development sites from being accidentally indexed. However, ensure that noindex is removed when deploying to production.
- Pages with duplicate content: If the relationship between a canonical page and a duplicate is clear, however, a
canonicaltag may be a more appropriate solution. - Admin-only pages: Dashboards, admin panels, and similar pages do not need search exposure.
Noindex vs Disallow Differences
Noindex and the Disallow directive in robots.txt are frequently confused, but their behavior is fundamentally different.
| Attribute | noindex (meta tag) | Disallow (robots.txt) |
|---|---|---|
| Function | Excludes the page from search results | Blocks crawler access to the page entirely |
| Indexing | Allows crawling but blocks indexing | Blocks crawling, but the page can still be indexed via external links |
| Link equity | Link value (link equity) from the page can still be passed | Crawlers cannot read the page, so link value cannot be transferred |
| Scope | Precise control at the individual page level | Batch control at the directory or URL pattern level |
The most critical caveat is that you must not use both simultaneously. If crawling is blocked via robots.txt, the search engine cannot read the noindex tag on the page, causing the noindex directive to be ignored — and the page may remain in search results. To reliably exclude a page from search results, allow crawling while using the noindex meta tag.
Related inblog Posts
How inblog Helps
inblog lets you set noindex on individual posts or tag pages to prevent unwanted pages from being indexed.