Skip to content

robots.txt

A plain-text file at the site root telling crawlers which paths they may or may not request, following the Robots Exclusion Protocol.

robots.txt is the file served at /robots.txt that controls which URLs search-engine crawlers (and other bots) are permitted to fetch. It uses a simple line-based format: User-agent: selects which bot the rules apply to, then Disallow: and Allow: rules follow.

A typical robots.txt:

User-agent: *
Allow: /
Disallow: /admin/
Disallow: /search?

Sitemap: https://example.com/sitemap.xml

robots.txt blocks crawling but NOT indexing -- if a blocked URL is linked from elsewhere, it can still appear in search results without a snippet. To prevent indexing, use <meta name="robots" content="noindex"> on the page itself (which requires the page to be crawlable so the meta is read).

Test changes via Google Search Console's URL Inspection tool, which reports how the live robots.txt evaluates against any URL.

Related terms

Further reading

Send Feedback