Robots.txt Generator

Build a valid robots.txt file by adding allow/disallow rules per user-agent. Control which pages and bots can crawl your website.

Sitemap & Crawl Delay
Rules
robots.txt Output

            

Frequently Asked Questions

The robots.txt file lives at the root of your domain (e.g., https://example.com/robots.txt) and tells crawlers which pages they are allowed to access. It follows the Robots Exclusion Standard. Note: it is advisory, not a security mechanism — malicious bots can and do ignore it.

No. A Disallow rule prevents Googlebot from crawling the page, but if the page has external links pointing to it, Google may still show it in search results with a "No information is available for this page" snippet. To remove a page from Google's index, use a noindex meta tag instead.

User-agent: * applies rules to all crawlers. You can target specific crawlers by name — for example, User-agent: Googlebot for Google's main crawler or User-agent: Bingbot for Microsoft Bing. Specific rules take precedence over wildcard rules for that bot.

Yes, it is recommended. The Sitemap: directive in robots.txt makes it easy for crawlers to discover your XML sitemap without it needing to be submitted via Google Search Console. You can include multiple Sitemap: lines for multiple sitemaps.

robots.txt controls whether a crawler can access and fetch a page at all — it operates at the crawl level. The meta robots tag (<meta name="robots">) controls whether a page can be indexed or its links followed — it operates at the indexing level. A page blocked in robots.txt cannot be crawled to read the meta robots tag, so you cannot use meta robots noindex on a page blocked by robots.txt.

All major legitimate crawlers respect robots.txt, including Googlebot, Bingbot, DuckDuckBot, Applebot, and reputable SEO tool crawlers like Ahrefs and Semrush. However, malicious bots, scrapers, and spam bots frequently ignore robots.txt entirely, so it should never be treated as a security measure for sensitive content.

Disallow in robots.txt prevents a bot from crawling the page — but if the URL has external links pointing to it, Google may still show it in search results with limited information. noindex (in a meta robots tag or X-Robots-Tag header) prevents the page from being added to Google's index entirely. To reliably remove a page from search results, use noindex on a page that is still crawlable.

Google does not support the Crawl-delay directive in robots.txt. To slow down Googlebot, use the crawl rate settings in Google Search Console (Settings → Crawl Stats → Change crawl rate). Bing and some other crawlers do honour Crawl-delay, so it is still worth including for them even though Google ignores it.

Add a Sitemap: directive followed by the full absolute URL of your sitemap file, for example: Sitemap: https://example.com/sitemap.xml. This directive applies to all bots regardless of which User-agent block it appears in, and you can list multiple sitemaps. Most SEO professionals place the Sitemap directive at the very end of the robots.txt file for readability.

In Google Search Console, navigate to Settings → robots.txt and use the built-in tester to check whether specific URLs are allowed or blocked by your rules. You can also test different user agents (Googlebot, Googlebot-Image, etc.) to see how rules apply to each. The live version of your robots.txt is fetched directly, so it reflects your current file without needing a page refresh.

About This Robots.txt Generator

This free robots.txt generator builds a valid robots.txt file from a visual form. Add rules for specific user agents, allow and disallow paths, set a crawl delay, and specify your sitemap URL — then copy or download the generated file.

A correctly formatted robots.txt file tells search engine crawlers which parts of your site to index and which to skip. Syntax errors in robots.txt can inadvertently block your entire site from being crawled.

When to use this tool

  • Blocking admin, login, and private pages from being crawled
  • Allowing Googlebot while blocking specific other bots
  • Adding a sitemap directive to help crawlers discover content
  • Verifying syntax before uploading the file to your site root

How It Works

Add Rules

Select a user-agent and choose Allow or Disallow for a specific path. Add as many rules as needed, or use a quick preset to start.

Preview Instantly

The robots.txt output updates in real time as you add or modify rules. Rules are grouped by user-agent in the correct format.

Copy or Download

Copy the generated content or download it as robots.txt and upload it to the root directory of your web server.

Common Use Cases

Block Admin Areas

Disallow /admin/, /wp-admin/, and /login/ paths to prevent crawlers from wasting crawl budget on private areas.

Block Staging Sites

Use Disallow: / for all bots on staging environments to prevent duplicate content issues if the staging URL is ever discovered.

Crawl Budget Management

Block paginated pages, filter/sort URLs, and internal search result pages to focus Google's crawl budget on your most important content.

Control Media Indexing

Use Googlebot-Image disallow rules to prevent certain image directories from appearing in Google Image search results.

Block Tag & Category Pages

On blog and CMS sites, disallow tag, category, and archive pages that create shallow content which dilutes the authority of core pages.

Sitemap Discovery

Include the Sitemap: directive to help all search engines discover your XML sitemap automatically without manual submission in each webmaster tool.

Related Articles

View all articles
Robots.txt, Meta Robots, X-Robots-Tag: Which One Do You Actually Need? A Goal-First Framework

Robots.txt, Meta Robots, X-Robots-Tag: Which One Do You Actually Need? A Goal-First Framework

Robots.txt, meta robots, and X-Robots-Tag aren't competing options — each addresses a different goal (crawl budget, index exclusion for HTML, index exclusion for PDFs/files), and "belt and suspenders" combining robots.txt blocking with noindex doesn't add safety, it disables the noindex entirely. Here's a goal-first decision framework for which mechanism to reach for, and why genuinely sensitive content needs authentication, not extra robots directives.

Jun 14, 2026
AI Crawlers and the New robots.txt Reality: GPTBot, Google-Extended, and ClaudeBot

AI Crawlers and the New robots.txt Reality: GPTBot, Google-Extended, and ClaudeBot

GPTBot, ClaudeBot, Google-Extended, and a growing list of AI training crawlers now require active robots.txt management. Here's every major AI crawler and its user agent, how to block them selectively, the distinction between blocking Google-Extended vs Googlebot, and what "respect robots.txt" actually means in practice.

Jun 10, 2026
Robots.txt Mistakes That Silently Kill SEO — and the Correct Configurations

Robots.txt Mistakes That Silently Kill SEO — and the Correct Configurations

A wrong robots.txt can deindex your entire site and nobody warns you until rankings collapse. Here's the most dangerous mistakes (Disallow: /, blocking CSS/JS), why robots.txt can't prevent indexation alone, and correct configurations for common scenarios.

Jun 8, 2026
Robots.txt Generator — Create Crawler Access Rules for Any Website

Robots.txt Generator — Create Crawler Access Rules for Any Website

Learn how robots.txt works, the difference between disallowing crawling vs. preventing indexing, common mistakes to avoid, and how to generate a correct robots.txt file for any site with a free tool.

Jun 6, 2026