Skip to content
AEO Canon · the reference for answer-engine optimization
Guide

robots.txt Generator for AI Crawlers

Toggle each AI crawler — GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot, Google-Extended, and more — to allow or block, add your sitemap, and copy a ready-to-use robots.txt. Each crawler is labelled by what it actually does, so you allow the bots that get you cited.

BBurke Atkerson1 min read

Toggle each AI crawler to allow or block, add your sitemap, and copy a ready-to-use robots.txt — generated in your browser. AI engines can only cite pages their crawlers are allowed to fetch, so the goal for visibility is simple: allow the bots that get you cited, and decide about the rest on purpose.

Quick answer

Allow the citation crawlers (OAI-SearchBot, PerplexityBot, Bingbot) so engines can quote you. Blocking training crawlers (GPTBot, ClaudeBot, CCBot, Google-Extended) is a rights choice that won't change your citations. Each bot below is labelled by what it actually does.

robots.txt generator for AI crawlers

12/12 allowed

Toggle each crawler to allow or block it. To be cited, you want the Citations bots allowed. Blocking Trainingbots is a rights choice — it won't improve or remove your citations.

# robots.txt — generated by The AEO Canon
# https://www.aeocanon.com/tools/robots-txt-generator-ai-crawlers

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: GPTBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Perplexity-User
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Claude-User
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: Bingbot
Allow: /

User-agent: Applebot-Extended
Allow: /

User-agent: CCBot
Allow: /

User-agent: Bytespider
Allow: /

Place the file at the root of your domain (example.com/robots.txt). robots.txt is a public, voluntary standard — reputable crawlers honor it, but it isn't access control. Note: Google AI Overviews use Googlebot, which has no separate AI opt-out.

Allow the right bots — and know what each one does

The single most common mistake is silently blocking the crawlers you want, usually with a blanket Disallow: / or a restrictive default that catches AI user-agents. The fix is to be explicit. Allow the search and live-fetch bots that drive citations; only block training bots if you have a deliberate reason to. And remember the Google nuance: blocking Google-Extended does not remove you from AI Overviews, because those are served by Googlebot, which has no separate AI opt-out. This is the Access pillar — the price of admission, with no partial credit. For the syntax rules themselves, see Google's robots.txt documentation.

Where to go next

For the full, copy-paste reference block and the one mistake that silently blocks every AI crawler, read how to allow AI crawlers in robots.txt. Then confirm it actually worked with how to check AI crawlers can read your site, and make sure your content isn't hidden behind JavaScript in why JavaScript breaks AI citation. Access is pillar one of The AEO Canon.

How we review

This guideis compiled from each vendor’s own documentation and current independent testing, and was last verified in 2026; we re-check quarterly. Pricing and features in this space change fast — confirm current details on the vendor’s site before buying. We don’t earn affiliate commissions on the tools we cover, and we don’t accept payment for placement.

Frequently asked questions

Should I allow AI crawlers in robots.txt?
To be cited in AI answers, yes — allow the crawlers that power citations, such as OAI-SearchBot (ChatGPT), PerplexityBot, and Bingbot (Copilot). If a crawler can't fetch your page, that engine can't quote it. Blocking training crawlers like GPTBot or ClaudeBot is a separate rights decision and won't improve or remove your citations; this generator labels each bot so you can choose deliberately.
What's the difference between AI search crawlers and training crawlers?
Search crawlers (OAI-SearchBot, PerplexityBot, Bingbot) fetch pages so an engine can cite them in live answers — these are the ones that drive your AI visibility. Training crawlers (GPTBot, ClaudeBot, CCBot, Google-Extended) collect content to train models. Blocking training crawlers restricts how your content trains AI but does not remove you from that engine's search citations.
Does blocking Google-Extended remove me from AI Overviews?
No. Google-Extended only controls whether your content is used to train Gemini and Vertex AI. It has no effect on Google Search or AI Overviews, which are served by Googlebot — and Google offers no separate opt-out from AI Overviews short of blocking Googlebot, which would remove you from Search entirely. Blocking Google-Extended is purely a training choice.
Where do I put the robots.txt file?
At the root of your domain, reachable at example.com/robots.txt. It must be a plain text file served from the top level — a robots.txt in a subfolder is ignored. Remember robots.txt is a public, voluntary standard that reputable crawlers honor; it signals intent but is not access control or security.

Keep exploring