GPTBot is OpenAI's web crawler that indexes content for ChatGPT and other OpenAI products. Its user-agent string is 'GPTBot'. OpenAI now operates a three-tier crawler system: GPTBot (training data), OAI-SearchBot (search indexing), and ChatGPT-User (real-time browsing). GPTBot is the most-blocked AI crawler via robots.txt, with 62% of top news sites blocking it (BuzzStream, 2025).

Robots.txt for AI Crawlers: How to Allow GPTBot, ClaudeBot, and PerplexityBot

Your robots.txt file controls which AI crawlers can access your site. Here's how to configure it for maximum AI visibility.

The robots.txt file tells web crawlers what they can and can't access on your website. Traditionally, this meant Googlebot and Bingbot. But now there's a new generation of AI crawlers — and many websites are blocking them without realizing it.

If AI crawlers can't access your site, your business won't appear in AI-generated answers from ChatGPT, Claude, Perplexity, or Google AI Overviews. This guide covers every AI crawler you need to know about and how to configure robots.txt correctly.

Check your robots.txt instantly

Foglift scans your robots.txt and checks every AI crawler listed below. Free, no signup required.

What the Data Shows About AI Crawler Blocking

Before configuring your robots.txt, it helps to understand the landscape. AI agent traffic grew over 6,900% year-over-year in 2025, making this decision increasingly consequential.

A 2025 BuzzStream study of top news publishers found that 79% block AI training bots via robots.txt, while 71% also block AI retrieval bots. The most-blocked crawlers: ClaudeBot (69% of sites), PerplexityBot (67%), and GPTBot (62%).

Cloudflare's Q1 2026 data reveals that 89.4% of AI crawler traffic is training or mixed-purpose, while only 8% is search-related and just 2.2% responds to actual user queries in real time. This distinction matters for your blocking strategy.

The blocking paradox

BuzzStream found that 70.6% of sites blocking ChatGPT-User still appeared in AI citations — blocking via robots.txt does not reliably prevent AI from citing your content. But publishers who blocked AI crawlers experienced a 23.1% decline in total monthly visits and a 13.9% drop in human-only browsing. The takeaway: blocking costs you traffic but doesn't prevent citation.

The Three-Tier Crawler Framework (2026)

As of 2026, major AI companies no longer use a single crawler. They've split into three tiers — and your robots.txt strategy needs to account for each:

Purpose	OpenAI	Anthropic	Perplexity
Training	`GPTBot`	`ClaudeBot`	`PerplexityBot`
Search indexing	`OAI-SearchBot`	`Claude-SearchBot`	—
User browsing	`ChatGPT-User`	`Claude-User`	`Perplexity-User`

Key insight: Blocking the training bot does not block the search or user-browsing bots. If you block GPTBot but allow OAI-SearchBot, ChatGPT search can still index your site — but the base model won't train on your content. Anthropic confirmed all three of its bots honor robots.txt independently (Search Engine Journal, 2026).

Complete AI Crawler Reference Table

Here's the complete list of AI crawlers as of April 2026, updated to include Apple, Meta, DuckDuckGo, and Common Crawl agents that launched in 2024–2025. ClaudeBot has approximately doubled its crawl rate between Q3 2025 and Q1 2026, suggesting Anthropic is scaling retrieval infrastructure significantly (TechnologyChecker.io, Q1 2026). Meta-ExternalAgent launched in July 2024 for Llama training (Meta Developers, 2024); Applebot-Extended launched June 2024 as an AI-training opt-out control for Apple Intelligence (Apple Support, 2024).

Crawler	Company	Tier	Powers	User-Agent
GPTBot	OpenAI	Training	Model training data	`GPTBot`
OAI-SearchBot	OpenAI	Search	ChatGPT search indexing	`OAI-SearchBot`
ChatGPT-User	OpenAI	User	Real-time web browsing	`ChatGPT-User`
ClaudeBot	Anthropic	Training	Model training data	`ClaudeBot`
Claude-SearchBot	Anthropic	Search	Claude search indexing	`Claude-SearchBot`
Claude-User	Anthropic	User	User-requested browsing	`Claude-User`
PerplexityBot	Perplexity AI	Training	Perplexity indexing	`PerplexityBot`
Perplexity-User	Perplexity AI	User	Real-time retrieval	`Perplexity-User`
Google-Extended	Google	Training	Gemini, AI Overviews	`Google-Extended`
Amazonbot	Amazon	Mixed	Alexa answers, Amazon AI	`Amazonbot`
Bytespider	ByteDance	Training	TikTok / Doubao AI	`Bytespider`
cohere-ai	Cohere	Training	Cohere AI products	`cohere-ai`
Applebot-Extended	Apple	Training (opt-out signal)	Apple Intelligence foundation models	`Applebot-Extended`
Meta-ExternalAgent	Meta	Training	Llama / Meta AI training	`meta-externalagent`
Meta-ExternalFetcher	Meta	User	Meta AI user-requested fetches	`meta-externalfetcher`
DuckAssistBot	DuckDuckGo	User	DuckAssist cited answers	`DuckAssistBot`
CCBot	Common Crawl	Training (dataset)	Open dataset used by many LLMs	`CCBot`

Sources: Search Engine Journal (Dec 2025), ALM Corp (2026), Anthropic documentation, Apple Support (119829, 120320), Meta Developers (externalagent crawler docs), DuckDuckGo Help Pages (duckassistbot), Common Crawl (commoncrawl.org/ccbot). Note: anthropic-ai and Claude-Web are deprecated — use ClaudeBot, Claude-SearchBot, and Claude-User instead.

Three crawlers worth their own callout:

Applebot-Extended is an opt-out signal, not a separate crawler. It does not fetch pages of its own. Apple's regular Applebot does the crawling; blocking Applebot-Extended tells Apple not to use the content for generative-AI training, while still leaving you indexable for Siri, Spotlight, and Apple search. Roughly 6–7% of high-traffic sites block it today — mostly news publishers including The New York Times, The Financial Times, The Atlantic, Vox Media, and Condé Nast (Apple Support, 2025).
Meta-ExternalFetcher can bypass robots.txt for user-initiated URLs. Meta states that facebookexternalhit and meta-externalfetcher may ignore robots.txt when a user explicitly provides a URL as context to a Meta AI product — the same carve-out that Perplexity-User and ChatGPT-User apply. If you need a hard block on user-triggered fetches, you need firewall rules, not robots.txt alone (Meta Developers, 2026).
CCBot blocks propagate slowly. Common Crawl publishes snapshots quarterly, and older snapshots live forever in derivative training datasets like The Pile and RedPajama. Blocking CCBot today removes you from future snapshots; content you already published remains in circulation for years (Common Crawl, 2025). Treat it as a long-horizon decision, not an instant kill switch.

Recommended robots.txt Configuration

For most businesses that want maximum AI visibility, use this robots.txt:

# Standard search engine crawlers
User-agent: Googlebot
Allow: /

User-agent: Bingbot
Allow: /

# OpenAI (all three tiers)
User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

# Anthropic (all three tiers)
User-agent: ClaudeBot
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: Claude-User
Allow: /

# Perplexity (both tiers)
User-agent: PerplexityBot
Allow: /

User-agent: Perplexity-User
Allow: /

# Google AI features
User-agent: Google-Extended
Allow: /

# Amazon AI
User-agent: Amazonbot
Allow: /

# Default — allow everything else
User-agent: *
Allow: /

# Block private/admin areas
Disallow: /admin/
Disallow: /api/
Disallow: /auth/

Sitemap: https://yoursite.com/sitemap.xml

How to Selectively Allow or Block AI Crawlers

The three-tier framework enables a nuanced strategy: allow search and user-browsing bots (so you appear in AI answers) while blocking training bots (so your content isn't used to train models). Here's how:

# ALLOW search + user bots (appear in AI answers)
User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: Claude-User
Allow: /

User-agent: Perplexity-User
Allow: /

# BLOCK training bots (prevent model training)
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Google-Extended
Disallow: /

Note: Even with this selective approach, blocking training crawlers means AI models won't have up-to-date knowledge of your site. They may still cite you from cached data, but it won't be current. And as BuzzStream's data shows, blocking doesn't reliably prevent citations — 70.6% of sites blocking ChatGPT-User still appeared in AI citations.

Compliance caveat: Bytespider (ByteDance) claimed robots.txt compliance but was observed accessing disallowed paths on test sites within 30 days of applying a block. Consider firewall-level blocking for crawlers you don't trust to honor robots.txt.

How to Edit robots.txt on Popular Platforms

WordPress

Edit via Yoast SEO plugin: SEO → Tools → File editor → robots.txt. Or create/edit the file at your site root.

Squarespace

Go to Settings → SEO → scroll to "Additional Robots.txt Rules" and add your AI crawler rules there.

Wix

Go to Dashboard → Settings → SEO (Google) → SEO Tools → Robots.txt Editor.

Shopify

Shopify auto-generates robots.txt. Edit it via theme.liquid or use a Shopify robots.txt app.

Next.js / Vercel

Create a robots.ts file in your app/ directory or add a static robots.txt in public/.

Common Mistakes

Using a wildcard Disallow that blocks AI crawlers — User-agent: * Disallow: / blocks everything, including AI crawlers.
Not checking platform defaults — Some CMS platforms add AI crawler blocks automatically. Always check after setup.
Blocking GPTBot but expecting ChatGPT visibility — GPTBot is how ChatGPT learns about your site. Without it, you rely only on Bing indexing.
Forgetting to add a sitemap reference — Always include Sitemap: https://yoursite.com/sitemap.xml at the end of robots.txt.

How to Verify Your Configuration

Visit yoursite.com/robots.txt in your browser
Check that GPTBot, ClaudeBot, and PerplexityBot are not blocked
Run a free Website Audit — it checks all AI crawlers and shows exactly which are blocked

Frequently Asked Questions

What is GPTBot?

GPTBot is OpenAI's web crawler that indexes content for ChatGPT and other OpenAI products. OpenAI now operates a three-tier system: GPTBot (training), OAI-SearchBot (search indexing), and ChatGPT-User (real-time browsing). GPTBot is the most-blocked AI crawler via robots.txt — 62% of top news sites block it (BuzzStream, 2025).

Should I allow AI crawlers on my website?

For most businesses, yes. AI agent traffic grew over 6,900% year-over-year in 2025. Allowing AI crawlers means your site can appear in AI-generated answers from ChatGPT, Perplexity, and Google AI Overviews. BuzzStream found that publishers blocking AI crawlers experienced a 23.1% decline in total monthly visits — and blocking doesn't reliably prevent citations, since 70.6% of sites blocking ChatGPT-User still appeared in AI citations.

Does Squarespace block AI crawlers?

Yes, some Squarespace sites block AI crawlers by default in their robots.txt. Check your site's robots.txt to confirm, and contact Squarespace support if you need to modify it.

What AI crawlers should I allow?

The key AI crawlers to allow in 2026 are: GPTBot, OAI-SearchBot, and ChatGPT-User (OpenAI); ClaudeBot, Claude-SearchBot, and Claude-User (Anthropic); PerplexityBot and Perplexity-User (Perplexity); Google-Extended (Google AI features); and Amazonbot (Amazon/Alexa). Each AI company now uses multiple crawlers — blocking the training bot alone does not block the search or user-browsing bots.

How do I check if AI crawlers are blocked on my site?

Visit yoursite.com/robots.txt and look for Disallow rules targeting GPTBot, ClaudeBot, or PerplexityBot. Or use Foglift's free Website Audit — it automatically checks AI crawler access as part of the AI Visibility Score.

Check your AI crawler status

Instant scan. See which AI crawlers can access your site.

Free AI Crawler Check

Generate your robots.txt

Use our free AI Robots.txt Generator to create an optimized robots.txt with the right AI crawler settings.

AI Robots.txt Generator

Sources & Further Reading

BuzzStream, “Which News Sites Block AI Crawlers in 2025? [New Data],” 2025. 79% of top news sites block AI training bots; 70.6% of blocking sites still cited; 23.1% traffic decline for blockers.
TechnologyChecker.io, “We Analyzed robots.txt Across Cloudflare's Network: Which AI Crawlers Get Blocked Most and Why,” Q1 2026. ClaudeBot doubled crawl rate Q3 2025 → Q1 2026.
Search Engine Journal, “Cloudflare Report: Googlebot Tops AI Crawler Traffic,” 2025. 89.4% of AI crawler traffic is training/mixed; 2.2% real-time user queries.
Search Engine Land, “Googlebot dominates web crawling in 2025 as AI bots surge,” 2025. GPTBot share decreased from 35.46% to 28.97% as blocking increased.
Paul Calvano, “AI Bots and Robots.txt,” Aug 2025. Analysis of AI crawler blocking patterns across the web.
Search Engine Journal, “Complete Crawler List For AI User-Agents,” Dec 2025. Verified user-agent list from real server logs with IP validation.
Search Engine Journal, “Anthropic's Claude Bots Make Robots.txt Decisions More Granular,” 2026. Three-bot framework: ClaudeBot, Claude-SearchBot, Claude-User.
ALM Corp, “ClaudeBot, Claude-User & Claude-SearchBot: Anthropic's Three-Bot Framework,” 2026. Each bot honors robots.txt independently.
Apple Support, “About Applebot” (article 119829) and “Applebot model training and individual privacy rights” (article 120320), updated 2025. Applebot-Extended is an AI-training opt-out signal, not a distinct crawler; Applebot continues to index for Siri, Spotlight, and Apple search regardless.
Meta for Developers, “Meta crawler documentation” (developers.facebook.com/docs/sharing/webmasters/crawler), 2024–2026. Meta-ExternalAgent launched July 2024 for training; Meta-ExternalFetcher handles user-initiated fetches and may bypass robots.txt when a user supplies an explicit URL.
DuckDuckGo Help Pages, “Is DuckAssistBot related to DuckDuckGo?” 2025. DuckAssistBot/1.2 crawls on-demand for DuckAssist answers; robots.txt changes take effect within 72 hours; data is not used for model training.
Common Crawl, “CCBot” (commoncrawl.org/ccbot), 2025. CCBot now runs on dedicated IP ranges with reverse DNS for verification; Common Crawl snapshots feed derivative training datasets (The Pile, RedPajama, C4), so blocks propagate slowly.

What the Data Shows About AI Crawler Blocking

Before configuring your robots.txt, it helps to understand the landscape. AI agent traffic grew over 6,900% year-over-year in 2025, making this decision increasingly consequential.

The blocking paradox

The Three-Tier Crawler Framework (2026)

As of 2026, major AI companies no longer use a single crawler. They've split into three tiers — and your robots.txt strategy needs to account for each:

Purpose	OpenAI	Anthropic	Perplexity
Training	`GPTBot`	`ClaudeBot`	`PerplexityBot`
Search indexing	`OAI-SearchBot`	`Claude-SearchBot`	—
User browsing	`ChatGPT-User`	`Claude-User`	`Perplexity-User`

Complete AI Crawler Reference Table

Crawler	Company	Tier	Powers	User-Agent
GPTBot	OpenAI	Training	Model training data	`GPTBot`
OAI-SearchBot	OpenAI	Search	ChatGPT search indexing	`OAI-SearchBot`
ChatGPT-User	OpenAI	User	Real-time web browsing	`ChatGPT-User`
ClaudeBot	Anthropic	Training	Model training data	`ClaudeBot`
Claude-SearchBot	Anthropic	Search	Claude search indexing	`Claude-SearchBot`
Claude-User	Anthropic	User	User-requested browsing	`Claude-User`
PerplexityBot	Perplexity AI	Training	Perplexity indexing	`PerplexityBot`
Perplexity-User	Perplexity AI	User	Real-time retrieval	`Perplexity-User`
Google-Extended	Google	Training	Gemini, AI Overviews	`Google-Extended`
Amazonbot	Amazon	Mixed	Alexa answers, Amazon AI	`Amazonbot`
Bytespider	ByteDance	Training	TikTok / Doubao AI	`Bytespider`
cohere-ai	Cohere	Training	Cohere AI products	`cohere-ai`
Applebot-Extended	Apple	Training (opt-out signal)	Apple Intelligence foundation models	`Applebot-Extended`
Meta-ExternalAgent	Meta	Training	Llama / Meta AI training	`meta-externalagent`
Meta-ExternalFetcher	Meta	User	Meta AI user-requested fetches	`meta-externalfetcher`
DuckAssistBot	DuckDuckGo	User	DuckAssist cited answers	`DuckAssistBot`
CCBot	Common Crawl	Training (dataset)	Open dataset used by many LLMs	`CCBot`

Three crawlers worth their own callout:

Applebot-Extended is an opt-out signal, not a separate crawler. It does not fetch pages of its own. Apple's regular Applebot does the crawling; blocking Applebot-Extended tells Apple not to use the content for generative-AI training, while still leaving you indexable for Siri, Spotlight, and Apple search. Roughly 6–7% of high-traffic sites block it today — mostly news publishers including The New York Times, The Financial Times, The Atlantic, Vox Media, and Condé Nast (Apple Support, 2025).
Meta-ExternalFetcher can bypass robots.txt for user-initiated URLs. Meta states that facebookexternalhit and meta-externalfetcher may ignore robots.txt when a user explicitly provides a URL as context to a Meta AI product — the same carve-out that Perplexity-User and ChatGPT-User apply. If you need a hard block on user-triggered fetches, you need firewall rules, not robots.txt alone (Meta Developers, 2026).
CCBot blocks propagate slowly. Common Crawl publishes snapshots quarterly, and older snapshots live forever in derivative training datasets like The Pile and RedPajama. Blocking CCBot today removes you from future snapshots; content you already published remains in circulation for years (Common Crawl, 2025). Treat it as a long-horizon decision, not an instant kill switch.

Recommended robots.txt Configuration

For most businesses that want maximum AI visibility, use this robots.txt:

# Standard search engine crawlers
User-agent: Googlebot
Allow: /

User-agent: Bingbot
Allow: /

# OpenAI (all three tiers)
User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

# Anthropic (all three tiers)
User-agent: ClaudeBot
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: Claude-User
Allow: /

# Perplexity (both tiers)
User-agent: PerplexityBot
Allow: /

User-agent: Perplexity-User
Allow: /

# Google AI features
User-agent: Google-Extended
Allow: /

# Amazon AI
User-agent: Amazonbot
Allow: /

# Default — allow everything else
User-agent: *
Allow: /

# Block private/admin areas
Disallow: /admin/
Disallow: /api/
Disallow: /auth/

Sitemap: https://yoursite.com/sitemap.xml

How to Selectively Allow or Block AI Crawlers

# ALLOW search + user bots (appear in AI answers)
User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: Claude-User
Allow: /

User-agent: Perplexity-User
Allow: /

# BLOCK training bots (prevent model training)
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Google-Extended
Disallow: /

How to Edit robots.txt on Popular Platforms

WordPress

Edit via Yoast SEO plugin: SEO → Tools → File editor → robots.txt. Or create/edit the file at your site root.

Squarespace

Go to Settings → SEO → scroll to "Additional Robots.txt Rules" and add your AI crawler rules there.

Wix

Go to Dashboard → Settings → SEO (Google) → SEO Tools → Robots.txt Editor.

Shopify

Shopify auto-generates robots.txt. Edit it via theme.liquid or use a Shopify robots.txt app.

Next.js / Vercel

Create a robots.ts file in your app/ directory or add a static robots.txt in public/.

Common Mistakes

Using a wildcard Disallow that blocks AI crawlers — User-agent: * Disallow: / blocks everything, including AI crawlers.
Not checking platform defaults — Some CMS platforms add AI crawler blocks automatically. Always check after setup.
Blocking GPTBot but expecting ChatGPT visibility — GPTBot is how ChatGPT learns about your site. Without it, you rely only on Bing indexing.
Forgetting to add a sitemap reference — Always include Sitemap: https://yoursite.com/sitemap.xml at the end of robots.txt.

How to Verify Your Configuration

Visit yoursite.com/robots.txt in your browser
Check that GPTBot, ClaudeBot, and PerplexityBot are not blocked
Run a free Website Audit — it checks all AI crawlers and shows exactly which are blocked

Frequently Asked Questions

What is GPTBot?

Should I allow AI crawlers on my website?

Does Squarespace block AI crawlers?

Yes, some Squarespace sites block AI crawlers by default in their robots.txt. Check your site's robots.txt to confirm, and contact Squarespace support if you need to modify it.

What AI crawlers should I allow?

How do I check if AI crawlers are blocked on my site?

Check your AI crawler status

Instant scan. See which AI crawlers can access your site.

Free AI Crawler Check

Generate your robots.txt

Use our free AI Robots.txt Generator to create an optimized robots.txt with the right AI crawler settings.

AI Robots.txt Generator

Sources & Further Reading

BuzzStream, “Which News Sites Block AI Crawlers in 2025? [New Data],” 2025. 79% of top news sites block AI training bots; 70.6% of blocking sites still cited; 23.1% traffic decline for blockers.
TechnologyChecker.io, “We Analyzed robots.txt Across Cloudflare's Network: Which AI Crawlers Get Blocked Most and Why,” Q1 2026. ClaudeBot doubled crawl rate Q3 2025 → Q1 2026.
Search Engine Journal, “Cloudflare Report: Googlebot Tops AI Crawler Traffic,” 2025. 89.4% of AI crawler traffic is training/mixed; 2.2% real-time user queries.
Search Engine Land, “Googlebot dominates web crawling in 2025 as AI bots surge,” 2025. GPTBot share decreased from 35.46% to 28.97% as blocking increased.
Paul Calvano, “AI Bots and Robots.txt,” Aug 2025. Analysis of AI crawler blocking patterns across the web.
Search Engine Journal, “Complete Crawler List For AI User-Agents,” Dec 2025. Verified user-agent list from real server logs with IP validation.
Search Engine Journal, “Anthropic's Claude Bots Make Robots.txt Decisions More Granular,” 2026. Three-bot framework: ClaudeBot, Claude-SearchBot, Claude-User.
ALM Corp, “ClaudeBot, Claude-User & Claude-SearchBot: Anthropic's Three-Bot Framework,” 2026. Each bot honors robots.txt independently.
Apple Support, “About Applebot” (article 119829) and “Applebot model training and individual privacy rights” (article 120320), updated 2025. Applebot-Extended is an AI-training opt-out signal, not a distinct crawler; Applebot continues to index for Siri, Spotlight, and Apple search regardless.
Meta for Developers, “Meta crawler documentation” (developers.facebook.com/docs/sharing/webmasters/crawler), 2024–2026. Meta-ExternalAgent launched July 2024 for training; Meta-ExternalFetcher handles user-initiated fetches and may bypass robots.txt when a user supplies an explicit URL.
DuckDuckGo Help Pages, “Is DuckAssistBot related to DuckDuckGo?” 2025. DuckAssistBot/1.2 crawls on-demand for DuckAssist answers; robots.txt changes take effect within 72 hours; data is not used for model training.
Common Crawl, “CCBot” (commoncrawl.org/ccbot), 2025. CCBot now runs on dedicated IP ranges with reverse DNS for verification; Common Crawl snapshots feed derivative training datasets (The Pile, RedPajama, C4), so blocks propagate slowly.

Robots.txt for AI Crawlers: How to Allow GPTBot, ClaudeBot, and PerplexityBot

What the Data Shows About AI Crawler Blocking

The Three-Tier Crawler Framework (2026)

Complete AI Crawler Reference Table

Recommended robots.txt Configuration

How to Selectively Allow or Block AI Crawlers

How to Edit robots.txt on Popular Platforms

WordPress

Squarespace

Wix

Shopify

Next.js / Vercel

Common Mistakes

How to Verify Your Configuration

Frequently Asked Questions

What is GPTBot?

Should I allow AI crawlers on my website?

Does Squarespace block AI crawlers?

What AI crawlers should I allow?

How do I check if AI crawlers are blocked on my site?

Check your AI crawler status

Generate your robots.txt

Sources & Further Reading

Related Reading

Related reading

What is GEO?

Optimize Website for ChatGPT

Best GEO Tools 2026

Robots.txt for AI Crawlers: How to Allow GPTBot, ClaudeBot, and PerplexityBot

What the Data Shows About AI Crawler Blocking

The Three-Tier Crawler Framework (2026)

Complete AI Crawler Reference Table

Recommended robots.txt Configuration

How to Selectively Allow or Block AI Crawlers

How to Edit robots.txt on Popular Platforms

WordPress

Squarespace

Wix

Shopify

Next.js / Vercel

Common Mistakes

How to Verify Your Configuration

Frequently Asked Questions

What is GPTBot?

Should I allow AI crawlers on my website?

Does Squarespace block AI crawlers?

What AI crawlers should I allow?

How do I check if AI crawlers are blocked on my site?

Check your AI crawler status

Generate your robots.txt

Sources & Further Reading

Related Reading

Related reading

What is GEO?

Optimize Website for ChatGPT

Best GEO Tools 2026