llms.txt explained
What llms.txt is
A plain-text file at your domain's root (yoursite.co/llms.txt) that:
- Lists your important pages in a structured format.
- Provides a content index AI crawlers can read quickly.
- Includes metadata about each page's intent.
Different from robots.txt (which is permission-only). llms.txt is content-discovery-focused.
Why it matters in 2026
AI engines (ChatGPT Search, Perplexity, Claude.ai search, Gemini) cite content based on what they index. llms.txt helps:
- Faster indexing of your important content.
- Better citation accuracy by AI engines.
- Direct signal about what's canonical on your site.
About 60 to 75% of well-known docs sites now publish one.
The three files
The May 2026 spec separates concerns:
llms.txt. Content index. List of important pages with descriptions.
llms-full.txt. Full content fallback. Some AI engines index this when present.
ai.txt. Crawler permissions. Allow/disallow specific AI bots.
Sample llms.txt
H1 heading: AskVault
> AI customer support and helpdesk platform.
H2 heading: Docs
- [Getting Started](https://docs.askvault.co/getting-started/): Set up AskVault in 5 minutes- [Embed on React](https://docs.askvault.co/how-to/embed-ai-chatbot-on-react-website/): React widget setup- [Skills overview](https://docs.askvault.co/skills/skills-overview/): The 14 built-in skills
## Pricing
- [Pricing](https://askvault.co/pricing/): Plans from ₹0 to ₹8,499 per month
## API
- [API getting started](https://docs.askvault.co/api/getting-started/): Auth and first requestMarkdown-formatted with H2 sections and bullet links. The spec at llmstxt.org defines the structure.
Sample ai.txt
User-agent: GPTBotAllow: /
User-agent: OAI-SearchBotAllow: /
User-agent: PerplexityBotAllow: /
User-agent: ClaudeBotAllow: /
User-agent: Google-ExtendedAllow: /
User-agent: BytespiderDisallow: /Allows reputable AI crawlers; blocks aggressive scrapers.
Implementation
About 30 minutes:
- Create
/public/llms.txtin your site root. - List your top 10 to 30 pages with descriptions.
- Create
/public/ai.txtwith crawler permissions. - Optional:
/public/llms-full.txtwith full content body. - Deploy.
- Verify at
yoursite.co/llms.txt.
Tools
- Validators. llmstxt-validator.dev.
- Generators. Some CMS plugins auto-generate.
- AskVault's docs site generates these from content collections automatically.
SEO + GEO benefits
Beyond AI indexing:
- Search engine bots see them too. Used as a sitemap signal.
- Some Google AI Overview citations prefer pages listed.
- Compounding as more AI engines adopt the spec.
Limits
- File size. Spec recommends under 50 KB for llms.txt, under 10 MB for llms-full.txt.
- Format. Strict Markdown structure.
Common pitfalls
File too long. Trim to essentials; AI crawlers may truncate.
Wrong format. Markdown headings required. Validate.
Outdated content. Re-generate periodically; stale entries hurt citation accuracy.
Forgetting ai.txt. Crawlers may still index, but you lose control of which.
FAQ
Will this guarantee AI engines cite me?
No. Improves likelihood, not certainty.
Should I copy entire docs into llms-full.txt?
Yes if you want to be fully indexed.
Does Google use llms.txt?
Some Google AI surfaces do. SEO benefit unclear; treat as bonus.