Skip to content
Try Free →

llms.txt explained

Last updated: · 4 min read

What llms.txt is

A plain-text file at your domain's root (yoursite.co/llms.txt) that:

  • Lists your important pages in a structured format.
  • Provides a content index AI crawlers can read quickly.
  • Includes metadata about each page's intent.

Different from robots.txt (which is permission-only). llms.txt is content-discovery-focused.

Why it matters in 2026

AI engines (ChatGPT Search, Perplexity, Claude.ai search, Gemini) cite content based on what they index. llms.txt helps:

  • Faster indexing of your important content.
  • Better citation accuracy by AI engines.
  • Direct signal about what's canonical on your site.

About 60 to 75% of well-known docs sites now publish one.

The three files

The May 2026 spec separates concerns:

llms.txt. Content index. List of important pages with descriptions.

llms-full.txt. Full content fallback. Some AI engines index this when present.

ai.txt. Crawler permissions. Allow/disallow specific AI bots.

Sample llms.txt

H1 heading: AskVault
> AI customer support and helpdesk platform.
H2 heading: Docs
- [Getting Started](https://docs.askvault.co/getting-started/): Set up AskVault in 5 minutes
- [Embed on React](https://docs.askvault.co/how-to/embed-ai-chatbot-on-react-website/): React widget setup
- [Skills overview](https://docs.askvault.co/skills/skills-overview/): The 14 built-in skills
## Pricing
- [Pricing](https://askvault.co/pricing/): Plans from ₹0 to ₹8,499 per month
## API
- [API getting started](https://docs.askvault.co/api/getting-started/): Auth and first request

Markdown-formatted with H2 sections and bullet links. The spec at llmstxt.org defines the structure.

Sample ai.txt

User-agent: GPTBot
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: Bytespider
Disallow: /

Allows reputable AI crawlers; blocks aggressive scrapers.

Implementation

About 30 minutes:

  1. Create /public/llms.txt in your site root.
  2. List your top 10 to 30 pages with descriptions.
  3. Create /public/ai.txt with crawler permissions.
  4. Optional: /public/llms-full.txt with full content body.
  5. Deploy.
  6. Verify at yoursite.co/llms.txt.

Tools

  • Validators. llmstxt-validator.dev.
  • Generators. Some CMS plugins auto-generate.
  • AskVault's docs site generates these from content collections automatically.

SEO + GEO benefits

Beyond AI indexing:

  • Search engine bots see them too. Used as a sitemap signal.
  • Some Google AI Overview citations prefer pages listed.
  • Compounding as more AI engines adopt the spec.

Limits

  • File size. Spec recommends under 50 KB for llms.txt, under 10 MB for llms-full.txt.
  • Format. Strict Markdown structure.

Common pitfalls

File too long. Trim to essentials; AI crawlers may truncate.

Wrong format. Markdown headings required. Validate.

Outdated content. Re-generate periodically; stale entries hurt citation accuracy.

Forgetting ai.txt. Crawlers may still index, but you lose control of which.

FAQ

Will this guarantee AI engines cite me?

No. Improves likelihood, not certainty.

Should I copy entire docs into llms-full.txt?

Yes if you want to be fully indexed.

Does Google use llms.txt?

Some Google AI surfaces do. SEO benefit unclear; treat as bonus.

Was this page helpful?