llms.txt
A Markdown file at a site's root that tells AI crawlers what the site is about and which URLs are canonical — a kind of robots.txt for language models.
llms.txt is a file format proposed by Jeremy Howard (Answer.AI) in September 2024. It lives at `https://yoursite.com/llms.txt`, is formatted as Markdown, and contains a site name, a one-sentence description, optional context, and a curated list of important URLs grouped into sections.
The goal is to give an LLM a single shot at understanding what a site is for. Crawling a full sitemap and inferring canonical structure from titles is lossy. Reading a 300-word llms.txt that explicitly says 'here's the product and here are the ten URLs that explain it' is not.
It has a companion format, `/llms-full.txt`, which inlines the full text of each linked page. This is expensive in bytes but is the preferred format for Claude's retrieval and provides the largest citation uplift for documentation-heavy sites.
In AIRRNK
AIRRNK checks for a well-formed llms.txt on every scan. It's a 5-point line item on the 47-point rubric. The WordPress plugin and Shopify app can both auto-generate and serve the file for you, kept in sync with your content.
- Generative Engine Optimization
The practice of making a website more likely to be cited by AI answer engines (ChatGPT, Claude, Perplexity, Google AI Mode) rather than simply ranked on a traditional search results page.
- Schema Markup
Structured data embedded in a page (usually as JSON-LD) that describes what the page is about in a machine-readable vocabulary defined at schema.org.
- Indexing API
An API — most commonly IndexNow (Bing/Yandex) or Google's Indexing API — that pushes URL changes to a search index in real time instead of waiting for crawlers.
Written by
The AIRank Editorial Team
Research & editorial, AIRank
The AIRank editorial team runs the 47-point scanner, the Observer pings, and the GEO research programme every week. Writing is reviewed by the core engineers who build the Injector, Blaster, and Surgeon agents.
About the team →