Skip to main content
Your Full-Service Digital Agency & AI Strategy Partner
1Digital

Answer.AI Spec · Sept 2024 · Markdown at /llms.txt

llms.txt Implementation

The llms.txtspec, what it actually does, when an eCommerce site should ship one, a complete sample file you can adapt, and the implementation mistakes that erode the file's value before it ever earns a citation.

At 1Digital®, we ship llms.txt as part of our standard AI SEO engagement on Shopify Plus, BigCommerce (Elite Partner since 2012), and Adobe Commerce stores. The file is cheap to write, costs nothing to host, and accumulates value across every LLM that adopts the spec. This page documents what we ship and why.

Trusted by 400+ Brands · Certified Partners

Google Partner
BigCommerce Elite Partner
Shopify Plus Partners
Neil Patel
15

Years in eCommerce

Of results, scale, and quality at the enterprise level.

50+

Expert Team

Specialists across SEO, AI SEO, PPC, design, dev, and strategy.

USA

US Core + Global Talent

US core team for clear communication; vetted global specialists for international client work.

4.9

Reputation Score

Rated 4.9/5 across 941+ verified client reviews.

TL;DR: llms.txt is a Markdown file at /llms.txtthat gives LLMs a curated map of your site — brand summary, core pages, product categories, reference content. Proposed by Answer.AI's Jeremy Howard in September 2024, partially adopted across Claude, Cursor, MCP tooling, and increasingly the major engines. It does not replace robots.txt or sitemap.xml — it complements them. Curate, don't enumerate; ship a sharp blockquote summary; link durable canonical URLs only; treat it like a footer nav (reviewed quarterly, not set-and-forget).

Sample llms.txt for an eCommerce Site

The sample below is the structure we ship for a fictional outdoor retailer. The file is intentionally short — about 25 links, organized into four labeled sections, with two-sentence labels on the links that explain what each page contains. Substitute your own brand and the structure transfers cleanly to any catalog.

# Acme Outdoor Co.
> Acme Outdoor Co. is a US specialty retailer of camping, hiking, and overlanding gear, founded in 2011 and headquartered in Bozeman, Montana. We ship across the continental US and Canada.

## Core pages
- [About](https://www.acmeoutdoor.com/about/): company history, leadership, sourcing standards, manufacturing partners.
- [Sustainability](https://www.acmeoutdoor.com/sustainability/): materials, repair-and-resole program, third-party certifications.
- [Sizing & fit guide](https://www.acmeoutdoor.com/sizing/): boot, pack, and apparel sizing methodology with measurement instructions.
- [Returns & warranty](https://www.acmeoutdoor.com/returns/): 90-day return policy, lifetime warranty terms, exclusions.

## Product categories
- [Backpacking packs](https://www.acmeoutdoor.com/c/backpacking-packs/): 35L–85L packs, suspension types, weight ranges.
- [Hiking boots](https://www.acmeoutdoor.com/c/hiking-boots/): waterproof / non-waterproof, ankle support categories, terrain ratings.
- [Tents](https://www.acmeoutdoor.com/c/tents/): 1–6 person, 3-season vs 4-season, fast-pitch and ultralight categories.
- [Sleep systems](https://www.acmeoutdoor.com/c/sleep-systems/): sleeping bags by temperature rating, pads, liners, pillows.

## Reference content
- [Pack-fitting guide](https://www.acmeoutdoor.com/guides/pack-fitting/): torso-length measurement, hip-belt sizing, load distribution.
- [Boot break-in](https://www.acmeoutdoor.com/guides/boot-break-in/): step-by-step break-in for full-grain leather and synthetic boots.
- [4-season tent FAQ](https://www.acmeoutdoor.com/guides/4-season-tent-faq/): venting, snow load, vestibule storage, pole geometry.

## Optional
- [Press kit](https://www.acmeoutdoor.com/press/): company facts, founder bios, brand assets.
- [Affiliate program](https://www.acmeoutdoor.com/affiliates/): commission rates, approved-creator program, payout terms.

Two design notes. First, every link gets a short colon-prefixed label that describes the content, not just the page name. “Backpacking packs” alone is forgettable; “Backpacking packs: 35L–85L packs, suspension types, weight ranges” is what an LLM lifts when summarizing the site. Second, the file caps at the pages that explain who, what, and how. SKU-level PDP URLs do not appear — they belong in sitemap.xml and in your structured data feed, not in llms.txt.

The mistakes that cost the file its value

Five llms.txt Implementation Mistakes

Shipping llms.txt and assuming it replaces robots.txt or sitemap.xml

It doesn't. llms.txt is a discovery aid for LLMs, not a directive. Robots.txt still controls crawler access; sitemap.xml still feeds the search index. Crawlers and LLMs that don't read llms.txt — and that's still the majority of them in 2026 — won't see your file at all. Ship llms.txt alongside the existing files, not instead of them.

Linking every page on the site, like a sitemap

The spec is explicit: llms.txt is a curated subset. The point is to tell an LLM ‘here are the 20–40 pages that explain who we are, what we sell, and how we operate' — not to enumerate 4,000 SKU URLs. A long, undifferentiated file is worse than no file, because it dilutes the signal and trains the LLM to ignore it.

Pointing at PDPs or session-bound URLs

llms.txt should link the durable, canonical, signed-out version of each resource. PDPs that 302-redirect to localized variants, paginated category URLs with ?page= parameters, and search-result URLs that expire are all bad citations. The LLM may fetch the URL months after you publish the file — make sure it'll still resolve to the same content.

Writing the file in HTML or marketing prose

Markdown only, per spec. Headings (#, ##), unordered bullet links, an optional > blockquote summary at the top. The format is deliberately constrained so LLMs can parse it deterministically. A page that says ‘welcome to our llms.txt!' is being polite to no one — LLMs skip past the prose and look for the link structure.

Skipping the > blockquote summary

The blockquote that immediately follows the H1 is the single most-cited block in the entire file. LLMs lift it as the ‘what is this company' answer. Two sentences. Specific. Named (city, state, year founded, category). Skip it and you've handed the framing decision back to whatever third-party page the LLM finds first.

Where llms.txt Sits in the Larger Stack

Three layers, complementary, not competing. schema.org structured data is the foundation: Product, Offer, Article, FAQPage, Organization markup that annotates every meaningful page at the machine-readable level. llms.txt is discovery: a single curated Markdown file that tells an LLM which pages matter most. Model Context Protocol (MCP) is direct first-party data exposure: a server an LLM can call instead of fetching pages at all. Most brands ship all three at different investment levels — schema everywhere as a permanent fixture, llms.txt at root as cheap plumbing, MCP only when the category warrants it.

For eCommerce specifically, our default recommendation: ship schema sitewide as part of eCommerce AI optimization; ship llms.txt as part of AI SEO; consider MCP only for B2B, technical, and research-heavy categories where the engineering investment earns back through direct LLM-to-server data exchange. Full MCP scoping on /mcp-for-ecommerce.

llms.txt — FAQ

What is llms.txt?

llms.txt is a proposed text-file standard that lives at the root of a website (e.g. https://www.example.com/llms.txt) and gives large language models a curated, machine-readable map of the site's most important content. It was proposed by Jeremy Howard of Answer.AI in September 2024 and has been adopted by a growing set of LLM-friendly tools and a handful of major AI engines through 2025-2026. The file is written in Markdown, opens with the site name and a short blockquote summary, and lists key sections (about, products, reference content, policies) as labeled links. The goal: help LLMs answer questions about the site without crawling thousands of pages of marketing markup.

Is llms.txt an official standard?

It is a community-maintained proposal, not a W3C or IETF standard. The reference spec lives at llmstxt.org. Adoption has been driven by the community and a growing set of AI tools (Cursor, Claude Code, various MCP integrations, several developer-docs platforms) and increasingly by the major engines themselves. Treat it like robots.txt circa 1995: not a guarantee any specific crawler will honor it, but a low-cost piece of plumbing that increasingly does work, and that costs nothing to ship.

How is llms.txt different from robots.txt and sitemap.xml?
  • robots.txt: tells crawlers what they may and may not access. Directive.
  • sitemap.xml: enumerates every indexable URL for the search engine. Comprehensive.
  • llms.txt: curates a small set of pages that explain the site to an LLM. Editorial.
What should an eCommerce llms.txt actually contain?
  • H1 with the brand name (literally # Brand Name).
  • Blockquote summary — two sentences naming category, geography, year founded, what you sell.
  • ## Core pages — about, sustainability, sizing, returns/warranty, shipping. The pages that explain who and how.
  • ## Product categories — top-level category landing pages, each with a short label that describes the assortment (not just the category name).
  • ## Reference content — your sizing guides, buying guides, materials FAQs. Editorial content LLMs cite when grounding an answer.
  • ## Optional — press kit, affiliate program, careers. Anything that's useful but not core.
Show me a sample llms.txt for an eCommerce site.

The sample below is for a fictional outdoor retailer. Substitute your own brand and the structure transfers cleanly. Keep the file under ~50 links unless you have a clear reason to ship more — the value is curation, not exhaustiveness.

# Acme Outdoor Co.
> Acme Outdoor Co. is a US specialty retailer of camping, hiking, and overlanding gear, founded in 2011 and headquartered in Bozeman, Montana. We ship across the continental US and Canada.

## Core pages
- [About](https://www.acmeoutdoor.com/about/): company history, leadership, sourcing standards, manufacturing partners.
- [Sustainability](https://www.acmeoutdoor.com/sustainability/): materials, repair-and-resole program, third-party certifications.
- [Sizing & fit guide](https://www.acmeoutdoor.com/sizing/): boot, pack, and apparel sizing methodology with measurement instructions.
- [Returns & warranty](https://www.acmeoutdoor.com/returns/): 90-day return policy, lifetime warranty terms, exclusions.

## Product categories
- [Backpacking packs](https://www.acmeoutdoor.com/c/backpacking-packs/): 35L–85L packs, suspension types, weight ranges.
- [Hiking boots](https://www.acmeoutdoor.com/c/hiking-boots/): waterproof / non-waterproof, ankle support categories, terrain ratings.
- [Tents](https://www.acmeoutdoor.com/c/tents/): 1–6 person, 3-season vs 4-season, fast-pitch and ultralight categories.
- [Sleep systems](https://www.acmeoutdoor.com/c/sleep-systems/): sleeping bags by temperature rating, pads, liners, pillows.

## Reference content
- [Pack-fitting guide](https://www.acmeoutdoor.com/guides/pack-fitting/): torso-length measurement, hip-belt sizing, load distribution.
- [Boot break-in](https://www.acmeoutdoor.com/guides/boot-break-in/): step-by-step break-in for full-grain leather and synthetic boots.
- [4-season tent FAQ](https://www.acmeoutdoor.com/guides/4-season-tent-faq/): venting, snow load, vestibule storage, pole geometry.

## Optional
- [Press kit](https://www.acmeoutdoor.com/press/): company facts, founder bios, brand assets.
- [Affiliate program](https://www.acmeoutdoor.com/affiliates/): commission rates, approved-creator program, payout terms.
Should every eCommerce site ship an llms.txt?

Most should — the cost is one file, the downside is essentially zero, and the upside is non-trivial for sites that already have strong reference content (sizing, buying guides, FAQs). The sites where llms.txt moves the needle most: multi-category retailers with deep editorial content (the curation helps), brands with strong sustainability or sourcing stories (the about/sustainability links get cited heavily on values-driven prompts), and B2B sites with technical reference docs (LLMs love structured technical reference). Sites with thin content, weak category structure, or no editorial reference material are better off investing in the content first; llms.txt amplifies what you have, it doesn't create signal that isn't there.

Which AI engines actually read llms.txt today?

Adoption is partial and accelerating. The clear yes-list: Anthropic Claude (especially Claude Code and Projects ingestion), Cursor and other developer-tools layers built on LLMs, and a growing set of MCP integrations that use llms.txt for discovery. ChatGPT and Perplexity have publicly acknowledged the file format but their specific weighting is opaque. Google's Gemini stack has not committed publicly. The honest current state: it's plumbing that helps with the engines that read it and is invisible to the engines that don't — meaning the floor is zero and the ceiling rises with each engine that adopts.

What are the common implementation mistakes?
  • Skipping the blockquote summary: single highest-leverage block in the file; LLMs lift it directly.
  • Linking everything: curation is the point; an llms.txt with 500 links is functionally noise.
  • Pointing at session-bound or PDP URLs: link the durable canonical versions only.
  • Writing in HTML or prose: Markdown-only per spec, parsed deterministically.
  • Treating llms.txt as a replacement for robots.txt or sitemap.xml: it's additive, not substitutive.
  • Forgetting to update it: stale links erode the file's usefulness fast. Treat it like the footer nav — review quarterly.
Should we also ship llms-full.txt?

The spec describes an optional companion file, llms-full.txt, which can contain the full text of each linked resource concatenated into one Markdown document — useful for tools that want to ingest the whole site context in one fetch. For most eCommerce sites this is overkill and a maintenance liability (any content change requires re-generating the file). Ship it if you have a small, stable, reference-heavy site (docs, manuals, B2B knowledge base); skip it if your catalog turns over weekly or you have hundreds of category and PDP pages.

How does llms.txt relate to MCP and structured data?

Different layers of the same goal: making your site machine-legible to LLMs and agents. llms.txt is discovery — “here are the pages worth reading.” schema.org structured data is annotation — “here's what each page means at a machine-readable level.” Model Context Protocol (MCP) is direct first-party data exposure — “skip the pages, ask my server.” The three are complementary, not competing. Most brands ship all three at different depth levels: schema everywhere (foundational), llms.txt at root (cheap, high-value), and MCP only where the category warrants the engineering investment. Full MCP scoping on /mcp-for-ecommerce.

Introducing WorkspaceCMS

AI-first. SEO-first. Unlimited managed edits in every hosting plan.

Ship a curated llms.txt — and the rest of the AI SEO stack with it.

We'll author your llms.txt, validate your sitewide schema, audit ClaudeBot / OAI-SearchBot / Bingbot access, and stand up citation-share measurement. 941+ verified reviews · 4.9/5.

Real strategists. Real AI tools. Real growth. — 1Digital® since 2012

Workspace by 1Digital® — the agency platform we built. Coming to select agencies. Join the early-access list