Is Your Website Agent-Ready? The 2026 Checklist

Search engines indexed your site for humans. AI agents need something different, and most sites are not agent-ready. Here is the practical checklist I built against my own codebase, plus the fastest way to score your own domain against the emerging agent-ready standards of 2026.

What "agent-ready" actually means

An AI agent that visits your site is not a browser. It does not render CSS, does not execute a carousel, does not wait for hydration. It wants three things in this order: a map of what exists, a clean representation of each page, and a contract for what it can do next.

Traditional SEO solves the first one. The second and third are where most sites are still blind. That gap is what the emerging stack of AI agent standards — llms.txt, `.well-known/mcp`, Agent Skills, OpenAPI discovery, markdown content negotiation — is designed to close.

The five categories that decide your score

1. Discovery

The absolute baseline. If an agent cannot answer "what lives here?" in a single request, everything downstream gets expensive.

robots.txt with an explicit `Sitemap:` line and bot-specific rules

XML sitemap with `hreflang` for multilingual sites

llms.txt — a short, structured summary aimed at LLM crawlers

HTTP Link headers that surface canonical, alternate and OpenAPI references without the agent having to parse HTML

2. Content accessibility

Agents prefer Markdown over HTML. It is cheaper to tokenize, less lossy, and does not carry scripts. A site that serves `text/markdown` on request is measurably easier to work with than one that forces every agent to run a headless browser.

Two ways to offer it:

Content negotiation: respond with Markdown when the request includes `Accept: text/markdown`

Explicit route: expose a `/api/md/<slug>` or `<slug>.md` path that returns the clean Markdown source of any canonical page

3. Protocol discovery

This is the layer most sites skip because it is newest. Each of these files is a few kilobytes of JSON and turns your site from a passive content source into something agents can actually bind to.

`/.well-known/mcp` — Model Context Protocol descriptor

`/.well-known/agent-skills` — manifest of callable capabilities

`/.well-known/oauth-authorization-server` — lets agents negotiate auth without a human

`/.well-known/openapi.json` — machine-readable contract of every public endpoint

4. Bot access controls

Not a permission, a declaration. Every major AI crawler now has a named user-agent: GPTBot, ClaudeBot, PerplexityBot, CCBot, Google-Extended, and more. Your robots.txt should have a yes/no answer for each one you care about. Silence gets interpreted inconsistently.

If you already sit behind Cloudflare, you get most of this for free. AI Audit shows exactly which AI bots hit your site and how often, AI Crawl Control gives you per-bot allow/deny toggles without editing robots.txt by hand, and AI Labyrinth feeds misbehaving scrapers dead-end content. Use Cloudflare's dashboard if you have it; fall back to robots.txt if you don't.

5. Agent commerce (optional)

The last category is genuinely emerging. Protocols like x402 (HTTP 402 payment headers), UCP (Universal Commerce Protocol), and ACP (Agent Commerce Protocol) let agents transact directly with your site. Skip this entirely if you don't sell anything. If you do, the early movers will own the market.

Score your site in 30 seconds

A manual checklist is guesswork. A real scanner hits your domain over HTTP, reads the headers, parses the well-known files, and reports a number.

The cleanest tool for this right now is isitagentready.com. Paste in your URL — it fetches your `robots.txt`, `sitemap.xml`, `llms.txt`, `.well-known/*` endpoints and OAuth metadata, then grades you across all five categories above. Takes about 30 seconds. No sign-up, no account.

What that score actually predicts

A score under 30 means agents will still work with your site, but they will spend most of their budget on parsing instead of reasoning. You are paying for brute force.

Between 30 and 60, you have the basics. Agents can find you and read you. They cannot negotiate with you.

Above 60, you are cheap to integrate with and show up naturally in agent-assembled answers. Above 85, your site behaves like a first-class agent citizen — an agent can discover your tools, authenticate, and call them without writing any custom glue.

What I built on this site

Since I ship blog content and tools from this domain, I treated agent-readiness as a deliverable rather than a nice-to-have. The public, agent-ready surface looks like this:

`/robots.txt` with explicit AI bot rules and a single `Sitemap:` pointer

`/sitemap.xml` across 9 locales and 500+ blog posts with hreflang

`/llms.txt` with a structured site description for LLM crawlers

Markdown content negotiation on every page — send `Accept: text/markdown` and you get clean `text/markdown` back instead of HTML

`/.well-known/openapi.json`, `/.well-known/api-catalog`, `/.well-known/oauth-authorization-server`, and `/.well-known/oauth-protected-resource` metadata so agents can bind to the API without scraping

A first-party MCP server→ with 27 tools across blog CRUD, AI content generation, SEO analysis, sitemap health, YouTube research, Brave search and translation

A multi-agent content pipeline→ that reads Search Console data and writes new posts end-to-end

The result: a 100/100 score on isitagentready.com.

The MCP server is the part most people skip. It is what turns a website into an interface — not just a collection of pages.

Prioritizing the work

If you are starting from zero, the order that gives you the most score per hour is:

Fix robots.txt and sitemap — cheap, high-weight, caught by every scanner

Publish llms.txt — one static file, huge signal

Add `.well-known/openapi.json` if you have any public API at all

Stand up `.well-known/mcp` and expose even one tool — the step-change moment

Offer Markdown content — either via negotiation or a dedicated route

Declare bot policy — explicit entries per AI user-agent

Commerce protocols — only if agents could plausibly buy something from you

The first four steps are mostly configuration. The MCP step is where you start designing your site as a product that agents can use.

The uncomfortable part

Being agent-ready is not an SEO update. It is an architectural posture. You are choosing whether the next generation of traffic treats your site as raw material or as an interface.

Most sites will stay raw material. That is fine — they will get scraped, summarized, and cited without attribution. The agent-ready sites that publish their protocols, expose their tools, and speak the languages agents already know will be the ones agents prefer to call directly.

Scan your domain first to see where you stand. Then pick the highest-weight agent-ready item you are currently failing, and ship that this week.

FAQ

What is an agent-ready website?+

An agent-ready website publishes the metadata, protocols, and structured content that AI agents need to discover, read, and interact with it directly. That usually means a valid robots.txt with AI bot rules, an XML sitemap, an llms.txt summary, .well-known endpoints for MCP and OpenAPI, and Markdown versions of canonical pages via content negotiation or dedicated routes.

How do I check if my website is agent-ready?+

The fastest way is to run an automated scanner that actually fetches your files over HTTP. Tools like isitagentready.com grade your domain across discovery, content accessibility, protocol discovery, bot access controls, and agent commerce in about 30 seconds. Manual self-assessment is unreliable because the tests depend on real headers and status codes.

Which agent-ready feature should I add first?+

Start with the cheapest, highest-weight items: a clean robots.txt with an explicit Sitemap line and AI bot policy, an up-to-date sitemap.xml, and an llms.txt file. Then add .well-known/openapi.json if you expose any public API, and stand up .well-known/mcp to let agents bind to your tools. That sequence moves most sites from zero to a solid agent-ready score in a day.

What is llms.txt?+

llms.txt is a plain-text or Markdown file served at the root of your domain that gives LLM crawlers a structured summary of your site: what it is, who runs it, the main sections, and which pages matter most. It plays the same role for language models that sitemap.xml plays for search engines — a cheap, discoverable map that lets an agent decide where to spend its attention.

What is MCP (Model Context Protocol)?+

MCP is an open protocol from Anthropic that lets AI agents bind to external tools and data sources through a standard interface. When a website publishes a /.well-known/mcp descriptor, compatible agents can discover its tools, authenticate, and call them without bespoke integration code. It is the closest thing the web has to a universal API contract for AI agents.

What is the difference between robots.txt and llms.txt?+

robots.txt tells crawlers which URLs they are allowed to fetch. llms.txt tells LLMs what your site is about and which pages are worth reading. They solve adjacent but different problems: robots.txt is about access, llms.txt is about orientation. An agent-ready site ships both — robots.txt with explicit AI bot rules, and llms.txt with a clean structured summary.

Does Cloudflare make your website agent-ready?+

Cloudflare covers a large slice of the checklist. AI Audit reports which AI bots hit your site, AI Crawl Control provides per-bot allow/deny toggles, and AI Labyrinth traps misbehaving scrapers. It does not generate your llms.txt or publish your .well-known/mcp descriptor for you, so you still need to add those yourself — but bot access controls and crawler analytics are handled at the edge.

✻

Back to home