Agent Surface & Observability

Agentic defensibility — the flip side of being AI-accessible

Accessibility has a cost

Everything else in this framework pushes you to open up: structured data, llms.txt, answer units, an MCP endpoint agents can call. That openness is the right strategy — and it creates new surface to defend. Agentic defensibility (criterion AS2 in the Agent Surface & Observability pillar) is how you stay open without being exploited.

The four threats

  1. Scraping and crawl cost. AI crawlers are relentless and can hammer dynamic pages, inflating your infrastructure bill and degrading performance for real users.
  2. Prompt injection. Text on your pages — or in user-generated content — can contain instructions aimed at any agent that reads it ("ignore previous instructions and…"). Your content can be weaponised against the very assistants you want to serve you.
  3. Data-protection exposure. Any feature where AI processes personal data raises GDPR obligations, and the EU AI Act adds a further layer of duties depending on use.
  4. Endpoint misuse. A callable surface is a callable surface for everyone — including agents acting in bad faith, scripting abuse, or probing for weaknesses.

The controls

You don't need a fortress; you need proportionate, layered defence:

  • Rate limiting on pages and especially on any API/MCP surface — per-IP and per-token where you can.
  • Bot detection and honeypots — hidden endpoints or fields that only automated scrapers will touch, flagging bad actors without affecting humans.
  • Crawler directivesrobots.txt rules and noai/noimageai signals to state clearly what's allowed. Reputable crawlers honour them; the directives also document your intent.
  • Prompt-injection hygiene — treat user-generated content as untrusted, sanitise it, and don't let it flow unfiltered into anything an agent might execute.
  • Data governance — a documented GDPR/AI-Act assessment for every feature where AI touches personal data: lawful basis, retention, and disclosure.

Open, but defended

The mistake at both extremes is easy to make. Lock everything down and you become invisible to the assistants that now mediate discovery. Leave everything open and you pay for it in cost, abuse, and compliance risk. Defensibility is the discipline of the middle: welcome the agents you want, on your terms, with the controls to enforce those terms. Pair it with observability and you can both see how agents use you and decide how they're allowed to.

What to do this week

  1. Put basic rate limiting on your most expensive dynamic pages and any API surface.
  2. Audit user-generated content paths for prompt-injection exposure and sanitise them.
  3. Set explicit robots.txt and noai directives so your intent is documented.
  4. List every feature where AI touches personal data and start a GDPR/AI-Act note for each.