# Executive Summary

Making a website *agent-friendly* means exposing content and functionality in predictable, machine-readable ways.  Drawing on UAIX’s framework, we categorize agent capabilities into **Perceive, Navigate, Act, Communicate, Authenticate/Authorize,** and **Observe/State**.  For each, we map best practices:

- **Perceive:** Use **semantic HTML** and ARIA so agents can “read” content.  Provide alternative text (`alt`), clear headings, and explicit structure.  Embed **structured data** (e.g. JSON-LD with [schema.org](https://schema.org/) vocabularies) so agents immediately understand page meaning.  
- **Navigate:** Offer explicit links and indexes.  Include `<a>` anchors, a sitemap (`sitemap.xml` or `Link: <.../sitemap.xml>`), and even a UAIX-style `llms.txt` file with agent hints.  Avoid hidden navigation or heavy JavaScript that blocks crawl.  
- **Act:** Expose actions through standard web endpoints.  Use HTML `<form>`s or REST/GraphQL APIs instead of obscure scripts.  Follow HTTP conventions (safe GETs, idempotent POST/PUT).  Respond with proper status codes and headers (e.g. `HTTP/429` for rate-limit).  Enable CORS (`Access-Control-Allow-Origin`) so cross-origin clients can call your APIs.  
- **Communicate:** Provide agent protocols and endpoints.  For example, publish machine-readable guides or chat interfaces (e.g. WebSockets or SSE for real-time data).  Document expected message formats (JSON schemas, OpenAPI specs).  Support cross-agent messaging (UAIX’s Agent-to-Agent model) by exposing tokenized endpoints or webhooks.  
- **Authenticate/Authorize:** Allow non-interactive logins.  Use token-based schemes (OAuth client-credentials, API keys, JWTs) so agents can obtain limited-scope access without a human login.  For instance, the OAuth2 “client credentials” grant lets a service account fetch a bearer token.  Support **delegated tokens** (on-behalf-of flows) for acting as users when needed.  Also offer simpler options (API keys or signed tokens) for trusted bots.  
- **Observe/State:** Make site health and state visible.  Provide status and metrics endpoints (`/status`, `/health`, JSON dashboards).  Emit logs or support webhooks/events that agents can consume.  Use standardized formats for error payloads and results so clients can parse them automatically.

Across these dimensions, the key is **openness and predictability**: use well-known HTML attributes (e.g. `<h1>…</h1>`, `<img alt="…">`, ARIA `role`/`aria-*`), standard HTTP headers (CORS, content type, `Link` for pagination), and documented endpoints (REST/GraphQL).  This ensures lower-capability agents (UAIX L0–L2) can at least retrieve basic data, while higher-level agents (L3–L5) can authenticate and perform complex workflows.  We summarize these best practices in the checklist and mapping table below.

## UAIX Capability Mapping

UAIX defines a **capability ladder** (L0–L5) that aligns roughly with our categories. For example: at L0 (URL-only), an agent can only `GET` public URLs; at L1 it parses JSON from simple endpoints; at L2 it reads schemas and uses `POST` JSON; higher levels add auth and state.  Table 1 maps each capability to concrete web features.

| UAIX Capability      | HTML/CSS/JS Patterns & Markup                               | HTTP Headers / Endpoints / Examples                               |
|----------------------|-------------------------------------------------------------|-------------------------------------------------------------------|
| **Perceive**         | Use semantic elements (`<article>`, `<nav>`, `<h1>…<h6>`, `<table>`, etc.) and ARIA roles to expose structure and meaning.  Provide `alt` text on images, labels on forms, transcripts on media.  Embed JSON-LD/Microdata with schema.org vocabularies so agents can extract facts.  Avoid purely canvas/SVG/Flash interfaces. | Support content negotiation (`Accept` header) and language variants (`Accept-Language` → `Vary: Accept-Language`).  Offer static formats (HTML) and also machine-oriented formats (JSON, CSV, XML).  Include a `llms.txt` (UAIX) or `robots.txt` at site root.  Indicate robots policy via `<meta name="robots" content="noindex">` or `X-Robots-Tag` header if needed. |
| **Navigate**         | Use meaningful `<a href="…">` links (not `onclick`), and include navigation landmarks (`<nav>`, `<ul>` for menus).  Include a `<link rel="next">`/`<link rel="prev">` in HTML or `Link` headers for pagination.  Provide a sitemap.xml or index page listing all resources.  Use breadcrumbs and content sections.  Ensure dynamic site navigation degrades gracefully (e.g. initial HTML <noscript> or pre-rendered content). | Expose `sitemap.xml` (with `<url>` entries) and accept range requests for large lists.  Provide HTTP `Link` header for pagination or related resources.  Use standard caching headers (`Cache-Control`, `ETag`, `Last-Modified`) so agents can poll efficiently.  Honor `robots.txt` rules (PUT it at “/robots.txt” in UTF-8 text). |
| **Act**              | Expose forms or APIs for actions.  Use HTML `<form method="POST">` or JavaScript `fetch()` to call REST endpoints.  If using JS, ensure a non-JS fallback (progressive enhancement).  Mark up interactive widgets with ARIA (`role="button"`, `aria-pressed`, etc.) so state changes are visible to assistive crawlers.  Use standard HTTP methods: GET for reads, POST/PUT/PATCH for writes.  Include HTML `<form action>` fields and JSON body schemas in API docs. | Follow HTTP best practices: safe methods (GET) don’t change state; idempotent methods (PUT/DELETE) can be retried; use `POST` only for non-idempotent writes.  Return clear status codes (200 OK, 201 Created, 404 Not Found, etc.) with JSON/XML bodies.  For rate limiting, return `429 Too Many Requests` with a `Retry-After` header.  Handle CORS so agents on other domains can call APIs: e.g. `Access-Control-Allow-Origin: *` or specific domains and support pre-flight `OPTIONS` requests.  Document your API (OpenAPI/GraphQL schema) so agents know how to call it. |
| **Communicate**      | If agents must exchange messages (e.g. chat), provide a machine-readable interface (e.g. WebSocket or JSON over HTTP).  Use Webhooks or publish/subscribe (Pub/Sub) endpoints.  Label message formats (JSON schema, HTML5 `<dialog>`, Atom/RSS feeds) so programs can parse them.  For interoperability, implement OAuth2/OpenID endpoints to allow agent delegation. | Provide network interfaces for push events (Server-Sent Events `/events`, WebSocket URLs, or webhook callbacks).  For custom protocols, use HTTP verbs (e.g. `POST /incoming-message`).  Tag messages with JSON-LD context or XMPP/Jabber schemas if relevant.  Include cross-agent discovery endpoints (UAIX’s “agent access manifest” or `.well-known/uaix-agent-access.json`). |
| **Authenticate/Authorize** | Support loginless auth.  Avoid requiring human-only CAPTCHA or SSO flows.  Provide OAuth2 flows (authorization code for user agents, client credentials for machines).  Allow API keys or JWT-based tokens as alternatives.  Clearly document the auth endpoint (e.g. `/oauth/token`).  Use standardized HTTP headers (e.g. `Authorization: Bearer <token>`) and schemas (RFC 6750 Bearer tokens). | Implement OAuth2 with TLS (RFC 6749) so machines can obtain tokens.  For example, the **Client Credentials** flow lets a service obtain a bearer token by sending `POST /oauth/token` with `Authorization: Basic <client_secret>`.  Support `WWW-Authenticate` challenges or `X-API-Key` headers for simpler keys.  Use secure, short-lived tokens (JWTs or opaque) and scopes to limit access.  Offer a “delegated token” (OIDC / JWT `sub` delegation) for cross-service calls.  If using cookies, ensure agents handle them (e.g. include `Set-Cookie` in JSON workflows).  Provide clear refresh-token or API-key rotation processes. |
| **Observe/State**    | Expose site health and metrics.  Provide a `/health` or `/status` endpoint that returns JSON status (uptime, version, component status).  Include events or logs in a parseable format (e.g. JSON lines, audit logs).  Mark dynamic content updates with ARIA live regions (for screen readers) so agents tracking changes get notifications. | Use a **status endpoint** returning JSON (e.g. `{"status":"ok","services":{"db":"up"}}`).  Support monitoring protocols (Prometheus, OpenAPI health check schemas).  For logging and monitoring, push events to webhooks or use syslog endpoints.  Provide API endpoints for event streams (SSE or WebSocket) with reconnection support.  Return debug info only to authenticated/authorized agents.  Expose **idempotency keys** and deterministic results so agents can retry safely (e.g. use `Idempotency-Key` header for POSTs). |

Table 1: *Mapping of UAIX agent capabilities to concrete web design practices and features.*

## HTML/CSS/JS Patterns

- **Semantic Markup:** Always use meaningful tags. E.g. `<h1>Title</h1>`, `<article>`, `<section>`, `<nav>`, etc. Assistive and search technologies (and agents) leverage these: *“search engines give more importance to keywords inside headings, links, etc.”*.  Use `<label>` with `for=` and `alt` on images, ensure color/text contrast (for OCR), and avoid conveying information by color or image alone.
- **ARIA:** For dynamic widgets (dropdowns, tabs, progress bars), add ARIA roles and states so bots know the semantics. ARIA “improves the accessibility and interoperability of web content”.  E.g. `<button aria-pressed="false">Show</button>`.  These roles make widgets readable via accessibility APIs.
- **Progressive Enhancement:** Build features in plain HTML first, then layer JS. Agents often can’t run JavaScript fully. For example, if using a JS grid, also include a fallback HTML table or a JSON endpoint for the same data.
- **CSS/JS Hints:** Avoid techniques that hide content from text-based agents (e.g. `<meta name="fragment">`, excessive canvas use, or obfuscating scripts). If using a SPA framework, ensure server-side rendering or static snapshots exist.
- **Structured Data:** Embed JSON-LD or Microdata so machines can parse content without in-page heuristics. Google recommends JSON-LD where possible and notes: *“Google uses structured data that it finds on the web to understand the content of the page…”*.  For example:
  
  ```html
  <script type="application/ld+json">
  {
    "@context": "https://schema.org",
    "@type": "Article",
    "headline": "Example Title",
    "author": {"@type":"Person","name":"Alice"},
    "image": "https://example.com/image.png",
    "datePublished": "2026-06-21"
  }
  </script>
  ```
  
  This lets an agent know exactly what the page is about.  Schema.org’s community vocabularies are widely used by Google, Microsoft, etc. to enable rich features.

## APIs and Technical Rules

- **Content Negotiation:** Support `Accept` headers.  If your site can serve HTML **and** JSON, respond with the appropriate `Content-Type` based on `Accept: application/json` or `text/html`.  Include a `Vary: Accept` header.  As HTTP/1.1 says, servers “often have different ways of representing information… [so] HTTP provides mechanisms for content negotiation”.
- **CORS:** If agents call APIs from a different origin (e.g. a browser agent on domain A calling domain B), enable CORS.  Add `Access-Control-Allow-Origin: *` (or specific domains) in responses.  For non-GET methods, support preflight: respond to `OPTIONS` with `Access-Control-Allow-Methods: GET,POST,...` and `Access-Control-Allow-Headers: Content-Type,Authorization`, etc..  Failure to configure this will block cross-domain requests.
- **Rate Limiting:** Protect endpoints, but communicate limits. Use `429 Too Many Requests` (RFC 6585) when a client exceeds a quota. Include `Retry-After: <seconds>` to tell the agent how long to wait. This lets a polite agent back off properly.
- **Robots & Directives:** Put a `/robots.txt` (UTF-8 text) at your root to advise crawlers.  When fetched successfully, crawlers **must** obey it.  Also use `<meta name="robots" content="nofollow,noindex">` on specific pages if you want them skipped (though this only influences indexing, not crawling).  Remember: robots.txt governs crawling, meta tags govern indexing.
- **Sitemaps:** Providing a `sitemap.xml` helps agents discover pages systematically.  Per the Sitemap protocol: *“The Sitemap protocol enables you to provide details about your pages to search engines, and we encourage its use…”*.  Include `<lastmod>`, `<changefreq>`, and `<priority>` hints if possible.  An alternative (or supplement) is an `llms.txt` (as UAIX suggests) listing agent-specific URLs and guidance.
- **GraphQL:** If using GraphQL, document the schema thoroughly. Agents will POST JSON queries to the GraphQL endpoint; ensure CORS and auth apply. Also consider having an alternative REST endpoint, as some crawlers may not introspect GraphQL. Use query complexity limits to avoid abuse.

## Authentication and Agent-Friendly Auth

Most web sites expect human logins, but agents need programmatic access. **Avoid interactive flows for bots.** Instead:

- **OAuth2**: Implement OAuth2 (RFC 6749). For example, a *service account* can use the **Client Credentials** flow to get a token without user involvement. The RFC notes: *“Client credentials are used as an authorization grant typically when the client is acting on its own behalf (the client is also the resource owner)”*. In practice: the agent does `POST /oauth/token` with `grant_type=client_credentials` and receives a JSON `{access_token:"…", token_type:"Bearer",expires_in:…}`. Then it includes `Authorization: Bearer <token>` in future requests.  
- **API Keys/Tokens:** For simpler cases, issue long-lived API keys or signed tokens. E.g. require a header `X-Api-Key: <token>`, or use JWTs in `Authorization: Bearer`. Ensure scopes/permissions are tied to these keys so a leaked key has limited reach. Rotate keys regularly.  
- **Session Cookies:** If using cookies/sessions (common in web apps), agents must handle them (set `SameSite=None; Secure`, return cookies on login). However, cookie-based auth is brittle for automation and is discouraged for pure API access.  
- **Delegated Tokens:** Some platforms (like cloud providers) allow one service to impersonate a user via delegated tokens. Support this so an agent can *on behalf of a user* call APIs without sharing the user’s primary credentials.

Include **descriptions in documentation** of the auth flows (endpoints, required headers, scopes). For example, your **OpenAPI** spec should detail the security scheme (OAuth2 client-credentials, API Key, etc.). This aids automated client generation.

## Error Handling, Idempotency, Observability

- **Error Responses:** Always return well-formed error messages. Use standard HTTP codes (4xx for client errors, 5xx for server) with a JSON body like:
  
  ```json
  { "error": "BadRequest", "message": "Missing parameter 'id'." }
  ```
  
  Include helpful fields (an error code, description, maybe a `retryable: true/false`). This lets an agent programmatically detect and handle errors.
- **Idempotency:** Wherever possible, design endpoints so that repeating a request has no adverse effect (idempotent). For `POST` actions that change state, consider requiring an `Idempotency-Key` header so retries won’t double-run. UAIX suggests having separate “GET-Action” URLs as *explicit* write fallbacks, not implicitly rewriting GETs to POSTs.
- **Observability:** Log agent actions and provide feedback endpoints. For long-running tasks, allow status polling via a `/tasks/<id>` endpoint. Return progress or logs so an agent knows what happened. Expose metrics (e.g. Prometheus format) or logs (via Splunk/ELK) behind an API if needed, so automated agents can monitor system health.

## Performance, Pagination, Large Data

Large datasets must be chunked. For example:

- **Pagination:** Always page or stream results. A common practice is Link headers:
  
  ```
  HTTP/1.1 200 OK
  Content-Type: application/json
  Link: <https://api.example.com/items?page=2>; rel="next"
  
  { "items": [ ... ], "page": 1, "pageSize": 100 }
  ```

  Or use query params: `/items?limit=100&offset=100`. Make sure to document these patterns. If using GraphQL, require a `first/after` or cursor approach.
- **Bulk Endpoints:** For very large syncs, consider endpoints that return bulk archives (e.g. CSV/JSON dumps) or support **HTTP Range** requests for partial content.
- **Caching:** Leverage `ETag` or `Last-Modified` so agents can skip re-downloading unchanged data. Support gzip compression (`Accept-Encoding: gzip`) for faster transfer.

## Security & Privacy Trade-offs

Balancing openness with security is crucial. Some guidelines:

- **Least Privilege:** Don’t expose sensitive data to unauthenticated agents. Use scopes and ACLs. For example, public agents might see only read-only JSON; sensitive actions require tokens.
- **No Secrets in URLs:** Never include API keys or tokens in URLs (they leak via logs). Prefer headers or POST bodies.
- **Rate Limits & Abuse:** Apply stricter rate limits on unauthenticated or anonymous access. Use CAPTCHAs or abuse detection for human forms, but avoid them on agent APIs.
- **Audit Trails:** For write actions by agents, log who/what did it (via API key identity). This supports the “governed multi-agent” level (L4+) where audit evidence is needed.
- **Privacy:** If content is private, don’t use structured data that accidentally exposes it. On mixed public/private pages, separate content or use session-based APIs so agents can’t scrape private info.

## Checklist for Developers

1. **Semantic Content:** Use correct HTML tags (headings, lists, tables, forms) and include `alt`, `title`, labels, ARIA where needed.  
2. **Structured Data:** Embed JSON-LD/Microdata with schema.org definitions for key entities (products, articles, events).  
3. **Sitemaps & Robot Rules:** Publish `/sitemap.xml` and `/robots.txt` (or `llms.txt` UAIX manifest) in UTF-8 text. Ensure no vital content is blocked.  
4. **RESTful Endpoints:** Provide clean URLs for APIs (e.g. `/api/widgets`, `/api/widgets/{id}`), accept and return JSON.  Document query parameters and request/response schema.  
5. **HTTP Standards:** Honor `Accept`/`Content-Type`, use proper status codes, and include necessary headers (`CORS`, `Link`, `Retry-After`).  Avoid custom non-standard headers unless documented.  
6. **Authentication:** Implement OAuth2 or API key auth for agents. Provide a “service account” or client-credentials path.  Never rely solely on JavaScript-based or interactive logins for API access.  
7. **Idempotency & Error Handling:** Design safe retries (use PUT/PATCH or idempotency keys). Return structured error JSON (including codes and retry hints). Use 4xx for client errors, 5xx for server.  
8. **Performance:** Limit pagesize (e.g. `limit=100`), support pagination, and allow `gzip`. Test under realistic load and profile common queries.  
9. **Observability:** Expose a `/status` or `/health` endpoint.  Enable logs and metrics that agents or dashboards can query. Document error messages and logs format.  
10. **Security:** Apply least-privilege for tokens. Do not reveal private data to anonymous agents. Use HTTPS everywhere and secure cookie flags.  Consider CAPTCHAs only on human-facing forms, not API endpoints.

These items are roughly prioritized: start by making content parseable (items 1–4), then ensure APIs and auth (5–7), then tune performance and security (8–10).

## Testing Plan and Metrics

**Test Plan:** Simulate agent workflows:

- **Crawl & Parse:** Write a bot that fetches `/robots.txt`, `/sitemap.xml`, and all listed URLs. It should record which pages succeed (HTTP 200) and how many warnings (missing alt text, empty titles). Use linters (e.g. [axe](https://www.deque.com/axe/) or custom scripts) to check semantic HTML and ARIA usage across pages.  
- **API Conformance:** Test all API endpoints with a variety of inputs. Verify each returns correct status codes and JSON schema. Include auth scenarios: ensure protected endpoints reject without tokens.  
- **Authentication:** Test OAuth flows end-to-end (token request, token refresh, API call). Measure time to obtain token and refresh. Ensure expired tokens yield 401.  
- **Error Handling:** Intentionally send bad requests to each endpoint. Check that errors return well-formed JSON with `error` fields. Ensure no unexpected crashes.  
- **Load & Rate:** Use a load tester to send concurrent requests and confirm that rate-limiting triggers (`429`) appropriately. Check that large paginated queries behave as expected.

**Metrics:** Evaluate:

- **Reachability:** % of pages discovered via sitemap/robots vs total pages. (Goal: ~100%).
- **Structured Data Coverage:** % of pages with valid JSON-LD or microdata tags. (Goal: high for key pages, ideally >90%).  
- **API Success Rate:** Fraction of automated API calls that return expected results vs errors.  
- **Performance:** Average and p95 response time for key endpoints (target < 500ms under load).  
- **Error Rate:** Count of 4xx/5xx per 1K requests in production.  
- **Auth Failures:** Number of unauthorized attempts vs successes, to gauge any misconfiguration.  
- **Robot Fetch Rate:** How often bots (GoogleBot, etc.) successfully crawl the site, from server logs.

Use these to iteratively improve the site. For example, low structured-data coverage or missing alt tags indicate needed fixes. High error rates or slow APIs should be addressed with caching, indexing, or refactoring. 

By following the above guidelines and continuously testing against these metrics, developers can ensure their websites are robustly **agent-usable**. This makes them accessible not only to humans, but to automated assistants, crawlers, and AI systems – aligning with UAIX’s vision of progressive agent support.

