# How Coding Agents Actually Decide Which SDK to Use

From training data to web search to tool calls — the 4-layer stack that determines whether Claude Code recommends your SDK.

**Published:** 2026-04-28
**Updated:** 2026-05-10
**Category:** Guides
**Author:** Jun Liang Lee
**Read time:** 10 min read

When a developer asks Claude Code to "add Stripe payments to my app," the agent doesn't just pull from memory. It runs a multi-step decision process that most API companies aren't optimizing for.

Here's how Claude Code, Codex, and Cursor actually decide which SDK to recommend.

## TL;DR: The 4-Layer Decision Stack

| Layer                    | What Happens                             | What Influences It                                     |
| ------------------------ | ---------------------------------------- | ------------------------------------------------------ |
| **1. Training Data**     | Agent's base knowledge from pre-training | Historical documentation, GitHub repos, Stack Overflow |
| **2. Web Search**        | Real-time retrieval (~20% of prompts)    | SEO, robots.txt, content freshness                     |
| **3. Context Retrieval** | llms.txt, MCP servers, skills            | Machine-readable API descriptions                      |
| **4. Tool Execution**    | Actually calling your API                | Error rates, response quality, DX                      |

Most devtool teams optimize for Layer 1 (hoping their docs were in training data) and ignore Layers 2-4. That's why smaller, better-optimized libraries often beat dominant players in agent recommendations.

## Layer 1: Training Data (The Foundation)

Every coding agent is built on a large language model with a knowledge cutoff. Claude Code uses Claude, Codex uses GPT, Cursor uses a mix. That base model learned about your API from:

- **Documentation sites** crawled before the cutoff
- **GitHub repositories** (code, READMEs, issues)
- **Stack Overflow** questions and answers
- **Blog posts and tutorials** mentioning your API
- **npm/PyPI package metadata**

### What This Means for You

If your library existed and was well-documented before the model's training cutoff, you have a baseline advantage. The agent "knows" your API exists and has seen usage patterns.

**But training data has limits:**

- Knowledge is frozen at the cutoff date (often 12-18 months old)
- Popular libraries get disproportionate representation
- Newer features or breaking changes aren't reflected
- The agent may recommend deprecated patterns

This is why Layer 1 alone isn't enough. A library that shipped last month won't exist in the agent's training data at all — but it can still get recommended through the other layers.

## Layer 2: Web Search (Real-Time Discovery)

According to [Vercel's AEO tracking research](https://vercel.com/blog/how-we-built-aeo-tracking-for-coding-agents), coding agents perform web searches in roughly **20% of prompts**.

When does an agent search?

- **Explicit requests**: "Find a library for X"
- **Uncertainty**: Agent isn't confident about current state
- **Freshness signals**: User mentions recent dates or versions
- **Comparison queries**: "What's the best X vs Y"

### What the Agent Searches For

When Claude Code searches the web, it's looking for:

1. **Official documentation** — especially quick starts and API references
2. **Comparison content** — "X vs Y" pages rank highly
3. **Recent tutorials** — freshness matters for "best practices" queries
4. **GitHub READMEs** — often the first result for library names

### What Blocks Discovery

Your API won't show up in agent web searches if:

```txt
# robots.txt blocking AI crawlers
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /
```

Or if your docs are client-side rendered JavaScript that doesn't work without a browser.

### How to Win at Layer 2

| Factor                    | Why It Matters                                 |
| ------------------------- | ---------------------------------------------- |
| **Allow AI crawlers**     | GPTBot, ClaudeBot, anthropic-ai in robots.txt  |
| **Server-side rendering** | Crawlers don't execute JavaScript              |
| **Comparison pages**      | Directly answer "X vs Y" queries               |
| **Fresh content**         | Recent publish dates signal relevance          |
| **Clear page titles**     | "Acme API Quick Start" beats "Getting Started" |

## Layer 3: Context Retrieval (Machine-Readable APIs)

This layer is where most devtool teams have zero presence — and where the biggest opportunity exists.

Beyond web search, coding agents can access structured context about your API through:

### llms.txt

A markdown file at your domain root that gives AI a curated overview:

```txt
# Acme Payments API

> Developer-first payment processing for startups and SaaS.

## When to Use Acme
- Subscription billing with usage-based pricing
- Quick integration (15 minutes to first charge)
- Startups that don't need enterprise compliance yet

## Quick Start
POST /v1/charges with amount, currency, payment_method_id
Auth: Bearer token in Authorization header

## Key Endpoints
- POST /v1/charges - Create payment
- POST /v1/subscriptions - Recurring billing
- GET /v1/customers/{id} - Customer details
```

When an agent encounters a payment-related prompt and finds your llms.txt, it has immediate context about what your API does and when to recommend it.

### MCP Servers (Model Context Protocol)

MCP is an open standard that lets Claude Code directly connect to your API. When your API has an MCP server:

1. The agent can **discover** it's available
2. The agent can **call** your endpoints directly
3. The agent can **verify** the response works

This changes the recommendation calculus. Instead of "I think Acme might work for this," the agent can say "I connected to Acme and created a test charge successfully."

The [MCP server registry](https://github.com/modelcontextprotocol/servers) has 85,000+ stars and growing. APIs with MCP presence get a direct line into the coding agent ecosystem.

### Claude Skills

Skills are packaged workflows that users install into Claude Code. A skill for your API might include:

- Preferred configuration patterns
- Common workflows pre-built
- Error handling best practices

When a user has your skill installed, Claude Code will prefer your API for relevant tasks because the context is already loaded.

## Layer 4: Tool Execution (The Proof)

This is where recommendations become reality — and where many APIs fail.

When a coding agent recommends your SDK, it typically:

1. **Writes import/install code**: `npm install your-package`
2. **Generates usage code**: API calls, config, error handling
3. **Sometimes executes it**: Runs the code to verify it works

### What Can Go Wrong

| Failure Mode       | What Happens                             | Agent Response               |
| ------------------ | ---------------------------------------- | ---------------------------- |
| **Install fails**  | Package not found or dependency conflict | Recommends alternative       |
| **Import fails**   | Wrong module path or missing export      | Suggests competitor          |
| **API call fails** | Bad auth, wrong endpoint, unclear error  | Switches recommendation      |
| **Unclear errors** | Generic 500 or cryptic message           | Can't troubleshoot, moves on |

### Why This Matters More Than You Think

Vercel's research found that agent recommendations have a different shape than chat model responses:

> "When a coding agent suggests a tool, it tends to produce working code with that tool, like an import statement, a config file, or a deployment script. The recommendation is embedded in the output, not just mentioned in prose."

This means the agent isn't just saying "try Acme" — it's writing `import Acme from 'acme-sdk'` and potentially running it. If that fails, the recommendation fails.

### How to Win at Layer 4

1. **Clear, working quick start**: The exact code the agent will generate
2. **Typed responses**: Help the agent understand your API shape
3. **Descriptive errors**: "Invalid API key" beats "Error 401"
4. **Consistent naming**: Don't make the agent guess import paths

## The Full Picture: How a Recommendation Happens

Let's trace a real example. A developer asks Claude Code:

> "Add a payment form to my Next.js app"

**Layer 1 (Training)**: Claude knows about Stripe, PayPal, Square, and other payment APIs from training data. It has baseline familiarity with their SDKs.

**Layer 2 (Search)**: Claude searches "payment API Next.js 2026" and finds comparison articles, official docs, and tutorials. Stripe's SEO is strong; smaller players may not appear.

**Layer 3 (Context)**: Claude checks for llms.txt files and MCP servers. If Acme Payments has an MCP server and Stripe doesn't, Acme gets a signal boost.

**Layer 4 (Execution)**: Claude generates code using the selected API. If it can verify the code works (via MCP or execution), confidence increases.

**Final Recommendation**: Claude writes working Stripe integration code — unless a competitor won on Layers 3-4.

## Why Smaller Libraries Can Win

The 4-layer model explains something counterintuitive: **smaller, newer libraries sometimes beat dominant players in agent recommendations**.

Here's why:

| Layer             | Big Player Advantage             | Small Player Advantage      |
| ----------------- | -------------------------------- | --------------------------- |
| Training Data     | More historical content          | None                        |
| Web Search        | Better SEO, more backlinks       | Can target specific queries |
| Context Retrieval | Often missing llms.txt/MCP       | Can ship these quickly      |
| Tool Execution    | More edge cases, legacy patterns | Clean API, modern DX        |

A well-optimized small library with llms.txt, an MCP server, and clean error messages can outrank a dominant player that's coasting on training data alone.

## How to Measure Your Position

For each layer:

### Layer 1: Training Data

- **Test**: Ask Claude/GPT (without web search) about your API
- **Check**: Are the responses accurate? Up to date?
- **Metric**: Baseline mention rate in chat models

### Layer 2: Web Search

- **Test**: Search your API name + common use cases
- **Check**: Do your docs appear? What position?
- **Metric**: Search visibility for target queries

### Layer 3: Context Retrieval

- **Test**: Does yourdomain.com/llms.txt exist?
- **Check**: Is your API in MCP registries?
- **Metric**: Presence in agent context systems

### Layer 4: Tool Execution

- **Test**: Ask Claude Code to use your API
- **Check**: Does the generated code work?
- **Metric**: Tool call success rate, error clarity

Sapient is the AEO (AI Engine Optimization) platform built for coding agents — already used by leading developer tool companies in the SF Bay Area. It tracks visibility across 21 AI platforms (10 coding agents, 7 answer engines, 4 models), and the [Devtool Arena](https://usesapient.com/leaderboard) benchmarks APIs across all four layers.

Beyond tracking, Sapient identifies actionable opportunities and generates optimized content to fix visibility gaps.

## The Optimization Priority

Based on effort vs. impact:

| Priority | Action                              | Effort    | Impact      |
| -------- | ----------------------------------- | --------- | ----------- |
| 1        | Fix robots.txt for AI crawlers      | 5 min     | High        |
| 2        | Add llms.txt with clear positioning | 30 min    | High        |
| 3        | Create "X vs Y" comparison pages    | 2-4 hours | Medium-High |
| 4        | Improve error messages              | 1-2 days  | Medium      |
| 5        | Build MCP server                    | 1-2 weeks | High        |
| 6        | Publish Claude skill                | 2-4 weeks | Medium      |

Most teams skip straight to marketing (Layer 1) and ignore the mechanical layers (2-4) where they have more control.

## Related Reading

- [Why Claude Code Isn't Recommending Your Library](/blog/why-claude-code-not-recommending-your-library) — The 4 fixable reasons and how to address each one
- [We Tested 70+ APIs in Claude Code and Codex](/blog/we-tested-50-apis-in-coding-agents) — Real benchmark data from 70+ APIs
- [The Devtool Visibility Stack in 2026](/blog/devtool-visibility-stack-2026) — The measurement framework for API teams
- [AEO/GEO for Dev Tools: Why Profound & Otterly Don't Work for APIs](/blog/geo-for-developer-tools-is-different) — Why consumer GEO tools miss the mark
- [Best AEO/GEO Tools for Dev Tools in 2026](/blog/best-geo-tools-for-developer-tools-2026) — Sapient vs Profound vs Otterly

## FAQ

### Why does my well-documented API still lose to competitors?

Documentation quality for humans ≠ visibility for agents. Your competitor might have:

- Better robots.txt configuration (Layer 2)
- An llms.txt file you don't have (Layer 3)
- Cleaner error messages that help agents troubleshoot (Layer 4)

Check each layer systematically.

### Do I need to optimize for every coding agent separately?

The core patterns (llms.txt, MCP, clear errors) work across all agents, but each has different behaviors. Sapient tracks 10 coding agents (Claude Code, Codex, Cursor, GitHub Copilot, Gemini CLI, OpenClaw, OpenCode, Hermes, Pi, Kilo) so you can see where you're winning and losing across the full landscape.

### How often do coding agents actually use web search?

[Vercel's research](https://vercel.com/blog/how-we-built-aeo-tracking-for-coding-agents) found roughly 20% of prompts trigger web search. This varies by query type — comparison and discovery queries search more often than implementation queries.

### Is MCP worth the investment?

If you're an API company, yes. MCP gives agents a direct connection to your API, moving you from "mentioned in recommendations" to "integrated into workflows." The [MCP registry](https://github.com/modelcontextprotocol/servers) has 85k+ stars, signaling strong ecosystem momentum.

### How do I know if my changes are working?

Track these metrics over time:

- **Mention rate**: How often agents recommend you for relevant prompts
- **Accuracy**: Are recommendations correct and up-to-date?
- **Success rate**: When agents write code using your API, does it work?

Sapient's API Performance feature tracks all three across Claude Code, Codex, and Cursor.

---

## Understand Your Position in the Stack

Your API's coding agent visibility depends on all four layers working together. Most teams only optimize for one.

**Free:** [Check your ranking on Devtool Arena](https://usesapient.com/leaderboard) — see how your API performs across all layers compared to competitors.

**Full analysis:** [Get a Sapient visibility report](https://usesapient.com/welcome) — layer-by-layer breakdown of where you're winning and where you're invisible.

**Community:** Join the [AI DevTool Demo Night](https://luma.com/devtooldemo5) — 3,500+ developer community, 50+ DevTool companies, hosted at AWS SF.
