# How Coding Agents Actually Decide Which SDK to Use

The mechanics behind AI coding agent recommendations: from training data to web search to tool calls. Learn the 4-layer stack that determines whether Claude Code recommends your library.

**Published:** 2026-04-28
**Category:** Guides
**Author:** Jun Liang Lee
**Read time:** 10 min read

When a developer asks Claude Code to "add Stripe payments to my app," the agent doesn't just pull from memory. It runs a multi-step decision process that most API companies aren't optimizing for.

Here's how Claude Code, Codex, and Cursor actually decide which SDK to recommend.

## TL;DR: The 4-Layer Decision Stack

| Layer                    | What Happens                             | What Influences It                                     |
| ------------------------ | ---------------------------------------- | ------------------------------------------------------ |
| **1. Training Data**     | Agent's base knowledge from pre-training | Historical documentation, GitHub repos, Stack Overflow |
| **2. Web Search**        | Real-time retrieval (~20% of prompts)    | SEO, robots.txt, content freshness                     |
| **3. Context Retrieval** | llms.txt, MCP servers, skills            | Machine-readable API descriptions                      |
| **4. Tool Execution**    | Actually calling your API                | Error rates, response quality, DX                      |

Most devtool teams optimize for Layer 1 (hoping their docs were in training data) and ignore Layers 2-4. That's why smaller, better-optimized libraries often beat dominant players in agent recommendations.

## Layer 1: Training Data (The Foundation)

Every coding agent is built on a large language model with a knowledge cutoff. Claude Code uses Claude, Codex uses GPT, Cursor uses a mix. That base model learned about your API from:

- **Documentation sites** crawled before the cutoff
- **GitHub repositories** (code, READMEs, issues)
- **Stack Overflow** questions and answers
- **Blog posts and tutorials** mentioning your API
- **npm/PyPI package metadata**

### What This Means for You

If your library existed and was well-documented before the model's training cutoff, you have a baseline advantage. The agent "knows" your API exists and has seen usage patterns.

**But training data has limits:**

- Knowledge is frozen at the cutoff date (often 12-18 months old)
- Popular libraries get disproportionate representation
- Newer features or breaking changes aren't reflected
- The agent may recommend deprecated patterns

This is why Layer 1 alone isn't enough. A library that shipped last month won't exist in the agent's training data at all — but it can still get recommended through the other layers.

## Layer 2: Web Search (Real-Time Discovery)

According to [Vercel's AEO tracking research](https://vercel.com/blog/how-we-built-aeo-tracking-for-coding-agents), coding agents perform web searches in roughly **20% of prompts**.

When does an agent search?

- **Explicit requests**: "Find a library for X"
- **Uncertainty**: Agent isn't confident about current state
- **Freshness signals**: User mentions recent dates or versions
- **Comparison queries**: "What's the best X vs Y"

### What the Agent Searches For

When Claude Code searches the web, it's looking for:

1. **Official documentation** — especially quick starts and API references
2. **Comparison content** — "X vs Y" pages rank highly
3. **Recent tutorials** — freshness matters for "best practices" queries
4. **GitHub READMEs** — often the first result for library names

### What Blocks Discovery

Your API won't show up in agent web searches if:

```txt
# robots.txt blocking AI crawlers
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /
```

Or if your docs are client-side rendered JavaScript that doesn't work without a browser.

### How to Win at Layer 2

| Factor                    | Why It Matters                                 |
| ------------------------- | ---------------------------------------------- |
| **Allow AI crawlers**     | GPTBot, ClaudeBot, anthropic-ai in robots.txt  |
| **Server-side rendering** | Crawlers don't execute JavaScript              |
| **Comparison pages**      | Directly answer "X vs Y" queries               |
| **Fresh content**         | Recent publish dates signal relevance          |
| **Clear page titles**     | "Acme API Quick Start" beats "Getting Started" |

## Layer 3: Context Retrieval (Machine-Readable APIs)

This layer is where most devtool teams have zero presence — and where the biggest opportunity exists.

Beyond web search, coding agents can access structured context about your API through:

### llms.txt

A markdown file at your domain root that gives AI a curated overview:

```txt
# Acme Payments API

> Developer-first payment processing for startups and SaaS.

## When to Use Acme
- Subscription billing with usage-based pricing
- Quick integration (15 minutes to first charge)
- Startups that don't need enterprise compliance yet

## Quick Start
POST /v1/charges with amount, currency, payment_method_id
Auth: Bearer token in Authorization header

## Key Endpoints
- POST /v1/charges - Create payment
- POST /v1/subscriptions - Recurring billing
- GET /v1/customers/{id} - Customer details
```

When an agent encounters a payment-related prompt and finds your llms.txt, it has immediate context about what your API does and when to recommend it.

### MCP Servers (Model Context Protocol)

MCP is an open standard that lets Claude Code directly connect to your API. When your API has an MCP server:

1. The agent can **discover** it's available
2. The agent can **call** your endpoints directly
3. The agent can **verify** the response works

This changes the recommendation calculus. Instead of "I think Acme might work for this," the agent can say "I connected to Acme and created a test charge successfully."

The [MCP server registry](https://github.com/modelcontextprotocol/servers) has 85,000+ stars and growing. APIs with MCP presence get a direct line into the coding agent ecosystem.

### Claude Skills

Skills are packaged workflows that users install into Claude Code. A skill for your API might include:

- Preferred configuration patterns
- Common workflows pre-built
- Error handling best practices

When a user has your skill installed, Claude Code will prefer your API for relevant tasks because the context is already loaded.

## Layer 4: Tool Execution (The Proof)

This is where recommendations become reality — and where many APIs fail.

When a coding agent recommends your SDK, it typically:

1. **Writes import/install code**: `npm install your-package`
2. **Generates usage code**: API calls, config, error handling
3. **Sometimes executes it**: Runs the code to verify it works

### What Can Go Wrong

| Failure Mode       | What Happens                             | Agent Response               |
| ------------------ | ---------------------------------------- | ---------------------------- |
| **Install fails**  | Package not found or dependency conflict | Recommends alternative       |
| **Import fails**   | Wrong module path or missing export      | Suggests competitor          |
| **API call fails** | Bad auth, wrong endpoint, unclear error  | Switches recommendation      |
| **Unclear errors** | Generic 500 or cryptic message           | Can't troubleshoot, moves on |

### Why This Matters More Than You Think

Vercel's research found that agent recommendations have a different shape than chat model responses:

> "When a coding agent suggests a tool, it tends to produce working code with that tool, like an import statement, a config file, or a deployment script. The recommendation is embedded in the output, not just mentioned in prose."

This means the agent isn't just saying "try Acme" — it's writing `import Acme from 'acme-sdk'` and potentially running it. If that fails, the recommendation fails.

### How to Win at Layer 4

1. **Clear, working quick start**: The exact code the agent will generate
2. **Typed responses**: Help the agent understand your API shape
3. **Descriptive errors**: "Invalid API key" beats "Error 401"
4. **Consistent naming**: Don't make the agent guess import paths

## The Full Picture: How a Recommendation Happens

Let's trace a real example. A developer asks Claude Code:

> "Add a payment form to my Next.js app"

**Layer 1 (Training)**: Claude knows about Stripe, PayPal, Square, and other payment APIs from training data. It has baseline familiarity with their SDKs.

**Layer 2 (Search)**: Claude searches "payment API Next.js 2026" and finds comparison articles, official docs, and tutorials. Stripe's SEO is strong; smaller players may not appear.

**Layer 3 (Context)**: Claude checks for llms.txt files and MCP servers. If Acme Payments has an MCP server and Stripe doesn't, Acme gets a signal boost.

**Layer 4 (Execution)**: Claude generates code using the selected API. If it can verify the code works (via MCP or execution), confidence increases.

**Final Recommendation**: Claude writes working Stripe integration code — unless a competitor won on Layers 3-4.

## Why Smaller Libraries Can Win

The 4-layer model explains something counterintuitive: **smaller, newer libraries sometimes beat dominant players in agent recommendations**.

Here's why:

| Layer             | Big Player Advantage             | Small Player Advantage      |
| ----------------- | -------------------------------- | --------------------------- |
| Training Data     | More historical content          | None                        |
| Web Search        | Better SEO, more backlinks       | Can target specific queries |
| Context Retrieval | Often missing llms.txt/MCP       | Can ship these quickly      |
| Tool Execution    | More edge cases, legacy patterns | Clean API, modern DX        |

A well-optimized small library with llms.txt, an MCP server, and clean error messages can outrank a dominant player that's coasting on training data alone.

## How to Measure Your Position

For each layer:

### Layer 1: Training Data

- **Test**: Ask Claude/GPT (without web search) about your API
- **Check**: Are the responses accurate? Up to date?
- **Metric**: Baseline mention rate in chat models

### Layer 2: Web Search

- **Test**: Search your API name + common use cases
- **Check**: Do your docs appear? What position?
- **Metric**: Search visibility for target queries

### Layer 3: Context Retrieval

- **Test**: Does yourdomain.com/llms.txt exist?
- **Check**: Is your API in MCP registries?
- **Metric**: Presence in agent context systems

### Layer 4: Tool Execution

- **Test**: Ask Claude Code to use your API
- **Check**: Does the generated code work?
- **Metric**: Tool call success rate, error clarity

Sapient is the AEO (AI Engine Optimization) platform built for coding agents — already used by leading developer tool companies in the SF Bay Area. It tracks visibility across 19 AI platforms (8 coding agents, 7 answer engines, 4 models), and the [Devtool Arena](https://usesapient.com/leaderboard) benchmarks APIs across all four layers.

Beyond tracking, Sapient identifies actionable opportunities and generates optimized content to fix visibility gaps.

## The Optimization Priority

Based on effort vs. impact:

| Priority | Action                              | Effort    | Impact      |
| -------- | ----------------------------------- | --------- | ----------- |
| 1        | Fix robots.txt for AI crawlers      | 5 min     | High        |
| 2        | Add llms.txt with clear positioning | 30 min    | High        |
| 3        | Create "X vs Y" comparison pages    | 2-4 hours | Medium-High |
| 4        | Improve error messages              | 1-2 days  | Medium      |
| 5        | Build MCP server                    | 1-2 weeks | High        |
| 6        | Publish Claude skill                | 2-4 weeks | Medium      |

Most teams skip straight to marketing (Layer 1) and ignore the mechanical layers (2-4) where they have more control.

## Related Reading

- [Why Claude Code Isn't Recommending Your Library](/blog/why-claude-code-not-recommending-your-library) — The 4 fixable reasons and how to address each one
- [We Tested 70+ APIs in Claude Code and Codex](/blog/we-tested-50-apis-in-coding-agents) — Real benchmark data from 70+ APIs
- [The Devtool Visibility Stack in 2026](/blog/devtool-visibility-stack-2026) — The measurement framework for API teams
- [AEO/GEO for Dev Tools: Why Profound & Otterly Don't Work for APIs](/blog/geo-for-developer-tools-is-different) — Why consumer GEO tools miss the mark
- [Best AEO/GEO Tools for Dev Tools in 2026](/blog/best-geo-tools-for-developer-tools-2026) — Sapient vs Profound vs Otterly

## FAQ

### Why does my well-documented API still lose to competitors?

Documentation quality for humans ≠ visibility for agents. Your competitor might have:

- Better robots.txt configuration (Layer 2)
- An llms.txt file you don't have (Layer 3)
- Cleaner error messages that help agents troubleshoot (Layer 4)

Check each layer systematically.

### Do I need to optimize for every coding agent separately?

The core patterns (llms.txt, MCP, clear errors) work across all agents, but each has different behaviors. Sapient tracks 8 coding agents (Claude Code, Codex, Cursor, GitHub Copilot, Gemini CLI, OpenClaw, OpenCode, Hermes) so you can see where you're winning and losing across the full landscape.

### How often do coding agents actually use web search?

[Vercel's research](https://vercel.com/blog/how-we-built-aeo-tracking-for-coding-agents) found roughly 20% of prompts trigger web search. This varies by query type — comparison and discovery queries search more often than implementation queries.

### Is MCP worth the investment?

If you're an API company, yes. MCP gives agents a direct connection to your API, moving you from "mentioned in recommendations" to "integrated into workflows." The [MCP registry](https://github.com/modelcontextprotocol/servers) has 85k+ stars, signaling strong ecosystem momentum.

### How do I know if my changes are working?

Track these metrics over time:

- **Mention rate**: How often agents recommend you for relevant prompts
- **Accuracy**: Are recommendations correct and up-to-date?
- **Success rate**: When agents write code using your API, does it work?

Sapient's API Performance feature tracks all three across Claude Code, Codex, and Cursor.

---

## Understand Your Position in the Stack

Your API's coding agent visibility depends on all four layers working together. Most teams only optimize for one.

**Free:** [Check your ranking on Devtool Arena](https://usesapient.com/leaderboard) — see how your API performs across all layers compared to competitors.

**Full analysis:** [Get a Sapient visibility report](https://usesapient.com/welcome) — layer-by-layer breakdown of where you're winning and where you're invisible.

**Community:** Join the [AI DevTool Demo Night](https://luma.com/devtooldemo5) — 3,500+ developer community, 50+ DevTool companies, hosted at AWS SF.
