Your MCP Server is Bad (and you should feel bad) - Jeremiah Lowin, Prefect
TL;DR
Jeremiah Lowin argues that most MCP servers fail because developers treat them like REST APIs for humans rather than curated interfaces optimized for AI agents' specific constraints around discovery cost, iteration speed, and limited context windows.
🧠 The Agent-Human Divide 3 insights
Discovery is prohibitively expensive
Unlike humans who study documentation once, agents must rediscover the entire server on every activation, burning tokens to enumerate every tool and description.
Iteration is the enemy
While humans iterate code rapidly, each additional agent tool call transmits the full conversation history over the wire, making multi-step workflows slow and costly.
Context windows are severely constrained
Agents operate with limited memory (around 200k tokens) compared to human long-term memory, requiring interfaces that minimize cognitive load and decision points.
🎯 Outcomes Over Operations 3 insights
Atomic operations are an anti-pattern
Exposing granular functions like get_user and filter_orders forces agents into expensive orchestration instead of single-call solutions that return complete outcomes.
Agents make terrible glue code
Using LLMs to chain atomic operations is slow, stochastic, and hard to debug compared to deterministic orchestration tools like Prefect or Airflow.
Design for agent stories
Tools should represent complete workflows (e.g., 'track latest order by email') rather than steps that require agent reasoning to sequence and execute.
🎨 The Art of Curation 3 insights
Raw API wrappers fail agents
Simply stringifying REST API responses creates 'needles in haystacks' where agents must examine every field to find relevant data, consuming excessive context.
Agents deserve purpose-built interfaces
Just as humans use websites and apps instead of raw APIs, agents need interfaces curated specifically for their strengths and weaknesses, not repurposed human developer tools.
Curate ruthlessly to minimize discovery
Every exposed tool and description adds to the discovery tax, so servers must eliminate unnecessary surface area and optimize for the agent's limited context window.
Bottom Line
Design MCP tools as complete outcome-oriented workflows rather than atomic API operations, curating strictly for agent limitations on discovery, iteration, and context.
More from AI Engineer
View all
The Production AI Playbook: Deploying Agents at Enterprise Scale — Sandipan Bhaumik, Databricks
Sandipan Bhaumik from Databricks presents a battle-tested five-pillar framework for deploying enterprise AI agents, arguing that starting with model selection leads to inevitable production failures while proper evaluation, observability, and data governance determine success at scale.
Sovereign Escape Velocity: Ownership w Open Models — Gus Martins, & Ian Ballantyne, Google DeepMind
Google DeepMind's Gus Martins and Ian Ballantyne introduce Gemma 4, a family of open models (2B to 31B parameters) that deliver frontier-level intelligence with disproportionate efficiency, enabling sovereign AI ownership through local deployment, Apache 2.0 licensing, and on-device capabilities.
LLM Observability, Evaluation, Experimentation Platform — Dat Ngo, Arize
Dat Ngo from Arize AI explains how modern AI systems require reimagined observability and evaluation patterns built on OpenTelemetry to manage non-deterministic agents, emphasizing that the future of AI engineering lies in automated experimentation flywheels that eliminate manual dashboard work.
Text Diffusion — Brendon Dillon, Google DeepMind
Google DeepMind researcher Brendon Dillon explains text diffusion as a parallel alternative to autoregressive language models that iteratively denoises random tokens rather than generating sequentially, offering significantly lower latency and unique capabilities like self-correction and adaptive computation, though currently limited by high serving costs for large batches.