Mergeable by default: Building the context engine to save time and tokens — Peter Werry, Unblocked

| Podcasts | May 03, 2026 | 15.6 Thousand views | 1:41:25

TL;DR

Peter Werry argues that as AI agents move toward autonomous 'YOLO mode' execution, simple RAG and MCP connections fail to provide adequate organizational context, creating bottlenecks and 'satisfaction of search' failures where agents stop at superficial answers instead of understanding the historical 'why' behind code decisions.

🔄 The Human Bottleneck Problem 3 insights

Humans become cognitive bottlenecks

With parallel agents and YOLO mode execution, engineers cannot manage the context switching required to manually feed information to multiple simultaneous background processes.

Organizational knowledge requires battle scars

True context includes institutional memory of incidents, outages, and historical decisions—not just current code state—enabling agents to understand why systems work the way they do.

Background agents are inevitable

As code intelligence reaches exponential improvement, the limiting factor becomes context delivery, requiring engines that can operate autonomously without human intermediaries.

Three Myths of Context Provision 3 insights

Naive RAG causes satisfaction of search

Simple vector search leads agents to stop at first findings (like radiologists missing secondary issues), overlooking critical context buried in Slack threads or incident reports.

MCP connections lack understanding

Wiring up data sources provides access but fails to reveal relationships between systems, historical motivations for changes, or the reasons behind architectural decisions.

Larger context windows don't solve reasoning

Even million-token windows cannot fit entire organizational contexts, and size doesn't help agents determine truth versus outdated information or reason across disparate sources.

🏗️ Building a True Context Engine 3 insights

Prevent satisfaction of search

Engines must surface previously rejected solutions, analyze deletion history, and understand user intent rather than stopping at the first compiling code solution.

Resolve conflicts beyond recency

Truth determination requires identifying contradictions between documentation and code, recognizing that main branch isn't always the future source of truth, and learning from user corrections.

Enforce access controls at the core

Context engines must respect permissions like private Slack channels, ensuring sensitive information only surfaces for authorized users while maintaining strict privacy boundaries.

Bottom Line

Organizations must build context engines that resolve conflicts between data sources, preserve institutional knowledge of past failures, and enforce access controls to prepare for autonomous background agents that understand the 'why' behind decisions, not just the 'what' of current code.

More from AI Engineer

View all
LLM Observability, Evaluation, Experimentation Platform — Dat Ngo, Arize
AI Engineer AI Engineer

LLM Observability, Evaluation, Experimentation Platform — Dat Ngo, Arize

Dat Ngo from Arize AI explains how modern AI systems require reimagined observability and evaluation patterns built on OpenTelemetry to manage non-deterministic agents, emphasizing that the future of AI engineering lies in automated experimentation flywheels that eliminate manual dashboard work.

12 days ago · 9 points
Text Diffusion — Brendon Dillon, Google DeepMind
AI Engineer AI Engineer

Text Diffusion — Brendon Dillon, Google DeepMind

Google DeepMind researcher Brendon Dillon explains text diffusion as a parallel alternative to autoregressive language models that iteratively denoises random tokens rather than generating sequentially, offering significantly lower latency and unique capabilities like self-correction and adaptive computation, though currently limited by high serving costs for large batches.

15 days ago · 8 points