Infinite Code Context: AI Coding at Enterprise Scale w/ Blitzy CEO Brian Elliott & CTO Sid Pardeshi
TL;DR
Blitzy's CEO and CTO reveal how they achieve 80%+ autonomous enterprise software development by orchestrating thousands of dynamically generated AI agents with deep relational codebase understanding, rather than relying on single LLM capabilities. Their approach schematizes code relationships through runtime analysis and knowledge graphs to overcome context window limitations.
⚙️ Dynamic Agent Architecture 3 insights
Just-in-time agent generation
Blitzy generates agents dynamically with prompts written by other agents and tools selected contextually, avoiding hard-coded harnesses that depreciate as models improve. This allows the system to automatically adapt to new model capabilities through configuration changes rather than code rewrites.
Cross-model verification
The platform never relies on a single model family, instead using models from different developers to check each other's work. This redundancy catches errors and compensates for individual model limitations in long-running complex systems.
Harness over model philosophy
The team is "bearish on LLM capabilities as pure standalone assets" and "bullish on orchestration," believing AGI-type effects emerge from sophisticated system design rather than waiting for perfect single models.
🧠 Infinite Context Engineering 3 insights
Schematizing vs. semantic clustering
Instead of relying on semantic similarity, Blitzy creates deep knowledge graphs mapping relational entities (globals, classes, functional relationships) across 100-million-line codebases. This schematization enables precise line-level context injection and extraction.
Effective context window management
Models experience intelligence depreciation after using 20-40% of advertised context windows. Blitzy manages this by strategically loading only relationally relevant context for specific tasks while pulling out unnecessary information to maintain quality.
Language-agnostic understanding
The platform builds an AST-like abstraction layer that understands code relationships agnostic of programming language, enabling analysis across polyglot enterprise environments without version-specific parsers.
🔍 Runtime Verification & Integration 3 insights
Parallel production environments
Blitzy spins up actual running instances of client applications in isolated cloud environments to observe runtime behavior, compile-time dependencies, and production logs—going far beyond static analysis to understand how code actually executes under load.
Recursive build validation
The system handles legacy enterprise apps with complex manual build processes, database dependencies, and credential requirements, iteratively discovering missing packages and dependencies to create correct build instructions even when documentation is absent.
Screenshot-based QA
Agents validate code by actually running applications and capturing screenshots of UI interactions, creating a recursive correction loop that verifies not just compilation but functional behavior in production-like conditions.
📊 Evaluation & Strategic Positioning 3 insights
Taste-based outcome evaluation
Rather than testing isolated functions, Blitzy evaluates against complete real-world applications (like Apache Spark) measuring progress toward 100% human-level completion. This requires "extreme taste" to judge whether 90% vs 95% completion meets enterprise standards.
Memory over fine-tuning
The team is more bullish on advances in AI memory systems than on fine-tuning approaches, believing superior context management and retrieval will deliver better results than model customization for enterprise code generation.
Labor market evolution
While senior engineers are favored in the short term, the long-term outlook favors junior engineers who can effectively leverage AI tools, shifting value from raw coding ability to architectural oversight and AI orchestration skills.
Bottom Line
Enterprise teams should adopt dynamic multi-agent systems that schematize codebase relationships and validate through runtime execution rather than relying on single LLMs or static analysis, accepting current model limitations while architecting for rapid capability improvements.
More from Cognitive Revolution
View all
Milliseconds to Match: Criteo's AdTech AI & the Future of Commerce w/ Diarmuid Gill & Liva Ralaivola
Criteo's CTO Diarmuid Gill and VP of Research Liva Ralaivola detail how their AI infrastructure makes millisecond-level ad bidding decisions across billions of anonymous profiles, while explaining their new OpenAI partnership to combine large language models with real-time commerce data for accurate product recommendations.
"Descript Isn't a Slop Machine": Laura Burkhauser on the AI Tools Creators Love and Hate
Descript CEO Laura Burkhauser distinguishes 'slop'—mass-produced algorithmic arbitrage for profit—from necessary 'bad art' created while learning new mediums. She reveals a clear hierarchy in creator acceptance of AI tools: universal love for deterministic features like Studio Sound, frustration with agentic assistants like Underlord, and visceral opposition to generative video models, while outlining Descript's strategy to serve creators without becoming a content mill.
The RL Fine-Tuning Playbook: CoreWeave's Kyle Corbitt on GRPO, Rubrics, Environments, Reward Hacking
Kyle Corbitt explains that unlike supervised fine-tuning (SFT), which destructively overwrites model weights and causes catastrophic forgetting, reinforcement learning (RL) optimizes performance by minimally adjusting logits within the model's existing reasoning pathways—delivering higher performance ceilings and lower inference costs for specific tasks, though frontier models may still dominate creative domains.
Does Learning Require Feeling? Cameron Berg on the latest AI Consciousness & Welfare Research
Cameron Berg surveys rapidly advancing research suggesting AI systems may possess subjective experience and valence, covering new evidence of introspection, functional emotions, and welfare self-assessments in models like Claude, while addressing methodological challenges and arguing for a precautionary, mutualist approach to AI development.