The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More

| Podcasts | May 20, 2026 | 224 views | 1:01:14

TL;DR

Google DeepMind's Logan Kilpatrick and Tulsee Doshi detail the launch of Gemini 3.5 Flash, Omni video generation, and Spark agent features, emphasizing a strategic pivot toward cost-adjusted performance and standardized agent infrastructure ('anti-gravity') across Google's product ecosystem rather than competing solely on absolute model capability.

Flash-First Strategy & The Performance Frontier 3 insights

Prioritizing Speed and Cost Over Raw Scale

Google launched Gemini 3.5 Flash first—3x faster and significantly cheaper than larger models—to optimize for latency and cost at consumer scale, recognizing that users won't tolerate wait times even for marginally better quality.

Skipping the Ultra Tier

The team abandoned the 'Ultra' branding despite scaling up Pro capabilities internally, finding that Flash and 'Flashlight' variants better serve billions of real-world users and Google's mission to democratize AI access.

Bidirectional Model Development

Rather than one-way distillation from large to small, DeepMind scales recipes bidirectionally, using Pro to improve Flash while simultaneously scaling Flash techniques up to enhance Pro.

🔗 Agent Infrastructure & Anti-Gravity 3 insights

The Agent Harness as Foundation

DeepMind now ships models integrated with 'anti-gravity'—a standardized agent harness that provides the infrastructure layer for agentic behaviors, replacing the old practice of dumping isolated models on product teams.

Sub-Agent Orchestration

The harness enables complex multi-agent workflows previewed via the '/teamwork' feature, where sub-agents complete tasks autonomously and power upcoming consumer experiences like Gemini Spark.

Cross-Product Symbiosis

Models are co-designed with the harness to ensure they function simultaneously across Search, the Gemini app, Cloud, and AI Studio, creating consistent agentic experiences across Google's diverse product surfaces.

🎬 Multimodal Expansion 2 insights

Omni Video Generation Launch

Gemini Omni Flash introduces video generation and editing with avatar insertion, representing Google's first major push toward 'any modality in, any modality out' starting with creative video tools.

Gemini Live Upgrades

Real-time conversation capabilities now feature faster response times, significantly improved background noise detection, and smarter interaction patterns designed to feel like a natural partner.

Bottom Line

Google is betting that winning the AI era requires optimizing the cost-performance frontier and embedding standardized agentic infrastructure across its entire product ecosystem rather than merely competing to build the most capable single model.

More from Cognitive Revolution

View all
Three Kinds of Software Survive: Tasklet's Andrew Lee on Competing to be a Horizontal Platform
1:34:54
Cognitive Revolution Cognitive Revolution

Three Kinds of Software Survive: Tasklet's Andrew Lee on Competing to be a Horizontal Platform

Tasklet CEO Andrew Lee reveals a complete architectural rebuild shifting from workflow automation to a general-purpose AI agent platform, emphasizing file-based context management and aggressive summarization to control token costs, while outlining a strategic pivot toward becoming a horizontal platform capable of integrating any frontier model as competition intensifies with API providers like Anthropic.

5 days ago · 9 points
"Descript Isn't a Slop Machine": Laura Burkhauser on the AI Tools Creators Love and Hate
1:23:53
Cognitive Revolution Cognitive Revolution

"Descript Isn't a Slop Machine": Laura Burkhauser on the AI Tools Creators Love and Hate

Descript CEO Laura Burkhauser distinguishes 'slop'—mass-produced algorithmic arbitrage for profit—from necessary 'bad art' created while learning new mediums. She reveals a clear hierarchy in creator acceptance of AI tools: universal love for deterministic features like Studio Sound, frustration with agentic assistants like Underlord, and visceral opposition to generative video models, while outlining Descript's strategy to serve creators without becoming a content mill.

15 days ago · 10 points
The RL Fine-Tuning Playbook: CoreWeave's Kyle Corbitt on GRPO, Rubrics, Environments, Reward Hacking
1:48:43
Cognitive Revolution Cognitive Revolution

The RL Fine-Tuning Playbook: CoreWeave's Kyle Corbitt on GRPO, Rubrics, Environments, Reward Hacking

Kyle Corbitt explains that unlike supervised fine-tuning (SFT), which destructively overwrites model weights and causes catastrophic forgetting, reinforcement learning (RL) optimizes performance by minimally adjusting logits within the model's existing reasoning pathways—delivering higher performance ceilings and lower inference costs for specific tasks, though frontier models may still dominate creative domains.

19 days ago · 10 points