The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More

Cognitive Revolution

| Podcasts | May 20, 2026 | 60.5 Thousand views | 1:01:14

TL;DR

Google DeepMind's Logan Kilpatrick and Tulsee Doshi detail the launch of Gemini 3.5 Flash, Omni video generation, and Spark agent features, emphasizing a strategic pivot toward cost-adjusted performance and standardized agent infrastructure ('anti-gravity') across Google's product ecosystem rather than competing solely on absolute model capability.

⚡ Flash-First Strategy & The Performance Frontier 3 insights

Prioritizing Speed and Cost Over Raw Scale

Google launched Gemini 3.5 Flash first—3x faster and significantly cheaper than larger models—to optimize for latency and cost at consumer scale, recognizing that users won't tolerate wait times even for marginally better quality.

Skipping the Ultra Tier

The team abandoned the 'Ultra' branding despite scaling up Pro capabilities internally, finding that Flash and 'Flashlight' variants better serve billions of real-world users and Google's mission to democratize AI access.

Bidirectional Model Development

Rather than one-way distillation from large to small, DeepMind scales recipes bidirectionally, using Pro to improve Flash while simultaneously scaling Flash techniques up to enhance Pro.

🔗 Agent Infrastructure & Anti-Gravity 3 insights

The Agent Harness as Foundation

DeepMind now ships models integrated with 'anti-gravity'—a standardized agent harness that provides the infrastructure layer for agentic behaviors, replacing the old practice of dumping isolated models on product teams.

Sub-Agent Orchestration

The harness enables complex multi-agent workflows previewed via the '/teamwork' feature, where sub-agents complete tasks autonomously and power upcoming consumer experiences like Gemini Spark.

Cross-Product Symbiosis

Models are co-designed with the harness to ensure they function simultaneously across Search, the Gemini app, Cloud, and AI Studio, creating consistent agentic experiences across Google's diverse product surfaces.

🎬 Multimodal Expansion 2 insights

Omni Video Generation Launch

Gemini Omni Flash introduces video generation and editing with avatar insertion, representing Google's first major push toward 'any modality in, any modality out' starting with creative video tools.

Gemini Live Upgrades

Real-time conversation capabilities now feature faster response times, significantly improved background noise detection, and smarter interaction patterns designed to feel like a natural partner.

Bottom Line

Google is betting that winning the AI era requires optimizing the cost-performance frontier and embedding standardized agentic infrastructure across its entire product ecosystem rather than merely competing to build the most capable single model.

Watch on YouTube

More from Cognitive Revolution

Intelligence on the Edge: Liquid AI's Ramin Hasani on the Search for Device-Native Foundation Models

Cognitive Revolution

Intelligence on the Edge: Liquid AI's Ramin Hasani on the Search for Device-Native Foundation Models

Liquid AI CEO Ramin Hasani details how his company is building device-native foundation models using biologically-inspired 'liquid neural networks' that deliver robust out-of-distribution generalization with minimal computational resources, enabling sophisticated AI to run directly on edge devices rather than cloud data centers.

1 day ago · 8 points

Fable's Back, AI Engineer Recap, & SambaNova

Cognitive Revolution

Fable's Back, AI Engineer Recap, & SambaNova

Anthropic's Fable model returns after a government safety review with refined defense-in-depth safeguards, coinciding with OpenAI's launch of GPT 5.6 Soul Ultra, creating a fragmented market where users must navigate significant pricing disparities and distinct capability trade-offs between frontier models.

3 days ago · 9 points

1000 Designs a Day: Neural Concept's Thomas von Tschammer on AI-Native Engineering

Cognitive Revolution

1000 Designs a Day: Neural Concept's Thomas von Tschammer on AI-Native Engineering

Neural Concept is replacing days-long physics simulations with AI models that deliver results in minutes, enabling automotive manufacturers to explore thousands of designs daily rather than dozens annually. This shift allows engineers to focus on high-level trade-offs while agentic co-pilots handle iterative optimization across domains like aerodynamics, crash safety, and thermal management.

4 days ago · 9 points

AI:AM #4: Cameron on Model Consciousness, Duvenaud's Gradual Disempowerment, swyx's AI-Eng Alpha

Cognitive Revolution

AI:AM #4: Cameron on Model Consciousness, Duvenaud's Gradual Disempowerment, swyx's AI-Eng Alpha

Consciousness researcher Cameron Berg demonstrates that frontier AI models score 30-45% on scientific consciousness indicators using automated theory-based evaluation, while revealing that internal "valence" representations governing welfare states can be directly steered to impact model safety and alignment behaviors.

8 days ago · 8 points

Browse more: 🎙️ Podcasts All Videos All Categories