Stanford CS153 Frontier Systems | Anjney Midha from AMP PBC on Frontier Systems

Stanford Online

| Podcasts | April 30, 2026 | 2.87 Thousand views | 1:05:55

TL;DR

Anjney Midha frames the current AI landscape as 'the great transition,' where industrial-scale model training meets a complete restructuring of the eight-layer infrastructure stack, while arguing that relationships and obsessions remain the ultimate asymmetric advantages for founders against entrenched incumbents.

🏗️ The Great Infrastructure Transition 2 insights

Every layer of the stack faces simultaneous disruption

From capital and energy to chips, cloud, training, models, and governance, fundamental assumptions are being revisited across the entire value chain for the first time in decades.

Uncertainty creates architectural opportunity

With leaders like Jensen Huang, Satya Nadella, and Sam Altman all racing to unblock bottlenecks, students have a rare window to redesign systems that have remained static in large organizations.

⚡ Industrialization of Training Compute 3 insights

Base training now runs at factory scale

Frontier labs conduct base model training twice yearly on 100,000+ B300-equivalent GPUs, shifting from bespoke research to industrial engineering processes.

Mid-training adds capabilities quarterly

Organizations now run mid-training 2-4 times per year using approximately 10% of base training compute to inject new capabilities into existing foundation models.

Reinforcement learning consumes the majority

Post-training reinforcement learning has become so compute-intensive that it now rivals or exceeds the combined compute of all pre-training and mid-training steps.

🎯 Asymmetric Advantages for Builders 2 insights

Relationships outperform rigid career planning

Midha's empirical 'life scaling law' suggests that prioritizing fun with people you trust—such as the roommates he founded AMP with—generates more impact than forecasting-based strategies.

Obsessions are non-scalable assets

While large organizations optimize for scalable efficiency, individuals can leverage specific obsessions and 'things that don't scale' as durable competitive moats.

Bottom Line

As AI training industrializes and RL consumes compute parity with pre-training, your most durable advantage lies in asymmetric bets and deep relationships that large organizations cannot replicate.

Watch on YouTube

More from Stanford Online

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 9: Scaling Laws

Stanford Online

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 9: Scaling Laws

This lecture introduces scaling laws as predictive power-law relationships that enable practitioners to optimize language model training on small budgets and confidently extrapolate performance to million-dollar large-scale runs, while tracing these empirical patterns back to classical machine learning theory and sample complexity research from the 1990s.

2 days ago · 9 points

Stanford Robotics Seminar ENGR319 | Spring 2026 | Ingredientsfor Long-Horizon Robot Autonomy

Stanford Online

Stanford Robotics Seminar ENGR319 | Spring 2026 | Ingredientsfor Long-Horizon Robot Autonomy

A researcher from Physical Intelligence argues that while robots now excel at short, dexterous tasks, true utility requires long-horizon autonomy for complex jobs like cleaning apartments or assembling server racks. The talk introduces MEM (Multiscale Embodied Memory), a system that uses compressed visual and linguistic memory to solve the latency and distribution shift problems that have historically prevented robots from tracking progress over extended time periods.

2 days ago · 8 points

Stanford CS547 HCI Seminar | Spring 2026 | Observing the User Experience in 2026

Stanford Online

Stanford CS547 HCI Seminar | Spring 2026 | Observing the User Experience in 2026

Mike Kuniavsky and Elizabeth Goodman examine how AI has revolutionized UX research by automating traditional methods while simultaneously creating an 'authenticity crisis' through synthetic users and widespread fraud, arguing that maintaining 'ground truth' through direct human contact remains essential for valid insights and organizational influence.

3 days ago · 8 points

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 8: Parallelism

Stanford Online

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 8: Parallelism

This lecture details how to scale language model training across massive clusters using 4D parallelism, contrasting TPU and GPU networking architectures while addressing the critical memory bottlenecks—particularly optimizer states—that dominate training costs at scale.

4 days ago · 9 points

Browse more: 🎙️ Podcasts All Videos All Categories