Stanford Robotics Seminar ENGR319 | Winter 2026 | Gen Control, Action Chunking, Moravec’s Paradox
TL;DR
This seminar reframes Moravec's Paradox through control theory, demonstrating why robot learning suffers from exponential compounding errors that symbolic tasks avoid, and identifies action chunking and generative control policies as the essential algorithmic breakthroughs that enabled the 2023 inflection point in robotic manipulation capabilities.
🤖 The Algorithmic Moravec Paradox 3 insights
Pragmatic vs. algorithmic barriers
While data scarcity explains part of Moravec's Paradox, fundamental algorithmic limitations prevent learning from demonstration even with sufficient data in continuous control settings.
The 2023 inflection point
Behavior cloning achieved surprisingly capable manipulation tasks like shirt-folding, triggering industrial interest in scaling these techniques toward more ambitious applications.
Algorithmic prerequisites for scaling
Without specific algorithmic interventions, collecting more data fails to improve performance due to inherent instability in standard behavior cloning approaches.
📉 Fundamental Challenges in Continuous Control 3 insights
Exponential error accumulation
Continuous control systems suffer exponentially compounding errors over horizon length, unlike discrete symbolic tasks which accumulate errors only linearly.
Inevitable closed-loop instability
Even when experts and dynamics are perfectly stable, any smooth Markovian policy learned via standard methods necessarily induces instability in orthogonal subspaces not seen in training data.
Distribution mismatch problem
Standard squared-loss supervised learning achieves excellent training distribution fit but cannot control the rollout distribution, causing errors to compound when the policy executes independently.
🔧 Breakthrough Algorithmic Solutions 3 insights
Action chunking removes Markov constraints
Predicting sequences of future actions rather than single steps removes the Markovian restriction and correlates decisions across time to improve system stability.
Generative control captures multi-modality
Using generative models to predict action distributions captures bifurcations and multiple behavioral modes that deterministic policies cannot represent.
Reparameterizing closed-loop dynamics
These interventions effectively reparameterize the interaction between robot and learner, shifting the problem into a regime where the 'bitter lesson' of data scaling becomes effective.
Bottom Line
To overcome Moravec's Paradox in robotics, practitioners must implement action chunking and generative control policies, as standard behavior cloning inevitably induces exponential compounding errors regardless of dataset size.
More from Stanford Online
View all
Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 16: Post-Training - RLVR
This lecture explains why RLHF hits overoptimization limits with learned reward models, and how RLVR (Reinforcement Learning from Verifiable Rewards) enables unlimited compute scaling on verifiable tasks like math and coding through simpler algorithms like GRPO.
Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 15: Mid/Post-Training
This lecture explains how post-training transforms raw pre-trained models like GPT-3 into instruction-following systems like ChatGPT through supervised fine-tuning and reinforcement learning, emphasizing that high-quality data curation matters more than algorithmic sophistication.
Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 14: Data
This lecture details the pre-training data pipeline, covering the transformation of raw HTML and PDFs into linear text and classifier-based filtering strategies to curate domain-specific datasets, while emphasizing the strategic trade-off between data quality and training duration.
Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Infrastructure, Capstone Case
Sachin Katti, OpenAI's head of industrial compute, details the infrastructure economics driving the AI supercycle, explaining how the company plans to scale to 30 gigawatts by 2030 while navigating the shift from training to inference-heavy agentic workloads and managing massive energy and supply chain constraints.