Stanford Robotics Seminar ENGR319 | Spring 2026 | Leveraging Geometry in Robot Learning

Stanford Online

| Podcasts | June 04, 2026 | 4.58 Thousand views | 1:03:31

TL;DR

Rob Platt argues that modern Vision-Language-Action models discard geometric structure, requiring massive datasets to relearn physical constraints. He proposes hybrid approaches that embed geometric symmetries (equivariance) directly into learning architectures, enabling data-efficient robot policies that respect physical laws.

🔄 The Problem: Two Extremes in Robotics 3 insights

Hand-coded geometric models dominated for decades

Pre-2010s robotics relied on structured geometric models and CAD-based planning that were powerful but brittle, often failing when object locations were misestimated due to incorrect assumptions about reality.

Modern VLAs solve brittleness with massive data

Current generalist models like RT-2 and Octo learn directly from data to overcome rigidity, but require enormous training datasets and discard geometric priors entirely.

The missing middle question

Platt explores whether machine learning models can incorporate geometry, mechanics, or physics to achieve data efficiency without sacrificing the generalization benefits of learning.

🏗️ The Flaw in Current VLA Architectures 3 insights

Disembodied reasoning destroys geometry

Standard VLAs use vision encoders followed by self-attention layers that obliterate geometric position encodings, reducing the world to a latent space with no physical structure.

Inefficient relearning of spatial relationships

Because these models discard geometric structure early, they must relearn basic spatial reasoning from scratch, driving up data requirements for physical tasks.

Uniform architectural pattern across models

Most current VLAs follow the same template: pre-trained visual encoder (CLIP/ResNet) → self-attention/diffusion transformer → action head, regardless of specific implementation.

⚖️ Equivariance: Encoding Physical Symmetry 3 insights

Noether's theorem inspires the approach

Drawing from Emmy Noether's work linking symmetries to conservation laws, Platt argues that embedding translation and rotation symmetries into models improves physical reasoning.

Equivariant neural networks hard-code constraints

These networks constrain layers so that transformations applied to inputs (e.g., rotating a point cloud) produce equivalent transformations in outputs (e.g., rotated action trajectories).

Dramatic parameter reduction example

Enforcing C4 rotation symmetry (90° increments) reduces a convolution kernel's free parameters from 18 to 5 while mathematically guaranteeing the model never violates rotational equivariance.

🎯 Equivariant Diffusion Policy Implementation 3 insights

Point cloud geometric representation

The model encodes scenes as point clouds processed by equivariant transformers respecting finite subgroups of SE3, maintaining geometric structure throughout the network rather than discarding it.

End-to-end symmetry preservation

Both the encoder and diffusion action head maintain equivariance properties, ensuring that rotating the input scene automatically rotates the generated motion plan without additional learning.

Empirical validation on limited data

Benchmarked on MimicGen's manipulation tasks with only 100-1000 demonstrations, the approach outperforms standard diffusion policies and ACT, particularly when trained without large pre-training datasets.

Bottom Line

Robot learning models should embed geometric equivariance (symmetry constraints) directly into neural network architectures to drastically reduce data requirements and ensure physically consistent behavior, rather than discarding geometric structure and forcing models to relearn it from massive datasets.

Watch on YouTube

More from Stanford Online

Stanford Robotics Seminar ENGR319 | Spring 2026 | Towards Trustworthy Autonomy

Stanford Online

Stanford Robotics Seminar ENGR319 | Spring 2026 | Towards Trustworthy Autonomy

As learning-based robotics deploy at scale—exemplified by Waymo's 500,000 weekly rides—they face dangerous 'semantic anomalies' where context causes system-level confusion rather than visual novelty. The speaker presents a 'fast and slow' reasoning framework using lightweight embedding models for real-time detection and large language models for safety interventions, enabling trustworthy autonomy without requiring perfect prediction models.

12 days ago · 9 points

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Applications, Coding AI

Stanford Online

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Applications, Coding AI

Vercel founder Guillermo Rauch explains how AI coding agents have expanded the software development market by 10-100x, driving a fundamental shift from traditional web services to 'agentic infrastructure' where tokens replace pixels as the primary commodity and deployment becomes the critical value creator.

26 days ago · 9 points

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Building AI Factories

Stanford Online

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Building AI Factories

Crusoe Energy CEO Chase Lockmiller explains how AI data centers represent history's second-largest infrastructure investment, driven by the economic potential of scalable 'digital labor.' He reveals Crusoe's strategy of building massive AI factories in stranded-power locations like Abilene, Texas, to overcome the industry's critical bottleneck: energized data center capacity.

about 1 month ago · 9 points

AI in Healthcare Series: Inside the Rise of AI in Healthcare, Open Evidence and Cyber Risks

Stanford Online

AI in Healthcare Series: Inside the Rise of AI in Healthcare, Open Evidence and Cyber Risks

Former U.S. Chief Data Scientist DJ Patil warns that healthcare systems are dangerously unprepared for AI-enabled cyberattacks from nation states, while simultaneously seeing rapid democratization of medical knowledge through tools like Open Evidence that are fundamentally reshaping the doctor-patient relationship.

about 1 month ago · 10 points

Browse more: 🎙️ Podcasts All Videos All Categories