Stanford AA228V I Validation of Safety Critical Systems I Explainability

Stanford Online

| Podcasts | April 10, 2026 | 1.62 Thousand views | 1:13:42

TL;DR

This lecture covers Project 3 results on reachability analysis before introducing explainability methods for safety-critical AI systems, focusing on how to attribute failures to specific time steps using Shapley values from game theory when simple ablation studies fail due to correlated noise patterns.

🏆 Project 3 Results & Verification Techniques 3 insights

AI-squared dominance on large systems

Top leaderboard performers achieved tightly clustered scores (0.70-0.72) using AI-squared verification techniques for large-scale systems, significantly outperforming other approaches.

Advanced geometric methods for small systems

Winning solutions employed zonotopes and PCA-aligned rectangles rather than simple axis-aligned box approximations, capturing more accurate reachable sets.

Second-order Taylor expansions improve accuracy

For medium systems, utilizing Hessian matrices for second-order Taylor expansion provided measurable performance gains over first-order linearization methods.

⚠️ The Safety-Critical Failure Scenario 2 insights

Post-incident stakeholder pressure

Chief engineers at companies like Waymo or aviation firms face intense scrutiny following rare catastrophic failures after thousands of successful operating hours, requiring immediate explanations to CEOs, investors, and regulators.

Three critical post-failure questions

Engineers must definitively answer why the specific failure occurred, what system or dataset modifications will prevent recurrence, and how to formally guarantee to stakeholders that the issue is resolved.

⏱️ Temporal Root Cause Analysis 2 insights

Limitations of leave-one-out analysis

Simple ablation studies that zero out individual noise variables at specific time steps often fail to identify failure causes because catastrophic outcomes frequently stem from correlated patterns across multiple consecutive steps.

Group-based noise attribution required

Analyzing groups of time steps rather than isolated events is necessary to detect redundancy and synergy effects in noise sequences that drive systems into failure regimes.

🎲 Shapley Values for Rigorous Attribution 3 insights

Game theory foundations for ML explainability

Shapley values from 1950s cooperative game theory provide a mathematically rigorous framework to attribute system failures to specific input features by averaging performance across all possible subsets of variables.

Handling redundancy and synergy

Unlike simple ablation, Shapley values correctly account for scenarios where multiple noise variables are redundant or exhibit synergy, providing precise numerical attribution for each variable's contribution to failure.

Computational challenges in long trajectories

Applying Shapley values to safety-critical trajectories with 40+ time steps presents significant computational challenges due to the combinatorial explosion of subset evaluations required by the method.

Bottom Line

Implement Shapley value analysis to rigorously attribute failures to specific correlated noise patterns across time steps, enabling targeted system modifications and verifiable guarantees to stakeholders.

Watch on YouTube

More from Stanford Online

Stanford Robotics Seminar ENGR319 | Spring 2026 | Towards Trustworthy Autonomy

Stanford Online

Stanford Robotics Seminar ENGR319 | Spring 2026 | Towards Trustworthy Autonomy

As learning-based robotics deploy at scale—exemplified by Waymo's 500,000 weekly rides—they face dangerous 'semantic anomalies' where context causes system-level confusion rather than visual novelty. The speaker presents a 'fast and slow' reasoning framework using lightweight embedding models for real-time detection and large language models for safety interventions, enabling trustworthy autonomy without requiring perfect prediction models.

5 days ago · 9 points

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Applications, Coding AI

Stanford Online

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Applications, Coding AI

Vercel founder Guillermo Rauch explains how AI coding agents have expanded the software development market by 10-100x, driving a fundamental shift from traditional web services to 'agentic infrastructure' where tokens replace pixels as the primary commodity and deployment becomes the critical value creator.

19 days ago · 9 points

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Building AI Factories

Stanford Online

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Building AI Factories

Crusoe Energy CEO Chase Lockmiller explains how AI data centers represent history's second-largest infrastructure investment, driven by the economic potential of scalable 'digital labor.' He reveals Crusoe's strategy of building massive AI factories in stranded-power locations like Abilene, Texas, to overcome the industry's critical bottleneck: energized data center capacity.

25 days ago · 9 points

AI in Healthcare Series: Inside the Rise of AI in Healthcare, Open Evidence and Cyber Risks

Stanford Online

AI in Healthcare Series: Inside the Rise of AI in Healthcare, Open Evidence and Cyber Risks

Former U.S. Chief Data Scientist DJ Patil warns that healthcare systems are dangerously unprepared for AI-enabled cyberattacks from nation states, while simultaneously seeing rapid democratization of medical knowledge through tools like Open Evidence that are fundamentally reshaping the doctor-patient relationship.

27 days ago · 10 points

Browse more: 🎙️ Podcasts All Videos All Categories