Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Applications, Applied AI
TL;DR
Base 10 CEO Tuhin explains why AI inference is shifting from frontier models to custom post-trained models as companies scale, driven by 70-90% cost savings, latency requirements, and the strategic need to own proprietary data rather than feed it to potential competitors.
💰 The Economic Imperative of Custom Models 3 insights
The 90/5 spending reversal
While 90-95% of current inference spend flows to frontier models, successful application companies must shift to custom post-trained models to achieve viable gross margins (40-70%).
Cost-performance parity gap
Open source models lag frontier models by roughly 90 days but cost 70-90% less to run, making them economically essential once companies reach scale.
Existential scaling pressure
High-volume AI applications currently run negative gross margins on frontier APIs, making the transition to optimized custom models a business-critical necessity for survival.
⚡ Infrastructure Differentiation 3 insights
Optimization as managed service
Unlike raw cloud providers (AWS, GCP, CoreWeave), Base 10 handles complex performance optimizations, reducing latency for multi-model workflows like Whisper Flow's speech-to-text pipeline.
Multi-cloud resilience
The platform provides fault-tolerant inference across multiple cloud providers, ensuring reliability without customers managing infrastructure complexity themselves.
Developer experience layer
Offers integrated security, observability, and flexibility that application companies need but cannot easily build on top of commodity compute offerings.
🔒 Owning Your Intelligence 3 insights
Defensibility through data ownership
Companies must 'own their intelligence' by post-training on proprietary workflow data to prevent frontier labs from capturing unique user signals and competing directly against them.
End-to-end post-training workflow
Base 10 enables customers to define utility functions (e.g., minimizing medical transcription errors), provide proprietary datasets, and deploy specialized models without managing ML infrastructure.
Avoiding the East India Company trap
Using frontier APIs risks vendors 'post-training against' customer workflows using shared data, whereas custom models keep strategic advantages proprietary.
🏢 Business Model & Market Position 2 insights
Powering scaled AI applications
Base 10 runs mission-critical infrastructure for companies like Whisper Flow and Abridge, simultaneously operating 20+ specialized models per customer with strict reliability requirements.
Pricing model evolution
Currently monetizes through compute markup on GPUs (H100/B200) for the software stack value, transitioning toward token-based pricing to demonstrate clear savings versus frontier APIs.
Bottom Line
To build a defensible, profitable AI business at scale, companies must transition from frontier APIs to owning their intelligence through post-trained custom models optimized for their specific workflows and data.
More from Stanford Online
View all
Stanford CS547 HCI Seminar | Spring 2026 | The Modern Motivators of Play
The speaker challenges the game industry's outdated assumption that players primarily seek competition, presenting 2024 data showing only 18% of gamers are motivated by competition while 50% seek stress relief and 40% want community. They introduce a framework of nine motivators divided into classic (Fun, Mastery, Competition, Immersion, Meditation, Comfort) and modern (Self-expression, Companionship, Education), arguing that successful games must layer social and creative motivators onto traditional designs to serve contemporary player needs.
Stanford CS336 Language Modeling from Scratch | Spring 2026 | Guest Lecture: Dan Fu
Dan Fu explains how LLM inference serves as the engine converting electricity into intelligence, detailing the lifecycle of requests through modern serving systems and emphasizing that GPU kernel expertise enables full-stack ML innovation.
Stanford Robotics Seminar ENGR319 | Spring 2026 | Leveraging Geometry in Robot Learning
Rob Platt argues that modern Vision-Language-Action models discard geometric structure, requiring massive datasets to relearn physical constraints. He proposes hybrid approaches that embed geometric symmetries (equivariance) directly into learning architectures, enabling data-efficient robot policies that respect physical laws.
Stanford CS25: Transformers United V6 I From Language Models to Native Multimodal Intelligence
Victoria Lynn from Thinking Machines Lab explains the evolution from language models to native multimodal AI systems, detailing how tokenization enables transformers to process images, audio, and video alongside text, while comparing discrete token approaches (Chameleon) against continuous diffusion-based methods (Transfusion) and their respective trade-offs in generation quality versus understanding capabilities.