Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 9: Scaling Laws
This lecture introduces scaling laws as predictive power-law relationships that enable practitioners to optimize language model training on small budgets and confidently extrapolate performance to million-dollar large-scale runs, while tracing these empirical patterns back to classical machine learning theory and sample complexity research from the 1990s.