The "Final Boss" of Deep Learning

Machine Learning Street Talk

| Podcasts | December 22, 2025 | 65.3 Thousand views | 43:58

TL;DR

Despite consuming hundreds of billions of operations per token, current large language models fail at reliable arithmetic and algorithmic reasoning, revealing a fundamental limitation that tool use cannot fix; the path forward requires categorical deep learning to provide the unifying theoretical framework that geometric deep learning cannot.

🧮 The Arithmetic and Reasoning Deficit 3 insights

Pattern Matching vs. Algorithms

Language models approximate addition through memorized digit patterns rather than executing algorithms, failing when inputs deviate slightly from training distributions (e.g., changing one 8 to 7 in a sequence).

Compute-Robustness Misalignment

Frontier models perform hundreds of billions of multiplications per token yet cannot reliably multiply small numbers, indicating a profound disconnect between scale and reasoning capability.

Physics Approximation

Video generation systems like VO and Genie produce visually plausible simulations that remain approximations rather than accurate encapsulations of Newtonian physics suitable for robotics.

🔧 Limits of Tool Augmentation 3 insights

Iterative Instability

While calculators and MCP servers can augment models, relying on external tools creates inefficiency and instability for complex reasoning chains requiring multiple intermediate calculations.

Compositional Barriers

External tool use prevents seamless composition of reasoning steps, forcing expensive backtracking when intermediate results surprise the model rather than flowing through an intrinsic computational mechanism.

Architectural Necessity

Sustainable progress requires internalizing basic computation and algorithmic primitives within neural architectures rather than depending on external API calls.

🌐 Geometric Deep Learning Constraints 3 insights

Symmetry-Driven Efficiency

Geometric deep learning leverages equivariance to symmetries (permutations, translations) to achieve exponential reductions in data requirements, with transformers exemplifying permutation equivariance.

Invertibility Limitations

Current geometric frameworks rely on group theory that assumes invertible transformations, making them incapable of modeling algorithmic computation that irreversibly destroys information.

Hybrid Success Models

Breakthroughs like AlphaCode, FunSearch, and AlphaGeometry achieve results only by combining language models with algorithmic procedures (verifiers, genetic algorithms) that provide correctness guarantees.

🧮 Category Theory as the Unifying Framework 3 insights

Partial Compositionality

Category theory generalizes geometric deep learning to handle 'algebra with colors' (partial compositionality), accommodating non-invertible operations and dimension constraints like matrix multiplication that group theory cannot express.

Systematic Discovery

Acting as a 'periodic table' for neural networks, category theory provides systematic rules that automatically generate meaningful architectural constraints without empirical trial and error.

Unifying Foundation

This framework potentially unifies probabilistic, neuroscience, and optimization perspectives in deep learning while enabling true compositional reasoning in latent spaces.

Bottom Line

The field must transition from empirical architecture search to principled design using categorical deep learning to internalize algorithmic reasoning and compositionality, rather than merely scaling current models or bolting on external tools.

Watch on YouTube

More from Machine Learning Street Talk

He won a Nobel here for AlphaFold. Then he left. - John Jumper

Machine Learning Street Talk

He won a Nobel here for AlphaFold. Then he left. - John Jumper

Nobel laureate John Jumper explains how AlphaFold solved the 50-year protein structure prediction problem by collapsing years of experimental work into minutes, while emphasizing its narrow scope as a starting point for biological research rather than a universal model of life.

2 days ago · 9 points

The Ex-Congressman Who Says AI Isn't Unstoppable — Brad Carson

Machine Learning Street Talk

The Ex-Congressman Who Says AI Isn't Unstoppable — Brad Carson

Former Congressman and Pentagon official Brad Carson argues that AI development is not inevitable and can be controlled through strategic regulation, particularly by treating AI as products subject to liability laws rather than granting them human rights, while leveraging chip controls and mandatory testing to shape the technology's future.

25 days ago · 8 points

Inference, not prediction — Prof. Michael I. Jordan on what modern AI is still missing

Machine Learning Street Talk

Inference, not prediction — Prof. Michael I. Jordan on what modern AI is still missing

Professor Michael I. Jordan critiques the hype around AGI and prediction-based LLMs, arguing that modern AI lacks economic and social thinking; he advocates for 'inference' systems grounded in game theory and market dynamics that respect human agency and create collective value.

about 1 month ago · 9 points

Inference, not prediction — Prof. Michael I. Jordan on what modern AI is still missing

Machine Learning Street Talk

Inference, not prediction — Prof. Michael I. Jordan on what modern AI is still missing

Professor Michael I. Jordan critiques AGI as harmful PR that demoralizes young engineers, arguing that true intelligence requires economic and social systems thinking—treating billions of humans as agents in value-creating ecosystems—rather than isolated prediction engines built without intellectual foundations.

about 1 month ago · 9 points

Browse more: 🎙️ Podcasts All Videos All Categories