The "Final Boss" of Deep Learning
TL;DR
Despite consuming hundreds of billions of operations per token, current large language models fail at reliable arithmetic and algorithmic reasoning, revealing a fundamental limitation that tool use cannot fix; the path forward requires categorical deep learning to provide the unifying theoretical framework that geometric deep learning cannot.
🧮 The Arithmetic and Reasoning Deficit 3 insights
Pattern Matching vs. Algorithms
Language models approximate addition through memorized digit patterns rather than executing algorithms, failing when inputs deviate slightly from training distributions (e.g., changing one 8 to 7 in a sequence).
Compute-Robustness Misalignment
Frontier models perform hundreds of billions of multiplications per token yet cannot reliably multiply small numbers, indicating a profound disconnect between scale and reasoning capability.
Physics Approximation
Video generation systems like VO and Genie produce visually plausible simulations that remain approximations rather than accurate encapsulations of Newtonian physics suitable for robotics.
🔧 Limits of Tool Augmentation 3 insights
Iterative Instability
While calculators and MCP servers can augment models, relying on external tools creates inefficiency and instability for complex reasoning chains requiring multiple intermediate calculations.
Compositional Barriers
External tool use prevents seamless composition of reasoning steps, forcing expensive backtracking when intermediate results surprise the model rather than flowing through an intrinsic computational mechanism.
Architectural Necessity
Sustainable progress requires internalizing basic computation and algorithmic primitives within neural architectures rather than depending on external API calls.
🌐 Geometric Deep Learning Constraints 3 insights
Symmetry-Driven Efficiency
Geometric deep learning leverages equivariance to symmetries (permutations, translations) to achieve exponential reductions in data requirements, with transformers exemplifying permutation equivariance.
Invertibility Limitations
Current geometric frameworks rely on group theory that assumes invertible transformations, making them incapable of modeling algorithmic computation that irreversibly destroys information.
Hybrid Success Models
Breakthroughs like AlphaCode, FunSearch, and AlphaGeometry achieve results only by combining language models with algorithmic procedures (verifiers, genetic algorithms) that provide correctness guarantees.
🧮 Category Theory as the Unifying Framework 3 insights
Partial Compositionality
Category theory generalizes geometric deep learning to handle 'algebra with colors' (partial compositionality), accommodating non-invertible operations and dimension constraints like matrix multiplication that group theory cannot express.
Systematic Discovery
Acting as a 'periodic table' for neural networks, category theory provides systematic rules that automatically generate meaningful architectural constraints without empirical trial and error.
Unifying Foundation
This framework potentially unifies probabilistic, neuroscience, and optimization perspectives in deep learning while enabling true compositional reasoning in latent spaces.
Bottom Line
The field must transition from empirical architecture search to principled design using categorical deep learning to internalize algorithmic reasoning and compositionality, rather than merely scaling current models or bolting on external tools.
More from Machine Learning Street Talk
View all
Solving the Wrong Problem Works Better - Robert Lange
Robert Lange from Sakana AI explains how evolutionary systems like Shinka Evolve demonstrate that scientific breakthroughs require co-evolving problems and solutions through diverse stepping stones, while current LLMs remain constrained by human-defined objectives and fail to generate autonomous novelty.
"Vibe Coding is a Slot Machine" - Jeremy Howard
Deep learning pioneer Jeremy Howard argues that 'vibe coding' with AI is a dangerous slot machine that produces unmaintainable code through an illusion of control, contrasting it with his philosophy that true software engineering insight emerges from interactive exploration (REPLs/notebooks) and deep engagement with models, drawing on his foundational ULMFiT research to demonstrate how understanding—not gambling—drives sustainable productivity.
What If Intelligence Didn't Evolve? It "Was There" From the Start! - Blaise Agüera y Arcas
Blaise Agüera y Arcas argues that intelligence is not an evolutionary invention but a fundamental physical property that emerges through phase transitions from noise to complex programs, with life representing 'embodied computation' where function, not matter, defines living systems.
If You Can't See Inside, How Do You Know It's THINKING? [Dr. Jeff Beck]
Dr. Jeff Beck argues that agency cannot be verified from external behavior alone, requiring instead evidence of internal planning and counterfactual reasoning, while advocating for energy-based models and joint embedding architectures as biologically plausible alternatives to standard function approximation.