Beyond Bigger Models: Recursion As The Next Scaling Law In AI

| Business & Entrepreneurship | May 01, 2026 | 17.9 Thousand views | 37:53

TL;DR

Recursion at inference time—rather than simply scaling model size—may be the next breakthrough in AI reasoning. Recent research on Hierarchical Reasoning Models (HRM) and Tiny Recursive Models (TRM) demonstrates that recursive architectures using shared weights can solve complex reasoning benchmarks like Arc Prize with minimal parameters, outperforming massive traditional LLMs.

⚠️ The Fundamental Flaw in Modern LLMs 2 insights

One-shot processing hits theoretical limits

Transformers process inputs in parallel without iterative compression, making them theoretically incapable of solving incompressible problems like sorting or Sudoku in a single pass due to computational lower bounds (e.g., the n log n comparison limit for sorting).

Memory without compression

Unlike RNNs which compress information into hidden states, LLMs must retain the entire input context (equivalent to a full novel) for every token generation, lacking the latent reasoning capabilities inherent in recursive models.

🧩 Hierarchical Reasoning Models (HRM) 2 insights

Tiny model beats massive LLMs on reasoning

HRM uses only 27 million parameters trained on just 1,000 Arc Prize puzzles with no pre-training, achieving approximately 70% accuracy on Arc Prize 1 where OpenAI's O3 scored 0%.

Three-level hierarchical recursion

The architecture applies identical weights recursively across three nested loops: low-level high-frequency processing, high-level low-frequency processing, and outer refinement steps, inspired by brain wave patterns operating at different frequencies.

Training Breakthroughs and TRM 2 insights

Bypassing backprop through time

HRM uses Deep Equilibrium (DEQ) learning with stop gradients, treating different hidden state 'carry' values as separate batch samples rather than backpropagating through all recursive steps, avoiding the vanishing gradient problems that plagued traditional RNNs.

TRM challenges the DEQ assumption

The Tiny Recursive Models paper reveals that HRM's DEQ math is insufficient to explain its performance, demonstrating that backpropagating through the full deep recursion actually improves results significantly.

🧬 Bio-Inspiration vs. Computational Reality 2 insights

Brain waves inspire architecture

HRM's hierarchical frequency approach draws from neuroscience observations that different brain regions operate at different frequencies, though the actual optimization mechanism may differ from biological processes.

Bio-plausibility as inspiration, not constraint

While biological analogies help generate ideas, computationally efficient solutions often diverge from bio-plausible mechanisms, as seen in the evolution from AlexNet's bio-inspired features to simpler, deeper architectures that run better on GPUs.

Bottom Line

Prioritize inference-time recursion and hierarchical reasoning over raw parameter count—using shared-weight iterative processing with variable scoping achieves superior reasoning performance with minimal training data and compute.

More from Y Combinator

View all
5 Papers That Show Where AI Research Is Heading Right Now
1:16:55
Y Combinator Y Combinator

5 Papers That Show Where AI Research Is Heading Right Now

Researchers argue that achieving AGI requires moving beyond human-generated training data toward AlphaZero-style self-play methods, while highlighting critical unsolved challenges in learning efficiency per sample and per watt. A detailed presentation demonstrates that protein biology models now follow the same predictable scaling laws as language models, with the ESMC model showing continuous improvement when trained on 2.8 billion sequences compared to previous plateaus at 50 million.

7 days ago · 6 points
How Meesho Became India’s Biggest Shopping App
30:21
Y Combinator Y Combinator

How Meesho Became India’s Biggest Shopping App

Meesho founder Vidit Aatrey details how the company pivoted from a failed local shopping app to India's largest e-commerce platform with 250 million users, achieving product-market fit by empowering WhatsApp-based resellers and focusing on value-conscious consumers in 'mass India.'

8 days ago · 9 points
The CEO Must Be the Chief AI Officer
54:07
Y Combinator Y Combinator

The CEO Must Be the Chief AI Officer

Brex CEO Pedro Franchesci argues that CEOs must personally serve as Chief AI Officers to transform their companies, shifting from treating AI as rigid, expensive tools (Foxconn factories) to autonomous 'virtual employees' (Eselin Institute) secured via network-layer controls, while overcoming conservative token consumption mindsets to unlock 10x productivity.

9 days ago · 10 points