🔬 "The Most Innovative Diffusion Research Is Happening in Drug Discovery, Not Image Generation"

| Podcasts | June 30, 2026 | 616 views | 1:48:40

TL;DR

Evan Fineberg and Sergey Udov of Genesis Molecular AI discuss how diffusion models have pivoted from image generation to drive breakthroughs in 3D protein structure prediction. They detail how their Pearl model applies LLM-style scaling strategies—including synthetic physics-based training data and inference-time 'thinking'—to solve the historically intractable challenge of predicting how small molecules bind to proteins.

🧬 Diffusion's Pivot from Images to Drug Discovery 2 insights

Drug discovery drives diffusion innovation

The most cutting-edge diffusion research now occurs in 3D molecular structure prediction rather than image generation, establishing drug discovery as the new frontier for generative AI primitives.

GANs failed for molecular applications

While generative adversarial networks showed promise for images in 2017-2018, they proved ineffective for proteins and drug discovery, with diffusion models providing the necessary computational primitive for the space.

⚡ Scaling Molecular AI Like Large Language Models 3 insights

Three-stage scaling adapted from LLMs

Genesis applies pre-training scaling, post-training refinement, and inference-time computing to molecular models, mirroring the development path of Llama and other language models.

Synthetic data via physics simulations

Unlike proteins, small molecules can be accurately modeled with physics simulations, allowing Genesis to generate unlimited synthetic training data beyond the limited 200,000 structures in public databases.

Inference-time thinking in structural space

Pearl uses 'thinking tokens' adapted to molecular coordinates, allowing the model to iteratively refine crystal structure representations with physics-based guidance during inference.

🔑 Solving the Protein-Ligand Binding Challenge 2 insights

Cracking the 'key and lock' problem

Pearl predicts 3D coordinates of protein-small molecule complexes with sufficient accuracy to determine binding potency, overcoming a barrier that previously required expensive, months-long laboratory experiments.

Navigating 10^60 molecular possibilities

The model addresses the vast search space of drug-like molecules by using diffusion to move beyond pattern matching toward generalizable physical predictions for previously undruggable targets.

Bottom Line

Apply LLM scaling strategies—specifically synthetic data generation and inference-time computing—to domain-specific physics problems, as demonstrated by using diffusion models with physics-guided 'thinking' to predict molecular binding and unlock new drug discovery targets.

More from Latent Space

View all
AI Security After Codex and Claude Code — Zico Kolter & Matt Fredrikson, Gray Swan
1:07:31
Latent Space Latent Space

AI Security After Codex and Claude Code — Zico Kolter & Matt Fredrikson, Gray Swan

Gray Swan co-founders Zico Kolter and Matt Fredrikson explain why AI systems require a fundamentally different security approach than traditional software, highlighting how their automated red teaming system 'Shade' has begun to outperform human experts at finding model vulnerabilities. They emphasize the urgent need to treat AI agents as inherently untrusted entities capable of correlated failures across the software ecosystem.

8 days ago · 8 points