🔬 From Red Teaming GPT-4 to Automating Drug Discovery: The Future of AI in Science — Andrew White
TL;DR
Andrew White recounts his journey from molecular dynamics researcher to founder of Future House and Edison Scientific, revealing how AlphaFold disrupted specialized supercomputing approaches and why current LLMs have already compressed a decade-long vision for automated science into just two years—prompting urgent policy attention at the highest levels of government.
🧬 Computational Biology's Hardware-to-Software Revolution 2 insights
D. E. Shaw's custom silicon failed against general AI
David Shaw's research group burned molecular dynamics algorithms into specialized chips near Times Square, but AlphaFold solved protein folding on desktop GPUs via Google Colab without requiring government-funded supercomputers.
Geometric deep learning supplanted maximum entropy methods
White's early work fitting complex molecular dynamics to sparse data evolved into using transformers that naturally handle chemical graph structures and symmetry, enabling chemistry to be treated as a language modeling problem.
🚀 Automating Discovery: Organizations and Timelines 2 insights
Future House bridges academia and venture capital
White resigned his University of Rochester tenure in June to establish a nonprofit 'Focused Research Organization' exploring AI automation, then spun out Edison Scientific as a venture-backed startup when progress compressed a planned 10-year mission into two years.
Current LLMs already automate empirical biology
Unlike theoretical domains requiring perfect reasoning, biology's empirical nature means today's models can match top-quintile human experimental predictions, making immediate automation viable without waiting for more advanced AI systems.
⚠️ Red Teaming and Policy at the Frontier 2 insights
ChemCrow prompted unprecedented White House attention
White spent nine months red-teaming GPT-4's chemistry capabilities before its March release, creating an agentic system that operates cloud laboratories and became the first pre-print presented to the President in a 30-minute National Security Council briefing.
Immediate dual-use implications revealed expertise gaps
Three-letter agencies immediately questioned ChemCrow's impact on explosives synthesis and nuclear weapons research timelines, highlighting the dangerous scarcity of experts at the intersection of AI and hard sciences.
Bottom Line
Organizations should immediately deploy current-generation LLMs to automate empirical scientific research through hybrid academic-industry teams, as the technology has already matured beyond the capabilities of traditional institutional structures.
More from Latent Space
View all
🔬There Is No AlphaFold for Materials — AI for Materials Discovery with Heather Kulik
MIT professor Heather Kulik explains how AI discovered quantum phenomena to create 4x tougher polymers and why materials science lacks an 'AlphaFold' equivalent due to missing experimental datasets, emphasizing that domain expertise remains essential to validate AI predictions in chemistry.
Dreamer: the Agent OS for Everyone — David Singleton
David Singleton introduces Dreamer as an 'Agent OS' that combines a personal AI Sidekick with a marketplace of tools and agents, enabling both non-technical users and engineers to build, customize, and deploy AI applications through natural language while maintaining privacy through centralized, OS-level architecture.
Why Anthropic Thinks AI Should Have Its Own Computer — Felix Rieseberg of Claude Cowork/Code
Anthropic's Felix Rieseberg explains why AI agents need their own virtual computers to be effective, arguing that confining Claude to chat interfaces severely limits capability. He details how this philosophy shaped Claude Cowork and why product development is shifting from lengthy planning to rapidly building multiple prototypes simultaneously.
⚡️Monty: the ultrafast Python interpreter by Agents for Agents — Samuel Colvin, Pydantic
Samuel Colvin from Pydantic introduces Monty, a Rust-based Python interpreter designed specifically for AI agents that achieves sub-microsecond execution latency by running in-process, bridging the gap between rigid tool calling and heavy containerized sandboxes.