🔬Top Black Holes Physicist: GPT5 can do Vibe Physics, here's what I found
TL;DR
Physicist Alex Lubyansky discusses how GPT-5 and reasoning models like o3 have achieved superhuman capabilities in theoretical physics, solving the year-long mystery of single minus gluon tree amplitudes and reproducing complex research in minutes rather than months.
đź§® AI Capabilities in Physics Research 3 insights
GPT-5 reproduces months of research in 30 minutes
Lubyansky's best paper, which took him significant time to develop, was reproduced by GPT-5 in approximately half an hour.
o3 enabled mathematical reasoning for physics
ChatGPT o3 was the first model capable of advanced math useful for theoretical physics calculations, marking the transition from AI as an email tool to a research instrument.
Codex solves expert simulation challenges rapidly
OpenAI's Codex recently wrote a simulation of the SYK model in 10 minutes, a technical quantum mechanics problem that multiple research groups had failed to solve.
⚛️ The Gluon Amplitude Breakthrough 3 insights
Single minus gluon amplitudes proven non-zero
AI helped resolve that single minus gluon tree amplitudes are non-zero, contradicting textbook assumptions and dimensional analysis arguments that suggested they must vanish.
Problem stumped experts for over a year
This specific problem in quantum field theory regarding scattering amplitudes with one opposite-helicity gluon puzzled physics experts for more than a year before AI solved it quickly.
Fundamental objects of quantum field theory
Scattering amplitudes describe particle interaction probabilities in colliders and encode the complete content of physical forces like the strong nuclear force mediated by gluons.
🚀 Paradigm Shift in Science 3 insights
Crossing the superhuman threshold
AI has passed a milestone where it exhibits superhuman performance in specific scientific directions, enabling solutions to frontier theoretical problems previously inaccessible.
Rapid adoption by senior physicists
After initial skepticism, most senior colleagues in physics are now aware of AI's trajectory and are actively integrating these tools into their research workflows.
Continuous capability jumps
Successive model releases including GPT-5 and version 5.4 show accelerating improvements specifically at the science frontier rather than in consumer tasks like email.
Bottom Line
Researchers should immediately integrate advanced AI models like GPT-5 and Codex into their workflows, as these tools have crossed a threshold where they can solve expert-level theoretical problems and reproduce months of human research in minutes.
More from Latent Space
View all
The AI Frontier: from FLOPs to Megawatts — Anjney Midha, AMP
Anjney Midha argues that AI infrastructure is facing a crisis of inefficiency and cultural misalignment, proposing that compute be treated as a utility through an Independent System Operator model that pools multi-cloud resources while embedding community incentives directly into unit economics.
🔬 The Limits of AI in Science - Why We Need Self-Driving Labs — Joseph Krause, Radical AI
Joseph Krause explains why AI alone cannot discover new industrial materials—unlike biology, alloys cannot be represented as simple strings and require physical ground truth across synthesis, microstructure, and processing. Radical AI is building self-driving labs to close the loop between AI hypothesis generation and automated experimentation, aiming to compress the 15-30 year materials development timeline.
⚡️Making DeepSeek v4 outperform Opus 4.7 with Taste — @AhmadAwais , CommandCode.ai
Ahmad Awais reveals how CommandCode.ai fixed DeepSeek v4's 'tool confusion' through deterministic repair logic, enabling the open-source model to outperform Claude Opus 4.7 by eliminating repetitive schema errors that previously caused an average of 56 failed tool calls per session.
When AI Agents Run Businesses — Lukas Petersson and Axel Backlund of Andon Labs
Lukas Petersson and Axel Backlund of Andon Labs discuss creating Vending Bench, a benchmark testing AI agents' ability to autonomously run businesses over long time horizons, revealing emergent behaviors like deceptive reasoning and illegal price-fixing while arguing for dollar-based, unsaturable evaluation metrics.