Cursor's Third Era: Cloud Agents — ft. Sam Whitmore, Jonas Nelle, Cursor

Latent Space

| Podcasts | March 06, 2026 | 5.15 Thousand views | 1:11:18

TL;DR

Cursor launched Cloud Agents that provide AI models with full virtual machine access to autonomously write, test, and demonstrate code through video recordings, shifting from simple code generation to end-to-end software engineering workflows.

🖥️ Full-Computer Architecture 3 insights

Brain in a box virtual machine approach

Agents now run on full VMs with complete computer use (pixels in, coordinates out) rather than just reading code, enabled by Autotab integration and proper DevX setup like a human developer.

Autonomous end-to-end testing on dev servers

Agents automatically start dev servers and test changes for 30+ minutes, returning with verified PRs rather than untested code suggestions, using default prompting calibrated to change complexity.

Multi-model synergistic base layer approach

The system leverages strengths from different model providers as base layers, creating outputs better than any single unified model tier could achieve alone.

🎥 Visual Verification System 3 insights

Video demonstrations accelerate code review process

Every agent session generates a video recording showing the implemented feature in action, serving as an entry point for review that is faster than reading large diffs.

Complete VNC remote desktop environment access

Developers get full remote control of the agent's VM to hover, type, and interact with the live environment via VNC before deciding to merge or request iterations.

Zero-prompt intelligent testing strategies

Agents autonomously determine how to test changes—such as opening Chrome DevTools to inject 5,000 characters to test error limits—without explicit human instructions.

🚀 Advanced Agentic Workflows 3 insights

Automated bug reproduction and fix verification

The /repro command enables agents to reproduce bugs on video, fix them, and demonstrate the fix, reducing complex bug resolution from hours to 90-second review cycles.

Parallel agent swarms increase development throughput

The next major unlock involves parallelizing work through swarms of agents to dramatically increase throughput rather than just making single agents faster.

Recursive agent debugging and Datadog integration

Cloud agents can spin up sub-agents to debug themselves using Datadog MCP integration and explore logs, though recursive agent spawning is currently disabled.

Bottom Line

Start using full-computer agents with visual verification workflows immediately, as models like Claude 3.5 Sonnet and Codex 53 have crossed the threshold to autonomously handle end-to-end development including testing and bug reproduction.

Watch on YouTube

More from Latent Space

The Agent Cloud: Databricks’ Bet on the Future of AI — Matei Zaharia and Reynold Xin

Latent Space

The Agent Cloud: Databricks’ Bet on the Future of AI — Matei Zaharia and Reynold Xin

Matei Zaharia and Reynold Xin detail Databricks' open-source 'Agent Cloud' platform (Omnigen), arguing that standardized protocols and persistent infrastructure—not just better models—will determine which enterprises successfully deploy collaborative, secure AI agents at scale.

about 5 hours ago · 9 points

AI Security After Codex and Claude Code — Zico Kolter & Matt Fredrikson, Gray Swan

Latent Space

AI Security After Codex and Claude Code — Zico Kolter & Matt Fredrikson, Gray Swan

Gray Swan co-founders Zico Kolter and Matt Fredrikson explain why AI systems require a fundamentally different security approach than traditional software, highlighting how their automated red teaming system 'Shade' has begun to outperform human experts at finding model vulnerabilities. They emphasize the urgent need to treat AI agents as inherently untrusted entities capable of correlated failures across the software ecosystem.

2 days ago · 8 points

⚡️Every product of the future will be a living system — Ronak Malde, Trajectory.ai

Latent Space

⚡️Every product of the future will be a living system — Ronak Malde, Trajectory.ai

Ronak Malde explains leaving DeepMind (and $2 billion in acquisition earnings) to found Trajectory.ai, arguing that AI products must evolve from static tools into "living systems" that continually learn from real-world user corrections across enterprise verticals like legal and finance.

3 days ago · 9 points

The AI Frontier: from FLOPs to Megawatts — Anjney Midha, AMP

Latent Space

The AI Frontier: from FLOPs to Megawatts — Anjney Midha, AMP

Anjney Midha argues that AI infrastructure is facing a crisis of inefficiency and cultural misalignment, proposing that compute be treated as a utility through an Independent System Operator model that pools multi-cloud resources while embedding community incentives directly into unit economics.

6 days ago · 10 points

Browse more: 🎙️ Podcasts All Videos All Categories