Securing Long-Running AI Agents: From Setup to Sandboxing
TL;DR
NVIDIA details the shift toward autonomous 'long-running' AI agents capable of independent multi-hour execution, introducing the NVIDIA Agent Toolkit featuring open Neotron models, packaged CUDA-X skills, and runtime security to enable scalable enterprise deployment.
🚀 The Evolution to Autonomous Agents 2 insights
Three inflection points in AI utility
The industry progressed from the ChatGPT moment (content generation) to the DeepSeek moment (reasoning models with 10x token growth) to the current 'Claw moment' where agents autonomously execute tasks without human-in-the-loop interaction.
Agents combine models with harnesses
Long-running agents consist of reasoning models plus a 'harness'—the orchestration layer, APIs, and data sources—that enables them to independently execute code and make decisions over hours based solely on outcome-based prompts.
🛠️ NVIDIA Agent Toolkit Architecture 3 insights
Neotron Ultra trained for tool use
Neotron Ultra, part of the open Neotron 3 family launching soon, was specifically trained on agent harnesses like OpenClaw and Hermes to excel at orchestration, delivering 5x speed and 30% lower inference cost than comparable models.
Skills convert libraries into agent tools
NVIDIA packaged its CUDA-X libraries as 'skills'—natural language instruction manuals that explain tool capabilities—allowing agents to understand and utilize complex software without loading entire codebases into their context window.
AIQ employs specialized multi-agent teams
The AIQ deep research skill uses a system of models where Neotron Nano routes intent, frontier models like GPT manage orchestration, and specialized Neotron 3 sub-agents handle specific disciplines like fact-gathering, critique, and synthesis.
🔒 Enterprise Security and Deployment 2 insights
ServiceNow autonomously resolves 90% of L1 tickets
Using NVIDIA's agent blueprint, ServiceNow deployed specialized triage and resolution agents that independently research past fixes and resolve 90% of Level 1 support tickets without human escalation.
Verified skills ensure runtime security
NVIDIA's verified skills program scans tools for vulnerabilities, evaluates cross-compatibility across different harnesses, and cryptographically signs skills so enterprises can enforce at runtime that only authorized, secure code executes.
Bottom Line
Enterprises should deploy autonomous AI by adopting model-agnostic agent harnesses with verified skill marketplaces and runtime sandboxing to safely enable long-running agents that independently execute complex business processes.
More from NVIDIA AI Podcast
View all
How NVIDIA Blackwell and NVIDIA Dynamo Scale AI Agents for Production
NVIDIA Blackwell delivers up to 40x more concurrent AI agents per GPU than Hopper through its rack-scale NVL72 architecture and Dynamo framework, fundamentally shifting AI infrastructure measurement from token throughput to agent concurrency benchmarks.
Build Video Analytics AI Agents with Skills
NVIDIA introduces the Video Search and Summarization (VSS) blueprint for building vision AI agents that process billions of camera streams using vision language models and a new 'skills' framework, enabling deep video search and summarization 60x faster than manual review.
Ask the Experts: Nemotron 3 Nano Omni | Nemotron Labs
NVIDIA researchers detail the development of Nemotron 3 Nano Omni, explaining how they evolved a text-only model into a multimodal system capable of processing vision, audio, and video through progressive training stages while maintaining the hybrid Mamba-Transformer architecture.
Apr 14 - Jetson AI Lab Research Group Call - Tensor RT Edge LLM on Jetson & Culture
NVIDIA researchers Lynn Chai and Luc introduce TensorRT Edge LLM, a purpose-built inference engine for deploying large language models on Jetson edge devices, showcasing NVFP4 quantization and speculative decoding techniques that achieve up to 7x faster prefill speeds and 500 tokens per second generation while previewing a simplified vLLM-style Python API coming soon.