Build Video Analytics AI Agents with Skills

| Podcasts | May 13, 2026 | 5.72 Thousand views | 59:53

TL;DR

NVIDIA introduces the Video Search and Summarization (VSS) blueprint for building vision AI agents that process billions of camera streams using vision language models and a new 'skills' framework, enabling deep video search and summarization 60x faster than manual review.

🏗️ VSS Blueprint Architecture 3 insights

Three-layer vision AI pipeline

Real-time feature extraction using VLMs and CV models feeds into a metadata database for offline agentic analytics like search and summarization.

Modular microservices design

Components can integrate into existing agents or applications, with training scripts provided for fine-tuning default models on custom data.

Multimodal processing support

As of June 1st, Neatron Omni models enable single-model processing of video, audio, and text modalities within the same architecture.

🛠️ Agent Skills Framework 3 insights

Pre-built workflow skills

Reference implementations for search, summarization, alerting, and reporting allow external agents (OpenClaw, CodeEx) to invoke VSS capabilities via standardized APIs.

Natural language deployment

An upcoming 'build a vision agent' skill will generate Docker Compose configurations from plain English descriptions to automate deployment packaging.

Advanced agentic search with critique

The system decomposes queries, fuses results from multiple embedding domains, and applies VLM-based critique to verify match accuracy before returning results.

Performance & Edge Deployment 3 insights

Real-time processing benchmarks

Deep agentic search with VLM critique completes in under 5 seconds, alert verification in under 3 seconds, and video summarization achieves 60x speedup over manual review.

Flexible hardware deployment

Edge deployment supported on AGX, IGX, and DGX Spark for offline operation, with 32GB GPUs handling component processing and 80GB GPUs required for fully local deployments.

Open source availability

VSS is free on GitHub with complete skills codebase releasing June 1st at GTC Taipei, including new capabilities for 3D tracking and third-party system integration.

Bottom Line

Developers can rapidly deploy production-ready video analytics AI agents using NVIDIA's open-source VSS blueprint and skills framework, eliminating the need to build vision AI infrastructure from scratch while maintaining full customization capabilities.

More from NVIDIA AI Podcast

View all
Ask the Experts: Nemotron 3 Nano Omni | Nemotron Labs
48:56
NVIDIA AI Podcast NVIDIA AI Podcast

Ask the Experts: Nemotron 3 Nano Omni | Nemotron Labs

NVIDIA researchers detail the development of Nemotron 3 Nano Omni, explaining how they evolved a text-only model into a multimodal system capable of processing vision, audio, and video through progressive training stages while maintaining the hybrid Mamba-Transformer architecture.

2 days ago · 10 points
Apr 14 - Jetson AI Lab Research Group Call - Tensor RT Edge LLM on Jetson & Culture
51:38
NVIDIA AI Podcast NVIDIA AI Podcast

Apr 14 - Jetson AI Lab Research Group Call - Tensor RT Edge LLM on Jetson & Culture

NVIDIA researchers Lynn Chai and Luc introduce TensorRT Edge LLM, a purpose-built inference engine for deploying large language models on Jetson edge devices, showcasing NVFP4 quantization and speculative decoding techniques that achieve up to 7x faster prefill speeds and 500 tokens per second generation while previewing a simplified vLLM-style Python API coming soon.

10 days ago · 10 points
March 10 - Jetson AI Lab Research Group Call - Lightning talks
55:28
NVIDIA AI Podcast NVIDIA AI Podcast

March 10 - Jetson AI Lab Research Group Call - Lightning talks

This Jetson AI Lab Research Group call features lightning talks on open-source hardware for remote Jetson access, a real-time emotional AI engine for robots running entirely on Jetson Nano, and updates to the Jetson AI Lab model repository with new performance benchmarks and deployment guides.

10 days ago · 8 points
Feb 10 - Jetson AI Lab Research Group Call - Drones on Jetson & Isaac Lab on DGX Spark
57:34
NVIDIA AI Podcast NVIDIA AI Podcast

Feb 10 - Jetson AI Lab Research Group Call - Drones on Jetson & Isaac Lab on DGX Spark

Cameron Rose presents 'Operation Squirrel,' an autonomous drone project using Jetson Orin Nano for real-time target tracking and dynamic payload delivery. The system uses a modular C++ software stack with TensorRT-optimized YOLO and OSNet running at 21 FPS, communicating via UART with a flight controller to maintain following distance through velocity commands.

10 days ago · 9 points