Company
Nvidia
Title
Automated CVE Analysis and Remediation Using Event-Driven RAG and AI Agents
Industry
Tech
Year
2024
Summary (short)
NVIDIA developed Agent Morpheus, an AI-powered system that automates the analysis of software vulnerabilities (CVEs) at enterprise scale. The system combines retrieval-augmented generation (RAG) with multiple specialized LLMs and AI agents in an event-driven workflow to analyze CVE exploitability, generate remediation plans, and produce standardized security documentation. The solution reduced CVE analysis time from hours/days to seconds and achieved a 9.3x speedup through parallel processing.
## Summary Nvidia presents Agent Morpheus, an internal production system designed to address the growing challenge of software vulnerability management at enterprise scale. With the CVE database hitting record highs (over 200,000 cumulative vulnerabilities reported by end of 2023), traditional approaches to scanning and patching have become unmanageable. The solution demonstrates a sophisticated LLMOps implementation that combines multiple LLMs, RAG, and AI agents in an event-driven architecture to automate the labor-intensive process of CVE analysis and exploitability determination. The core innovation here is distinguishing between a vulnerability being *present* (a CVE signature detected) versus being *exploitable* (the vulnerability can actually be executed and abused). This nuanced analysis previously required security analysts to manually synthesize information from multiple sources—a process that could take hours or days per container. Agent Morpheus reduces this to seconds while maintaining the quality of analysis through intelligent automation and human-in-the-loop oversight. ## Technical Architecture and LLM Configuration The system employs four distinct Llama3 large language models, with three of them being LoRA (Low-Rank Adaptation) fine-tuned for specific tasks within the workflow: - **Planning LLM**: A LoRA fine-tuned model specifically trained to generate unique investigation checklists based on the CVE context. This model takes vulnerability and threat intelligence data and produces actionable task lists tailored to each specific CVE. - **AI Agent LLM**: Another LoRA fine-tuned model that executes checklist items within the context of a specific software project. This agent can autonomously retrieve information and make decisions by accessing project assets including source code, SBOMs (Software Bill of Materials), documentation, and internet search tools. - **Summarization LLM**: A LoRA fine-tuned model that combines all findings from the agent's investigation into coherent summaries for human analysts. - **VEX Formatting LLM**: The base Llama3 model that standardizes justifications for non-exploitable CVEs into the common machine-readable VEX (Vulnerability Exploitability eXchange) format for distribution. This multi-model architecture represents a thoughtful LLMOps design decision—rather than using a single general-purpose model for all tasks, Nvidia chose to specialize models through fine-tuning for their specific roles, likely improving accuracy and reliability for each stage of the pipeline. ## Inference Infrastructure with NVIDIA NIM The deployment leverages NVIDIA NIM inference microservices, which serves as the core inference infrastructure. A key architectural decision was hosting all four model variants (three LoRA adapters plus base model) using a single NIM container that dynamically loads LoRA adapters as needed. This approach optimizes resource utilization while maintaining the flexibility to serve different specialized models. The choice of NIM was driven by several production requirements: - **OpenAI API compatibility**: NIM provides an API specification compatible with OpenAI's interface, simplifying integration with existing tooling and agent frameworks. - **Dynamic LoRA loading**: The ability to serve multiple LoRA-customized models from a single container reduces infrastructure complexity and costs. - **Variable workload handling**: Agent Morpheus generates approximately 41 LLM queries per CVE on average. With container scans potentially generating dozens of CVEs per container, the system can produce thousands of outstanding LLM requests for a single container scan. NIM is designed to handle this bursty, variable workload pattern that would be challenging for custom LLM services. ## Event-Driven Pipeline Architecture The system is fully integrated into Nvidia's container registry and security toolchain using the Morpheus cybersecurity framework. The workflow is triggered automatically when containers are uploaded to the registry, making it truly event-driven rather than batch-processed. The pipeline flow operates as follows: A container upload event triggers a traditional CVE scan (using Anchore or similar tools). The scan results are passed to Agent Morpheus, which retrieves current vulnerability and threat intelligence for the detected CVEs. The planning LLM generates investigation checklists, the AI agent executes these autonomously, the summarization LLM consolidates findings, and finally results are presented to human analysts through a security dashboard. One notable aspect of this architecture is that the AI agent operates autonomously without requiring human prompting during its analysis. The agent "talks to itself" by working through the generated checklist, retrieving necessary information, and making decisions. Human analysts are only engaged when sufficient information is available for them to make final decisions—a design that optimizes analyst time and attention. ## Agent Tooling and LLM Limitations Mitigation The case study reveals practical approaches to overcoming known LLM limitations in production. The AI agent has access to multiple tools beyond just data retrieval: - **Version comparison tool**: The team discovered that LLMs struggle to correctly compare software version numbers (e.g., determining that version 1.9.1 comes before 1.10). Rather than attempting to solve this through prompting or fine-tuning, they built a dedicated version comparison tool that the agent can invoke when needed. - **Calculator tools**: A well-known weakness of LLMs is mathematical calculations. The system provides calculator access to overcome this limitation. This pragmatic approach—using tools to handle tasks LLMs are poor at rather than trying to force LLMs to do everything—represents mature LLMOps thinking. ## Parallel Processing and Performance Optimization Using the Morpheus framework, the team built a pipeline that orchestrates the high volume of LLM requests asynchronously and in parallel. The key insight is that both the checklist items for each CVE and the CVEs themselves are completely independent, making them ideal candidates for parallelization. The performance results are significant: processing a container with 20 CVEs takes 2842.35 seconds when run serially, but only 304.72 seconds when parallelized using Morpheus—a 9.3x speedup. This transforms the practical utility of the system from something that might take nearly an hour per container to completing in about 5 minutes. The pipeline is exposed as a microservice using HttpServerSourceStage from Morpheus, enabling seamless integration with the container registry and security dashboard services. ## Continuous Learning and Human-in-the-Loop The system implements a continuous improvement loop that leverages human analyst output. After Agent Morpheus generates its analysis, human analysts review the findings and may make corrections or additions. These human-approved patching exemptions and changes to the Agent Morpheus summaries are fed back into LLM fine-tuning datasets. This creates a virtuous cycle where the models are continually retrained using analyst output, theoretically improving system accuracy over time based on real-world corrections. This approach addresses a common LLMOps challenge: how to maintain and improve model performance in production when ground truth labels are expensive to obtain. ## Production Integration and Workflow The complete production workflow demonstrates enterprise-grade integration: - Container upload triggers automatic CVE scanning - Scan results flow automatically to Agent Morpheus - Agent Morpheus retrieves intelligence and runs its analysis pipeline - Results are surfaced to a security analyst dashboard - Analysts review and make final recommendations - Recommendations undergo peer review - Final VEX documents are published and distributed with containers - Analyst corrections feed back into training datasets This end-to-end automation, from container upload to VEX document publication, represents a mature production deployment rather than a proof-of-concept. ## Critical Assessment While the case study presents impressive results, it's worth noting several caveats: - The 9.3x speedup comparison is between their own serial and parallel implementations, not against any baseline or competitive approach. - The "hours or days to seconds" claim for triage time improvement lacks specific baseline measurements or methodology. - This is effectively a first-party case study from Nvidia promoting their own NIM and Morpheus products, so claims should be evaluated with appropriate skepticism. - The system still requires human analyst review, so "fully automated" should be understood as "automated analysis with human oversight" rather than completely autonomous operation. Nevertheless, the technical architecture demonstrates sophisticated LLMOps practices including multi-model orchestration, LoRA fine-tuning for task specialization, tool augmentation for LLM limitations, parallel inference optimization, event-driven microservices architecture, and continuous learning from human feedback—all running in a production environment at enterprise scale.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.