## Overview
Cisco, a $56 billion networking and collaboration technology company, has implemented a comprehensive agentic AI platform to transform their customer experience operations. This case study represents one of the most detailed enterprise-scale implementations of multi-agent LLM systems in production, managing operations across a 20,000-person customer experience organization that handles over $26 billion in recurring revenue. The implementation demonstrates a sophisticated approach to combining traditional machine learning with large language models through a multi-agent architecture built on LangChain.
The company's customer experience framework follows a traditional "land, adopt, expand, and renew" model, with specialized teams handling different aspects of the customer lifecycle. What makes this implementation notable is how they've systematically deployed AI agents to augment each of these functions while maintaining human oversight and achieving measurable business outcomes. The speaker emphasizes that this isn't experimental technology but rather a production system that has been running at scale for over six months.
## Technical Architecture and Multi-Agent Design
The core technical innovation lies in Cisco's multi-agent supervisor architecture, which they developed before similar approaches became mainstream in the LangChain ecosystem. Rather than attempting to solve complex customer experience queries with a single large agent, they decomposed functionality into specialized agents that work together under a supervisor's coordination. This approach directly addresses one of the key challenges in enterprise LLM deployment: achieving the high accuracy levels required for business-critical operations.
The supervisor agent receives natural language queries and intelligently routes them to appropriate specialized agents. For example, when a renewal specialist asks about a customer's renewal status and risk factors, the supervisor decomposes this into multiple sub-queries that are handled by different agents: the renewals agent for contract information, the adoption agent for usage patterns, and the delivery agent for implementation status. Each agent has access to specific data sources and tools relevant to their domain, allowing for more accurate and contextual responses.
The system leverages multiple large language models depending on the use case and deployment requirements. They use Mistral Large for on-premises deployments (working closely with Mistral on model optimization for on-premises environments), Claude Sonnet 3.5, and various GPT models from GPT-4 to GPT-o3. This flexibility in model selection allows them to meet diverse customer requirements, particularly for highly regulated industries that require on-premises deployment in physical data centers rather than cloud-based solutions.
## Hybrid ML and LLM Integration
One of the most sophisticated aspects of Cisco's implementation is their deliberate combination of traditional machine learning models with LLMs, recognizing the strengths and limitations of each approach. The speaker explicitly notes that "LLMs are really bad for predictions" and emphasizes that machine learning models remain superior for deterministic prediction tasks. Their system uses LLMs for language understanding, reasoning, and interaction, while relying on traditional ML models for risk prediction, sentiment analysis, and other predictive analytics tasks.
This hybrid approach is particularly evident in their renewals agent, where natural language processing capabilities are used to understand queries and provide context, but actual risk assessment relies on purpose-built machine learning models trained on historical data and customer signals. The LangChain platform serves as the orchestration layer that enables seamless context sharing between these different types of models, maintaining conversation state and ensuring that insights from ML models are properly integrated into the LLM-generated responses.
The integration also extends to their custom AI models, where they've invested in both creating new machine learning models for specific prediction tasks and fine-tuning LLMs for their particular use cases. This fine-tuning is especially important for their on-premises deployments, where they need to achieve high accuracy levels while working within the constraints of customer data residency requirements.
## Production Deployment and Evaluation Framework
Cisco's approach to production deployment demonstrates a mature understanding of LLMOps challenges. They operate three parallel tracks: limited availability deployments with real users, full production systems, and ongoing experimentation for new use cases. This multi-track approach allows them to maintain production stability while continuing to innovate and expand their AI capabilities.
Their evaluation framework is particularly noteworthy for its organizational separation. They maintain dedicated evaluation teams that are independent from development teams, with access to golden datasets and clear performance metrics. This separation helps avoid the common pitfall of teams evaluating their own work and ensures objective assessment of system performance. The speaker metaphorically describes this as not making "the dog the custodian of the sausages," emphasizing the importance of independent evaluation.
The system achieves remarkable performance metrics, including 95% accuracy in risk recommendations and 60% automation of their 1.6-1.8 million annual support cases. These metrics are particularly impressive given the complexity of enterprise customer data and the high-stakes nature of customer experience decisions. The 20% reduction in operational time achieved within just three weeks of limited availability deployment demonstrates rapid time-to-value that's often challenging to achieve with enterprise AI implementations.
## Use Case Strategy and Business Impact
Cisco's use case selection methodology provides valuable insights for other enterprises considering large-scale AI deployments. Rather than allowing teams to pursue AI projects opportunistically, they established strict criteria requiring any use case to fit into one of three buckets: helping customers get immediate value from their Cisco investments, making operations more secure and reliable, or providing visibility and insights across the customer lifecycle. This framework helped them focus on high-impact applications rather than getting distracted by technologically interesting but business-irrelevant use cases.
The business impact is substantial and measurable. With over half of Cisco's revenue being recurring, improvements in renewal processes directly translate to significant financial outcomes. The speaker emphasizes the direct correlation between reducing operational burden on customer-facing teams and increasing revenue, as time saved from administrative tasks can be redirected to customer engagement and business development activities.
Their support automation represents another significant value driver, with 60% of cases being resolved without human intervention. This level of automation in technical support is particularly challenging to achieve given the complexity of networking and collaboration technology issues, making their success notable for other technology companies considering similar implementations.
## Deployment Flexibility and Enterprise Requirements
One of the most technically challenging aspects of Cisco's implementation is their support for flexible deployment models. Unlike many AI companies that focus primarily on cloud-based solutions, Cisco supports true on-premises deployment in physical data centers, hybrid cloud deployments, and full cloud implementations. This flexibility is driven by customer requirements in highly regulated industries, federal government contracts, and international markets with strict data residency requirements.
The ability to deploy the same agentic framework across these different environments without code changes represents significant technical achievement. This deployment flexibility is enabled by their use of LangChain's abstraction layer, which allows the same agent workflows to run regardless of the underlying infrastructure. However, the speaker notes that achieving this required close collaboration with model providers, particularly Mistral, to optimize models for on-premises hardware constraints.
## Technical Challenges and Lessons Learned
The presentation includes several candid assessments of technical challenges and lessons learned. The speaker strongly emphasizes that combining LLMs with SQL for enterprise data access is "really really hard" and warns against using LLMs for SQL joins, describing the results as poor. Instead, they leverage Snowflake's Cortex for semantic context and metadata handling, while normalizing data structures to work more effectively with LLM capabilities.
The challenge of achieving high accuracy with text-to-SQL in enterprise environments reflects a broader issue in enterprise AI deployments: the complexity and inconsistency of enterprise data schemas make it difficult for LLMs to generate reliable SQL queries. Cisco's approach of using semantic layers and avoiding complex SQL operations demonstrates a pragmatic approach to working within current LLM limitations.
Context management across agents represents another significant technical challenge. While the Model Context Protocol (MCP) provides some capabilities for context sharing, the speaker describes it as "Swiss cheese" with significant gaps. Cisco has contributed to an open-source initiative called "Agency" that proposes a more comprehensive architecture for agent-to-agent communication, including concepts like agent directories and authentication mechanisms analogous to DNS servers for internet communication.
## Organizational and Process Considerations
The implementation required significant organizational changes beyond the technical deployment. Cisco emphasizes the importance of engaging with subject matter experts and end users before development rather than after, ensuring that the AI solutions actually address real user needs rather than being technically impressive but practically irrelevant. This user-centric approach involved working closely with renewal specialists, adoption teams, and support engineers to understand their specific questions and workflows.
The speaker advocates for having separate teams focused on experimentation versus production, recognizing that these functions require different metrics, risk tolerances, and operational approaches. Experimentation teams need the freedom to fail fast and try novel approaches, while production teams must focus on reliability, performance, and business outcomes. This organizational separation helps ensure that innovation continues while maintaining production system stability.
The presentation also highlights the importance of defining success metrics before selecting tools or technologies. This disciplined approach helps teams focus on business outcomes rather than getting caught up in the latest technological trends. The speaker notes that many organizations start with 400+ potential AI use cases but only find 5 that actually contribute meaningful business value, emphasizing the importance of rigorous prioritization.
## Future Directions and Industry Impact
Cisco's work represents a significant contribution to the broader understanding of how to deploy LLMs effectively in enterprise environments. Their open-source contributions through the Agency initiative and their collaboration with LangChain demonstrate a commitment to advancing the entire industry rather than just their own capabilities. The architectural patterns they've developed for multi-agent systems, hybrid ML/LLM integration, and flexible deployment models provide valuable blueprints for other enterprises.
The scale of their deployment – supporting 20,000 users across multiple complex workflows – provides confidence that agentic AI systems can work reliably in large enterprise environments. Their achievement of 95% accuracy levels and significant operational improvements demonstrates that the technology has matured beyond experimental applications to deliver real business value.
However, the presentation also highlights ongoing challenges in the field, particularly around SQL integration, context management, and the need for better standardization in agent-to-agent communication protocols. These challenges suggest areas where the broader AI industry needs continued development to fully realize the potential of enterprise AI deployments.