Swisscom: AI-Powered Network Operations Assistant with Multi-Agent RAG Architecture

LLMOps Database

Telecommunications

Swisscom

Company

Swisscom

Title

AI-Powered Network Operations Assistant with Multi-Agent RAG Architecture

Industry

Telecommunications

Link

https://aws.amazon.com/blogs/machine-learning/transforming-network-operations-with-ai-how-swisscom-built-a-network-assistant-using-amazon-bedrock?tag=soumet-20

Year

2025

Summary (short)

Swisscom, Switzerland's leading telecommunications provider, developed a Network Assistant using Amazon Bedrock to address the challenge of network engineers spending over 10% of their time manually gathering and analyzing data from multiple sources. The solution implements a multi-agent RAG architecture with specialized agents for documentation management and calculations, combined with an ETL pipeline using AWS services. The system is projected to reduce routine data retrieval and analysis time by 10%, saving approximately 200 hours per engineer annually while maintaining strict data security and sovereignty requirements for the telecommunications sector.

## Overview Swisscom's Network Assistant represents a comprehensive LLMOps implementation designed to transform network operations through AI-powered automation. As Switzerland's leading telecommunications provider, Swisscom faced the challenge of network engineers spending more than 10% of their time on manual data gathering and analysis from multiple disparate sources. The solution leverages Amazon Bedrock as the foundation for a sophisticated multi-agent system that combines generative AI capabilities with robust data processing pipelines to deliver accurate and timely network insights. ## Technical Architecture and Evolution The solution architecture evolved through several iterations, demonstrating the iterative nature of LLMOps development. The initial implementation established basic RAG functionality using Amazon Bedrock Knowledge Bases, where user queries are matched with relevant knowledge base content through embedding models, context is enriched with retrieved information, and the LLM produces informed responses. However, the team discovered that this basic approach struggled with large input files containing thousands of rows with numerical values across multiple parameter columns, highlighting the complexity of implementing LLMs for technical, data-heavy use cases. The architecture evolved to incorporate a multi-agent approach using Amazon Bedrock Agents, featuring three specialized components: - **Supervisor Agent**: Orchestrates interactions between documentation management and calculator agents to provide comprehensive responses - **Documentation Management Agent**: Helps network engineers access information efficiently from large volumes of data and extract insights about data sources, network parameters, configuration, and tooling - **Calculator Agent**: Supports engineers in understanding complex network parameters and performing precise data calculations from telemetry data ## Data Pipeline and Processing A critical aspect of the LLMOps implementation is the sophisticated ETL pipeline that ensures data accuracy and scalability. The system uses Amazon S3 as the data lake with daily batch ingestion, AWS Glue for automated data crawling and cataloging, and Amazon Athena for SQL querying. This serverless architecture represents a significant technical advancement where the calculator agent translates natural language user prompts into SQL queries and dynamically selects and executes relevant queries based on input parameter analysis. The evolution from initial Pandas or Spark data processing to direct SQL query execution through Amazon Bedrock Agents demonstrates the importance of finding the right balance between AI model interpretation and traditional data processing approaches. This hybrid approach facilitates both accuracy in calculations and richness in contextual responses, addressing a common challenge in LLMOps where pure LLM-based solutions may struggle with precise numerical computations. ## Security and Compliance Implementation The implementation showcases sophisticated approaches to data security and compliance in LLMOps, particularly relevant for telecommunications where data sovereignty requirements are stringent. The system implements comprehensive guardrails through Amazon Bedrock, including content filters that block harmful categories such as hate, insults, violence, and prompt-based threats like SQL injection. The security framework includes specific filters for sensitive telecommunications identifiers (IMSI, IMEI, MAC addresses, GPS coordinates) through manual word filters and regex-based pattern detection. The team conducted a thorough threat model evaluation following the STRIDE methodology (Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, Elevation of privilege), creating detailed data flow diagrams and establishing trust boundaries within the application. This comprehensive security approach demonstrates best practices for LLMOps implementations in regulated industries, where data protection and compliance are paramount. ## Performance and Scalability Considerations The serverless architecture choice proves particularly beneficial for LLMOps deployment, minimizing compute resource management while providing automatic scaling capabilities. The pay-per-use model of AWS services helps maintain low operational costs while ensuring high performance, addressing common concerns about LLMOps cost management. The system integrates with Swisscom's on-premises data lake through daily batch data ingestion, demonstrating how cloud-based LLMOps solutions can effectively interface with existing enterprise infrastructure. ## Evaluation and Impact Assessment The implementation includes robust evaluation mechanisms, with the system implementing contextual grounding and relevance checks to verify that model responses are factually accurate and appropriate. The projected benefits include a 10% reduction in time spent on routine data retrieval and analysis tasks, translating to approximately 200 hours saved per engineer annually. The financial impact shows substantial cost savings per engineer with operational costs at less than 1% of total value generated, demonstrating strong ROI characteristics typical of successful LLMOps implementations. ## Deployment and Operational Considerations The team's adoption of infrastructure as code (IaC) principles through AWS CloudFormation demonstrates mature LLMOps practices, enabling automated and consistent deployments while providing version control of infrastructure components. This approach facilitates easier scaling and management of the Network Assistant solution as it grows, addressing common challenges in LLMOps around deployment consistency and scalability. ## Future Enhancements and Lessons Learned The roadmap includes implementing a network health tracker agent for proactive monitoring, integration with Amazon SNS for proactive alerting, and expansion of data sources and use cases. Key lessons learned include the importance of addressing data sovereignty requirements early in the design process, the need for hybrid approaches that combine AI model interpretation with traditional data processing for numerical accuracy, and the benefits of serverless architectures for LLMOps implementations. The team's experience highlights that complex calculations involving significant data volume management require different approaches than pure AI model interpretation, leading to their enhanced data processing pipeline that combines contextual understanding with direct database queries. This insight is particularly valuable for organizations implementing LLMOps in technical domains where precision and accuracy are critical. ## Industry-Specific Considerations The telecommunications industry context provides valuable insights into LLMOps implementation in regulated environments. The solution addresses sector-specific challenges around data classification, compliance with telecommunications regulations, and handling of sensitive network data. The threat modeling approach and comprehensive security framework serve as a model for other organizations operating in regulated industries considering LLMOps implementations. The case study demonstrates how LLMOps can transform traditional engineering workflows while maintaining strict compliance and security requirements. The combination of Amazon Bedrock's capabilities with careful attention to data security and accuracy shows how modern AI solutions can address real-world engineering challenges in highly regulated environments, providing a blueprint for similar implementations across the telecommunications sector and other infrastructure-intensive industries.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.

Book a Demo

Use Open Source