BT: Journey Towards Autonomous Network Operations with AI/ML and Dark NOC

LLMOps Database

Telecommunications

Company

Title

Journey Towards Autonomous Network Operations with AI/ML and Dark NOC

Industry

Telecommunications

Link

https://www.youtube.com/watch?v=3ywSa9OeuN8

Year

Summary (short)

BT is undertaking a major transformation of their network operations, moving from traditional telecom engineering to a software-driven approach with the goal of creating an autonomous "Dark NOC" (Network Operations Center). The initiative focuses on handling massive amounts of network data, implementing AI/ML for automated analysis and decision-making, and consolidating numerous specialized tools into a comprehensive intelligent system. The project involves significant organizational change, including upskilling teams and partnering with AWS to build data foundations and AI capabilities for predictive maintenance and autonomous network management.

Tags

internet_of_things

regulatory_compliance

British Telecom (BT) is embarking on an ambitious journey to revolutionize how they operate one of the UK's largest mobile networks through the implementation of AI, ML, and generative AI technologies. This case study explores their transformation from traditional network engineering to a software-driven autonomous operations model, dubbed "Dark NOC" (Network Operations Center). ## Background and Challenge BT operates a massive network infrastructure including thousands of radio sites, gateways, and distributed core/IMS networks. Their traditional operations model, largely unchanged since 2G networks, relies heavily on human expertise across various specialized domains. This creates several challenges: * The network generates petabytes of data from multiple sources, but this data exists in silos and requires significant human intervention to analyze and act upon * Operations require numerous specialized tools for different network functions, creating complexity and inefficiency * Heavy dependence on human expertise for problem diagnosis and resolution * Limited ability to predict and prevent network issues proactively ## The Transformation Strategy BT's approach to implementing AI and automation follows a structured path that acknowledges the complexity of the challenge. Their strategy encompasses several key areas: ### Data Foundation The first and most crucial step involves establishing proper data infrastructure: * Working with AWS to clean and structure the massive amounts of network data * Creating a sensible data architecture that can handle the scale and complexity of network operations * Implementing data governance to enable sharing across the organization * Building knowledge graphs to connect different data sources and enable AI/ML applications ### Organizational Transformation BT recognizes that technical transformation must be accompanied by organizational change: * Converting traditional network engineering teams into software engineering teams * Upskilling existing radio and core network engineers to understand and work with new technologies * Developing new processes that support automated operations * Creating a culture that embraces software-driven network management ### AI/ML Implementation Approach The company is taking a measured approach to implementing AI and ML capabilities: * Creating an "AI continuum" that combines traditional ML models with newer generative AI approaches * Using an agentic AI framework where different types of AI work together * Focusing on specific use cases like root cause analysis, service impact analysis, and anomaly detection * Building predictive maintenance capabilities to prevent network issues before they occur ### Technical Architecture The solution architecture includes several key components: * Containerized network infrastructure for flexible traffic management * Automated logic for traffic routing and node management * Consolidated monitoring and management systems to replace multiple specialized tools * Integration of AI/ML models for automated decision-making * Knowledge graph-based system for understanding network topology and relationships ## Implementation Challenges and Solutions The implementation faces several challenges that BT is actively addressing: ### Data Quality and Access * Challenge: Raw network data is massive and often unstructured * Solution: Partnership with AWS to implement data cleaning and structuring processes * Implementation of data governance frameworks to ensure quality and accessibility ### Technical Complexity * Challenge: Need to understand and automate complex network operations across multiple domains * Solution: Gradual approach starting with basic automation before moving to more complex AI/ML implementations * Focus on building proper foundations before adding advanced capabilities ### Skills and Culture * Challenge: Traditional network engineers need new skills * Solution: Comprehensive upskilling program * Gradual transformation of team structure and working methods ## Future Vision and Roadmap BT's roadmap for the Dark NOC vision includes several phases: ### Near-term Goals * Establishing clean, accessible data foundations * Implementing basic automation for common operations * Consolidating monitoring and management tools ### Medium-term Objectives * Expanding AI/ML capabilities for automated decision-making * Implementing predictive maintenance capabilities * Developing application-aware networking features ### Long-term Vision * Fully autonomous network operations * Self-healing network capabilities * Proactive issue prevention through AI/ML * Enhanced cybersecurity through automated anomaly detection ## Lessons and Insights Several key lessons emerge from BT's transformation journey: * The importance of starting with data foundations before implementing advanced AI capabilities * The need for organizational transformation alongside technical changes * The value of partnering with technology providers (in this case AWS) for specialized expertise * The benefits of taking a measured, phased approach to transformation The case study demonstrates the complexity of implementing AI/ML in large-scale telecommunications networks and provides valuable insights into how to approach such transformations. It also highlights the importance of considering both technical and organizational aspects when implementing AI-driven operations.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.

Book a Demo

Use Open Source