Tech
Thomson Reuters
Company
Thomson Reuters
Title
Agentic Platform Engineering Hub for Cloud Operations Automation
Industry
Tech
Year
2026
Summary (short)
Thomson Reuters' Platform Engineering team transformed their manual, labor-intensive operational processes into an automated agentic system to address challenges in providing self-service cloud infrastructure and enablement services at scale. Using Amazon Bedrock AgentCore as the foundational orchestration layer, they built "Aether," a custom multi-agent system featuring specialized agents for cloud account provisioning, database patching, network configuration, and architecture review, coordinated through a central orchestrator agent. The solution delivered a 15-fold productivity gain, achieved 70% automation rate at launch, and freed engineering teams from repetitive tasks to focus on higher-value innovation work while maintaining security and compliance standards through human-in-the-loop validation.
## Overview Thomson Reuters, a leading AI and technology company serving legal, tax, accounting, risk, trade, and media sectors, faced significant operational challenges in their Platform Engineering team. The team supports multiple lines of business by providing essential cloud infrastructure and enablement services, but manual processes and repeated coordination between teams created substantial delays and inefficiencies. Engineers were spending considerable time answering the same questions and executing identical processes across different teams, particularly for operational tasks like database management, information security and risk management (ISRM) operations, landing zone maintenance, infrastructure provisioning, secrets management, CI/CD pipeline orchestration, and compliance automation. The Platform Engineering team built an autonomous agentic solution called "Aether" using Amazon Bedrock AgentCore as the foundational orchestration layer. This system features multiple specialized agents handling different service domains and is designed to enable non-technical users to interact with complex automation through natural language requests. The solution was published in January 2026 as a case study demonstrating production use of agentic AI for enterprise operations. ## Technical Architecture and Design Principles The solution architecture reflects three core principles: scalability, extensibility, and security. The system consists of four primary components working together to deliver autonomous operational capabilities. First, a custom web portal built with React and hosted on Amazon S3 provides secure agent interactions through enterprise single sign-on authentication. Second, a central orchestrator agent named "Aether" routes requests and manages interactions between different specialized agents. Third, multiple service-specific agents handle specialized tasks including AWS account provisioning, database patching, network service configuration, and architecture review. Fourth, a human-in-the-loop validation service called "Aether Greenlight" provides oversight for sensitive operations. The team chose Amazon Bedrock AgentCore specifically because it provides complete foundational infrastructure for building, deploying, and operating enterprise-grade AI agents at scale without requiring teams to build that infrastructure from scratch. This decision gave the Platform Engineering team flexibility to innovate with their preferred frameworks while ensuring agents operate with enterprise-level security, reliability, and scalability—critical requirements for managing production operational workflows. ## Orchestrator Agent Implementation The orchestrator service, Aether, was designed as a modular system using the LangGraph Framework. The orchestrator's core responsibility is routing incoming requests to appropriate specialized agents while maintaining conversation context and security. The system retrieves context from a custom-built agent registry to determine which agent should handle each situation. When an agent's actions are required, the orchestrator makes a tool call that programmatically populates data from the registry, which helps prevent potential prompt injection attacks and facilitates more secure communication between endpoints. A particularly notable aspect of the orchestrator design is its integration with AgentCore Memory service capabilities. The system implements both short-term and long-term memory to maintain conversation context while keeping the underlying system stateless. Short-term memory maintains context within individual conversations, enabling natural multi-turn interactions where the agent understands references to previous statements. Long-term memory tracks user preferences and interaction patterns over time, allowing the system to learn from past interactions and avoid repeating previous mistakes. This dual-memory approach represents a sophisticated approach to managing conversational state in a production agentic system. ## Agent Development Framework and Deployment Thomson Reuters developed their own internal framework called TR-AgentCore-Kit (TRACK) to simplify and standardize agent deployment across the organization. TRACK is a customized version of the Bedrock AgentCore Starter Toolkit, modified to meet Thomson Reuters' specific compliance alignment requirements including asset identification standards and resource tagging standards. This framework abstracts away infrastructure concerns, handling connection to AgentCore Runtime, tool management, AgentCore Gateway connectivity, and baseline agent setup automatically so developers can focus on implementing business logic rather than dealing with infrastructure plumbing. TRACK also manages the registration of service agents into the Aether environment by deploying agent cards into the custom-built agent-to-agent (A2A) registry. This seamless flow allows developers to deploy agents to AWS and register them with custom-built services in one unified package. By deploying agent cards into the registry, the process to fully onboard an agent built by a service team can continue smoothly to make the agent available from the overarching orchestrator. This abstraction significantly reduces the barrier to entry for teams wanting to build new specialized agents. The use of AgentCore Gateway is particularly relevant from an LLMOps perspective, as it provided a straightforward and secure way for developers to build, deploy, discover, and connect to tools at scale. This addresses one of the key challenges in production LLM systems: managing the integration between language models and external tools or APIs in a secure, discoverable, and maintainable way. ## Agent Discovery and Registration System To enable seamless agent discovery and communication across the multi-agent system, Thomson Reuters implemented a custom A2A solution using Amazon DynamoDB and Amazon API Gateway. This system supports cross-account agent calls, which was essential for their modular architecture where different agents might be owned and operated by different teams within the organization. The registration process occurs through the TRACK framework, allowing teams to register their agents directly with the orchestrator service without manual intervention. The A2A registry maintains a comprehensive history of agent versions for auditing purposes, which is critical for compliance and debugging in production environments. Importantly, the registry requires human validation before allowing new agents into the production environment. This governance model facilitates conformance with Thomson Reuters' information security and risk management standards while providing flexibility for future expansion. This represents a pragmatic approach to balancing innovation velocity with enterprise governance requirements—a common challenge when deploying LLM-based systems in regulated or security-conscious environments. ## User Interface and Access Control The team developed a web portal using React to provide a secure and intuitive interface for agent interactions, addressing the challenge that raw conversational interfaces can sometimes be difficult for users to navigate effectively. The portal authenticates users against Thomson Reuters' enterprise single sign-on system and provides access to agent flows based on user permissions. This approach ensures that sensitive operations such as AWS account provisioning or database patching are only accessible to authorized personnel. This authentication and authorization layer is crucial from an LLMOps perspective because it demonstrates how conversational AI systems need to be integrated into existing enterprise security frameworks rather than operating as standalone systems. The portal effectively provides a user experience layer on top of the agentic system while enforcing access control policies that would be difficult to implement through purely conversational means. ## Human-in-the-Loop Validation The Aether Greenlight validation service represents a sophisticated approach to maintaining human oversight for critical operations while still achieving significant automation gains. This service extends beyond basic requester approval, allowing team members outside the initial conversation to participate in the validation process. For example, a database patching operation might be requested through a conversation with the agent, but the actual execution requires approval from a database administrator who wasn't part of the original conversation. The system maintains a complete audit trail of approvals and actions, supporting Thomson Reuters' compliance requirements. This addresses a key challenge in deploying autonomous agents for sensitive operations: how to maintain appropriate human oversight without undermining the efficiency gains from automation. The implementation suggests that the agents plan and prepare operations autonomously, but execution is gated by human approval for high-risk actions. This represents a pragmatic middle ground between full automation and manual execution. ## Multi-Agent Coordination and Tool Calling The architecture employs a multi-agent system where specialized agents focus on specific domains rather than attempting to build a single monolithic agent. The cloud account provisioning agent automates the creation and configuration of new cloud accounts according to internal standards, handling tasks such as setting up organizational units, applying security policies, and configuring baseline networking. The database patching agent manages the end-to-end database patching lifecycle and version upgrades. Network service agents handle network configuration requests such as VPC setup, subnet allocation, and connectivity establishment between environments. Architecture review agents assist in evaluating proposed architectures against best practices, security requirements, and compliance standards, providing automated feedback and recommendations. AgentCore serves as the foundational orchestration layer for these agents, providing core agentic capabilities including intelligent decision-making, natural language understanding, tool calling, and agent-to-agent communication. The use of tool calling is particularly important here—the agents aren't just generating text responses but are actually invoking APIs and tools to perform real operational work. This represents genuine production use of LLMs where the model's outputs directly trigger infrastructure changes rather than just providing information to human operators. ## Development and Deployment Workflow The team followed a three-phase approach to develop and deploy the system. The discovery and architecture planning phase involved evaluating existing AWS resources and codebases to design a comprehensive solution incorporating AgentCore, focusing on service objectives and integration requirements. The core development and migration phase took a dual-track approach, migrating existing solutions to AgentCore while simultaneously building TRACK as a deployment engine to enable rapid agent creation. This pragmatic approach allowed them to demonstrate value quickly with migrated solutions while building the foundation for future scalability. The system enhancement and deployment phase refined orchestrator functionality, developed the intuitive web user experience, and executed a team onboarding process for the new agentic system deployment. This phased approach reflects practical LLMOps considerations around minimizing risk during migration to AI-powered systems while building capabilities for long-term scalability. Rather than attempting a big-bang replacement of all manual processes, the team incrementally built capability and migrated workloads. ## Production Outcomes and Metrics The reported outcomes are impressive but should be considered with appropriate skepticism given this is an AWS-sponsored case study. The team reports a 15-fold productivity gain through intelligent automation of routine tasks and a 70% automation rate achieved at first launch. They emphasize continuous reliability with repeatable runbooks executed by agents around the clock, which addresses one of the key value propositions of automation: consistency and availability beyond human work hours. From a speed and agility perspective, the team reports faster time to value through accelerated product delivery by automating environment setup, policy enforcement, and day-to-day operations. Self-service workflows empowered teams with clear standards and paved-road tooling. The security and compliance benefits include a stronger security posture through applied guardrails and database patching by default, with human-in-the-loop approvals maintaining oversight while automating verification of changes. Cost and resource optimization benefits include improved cost efficiency through automated infrastructure usage optimization, strategic talent allocation by freeing engineering teams to focus on highest-priority high-value work, and reduced operational toil through removal of repetitive tasks and variance through standardization. Developer experience improvements include increased satisfaction through streamlined workflows with intuitive self-service capabilities and consistent standards through established repeatable patterns. ## LLMOps Considerations and Tradeoffs While the case study presents an overwhelmingly positive picture, there are several LLMOps considerations and potential tradeoffs worth noting. First, the solution introduces a new layer of complexity into the operational stack. Teams now need to understand how agents work, how to debug agent failures, and how to maintain agent code alongside traditional infrastructure code. The TRACK framework helps abstract some of this complexity, but it also creates a proprietary layer that teams must learn and maintain. Second, the reliance on Amazon Bedrock AgentCore creates vendor lock-in and dependency on AWS services. While AgentCore provided valuable accelerators, organizations considering similar implementations should weigh the benefits of managed services against the flexibility of open-source alternatives. The team's choice to use LangGraph within AgentCore suggests they valued having some framework flexibility even within the managed service. Third, the multi-agent architecture with agent-to-agent communication introduces potential failure modes that don't exist in simpler systems. If the orchestrator misroutes a request or if communication between agents fails, debugging the issue may be more complex than debugging a traditional API call. The comprehensive audit trail and versioning in the agent registry help mitigate these risks but don't eliminate them entirely. Fourth, the human-in-the-loop validation system, while addressing security and compliance concerns, potentially limits the speed benefits of automation. If approvals take hours or days, the 15-fold productivity gain claim may only apply to the agent's work rather than end-to-end cycle time. The case study doesn't provide detailed metrics on approval latency or what percentage of operations require human validation. Fifth, the case study doesn't discuss model evaluation, testing strategies, or how the team handles model updates and potential regressions. These are critical LLMOps concerns for production systems. How do they test agents before deploying updates? How do they monitor agent performance in production? How do they handle scenarios where the LLM misunderstands a request or generates incorrect tool calls? The absence of discussion around these topics suggests they may still be evolving their practices or may not have encountered significant issues yet. ## Scaling and Organizational Impact The case study emphasizes that the agentic system establishes a replicable pattern that teams across the organization can adopt for similar automation capabilities, creating a multiplier effect for operational excellence. The TRACK framework and agent registry infrastructure enable teams beyond Platform Engineering to build their own specialized agents following established patterns. This represents a thoughtful approach to scaling AI capabilities across an enterprise—building reusable infrastructure and patterns rather than one-off solutions. The team's vision is for Aether to "enhance the experience of engineers by removing the need for manual execution of tasks that could be automated to support further innovation and creative thinking" and to help establish Thomson Reuters "as a front-runner in the age of artificial intelligence." While these statements reflect aspirational positioning, the technical foundation they've built does enable broader adoption if the initial implementation proves successful. ## Critical Assessment This case study represents a substantive implementation of agentic AI for production operations, not just a proof of concept or limited pilot. The technical architecture demonstrates thoughtful design around security, governance, scalability, and developer experience. The use of memory management, agent registry with versioning, human-in-the-loop validation, and custom development frameworks shows maturity beyond simply connecting an LLM to some APIs. However, as an AWS-sponsored case study published on the AWS blog, readers should maintain appropriate skepticism about the reported metrics and benefits. The 15-fold productivity gain is impressive if true, but the methodology for calculating this metric isn't explained. Does it account for the engineering time required to build and maintain the system? Does it measure actual cycle time reduction for end users or just agent execution time? The 70% automation rate at launch is more credible and verifiable, assuming it means 70% of eligible operational tasks can now be completed by agents without human intervention. The solution addresses real LLMOps challenges around tool integration, access control, governance, and multi-agent coordination. The technical choices—using LangGraph for orchestration, DynamoDB and API Gateway for agent registry, AgentCore Memory for state management, and custom frameworks for standardization—represent reasonable architectural decisions for an enterprise context. The phased development approach and emphasis on reusable patterns suggest an organization thinking about long-term operationalization rather than just demonstrating AI capabilities. The most significant gap in the case study is the lack of discussion around monitoring, evaluation, and continuous improvement of agent performance. Production LLM systems require ongoing attention to quality, accuracy, and reliability. How Thomson Reuters addresses these concerns will ultimately determine whether this system delivers sustained value or becomes a maintenance burden. The strong emphasis on security, compliance, and governance suggests they understand the risks of deploying autonomous systems, but the absence of detailed operational practices leaves questions about how mature their LLMOps practices truly are.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.