London Stock Exchange Group developed a client services assistant application using Amazon Q Business to enhance their post-trade customer support. The solution leverages RAG techniques to provide accurate and quick responses to complex member queries by accessing internal documents and public rulebooks. The system includes a robust validation process using Claude v2 to ensure response accuracy against a golden answer dataset, delivering responses within seconds and improving both customer experience and staff productivity.
London Stock Exchange Group (LSEG), through its London Clearing House (LCH) division, developed a generative AI-powered assistant to support their client services team in handling B2B queries related to post-trade products and services. The LCH Group operates as a multi-asset class clearing house providing risk management across OTC and listed interest rates, fixed income, FX, credit default swaps, equities, and commodities. This case study represents an interesting example of deploying LLMs in a highly regulated financial services environment where accuracy and trust are paramount.
The core business problem was straightforward: as LCH’s business grew, their client services team needed to answer complex member queries requiring reference to detailed service documentation, policy documents, and rulebooks. Questions such as “What is the eligible collateral at LCH?” or “Can members clear NIBOR IRS at LCH?” required agents to search through multiple documentation sources, which was time-consuming and potentially error-prone. The goal was to improve both customer experience and employee productivity by enabling faster, more accurate information retrieval.
The LCH team went through a structured evaluation process, conducting cross-functional workshops to examine different LLM approaches including prompt engineering, RAG, and custom model fine-tuning. They evaluated options like Amazon SageMaker and SageMaker Jumpstart, weighing trade-offs between development effort and model customization capabilities. Amazon Q Business was ultimately selected for several key reasons:
This selection process highlights an important LLMOps consideration: sometimes managed services that abstract away LLM complexity can be more appropriate than custom deployments, particularly when the organization prioritizes maintainability and quick updates over deep customization.
The solution leverages multiple data sources through Amazon Q Business connectors:
This multi-source approach using RAG allows the system to provide comprehensive answers by combining information from different sources while maintaining the ability to cite specific sources for each response. The citation capability is particularly important in financial services where verifiability of information is essential for regulatory compliance and building user trust.
Rather than using Amazon Q Business’s built-in web experience, LCH opted for a custom frontend application. This decision gave them more control over the user experience and allowed integration with their existing workflows. The architecture includes:
The identity and access management layer uses SAML 2.0 IAM federation with a third-party identity provider, allowing LCH users to authenticate using their existing enterprise credentials while maintaining granular control over who can access the Amazon Q Business application.
Perhaps the most interesting LLMOps aspect of this implementation is the validation system designed to ensure response accuracy. This represents a thoughtful approach to building trust in AI-generated content in a regulated environment:
This approach to evaluation is notable because it uses one LLM (Claude v2) to validate the outputs of another system (Amazon Q Business), creating a form of automated quality assurance. While this introduces its own considerations about LLM-as-judge reliability, it provides a scalable way to monitor system accuracy over time.
The case study reports that the Amazon Q Business application returns answers within a few seconds per question. Testing verified high factual accuracy of responses, though specific accuracy percentages are not provided. The expectation is that the system saves time for each live agent on each question by providing quick and correct responses.
It’s worth noting that this is an AWS blog post co-authored with LSEG employees, so there may be some optimistic framing of results. The case study doesn’t provide detailed metrics on:
LCH followed a phased rollout approach rather than a big-bang deployment, which is a prudent strategy for AI systems in regulated environments. This allowed for thorough testing and quality verification before broader exposure.
Future plans mentioned include:
The phased approach and integration roadmap suggest this is positioned as part of a broader enterprise AI strategy rather than an isolated point solution.
This case study illustrates several important LLMOps patterns:
The use of managed RAG services can significantly reduce operational complexity while still providing enterprise-grade functionality. By choosing Amazon Q Business, LCH avoided the need to directly manage LLM deployments, embedding models, vector databases, and retrieval pipelines.
Citation and source attribution are critical for building trust in LLM outputs, particularly in regulated industries. The ability to trace answers back to specific documents allows users to verify information and builds confidence in the system.
Evaluation systems using golden answer benchmarks provide a scalable way to monitor LLM system accuracy. Using another LLM for scoring creates an automated quality assurance loop, though organizations should consider the limitations of LLM-as-judge approaches.
Identity management and access control integration is essential for enterprise deployments. The SAML 2.0 federation approach allows seamless integration with existing enterprise identity infrastructure.
Custom frontends provide flexibility for workflow integration even when using managed backend services. This allows organizations to tailor the user experience while leveraging managed AI capabilities.
The regulated nature of financial services introduces unique requirements around verifiability, auditability, and accuracy that shape the entire implementation approach, from technology selection through deployment strategy.
TP ICAP faced the challenge of extracting actionable insights from tens of thousands of vendor meeting notes stored in their Salesforce CRM system, where business users spent hours manually searching through records. Using Amazon Bedrock, their Innovation Lab built ClientIQ, a production-ready solution that combines Retrieval Augmented Generation (RAG) and text-to-SQL approaches to transform hours of manual analysis into seconds. The solution uses Amazon Bedrock Knowledge Bases for unstructured data queries, automated evaluations for quality assurance, and maintains enterprise-grade security through permission-based access controls. Since launch with 20 initial users, ClientIQ has driven a 75% reduction in time spent on research tasks and improved insight quality with more comprehensive and contextual information being surfaced.
Prudential Financial, in partnership with AWS GenAI Innovation Center, built a scalable multi-agent platform to support 100,000+ financial advisors across insurance and financial services. The system addresses fragmented workflows where advisors previously had to navigate dozens of disconnected IT systems for client engagement, underwriting, product information, and servicing. The solution features an orchestration agent that routes requests to specialized sub-agents (quick quote, forms, product, illustration, book of business) while maintaining context and enforcing governance. The platform-based microservices architecture reduced time-to-value from 6-8 weeks to 3-4 weeks for new agent deployments, enabled cross-business reusability, and provided standardized frameworks for authentication, LLM gateway access, knowledge management, and observability while handling the complexity of scaling multi-agent systems in a regulated financial services environment.
Siteimprove, a SaaS platform provider for digital accessibility, analytics, SEO, and content strategy, embarked on a journey from generative AI to production-scale agentic AI systems. The company faced the challenge of processing up to 100 million pages per month for accessibility compliance while maintaining trust, speed, and adoption. By leveraging AWS Bedrock, Amazon Nova models, and developing a custom AI accelerator architecture, Siteimprove built a multi-agent system supporting batch processing, conversational remediation, and contextual image analysis. The solution achieved 75% cost reduction on certain workloads, enabled autonomous multi-agent orchestration across accessibility, analytics, SEO, and content domains, and was recognized as a leader in Forrester's digital accessibility platforms assessment. The implementation demonstrated how systematic progression through human-in-the-loop, human-on-the-loop, and autonomous stages can bridge the prototype-to-production chasm while delivering measurable business value.