Rocket Close, a Detroit-based title agency within Rocket Companies, faced bottlenecks in title operations due to time-intensive state-specific examinations, manual research across fragmented systems, and complex local requirements that slowed mortgage processing. To address these challenges, they built Supercharger in collaboration with AWS—an agentic AI solution powered by Strands Agents and Amazon Bedrock that centralizes knowledge and automates research-heavy tasks through natural language interactions. The solution delivered significant operational improvements including a 30% reduction in contact center inquiries, enhanced state exam accuracy through real-time insights, improved client satisfaction through automation of routine tasks, and 3x latency improvements through architectural optimization.
Rocket Close, the Detroit-based title agency and appraisal management company operating within Rocket Companies, developed Supercharger—an agentic AI solution designed to address critical bottlenecks in title operations that were impeding the mortgage lending and homebuying process. The company provides title insurance, property valuation, and settlement services, and as demand for mortgages and loans expanded, their title examination operations became increasingly strained by the complexity and time-intensive nature of state-specific title verification work.
The core problem stemmed from the fragmented nature of title examination workflows. Title examiners must verify data from disparate sources including multiple internal systems, state-specific guides, county-level requirements, and local rules around probate, tax identification, and recording procedures. A single title examiner attempting to understand county-specific recording requirements might spend hours navigating through these various sources to locate the relevant information. This manual, research-heavy approach created significant friction in the lending process and made it difficult for teams to scale operations alongside their expanding client base.
Supercharger represents a sophisticated production deployment of agentic AI built on AWS infrastructure. At its core, the solution leverages Strands Agents, an open source agent harness SDK developed by AWS specifically for building agents using large language models. The team chose Strands Agents because it provides a model-driven approach that scales from straightforward to complex use cases and supports both local development and production deployment. Importantly, while the initial implementation uses Anthropic’s Claude LLM through Amazon Bedrock, the architecture maintains flexibility to swap different LLMs as requirements evolve—a critical design decision for long-term maintainability.
The architecture implements a WebSocket-based streaming approach that provides real-time user feedback. When an operations team member poses a question, the request flows through a multi-stage pipeline: first establishing a WebSocket connection with JWT token validation through an identity provider and Istio, then invoking the Strands Agent which triggers the agentic workflow based on system prompts and user input. The agent queries Amazon Bedrock Knowledge Bases for relevant policies and procedures, determines which functions to invoke with appropriate parameters, executes Model Context Protocol (MCP) tools to retrieve order information from internal APIs, synthesizes context from the knowledge base for order-specific details, and streams the combined response back to the user through WebSocket for progressive rendering in the chat UI.
A particularly noteworthy architectural decision involves the implementation of an MCP tool-based architecture where each data source is exposed as a distinct tool that Strands Agents can invoke. This design choice delivers three significant advantages that align with production LLMOps best practices. First, it provides extensibility—new data sources can be added as additional tools without requiring restructuring of the core architecture, which the team explicitly designed for to accommodate future expansion. Second, it enforces separation of concerns where the logic for interacting with each system is encapsulated in its own tool, making the overall architecture more maintainable and testable. Third, it enables flexibility where the Strands agent dynamically selects which tools to use based on each query, supporting workflows that span multiple data sources.
The team learned through implementation that efficient data retrieval stands as a cornerstone of performance. They architected a streamlined solution where MCP tools retrieve necessary order information in a single call before using LLM synthesis to extract relevant details, which alleviates the need for multiple database queries. This architectural philosophy extends to maintaining clear separation of concerns between Strands Agents and MCP tools, creating a flexible foundation capable of evolving alongside changing requirements.
Supercharger implements six interconnected capabilities that work together to streamline the homeownership process. Conversation Analytics enables natural language processing that understands context and intent across multi-turn conversations, making interactions feel intuitive rather than rigid and transactional. State-level title examination assistance provides comprehensive checklists and guidance tailored to specific title examination requirements, delivering teams the right information at the right moment. API-based integration connects with existing systems including Rocket Close operational databases containing order information, standard procedures, and policies for state-level title exams, maintaining data consistency and avoiding manual data entry while reducing errors. Guardrails and Response Accuracy mechanisms verify that every response meets quality standards and complies with regulatory requirements through Amazon Bedrock Guardrails combined with row-level data entitlements that help prevent accidental access to customer-sensitive data through intelligent access controls. Comprehensive logging and monitoring provide complete visibility into system performance and user interactions with full audit trails meeting compliance requirements. Finally, unified access to multiple data sources maintains complete context for decision-making, pulling together information that previously required checking multiple systems and creating a unified experience for operations teams navigating complex title workflows.
The production deployment demonstrates sophisticated attention to security and compliance requirements critical in financial services. The solution combines Amazon Bedrock Guardrails with row-level data entitlements to implement intelligent access controls that prevent unauthorized access to customer-sensitive data. All conversations are logged with complete audit trails to meet regulatory compliance requirements—a non-negotiable aspect of operating in the mortgage industry. The team learned that offloading security enforcement to session attributes, rather than embedding it in business logic or step-by-step agent prompts, provides cleaner and more consistent access control. This architectural decision reflects mature LLMOps thinking about separating security concerns from business logic and agent orchestration.
The team discovered several critical lessons around effective prompt engineering and agent orchestration in production. They found that effective LLM prompting focuses on describing what the agent should accomplish rather than prescribing how it should accomplish tasks. Removing deterministic steps allowed the agent to orchestrate dynamically using its inherent capabilities, proving more adaptable than custom approaches that attempted to micromanage the agent’s behavior. This represents a shift from traditional programming paradigms where developers specify exact control flow to a more declarative approach that leverages the LLM’s reasoning capabilities.
Additional insights emerged around metadata filtering in knowledge bases to enhance retrieval precision, demonstrating the importance of retrieval-augmented generation (RAG) optimization in production systems. The team also recognized the critical importance of descriptive tool naming and coherent docstrings, which serve as natural language interfaces for agent reasoning. This reflects a unique aspect of LLMOps where API design must consider not just programmatic interfaces but also how an LLM will interpret and reason about available tools based on their descriptions.
The production deployment achieved significant performance improvements through iterative refinement. The team achieved 3x latency improvements and reduced costs through architectural refinement and better prompting techniques that reduced the number of calls the agent made to the LLM. The WebSocket-based streaming approach delivered immediate user feedback, improving perceived performance even when handling complex queries—an important consideration for user experience in production AI applications.
The team’s experience highlights that designing solutions to take advantage of the agent’s inherent intelligence rather than constraining it made Supercharger both more powerful and maintainable in the long term. This philosophical approach to agentic AI development represents mature thinking about how to build production systems that can evolve and improve over time.
While the source material presents these results as achievements (and we should evaluate them with appropriate skepticism given the promotional nature of the content), the claimed business impacts provide insight into how the organization measures success for their LLMOps deployment. The solution reportedly delivered immediate operational efficiency gains for operations and client relations teams, reducing incoming calls and emails to the contact center by 30% through question-answering capability. State exam accuracy improved through real-time insights about orders within existing workflows, which reduced cognitive load, minimized research time, and increased accuracy in decision-making. Client satisfaction was enhanced through automation of routine tasks, execution of order-level processes, and drafting communications on behalf of clients. Operational consistency improved with AI-guided state-level exam assistance. The performance optimization achieved 3x latency improvements alongside cost reductions.
These metrics reflect a mature approach to measuring LLMOps impact across multiple dimensions: operational efficiency (contact center reduction), quality (exam accuracy), user experience (client satisfaction), consistency (standardized guidance), and technical performance (latency and cost). The breadth of measurement suggests the organization has thoughtfully considered how to evaluate their AI investment beyond simple cost savings.
The team explicitly recognized that executive sponsorship and change management proved crucial for timely delivery, leading them to collaborate closely with AWS. This acknowledgment reflects an important reality of production LLMOps deployments—technical excellence alone is insufficient without organizational alignment, stakeholder buy-in, and effective change management to drive adoption. The hands-on collaboration between business and technology teams represents a best practice for ensuring AI solutions address actual business needs rather than being technology in search of a problem.
The documented future phases of Supercharger include expansion for bankers to address loan-specific questions and the creation of fast-start templates to guide multiple domain teams in building agentic solutions for their business problems. This roadmap indicates the organization views Supercharger not just as a point solution but as a foundation for broader agentic AI deployment across the enterprise. The emphasis on templates and reusability reflects mature platform thinking about how to scale LLMOps capabilities across an organization.
While the case study presents an optimistic view of the deployment, several aspects warrant balanced consideration. The source material originates from AWS marketing content and likely emphasizes positive outcomes while minimizing challenges or limitations. The claimed 30% reduction in contact center inquiries and 3x latency improvements represent significant achievements if accurate, but we lack details about the measurement methodology, baseline conditions, or confounding factors that might have influenced these results.
The architectural decisions around MCP tools and Strands Agents appear sound from an engineering perspective, but the case study doesn’t discuss challenges encountered during implementation, limitations discovered in production, or edge cases where the solution performs poorly. The emphasis on flexibility and extensibility suggests anticipation of evolution and change, which is positive, but we don’t learn about specific architectural tradeoffs or technical debt incurred during rapid development.
The security and compliance architecture appears robust with audit trails, guardrails, and row-level access controls, which are appropriate for financial services. However, the case study doesn’t address how the team handles hallucinations, incorrect information retrieval, or situations where the agent makes inappropriate tool selections—all common challenges in production agentic AI systems.
The lesson about focusing on what the agent should accomplish rather than how represents valuable insight, but this approach can also introduce unpredictability in production systems where consistent behavior is crucial for regulated industries. The case study doesn’t discuss how the team balances agentic flexibility with the need for deterministic, auditable decision-making in critical workflows.
Overall, the Rocket Close Supercharger deployment represents a sophisticated production implementation of agentic AI that demonstrates mature thinking about architecture, security, performance, and organizational change. The technical decisions around Strands Agents, MCP tools, streaming architecture, and separation of concerns align with emerging best practices in LLMOps. However, readers should approach the claimed results with appropriate skepticism given the promotional nature of the content and recognize that successful production AI deployments typically involve more complexity, challenges, and tradeoffs than presented in case studies designed to showcase success.
Prudential Financial, in partnership with AWS GenAI Innovation Center, built a scalable multi-agent platform to support 100,000+ financial advisors across insurance and financial services. The system addresses fragmented workflows where advisors previously had to navigate dozens of disconnected IT systems for client engagement, underwriting, product information, and servicing. The solution features an orchestration agent that routes requests to specialized sub-agents (quick quote, forms, product, illustration, book of business) while maintaining context and enforcing governance. The platform-based microservices architecture reduced time-to-value from 6-8 weeks to 3-4 weeks for new agent deployments, enabled cross-business reusability, and provided standardized frameworks for authentication, LLM gateway access, knowledge management, and observability while handling the complexity of scaling multi-agent systems in a regulated financial services environment.
Two organizations operating in highly regulated industries—Sicoob, a Brazilian cooperative financial institution, and Holland Casino, a government-mandated Dutch gaming operator—share their approaches to deploying generative AI workloads while maintaining strict compliance requirements. Sicoob built a scalable infrastructure using Amazon EKS with GPU instances, leveraging open-source tools like Karpenter, KEDA, vLLM, and Open WebUI to run multiple open-source LLMs (Llama, Mistral, DeepSeek, Granite) for code generation, robotic process automation, investment advisory, and document interaction use cases, achieving cost efficiency through spot instances and auto-scaling. Holland Casino took a different path, using Anthropic's Claude models via Amazon Bedrock and developing lightweight AI agents using the Strands framework, later deploying them through Bedrock Agent Core to provide management stakeholders with self-service access to cost, security, and operational insights. Both organizations emphasized the importance of security, governance, compliance frameworks (including ISO 42001 for AI), and responsible AI practices while demonstrating that regulatory requirements need not inhibit AI adoption when proper architectural patterns and AWS services are employed.
Digits, a company providing automated accounting services for startups and small businesses, implemented production-scale LLM agents to handle complex workflows including vendor hydration, client onboarding, and natural language queries about financial books. The company evolved from a simple 200-line agent implementation to a sophisticated production system incorporating LLM proxies, memory services, guardrails, observability tooling (Phoenix from Arize), and API-based tool integration using Kotlin and Golang backends. Their agents achieve a 96% acceptance rate on classification tasks with only 3% requiring human review, handling approximately 90% of requests asynchronously and 10% synchronously through a chat interface.