Company
Onity Group
Title
Intelligent Document Processing for Mortgage Servicing Using Amazon Bedrock and Multimodal AI
Industry
Finance
Year
2025
Summary (short)
Onity Group, a mortgage servicing company processing millions of pages annually across hundreds of document types, implemented an intelligent document processing solution using Amazon Bedrock foundation models to handle complex legal documents with verbose text, handwritten entries, and notarization verification. The solution combines Amazon Textract for basic OCR with Amazon Bedrock's multimodal models (Anthropic Claude Sonnet and Amazon Nova) for complex extraction tasks, using dynamic routing based on content complexity. This hybrid approach achieved a 50% reduction in document extraction costs while improving overall accuracy by 20% compared to their previous OCR and AI/ML solution, with some use cases like credit report processing achieving 85% accuracy.
## Overview Onity Group's implementation represents a sophisticated LLMOps case study in the financial services sector, specifically addressing the complex document processing challenges inherent in mortgage servicing operations. Founded in 1988 and headquartered in West Palm Beach, Florida, Onity Group operates through its primary subsidiary PHH Mortgage Corporation and Liberty Reverse Mortgage brand, providing comprehensive mortgage servicing and origination solutions. The company processes millions of pages across hundreds of document types annually, making it an ideal candidate for advanced AI-powered document processing solutions. The case study is particularly valuable from an LLMOps perspective because it demonstrates the practical implementation of foundation models in a production environment where accuracy, cost efficiency, and regulatory compliance are paramount. The solution addresses real-world challenges that traditional OCR and machine learning models struggled to handle effectively, including verbose legal documents, inconsistent handwritten text, and complex visual elements like notary seals and legal stamps. ## Technical Architecture and LLMOps Implementation The production system represents a mature LLMOps implementation that leverages multiple AWS services in an orchestrated workflow. The architecture demonstrates several key LLMOps principles including model selection, dynamic routing, and cost optimization strategies. The document ingestion process begins with uploads to Amazon S3, which triggers automated processing workflows—a classic pattern in production LLMOps deployments. The preprocessing stage includes image enhancement, noise reduction, and layout analysis, which are critical steps that directly impact the downstream performance of the foundation models. This preprocessing optimization demonstrates understanding of how to maximize LLM performance in production environments. The classification workflow represents a sophisticated LLMOps implementation using a three-step intelligent process. Initially, Amazon Textract extracts document contents, which are then processed by Onity's custom AI model. If the confidence score meets predetermined thresholds, classification is complete. However, when documents aren't recognized—typically because the custom model lacks training data for specific document types—the system automatically routes to Anthropic's Claude Sonnet via Amazon Bedrock. This dual-model approach exemplifies intelligent model orchestration in production LLMOps, balancing cost efficiency with the speed-to-market advantages of foundation models. The extraction component employs an algorithm-driven approach that queries internal databases to retrieve specific extraction rules for each document type and data element. The system then dynamically routes extraction tasks between Amazon Textract and Amazon Bedrock foundation models based on content complexity. This dynamic routing represents a sophisticated LLMOps pattern that optimizes the balance between cost, performance, and accuracy by leveraging the most appropriate model for each specific task. ## Foundation Model Integration and Prompt Engineering The production implementation demonstrates mature prompt engineering practices tailored to specific document processing use cases. For notarization information extraction, the system passes document images to foundation models with carefully crafted prompts that specify exactly what information to extract: state, county, notary date, notary expiry date, presence of notary seal, person signed before notary, and notary public name. The prompts include sophisticated instructions handling edge cases, such as manually crossed out or modified text, where the system should prioritize manually written modifications over printed text. The rider information extraction showcases the multimodal capabilities of the foundation models, processing documents that combine text and checkbox elements. The system extracts both checked riders and other riders listed on documents, demonstrating the models' ability to understand complex visual layouts and interpret both textual and graphical elements simultaneously. Home appraisal document processing represents one of the most complex LLMOps implementations in the solution. The system processes grid layouts with rows and columns of information, verifying room count consistency across subject properties and comparables while ensuring square footages fall within specified percentage ranges. The foundation models not only extract the required information but also provide detailed justifications for their findings, demonstrating sophisticated reasoning capabilities that go beyond simple data extraction. ## Production Performance and Cost Optimization The LLMOps implementation demonstrates significant production success metrics that validate the architectural choices. The 50% reduction in document extraction costs while achieving 20% accuracy improvements compared to previous OCR and AI/ML solutions represents substantial operational efficiency gains. More impressively, specific use cases like credit report processing achieved 85% accuracy rates, demonstrating the solution's capability to handle complex, multiformat documents effectively. The cost optimization strategy is particularly noteworthy from an LLMOps perspective. By implementing dynamic routing between different processing approaches—using Amazon Textract for simpler extractions and Amazon Bedrock foundation models for complex tasks—Onity achieved optimal cost-performance balance. This approach recognizes that not all document processing tasks require the full capabilities of large foundation models, and intelligent routing can significantly reduce operational costs while maintaining high accuracy standards. The Amazon Bedrock API implementation allows Onity to experiment with different foundation models and compare their performance across metrics like accuracy, robustness, and cost using Amazon Bedrock model evaluation capabilities. This flexibility enables continuous optimization of the production system as new models become available or as processing requirements evolve. ## Security and Compliance in Production LLMOps The implementation demonstrates enterprise-grade security practices essential for financial services LLMOps deployments. Data encryption at rest using AWS Key Management Service and in transit using TLS protocols ensures comprehensive data protection. Access control through AWS Identity and Access Management policies provides granular security management. The solution follows AWS Financial Services Industry Lens architectural best practices and implements Security Pillar guidelines from the AWS Well-Architected Framework. Amazon Bedrock's enterprise-grade security features, including data processing within Amazon VPC, built-in guardrails for responsible AI use, and comprehensive data protection capabilities, are particularly important for handling sensitive financial documents. These security measures represent critical LLMOps considerations when deploying foundation models in regulated industries. ## Model Selection and Evaluation Strategy The case study demonstrates sophisticated model selection strategies that are core to effective LLMOps. The solution leverages multiple foundation models including Anthropic's Claude Sonnet and Amazon Nova models, selecting the optimal model based on specific use case requirements. The flexible foundation model selection provided by Amazon Bedrock enables organizations to evolve their AI capabilities over time, striking optimal balances between performance, accuracy, and cost for each specific use case. The recommendation to use Amazon Bedrock model playground for experimentation and Amazon Bedrock model evaluation for comparing foundation models across different metrics represents best practices in LLMOps model evaluation and selection. This approach enables informed decisions about which models provide the best balance of performance and cost-effectiveness for specific use cases. ## Production Deployment and Scalability The solution demonstrates mature production deployment practices, handling millions of pages across hundreds of document types annually. The architecture's ability to scale while maintaining consistent performance and accuracy represents successful LLMOps implementation at enterprise scale. The automated processing workflows, triggered by document uploads to Amazon S3, demonstrate event-driven architecture patterns that are essential for scalable LLMOps deployments. The solution's ability to handle diverse document types—from verbose legal documents to structured credit reports—while maintaining high accuracy rates demonstrates the robustness required for production LLMOps systems. The dynamic routing capabilities ensure that the system can adapt to new document types and evolving processing requirements without requiring extensive retraining or reconfiguration. ## Lessons for LLMOps Implementation This case study provides several valuable insights for LLMOps practitioners. The hybrid approach combining traditional AI/ML models with foundation models demonstrates that optimal solutions often leverage multiple AI technologies rather than replacing existing systems entirely. The dynamic routing strategy shows how intelligent orchestration can optimize cost and performance simultaneously. The emphasis on preprocessing and prompt engineering highlights the importance of these often-overlooked aspects of LLMOps implementation. The sophisticated prompt engineering that handles edge cases and provides detailed extraction instructions demonstrates the level of engineering required for production-ready foundation model applications. The cost optimization achieved through intelligent model selection and routing provides a blueprint for organizations seeking to implement foundation models cost-effectively at scale. The 50% cost reduction while improving accuracy demonstrates that well-architected LLMOps solutions can deliver both operational efficiency and improved outcomes simultaneously.

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.