ZenML

AI-Driven Contract Analysis and Extraction at Scale

PriceWaterhouseCooper / PWC 2026
View original source

PwC developed AIDA (AI-driven annotation), a solution built on AWS that addresses the challenge of extracting structured insights from lengthy, unstructured contracts that traditionally require significant manual review time from legal, compliance, and procurement teams. The solution combines rule-based extraction with LLM-powered natural language query capabilities, leveraging Amazon Bedrock and AWS services to process contracts at scale. In customer implementations, AIDA has demonstrated the ability to reduce manual contract review time by up to 90%, with one major film and TV studio achieving a 90% reduction in rights research time, enabling faster retrieval of key information and shortened review cycles across industries including Media & Entertainment and Real Estate.

Industry

Legal

Technologies

Overview

PwC’s AIDA (AI-driven annotation) represents a production-scale implementation of LLM technology designed to transform unstructured contract documents into structured, actionable insights. The system specifically targets the challenges faced by legal, compliance, and procurement teams who must process growing volumes of complex contracts where critical information is buried in lengthy agreements. The solution has been deployed in multiple customer implementations, including a notable deployment with a major film and TV studio that achieved a 90% reduction in rights research time.

The system operates across multiple industries, with particular emphasis on Media & Entertainment and Real Estate sectors, as well as supporting Procurement, Legal, and Compliance functions. In the M&E sector specifically, AIDA helps content producers and distributors extract rights information from license agreements, including broadcast, streaming, theatrical, and derivative rights, enabling faster decision-making on spin-offs, sequels, and global distribution strategies.

Architecture and LLMOps Infrastructure

AIDA’s architecture demonstrates a comprehensive approach to deploying LLMs in production environments with enterprise-grade security, scalability, and observability requirements. The system is built entirely on AWS cloud-native services, providing a reference architecture for organizations implementing similar document intelligence capabilities.

Edge Security and Access Layer: The system implements multi-layered security beginning at the edge with AWS WAF for threat filtering, followed by Network Load Balancer routing to NGINX reverse proxy servers that manage SSL termination, routing, and policy enforcement before forwarding to Amazon ECS. All data in transit is encrypted using TLS 1.2 or higher, covering both user connections through HTTPS and internal service-to-service communication across Amazon ECS, Amazon RDS, Amazon S3, Amazon Bedrock, and other AWS services. Authentication is handled through Amazon Cognito, integrated with enterprise identity providers including Microsoft Entra ID and Okta, enabling secure access at scale. The system implements fine-grained access control through both application-level and project-level roles, allowing administrators to manage user access and define granular permissions within projects.

Data Storage and Persistence Layer: The storage architecture separates concerns between unstructured and structured data. Amazon S3 stores uploaded documents, OCR outputs, and associated metadata, providing durable and cost-effective management of large contract volumes. All S3 buckets implement encryption at rest using SSE-S3, with Block Public Access enabled at the bucket level and access logging enabled for security analysis and audit purposes. Amazon RDS persists structured data, configurations, and extracted insights, with instances encrypted at rest using AWS KMS-managed keys. This dual-storage approach enables efficient querying and retrieval of insights for analytics and integration while maintaining appropriate security controls for potentially sensitive contractual data.

Asynchronous Processing Architecture: A key aspect of the LLMOps implementation is the asynchronous processing design that enables parallel processing of large contract volumes without blocking user interactions. OCR and extraction workflows run on Amazon ECS using AWS Fargate, with tasks coordinated through Amazon SQS. This architecture pattern allows the system to scale horizontally to process thousands of contracts in parallel while maintaining consistent accuracy. Extraction rules guide how relevant content is identified and sent to foundation models hosted on Amazon Bedrock, where LLMs interpret contract text and extract structured values. Results are written back to Amazon RDS, making them available for review, dashboards, and downstream integrations.

RAG Implementation and Knowledge Management

The system implements a sophisticated Retrieval Augmented Generation pipeline that serves as the foundation for AIDA’s question-answering capabilities. This RAG implementation addresses the critical requirement that AI-generated answers must be accurate, traceable, and grounded in actual contract content rather than relying solely on the model’s parametric knowledge.

Semantic Indexing Pipeline: Documents stored in Amazon S3 are processed through an embedding pipeline using Amazon Bedrock Embeddings Models. The resulting vector representations are indexed in Amazon OpenSearch Serverless, creating a semantic search layer that supports both document-level and global queries. This serverless approach to vector storage provides scalability without requiring manual capacity management, though organizations implementing similar systems should carefully consider the cost implications of vector storage at scale and evaluate whether serverless or provisioned capacity makes more sense for their specific workload patterns.

Hybrid Retrieval Strategy: During inference, the system employs both implicit and explicit filtering strategies. Implicit filtering relies on semantic similarity between queries and contract content to surface contextually relevant sections. Explicit filtering applies metadata constraints such as contract type, creation date, business unit, or jurisdiction to narrow results to the most relevant subset. This hybrid approach is particularly important in production legal applications where precision is critical—overly broad retrieval can introduce irrelevant context that confuses the LLM, while overly narrow retrieval risks missing critical information.

Context Construction and Citation Linking: Retrieved passages are combined with user queries and provided to LLMs hosted on Amazon Bedrock. The system maintains careful tracking of source documents and page references, enabling every extracted insight or answer to include clickable citations that link back to the exact source text within the original contract. This citation capability is not just a nice-to-have feature but rather a fundamental requirement for any LLM system deployed in legal or compliance contexts where users must be able to verify AI-generated outputs against source documents.

Production LLM Orchestration

AIDA demonstrates several important patterns for orchestrating LLMs in production environments, particularly in domains requiring high accuracy and explainability.

Template-Based Extraction: The system supports user-defined templates that group together labels representing key contract elements such as termination notice periods, renewal terms, or rights clauses. Each template defines extraction logic once, which can then be applied consistently across multiple documents. Behind the scenes, AIDA processes contracts using a structured representation that preserves page and section context. Extraction rules guide how relevant content is identified, and LLMs interpret that context to extract correct values. This template-based approach represents a pragmatic middle ground between fully manual extraction and fully autonomous LLM processing—it leverages LLM capabilities for interpretation and extraction while maintaining human-defined rules and structures that ensure consistency and accuracy across large document volumes.

Document-Level Conversational Interface: The document-level chat capability allows users to ask natural language questions about single contracts and receive answers grounded directly in that document. When questions are submitted, AIDA identifies the most relevant sections by comparing queries against semantic representations of document content. These sections are then provided as context to Amazon Bedrock-hosted LLMs, which generate responses based on the contract text. This implementation pattern—semantic retrieval followed by LLM generation with explicit context—represents best practice for production document intelligence systems where hallucination risks must be minimized.

Global Chat Across Document Collections: The global chat feature extends document-level capabilities to support queries across multiple contracts within a project. This can operate in two modes: project-wide queries evaluated across all contracts to provide consolidated views, or scoped queries targeting selected agreement subsets. The technical challenge here involves managing context windows effectively—with potentially thousands of contracts in a project, the system cannot simply concatenate all content and pass it to the LLM. Instead, the semantic search layer must effectively identify the most relevant passages across the document collection, requiring careful tuning of retrieval parameters and potentially implementing re-ranking strategies to ensure the most relevant context surfaces to the top.

Safety, Governance, and Human-in-the-Loop

The system incorporates multiple layers of safety controls and governance mechanisms appropriate for production deployment in regulated environments.

Content Filtering and PII Protection: AIDA uses Amazon Bedrock Guardrails to apply content filtering, sensitive information (PII) protection, and prompt safety controls. While the marketing material claims this “confirms that responses remain secure and aligned with enterprise and legal standards,” organizations should recognize that guardrails represent one layer of a defense-in-depth strategy. In practice, production legal AI systems should implement multiple overlapping controls including input validation, output filtering, access controls, and audit logging. The system’s handling of potentially sensitive contractual data requires appropriate safeguards, and PwC appropriately notes that customers remain responsible for configuring and operating the solution to meet their specific compliance obligations.

Human-in-the-Loop Review: The system includes a configurable human-in-the-loop review queue that can validate and approve extracted outputs before they are forwarded to downstream systems. This represents critical best practice for high-stakes applications—while AIDA may reduce manual review time by 90%, that remaining 10% of human review serves as a crucial quality gate. Organizations implementing similar systems should carefully consider where human review should be mandatory versus optional, and how to design review interfaces that enable reviewers to efficiently validate AI outputs rather than simply rubber-stamping them.

Auditability and Traceability: The system implements comprehensive logging and audit capabilities through AWS CloudTrail, enabling access logging for data operations. Combined with the citation linking capabilities described earlier, this creates an audit trail from user queries through retrieval and generation to final outputs, with clear links back to source documents. This level of traceability is essential for legal applications where AI-generated insights may inform high-stakes business decisions or legal positions.

System Integration and Operational Monitoring

AIDA demonstrates mature practices around system integration and operational observability that are essential for production LLM deployments.

Downstream System Integration: The solution integrates with contract lifecycle management tools, ERP systems, CRMs, and data warehouses using AWS Lambda, Amazon EventBridge, and Amazon SQS. This integration architecture enables AIDA to push structured contract data into systems already in use within the organization, reducing manual data handling and enabling reuse of contract insights across compliance, reporting, and analytics workflows. The event-driven integration pattern using EventBridge provides loose coupling between AIDA and downstream systems, enabling the solution to scale independently and integrate with new systems without requiring architectural changes.

Operational Dashboard: The system includes an operational dashboard built on Amazon QuickSight that visualizes metrics such as document volumes, OCR accuracy, extraction throughput, and processing status. This dashboard provides visibility into system performance and helps identify bottlenecks or opportunities to improve efficiency. For LLMOps specifically, this type of operational visibility is crucial—organizations need to monitor not just traditional infrastructure metrics but also AI-specific metrics like extraction accuracy, retrieval relevance, and user satisfaction with generated outputs. The case study mentions tracking “extraction accuracy,” though it does not detail how this metric is calculated or whether it involves comparison against human-annotated ground truth.

Monitoring and Observability: Amazon CloudWatch and AWS X-Ray provide monitoring and tracing capabilities across the distributed system. For LLM applications specifically, effective monitoring must track both infrastructure-level metrics (latency, throughput, error rates) and application-level metrics (extraction accuracy, retrieval quality, generation quality). Organizations implementing similar systems should consider implementing custom metrics that track LLM-specific concerns such as prompt token usage, completion token usage, guardrail rejection rates, and user feedback on output quality.

Deployment and DevOps Practices

The system implements continuous deployment capabilities through AWS CodeBuild and AWS CodePipeline, enabling automated building, testing, and deployment of updates. This CI/CD infrastructure is essential for production LLM systems that require frequent iteration on prompts, extraction rules, and model configurations. However, the case study does not provide detail on how prompt changes are tested and validated before deployment, or how the system handles versioning of extraction templates and rules across deployments.

Infrastructure as Code and Security: AWS IAM and AWS KMS manage access and encryption, with IAM policies implemented following the principle of least privilege. This security-first approach is appropriate for systems handling potentially sensitive contractual data, though organizations should recognize that implementing least-privilege IAM correctly requires ongoing maintenance as the system evolves and new integration requirements emerge.

Critical Assessment and Considerations

While the case study presents AIDA as a successful production deployment, several areas warrant careful consideration for organizations evaluating similar approaches:

Claimed Efficiency Gains: The 90% reduction in manual contract review time is impressive if accurate, though the case study provides limited detail on how this metric was calculated or what baseline was used for comparison. Organizations should recognize that actual efficiency gains will vary significantly based on contract types, complexity, existing processes, and user proficiency with AI tools. The case study also mentions one major film and TV studio achieving 90% reduction in rights research time—this appears to be the same 90% figure cited more generally, raising questions about whether this represents a single deployment or multiple validations of the claimed benefit.

Model Selection and Customization: The case study does not specify which foundation models from Amazon Bedrock are used, or whether different models are used for different tasks (embedding vs. extraction vs. question-answering). In production deployments, model selection significantly impacts accuracy, latency, and cost. Organizations should carefully evaluate whether to use general-purpose models or whether domain-specific fine-tuning would improve accuracy for their specific contract types and legal domains.

OCR and Document Processing: The system includes OCR processing, though details on OCR accuracy and error handling are limited. In practice, OCR errors can significantly impact downstream extraction accuracy, particularly for complex contract layouts, poor-quality scans, or non-standard formatting. Production systems should implement OCR quality checks and potentially human review of OCR outputs for critical documents.

Extraction Rule Maintenance: The template-based extraction approach requires ongoing maintenance of extraction rules as contract formats evolve and new extraction requirements emerge. The case study does not detail how rule updates are managed, tested, and deployed, or how conflicts between rules are resolved. Organizations should plan for dedicated resources to maintain and improve extraction templates over time.

Cost Considerations: While the case study emphasizes the efficiency benefits of AIDA, it does not address the infrastructure and AI model costs associated with running the system. Organizations should carefully model the total cost of ownership including AWS infrastructure costs, Amazon Bedrock API costs (which can be significant for large document volumes), and ongoing operational costs for monitoring, maintenance, and rule updates.

Vendor Lock-in: The solution is tightly integrated with AWS services, particularly Amazon Bedrock. While this integration enables rapid development and deployment, it also creates vendor lock-in. Organizations concerned about portability should consider abstracting LLM interactions behind provider-agnostic interfaces that would facilitate migration to alternative providers or models in the future.

Overall, AIDA represents a well-architected production deployment of LLM technology for contract intelligence, demonstrating important patterns around RAG implementation, asynchronous processing, security controls, and system integration. However, as with any vendor-provided case study, organizations should validate claimed benefits through proof-of-concept testing with their own contract types and workflows before committing to full deployment.

More Like This

Agentic AI Copilot for Insurance Underwriting with Multi-Tool Integration

Snorkel 2025

Snorkel developed a specialized benchmark dataset for evaluating AI agents in insurance underwriting, leveraging their expert network of Chartered Property and Casualty Underwriters (CPCUs). The benchmark simulates an AI copilot that assists junior underwriters by reasoning over proprietary knowledge, using multiple tools including databases and underwriting guidelines, and engaging in multi-turn conversations. The evaluation revealed significant performance variations across frontier models (single digits to ~80% accuracy), with notable error modes including tool use failures (36% of conversations) and hallucinations from pretrained domain knowledge, particularly from OpenAI models which hallucinated non-existent insurance products 15-45% of the time.

healthcare fraud_detection customer_support +90

Migration of Credit AI RAG Application from Multi-Cloud to AWS Bedrock

Octus 2025

Octus, a leading provider of credit market data and analytics, migrated their flagship generative AI product Credit AI from a multi-cloud architecture (OpenAI on Azure and other services on AWS) to a unified AWS architecture using Amazon Bedrock. The migration addressed challenges in scalability, cost, latency, and operational complexity associated with running a production RAG application across multiple clouds. By leveraging Amazon Bedrock's managed services for embeddings, knowledge bases, and LLM inference, along with supporting AWS services like Lambda, S3, OpenSearch, and Textract, Octus achieved a 78% reduction in infrastructure costs, 87% decrease in cost per question, improved document sync times from hours to minutes, and better development velocity while maintaining SOC2 compliance and serving thousands of concurrent users across financial services clients.

document_processing question_answering summarization +45

Building a Microservices-Based Multi-Agent Platform for Financial Advisors

Prudential 2025

Prudential Financial, in partnership with AWS GenAI Innovation Center, built a scalable multi-agent platform to support 100,000+ financial advisors across insurance and financial services. The system addresses fragmented workflows where advisors previously had to navigate dozens of disconnected IT systems for client engagement, underwriting, product information, and servicing. The solution features an orchestration agent that routes requests to specialized sub-agents (quick quote, forms, product, illustration, book of business) while maintaining context and enforcing governance. The platform-based microservices architecture reduced time-to-value from 6-8 weeks to 3-4 weeks for new agent deployments, enabled cross-business reusability, and provided standardized frameworks for authentication, LLM gateway access, knowledge management, and observability while handling the complexity of scaling multi-agent systems in a regulated financial services environment.

healthcare fraud_detection customer_support +48