WhyHow: Graph RAG and Multi-Agent Systems for Legal Case Discovery and Document Analysis

LLMOps Database

Legal

WhyHow

Company

WhyHow

Title

Graph RAG and Multi-Agent Systems for Legal Case Discovery and Document Analysis

Industry

Legal

Link

https://www.youtube.com/watch?v=yYxr6LdXNWM

Year

2025

Summary (short)

WhyHow.ai, a legal technology company, developed a system that combines graph databases, multi-agent architectures, and retrieval-augmented generation (RAG) to identify class action and mass tort cases before competitors by scraping web data, structuring it into knowledge graphs, and generating personalized reports for law firms. The company claims to find potential cases within 15 minutes compared to the industry standard of 8-9 months, using a pipeline that processes complaints from various online sources, applies lawyer-specific filtering schemas, and generates actionable legal intelligence through automated multi-agent workflows backed by graph-structured knowledge representation.

WhyHow.ai represents an interesting case study in applying LLMOps principles to the legal industry, specifically focusing on class action and mass tort case discovery. The company was founded by a technical founder with a decade of experience in graph technologies, working alongside a legal co-founder to bridge the gap between technical capabilities and legal domain expertise. ## Company Overview and Use Case WhyHow.ai operates in the legal services sector, specializing in finding class action and mass tort cases before traditional law firms identify them. Their primary focus is on pharmaceutical liability cases where multiple individuals have been harmed by a product, requiring scientific evidence of harm and collective legal action against pharmaceutical companies. The company positions itself as a technical service provider that supports law firms rather than engaging in actual litigation. The core business problem they address is the traditional inefficiency in legal case discovery, where law firms typically find cases 8-9 months after people begin complaining about issues. WhyHow.ai claims to identify these cases within 15 minutes through automated systems, though they acknowledge it takes about a month to build confidence in the signal quality. ## Technical Architecture and LLMOps Implementation The company's technical approach combines three main components: knowledge graphs, multi-agent systems, and retrieval-augmented generation. Their architecture reflects a sophisticated understanding of the challenges in deploying LLMs in production, particularly in a domain where accuracy is paramount. ### Knowledge Graph Foundation WhyHow.ai uses knowledge graphs as the backbone of their system, defining graphs primarily as "relations" that provide visual understanding, explicit connections, and scalable analytics capabilities. Their graph schema captures complex relationships between individuals, products, ingredients, concentrations, and regulatory identifiers. This structured approach allows them to represent legal cases as interconnected entities where, for example, pharmaceutical ingredients at certain concentrations become problematic when consumed by specific individuals during particular time periods. The company emphasizes the importance of schema consistency and extensibility in their graph design. Each law firm client has a personalized graph structure that reflects their specific filtering criteria, case preferences, and reporting requirements. This customization extends to jurisdictional preferences, case value thresholds, and specific product categories. ### Multi-Agent System Design The multi-agent architecture breaks down complex legal workflows into discrete, testable components. Rather than treating agents as highly autonomous entities, WhyHow.ai adopts a more controlled approach where agents represent specific workflow steps with defined inputs, outputs, and state management. This design philosophy stems from their recognition that legal applications require exceptional accuracy and auditability. The company acknowledges significant challenges with agent reliability, noting that even with 95% accuracy per agent, chaining five agents sequentially results in only 77% expected end-to-end accuracy. To address this, they implement extensive guardrails, human-in-the-loop verification, and state management systems. Their agents maintain both immediate and episodic memory, with careful attention to state capture, expansion, pruning, and querying processes. ### Production Deployment Challenges The deployment of LLMs in legal contexts presents unique challenges that WhyHow.ai addresses through several production strategies. Legal professionals have extremely low tolerance for errors, requiring systems that can provide precise, auditable results in proper legal language. The company's approach involves treating LLMs not as standalone solutions but as components that enable previously impossible system integrations. They emphasize that their system is "not an LLM filtered system" but rather "an ML filtered system that LLMs have allowed us to pipe together." This distinction reflects a hybrid approach where traditional machine learning handles precise filtering and classification tasks, while LLMs provide the flexibility to connect disparate components and enable natural language interfaces. ### Data Pipeline and Web Scraping The company's data acquisition strategy involves comprehensive web scraping across various platforms including government websites, specialized forums, Reddit communities, and complaint databases. They process this raw data through qualification pipelines that filter content down to signals relevant to specific legal cases. The qualification process applies lawyer-specific schemas that reflect individual preferences for case types, jurisdictions, monetary thresholds, and other criteria. This personalization ensures that each law firm receives highly targeted intelligence rather than generic case alerts. ### RAG Implementation for Legal Discovery WhyHow.ai's RAG implementation focuses heavily on legal discovery processes, where law firms must review massive document collections (often 500GB+ of emails and documents) provided by opposing parties. Their system extracts information from these documents and structures it into consistent graph representations, enabling lawyers to quickly identify relevant information while dismissing irrelevant materials. The RAG system doesn't primarily operate as a conversational interface but rather as a report generation engine. The system queries relevant subgraphs from their broader knowledge structure and composes them into formatted reports that match lawyers' preferred working styles and document formats. ### Case Study: Automotive Fire Defects The company provided a specific example involving automotive defect cases where vehicles catch fire unexpectedly. Their system monitors complaints across various platforms, tracking both the density of complaints (total volume) and velocity (complaints per month) for specific vehicle models and years. When these metrics exceed certain thresholds, the system generates alerts and detailed reports for interested law firms. This case study demonstrates their ability to identify patterns across distributed data sources and provide early warning systems for emerging legal issues. The automated analysis considers factors like make, model, year, jurisdiction, and potential case value to match opportunities with appropriate law firms. ## Critical Assessment and Limitations While WhyHow.ai presents an innovative approach to legal technology, several aspects of their claims warrant careful consideration. The company's assertion of finding cases within 15 minutes compared to industry standards of 8-9 months is impressive but lacks detailed validation metrics or independent verification. The legal industry's conservative nature and emphasis on precedent may create adoption challenges for automated case discovery systems. Legal professionals typically rely on established networks, referrals, and traditional research methods, making them potentially resistant to algorithmic recommendations. The company's acknowledgment of agent reliability issues (77% accuracy for chained 95% accurate agents) highlights ongoing challenges in production LLM deployment. While they implement guardrails and human oversight, these reliability concerns could limit system adoption in risk-averse legal environments. Their hybrid approach of combining traditional ML with LLMs suggests recognition that current generative AI capabilities alone are insufficient for legal applications requiring high precision and auditability. ## Production Insights and Lessons WhyHow.ai's experience offers several valuable insights for LLMOps practitioners. Their emphasis on schema-driven development, extensive guardrails, and hybrid architectures reflects practical approaches to deploying LLMs in high-stakes environments. The company's focus on state management, particularly their attention to capturing, expanding, pruning, and querying state through graph structures, demonstrates sophisticated thinking about long-term system maintenance and scalability. Their personalization approach, where each client receives customized schemas and workflows, illustrates how LLMOps systems can provide mass customization while maintaining underlying technical coherence. The integration of traditional ML for precise tasks with LLMs for flexible connectivity represents a pragmatic approach to production deployment that leverages the strengths of both paradigms while mitigating their respective weaknesses.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.

Book a Demo

Use Open Source