Finance
ClimateAligned
Company
ClimateAligned
Title
RAG-Based System for Climate Finance Document Analysis
Industry
Finance
Year
2023
Summary (short)
ClimateAligned, an early-stage startup, developed a RAG-based system to analyze climate-related financial documents and assess their "greenness." Starting with a small team of 2-3 engineers, they built a solution that combines LLMs, hybrid search, and human-in-the-loop processes to achieve 99% accuracy in document analysis. The system reduced analysis time from 2 hours to 20 minutes per company, even with human verification, and successfully evolved from a proof-of-concept to serving their first users while maintaining high accuracy standards.
ClimateAligned is tackling the challenge of analyzing climate-related financial documents at scale to help major financial institutions make informed investment decisions in climate initiatives. This case study provides a detailed look at how a small startup team approached building and deploying LLMs in production, with particular attention to maintaining high accuracy while scaling operations. The core problem they addressed was analyzing financial instruments and company documents to assess their climate-related aspects. This traditionally required experts spending hours manually reviewing documents, making it expensive and slow. Their solution combines RAG (Retrieval Augmented Generation) with human expertise to create a scalable, accurate system. ## Technical Architecture and Evolution The system architecture evolved through several stages, starting with a basic RAG implementation and gradually adding more sophisticated components: * Document Processing: They built an in-house document ingestion and management system to handle various unstructured document formats from different companies. * Search System: They implemented a hybrid search combining multiple approaches: * Vector similarity search for semantic matching * BM25 keyword search * Reciprocal Rank Fusion (RRF) for reranking * Domain-specific heuristics to narrow search spaces The team made a strategic decision to start with OpenAI's GPT-4 for their core LLM operations, acknowledging it as a reliable though expensive starting point. This choice allowed them to focus on proving value before investing in custom models or fine-tuning. ## Human-in-the-Loop Integration A key insight from their implementation was the effective use of human-in-the-loop processes. Their analyst achieved 99% accuracy, while the pure LLM system reached about 85% accuracy. Rather than viewing this as a limitation, they leveraged it as a strength: * Initial System: Questions and answers were generated by the LLM but reviewed by human experts * Efficiency Gain: Reduced analysis time from 2 hours to 20 minutes per company * Data Collection: Built up a valuable dataset of correct and incorrect examples * Classification System: After collecting ~10,000 examples, developed a classifier to identify which results needed human review ## Scaling and Performance Improvements The team focused on incremental improvements while maintaining high accuracy: * Search Quality: The biggest accuracy improvements came from enhancing search functionality * Topic-Based Ranking: Added context-aware search using topic detection * Document Context: Working on incorporating longer-range context in document analysis * Model Selection: Moving from exclusively using GPT-4 to specialized models for different tasks ## Engineering Challenges and Solutions Several key challenges emerged during development: * Document Processing: Handling inconsistent document formats and structures * Context Windows: Balancing chunk size for context while maintaining specificity * API Limitations: Dealing with slow and sometimes unreliable OpenAI API responses * Scale Requirements: Planning for growth from hundreds to thousands of documents ## Production Considerations The team maintained several key principles for production deployment: * Accuracy Tracking: Continuous monitoring of system accuracy * Auditability: Maintaining source traceability for all answers * Scalability: Building systems that could handle increasing document volumes * Cost Management: Planning for transition from expensive API models to specialized in-house solutions ## Future Directions ClimateAligned is working on several improvements: * Developing topic-specific fine-tuned models * Improving document context handling * Scaling to handle historical data across multiple industries * Building comparison capabilities across companies and financial instruments ## Key Lessons The case study highlights several important lessons for LLMOps: * Starting with expensive but reliable tools (like GPT-4) can be valid for proving initial value * Human-in-the-loop processes can bridge the gap between current AI capabilities and required accuracy * Focusing on search quality can provide significant accuracy improvements * Building domain-specific solutions can outperform general-purpose approaches * Incremental improvement while maintaining accuracy is more valuable than rapid scaling with lower quality The ClimateAligned case study demonstrates how a small team can successfully deploy LLMs in production by focusing on specific domain problems and gradually building up capabilities while maintaining high accuracy standards.

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.