Statista, a global data platform, developed and optimized a RAG-based AI search system to enhance their platform's search capabilities. Working with Urial Labs and Talent Formation, they transformed a basic prototype into a production-ready system that improved search quality by 140%, reduced costs by 65%, and decreased latency by 10%. The resulting Research AI product has seen growing adoption among paying customers and demonstrates superior performance compared to general-purpose LLMs for domain-specific queries.
This case study presents a comprehensive journey of implementing and optimizing a production LLM system at Statista, a global data platform serving over 30,000 paying customers with millions of statistics across various industries.
# Context and Business Challenge
Statista faced a significant challenge in early 2023 with the emergence of ChatGPT and other LLMs. As a platform hosting millions of statistics and serving 23 million views per month, they needed to enhance their search and discovery capabilities while maintaining their position as a trusted data source. The challenge was particularly important given that 66-80% of their traffic comes from organic search.
# Initial Approach and Development
The journey began with a methodical approach:
* Dedicated one engineer for two months to explore potential use cases
* Created an initial prototype to prove the concept
* Partnered with external expertise (Urial Labs) for production optimization
# Technical Implementation Details
The system was implemented as a RAG (Retrieval Augmented Generation) application with several key components:
* Vector store for semantic search across millions of statistics
* Multi-stage retrieval and ranking system
* Answer generation using LLMs
* Quality rating system for answers
The initial implementation had significant challenges:
* 42 LLM calls per request (40 for reranking, 1 for answering, 1 for rating)
* High latency (~30 seconds)
* High costs (~8 cents per query)
* Quality issues (30% on internal metrics)
# Optimization Process and Methodology
The team implemented a systematic optimization approach:
* Established comprehensive traceability to understand performance bottlenecks
* Defined clear metrics prioritizing quality, then cost, then latency
* Created a reference dataset with expert-validated answers
* Implemented automated testing infrastructure for rapid experimentation
* Conducted over 100 experiments to optimize performance
Key technical innovations included:
## Query Processing Improvements
* Implemented query rewriting for better semantic matching
* Developed multi-query approach to capture different aspects of complex questions
* Utilized Hypothetical Document Embeddings (HyDE) technique to improve retrieval quality
## Model Selection and Optimization
* Conducted comprehensive model comparisons across different providers
* Evaluated trade-offs between quality, cost, and latency
* Implemented dynamic model selection based on query complexity
# Results and Production Implementation
The optimization efforts yielded impressive results:
* 140% improvement in answer quality
* 65% reduction in costs
* 10% improvement in latency (after reinvesting some gains into quality improvements)
The production system includes several sophisticated features:
* Parallel retrieval pipelines
* Dynamic model selection
* Automated quality assessment
* Key fact extraction and visualization
# Business Impact and Adoption
The system, launched as "Research AI", has shown strong business results:
* Increasing usage among paying customers
* Low bounce rates indicating good user engagement
* Higher content interaction rates compared to traditional search
* Competitive performance against leading generative AI models
# Production Monitoring and Continuous Improvement
The team implemented:
* Continuous quality benchmarking against leading AI models
* Regular quality metric updates and calibration
* A/B testing for new features and integrations
* Usage monitoring and cost tracking
# Innovation and Future Directions
The project has spawned additional innovations:
* Development of an AI Router product for optimizing model selection
* Exploration of new business models including data licensing for LLM training
* Integration possibilities with enterprise customers' internal AI systems
# Key Learnings
* Importance of systematic optimization methodology
* Value of comprehensive metrics and testing infrastructure
* Need for balanced approach to quality, cost, and latency
* Significance of production-ready monitoring and evaluation systems
The case study demonstrates how careful engineering, systematic optimization, and focus on production metrics can transform a proof-of-concept AI system into a valuable production service. The team's approach to balancing quality, cost, and performance while maintaining a focus on user value provides valuable insights for similar LLMOps initiatives.
Start your new ML Project today with ZenML Pro
Join 1,000s of members already deploying models with ZenML.