Factiva: Enterprise-Scale LLM Deployment with Licensed Content for Business Intelligence

LLMOps Database

Media & Entertainment

Factiva

Company

Factiva

Title

Enterprise-Scale LLM Deployment with Licensed Content for Business Intelligence

Industry

Media & Entertainment

Link

https://www.youtube.com/watch?v=EyCGYylNIvw

Year

2023

Summary (short)

Factiva, a Dow Jones business intelligence platform, implemented a secure, enterprise-scale LLM solution for their content aggregation service. They developed "Smart Summaries" that allows natural language querying across their vast licensed content database of nearly 3 billion articles. The implementation required securing explicit GenAI licensing agreements from thousands of publishers, ensuring proper attribution and royalty tracking, and deploying a secure cloud infrastructure using Google's Gemini model. The solution successfully launched in November 2023 with 4,000 publishers, growing to nearly 5,000 publishers by early 2024.

Tags

question_answering

summarization

data_analysis

regulatory_compliance

Factiva's implementation of LLMs in production represents a significant case study in enterprise-scale deployment of generative AI, particularly in handling licensed content and intellectual property rights. This case study explores how a major business intelligence platform approached the challenges of implementing LLMs while maintaining security, attribution, and proper compensation for content providers. ## Background and Business Context Factiva is a 25-year-old business intelligence platform owned by Dow Jones that aggregates content from thousands of sources worldwide. Their platform serves corporate clients across multiple industries, providing access to content from 200 different countries in over 30 languages. The platform contains nearly 3 billion articles dating back to 1944, making it one of the largest curated business intelligence repositories. Traditionally, Factiva's platform relied on Boolean search operators and advanced queries, requiring significant expertise to extract relevant information. Users needed to understand complex query syntax and often required expert assistance to build effective searches. This created a barrier to entry for many potential users and limited the platform's accessibility. ## Technical Implementation The LLM implementation, launched as "Smart Summaries" in November 2023, involves several key technical components: * **Model Selection**: Factiva chose Google's Gemini model, deployed in a private cloud environment. The company maintains a model-agnostic approach, selecting specific providers based on use case requirements. * **Infrastructure**: The solution runs on Google Cloud, with a secure network architecture specifically designed to handle their massive content corpus. The implementation required special consideration due to the unprecedented scale - indexing nearly 3 billion articles for semantic search. * **Security Architecture**: The system operates in a closed ecosystem, ensuring that only properly licensed content is used for generating summaries. The implementation includes strict content boundaries and secure infrastructure to prevent data leakage. * **Search Enhancement**: The system combines enhanced semantic search with generative capabilities, focusing on three core principles: * Relevancy * Recency * Context ## Content Licensing and Attribution System A crucial aspect of Factiva's LLM deployment is their comprehensive content licensing system: * They developed a new licensing framework specifically for generative AI use cases * The system includes transparent attribution and citation structures * A robust royalty tracking system ensures proper compensation for content providers * Separate royalty streams track traditional vs. GenAI content usage * The platform maintains clear boundaries between GenAI-licensed and non-GenAI-licensed content ## Deployment and Scaling The deployment process involved several phases: * Initial launch with approximately 2,000 licensed sources * Rapid scaling to 4,000 sources within six months * Further growth to nearly 5,000 sources by early 2024 * Continuous publisher outreach and education about GenAI licensing ## Publisher Education and Onboarding Factiva invested significant effort in publisher education and onboarding: * Explained technical concepts like RAG models and hallucinations * Addressed security concerns through transparent infrastructure documentation * Demonstrated clear attribution and compensation mechanisms * Provided education about AI terminology and implications * Established new "royalty moments" specific to GenAI usage ## Hallucination Management and Quality Control The system includes several measures to maintain output quality: * Clear disclaimers identifying AI-generated content * Restricted knowledge base limited to verified, licensed content * Transparent source attribution * Technical documentation explaining search algorithms * Closed ecosystem approach to reduce hallucination risks ## Results and Impact The implementation has shown promising results: * Increased accessibility through natural language querying * Growing publisher participation in the GenAI program * Maintained trust and transparency with content providers * Successfully balanced innovation with content rights protection * Created new revenue opportunities for publishers through GenAI licensing ## Future Developments Factiva continues to evolve their LLM implementation: * Exploring conversational AI capabilities * Developing agent-based interactions * Planning enhanced personalization features * Expanding the publisher network * Investigating new use cases for enterprise AI deployment This case study demonstrates how enterprise-scale LLM deployment can be achieved while respecting intellectual property rights and maintaining strict security requirements. It provides valuable insights into the challenges and solutions for implementing GenAI in content-intensive business applications, particularly regarding licensing, attribution, and security considerations.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.

Book a Demo

Use Open Source