Factiva, a Dow Jones business intelligence platform, implemented a secure, enterprise-scale LLM solution for their content aggregation service. They developed "Smart Summaries" that allows natural language querying across their vast licensed content database of nearly 3 billion articles. The implementation required securing explicit GenAI licensing agreements from thousands of publishers, ensuring proper attribution and royalty tracking, and deploying a secure cloud infrastructure using Google's Gemini model. The solution successfully launched in November 2023 with 4,000 publishers, growing to nearly 5,000 publishers by early 2024.
Factiva is a business intelligence platform owned by Dow Jones (part of News Corporation) that has been aggregating and licensing content for nearly three decades. The platform provides curated business intelligence from thousands of content sources across 200 countries in over 30 languages, serving corporate clients, financial services firms, and professional researchers. In November 2024, Factiva launched “Smart Summaries,” an AI-powered feature that allows users to query the service using natural language and receive summarized responses with relevant sources—all built on content that has been explicitly licensed for generative AI use.
Tracy Mabery, General Manager of Factiva, discussed the implementation in a podcast interview, providing insight into how a legacy business intelligence platform has navigated the transition to generative AI while maintaining its core principles around content licensing, publisher relationships, and data security.
Before generative AI, Factiva was already built on machine learning and AI foundations. The platform was originally created as a joint venture to serve both qualitative and quantitative research needs in financial services. Core capabilities included advanced Boolean query operators, semantic filtering, rules management, and contextual search (distinguishing, for example, between “Apple” as a company versus “apple” as a fruit). The platform also employed human experts who partnered with corporate clients to build sophisticated search queries.
This technical foundation proved crucial for the generative AI transition. The existing infrastructure for semantic search, content attribution, and royalty tracking provided the scaffolding upon which the new AI capabilities were built. Factiva’s corpus includes nearly three billion articles dating back to 1944, representing a massive archive that needed to be indexed and made searchable for the new AI features.
Factiva selected Google Gemini on Google Cloud as their foundation model and infrastructure partner. According to Mabery, the selection was driven primarily by security considerations, but also by Google’s ability to handle the scale of Factiva’s content corpus. The decision-making process included Google providing detailed network architecture diagrams showing exactly how the infrastructure would be built and secured.
The implementation uses a closed ecosystem approach where only content licensed for generative AI use feeds into the summarization engine. This is essentially a Retrieval-Augmented Generation (RAG) architecture where the retrieval is limited to the licensed corpus. The semantic search layer powers both the generative AI summarization features and the non-generative search capabilities, providing unified infrastructure.
When Factiva approached Google about indexing nearly three billion articles for their private cloud, it represented a significant technical undertaking. Mabery noted that even Google found the “three billion with a B” figure notable, suggesting the scale required careful infrastructure planning.
The company maintains an agnostic stance toward both models and cloud providers. While Google was selected for Smart Summaries, Dow Jones broadly aims to pick “best of breed” solutions for each specific use case. There was no explicit preference stated for proprietary versus open-source models—security, capability, and fit for purpose appear to be the driving factors.
Perhaps the most distinctive aspect of Factiva’s approach is their publisher-by-publisher content licensing strategy. Rather than following the approach of many frontier AI companies that trained on publicly available internet data, Factiva went to each of their thousands of publishers to secure explicit generative AI licensing rights.
This process involved significant education, as many publishers were unfamiliar with AI terminology and concepts in the early days. Terms like “RAG model,” “retrieval augmented generation,” “agents,” and “hallucinations” were new to the publishing community. The conversations addressed three main concerns:
Factiva leveraged its existing royalty and attribution infrastructure, which has been tracking content usage and compensating publishers for nearly three decades. The concept of “royalty moments” was extended to generative AI, creating new compensation opportunities for publishers who opt into AI licensing.
The results have been impressive from a partnership perspective. At launch in November 2024, nearly 4,000 publishers had signed generative AI licensing agreements, up from just 2,000 sources six months prior. By the time of the interview (early 2025), the number had grown to nearly 5,000 sources.
The closed ecosystem approach serves as a primary defense against hallucinations. Because the generative AI can only draw from licensed content within Factiva’s corpus, it cannot hallucinate information from the broader internet or its training data. This is described as more of a “language task” than a “knowledge task”—the AI summarizes and synthesizes existing content rather than generating novel claims from parametric knowledge.
The system emphasizes three core search principles: relevancy, recency, and context. These guide how content is surfaced and summarized. Factiva provides explicit disclaimers noting that summaries are generative AI-initiated, along with detailed technical explanations of how the search algorithm surfaces information.
Mabery acknowledged that hallucinations remain an industry-wide challenge and that user tolerance for AI errors may be evolving. She noted the example of Apple Intelligence having to pull a feature due to errors, suggesting that egregious mistakes still require addressing even as users become more accustomed to AI limitations.
Factiva’s approach to generative AI is guided by three core principles:
The company explicitly states that they only create licensing deals they would sign themselves, suggesting alignment between their own publishing interests and what they ask of partners.
Several LLMOps-relevant observations emerge from the case study:
While specific details were not confirmed, the interview hinted at several future directions:
The trajectory suggests Factiva views generative AI as moving from “wish list” to “table stakes” for business intelligence platforms, with continued acceleration of AI capabilities expected.
While the case study presents a compelling model for responsible AI deployment in content aggregation, several considerations merit attention:
Nevertheless, Factiva’s approach represents a significant case study in building production AI systems with explicit attention to content rights, publisher relationships, and transparent compensation—areas where many AI deployments have faced criticism.
Snorkel developed a specialized benchmark dataset for evaluating AI agents in insurance underwriting, leveraging their expert network of Chartered Property and Casualty Underwriters (CPCUs). The benchmark simulates an AI copilot that assists junior underwriters by reasoning over proprietary knowledge, using multiple tools including databases and underwriting guidelines, and engaging in multi-turn conversations. The evaluation revealed significant performance variations across frontier models (single digits to ~80% accuracy), with notable error modes including tool use failures (36% of conversations) and hallucinations from pretrained domain knowledge, particularly from OpenAI models which hallucinated non-existent insurance products 15-45% of the time.
Stripe, processing approximately 1.3% of global GDP, has evolved from traditional ML-based fraud detection to deploying transformer-based foundation models for payments that process every transaction in under 100ms. The company built a domain-specific foundation model treating charges as tokens and behavior sequences as context windows, ingesting tens of billions of transactions to power fraud detection, improving card-testing detection from 59% to 97% accuracy for large merchants. Stripe also launched the Agentic Commerce Protocol (ACP) jointly with OpenAI to standardize how agents discover and purchase from merchant catalogs, complemented by internal AI adoption reaching 8,500 employees daily using LLM tools, with 65-70% of engineers using AI coding assistants and achieving significant productivity gains like reducing payment method integrations from 2 months to 2 weeks.
Notion AI, serving over 100 million users with multiple AI features including meeting notes, enterprise search, and deep research tools, demonstrates how rigorous evaluation and observability practices are essential for scaling AI product development. The company uses Brain Trust as their evaluation platform to manage the complexity of supporting multilingual workspaces, rapid model switching, and maintaining product polish while building at the speed of AI industry innovation. Their approach emphasizes that 90% of AI development time should be spent on evaluation and observability rather than prompting, with specialized data specialists creating targeted datasets and custom LLM-as-a-judge scoring functions to ensure consistent quality across their diverse AI product suite.