## Overview
This case study, presented at an NLP Summit by Devananda Kumar from Philips, describes a GenAI-powered application designed to automate competitor intelligence gathering and analysis within a procurement and supply chain context. The presentation focuses specifically on how LLMs can be deployed in production to support business analysts and procurement teams in the medical device industry, though the approach appears applicable across manufacturing sectors.
The core problem being addressed is the labor-intensive nature of competitive intelligence gathering in procurement. Traditional methods require business analysts to manually access multiple data sources including PDFs, Word documents, SharePoint repositories, internal databases, and external websites to compile competitive insights. For example, when comparing a medical device analyzer across 25 features against 8 competitors, analysts would need to manually fill in approximately 200 data fields—a time-consuming task where information gaps are common due to the difficulty of finding relevant data on the internet.
## Use Case Context
The presentation outlines three main categories of competitor landscape analysis where GenAI can provide value:
- **Market Analysis**: Understanding market share, regional performance, and trend analysis. The system connects to internal sales databases and external sources to answer questions like "What is the market share of our analyzer device in North America in Q2 2024 compared to competitors?"
- **Reviews and Ratings**: Analyzing end-user perspectives on product features and performance through sentiment analysis and feature extraction from customer feedback.
- **Price Analysis and Procurement**: The primary focus of this case study—evaluating competitor procurement strategies, supplier networks, and cost structures to optimize procurement decisions.
The goal for procurement teams is to identify potential new suppliers, benchmark procurement costs, understand competitor supplier relationships, and find cost-saving opportunities without compromising quality.
## Technical Architecture
The solution, coined the "Smart Business Analyst," employs a multi-agent architecture to maximize information retrieval from limited and niche data sources. The system recognizes that competitive intelligence information is not abundantly available, so it must be thorough in its search approach.
### Multi-Agent Pipeline
The data flow consists of three primary agents:
- **Subquery Generation Agent**: Since competitive intelligence queries often require information from diverse sources, the system splits a single business query into multiple sub-queries to maximize information retrieval from the internet. This approach acknowledges that niche competitive information requires casting a wide net across various data sources.
- **Search Agents**: These agents scrape and store the top-ranked URLs from the web into a vector database. This represents a RAG (Retrieval-Augmented Generation) approach where external web content is indexed and made available for the LLM to reference when generating responses.
- **Report Generation Agent**: This agent synthesizes the collected information and formats it into a display-ready state, providing prescriptive or forward-looking insights in a readable format.
### Data Source Integration
The system is designed to integrate with multiple data source types:
- Internal databases (sales data, procurement records)
- Subscription-based journals and magazines
- Company annual reports
- News feeds and daily updates
- Government and regulatory data
- Trade associations
- Financial databases
- Academic research publications
- Dedicated websites pre-configured as primary sources
The presentation includes a data source mapping matrix showing which sources are relevant for different analysis types (design benchmarking, cost benchmarking, etc.), allowing the system to prioritize sources based on the query type.
### Real-Time Information Retrieval
A key architectural decision highlighted is the integration of the Google Search API wrapper to address LLM knowledge cutoff limitations. Since LLM APIs are trained on historical data, the system uses the Google Search API to fetch the most current information from the internet, ensuring responses include recent developments, news, and market updates that occurred after the base model's training cutoff.
## Differentiation from Generic ChatGPT
The presenter explicitly addresses how this production system differs from using ChatGPT directly:
- **Precision in Numerical Data**: When evaluating medical device features, generic ChatGPT tends to provide qualitative comparisons (e.g., "your analyzer device is better than your competitor"). The Smart Business Analyst is trained to provide exact numerical values (e.g., "rotation degrees: 35° for your company vs. 33° for competitor").
- **Domain-Specific Training**: The system appears to incorporate domain-specific knowledge about medical devices, procurement terminology, and competitive intelligence frameworks that generic models may lack.
- **Structured Output**: The system can populate structured comparison matrices automatically, which is more suitable for business workflows than free-form text responses.
## Production Features
Several production-ready features are mentioned:
- **Visualization Charts**: The system can generate visual representations of data, not just textual responses.
- **Conversational Memory**: The application maintains context across a conversation session, allowing users to ask follow-up questions without restating the original context. This mirrors the ChatGPT experience but within a controlled, enterprise environment.
- **Multilingual Support**: End users can pose questions in any language, expanding accessibility for global teams.
- **Pluggable Data Sources**: The architecture allows additional data sources to be connected to the system as needed, providing flexibility for expanding coverage.
- **Source Attribution**: The interface displays the multiple websites from which information was extracted, providing transparency and allowing users to verify sources.
## Quantitative vs. Qualitative Analysis
The presentation distinguishes between two types of analysis the system supports:
**Quantitative Analysis** relies on procurement data aggregators and third-party data providers. Examples of KPIs include:
- Supplier Performance Risk (comparing organization vs. competitor weighted averages)
- Inventory Coverage (days a manufacturing plant can operate without new inventory)
The presenter acknowledges data limitations in quantitative analysis—competitor component costs, special discounts, and supplier quality metrics may not be readily available. In such cases, industry standards are used to derive conclusions.
**Qualitative Analysis** is the primary focus of the GenAI application, gathering insights from secondary research sources like news, blogs, company annual reports, and specialized publications. This is where the LLM excels at synthesizing unstructured text into actionable intelligence.
## Process Flow Example
A practical example is provided: A procurement manager queries "What is the cost reduction strategy that my competitor is obtaining?" The system:
- Identifies relevant secondary data sources (journals, news reports)
- Extracts keywords from the query
- Uses the pre-trained model to extract relevant information from indexed sources
- Presents insights in a readable format (prescriptive or forward-looking)
## Critical Assessment
While the presentation demonstrates a thoughtful application of LLMs to a real business problem, several aspects warrant consideration:
- **Accuracy Verification**: The claim that the system provides "exact" numerical values for product features needs validation. Web-scraped data can be outdated or inaccurate, and the presentation doesn't detail how accuracy is verified.
- **Hallucination Mitigation**: No specific mention is made of hallucination detection or mitigation strategies, which is particularly important when providing numerical data that could influence procurement decisions.
- **Scalability and Maintenance**: The multi-agent architecture with web scraping components may face challenges with website changes, rate limiting, and data quality over time.
- **Competitive Data Ethics**: Gathering competitor intelligence through web scraping raises questions about data usage rights and competitive ethics that aren't addressed.
- **Quantified Results**: While the presentation mentions time savings ("filled up within a few seconds"), no specific metrics on accuracy improvements, user adoption, or business impact are provided.
The presentation appears to be more of a concept demonstration than a fully-deployed production case study, though the architecture described is sound and represents a practical approach to applying LLMs in an enterprise procurement context. The multi-agent design, source attribution, and real-time search integration represent solid LLMOps practices for production deployments.