## Overview
This case study, presented at Google Cloud Next 2024, showcases how Cox 2M, the commercial IoT division of Cox Communications, partnered with ThoughtSpot to deploy a production-ready, enterprise-grade generative AI analytics solution. The presentation featured speakers from Google Cloud (Jin George, Solutions Architect), Cox 2M (Josh Horton, Data Strategy and Analytics lead), and ThoughtSpot (Angeli Kumari, VP of Product), providing perspectives from both the technology provider and customer sides.
Cox 2M operates in three primary segments: Automotive lot management (Lot Vision solution), small business fleet management, and industrial/supply chain IoT. The scale of their operations is significant, with 11.4 billion assets tracked over their lifespan, over 29,000 miles of asset movements measured every hour, and more than 200 IoT sensor messages processed each hour. This data intensity made their analytics challenges particularly acute.
## The Business Problem
The core challenge Cox 2M faced was a classic analytics bottleneck that many organizations encounter. As Josh Horton explained, they operated with a very lean analytics team that was spending most of its time serving internal ad hoc requests rather than focusing on high-value, customer-facing outcomes. The numbers were stark: it took upwards of a week to produce a single ad hoc request—essentially one answer to one question from the business.
This created two critical problems. First, the cost-to-serve model was unsustainable and not scalable, especially for a small team. Second, and perhaps more importantly, insights delivered a week later were often no longer relevant to the business decisions they were meant to inform. The analytics team recognized they needed to fundamentally change their approach rather than simply scaling up resources.
## The LLM-Powered Solution Architecture
The solution involved multiple integration layers leveraging Google Cloud's ecosystem and ThoughtSpot's AI-driven analytics platform, with Gemini serving as the LLM backbone for natural language capabilities.
### Data Infrastructure Integration
ThoughtSpot integrates deeply with Google's data infrastructure stack. The platform can be hosted on Google Cloud and purchased directly through Google Marketplace. At the data layer, it connects to Google BigQuery for live queries, Google Cloud SQL, AlloyDB, and Looker Modelo for semantic layer definitions. This allows business users to ask natural language questions that execute as live queries against these data platforms, enabling real-time insights rather than pre-aggregated reports.
The integration with Looker Modelo is particularly notable because it allows organizations to define their semantic layer in one place and leverage it across ThoughtSpot capabilities, maintaining consistency in business definitions and metrics.
### Gemini LLM Integration for Production Analytics
The Gemini integration powers three core capabilities in production:
**Natural Language Narrations**: Gemini translates data insights into business-friendly natural language. Rather than forcing business users to interpret charts and numbers, the system converts analytical findings into prose that matches how business users actually communicate. This bridges the gap between data-speak and business-speak.
**Change Analysis**: This is perhaps the most powerful production feature demonstrated. Business users often have KPIs that their performance depends on. ThoughtSpot's KPI charts provide automatic comparisons with prior periods (like the same time last year) and identify top contributors to changes. Gemini powers the natural language explanations of these changes. In the demo, a 100%+ increase in roaming call minutes was automatically analyzed, with Gemini explaining which states (California, Texas) and which plans contributed most to the change. The system also generates additional insights about how other attributes contributed to changes.
**Liveboard Summarization**: Rather than requiring users to analyze every visualization on a dashboard individually, Gemini can summarize an entire dashboard's key insights, highlighting both expected and unexpected changes. This dramatically reduces the cognitive load on business users who need to quickly understand what's happening in their data.
**Sage Natural Language Querying**: The Sage feature allows users to ask data questions in natural language. The demo showed queries like "which states have the most roaming calls" and more complex questions like "top three customers contributing most in California and Texas." The system converts natural language to SQL, executes against the live data, and returns visualizations.
### Production Feedback Loops and Learning
A critical aspect of the LLMOps implementation is the feedback mechanism built into Sage. When users ask questions that the model doesn't perfectly understand (like "most popular plans" when that exact term isn't in the data), the LLM intelligently interprets the intent (converting it to "subscribers"). Users can provide feedback when results aren't quite right—for example, if they wanted top 3 results instead of top 10.
The feedback process works by translating questions into smaller fragments, mapping those fragments to actual data tokens (column names). These mappings are stored in ThoughtSpot's system (notably not sent back to the LLM) and used for subsequent queries. Importantly, these learned fragments can be applied to a "class of problems," meaning if "most popular" is clarified once, that understanding applies to future questions using similar phrasing. This creates a continuous improvement loop without requiring model retraining.
### Validation and Trust Mechanisms
The solution includes mechanisms for users to validate that the LLM has correctly interpreted their questions. The interface displays "tokens" (column names) that show exactly which data fields the system used to answer the query. This transparency allows business users to verify that the natural language to SQL translation was accurate, building trust in the AI-generated insights.
## Results and Impact
The quantified results are impressive: Cox 2M reduced time-to-insight by 88%, going from over a week to under an hour for ad hoc requests. This wasn't just about speed—it fundamentally changed what the analytics team could focus on. Instead of being consumed by internal request fulfillment, they could now concentrate on high-value, customer-facing analytics outcomes.
The solution also transformed the concept of "self-serve analytics." As Josh Horton noted, before generative AI, self-serve was really "self-serve for BI and analytics teams," not for actual end business users. Now, non-technical business users can interact with data in natural language and receive accurate answers, with guardrails that guide them toward the right questions when they ask "bad questions."
## Future Production Use Cases
Cox 2M shared two compelling future use cases that illustrate the expanding scope of their LLM-powered analytics:
**Small Business Fleet Management**: Enabling construction business owners to log into their app and ask questions like "How many of my fleet drivers used vehicles outside of business hours last week?" This bypasses traditional feature development cycles where such capabilities would need to be planned, prioritized, and built as specific application features.
**Supply Chain IoT**: Allowing project managers to ask about assets in transit on overseas container ships, such as "Where are all my assets and are any at risk of loss?" This addresses a critical pain point in supply chain management where loss in transit is a significant concern.
## Critical Assessment
While the presentation makes strong claims about production readiness, several aspects warrant careful consideration:
The case study comes from a vendor presentation at Google Cloud Next, so the perspective naturally emphasizes success. The 88% reduction in time-to-insight is a compelling metric, but the baseline (one week for a single ad hoc request) suggests the prior state may have been particularly problematic rather than typical.
The feedback loop mechanism for improving LLM accuracy appears well-designed, storing corrections locally rather than sending them to the LLM, which addresses both privacy and operational concerns. However, the long-term maintenance of these learned fragments and potential for accumulated errors wasn't discussed.
The integration with production data systems (BigQuery, AlloyDB) for live querying is a genuine production consideration, though questions about query performance, cost management, and handling of complex analytical patterns weren't addressed in detail.
The solution's handling of "bad questions" was mentioned as a feature, but the specific guardrails, error handling, and edge case management weren't fully explored. In production LLM systems, these edge cases often create the most significant operational challenges.
Overall, this represents a solid example of deploying LLMs for enterprise analytics in a production environment, with thoughtful consideration of user feedback loops, validation mechanisms, and integration with existing data infrastructure. The emphasis on making analytics "invisible" while outcomes remain "visible" reflects a mature understanding of how AI should serve business users rather than requiring them to become AI experts.