Company
Github
Title
AI-Powered Customer Feedback Analysis at Scale
Industry
Tech
Year
2024
Summary (short)
GitHub faced the challenge of manually processing vast amounts of customer feedback from support tickets, with data scientists spending approximately 80% of their time on data collection and organization tasks. To address this, GitHub's Customer Success Engineering team developed an internal AI analytics tool that combines open-source machine learning models (BERTopic with BERT embeddings and HDBSCAN clustering) to identify patterns in feedback, and GPT-4 to generate human-readable summaries of customer pain points. This system transformed their feedback analysis from manual classification to automated trend identification, enabling faster identification of common issues, improved feature prioritization, data-driven decision making, and discovery of self-service opportunities for customers.
## Overview GitHub's Customer Success Engineering team developed an internal AI-powered analytics platform to systematically analyze and interpret customer feedback from support tickets at scale. The challenge they faced was common across many organizations: receiving overwhelming volumes of textual feedback that was impossible to manually process effectively. Research cited in the case study indicates that data scientists typically spend about 80% of their time on data collection and organization tasks, including manual classification, which creates bottlenecks and delays insight discovery. This inefficiency motivated GitHub to build a custom solution that adheres to their strict security and privacy requirements while incorporating tailored business metrics specific to their product areas. The team's mission was clear: honor the insights from their vast user base and let developer voices guide feature prioritization decisions. Led by program manager Mariana Borges and staff software engineer Steven Solomon, they assembled a team to create an AI analytics tool that could present relevant and actionable trends with business context tailored to GitHub's various product areas. ## Technical Architecture and Model Selection GitHub's approach demonstrates thoughtful consideration of the LLMOps stack, particularly around model selection and the balance between open-source and commercial solutions. The architecture consists of two primary AI components working in tandem: **BERTopic for Topic Modeling and Clustering**: The foundation of their system uses BERTopic, an open-source topic modeling framework hosted on their platform. BERTopic leverages BERT (Bidirectional Encoder Representations from Transformers) embeddings to create dynamic and interpretable topics. BERT is particularly valuable because it understands the meaning of ambiguous language by using surrounding text to establish context. BERTopic combines BERT's document embedding capabilities with HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise) to group similar documents together. Topics are then derived by extracting and aggregating the most representative words from each cluster. A critical capability that influenced their model selection was BERT's multilingual proficiency. Since BERT is trained on diverse datasets including text in various languages, it can effectively analyze feedback from GitHub's global user base regardless of the language used. This ensures comprehensive coverage of their international community without requiring separate models or processing pipelines for different languages. **GPT-4 for Summarization**: While BERTopic successfully identified clusters and generated representative words, the team recognized that a collection of keywords differs significantly from actionable insights that humans can readily understand and act upon. This led to the second phase of their architecture: using GPT-4 to transform topic clusters into human-readable summaries that clearly articulate customer pain points. An important distinction highlighted in the case study is that GitHub does not train any models using customer feedback from support tickets. Instead, they apply pre-trained models to analyze the feedback text and generate insights. This approach has several advantages from an LLMOps perspective: it reduces infrastructure requirements, avoids the need for maintaining training pipelines, respects customer data privacy, and allows them to leverage continuously improving foundation models. ## Prompt Engineering and Model Optimization GitHub's work with GPT-4 demonstrates practical prompt engineering and model optimization techniques in production. Rather than fine-tuning or retraining GPT-4, they optimized it through three key approaches: **Prompt Optimization**: The team crafted and refined prompts to guide the model in generating relevant summaries of topic clusters. This iterative process of prompt development is crucial for achieving reliable, high-quality outputs in production environments. **Parameter Tuning**: They adjusted various GPT-4 parameters to control the model's output characteristics, including temperature (controlling randomness), max tokens (limiting response length), top-p (nucleus sampling), and frequency and presence penalties (controlling repetition). These parameter adjustments allow fine-grained control over output quality without requiring model retraining. **Iterative Feedback and Testing**: The team implemented continuous improvement through human feedback and A/B testing. This feedback loop is essential for production LLM systems, allowing them to refine prompts and parameters based on real-world performance and user needs. This optimization approach represents a pragmatic LLMOps strategy: leveraging powerful foundation models while customizing their behavior through prompt engineering and parameter tuning rather than undertaking costly and complex fine-tuning or retraining efforts. ## Data Visualization and User Interface Development A significant portion of GitHub's LLMOps journey involved not just generating insights but effectively communicating them to internal stakeholders. The team recognized that generating useful AI insights and telling compelling stories with those insights are distinct challenges requiring different expertise. Initially, they experimented with Azure Data Explorer (ADX) dashboards to rapidly prototype various visualizations. This "ship to learn" approach—deeply rooted in GitHub's culture—emphasizes learning from rapid iteration and viewing failures as stepping stones to success. By sharing multiple dashboard variants across the company, they collected feedback to identify which visualizations provided genuine value and which fell short. Through this process, they learned that generic data visualizations were insufficient. They needed a tailored tool incorporating business-specific context that could tell stories like "Here are the top 10 customer pain points in support tickets for X product/feature." This realization led them to develop a custom internal web application with advanced filtering capabilities to navigate feedback insights effectively and connect insights generated by their internal systems for better action prioritization. The development of this interface demonstrates an important LLMOps principle: the value of AI-generated insights depends heavily on how they're presented to end users. The team focused on attention direction through careful use of position, color, and size to highlight the most important information. They followed a minimum viable product (MVP) approach, launching early and iterating based on feedback from product teams. ## Production Impact and Outcomes The integration of AI into GitHub's feedback analysis process has delivered several concrete outcomes: **Scalability Through Automation**: The transition from manual classification to automated trend identification significantly enhanced their ability to scale data analysis efforts. This automation saves substantial time while increasing precision in understanding and responding to developer feedback. **Faster Problem Identification**: Clustering feedback enables quicker identification of recurring problems, allowing teams to address issues more efficiently and minimize user disruption. The speed improvement is particularly valuable for maintaining platform reliability and user productivity. **Improved Feature Prioritization**: Understanding what the developer community needs most allows GitHub to focus development efforts on features that provide the greatest benefit. This data-driven prioritization helps optimize resource allocation across product teams. **Enhanced Decision Making**: The clear, summarized insights enable internal teams to make more informed decisions aligned with actual user needs rather than assumptions or incomplete data samples. **Self-Service Enablement**: The insights revealed opportunities to create self-help resources that empower customers to resolve issues independently. This approach both expedites problem resolution and builds user capability to handle future issues without direct support intervention. ## LLMOps Considerations and Lessons Several important LLMOps themes emerge from GitHub's experience: **Security and Privacy**: GitHub explicitly sought solutions adhering to strict security and privacy regulations. This led them to build an internal tool rather than using off-the-shelf commercial analytics platforms. This decision reflects the reality that organizations handling sensitive customer data must carefully consider where and how that data is processed when using AI systems. **Model Selection Strategy**: Their choice to combine open-source models (BERTopic/BERT) with commercial models (GPT-4) demonstrates a pragmatic approach to the build-versus-buy decision. They leveraged open-source for specialized tasks (topic modeling and clustering) while using a powerful commercial LLM for natural language generation where it excels. **No Training on Customer Data**: By explicitly not training models on customer support data, GitHub avoided complex data governance challenges, reduced infrastructure requirements, and maintained flexibility to upgrade to newer model versions without retraining overhead. **Iterative Development and User Feedback**: The "ship to learn" philosophy meant launching early versions to collect feedback and iteratively improving based on actual usage. This agile approach to AI product development helps ensure the system meets real user needs rather than hypothetical requirements. **Context and Business Metrics**: The team learned that raw AI outputs, while technically accurate, required significant contextualization with business-specific metrics and knowledge to be actionable. This integration of domain knowledge with AI capabilities is often the difference between an experimental project and a production system driving business value. **Multilingual Capability**: BERT's inherent multilingual abilities eliminated the need for separate processing pipelines for different languages, simplifying their architecture while ensuring comprehensive coverage of their global user base. The case study represents a mature approach to LLMOps where the focus extends beyond model selection and deployment to encompass the entire value chain: data ingestion, model application, insight generation, visualization, user interface design, and continuous improvement based on stakeholder feedback. GitHub's emphasis on making insights actionable and ensuring they drive actual decision-making demonstrates understanding that successful LLMOps requires not just technical excellence but also effective change management and stakeholder engagement within the organization.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.