## Overview
Linear, a project management and issue tracking platform, developed an AI-powered Similar Issues matching feature to address duplicate issue management across large team workflows. The case study, published in November 2023, describes how they leveraged large language models and vector embeddings to identify semantically similar issues in production. The company formed an internal "AI skunkworks" team during summer 2023 to experiment with machine learning technologies, focusing specifically on automation that reduces repetitive work rather than "flashy AI features."
The core business problem was straightforward but persistent: in organizations with large backlogs and multiple team members, duplicate issues inevitably accumulate. Team members often have an intuition that an issue might already exist but lack effective tools to verify this suspicion. The consequences range from mild inefficiency (slightly larger backlogs) to significant waste (multiple engineers working on the same bug unknowingly). Linear sought to intercept this problem at multiple stages of the workflow rather than just addressing it after issues were created.
## Technical Architecture and Tooling Decisions
The technical foundation of Linear's solution rests on vector embeddings, which encode semantic meaning as matrices of floating point numbers. This approach allows the system to understand that terms like "bug," "problem," and "broken" are conceptually similar within the context of issue tracking, even though they're distinct keywords. The system uses cosine similarity queries to compute relatedness, where scores closer to 1 indicate high similarity and scores closer to -1 indicate opposing concepts.
Linear's technology selection process involved evaluating multiple approaches and platforms, which reflects a pragmatic LLMOps methodology. For generating embeddings, they opted to use LLM APIs, taking advantage of the resurgence in embedding quality and accessibility that accompanied recent advances in large language models. The case study notes that while the mathematical concepts behind vector embeddings aren't particularly new—Google has been innovating in this space for years—the new generation of LLMs can generate more semantically accurate embeddings at very low cost through simple API calls.
The vector storage decision proved more complex and illustrates important tradeoffs in production LLMOps deployments. During the initial proof of concept, Linear stored vector embeddings as blobs directly in their primary data store. This pragmatic choice enabled rapid iteration while the team evaluated long-term options. The approach had reasonable performance, though the team learned to ensure that the large blob columns weren't unnecessarily selected in queries, as vectors can be substantially larger than typical data types.
After experimenting with several vector-specific databases, Linear encountered various operational challenges including downtime during scaling, high costs, and increased operational complexity. The team ultimately selected PostgreSQL with the pgvector extension hosted on Google Cloud Platform. This choice reflects a key LLMOps principle: favoring known, maintainable technologies over specialized solutions when feasible. PostgreSQL provided reasonable response times for cosine similarity queries while remaining "a known quantity that our small engineering team can maintain with confidence." The timing was fortuitous—GCP launched support for the pgvector extension just as Linear needed it.
## Data Pipeline and Backfilling Strategy
Preparing the system for production required generating embeddings for all existing issues across all workspaces and populating the database with both vector embeddings and relevant metadata (status, workspace, and team identifiers). Linear leveraged an existing internal framework for data backfills, using task runners to process jobs in parallel on a Kubernetes cluster. The backfill process involved iterating through issues, concatenating their content, and converting the resulting text to vectors through API requests.
The parallel processing architecture on Kubernetes demonstrates mature infrastructure practices for handling large-scale batch operations in LLMOps contexts. The team characterized this portion as "a breeze," suggesting that having established patterns for distributed data processing significantly reduced the implementation burden.
## Performance Optimization and Indexing
Performance optimization represented a significant technical challenge, particularly given Linear's scale. Without proper indexing, cosine similarity queries across the vector database were "very slow," with initial testing showing queries regularly timing out or taking multiple seconds to complete. This performance issue is particularly critical for a feature meant to provide real-time feedback during issue creation.
The scale of Linear's existing deployment—tens of millions of issues from the start—made index generation complex. Initial attempts at creating indexes failed even when providing database servers with hundreds of gigabytes of memory. The successful approach involved partitioning the embeddings table by workspace ID across several hundred partitions and creating indexes on each partition separately. This partitioning strategy effectively reduced the search space for individual queries while making index creation more manageable.
For index parameters, the team largely followed pgvector's recommendations regarding settings like list size, testing different values to maintain sufficient search accuracy. This balance between performance and accuracy is a critical consideration in production vector search systems—overly aggressive optimization can degrade the quality of similarity matches.
## Integration Points and User Experience
Linear's implementation strategy demonstrates thoughtful consideration of where AI-assisted detection provides the most value. Rather than building a single-purpose feature, they integrated similar issue detection at three key touchpoints:
The first integration point occurs during issue creation itself, where the system suggests potentially related or duplicate issues in real-time. This represents the most proactive intervention, helping prevent duplicates before they enter the system.
The second integration appears in Linear's Triage functionality, which handles issues from external sources. When working through the Triage inbox, users see existing similar issues prominently displayed with easy options to mark them as duplicates. This helps manage issues that have already been created but haven't yet been fully processed.
The third and perhaps most innovative integration pushes detection even earlier in the workflow by surfacing similar issues directly within support integrations. When customers email about bugs or problems through tools like Intercom, support teams can immediately see related existing issues and their status without switching between tools. This integration bridges the gap between customer support and engineering workflows, potentially improving response times and consistency.
## Production Results and Team Impact
The case study provides limited quantitative metrics but offers qualitative evidence of impact. Linear's customer experience team has found value in using the feature to consolidate support issues in Intercom with reduced manual aggregation effort. Community feedback indicated that teams are successfully using the feature to manage backlogs across both engineering and support contexts.
The social media excerpts included in the case study show positive reception, with users noting they've "made use of it already a few times" and that it's "a great addition to keep the backlog sorted." While these anecdotes don't constitute rigorous evaluation, they suggest the feature achieved product-market fit for at least some user segments.
## Critical Assessment and Limitations
From an LLMOps perspective, several aspects of this case study warrant balanced consideration. The text originates from Linear's own blog and naturally emphasizes positive aspects while minimizing challenges or failures. The team mentions "much consternation" and moving "development embeddings between providers on more than one occasion," suggesting the path to production involved more difficulty than the relatively smooth narrative implies.
The case study lacks detailed discussion of several important LLMOps considerations. There's no mention of embedding model versioning or strategies for handling model updates. If Linear needs to switch embedding providers or models in the future, they would presumably need to regenerate all embeddings—a potentially complex migration given their scale. The text doesn't address how they handle this operational concern.
Similarly, the case study provides limited information about accuracy evaluation or quality monitoring. How does Linear measure whether the similar issues being surfaced are genuinely useful? What false positive or false negative rates do they observe? How do they monitor for degradation over time? These questions remain unanswered, though they may address them through internal processes not described in this public-facing content.
The cost implications of the solution also receive minimal attention. While the text mentions that LLM-generated embeddings are available "at a very low cost," there's no discussion of ongoing operational expenses for API calls, vector storage, or computational resources for similarity queries at scale. For organizations considering similar implementations, understanding these cost tradeoffs would be valuable.
## Future Directions and Improvements
Linear outlines several planned improvements that provide insight into their iterative approach to LLMOps. They plan to expand the feature to more integrations, extending the touchpoints where similar issue detection provides value. They're considering incorporating additional properties like labels into the embeddings, which would require rethinking their embedding generation process to combine structured metadata with textual content.
The team has identified a specific issue with templates: issues created from templates may unduly influence similarity scores, presumably because template text creates spurious matches between otherwise unrelated issues. Addressing this would require filtering or weighting strategies in either the embedding generation or similarity scoring phases.
Perhaps most interestingly, Linear intends to use the vector embedding index as a signal for their main search functionality, which currently runs on ElasticSearch. This suggests a hybrid approach combining traditional keyword search with semantic similarity, a common pattern in production search systems that want to balance precision and recall across different query types.
## Broader LLMOps Lessons
This case study illustrates several important principles for production LLMOps deployments. First, starting with pragmatic proof-of-concept approaches (like storing vectors as blobs in the primary database) can accelerate learning without committing to complex infrastructure prematurely. Second, favoring familiar, maintainable technology stacks (PostgreSQL rather than specialized vector databases) can reduce operational burden for small teams, even if specialized solutions might offer theoretical advantages.
Third, the importance of integration strategy stands out clearly. Linear didn't just build a standalone similarity detection feature—they thoughtfully integrated it at multiple workflow stages where it provides distinct value. This integration-first mindset maximizes the utility of the underlying AI capabilities.
Finally, the case study demonstrates the value of having established infrastructure for common LLMOps tasks. Linear's existing framework for parallel data processing on Kubernetes made the embedding backfill "a breeze," while teams lacking such infrastructure might find this a significant hurdle. Building reusable patterns for tasks like batch processing, job orchestration, and data backfills pays dividends across multiple AI initiatives.
The partitioning strategy Linear employed to handle scale represents a practical solution to a common challenge in production vector search systems. Rather than attempting to maintain a single monolithic index, breaking the problem down by workspace (a natural boundary in their data model) allowed them to achieve acceptable performance while managing index creation complexity. This approach trades some cross-workspace search capabilities for improved performance and maintainability within workspaces, a reasonable tradeoff given their use case.
Overall, Linear's Similar Issues feature represents a well-executed example of applying LLMs and vector embeddings to a concrete business problem in production. The implementation reflects pragmatic engineering choices, thoughtful integration strategy, and realistic acknowledgment of scale challenges. While the case study is naturally promotional and lacks some technical details that would be valuable for complete assessment, it provides useful insights into the practical considerations of deploying semantic similarity search at scale in a production SaaS environment.