Accolade, facing challenges with fragmented healthcare data across multiple platforms, implemented a Retrieval Augmented Generation (RAG) solution using Databricks' DBRX model to improve their internal search capabilities and customer service. By consolidating their data in a lakehouse architecture and leveraging LLMs, they enabled their teams to quickly access accurate information and better understand customer commitments, resulting in improved response times and more personalized care delivery.
Accolade is a healthcare technology company focused on helping members lead healthier lives by providing timely access to appropriate care. Their technology-enabled solutions combine virtual primary care, mental health support, expert medical opinions, and care navigation. This case study describes how Accolade addressed significant data fragmentation challenges to enable generative AI capabilities that improved their internal team productivity and customer service quality.
The core problem Accolade faced was that their data was siloed across multiple platforms, including AWS Redshift and Snowflake, which created inefficiencies and prevented them from leveraging AI effectively. Their Vice President of Enterprise Data and Clinical Platform, Kapil Ashar, noted that fragmented data “hampered efficiency and ability to provide timely, personalized care,” with internal teams struggling to quickly access necessary information. This led to slower response times and diminished ability to assist members effectively.
Before implementing any LLM-based solutions, Accolade recognized that they needed to fix their underlying data infrastructure. This is a critical lesson in LLMOps: generative AI initiatives depend heavily on having a solid data foundation. The company transitioned from a traditional BI infrastructure to an AI-driven framework using Databricks’ lakehouse architecture.
The lakehouse architecture enabled the combination of data storage and management in a unified environment, facilitating easier access and analysis. This approach broke down silos between different types of data and systems. As Ashar stated, “It’s not just about having data; it’s about making it actionable and efficient, which Databricks enabled us to do effectively at a scale we previously couldn’t manage.”
For real-time analytics, the platform utilized Apache Spark for streaming data capabilities, enabling continuous data ingestion from various sources. This streaming capability is particularly important in healthcare contexts where timely data access can directly impact care delivery outcomes.
A particularly noteworthy aspect of this implementation is the attention to data governance, which is critical in healthcare due to HIPAA compliance requirements. Databricks Unity Catalog was implemented for management and governance, providing stringent access controls and detailed data lineage tracking. This unified governance approach allowed Accolade to manage structured and unstructured data, machine learning models, notebooks, dashboards, and files across any cloud or platform.
The case study notes that this governance framework “accelerated data and AI initiatives, simplified regulatory compliance and allowed data teams to access data and collaborate securely.” This highlights an important LLMOps consideration: in regulated industries like healthcare, governance cannot be an afterthought—it must be built into the architecture from the beginning.
The core GenAI implementation centered on retrieval augmented generation (RAG), which the case study describes as “a generative AI workflow that uses custom data and documents to provide context for LLMs.” This is a standard but effective approach for enterprise applications where domain-specific knowledge and proprietary data need to inform LLM responses.
Accolade used the Mosaic AI Agent Framework to develop their RAG solution, specifically tailored to improve the efficiency and effectiveness of their internal teams. The solution allowed power users to access diverse data sources including:
This diversity of data sources is typical in enterprise RAG implementations and presents challenges around document parsing, chunking strategies, and embedding generation, though the case study does not provide specific details on how these challenges were addressed.
The RAG solution was particularly focused on helping internal teams “answer questions confidently and correctly” and enabling them to “better understand and fulfill customer commitments” by analyzing customer contracts and other relevant data. This use case—augmenting human workers rather than replacing them—represents a practical and relatively low-risk approach to enterprise GenAI adoption.
Accolade chose DBRX, an open-source LLM developed by Databricks, as the foundation for their solution. The case study indicates DBRX was used to enhance internal search functions and enable internal users to retrieve accurate information quickly. The choice of an open-source model is notable as it provides greater control over the model and potentially better compliance posture compared to using third-party API-based models in a healthcare context.
For deployment, Databricks Model Serving was used to deploy the model as a RESTful API, enabling real-time predictions that could be integrated directly into Accolade’s decision systems. The case study highlights several operational benefits of this approach:
This deployment pattern—exposing models via RESTful APIs with managed infrastructure handling scaling and versioning—represents a common and proven approach in production LLMOps.
While the case study presents a compelling narrative, there are several points that warrant balanced consideration:
The case study is published by Databricks and features their products extensively, so it naturally emphasizes the benefits of their platform. Specific quantitative results (e.g., percentage improvements in response times, cost savings, or accuracy metrics) are notably absent, making it difficult to objectively assess the impact of the implementation.
The claim of “major productivity gains” is vague and unsubstantiated with specific metrics. Similarly, phrases like “greater transparency and control” are qualitative and not backed by concrete examples or measurements.
The case study also does not discuss challenges encountered during implementation, such as:
These are important operational considerations that would provide more actionable insights for other organizations looking to implement similar solutions.
Despite the limitations in specific detail, several valuable LLMOps lessons can be extracted:
The importance of data foundation cannot be overstated. Accolade explicitly noted that their AI efforts were delayed due to data fragmentation, and fixing the data infrastructure was a prerequisite for their GenAI initiatives. This reinforces that successful LLM implementations often require significant investment in underlying data architecture.
Governance must be built in from the start, especially in regulated industries. The integration of Unity Catalog for data lineage and access controls demonstrates how compliance requirements shape architectural decisions in production LLM systems.
Starting with internal use cases (augmenting internal teams) rather than customer-facing applications represents a lower-risk approach to enterprise GenAI adoption. It allows organizations to iterate and learn while limiting potential exposure from errors.
Using managed services for model deployment (like Databricks Model Serving) can simplify operational overhead around scaling, versioning, and integration. This trade-off between control and operational simplicity is a common consideration in LLMOps.
The RAG pattern remains a dominant approach for enterprise GenAI use cases where proprietary or domain-specific knowledge needs to enhance LLM capabilities. The ability to ground responses in company-specific documents and data addresses common concerns about LLM accuracy and relevance.
The case study concludes by noting that Accolade anticipates “the combination of human and artificial intelligence will remain core to their mission.” This human-AI collaboration model, where AI tools support rather than replace human workers, is presented as central to their ongoing strategy. They describe their approach as a “compound AI system” that is “democratizing access to high-quality insights” and “empowering employees to perform their roles with greater efficacy.”
The emphasis on continued innovation and collaboration within a consolidated platform suggests an iterative approach to expanding GenAI capabilities, though specific future plans are not detailed in the case study.
Snorkel developed a specialized benchmark dataset for evaluating AI agents in insurance underwriting, leveraging their expert network of Chartered Property and Casualty Underwriters (CPCUs). The benchmark simulates an AI copilot that assists junior underwriters by reasoning over proprietary knowledge, using multiple tools including databases and underwriting guidelines, and engaging in multi-turn conversations. The evaluation revealed significant performance variations across frontier models (single digits to ~80% accuracy), with notable error modes including tool use failures (36% of conversations) and hallucinations from pretrained domain knowledge, particularly from OpenAI models which hallucinated non-existent insurance products 15-45% of the time.
Prudential Financial, in partnership with AWS GenAI Innovation Center, built a scalable multi-agent platform to support 100,000+ financial advisors across insurance and financial services. The system addresses fragmented workflows where advisors previously had to navigate dozens of disconnected IT systems for client engagement, underwriting, product information, and servicing. The solution features an orchestration agent that routes requests to specialized sub-agents (quick quote, forms, product, illustration, book of business) while maintaining context and enforcing governance. The platform-based microservices architecture reduced time-to-value from 6-8 weeks to 3-4 weeks for new agent deployments, enabled cross-business reusability, and provided standardized frameworks for authentication, LLM gateway access, knowledge management, and observability while handling the complexity of scaling multi-agent systems in a regulated financial services environment.
Martin Der, a data scientist at Xomnia, presents practical approaches to GenAI governance addressing the challenge that only 5% of GenAI projects deliver immediate ROI. The talk focuses on three key pillars: access and control (enabling self-service prototyping through tools like Open WebUI while avoiding shadow AI), unstructured data quality (detecting contradictions and redundancies in knowledge bases through similarity search and LLM-based validation), and LLM ops monitoring (implementing tracing platforms like LangFuse and creating dynamic golden datasets for continuous testing). The solutions include deploying Chrome extensions for workflow integration, API gateways for centralized policy enforcement, and developing a knowledge agent called "Genie" for internal use cases across telecom, healthcare, logistics, and maritime industries.