Dust: Build vs. Buy AI Agents: Enterprise Deployment Lessons from 1,000+ Companies

Company

Dust

Title

Build vs. Buy AI Agents: Enterprise Deployment Lessons from 1,000+ Companies

Industry

Tech

Link

https://blog.dust.tt/build-vs-buy-ai-agents/

Year

2025

Summary (short)

Dust, an AI agent platform company, shares insights from deploying AI agents across over 1,000 enterprise customers to address the common build-versus-buy dilemma. The case study explores the hidden costs of building custom AI infrastructure—including longer time-to-value (6-12 months underestimation), ongoing maintenance burden, and opportunity costs that divert engineering resources from core business objectives. Multiple customer examples demonstrate that buying a platform enabled rapid deployment (20 minutes to functional agents at November Five, 70% adoption in two months at Wakam, 95% adoption in 90 days at Ardabelle) with enterprise-grade security, continuous improvements, and significant productivity gains. The study advocates that most companies should buy AI infrastructure and focus engineering talent on competitive differentiation, though building may make sense for truly unique requirements or when AI infrastructure is the core product itself.

## Overview This case study presents Dust's perspective on enterprise AI agent deployment based on their experience working with over 1,000 companies. While the text is promotional in nature (written by Dust to advocate for their platform), it provides valuable insights into real-world LLMOps challenges through multiple customer examples spanning consulting, healthcare technology, insurance, private equity, and other sectors. The core argument centers on the build-versus-buy decision for AI agent infrastructure, with detailed examination of the technical, organizational, and economic factors that influence production AI deployments at scale. The case study is particularly valuable for understanding the operational realities of deploying LLMs in production environments, including the often-underestimated complexity of enterprise-grade AI infrastructure. However, readers should note that the examples are curated to support Dust's position, and the study doesn't deeply explore scenarios where building custom infrastructure might provide strategic advantages beyond the brief acknowledgment of edge cases. ## The Core Build vs. Buy Problem in LLMOps The fundamental challenge presented is familiar to enterprise software decisions but carries unique dimensions in the AI agent context. Companies face a decision that appears straightforward on the surface—leverage internal engineering talent to build custom AI infrastructure or purchase a platform—but the analysis reveals layers of hidden complexity specific to production LLM deployments. The key insight is that estimating build costs typically focuses on obvious factors like engineer salaries, infrastructure costs, and model API expenses, but misses the substantial ongoing investment required to maintain production-grade AI systems. This mirrors broader LLMOps challenges where the initial model development represents a fraction of the total cost of ownership. ## Technical Infrastructure Requirements for Production AI Agents The case study outlines several critical technical components required for production AI agent deployments that are frequently underestimated: **Universal Context Access and Integrations**: Production AI agents require connectivity across the entire enterprise technology stack—Slack, Notion, GitHub, Salesforce, and potentially dozens of other tools. Each integration requires custom development, ongoing maintenance as APIs evolve, and robust error handling. This represents a significant engineering investment that compounds over time as new tools are adopted and existing APIs change. The integration challenge is particularly acute in LLMOps because agents need real-time or near-real-time access to current data, not just historical snapshots. **Permission Systems for AI Agents**: Traditional role-based access control (RBAC) systems break down when AI agents need to operate across permission boundaries. The case study highlights the example of an HR agent that needs access to documents general employees cannot see directly. This requires designing entirely new permission models that understand context, intent, and appropriate boundaries for AI-mediated access. This is a genuinely complex LLMOps challenge—the agent must respect data governance while providing utility, and the permission system must be explainable and auditable for compliance purposes. **Semantic Search Infrastructure**: Unlike human users who can navigate organizational knowledge through intuition and institutional memory, AI agents require pre-embedding of content and semantic search capabilities to effectively retrieve relevant information. This involves maintaining vector databases, implementing embedding pipelines, managing index updates as content changes, and optimizing retrieval for both relevance and latency. The case study notes this as foundational infrastructure that Dust built over two years with a dedicated team, suggesting significant complexity in production deployments. **Model Flexibility and Provider Management**: The rapidly evolving AI landscape requires infrastructure that supports multiple model providers and can adapt as new capabilities emerge. This is a critical LLMOps consideration—production systems cannot be tightly coupled to a single model or provider without creating strategic risk. However, supporting multiple providers requires abstraction layers, provider-specific optimization, cost management across different pricing models, and governance frameworks for model selection decisions. ## Real-World Deployment Examples and Time-to-Value Analysis The case study provides several concrete examples that illustrate the time-to-value differential between building and buying: **Doctolib's Experience**: Doctolib, a healthcare technology company with 3,000 employees, initially built DoctoGPT, an internal ChatGPT-based solution that gained 800 active users. However, success created unexpected challenges. The team was overwhelmed with feature requests within days of launch, requiring out-of-the-box connectors, native plugins for platforms like Zendesk, and advanced features users expected. VP of Data & AI Nacim Rahal recognized that maintaining this internally would make his team "a permanent bottleneck requiring resources across multiple teams." The supporting infrastructure extended far beyond the AI functionality itself to include connector development, access rights management, security features, audit logs, compliance frameworks, and ongoing maintenance. Doctolib's guiding principle became "build what's in our core business, buy what will be a side project," reflecting the opportunity cost of diverting resources from their healthcare mission. **Wakam's Infrastructure Journey**: Wakam, a European insurtech company, provides a particularly instructive example of the build-to-buy transition. Their five-person data science team began building a custom AI chatbot with RAG capabilities in late 2023 after encountering GPT-4. While they had the technical expertise to create a working prototype, they discovered that maintaining it required constant engineering effort, with every new feature requiring weeks of development. More critically, they found themselves rediscovering fundamental challenges that AI platform companies had already solved—implementing effective RAG, managing vector databases, handling model orchestration, and creating user interfaces. The team was spending time on "undifferentiated technical problems rather than creating business value." After pivoting to Dust, Wakam achieved 70% employee adoption within two months and deployed 136 AI agents across the organization. Their legal team cut contract analysis time by 50%, and data teams enabled self-service intelligence that dramatically reduced processing time. This example illustrates both the technical depth required for production AI systems and the opportunity cost of building versus buying. **Ardabelle's Rapid Deployment**: Ardabelle, a private equity fund, recognized that their competitive advantage was deal analysis and portfolio management, not AI infrastructure. Within 90 days of choosing to buy rather than build, they achieved adoption across the entire investment team, 150+ queries per analyst per week, and evaluated 50% more deals in the same timeframe. The case study estimates that those 90 days would have extended to 9-12 months if building internally. This dramatic difference in time-to-value represents a critical LLMOps consideration—in rapidly evolving markets, the opportunity cost of delayed deployment can exceed the direct costs of platform purchases. **November Five's 20-Minute Deployment**: Dario Prskalo from November Five was able to set up and deploy functional AI agents in just 20 minutes—not to a proof of concept, but to agents that his team could actually use. This represents the extreme end of the time-to-value spectrum and suggests that well-designed platforms can dramatically reduce deployment friction. **CMI Strategies' ROI Model**: CMI Strategies, a 100-person consulting firm, achieved 95% adoption across all consultants, 60-70% time savings on commercial proposals, and 50% faster executive summary production. Bastien Hontebeyrie explained the economics: "Monthly costs of 30-40 euros per consultant generate multiple hours of weekly time savings, creating obvious ROI before considering quality improvements." This provides a concrete cost-benefit framework for evaluating platform decisions in LLMOps contexts. ## Enterprise Security and Compliance Considerations The case study emphasizes enterprise-grade security as a major differentiator for buying versus building, though with appropriate skepticism, security requirements vary significantly by organization and use case: **Zero Data Retention Guarantees**: Platform providers negotiate zero data retention agreements with model providers, which individual companies may struggle to obtain. This is particularly important for organizations handling sensitive data where model training on customer data would violate privacy commitments or regulations. **Fine-Grained Permissions for AI Agents**: As discussed earlier, traditional permission systems require rethinking for AI agents. Production platforms invest in permission models specifically designed for AI-mediated access. **Regulatory Compliance**: Compliance with GDPR, SOC 2, and industry-specific regulations requires ongoing investment. The case study highlights Anne-Claire from Mirakl's perspective: "We wanted a solution that would fit our security requirements from the start." For companies handling sensitive transactional data, security cannot be an afterthought. Building this level of security infrastructure is estimated to require another 6-12 months of development and ongoing compliance work. **Shadow AI Risk**: The case study introduces an often-overlooked LLMOps consideration—the risk of employees using personal ChatGPT accounts with sensitive company data when no approved platform is available. This creates uncontrolled security vulnerabilities across organizations. Providing an approved, secure platform addresses this risk, regardless of whether it's built or bought, though buying typically offers faster time-to-secure-deployment. ## The Maintenance Burden and Continuous Improvement Challenge A critical LLMOps insight from the case study is that the real cost isn't building version one—it's maintaining and improving the system as the AI landscape evolves. This maintenance burden manifests in several ways: **Feature Velocity**: The AI landscape changes monthly, with new model capabilities, improved techniques, and evolving best practices. Companies that build internally must maintain their own upgrade path indefinitely. Doctolib's experience with being "overwhelmed with requests within days of launch" illustrates how user expectations quickly outpace initial implementations. **Technical Debt Accumulation**: As models evolve and best practices change, custom-built infrastructure accumulates technical debt. RAG implementations that worked well with GPT-3.5 may need substantial rearchitecting for optimal performance with GPT-4 or Claude. Embedding strategies, chunking approaches, and retrieval techniques all require ongoing optimization. **Integration Maintenance**: APIs for connected systems evolve, requiring ongoing maintenance of integration code. This is particularly challenging in LLMOps contexts because AI agents may interact with APIs in ways not anticipated by the API providers, leading to edge cases and unexpected breaking changes. **Model Provider Management**: As new model providers emerge and existing providers update their offerings, production systems must continuously evaluate and potentially integrate new options. This requires not just technical integration work but also governance frameworks for deciding when and how to adopt new models. The case study notes that Dust has built their platform over two years with a dedicated team, and companies choosing to build typically underestimate timelines by 6-12 months. This suggests that even with accurate initial estimates, the ongoing investment required for production AI systems is substantial. ## The Integration Trap: Middle Path Challenges The case study warns against a "middle path" approach where companies buy a basic AI platform and build custom integrations and workflows on top. This approach is characterized as potentially combining "the worst of both worlds": **Continued Engineering Resource Requirements**: Custom integration development still requires dedicated engineering resources, negating some of the benefits of buying. **Platform API Limitations**: Custom code is constrained by the platform's API capabilities, which may not expose all functionality or may impose performance constraints. **Maintenance Burden from Platform Updates**: Custom code breaks when the platform updates, creating ongoing maintenance overhead that compounds over time. **Missed Platform Capabilities**: Building custom integrations may bypass built-in platform capabilities that would provide better functionality with less effort. The case study contrasts this with Watershed's approach, which embraced Dust's full platform and focused on high-value use cases rather than custom integrations. Jonathan Coveney from Watershed explained that "the biggest success has been as a general-purpose tool that many people have been able to use across different functions," suggesting that platform adoption without heavy customization can deliver broad value. This represents an important LLMOps consideration—the tension between customization and standardization. While custom integrations may seem to offer the best of both worlds, they often create maintenance burdens that reduce the net benefit of platform adoption. ## When Building Makes Sense: Edge Cases and Strategic Considerations Despite the overall advocacy for buying, the case study acknowledges scenarios where building custom AI infrastructure may be justified: **Truly Unique Requirements**: Building makes sense when requirements are fundamentally unique and cannot be addressed by existing platforms. The case study emphasizes "truly unique"—not just "we want it to work slightly differently"—suggesting that many perceived unique requirements can actually be met by configurable platforms. **AI Infrastructure as Core Product**: If a company is selling AI capabilities to customers, building custom infrastructure may be strategic. However, the case study notes that "even then, many successful AI companies use platforms like Dust internally while building their customer-facing products," suggesting that internal tooling and customer-facing products may warrant different build-versus-buy decisions. **Dedicated Resources and Realistic Timelines**: Building may work if a company can commit a team of 5-10 engineers for 12-18 months without impacting core product roadmap, has in-house AI expertise, and maintains "brutal honesty" about timeline and resource requirements. This last point is critical—the case study suggests that many build decisions fail because of unrealistic estimates rather than fundamental infeasibility. From an LLMOps perspective, these criteria are stringent and likely exclude most companies from the "building makes sense" category. The emphasis on "brutal honesty" about timelines and resources suggests that many failed build attempts stem from underestimation rather than strategic misjudgment. ## Adoption Patterns and Success Factors The case study identifies common patterns among companies that succeed with AI agents, providing insights into operational LLMOps beyond the build-versus-buy decision: **Speed of Deployment**: Successful companies move fast—Ardabelle achieved 95% adoption in 90 days, Wakam hit 70% adoption within two months, and CMI Strategies achieved measurable ROI within weeks. The case study argues that "speed matters because the AI landscape is changing rapidly," suggesting that early deployment provides learning opportunities and competitive advantages that compound over time. **Focus on Adoption Over Features**: Mirakl's goal was to transform 75% of employees from users into builders, which required a platform that made agent creation accessible rather than maximizing technical sophistication. This represents an important LLMOps insight—the value of AI systems depends critically on adoption, and adoption depends on accessibility and ease of use, not just technical capability. **AI as Strategic Capability, Not Technical Project**: At Creative Force, the People team led their AI transformation, not IT. This suggests that successful AI deployments require treating AI as a change management challenge rather than purely a technical implementation. From an LLMOps perspective, this implies that production AI systems must be designed with organizational and human factors in mind, not just technical performance. **Maintaining Core Business Focus**: Every company that chose to buy recognized that their competitive advantage wasn't AI infrastructure but rather their product, customers, or domain expertise. This reflects a broader strategic principle in LLMOps—technology investments should align with and support competitive differentiation rather than creating new areas of undifferentiated infrastructure development. ## Critical Assessment and Limitations While the case study provides valuable insights, several considerations warrant caution: **Selection Bias**: All examples are Dust customers who chose to buy their platform, creating inherent selection bias. Companies that successfully built custom AI infrastructure would not appear in this analysis. A truly balanced assessment would require examining successful build approaches as well. **Cost Transparency**: The case study emphasizes hidden costs of building but doesn't provide comprehensive cost breakdowns for platform purchases, including licensing costs at scale, professional services, customization needs, and migration costs from existing systems. **Technical Depth**: While the case study mentions important technical components (RAG, vector databases, embeddings, model orchestration), it doesn't deeply explore the technical tradeoffs involved in different implementation approaches. For example, companies with existing vector database expertise and infrastructure might find building on their existing investments more efficient than adopting a new platform. **Vendor Lock-in**: The case study doesn't address potential vendor lock-in risks when adopting a platform, including data portability, API dependencies, and the ability to migrate to alternative solutions if business needs change. **Organizational Context**: The success stories span diverse industries and company sizes, but the case study doesn't deeply explore how organizational context—existing technical capabilities, regulatory environment, cultural factors—should influence build-versus-buy decisions. **Platform Limitations**: While the case study emphasizes platform benefits, it doesn't explore scenarios where platform constraints might limit capabilities or force suboptimal architectural decisions. ## LLMOps Insights and Broader Implications Beyond the specific build-versus-buy argument, this case study provides several valuable LLMOps insights: **Time-to-Value Criticality**: In rapidly evolving AI landscapes, time-to-value becomes a critical factor that may outweigh traditional cost considerations. The difference between 20 minutes and 12 months for deployment represents not just time savings but also organizational learning, competitive positioning, and ability to adapt to evolving capabilities. **Complexity of Production AI Systems**: The technical infrastructure requirements outlined—universal context access, agent-specific permissions, semantic search, model flexibility—illustrate that production AI systems are substantially more complex than prototype or research implementations. This complexity is frequently underestimated in initial build assessments. **Maintenance as Primary Cost Driver**: The emphasis on ongoing maintenance rather than initial development costs reflects broader LLMOps realities. AI systems require continuous investment to remain effective as models evolve, data changes, and user expectations increase. **Adoption as Success Metric**: The focus on adoption rates (70%, 95%) and usage patterns (150+ queries per analyst per week) rather than purely technical metrics reflects the reality that production AI value depends on organizational uptake, not just technical capability. **Strategic Resource Allocation**: The core argument—that engineering talent should focus on competitive differentiation rather than undifferentiated infrastructure—applies broadly in LLMOps. Even companies with strong technical capabilities must evaluate opportunity costs of internal AI infrastructure development. **Security and Compliance Complexity**: The emphasis on enterprise-grade security, zero data retention, and compliance frameworks illustrates that production AI deployments in regulated industries require substantial governance investment beyond core functionality. This case study, despite its promotional nature, provides valuable insights into the operational realities of deploying AI agents at enterprise scale, the hidden complexities of production LLM systems, and the strategic considerations that should inform platform decisions in LLMOps contexts.

Start deploying reproducible AI workflows today