Company
Alice
Title
Building an AI Sales Development Representative with Advanced RAG Knowledge Base
Industry
Tech
Year
2025
Summary (short)
11X developed Alice, an AI Sales Development Representative (SDR) that automates lead generation and email outreach at scale. The key innovation was replacing a manual product library system with an intelligent knowledge base that uses advanced RAG (Retrieval Augmented Generation) techniques to automatically ingest and understand seller information from various sources including documents, websites, and videos. This system processes multiple resource types through specialized parsing vendors, chunks content strategically, stores embeddings in Pinecone vector database, and uses deep research agents for context retrieval. The result is an AI agent that sends 50,000 personalized emails daily compared to 20-50 for human SDRs, while serving 300+ business organizations with contextually relevant outreach.
## Company Overview and Use Case 11X is a company building digital workers for go-to-market organizations, with their flagship product Alice serving as an AI Sales Development Representative (SDR). Alice represents a sophisticated production deployment of LLMs in the sales automation space, processing and generating tens of thousands of personalized sales emails daily. The company also develops Julian, a voice agent, demonstrating their broader commitment to AI-powered business process automation. The core challenge that 11X faced was creating an AI system that could effectively represent sellers' businesses while engaging prospects at scale. Traditional SDRs handle lead sourcing, multi-channel engagement, and meeting booking, with success measured by positive reply rates and meetings booked. Alice needed to replicate these human capabilities while operating at significantly higher volume - sending approximately 50,000 emails per day compared to the 20-50 emails a human SDR might send. ## Technical Architecture and LLMOps Implementation The technical foundation of Alice centers around a sophisticated knowledge base system that exemplifies modern LLMOps practices. The architecture follows a clear separation of concerns with multiple specialized components working together in production. The system architecture begins with users uploading various resource types through a client interface. These resources are stored in S3 buckets and processed through a backend system that creates database records and initiates processing jobs based on resource type and selected vendors. The parsing vendors work asynchronously, sending webhooks upon completion that trigger ingestion processes. The parsed content is then stored in the local database while simultaneously being embedded and upserted to Pinecone vector database, enabling the AI agent to query this information during email generation. ### Parsing Pipeline and Vendor Integration One of the most sophisticated aspects of Alice's knowledge base is its multi-modal parsing capability. The system handles five distinct resource categories: documents, images, websites, audio, and video content. Rather than building parsing capabilities in-house, 11X made strategic vendor selection decisions based on production requirements. For documents and images, they selected LlamaIndex's Llama Parse, primarily due to its extensive file type support and excellent customer support responsiveness. The vendor selection process prioritized practical deployment concerns over theoretical accuracy metrics, focusing on markdown output support, webhook capabilities, and comprehensive file format coverage. Website parsing utilizes Firecrawl, chosen over competitors like Tavily due to familiarity from previous projects and the fact that Tavily's crawl endpoint was still in development during their evaluation period. This decision highlights the importance of vendor maturity and feature availability in production LLMOps deployments. For audio and video processing, 11X selected CloudGlue, a newer vendor that provided both audio transcription and visual content extraction capabilities. This choice demonstrates their willingness to work with emerging vendors when they offer superior technical capabilities, specifically the ability to extract information from video content beyond simple audio transcription. ### Chunking Strategy and Content Processing The chunking strategy employed by Alice's knowledge base demonstrates sophisticated thinking about semantic preservation and retrieval optimization. The system processes parsed markdown content through a waterfall chunking approach that balances semantic coherence with embedding effectiveness. The chunking pipeline begins with markdown header-based splitting to preserve document structure and hierarchy. This is followed by sentence-level splitting when chunks exceed target token counts, and finally token-based splitting as a fallback. This multi-layered approach ensures that semantic units remain intact while maintaining appropriate chunk sizes for embedding and retrieval operations. The preservation of markdown structure is particularly important because formatting contains semantic meaning - distinguishing between titles, paragraphs, and other content types that influence how information should be weighted and retrieved. This attention to structural semantics reflects mature LLMOps thinking about information representation and retrieval quality. ### Vector Database Implementation and Storage The storage layer utilizes Pinecone as the primary vector database, chosen for practical production considerations rather than purely technical specifications. The selection criteria included cloud hosting to reduce infrastructure management overhead, comprehensive SDK support, bundled embedding models to simplify the technology stack, and strong customer support for troubleshooting production issues. The embedding strategy leverages Pinecone's built-in embedding models, reducing the complexity of managing separate embedding model infrastructure. This decision reflects a pragmatic approach to LLMOps where reducing operational complexity can outweigh potential performance optimizations from custom embedding solutions. ### Advanced Retrieval with Deep Research Agents Alice's retrieval system has evolved beyond traditional RAG implementations to incorporate deep research agents, representing the cutting edge of production RAG systems. The system progression moved from traditional RAG to agentic RAG, and finally to deep research RAG using Leta, a cloud agent provider. The deep research agent receives lead information and dynamically creates retrieval plans that may involve multiple context gathering steps. This approach allows for both broad and deep information gathering depending on the specific context requirements. The agent can evaluate whether additional retrieval steps are needed, enabling more sophisticated and contextually appropriate email generation. The integration with Leta demonstrates the emerging pattern of leveraging specialized agent platforms rather than building agent orchestration capabilities from scratch. This vendor-assisted approach allows the core engineering team to focus on domain-specific logic while relying on specialized platforms for agent coordination and execution. ### Production Monitoring and Quality Assurance The knowledge base includes sophisticated visualization capabilities that serve both operational and customer confidence purposes. The system projects high-dimensional vectors from Pinecone into three-dimensional space, creating an interactive 3D visualization of the knowledge base contents. Users can click on nodes to view associated content chunks, providing transparency into what Alice "knows" about their business. This visualization serves multiple purposes in the production environment: it enables sales and support teams to demonstrate Alice's knowledge capabilities, provides debugging capabilities for understanding retrieval behavior, and builds customer confidence by making the AI's knowledge base tangible and inspectable. ### User Experience and Workflow Integration The knowledge base integrates seamlessly into the campaign creation workflow, automatically surfacing relevant information as Q&A pairs with associated source chunks. This integration demonstrates sophisticated UX thinking about how AI capabilities should be presented to end users - making the system's intelligence visible and actionable rather than hidden behind automated processes. The transition from the manual library system to the automated knowledge base represents a fundamental shift in user experience philosophy. Instead of requiring users to manually catalog their products and services, the system now automatically ingests and organizes information from natural business documents and resources. ## Production Lessons and LLMOps Insights The development of Alice's knowledge base yielded several important lessons for production LLMOps implementations. The team emphasized that RAG systems are significantly more complex than initially anticipated, involving numerous micro-decisions about parsing, chunking, storage, and retrieval strategies. This complexity multiplies when supporting multiple resource types and file formats. The recommended approach of reaching production before extensive benchmarking reflects practical LLMOps wisdom. Rather than getting paralyzed by vendor evaluations and performance comparisons, the team prioritized getting a working system into production to establish real usage patterns and performance baselines. This approach enables data-driven optimization based on actual production behavior rather than theoretical benchmarks. The heavy reliance on specialized vendors rather than building capabilities in-house represents a mature approach to LLMOps infrastructure. By leveraging vendors for parsing, vector storage, and agent orchestration, the team could focus on core domain logic while benefiting from specialized expertise in each component area. ## Future Technical Direction The roadmap for Alice's knowledge base includes several advanced LLMOps initiatives. Hallucination tracking and mitigation represents a critical production concern for systems generating customer-facing content. The plan to evaluate parsing vendors on accuracy and completeness metrics shows a maturation toward more sophisticated vendor management and quality assurance. The exploration of hybrid RAG approaches, potentially incorporating graph databases alongside vector storage, demonstrates continued innovation in retrieval architecture. This hybrid approach could enable more sophisticated relationship modeling and contextual understanding beyond simple semantic similarity. Cost optimization across the entire pipeline reflects the practical concerns of scaling LLMOps systems in production. As usage grows, the accumulated costs of parsing vendors, vector storage, and API calls become significant operational considerations requiring systematic optimization efforts. The Alice knowledge base case study represents a sophisticated production implementation of modern LLMOps practices, demonstrating how multiple specialized technologies can be orchestrated to create intelligent business automation systems. The emphasis on vendor partnerships, pragmatic technology choices, and user experience integration provides valuable insights for other organizations building production LLM systems.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.