Neople: AI-Powered Digital Co-Workers for Customer Support and Business Process Automation

Company

Neople

Title

AI-Powered Digital Co-Workers for Customer Support and Business Process Automation

Industry

E-commerce

Link

https://www.youtube.com/watch?v=DUBF5huCVBo

Year

2025

Summary (short)

Neople, a European startup founded almost three years ago, has developed AI-powered "digital co-workers" (called Neeles) primarily targeting customer success and service teams in e-commerce companies across Europe. The problem they address is the repetitive, high-volume work that customer service agents face, which reduces job satisfaction and efficiency. Their solution evolved from providing AI-generated response suggestions to human agents, to fully automated ticket responses, to executing actions across multiple systems, and finally to enabling non-technical users to build custom workflows conversationally. The system now serves approximately 200 customers, with AI agents handling repetitive tasks autonomously while human agents focus on complex cases. Results include dramatic improvements in first response rates (from 10% to 70% in some cases), reduced resolution times, and expanded use cases beyond customer service into finance, operations, and marketing departments.

## Overview Neople is a European AI startup that has built a platform for creating AI-powered "digital co-workers" (branded as "Neeles") that assist primarily with customer service and support functions, with expanding use cases across business operations. The company was founded nearly three years ago by four co-founders including Yelp (CTO), and the team includes Sena (CPO with background in mathematics and psychology) and Christos (Lead Design Engineer with creative technology background). With approximately 200 customers primarily across Europe, Neople focuses heavily on e-commerce and customer service sectors, though they're expanding into other operational areas. The core philosophy behind Neople is making AI accessible to non-technical users by creating a conversational, human-centered interface rather than exposing technical complexity. The product embodies this by personifying the AI as a character with a name, face, and personality, which customers can brand and customize. Many customers reportedly have life-size cutouts of their Neele in their offices, demonstrating the human connection the company aims to create. ## Evolution of the Product Neople's journey represents a fascinating case study in how to build production AI systems as the underlying technology rapidly evolves. The team describes having to "reinvent themselves five times over" as large language models evolved. Their development can be broken into four major phases: **Phase 1: Suggestions (approximately 2 years)** - When they started, chat models didn't exist and context windows were limited to about 2K tokens. The team initially tried cramming everything into one massive prompt, which didn't work reliably. They quickly realized they couldn't trust the AI to send responses directly to customers, so they built a system that provided suggestions to human customer service agents within contact centers and chat applications. This allowed agents to see AI-generated response options while maintaining full control over what went to customers. **Phase 2: Automated Responses** - As LLM capabilities improved, Neople added simple buttons to their suggestions interface: "copy message" and "send straight to customer." This small feature change revealed that customers were willing to let the AI handle repetitive workflows directly. The team then worked closely with alpha customers to validate quality by running the AI against historical tickets and getting feedback on whether customers would have been satisfied with automated responses. This led to full automation of straightforward customer service responses, though customers could configure topic-based guardrails to prevent automation of sensitive issues like legal questions or serious complaints. **Phase 3: Actions Within Customer Service** - The team recognized that truly valuable automation required more than just writing responses—it needed to perform actions in external systems. For example, if a customer requests an address change, the AI needs to check if the order status allows changes, update the address in the order management system, and then confirm the change with the customer. This required extensive system integrations and agentic capabilities to execute multi-step workflows. **Phase 4: Actions Beyond Customer Service** - Once the infrastructure for cross-system actions was built, customers naturally began using their Neeles for operational tasks beyond customer service: invoice reconciliation, meeting summaries, action item tracking, email drafting, and even marketing content creation (leveraging the carefully tuned tone of voice). One notable example involved a customer's entire marketing department using their Neele to draft blog posts and headers because the AI had been trained to embody the company's voice. ## Technical Architecture ### Core Response Generation Pipeline The system follows a structured but flexible pipeline when handling customer service tickets: **Data Gathering** - When a ticket arrives (initially just a ticket number), deterministic code collects all relevant metadata including tags, assignees, conversation history, and customer information. This step doesn't involve LLMs since it's a straightforward data collection task. **Knowledge Retrieval (Agentic RAG)** - This is one of the most sophisticated components. An AI agent with a singular goal—finding the right documents to answer the question—has access to multiple search tools: - Semantic search for conceptual matching - Keyword search for exact term matching (especially useful for product SKUs) - Hybrid search combining both approaches - Document type selection (FAQ, product feeds, policies, etc.) The agent iterates for up to five turns, reviewing its results and refining queries until satisfied or cut off. Crucially, Neople discovered that different document types require different embedding strategies. For FAQs, they embed only the question (not question+answer) to maximize matching accuracy. The system relies on confidence scores from the vector database to assess match quality. The agent decides which search strategy and document types to use based on the query context, demonstrating true agentic behavior within constrained parameters. **External System Integration** - In parallel with knowledge retrieval, the system queries relevant external systems like order management systems, warehouse management systems, and HR software. For systems without clean APIs, Neople employs browser automation, giving the AI its own Chrome browser to navigate web interfaces. The team notes browser automation technology has gone through approximately 12-13 significant version improvements in the past six months. **Response Generation** - With all context gathered (ticket info, relevant knowledge, external system data), the system generates a response that may include policy references and information from various sources. **Evaluation Pipeline** - This is where Neople invests heavily, with approximately half of all LLM calls dedicated to evaluation before any response goes to a customer. Key evaluations include: - Refusal detection: Does the AI inappropriately say it can't help? - Hallucination detection: Is every statement grounded in provided knowledge? This is critical because Neople explicitly doesn't want the LLM using its pre-trained knowledge—only information from the customer's knowledge base. - Action implication checking: Does the response imply an action that wasn't actually performed? - General quality and safety checks If all evaluations pass, the response is sent automatically. If any fail, it's provided as an internal suggestion to human agents with context about what failed. ### RAG Implementation Details Neople's RAG system demonstrates sophisticated understanding of retrieval challenges: **Multi-strategy Search** - The system dynamically selects between semantic, keyword, and hybrid search based on query characteristics. Product lookups (with specific SKUs) benefit from keyword search, while conceptual questions work better with semantic search. **Document Type Specialization** - Different document types are embedded and retrieved differently. FAQ questions are embedded separately from answers to maximize matching precision. Product feeds prioritize keyword matching over semantic similarity. **Knowledge Source Diversity** - The system ingests multiple knowledge sources: - Customer websites (crawled automatically, providing reliable FAQ and product information) - Uploaded documents and knowledge bases - Product feeds (structured data) - Internal policies and procedures The team recognized early that website crawling was crucial because not all customers had well-organized internal knowledge bases, but e-commerce companies always maintain high-quality websites. **Language Handling** - An early discovery was that semantic search fails when questions and source documents are in different languages, leading to additional handling for multilingual scenarios. ### Workflow and Action System Beyond simple response generation, Neople enables customers to define complex workflows through a conversational interface: **Template-Based Approach** - Neople provides templates for common use cases (order cancellations, address changes, refunds, etc.) that encode the general workflow structure. Customers select a template and then customize it conversationally. **Conversational Workflow Building** - Non-technical users can explain workflows to their Neele as they would to a human colleague. The AI asks follow-up questions to understand: - What systems are involved? - What conditions trigger the workflow? - What guardrails or business rules apply? - What actions need to be performed and in what sequence? **Incremental Complexity** - Neople guides customers to start with simple, specific workflows rather than trying to handle all edge cases immediately. For example, instead of automating all order cancellations (which might have 40+ edge cases), they start with the simplest scenario (order not yet shipped) and expand from there. This teaches customers valuable decomposition skills. **Visual Workflow Representation** - The system presents workflows not as traditional node-based flow diagrams but as conversational "bubbles" showing triggers, conditions, rules, and actions in an accessible format for non-technical users. ## Evaluation and Quality Management Neople's approach to evaluation is particularly sophisticated, operating at multiple levels: **Production Inline Evaluation** - Before any automated response reaches an end customer, it passes through multiple LLM-based evaluations checking for hallucinations, refusals, action consistency, and other quality signals. This is feasible because email responses aren't latency-sensitive. **Offline Evaluation** - The team maintains offline evaluation suites for critical components like the agentic RAG system and the evaluation system itself (meta-evaluation). **Customer-Driven Evaluation** - Brilliantly, Neople has architected their product so customers become part of the evaluation loop: - End customers provide feedback through CSAT scores and ticket ratings - Human agents provide implicit feedback through editing suggestions or choosing to send them - Customers have full visibility into the AI's reasoning trace, allowing them to diagnose issues - Metrics like first response rate and resolution time provide aggregate quality signals **Transparent Reasoning** - Every response includes a complete trace showing: - What knowledge sources were consulted and what was extracted - What external systems were queried - What evaluations were performed - Why decisions were made at each step This transparency enables non-technical customers to identify problems (like outdated knowledge) and fix them independently. For example, a customer in the entertainment business realized their Neele was using 2023 information because they hadn't updated their knowledge base after events changed. **Iterative Quality Improvement** - When customers notice quality issues, they can immediately update knowledge sources, and the Neele applies new information to subsequent tickets. This creates a tight feedback loop for continuous improvement. ## Integration Strategy Neople faced the seemingly infinite problem of integrating with every customer's unique tech stack. Their solutions include: **Customer-Driven Integration Priorities** - Rather than dictating which tools they support, Neople follows the principle of "work where you work." They build integrations based on what customers actually use, letting demand drive prioritization. **Integration Partnership** - To scale beyond their team's capacity, Neople partners with an external integration platform that provides access to additional systems. **Browser Automation Fallback** - For systems without APIs or where integration isn't yet available, customers can use browser automation, allowing the AI to interact with web interfaces directly. While acknowledging browser automation has limitations, the team notes rapid improvement in this technology. **Standard Protocol Support** - The system connects to common customer service platforms (contact centers), communication tools (Slack, Teams), and business systems (ERP, WMS, finance tools, HR systems). ## Human-AI Collaboration Design Neople's approach to human-AI collaboration reflects sophisticated thinking about trust, control, and accessibility: **Progressive Autonomy** - The system started with full human control (suggestions only) and progressively automated more as reliability improved and customer trust grew. Customers can still configure topic-based rules to ensure human oversight for sensitive issues. **Approachable Personality** - By personifying the AI as a character with a name, face, and personality (customized per customer, often matching company branding), Neople reduces anxiety around AI adoption. The team explicitly wanted to address fears about AI taking jobs by framing it as a helpful co-worker. **Non-Technical Interface** - Unlike workflow builders requiring API knowledge or technical skills, Neople's conversational interface allows customer service managers, operations staff, and other non-technical users to configure and customize their AI co-workers without IT involvement. **Controllability and Visibility** - Users can see exactly what their Neele knows, how it makes decisions, and why it produced specific outputs. They can update knowledge, modify workflows, and adjust guardrails conversationally. **Template-Guided Discovery** - Templates provide scaffolding for common use cases while still allowing full customization, balancing ease of use with flexibility. ## Production Deployment Considerations **Multi-Tenancy** - With approximately 200 customers, each with custom knowledge bases, workflows, and system integrations, Neople operates a complex multi-tenant architecture. **Knowledge Base Per Customer** - Each customer has their own vector database with custom embedding strategies based on their document types and content. **Reliability Requirements** - Customer service is mission-critical, requiring high availability and consistent quality. The extensive evaluation pipeline reflects this requirement. **Scalability** - The system handles high-volume ticket flows, with some customers seeing thousands of repetitive queries daily. Automation rates in the 60-70% range for repetitive workflows significantly impact agent workload. **Continuous Model Evolution** - The team regularly updates their approach as new LLM capabilities emerge. They've moved from 2K context windows to models with 100M+ tokens, from no chat models to sophisticated multi-agent systems, and from basic prompting to complex agentic workflows. ## Challenges and Learnings **LLM Limitations in Early Days** - Initial limitations (small context windows, no chat models, reliability issues) forced the team to start conservatively with suggestions rather than automation. **Oscillation Between LLM-First and Code-First** - The team went through a phase of disillusionment with LLMs and reverted to traditional code, before finding the right balance of deterministic code for known steps and AI for flexible reasoning. **Hallucination Management** - Teaching LLMs to say "I don't know" when information isn't in the knowledge base (rather than using pre-trained knowledge) remains a core challenge, addressed through extensive evaluation. **Customer Knowledge Quality** - System output quality depends heavily on input knowledge quality. By providing visibility into reasoning traces, Neople enables customers to identify and fix knowledge gaps themselves. **Workflow Complexity** - Customers often want to automate complex processes with dozens of edge cases. Teaching decomposition—starting simple and expanding incrementally—has been crucial for success. **Integration Breadth** - Every customer uses different tools, creating an infinite integration problem partially solved through partnerships and browser automation. ## Results and Impact **Customer Metrics** - Customers report significant improvements in key metrics: - First response rate improvements from 10% to 70% - Reduced resolution times - High CSAT scores maintained or improved **Agent Experience** - Human agents can focus on complex, engaging work rather than repetitive questions, improving job satisfaction. **Business Expansion** - What started as a customer service tool has expanded into finance (invoice reconciliation), operations (process automation), marketing (content creation), and general productivity (meeting notes, action tracking). **Adoption Beyond Tech Stack** - One customer's entire marketing department adopted their Neele for content creation, demonstrating how well-tuned tone of voice creates value beyond the original use case. **Customer Autonomy** - Non-technical teams can now configure AI automation independently, without waiting for technical resources, significantly accelerating deployment and iteration. ## Future Direction Neople is focusing on enabling customers to expand AI usage throughout their organizations, moving beyond customer service into operational and analytical workflows. They're exploring how to better support complex workflows that may require human reasoning for certain steps while still automating the repetitive components. The team continues to incorporate new AI capabilities as they emerge, recently noting rapid improvements in browser automation and the potential of emerging technologies like Model Context Protocol (MCP) for integrations. The case study exemplifies several LLMOps best practices: progressive deployment from suggestions to full automation, extensive inline evaluation, customer-driven quality feedback loops, transparent reasoning for debugging and trust, and careful balance between deterministic code and flexible AI components. Neople's journey also illustrates how production AI systems must continuously evolve as the underlying models improve, requiring teams to repeatedly rethink their architecture and capabilities.

Start deploying reproducible AI workflows today