Wesco: Enterprise-Scale GenAI and Agentic AI Deployment in B2B Supply Chain Operations

Company

Wesco

Title

Enterprise-Scale GenAI and Agentic AI Deployment in B2B Supply Chain Operations

Industry

E-commerce

Link

https://www.youtube.com/watch?v=dohpC7DRWeI

Year

2025

Summary (short)

Wesco, a B2B supply chain and industrial distribution company, presents a comprehensive case study on deploying enterprise-grade AI applications at scale, moving from POC to production. The company faced challenges in transitioning from traditional predictive analytics to cognitive intelligence using generative AI and agentic systems. Their solution involved building a composable AI platform with proper governance, MLOps/LLMOps pipelines, and multi-agent architectures for use cases ranging from document processing and knowledge retrieval to fraud detection and inventory management. Results include deployment of 50+ use cases, significant improvements in employee productivity through "everyday AI" applications, and quantifiable ROI through transformational AI initiatives in supply chain optimization, with emphasis on proper observability, compliance, and change management to drive adoption.

## Overview Wesco, a B2B supply chain and industrial distribution company serving over 50 countries, presents an extensive case study on deploying enterprise-grade AI and generative AI applications at production scale. Arjun Srinivasan, the Director of Data Science at Wesco, shares the organization's journey from traditional business intelligence and predictive analytics to cognitive AI systems powered by LLMs and agentic frameworks. The presentation emphasizes practical challenges in moving beyond proof-of-concepts to realize tangible ROI and enterprise value. The company's evolution follows a maturity curve from "insight" (retrospective analytics) to "foresight" (predictive modeling) and finally to "action" (cognitive intelligence with reasoning and autonomous actions). This progression mirrors the industry-wide shift post-ChatGPT toward generative AI adoption and more recently toward agentic AI systems that can take autonomous actions with minimal human intervention. ## Strategic AI Roadmap and Maturity Stages Wesco's AI roadmap is structured around three key stages, though the speaker emphasizes these are not strictly linear and can be intertwined: **Foundation Stage - Awareness and Governance**: The company began by establishing robust governance frameworks for both data and AI, implementing proper stakeholder management, and creating guardrails for experimentation. They adopted a hybrid approach combining build and buy strategies, recognizing that not every AI capability needs to be built in-house. A critical element was creating an air-gapped sandbox environment that allows rapid experimentation while maintaining security and compliance. **Operational Stage - Everyday AI**: This phase focuses on low-effort, medium-to-high-impact use cases that improve employee productivity and efficiency. The "everyday AI" umbrella includes knowledge search and retrieval systems, document processing using LLM-based approaches (transitioning from traditional OCR), text summarization for meetings and emails, and coding assistants. The company reports that software development teams are experiencing significant productivity gains from AI-assisted coding, aligning with industry trends where companies like Google and Microsoft report over 25% of their code being AI-generated or AI-assisted. **Transformational Stage - High ROI Use Cases**: The final stage involves AI becoming core to the company's DNA through high-effort, high-ROI initiatives. These include synthetic data generation for model training, advanced recommendation systems (moving from traditional ML to LLM-enhanced approaches), and inventory forecasting for demand planning. These use cases directly impact quantifiable metrics like revenue improvement, margin enhancement, and cost reduction. Throughout this journey, Wesco emphasizes three critical pillars: people (workforce planning and upskilling), process (change management and business champion programs), and technology (tech stack refresh and platform development). ## Production Use Cases and LLMOps Implementation **Everyday AI Applications in Production:** Wesco has deployed numerous generative AI applications focused on operational efficiency. Their content creation and personalization capabilities serve marketing, sales, IT, and digital teams with automated generation of marketing materials, blog posts, social media content, and email campaigns. The company measures success not just through technical metrics like BLEU scores but through business KPIs such as click-through rates, user engagement improvements, and time/cost savings measured through A/B testing pre- and post-deployment. Persona-based chatbots and agents have been deployed across individual business functions, enabling function-specific knowledge access and task automation. Data enrichment workflows use AI assistants to populate product information management (PIM) systems with accurate weights, dimensions, and product specifications, directly improving e-commerce data quality. The intelligent document processing system represents a significant production deployment, handling unstructured documents including contracts, bills of material, purchase orders, RFPs, and RFQs. This capability is transitioning from a point solution to an enterprise-wide product, demonstrating the company's approach to scaling successful AI implementations. Language translation and localization services leverage generative AI with heavy human-in-the-loop oversight and reinforcement learning to improve translation quality and speed. This is critical for a company serving 50+ countries, covering both web content and product catalogs. An "AI for BI" initiative is underway to transform static business intelligence into dynamic, AI-powered insight generation with automated data analysis capabilities. **Transformational AI Applications:** On the high-value end, Wesco is implementing agentic AI systems within the supply chain domain for inventory management, representing a significant investment in autonomous decision-making capabilities. Recommendation systems for both products and pricing leverage traditional ML models as the foundation ("the cake") with LLM-based serving layers as "icing on the cake," demonstrating the company's pragmatic view that sound ML principles remain essential even in the era of generative AI. The company is also working on simulation and optimization capabilities built on operations research principles combined with data-driven decision making, representing the convergence of classical optimization with modern AI techniques. ## Multi-Agent Architecture: Fraud Detection Case Study Wesco provides a detailed technical example of their multi-agent AI system for fraud detection in accounts payable, built using LangGraph state graphs. This system demonstrates sophisticated LLMOps practices including agent orchestration, memory management, and explainability. The architecture consists of a planner and executor agent that coordinate five specialized agents: **Receipt Extractor Agent**: Uses combined OCR and LLM capabilities with RAG (retrieval-augmented generation) and in-memory caching to extract information from invoices and receipts. The in-memory cache enables comparison across submissions from the same vendor or over time periods to identify anomalies. **Entity Resolver Agent**: Handles deduplication through business rule-based logic or ML/LLM approaches, performing normalization to ensure consistent entity representation across the system. **Anomaly Detection Agent**: Employs methods ranging from traditional ML (isolation forest algorithms) to sophisticated LLM-based anomaly detection, identifying potential fraudulent transactions or duplicates based on company-specific policy rules. **Decision-Making Agent**: Uses router logic to determine whether transactions should be auto-approved or escalated to human reviewers (users or auditors). This agent represents the critical handoff point between autonomous and human-supervised decision-making. **Investigator Agent**: Provides explainability through long-term memory stores and chain-of-thought reasoning. When auditors query why certain decisions were made (days, weeks, or months later), this agent retrieves the decision trail from structured JSON content stored in the backend, enabling full auditability. The system is designed with a "North Star" goal of progressively minimizing human-in-the-loop requirements as the models mature through reinforcement learning from auditor feedback. This demonstrates the company's thoughtful approach to autonomous AI that balances efficiency gains with appropriate oversight and explainability requirements. ## LLMOps Infrastructure and Technical Stack **Platform Architecture:** Wesco has built a proprietary composable AI platform with YAML-based configuration files that enable rapid deployment of chatbots and agents for both internal and external customers. This platform approach allows the company to standardize LLMOps practices while enabling customization for specific use cases. The composable design suggests a microservices-oriented architecture where components can be mixed and matched based on use case requirements. The platform sits atop mature data and analytics layers, following a clear architectural hierarchy: data management and analytics as foundation, traditional ML as the next layer, and generative AI/agentic AI as higher abstraction layers. This reflects an understanding that LLM applications require solid data infrastructure to be effective. **MLOps and LLMOps Practices:** The company has implemented comprehensive MLOps pipelines extended to support LLMOps requirements. Key components include: - **Prompt Engineering and Management**: Wesco maintains prompt banks with curated prompt templates for each use case, enabling systematic prompt engineering practices with version control and iterative refinement. - **Model Management**: Existing ML model management capabilities (tracking experiments, versioning, deployment) have been extended to support generative AI and agentic models. This includes managing both fine-tuned models and integrating third-party LLMs. - **CI/CD/CT Pipelines**: The company implements continuous integration, continuous deployment, and critically, continuous training (CT) pipelines. The CT component is essential for LLMOps as models need retraining or fine-tuning as data distributions shift or new patterns emerge. - **Hybrid Model Strategy**: Wesco employs a multi-sourced approach including open-source models accessed through hyperscaler partners, managed LLM services from commercial providers, and self-hosted models. This flexibility allows optimization for cost, performance, and data sensitivity requirements across different use cases. - **Fine-Tuning Approaches**: The company is building domain-specific and specialized language models using LoRA (Low-Rank Adaptation) and QLoRA (Quantized LoRA) techniques. This allows them to create models tuned to supply chain and industrial distribution contexts without the computational expense of full model retraining. The focus on "right-sized models" demonstrates attention to cost-to-serve economics at scale. **Observability and Monitoring:** Wesco emphasizes multi-layered observability as critical for production LLM systems: - **Infrastructure Monitoring**: Traditional monitoring of compute resources, latency, and availability. - **Data Observability**: Tracking data quality, completeness, and drift in input data feeding LLM applications. - **Model Observability**: For traditional ML models, monitoring both data drift and model drift. For LLMs, this extends to tracking prompt-response patterns, token usage, and output quality metrics. - **LLM/Agent-Specific Observability**: Fine-grained tracing of agent behaviors, decision paths, and reasoning chains. The company specifically mentions using LangFuse, an open-source observability tool, to trace agentic AI decisions with PostgreSQL backend for storing trace data. This enables auditors to investigate decisions made autonomously by agents days, weeks, or months after the fact. - **Human-in-the-Loop Feedback**: Reinforcement learning from human feedback (RLHF) is systematically incorporated, with feedback loops informing model improvements and prompt refinements. The monitoring infrastructure includes tracking technical metrics like BLEU scores for text generation tasks, but the emphasis is on translating these to business KPIs such as click-through rates, engagement metrics, time savings, and cost reductions. ## Enterprise Governance Framework Wesco has implemented a comprehensive six-pillar governance framework specifically designed for enterprise AI at scale: **Use Case Prioritization and Intake**: A formal intake process driven by business stakeholders ensures AI initiatives align with strategic priorities. The company uses an effort-impact matrix to prioritize use cases, categorizing them into exploration (low effort, low impact for learning), personalization (low effort, high impact for quick wins), foundational capabilities (high effort, low immediate impact but enabling future capabilities), and transformational opportunities (high effort, high impact with quantifiable ROI). **Technology Stack and Architecture**: Beyond selecting tools, the governance framework includes involvement from enterprise architecture and cloud center of excellence (CoE) teams. Reference architectures provide templates for common patterns (RAG systems, agent frameworks, etc.), ensuring consistency and reducing time-to-deployment for new use cases. **Risk Management and Security**: Enterprise security is baked in by design with the platform running within corporate VPN, implementing role-based access control (RBAC) and attribute-based access control (ABAC) to ensure appropriate user access. The company has implemented guardrails for both prompts (preventing prompt injection, ensuring appropriate queries) and actions (constraining what autonomous agents can do). For highly regulated use cases, the governance framework tracks evolving standards including NIST AI Risk Management Framework, ISO 42001 (AI management systems), and EU AI Act requirements for European customers. **Steering Committees**: Oversight bodies review both pro-code AI (custom-built models and systems) and low-code AI (citizen developer tools) to ensure alignment with governance policies and strategic objectives. **Third-Party Risk Assessment**: Recognizing that not all AI needs to be built internally, Wesco has established processes for evaluating and onboarding third-party AI SaaS solutions. This is particularly important for customer-facing companies that need to maintain healthy vendor ecosystems while ensuring security and compliance. **Strategic Partnerships and External Engagement**: The company maintains close collaboration with hyperscalers (AWS, Azure, GCP), data platform providers, and specialized AI vendors. Active participation in industry conferences and external engagement helps the organization stay current with emerging best practices and technologies. **Workforce Development**: Comprehensive AI literacy programs span the organization, targeting both technical teams building AI capabilities and business users consuming them. Training is delivered through vendor partner networks and internal programs, ensuring broad understanding of AI capabilities and limitations. ## ROI Measurement and Business Value Translation A significant portion of Wesco's LLMOps maturity involves systematically translating technical metrics into business value. The speaker emphasizes that data scientists and AI engineers naturally gravitate toward technical metrics (accuracy, error rates, BLEU scores, etc.), but enterprise success requires mapping these to business KPIs. **ROI Categories:** - **Tangible, Direct ROI**: Revenue improvement, margin enhancement, cost reduction. Examples include inventory forecasting that reduces carrying costs, pricing optimization that improves margins, and sales forecasting that enables better resource allocation. - **Efficiency and Productivity**: Team-level force multiplier effects where AI doesn't just improve individual productivity but enhances knowledge transfer and sharing across teams. Previously siloed knowledge embedded in experienced employees' expertise becomes accessible through chatbots and agents trained on enterprise knowledge bases. - **Time-to-Value**: Reducing time to deliver customer value, which improves customer experience and creates competitive differentiation within the industry vertical. - **Cost Avoidance**: Preventing errors, reducing rework, and automating manual processes that would otherwise require additional headcount. The company conducts A/B testing to measure impact, comparing pre-deployment baselines with post-deployment metrics. For the marketing content generation example, they track not just technical text quality scores but downstream business metrics like click-through rates and user engagement. ## Scaling Challenges and Lessons Learned **From POC to Production Barriers:** Wesco identifies several critical challenges in scaling AI from proof-of-concept to production systems: - **Software Engineering Rigor**: Moving from notebook-based experimentation to production-grade software with proper testing, versioning, and deployment pipelines. - **Data Quality and Access**: Ensuring sufficient, clean, contextual data to feed AI systems, with proper data governance and lineage tracking. - **Stakeholder Communication**: Bridging the gap between technical teams and business stakeholders, maintaining alignment throughout development and deployment. - **Architecture for Scale**: Reference architectures and infrastructure that can handle production loads, with considerations for latency, throughput, and cost at scale. - **Change Management**: Perhaps the most underestimated challenge - driving actual adoption of AI capabilities by end users. Technical excellence means nothing if users don't adopt the tools or change their workflows. **Scaling Strategies:** The company's approach to scaling includes creating the air-gapped sandbox for rapid iteration while maintaining strict gates for promotion to production. This allows experimentation velocity without compromising production stability or security. Integration is emphasized as critical - AI applications that exist in silos provide limited value. For business process automation workflows, AI needs to integrate with source systems (ERP, CRM) and target systems where decisions are actioned. The composable platform architecture facilitates these integrations through standardized interfaces. Compliance and trust are designed in from the start rather than bolted on later. This includes responsible AI guardrails, region-specific policy adherence for international operations, and comprehensive model risk management practices. ## Future Directions and Emerging Trends Wesco is actively preparing for several emerging trends in the LLMOps landscape: **Quantum Computing**: While still largely in R&D, quantum breakthroughs promise exponentially faster training and inference, which would fundamentally change the economics of running large-scale AI systems. **Domain-Specific Models**: The company is investing in specialized language models tailored to supply chain and industrial distribution contexts. This addresses the limitation of general-purpose models (GPT-4, Claude, Gemini) that lack enterprise-specific context and terminology. **No-Code/Low-Code AI Platforms**: As AI becomes increasingly abstracted, Wesco expects more capability to be placed in the hands of non-technical users. This democratization requires appropriate guardrails and governance but can accelerate value realization. **Reinforcement Learning Maturity**: RLHF and broader reinforcement learning approaches are becoming more mainstream, enabling systems that improve continuously from user interactions and feedback. ## Critical Success Factors The presentation concludes with four essential principles for enterprise LLMOps success: **Strategic Alignment**: Every AI initiative must tightly align with broader business strategy to maximize ROI and impact. This prevents the common pitfall of "AI for AI's sake" and ensures resources focus on high-value opportunities. **Iterative Implementation**: Recognizing that initial deployments represent a "cold start" that requires continuous learning, feedback integration, and refinement. The maturity of AI systems grows over time through reinforcement learning and systematic improvement processes. **Measure What Matters**: Focusing on KPIs and metrics that directly reflect business value ensures monitoring and evaluation efforts are effective and actionable rather than vanity metrics that don't drive decisions. **Change Management**: Technical capability without user adoption equals zero business value. Comprehensive change management programs including awareness campaigns, business champions, regular office hours, and incentive structures are essential for realizing AI benefits. ## Assessment and Balanced Perspective While Wesco's presentation showcases impressive breadth and sophistication in their LLMOps practices, several considerations merit attention: The case study represents aspirational best practices from a well-resourced enterprise with mature data infrastructure. Smaller organizations may find the comprehensive governance framework and multi-layered architecture challenging to replicate without similar resources. The presentation emphasizes successes and lessons learned but provides limited detail on specific failures, deployment timelines, or quantitative ROI figures for most use cases. The fraud detection multi-agent example is architecturally interesting but lacks discussion of accuracy rates, false positive/negative trade-offs, or actual cost savings realized. The claim of "50+ use cases" deployed deserves context - the distinction between production-grade systems serving critical business processes versus experimental pilots with limited user adoption is unclear. The emphasis on "everyday AI" for productivity suggests many use cases may fall into the latter category. The company's pragmatic approach of building a composable platform rather than bespoke solutions for each use case is sound engineering practice. However, the YAML-based configuration approach for rapidly spinning up chatbots and agents may abstract away important customization needs for complex use cases, potentially creating limitations as requirements evolve. The focus on domain-specific model development through fine-tuning is well-justified but resource-intensive. The balance between fine-tuning efforts and prompt engineering with general-purpose models isn't fully explored, though recent research suggests prompt engineering can often achieve comparable results at lower cost. Overall, Wesco's case study represents a mature, thoughtful approach to enterprise LLMOps with appropriate emphasis on governance, business value, and change management alongside technical implementation. The multi-agent fraud detection example and observability infrastructure demonstrate sophisticated technical capability. However, prospective practitioners should recognize this represents an ideal end-state that requires significant organizational maturity, resources, and time to achieve rather than a quick path to production LLM systems.

Start deploying reproducible AI workflows today