ADP, a major HR and payroll services provider, is developing ADP Assist, a generative AI initiative to make their platforms more interactive and user-friendly while maintaining security and quality. They're implementing a comprehensive AI strategy through their "One AI" and "One Data" platforms, partnering with Databricks to address key challenges in quality assurance, IP protection, data structuring, and cost control. The solution employs RAG and various MLOps tools to ensure reliable, secure, and cost-effective AI deployment across their global operations serving over 41 million workers.
ADP, one of the world’s largest Human Capital Management (HCM) companies, presented their generative AI strategy at a Databricks event. ADP provides HR, payroll, benefits, talent management, and time tracking solutions to over one million clients across 140 countries, with their systems processing payments for more than 41 million global workers. Notably, one in six US workers are employed by companies using ADP systems, giving the company access to an extraordinarily large and valuable dataset about the world of work.
The presentation was delivered by Fernando Schwarz, VP of AI at ADP and head of the Core AI team, who discussed how the company is leveraging Databricks as a strategic partner in their generative AI initiatives. It’s important to note that this presentation represents ADP’s ongoing journey rather than a completed project with measured results—the company acknowledges they are “still sort of working on this” and are in the process of scaling their capabilities.
The flagship generative AI product being developed is called “ADP Assist,” designed to make ADP’s platforms more interactive and user-friendly. The company’s guiding philosophy follows a pyramid framework with three tiers: Easy (baseline usability), Smart (intelligent features), and Human (natural, human-like interactions). The goal is to progress through these tiers, though Schwarz acknowledges they “haven’t made it human yet but it’s getting there.”
Key objectives for ADP Assist include:
Schwarz outlined four major challenges that ADP faces in deploying generative AI at enterprise scale, which represent common concerns in the LLMOps space:
Given that ADP’s products deal with critical areas like payroll and tax advice, ensuring high-quality, accurate responses is paramount. Schwarz explicitly referenced recent media coverage of companies deploying ChatGPT-like tools that have made headlines “for the wrong reasons” due to hallucinations and inaccurate outputs. The company is deeply focused on implementing proper guardrails to prevent such issues. They are working with Databricks’ Mosaic team specifically on quality measurement and achievement, though specific metrics or methodologies were not detailed in the presentation.
With a million clients trusting ADP with their data, IP protection is a critical concern. The company must navigate the complex challenge of exposing data to AI systems while ensuring there are no IP issues. This is particularly challenging in generative AI contexts where training data and retrieved content could potentially leak or be misattributed.
ADP is building RAG (Retrieval Augmented Generation) systems that require accessing and structuring unstructured content. The challenge involves formatting data appropriately, creating the right datasets, and structuring information in ways that LLMs can effectively utilize to produce quality results. This represents a significant data engineering challenge at ADP’s scale.
Cost management has become increasingly important as the company moves from proof-of-concept to production. Schwarz candidly noted that about a year prior to the presentation, executives were more focused on proving capabilities worked, with cost being a secondary concern. Now that generative AI has proven viable, the organization is taking “deep breaths” when seeing bills and focusing on cost containment. The natural progression has been from enthusiasm and POC development to making implementations financially viable.
ADP’s AI strategy centers on building two foundational platforms:
This central data platform leverages Databricks capabilities extensively:
The central AI platform includes multiple components, many tied to Databricks capabilities:
One notable advantage Schwarz highlighted is that enterprise platforms like Databricks help solve the painful permissioning challenges that plague enterprise AI deployments. Having a unified platform that handles permissions makes scaling significantly easier.
Governance is described as “top of mind” for ADP’s clients and leadership. A key concern is preventing “rogue teams building their own thing” that becomes difficult to manage. Unity Catalog is specifically mentioned as providing a solution for governing vector indexes and ensuring consistent oversight across the organization.
The company is also leveraging the Mosaic team at Databricks for fine-tuning capabilities. This includes the potential to fine-tune open-source models in-house, though Schwarz noted they are “just getting started” with this approach.
The transition from experimentation to production has brought cost optimization to the forefront. ADP’s approach to controlling costs includes:
This represents a common maturation pattern in enterprise LLMOps: initial experimentation with powerful but expensive commercial APIs, followed by optimization through fine-tuned, smaller, or self-hosted models once value is proven.
Beyond internal operations, ADP has a broader data strategy with four pillars:
The commercial data angle is particularly interesting—ADP sees significant value in the aggregated, anonymized patterns they can derive from processing payments and HR data for such a large portion of the workforce.
While ADP’s presentation provides valuable insights into enterprise LLMOps challenges and strategies, it’s important to note several limitations:
The presentation is primarily forward-looking and strategic rather than retrospective. No concrete metrics, success rates, or quantified results were shared. The initiatives described are works in progress, with phrases like “still sort of working on this” and “just getting started” appearing throughout.
The presentation also serves as a Databricks partner testimonial, so the framing naturally emphasizes the value of Databricks tools. While the use cases and challenges described appear genuine, the specific tool recommendations should be evaluated in that context.
Additionally, the quality assurance and hallucination prevention approaches mentioned are acknowledged as priorities but not detailed technically. For an organization dealing with payroll and tax advice, the specific guardrails and validation mechanisms would be critical implementation details not covered here.
ADP has established a Core AI team led by Fernando Schwarz (VP of AI) that is building the One AI platform. This team operates as a Center of Excellence, partnering with various business units across the organization. The team is actively scaling and hiring, indicating significant organizational investment in these capabilities.
The approach of having a central AI platform team that partners with business units while maintaining governance and shared infrastructure represents a common and generally effective enterprise AI organizational pattern, balancing innovation velocity with control and consistency.
Snorkel developed a specialized benchmark dataset for evaluating AI agents in insurance underwriting, leveraging their expert network of Chartered Property and Casualty Underwriters (CPCUs). The benchmark simulates an AI copilot that assists junior underwriters by reasoning over proprietary knowledge, using multiple tools including databases and underwriting guidelines, and engaging in multi-turn conversations. The evaluation revealed significant performance variations across frontier models (single digits to ~80% accuracy), with notable error modes including tool use failures (36% of conversations) and hallucinations from pretrained domain knowledge, particularly from OpenAI models which hallucinated non-existent insurance products 15-45% of the time.
British Telecom (BT) partnered with AWS to deploy agentic AI systems for autonomous network operations across their 5G standalone mobile network infrastructure serving 30 million subscribers. The initiative addresses major operational challenges including high manual operations costs (up to 20% of revenue), complex failure diagnosis in containerized networks with 20,000 macro sites generating petabytes of data, and difficulties in change impact analysis with 11,000 weekly network changes. The solution leverages AWS Bedrock Agent Core, Amazon SageMaker for multivariate anomaly detection, Amazon Neptune for network topology graphs, and domain-specific community agents for root cause analysis and service impact assessment. Early results focus on cost reduction through automation, improved service level agreements, faster customer impact identification, and enhanced change efficiency, with plans to expand coverage optimization, dynamic network slicing, and further closed-loop automation across all network domains.
OpenAI's Forward Deployed Engineering (FDE) team, led by Colin Jarvis, embeds with enterprise customers to solve high-value problems using LLMs and deliver production-grade AI applications. The team focuses on problems worth tens of millions to billions in value, working with companies across industries including finance (Morgan Stanley), manufacturing (semiconductors, automotive), telecommunications (T-Mobile, Klarna), and others. By deeply understanding customer domains, building evaluation frameworks, implementing guardrails, and iterating with users over months, the FDE team achieves 20-50% efficiency improvements and high adoption rates (98% at Morgan Stanley). The approach emphasizes solving hard, novel problems from zero-to-one, extracting learnings into reusable products and frameworks (like Swarm and Agent Kit), then scaling solutions across the market while maintaining strategic focus on product development over services revenue.