Nvidia: Scaling AI Development with DGX Cloud: ServiceNow and SLB Production Deployments

Overview

This case study presents two distinct production deployments of large language models and foundation models using Nvidia DGX Cloud on AWS infrastructure. The session features presentations from ServiceNow and SLB (formerly Schlumberger), representing enterprise technology and energy sectors respectively. Both organizations demonstrate sophisticated approaches to building, training, and deploying foundation models at scale while addressing different business challenges and technical requirements.

The case study is particularly valuable because it provides insights into how two very different industries—enterprise workflow automation and energy exploration—leverage similar infrastructure and tooling to solve fundamentally different problems. ServiceNow focuses on building efficient small language models for text-based enterprise automation, while SLB develops multi-modal models for scientific data synthesis in geophysics and petroleum engineering.

ServiceNow: Efficient Small Language Models for Enterprise Automation

ServiceNow’s Foundation Model Lab is led by a principal scientist responsible for mid-training and post-training of models. The organization’s primary goal is to build what they call “frontier-level reasoning performance on a single GPU scale,” meaning they want small models (5B-15B parameters) that can perform comparably to much larger frontier models like GPT or Claude.

The business case for this approach is compelling: ServiceNow hosts enterprise data across ITSM, ITO, ITA, CSM, HR, legal compliance, and other domains. Their platform serves numerous processes that benefit from automation, and using frontier models for every task would be prohibitively expensive. By building efficient smaller models, they enable customers to reserve expensive frontier model API calls for only the most complex use cases while defaulting to ServiceNow’s models for the majority of workloads.

ServiceNow has released three open-source models in their Ariel family: a 5 billion parameter model, a 15 billion parameter reasoning model, and an upgraded 15 billion parameter multimodal model that consumes images and text and outputs text. The speaker emphasized that their 15B parameter model performs on par with models 30 times larger, specifically comparing favorably to DeepSeek R1 (600B parameters), Qwen (235B parameters), and Gemini Flash on benchmarks. The models received approximately 100,000 downloads, indicating strong community adoption.

The open-source strategy is deliberate: ServiceNow recognizes that the model alone cannot deliver value without agentic orchestrators, tools, and data. By open-sourcing the models, they increase awareness while maintaining competitive advantage through their integrated platform capabilities. The models support agentic systems that can call appropriate tools and functions in the right sequence, inspired by concepts like Anthropic’s Model Context Protocol (MCP).

From an LLMOps perspective, ServiceNow’s approach addresses two critical challenges: model capability and cost efficiency. They design custom post-training recipes inspired by their use cases, though their fundamental training approach doesn’t differ significantly from other open models. The emphasis on building models that can run efficiently on single GPUs reflects a production-first mindset focused on deployment economics and scalability.

Regarding customization approaches, ServiceNow primarily sees customers succeeding with out-of-the-box agentic scaffolding rather than model fine-tuning. The speaker noted that most enterprise processes (flight booking, food ordering, etc.) are fundamentally similar across customers, with differences emerging in policies, triggers, and tools rather than core model capabilities. ServiceNow allows customers to customize agentic scaffolding, define custom policies, and implement guardrails. While they remain open to providing custom models if demand emerges, they find this is generally unnecessary for most use cases, suggesting that proper orchestration and RAG approaches often suffice for customization needs.

SLB: Domain-Specific Foundation Models for Energy

SLB’s approach represents a different LLMOps challenge altogether. As the business line director for their internal data and AI platform explains, SLB is a global technology company in the energy industry operating in over 120 countries, dealing with both traditional oil and gas production and new energy solutions. Their business is heavily science and engineering focused, with specialist scientists and engineers across different divisions.

The core challenge is that SLB expensively acquires vast amounts of specialized data—seismic data, geophysical data, petrophysical data from well bores—that characterizes subsurface formations. Geologists and geophysicists use this data to build models of underground structures, which have inherent uncertainty. For 30-40 years, SLB has used complex deterministic physics and mathematics to predict subsurface structure and simulate fluid flow in porous media. More recently, they’ve incorporated machine learning and data-driven techniques, and now they’re leveraging foundation models.

However, SLB’s use case is fundamentally different from text-to-text generation. They need foundation models for seismic and time series data, and for petrophysical data that can predict scientific responses as if the data had been acquired directly. These are multi-modal models handling text-to-images, images-to-text, and ultimately generating complex models of the earth. The speaker emphasized this is “almost a creative process” they’re trying to support among scientists, not a transactional workflow.

SLB has released two major generative AI products in the last two years: Generative AI and Energene Assistant. These leverage domain foundation models that sit alongside their existing portfolio of physics-based simulators, algorithms, and traditional machine learning models for tasks like automating fault and horizon interpretation (which can take months). The foundation models can reconstruct missing data in acquisitions within certain constraints, work that would normally take weeks, days, or even months for humans to complete.

The trust dimension is critical in SLB’s LLMOps approach. Scientists with PhD-level expertise need to trust these tools enough to use them. SLB started training models with public data to achieve initial capability, then augmented with proprietary data from their own acquisitions. Crucially, they work with customers to fine-tune models with customer-specific data. Every major customer either takes the off-the-shelf model or, more commonly, combines it with their own data to create fine-tuned models representing the specific basins and geological settings where they operate. This builds the trust necessary for accurate predictions in their specific context.

SLB’s approach to customization differs significantly from ServiceNow’s. For general corporate memory and documents—even historical documents dating back to the 19th century—they use RAG approaches. They create pipelines that ingest historical documents, use advanced techniques to capture semantic meaning from old documents (where language has changed significantly over 100+ years), and automatically vectorize content for out-of-the-box RAG. However, for specialty domains like petrophysics, geophysics, and seismic data, they actually fine-tune the domain foundation models to create competitive advantage.

The speaker described their journey from rudimentary models to 2D, 2.5D, and eventually 3D models over a two-year period, representing a steep learning curve. They’ve now reached a point where foundation models produce “really plausible generative results” that can be incorporated into workflows alongside existing machine learning and physics-based approaches, becoming an important part of their toolkit.

Infrastructure and LLMOps Platform: DGX Cloud on AWS

Both organizations converged on Nvidia DGX Cloud on AWS as their infrastructure solution, though for different reasons and with different usage patterns. The DGX Cloud offering represents a co-engineered solution between Nvidia and AWS, combining:

Infrastructure layer: Nvidia GB200 GPUs (with newer B300 mentioned as upcoming), high-performance networking, and Lustre storage
Management layer: Optimized Amazon EKS (Elastic Kubernetes Service)
Orchestration layer: Run AI for GPU orchestration
Software layer: Enterprise-grade AI software including Nvidia AI Enterprise, DJX Hub, Omniverse, and NIM (Nvidia Inference Microservices)
Support layer: Technical account managers and 24/7 enterprise support

For ServiceNow, the transition from on-premises infrastructure to DGX Cloud addressed several pain points. They previously purchased Nvidia hardware and managed it in-house with their own orchestration frameworks and job management. While functional, maintaining extremely large clusters is resource-intensive. DGX Cloud offered greater reliability and the flexibility to switch to newer hardware as it becomes available rather than being locked into purchased hardware.

The speaker particularly emphasized Run AI’s value for GPU orchestration. ServiceNow previously developed custom job schedulers to maximize compute utilization during downtime. Run AI simplified this through its job prioritization capabilities, enabling them to achieve “almost 100% utilization” across training models, synthesizing data, and evaluation tasks. Over the past year, they experienced “close to zero downtime,” with only a few hours lost during one cluster upgrade. The speaker stated they now prefer not to use clusters without Run AI, highlighting how critical efficient GPU utilization is for production LLMOps.

For SLB, DGX Cloud provided the velocity needed for early-stage product development and rapid iteration. Coming from a background of on-premises infrastructure for workflows, simulation, and seismic processing, they had some A100 infrastructure that got them started. However, given the rapidly moving AI landscape and their ambitious goal of two major product releases in two years, they needed to accelerate development. The turnkey stack where “everything worked beautifully out of the box” allowed their team to focus on what mattered—building domain-specific models—rather than infrastructure management.

As SLB progressed from 2D to 2.5D to 3D seismic foundation models, they were able to optimize and increase throughput from the same hardware through the support partnership. This demonstrates how the platform supports not just initial development but continued optimization as models become more sophisticated.

The networking and storage aspects are particularly important for both organizations. High-performance networking minimizes training time losses, which directly impacts both development velocity and GPU utilization economics. Lustre storage provides the throughput needed for large-scale data operations, especially critical for SLB’s massive seismic datasets and ServiceNow’s data synthesis workloads.

Deployment Strategies and Future Directions

The two organizations have different deployment strategies reflecting their distinct business models and customer relationships.

ServiceNow expressed commitment to continuing with both Nvidia DGX infrastructure and AWS services. The speaker noted they “want both” and appreciate having them combined effectively. They specifically called out Run AI on AWS and Lustre from AWS as encouraging them to explore deeper AWS integration. The implication is a hybrid approach leveraging both DGX Cloud for training and development, and broader AWS infrastructure for deployment and inference at scale.

SLB articulated a more nuanced multi-environment strategy. DGX Cloud and their AWS/Nvidia partnership was “fantastic for that early stage product development” and provided necessary velocity. However, as they scale globally, they’ll use various techniques. Some customers are extremely sensitive about proprietary data and require on-premises solutions. For global service delivery, they’ll leverage commodity infrastructure and AWS services. The speaker emphasized they’re “really at the early stages of this revolution” and need to remain responsive to customer learning and market discovery, making DGX Cloud’s flexibility particularly valuable during this dynamic period. As patterns establish and mature, they can shift toward more commodity infrastructure, but expect to need diverse infrastructure support for the foreseeable future.

This difference reflects fundamentally different business models: ServiceNow provides a multi-tenant SaaS platform where they control the deployment environment, while SLB sells both turnkey applications and platforms to customers who may have varying infrastructure preferences and data sensitivity requirements.

AI Scaling Laws and Production Considerations

The session opened with context about three emerging scaling laws for AI that inform both organizations’ strategies:

Pre-training scaling: Teaching models from internet data, the foundation layer
Post-training scaling: Where models “go to school and learn thinking,” the refinement layer
Test-time scaling: Long thinking and reasoning before responding, the inference optimization layer

Both organizations focus heavily on post-training scaling to adapt foundation capabilities to their specific domains and use cases. ServiceNow’s custom post-training recipes and SLB’s domain-specific fine-tuning on seismic and petrophysical data exemplify this approach. Neither organization is building foundation models from scratch; instead, they’re leveraging foundation model architectures and focusing their LLMOps efforts on adaptation, optimization, and production deployment.

The emphasis on cost-effective inference pervades both use cases. ServiceNow explicitly targets single-GPU deployment to make their agentic systems economically viable at enterprise scale. SLB needs models that can be deployed globally across customer environments with varying infrastructure capabilities. This production-first thinking influences architecture decisions, model size choices, and optimization priorities throughout the development lifecycle.

Evaluation and Trust

While not extensively detailed, both organizations emphasize evaluation and trust differently based on their domains.

ServiceNow measures their models against frontier model benchmarks, demonstrating competitive performance despite 30x size differences. Their evaluation framework appears to focus on task completion capability and cost efficiency, with the ultimate validation being customer adoption and open-source community downloads (100,000+ downloads suggests strong validation).

SLB faces a more challenging evaluation problem because their outputs aren’t easily verified against ground truth—the subsurface is inherently uncertain and unknowable without expensive data acquisition. Their evaluation strategy centers on building trust with domain experts (geologists, geophysicists, engineers with PhDs) by training on data they recognize, providing explainability around predictions, and enabling fine-tuning with customer-specific data. The fact that models must sit alongside traditional physics-based and mathematical approaches provides implicit validation opportunities through consistency checking across multiple methodologies.

Critical Assessment

While both presentations provide valuable insights into production LLMOps, several claims warrant careful consideration:

The assertion that ServiceNow’s 15B parameter model performs “on par with” models 30x larger is impressive but requires context. The specific benchmarks used matter significantly—performance on enterprise workflow tasks may not generalize to other domains. The comparison may also reflect the relatively narrow task distribution ServiceNow optimizes for compared to general-purpose frontier models.

Both organizations emphasize near-zero downtime and high reliability, which is certainly achievable with managed infrastructure but represents best-case scenarios. Organizations considering similar approaches should plan for learning curves, integration challenges, and operational complexities that may not surface in promotional presentations.

SLB’s two-year timeline for two major product releases is notable but doesn’t necessarily indicate the models are fully mature or validated in production at scale. Early-stage deployment and long-term production stability represent different challenges, particularly in an industry where errors can have significant safety and financial consequences.

The cost discussions focus primarily on training infrastructure efficiency but give limited insight into total cost of ownership including data preparation, evaluation, human-in-the-loop validation, and ongoing model maintenance and updates. The true LLMOps cost picture extends well beyond GPU utilization rates.

Despite these caveats, both case studies demonstrate sophisticated production deployments with clear business value propositions, thoughtful technical architectures, and realistic approaches to the challenges of operationalizing foundation models in complex enterprise and scientific domains.

Scaling AI Development with DGX Cloud: ServiceNow and SLB Production Deployments

Industry

Technologies

Overview

ServiceNow: Efficient Small Language Models for Enterprise Automation

SLB: Domain-Specific Foundation Models for Energy

Infrastructure and LLMOps Platform: DGX Cloud on AWS

Deployment Strategies and Future Directions

AI Scaling Laws and Production Considerations

Evaluation and Trust

Critical Assessment

More Like This

Agentic AI Copilot for Insurance Underwriting with Multi-Tool Integration

Multi-Industry AI Deployment Strategies with Diverse Hardware and Sovereign AI Considerations

Building Economic Infrastructure for AI with Foundation Models and Agentic Commerce