Company
Various
Title
Federal Government AI Platform Adoption and Scalability Initiatives
Industry
Government
Year
2023
Summary (short)
The U.S. federal government agencies are working to move AI applications from pilots to production, focusing on scalable and responsible deployment. The Department of Energy (DOE) has implemented Energy GPT using open models in their environment, while the Department of State is utilizing LLMs for diplomatic cable summarization. The U.S. Navy's Project AMMO showcases successful MLOps implementation, reducing model retraining time from six months to one week for underwater vehicle operations. Agencies are addressing challenges around budgeting, security compliance, and governance while ensuring user-friendly AI implementations.
## Overview This panel discussion, featuring representatives from the Department of Energy, the White House Office of Science and Technology Policy, and Domino Data Lab, provides a comprehensive look at how U.S. federal government agencies are navigating the transition from AI pilots to production-scale deployments. The conversation is particularly valuable because it brings together perspectives from policy makers, practitioners within a large federated agency (DOE), and a commercial MLOps platform vendor, offering a multi-dimensional view of the LLMOps landscape in government. The discussion centers on a critical inflection point: after years of building out cloud infrastructure and data management capabilities, federal agencies are now facing the harder challenge of operationalizing AI at scale while maintaining governance, compliance, and security requirements unique to government operations. ## Key Challenges in Moving from Pilot to Production ### Budget Cycle Misalignment One of the most significant operational challenges highlighted is the disconnect between the federal budget cycle (typically planned two years in advance) and the rapid emergence of generative AI capabilities. Agencies did not budget for these tools years ago, and now face consumption-based cost models that become harder to predict as usage scales. This creates tension between the desire to move quickly and demonstrate value while waiting for budget allocation processes to catch up. ### Scaling Access Equitably A recurring theme is the challenge of providing equal access to AI tools across large, federated organizations. When pilot programs demonstrate productivity gains, there is immediate pressure to scale access to all employees who could benefit. This creates infrastructure, licensing, and support challenges that were not anticipated during initial pilot phases. ### Cybersecurity and Compliance at Different Data Levels The transition from pilot to production introduces new compliance requirements around records management and data handling. Pilots often operate with sanitized or synthetic data, but production systems must handle real operational data at various classification levels. This requires different security postures and authorization processes that can significantly slow deployment timelines. ### Talent Acquisition and Retention The discussion highlights creative approaches agencies are taking to address AI talent gaps, including the DHS AI Core hiring effort that has brought on over 25 AI experts. However, talent remains a bottleneck, with agencies often turning to vendors for supplementary services or innovation units to pilot new technologies. ## Governance Frameworks and Policy as Enablers ### White House Policy as Implementation Roadmap A key insight from the White House representative is that the administration's AI policies are designed not just as regulatory constraints but as practical roadmaps for implementation. The policy for government use of AI provides specific steps for public engagement from ideation through production, including requirements for stakeholder engagement, user notice and explanation when interacting with AI systems, and mechanisms for opt-outs and recourse. The DOE perspective validates this approach, noting that having prescriptive guidance from the White House provides "backing" when having difficult conversations about compliance requirements. Rather than appearing as bureaucratic overhead, the policies can be positioned as protection for both the agency and users. ### Chief AI Officers and Governance Councils The establishment of agency Chief AI Officers and the Chief AI Officers Council represents a significant organizational innovation for coordinating AI activities across the federal government. These structures provide both the technical leadership needed to drive innovation and the authority to manage risks appropriately. ### Building on Existing Governance Structures The DOE example demonstrates a pragmatic approach: rather than creating entirely new governance processes for AI, the agency integrated AI governance into existing IT and OT governance structures. When the executive order came out, they "piggybacked on an existing meeting" with established relationships and decision-making authorities. This approach reduces friction and leverages institutional knowledge rather than starting from scratch. ## Technical Approaches to Auditability and Scalability ### Platform-Based Auditability A strong emphasis is placed on the need for systematic, platform-level auditability rather than manual processes. The Domino Data Lab representative argues that effective auditability at scale must be "built in" to the platform, not achieved through spreadsheets or documents. The complexity of the model development lifecycle—involving multiple data sources, multiple data scientists and teams, potentially multiple organizations, plus testing, evaluation, deployment, and monitoring—means that errors or biases at any stage can compromise model effectiveness. This architectural requirement for built-in auditability represents a key consideration for government agencies evaluating MLOps platforms: the tooling must support comprehensive audit trails as a first-class feature, not an afterthought. ### Governance as Acceleration, Not Burden An important reframing offered in the discussion is that smart governance actually accelerates innovation rather than impeding it. When program managers and end users clearly understand what is allowable, they have a "green light" to proceed confidently. Unclear or absent guidance creates hesitation and delays. The recommendation is to develop policies, create implementation plans, and then programmatically encode those policies into the platform so compliance becomes routine and integrated into daily workflows. ### Avoiding Vendor Lock-in The White House representative emphasizes that administration policies include provisions to prevent vendor lock-in, specifically requiring agencies to retain rights over their data, any data manipulated or labeled by vendors, and model artifacts. This is critical for long-term operational flexibility and reflects hard-learned lessons from earlier generations of government IT procurement. ## Specific LLM Use Cases in Production ### Department of Energy: Energy GPT DOE has deployed an internal chatbot called "Energy GPT" built on an open foundation model but deployed within DOE's own environment. This architecture allows the agency to incorporate DOE-specific data without contributing to public model training—a critical requirement for government data handling. Use cases include productivity assistance (writing improvement), document summarization, and navigation of internal organizational knowledge. ### NEPA Document Analysis DOE has implemented AI for policy document analysis, specifically around National Environmental Policy Act (NEPA) documentation. This use case demonstrates the value of building once and sharing across agencies, as climate and environmental policy is not unique to DOE. ### OPM: COBOL Code Modernization with LLMs The Office of Personnel Management is exploring using LLMs to modernize legacy COBOL code by translating it into modern programming languages. This represents a high-value use case for foundation models in government, addressing the well-documented challenge of maintaining aging but critical legacy systems when COBOL expertise is increasingly rare. ### State Department: Diplomatic Cable Summarization The Department of State received Technology Modernization Fund (TMF) funding to use LLMs and foundation models to summarize diplomatic cables from abroad and surface key insights. This addresses the information overload problem in diplomatic communications and represents a classic LLM use case in document processing. ### Navy Project AMMO: Autonomous Vehicle Model Retraining Perhaps the most operationally impressive example discussed is the Navy's Project AMMO for uncrewed undersea vehicles (UUVs). Using Domino's platform deployed on AWS GovCloud with integrated partner tools in an end-to-end MLOps pipeline, the Navy has reduced model retraining time from six months to one week. The workflow involves building and testing models in the platform, deploying them to autonomous vehicles, collecting data from field operations, and then retraining models based on operational experience. This represents a complete MLOps lifecycle operating at production scale with significant measurable impact. ## Strategic Recommendations for AI Leadership ### Listen to Agency Business Needs The DOE DCIO emphasizes the importance of understanding actual business cases rather than pursuing AI for its own sake. The excitement around AI can sometimes "overmask" the actual business objective, leading to solutions that could have been simpler or didn't require AI at all. ### Think Big with Agency AI Strategies Looking ahead, Chief AI Officers will be creating agency AI strategies, and the recommendation is to think ambitiously about five to ten year horizons. Starting from the desired end state and working backwards helps identify the necessary building blocks: cloud infrastructure, data infrastructure, technology modernization, and employee training. ### Get Quick Wins to Build Momentum From the vendor perspective, the advice is to "put points on the board" with quick wins that demonstrate value. Making change in large government bureaucracies is inherently slow, and early successes help build belief from both leadership and front-line workers. This momentum aids budget requests, tool adoption, and overall organizational change management. ## Critical Assessment While this discussion provides valuable insights into government AI operations, it should be noted that one of the panelists represents Domino Data Lab, a commercial vendor with obvious interest in promoting their platform. The specific claims about Navy Project AMMO's time savings (six months to one week for model retraining) are impressive but presented without independent verification. Additionally, the conversation focuses heavily on governance and policy frameworks but provides limited technical detail on actual implementation challenges—how agencies handle model versioning, A/B testing, prompt management, or retrieval-augmented generation architectures in practice. The success stories mentioned are presented at a high level without discussion of failures, setbacks, or lessons learned from unsuccessful deployments. That said, the multi-stakeholder perspective and the emphasis on governance as an enabler rather than obstacle represents mature thinking about enterprise AI operations that is applicable beyond government contexts.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.