Georgia-Pacific: Scaling Generative AI for Manufacturing Operations with RAG and Multi-Model Architecture

LLMOps Database

Other

Georgia-Pacific

Company

Georgia-Pacific

Title

Scaling Generative AI for Manufacturing Operations with RAG and Multi-Model Architecture

Industry

Other

Link

https://www.youtube.com/watch?v=lqow_w9jaVM

Year

2025

Summary (short)

Georgia-Pacific, a forest products manufacturing company with 30,000+ employees and 140+ facilities, deployed generative AI to address critical knowledge transfer challenges as experienced workers retire and new employees struggle with complex equipment. The company developed an "Operator Assistant" chatbot using AWS Bedrock, RAG architecture, and vector databases to provide real-time troubleshooting guidance to factory operators. Starting with a 6-8 week MVP deployment in December 2023, they scaled to 45 use cases across multiple facilities within 7-8 months, serving 500+ users daily with improved operational efficiency and reduced waste.

Georgia-Pacific's generative AI implementation represents a comprehensive case study in scaling LLM deployments for manufacturing operations. The company, a forest products manufacturer with over 30,000 employees globally across 140+ facilities, faced critical challenges around knowledge transfer as experienced workers retired and new employees struggled to operate complex machinery, some dating back decades. Their manufacturing equipment, described as "the size of a football field" with thousands of interconnected parts, requires years of experience to operate effectively, creating a significant knowledge gap. The company's AI journey began around 7-8 years ago with traditional predictive models and statistical anomaly detection, but the advent of generative AI in late 2023 provided new opportunities to bridge the experience gap. Led by VP of IT and AI Architecture Delivery Manish SA, the team initiated their generative AI project in December 2023, approximately one year after ChatGPT's release, focusing on real-world production problems rather than experimental use cases. **Technical Architecture and Implementation** The core solution, internally called "Operator Assistant," employs a sophisticated RAG (Retrieval Augmented Generation) architecture built on AWS infrastructure. The system integrates structured time-series data from PI historians (capturing IoT sensor data every second or millisecond) with unstructured documentation including standard operating procedures, maintenance records, and equipment manuals. This hybrid approach addresses the fundamental challenge of manufacturing environments where critical operational knowledge exists in disparate formats across siloed systems. The architecture leverages AWS Bedrock as the primary LLM platform, with the flexibility to swap models as new versions become available. The team specifically highlighted their ability to upgrade from Claude Sonnet 1 to 3.7 to 4.0 as new models were released, demonstrating the importance of flexible, non-monolithic architectures in production LLM deployments. Vector databases serve as the core retrieval mechanism, with the team transitioning from initial experiments using Amazon Kendra to more optimized Aurora vector stores for better performance and cost efficiency. **Data Preparation and Knowledge Capture** A critical innovation in Georgia-Pacific's approach was addressing the challenge of undocumented institutional knowledge. Recognizing that much operational expertise existed only in employees' heads, the team developed a custom tool called "Docgen" to systematically capture this tacit knowledge. The application presents structured questionnaires to experienced operators, asking scenario-based questions such as "What do you do when power consumption goes up?" or "When a safety alert shows up, what do you do?" Responses are captured via voice or text, transcribed, and converted into structured text suitable for LLM consumption. This knowledge capture process represents a sophisticated approach to the common challenge of insufficient training data in enterprise LLM deployments. Rather than relying solely on existing documentation, the system actively solicits and structures human expertise, creating a more comprehensive knowledge base for the RAG system. **Production Deployment and Scaling Challenges** The initial deployment timeline was remarkably aggressive, with the team moving from concept to production in 6-8 weeks. However, this rapid deployment came with acknowledged technical debt - the early implementation used "non-standard databases," lacked proper CI/CD pipelines, and wasn't optimized for efficiency. The focus during this phase was rapid user feedback and proof of concept rather than production-ready architecture. The scaling phase revealed several critical LLMOps challenges. Performance requirements evolved significantly as the system expanded beyond the initial single-machine deployment. While one-minute response times were acceptable during experimentation, production users demanded sub-5-second responses as the system scaled to 400+ machines. This necessitated comprehensive optimization across the entire stack, including vector database performance, Lambda caching strategies, and Bedrock agent response times. The team's approach to scaling demonstrates mature LLMOps practices, with systematic attention to latency optimization at every component level. They worked closely with AWS support teams to analyze performance bottlenecks across the entire pipeline, from data retrieval through vector similarity search to final response generation. This holistic approach to performance optimization is characteristic of successful production LLM deployments. **Operational Excellence and Monitoring** Georgia-Pacific's deployment emphasizes the importance of comprehensive monitoring and observability in production LLM systems. Working with AWS Countdown Premium support, they implemented monitoring strategies to identify throttling issues, model drift, and performance degradation. The system includes real-time alerting capabilities that notify operators of equipment problems while providing contextual troubleshooting guidance. The user interface design reflects practical operational considerations, moving beyond simple text-based chat to provide guided interactions. The system proactively alerts operators to equipment issues (displayed as red alerts) and suggests relevant questions, reducing the cognitive load on users who may be dealing with urgent production problems. This guided approach represents a mature understanding of how LLMs should be integrated into operational workflows. **Cost Optimization and Resource Management** The case study highlights several critical cost considerations for production LLM deployments. The team emphasizes the importance of architectural decisions on long-term operational costs, particularly around model selection and infrastructure choices. They implemented financial controls using AWS Budget to provide alerts and automated responses to cost overruns, demonstrating proactive cost management practices. Token management emerged as a critical cost optimization area, with the team investing in prompt engineering to minimize input tokens while maintaining response quality. The chunking strategy for vector databases was optimized to balance retrieval accuracy with computational efficiency, showing sophisticated understanding of the cost-performance tradeoffs in RAG systems. **Security and Governance Considerations** While not extensively detailed in the presentation, the case study touches on several security considerations relevant to production LLM deployments. The system implements role-based access controls to ensure appropriate data access, and the team worked with AWS support to implement security guardrails against common vulnerabilities such as prompt injection and data poisoning attacks. The governance framework includes data retention policies to address compliance requirements, particularly important in manufacturing environments with regulatory oversight. The system's ability to validate outputs before sending them to downstream systems provides an additional layer of security against malicious modifications or inappropriate responses. **Advanced Capabilities and Future Developments** The evolution toward agentic AI represents a significant advancement in Georgia-Pacific's LLM deployment. The team recently deployed their first autonomous agent capable of taking actions rather than just providing recommendations. This agent focuses on work order automation, handling the complex process of planning maintenance activities that typically involve 100+ items including parts procurement and repair coordination. The agentic workflow utilizes RAG combined with graph databases and AWS Bedrock Agents to create a comprehensive automated planning system. This represents a natural evolution from passive information retrieval to active process automation, demonstrating the maturation of their LLMOps capabilities. **Results and Impact** The deployment has achieved significant scale with 500+ users and 30-40 concurrent users at any given time across 45 sites. The system has demonstrated measurable improvements in operational efficiency, production output, and waste reduction, with plans to extend impact measurement to finished product quality metrics. The relatively small team size (4-5 people) managing this scale of deployment suggests effective operational practices and tooling. **Lessons Learned and Best Practices** Several key lessons emerge from Georgia-Pacific's experience. The importance of flexible, non-monolithic architectures enables rapid model upgrades and technology evolution. The value of dedicated support relationships, particularly AWS Countdown Premium, provided context-aware assistance that reduced resolution times and enabled deeper technical collaboration. The focus on real operational problems rather than experimental use cases drove rapid adoption and demonstrated clear business value. The case study also highlights the critical importance of performance optimization in production LLM deployments, where user expectations evolve rapidly from experimental tolerance to production-level performance requirements. The systematic approach to bottleneck identification and resolution across the entire stack demonstrates mature LLMOps practices that other organizations can emulate.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.

Book a Demo

Use Open Source