Agoda: GPT Integration for SQL Stored Procedure Optimization in CI/CD Pipeline

LLMOps Database

E-commerce

Agoda

Company

Agoda

Title

GPT Integration for SQL Stored Procedure Optimization in CI/CD Pipeline

Industry

E-commerce

Link

https://medium.com/agoda-engineering/how-agoda-uses-gpt-to-optimize-sql-stored-procedures-in-ci-cd-29caf730c46c

Year

2024

Summary (short)

Agoda integrated GPT into their CI/CD pipeline to automate SQL stored procedure optimization, addressing a significant operational bottleneck where database developers were spending 366 man-days annually on manual optimization tasks. The system provides automated analysis and suggestions for query improvements, index recommendations, and performance optimizations, leading to reduced manual review time and improved merge request processing. While achieving approximately 25% accuracy, the solution demonstrates practical benefits in streamlining database development workflows despite some limitations in handling complex stored procedures.

Tags

data_analysis

data_cleaning

legacy_system_integration

This case study from Agoda, a major e-commerce travel platform, showcases a practical implementation of LLMs in production for optimizing database operations. The company faced significant challenges with their database development workflow, particularly in the optimization of SQL stored procedures (SPs), which was consuming substantial developer resources and causing delays in their development pipeline. The core problem Agoda addressed was the time-intensive nature of SP optimization, with database developers spending approximately 366 man-days annually on this task. The merge request (MR) approval process was particularly affected, with 90th percentile cases taking up to 4.1 hours. This represented a clear operational bottleneck that needed addressing. The LLMOps implementation focuses on integrating GPT into their existing CI/CD pipeline, creating a systematic approach to automated SP optimization. The system architecture comprises three main components: Context Preparation: The system feeds GPT with comprehensive contextual information including: * The SQL code of the stored procedure requiring optimization * Database schema information including table structures and existing indexes * Performance test reports highlighting execution metrics and inefficiencies This context preparation phase is crucial for enabling GPT to make informed and relevant optimization suggestions. Analysis and Recommendation Generation: The system leverages GPT to perform three types of analysis: * Performance Analysis: Evaluating test reports to identify specific performance issues * Query Refinement: Suggesting optimized versions of SPs with improved efficiency * Index Recommendations: Analyzing query patterns to suggest optimal indexing strategies Integration with Developer Workflow: The GPT system is fully integrated into the CI/CD pipeline, providing: * Automated analysis as part of the merge request process * Side-by-side performance comparisons between original and optimized versions * Click-to-apply functionality for implementing suggestions directly in GitLab The implementation includes several noteworthy technical aspects regarding LLM deployment in production: Prompt Engineering and System Design: * The system requires carefully structured prompts to ensure GPT receives appropriate context * The architecture includes mechanisms to handle different types of SP complexity levels * Integration with existing CI/CD tools and GitLab infrastructure Quality Control and Verification: * Performance comparisons are automatically generated for each optimization suggestion * The system includes automated logic verification to ensure optimized SPs maintain identical functionality * Current accuracy rates are around 25%, showing room for improvement but demonstrating practical utility Limitations and Challenges: * Variability in optimization scope: GPT sometimes makes minimal changes and other times suggests complete logic rewrites * Need for automated verification: The team is developing additional tools for automatic logic verification * Accuracy improvements needed: While 25% accuracy is useful, there's significant room for enhancement Future Development Plans: * Development of a direct database environment integration * Creation of a new test framework for easier test case addition * Implementation of production monitoring for high-CPU stored procedures * Enhanced prompt tuning capabilities The case study provides valuable insights into practical LLMOps implementation: * The importance of integrating LLM capabilities into existing workflows rather than creating separate systems * The need for robust verification and testing mechanisms when using LLMs for code modification * The value of providing developers with self-service tools and easy-to-use interfaces * The benefit of starting with well-defined, specific use cases rather than attempting to solve all problems at once Looking at the implementation critically, the 25% accuracy rate might seem low, but in the context of database optimization, where any improvement can have significant impact, this represents a valuable addition to the development toolkit. The system's ability to reduce manual review time and provide automated suggestions, even if not all are implemented, demonstrates the practical value of LLMs in production environments. The approach taken by Agoda shows a pragmatic balance between automation and human oversight, where GPT serves as an assistive tool rather than a complete replacement for human expertise. This aligns well with current best practices in LLMOps, where systems are designed to augment rather than replace human decision-making, especially in critical infrastructure components like database operations.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.

Book a Demo

Use Open Source