Company
Mercado Libre
Title
Building a Scalable LLM Gateway for E-commerce Recommendations
Industry
E-commerce
Year
2023
Summary (short)
Mercado Libre developed a centralized LLM gateway to handle large-scale generative AI deployments across their organization. The gateway manages multiple LLM providers, handles security, monitoring, and billing, while supporting 50,000+ employees. A key implementation was a product recommendation system that uses LLMs to generate personalized recommendations based on user interactions, supporting multiple languages across Latin America.
## Overview Mercado Libre, Latin America's largest e-commerce and fintech company with a mission to democratize commerce and financial services, embarked on an ambitious initiative to make generative AI available at enterprise scale. With over 50,000 employees—many without technical backgrounds—the company faced significant challenges in rolling out LLM capabilities across the organization. The solution presented by Lina Chaparro, a Machine Learning Project Leader at Mercado Libre, centers on the development of a centralized LLM Gateway that provides management, monitoring, and control for consuming generative AI services. The presentation was delivered as part of what appears to be a technical conference or meetup, where the speaker walked through both the architectural decisions and a specific production use case involving product recommendations and push notifications. ## The Challenge of Enterprise-Scale LLM Adoption The presentation highlights several critical challenges that organizations face when attempting to deploy generative AI at scale: **Scale and Rate Limiting**: Operating at Mercado Libre's scale means handling massive volumes of requests per minute, often exceeding the rate limits of underlying LLM providers. This is a common challenge in production LLM deployments where API quotas from providers like OpenAI, Anthropic, or Google can quickly become bottlenecks. **Democratization Across Non-Technical Users**: With 50,000+ employees who don't necessarily have programming skills, the company needed to find ways to make LLM capabilities accessible to everyone, not just engineers. This speaks to the broader challenge of building user-friendly interfaces and tools that abstract away technical complexity. **Rapidly Evolving Model Landscape**: The LLM market changes rapidly, with new models and providers emerging constantly. Organizations need infrastructure that can adapt to these changes without requiring significant rearchitecting. **Observability and Trust**: Understanding how LLMs are being used, monitoring quality of responses, tracking response times, and managing costs are all critical for enterprise adoption. The speaker emphasized the need for metrics visible at different levels to understand impact. **Security and Information Protection**: As a company handling sensitive financial and commerce data, information security was highlighted as a top priority. ## The Gateway Architecture Solution Mercado Libre's solution was to build an LLM Gateway—a centralized system that acts as a single entry point for all generative AI consumption across the organization. The implementation was accelerated by leveraging Fury, the company's internal platform, which allowed rapid scaling without infrastructure concerns. The gateway architecture provides several key benefits: **Centralized Management and Control**: In a complex ecosystem like Mercado Libre's, having a single point of control for LLM communications is essential. This allows for consistent policies, monitoring, and governance across all use cases. **Multi-Provider Integration**: The gateway currently integrates four major LLM providers, providing a unified interface regardless of which underlying model is being used. This abstraction layer is crucial for avoiding vendor lock-in and enabling experimentation with different models. **Fallback System**: The gateway implements logic to guarantee continued service availability. If one provider experiences issues, the system can automatically route requests to alternative providers, ensuring high availability for production workloads. **Security Layer**: The gateway acts as a security barrier, providing encryption, authentication, and authorization functions. This is particularly important when dealing with sensitive customer data in an e-commerce and financial services context. **Performance Optimization**: The gateway handles intelligent routing and rate limiting to reduce latency and improve response times. This is essential for real-time applications where user experience depends on fast responses. **Simplified Architecture**: By acting as a single entry point, the gateway reduces complexity and direct dependencies between systems, making the overall architecture easier to maintain and scale. **Centralized Billing and Cost Management**: Given the different pricing models across providers and models, the gateway centralizes consumption tracking and associated costs, providing visibility into GenAI spend. ## Developer and User Tools Beyond the gateway itself, Mercado Libre built complementary tools to facilitate adoption: **Playground**: The team developed an internal playground tool that centralizes GenAI-based solutions accessible to any employee. By 2023, this playground had more than 16,000 unique users, demonstrating significant internal adoption. **SDK**: For developers building programmatic integrations, an SDK was developed to simplify the user experience and abstract away the complexity of interacting with the gateway. ## Production Use Case: Personalized Product Recommendations The presentation includes a detailed production use case that demonstrates the practical application of the gateway architecture: **Problem Statement**: The goal was to drive recommendations for products related to user interests, enhancing the user experience by creating customized bookings with high-shipping products and improving the notification algorithm to increase engagement in the marketplace. **How It Works**: When a user interacts with a product—through views, questions, or favorites—the system uses LLMs to generate personalized push notifications encouraging purchase. These notifications lead to landing pages with AI-generated recommendations. The LLM essentially answers the question "what are the top 10 items this user would be interested in buying based on their preferences?" **Technical Challenges Addressed**: - **Real-Time Latency**: As a real-time application, the system needs to account for LLM generation time while still meeting user experience requirements. Managing latency and rate limits in production is a significant challenge. - **Multi-Language and Dialect Support**: Mercado Libre operates across Latin America, requiring support for multiple Spanish accents and dialects (neutral Spanish, regional variations) as well as Brazilian Portuguese. This adds complexity to prompt engineering and output quality assurance. - **Dynamic Prompt Management**: The team implemented prompt versioning to enable experimentation with new prompts and communication styles. This is a crucial capability for iterating on LLM applications without code deployments. - **Dynamic Benefits Integration**: The system can dynamically incorporate benefits like free shipping, interest-free installments, or new loyalty program features into generated content. ## Results and Scale The presentation provides some concrete metrics around adoption and impact: - Over 16,000 unique users interacting with the playground in 2023 - More than 150 use cases leveraging the gateway - Improvements observed in recommendation quality - Expected positive impact on NPS (Net Promoter Score) metrics It's worth noting that some of these results are described in terms of expectations rather than measured outcomes, so the full impact may not yet be quantified. ## MLOps Integration The speaker explicitly mentions that the gateway architecture aligns with Mercado Libre's internal MLOps practices, providing tools that improve quality, performance, scalability, and security of model use. This suggests the LLM gateway is part of a broader ML infrastructure strategy rather than a standalone initiative. ## Critical Assessment The presentation makes a compelling case for the gateway architecture approach to enterprise LLM deployment. However, a few observations are worth noting: The metrics shared (16,000 users, 150 use cases) demonstrate adoption but don't provide deep insight into business impact or ROI. The expected improvements in NPS are mentioned but appear to still be in the measurement phase. The multi-provider integration with fallback capabilities is a strong architectural choice that addresses availability concerns and reduces vendor dependency, though the specific providers aren't named. The emphasis on democratizing AI access to non-technical employees through the playground is commendable, though the presentation doesn't detail what safeguards are in place to prevent misuse or ensure appropriate use of the technology. Overall, this case study represents a thoughtful, infrastructure-first approach to enterprise LLM adoption that addresses many of the practical challenges organizations face when moving from experimentation to production-scale deployment.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.