ZenML
Blog
Alex Strick van Linschoten

Alex Strick van Linschoten

71 posts
Newsletter 17: What Teams Need to Ship AI Agents

Newsletter 17: What Teams Need to Ship AI Agents

We're expanding ZenML beyond its original MLOps focus into the LLMOps space, recognizing the same fragmentation patterns that once plagued traditional machine learning operations. We're developing three core capabilities: native LLM components that provide unified APIs and management across providers like OpenAI and Anthropic, along with standardized prompt versioning and evaluation tools; applying established MLOps principles to agent development to bring systematic versioning, evaluation, and observability to what's currently a "build it and pray" approach; and enhancing orchestration to support both LLM framework integration and direct LLM calls within workflows. Central to our philosophy is the principle of starting simple before going autonomous, emphasizing controlled workflows over fully autonomous agents for enterprise production environments, and we're actively seeking community input through a survey to guide our development priorities, recognizing that today's infrastructure decisions will determine which organizations can successfully scale AI deployment versus remaining stuck in pilot phases.

How to Simplify Authentication in Machine Learning Pipelines (Without Compromising Security)

How to Simplify Authentication in Machine Learning Pipelines (Without Compromising Security)

Discover how ZenML's Service Connectors solve one of MLOps' most frustrating challenges: credential management. This deep dive explores how Service Connectors eliminate security risks and save engineer time by providing a unified authentication layer across cloud providers (AWS, GCP, Azure). Learn how this approach improves developer experience with reduced boilerplate, enforces security best practices with short-lived tokens, and enables true multi-cloud ML workflows without credential headaches. Compare ZenML's solution with alternatives from Kubeflow, Airflow, and cloud-native platforms to understand why proper credential abstraction is the unsung hero of efficient MLOps.

8 Alternatives to Kubeflow for ML Workflow Orchestration (and Why You Might Switch)

8 Alternatives to Kubeflow for ML Workflow Orchestration (and Why You Might Switch)

8 practical alternatives to Kubeflow that address its common challenges of complexity and operational overhead. From Argo Workflows' lightweight Kubernetes approach to ZenML's developer-friendly experience, we analyze each tool's strengths across infrastructure needs, developer experience, and ML-specific capabilities—helping you find the right orchestration solution that removes barriers rather than creating them for your ML workflows.

LLMOps Is About People Too: The Human Element in AI Engineering

LLMOps Is About People Too: The Human Element in AI Engineering

We explore how successful LLMOps implementation depends on human factors beyond just technical solutions. It addresses common challenges like misaligned executive expectations, siloed teams, and subject-matter expert resistance that often derail AI initiatives. The piece offers practical strategies for creating effective team structures (hub-and-spoke, horizontal teams, cross-functional squads), improving communication, and integrating domain experts early. With actionable insights from companies like TomTom, Uber, and Zalando, readers will learn how to balance technical excellence with organizational change management to unlock the full potential of generative AI deployments.

Chat With Your ML Pipelines: Introducing the ZenML MCP Server

Chat With Your ML Pipelines: Introducing the ZenML MCP Server

Discover the new ZenML MCP Server that brings conversational AI to ML pipelines. Learn how this implementation of the Model Context Protocol allows natural language interaction with your infrastructure, enabling query capabilities, pipeline analytics, and run management through simple conversation. Explore current features, engineering decisions, and future roadmap for this timely addition to the rapidly evolving MCP ecosystem.

Understanding the AI Act: February 2025 Updates and Implications

Understanding the AI Act: February 2025 Updates and Implications

The EU AI Act, now partially in effect as of February 2025, introduces comprehensive regulations for artificial intelligence systems with significant implications for global AI development. This landmark legislation categorizes AI systems based on risk levels - from prohibited applications to high-risk and limited-risk systems - establishing strict requirements for transparency, accountability, and compliance. The Act imposes substantial penalties for violations, up to €35 million or 7% of global turnover, and provides a clear timeline for implementation through 2027. Organizations must take immediate action to audit their AI systems, implement robust governance infrastructure, and enhance development practices to ensure compliance, with tools like ZenML offering technical solutions for meeting these regulatory requirements.

LLMOps in Production: 457 Case Studies of What Actually Works

LLMOps in Production: 457 Case Studies of What Actually Works

A comprehensive overview of lessons learned from the world's largest database of LLMOps case studies (457 entries as of January 2025), examining how companies implement and deploy LLMs in production. Through nine thematic blog posts covering everything from RAG implementations to security concerns, this article synthesizes key patterns and anti-patterns in production GenAI deployments, offering practical insights for technical teams building LLM-powered applications.

Production LLM Security: Real-world Strategies from Industry Leaders 🔐

Production LLM Security: Real-world Strategies from Industry Leaders 🔐

Learn how leading companies like Dropbox, NVIDIA, and Slack tackle LLM security in production. This comprehensive guide covers practical strategies for preventing prompt injection, securing RAG systems, and implementing multi-layered defenses, based on real-world case studies from the LLMOps database. Discover battle-tested approaches to input validation, data privacy, and monitoring for building secure AI applications.

Optimizing LLM Performance and Cost: Squeezing Every Drop of Value

Optimizing LLM Performance and Cost: Squeezing Every Drop of Value

This comprehensive guide explores strategies for optimizing Large Language Model (LLM) deployments in production environments, focusing on maximizing performance while minimizing costs. Drawing from real-world examples and the LLMOps database, it examines three key areas: model selection and optimization techniques like knowledge distillation and quantization, inference optimization through caching and hardware acceleration, and cost optimization strategies including prompt engineering and self-hosting decisions. The article provides practical insights for technical professionals looking to balance the power of LLMs with operational efficiency.

The Evaluation Playbook: Making LLMs Production-Ready

The Evaluation Playbook: Making LLMs Production-Ready

A comprehensive exploration of real-world lessons in LLM evaluation and quality assurance, examining how industry leaders tackle the challenges of assessing language models in production. Through diverse case studies, the post covers the transition from traditional ML evaluation, establishing clear metrics, combining automated and human evaluation strategies, and implementing continuous improvement cycles to ensure reliable LLM applications at scale.

Prompt Engineering & Management in Production: Practical Lessons from the LLMOps Database

Prompt Engineering & Management in Production: Practical Lessons from the LLMOps Database

Practical lessons on prompt engineering in production settings, drawn from real LLMOps case studies. It covers key aspects like designing structured prompts (demonstrated by Canva's incident review system), implementing iterative refinement processes (shown by Fiddler's documentation chatbot), optimizing prompts for scale and efficiency (exemplified by Assembled's test generation system), and building robust management infrastructure (as seen in Weights & Biases' versioning setup). Throughout these examples, the focus remains on systematic improvement through testing, human feedback, and error analysis, while balancing performance with operational costs and complexity.

LLM Agents in Production: Architectures, Challenges, and Best Practices

LLM Agents in Production: Architectures, Challenges, and Best Practices

An in-depth exploration of LLM agents in production environments, covering key architectures, practical challenges, and best practices. Drawing from real-world case studies in the LLMOps Database, this article examines the current state of AI agent deployment, infrastructure requirements, and critical considerations for organizations looking to implement these systems safely and effectively.

LLMOps Lessons Learned: Navigating the Wild West of Production LLMs 🚀

LLMOps Lessons Learned: Navigating the Wild West of Production LLMs 🚀

Explore key insights and patterns from 300+ real-world LLM deployments, revealing how companies are successfully implementing AI in production. This comprehensive analysis covers agent architectures, deployment strategies, data infrastructure, and technical challenges, drawing from ZenML's LLMOps Database to highlight practical solutions in areas like RAG, fine-tuning, cost optimization, and evaluation frameworks.

Everything you ever wanted to know about LLMOps Maturity Models

Everything you ever wanted to know about LLMOps Maturity Models

As organizations rush to adopt generative AI, several major tech companies have proposed maturity models to guide this journey. While these frameworks offer useful vocabulary for discussing organizational progress, they should be viewed as descriptive rather than prescriptive guides. Rather than rigidly following these models, organizations are better served by focusing on solving real problems while maintaining strong engineering practices, building on proven DevOps and MLOps principles while adapting to the unique challenges of GenAI implementation.

Podcast: ML Monitoring with Emeli Dral

Podcast: ML Monitoring with Emeli Dral

This week I spoke with Emeli Dral, co-founder and CTO of Evidently, an open-source tool tackling the problem of monitoring of models and data for machine learning. We discussed the challenges around building a tool that is both straightforward to use while also customizable and powerful.

Podcast: Practical Production ML with Emmanuel Ameisen

Podcast: Practical Production ML with Emmanuel Ameisen

This week I spoke with Emmanuel Ameisen, a data scientist and ML engineer currently based at Stripe. Emmanuel also wrote an excellent O'Reilly book called 'Building Machine Learning Powered Applications', a book I find myself often returning to for inspiration and that I was pleased to get the chance to reread in preparation for our discussion.

Richify that CLI!

Richify that CLI!

We recently reworked a number of parts of our CLI interface. Here are some quick wins we implemented along the way that can help you improve how users interact with your CLI via the popular open-source library, rich.