Deploy Models to Production with KServe
KServe is a Kubernetes-based model inference platform built for highly scalable deployment use cases. It provides a standardized inference protocol across ML frameworks while supporting a serverless architecture with autoscaling including Scale to Zero on GPUs. KServe uses a simple and pluggable production serving architecture for production ML serving that includes prediction, pre-/post-processing, monitoring and explainability. ZenML's KServe integration provides sets of functionalities and steps that take care of preparing the model from various ML frameworks to the correct format so you can deploy your models to KServe without requiring any custom code or configuration.