ZenML
BentoML
All integrations

BentoML

Seamlessly Deploy Models to Production with ZenML and BentoML

Add to ZenML

Seamlessly Deploy Models to Production with ZenML and BentoML

Simplify your model deployment process by integrating BentoML with ZenML. This powerful combination allows you to package models into production-ready Bentos and serve them locally or in the cloud with ease, streamlining the transition from development to production.

Features with ZenML

  • Streamlined Model Packaging: Effortlessly package trained models into Bentos using ZenML's built-in BentoML steps
  • Local Model Serving: Deploy and serve models locally for development and testing with the BentoML Model Deployer
  • Container-based Model Serving: ZenML’s in-built steps convert your bento into Docker images and automatically push them to your Stack’s Container Registry, from where you can deploy them anywhere.
  • Cloud Deployment Ready: Bentos are versioned and tracked and you can fetch them from ZenML for seamless deployment to various cloud platforms using bentoctl  or yatai .
  • Standardized Deployment Workflow: Establish a consistent and reproducible model deployment process across your organization

BentoML integration screenshot

Main Features

  • Framework-agnostic model packaging and serving
  • Supports local, cloud, and Kubernetes deployments
  • Easy-to-use Python API for defining prediction services
  • Automatic generation of OpenAPI specifications
  • Built-in monitoring and logging capabilities

How to use ZenML with BentoML

zenml model-deployer register bentoml_deployer --flavor=bentoml
zenml stack update -d bentoml_deployer

You first need to define a BentoML Service in a service.py file and define the logic to serve your model there. It could look like the following:

@bentoml.service(
    name=SERVICE_NAME,
)
class MNISTService:
    def __init__(self):
        # load model
        self.model = bentoml.pytorch.load_model(MODEL_NAME)
        self.model.eval()

    @bentoml.api()
    async def predict_ndarray(
        self, 
        inp: Annotated[np.ndarray, DType("float32"), Shape((28, 28))]
    ) -> np.ndarray:
        inp = np.expand_dims(inp, (0, 1))
        output_tensor = await self.model(torch.tensor(inp))
        return to_numpy(output_tensor)

You can then define your pipeline as follows:

from zenml import pipeline, step
from zenml.integrations.bentoml.steps import bento_builder_step
from zenml.integrations.bentoml.steps import bentoml_model_deployer_step


@pipeline
def bento_builder_pipeline():
    model = model_training_step()
    bento = bento_builder_step(
        model=model,
        model_name="pytorch_mnist",  # Name of the model
        model_type="pytorch",  # Type of the model (pytorch, tensorflow, sklearn, xgboost..)
        service="service.py:CLASS_NAME",  # Path to the service file within zenml repo
    )
    deployed_model = bentoml_model_deployer_step(
        bento=bento,
        model_name="pytorch_mnist",  # Name of the model
        port=3001,  # Port to be used by the http server
        deployment_type="container" # the type of deployment, either local or container
    )

Additional Resources

Connect Your ML Pipelines to a World of Tools

Expand your ML pipelines with more than 50 ZenML Integrations

  • Amazon S3
  • Apache Airflow
  • Argilla
  • AutoGen
  • AWS
  • AWS Strands
  • Azure Blob Storage
  • Azure Container Registry
  • AzureML Pipelines
  • Comet
  • CrewAI