ZenML
Hugging Face (Inference Endpoints)
All integrations

Hugging Face (Inference Endpoints)

Effortlessly deploy Hugging Face models to production with ZenML

Add to ZenML

Effortlessly deploy Hugging Face models to production with ZenML

Integrate Hugging Face Inference Endpoints with ZenML to streamline the deployment of transformers, sentence-transformers, and diffusers models. This integration allows you to leverage Hugging Face's secure, scalable infrastructure for hosting models, while managing the deployment process within your ZenML pipelines.

Features with ZenML

  • Seamless deployment of Hugging Face models directly from ZenML pipelines
  • Simplified management of inference endpoints within the ZenML ecosystem
  • Automatically scale deployments based on demand using Hugging Face's infrastructure
  • Maintain a centralized registry of deployed models for easy tracking and monitoring

Hugging Face (Inference Endpoints) integration screenshot

Main Features

  • Secure model hosting on dedicated Hugging Face infrastructure
  • Autoscaling capabilities to handle variable inference workloads
  • Support for a wide range of model types and frameworks
  • Pay-per-use pricing for cost-effective deployments
  • Enterprise-grade security features like VPC deployment

How to use ZenML with Hugging Face (Inference Endpoints)


from zenml.integrations.huggingface.steps import huggingface_model_deployer_step
from zenml.integrations.huggingface.services.huggingface_deployment import HuggingFaceDeploymentService
from zenml.integrations.huggingface.services import HuggingFaceServiceConfig

@step
def predictor(
    service: HuggingFaceDeploymentService,
) -> Annotated[str, "predictions"]:
    # Run a inference request against a prediction service
    data = load_live_data()
    prediction = service.predict(data)
    return prediction
    
@pipeline
def deploy_and_infer():
    service_config = HuggingFaceServiceConfig(model_name=model_name)
    service = huggingface_model_deployer_step(
        model_name="text-classification-model",
        accelerator="gpu",
        hf_repository="myorg/text-classifier",
        task="text-classification"
    )
    predictor(service)
	

Additional Resources

Connect Your ML Pipelines to a World of Tools

Expand your ML pipelines with more than 50 ZenML Integrations

  • Amazon S3
  • Apache Airflow
  • Argilla
  • AutoGen
  • AWS
  • AWS Strands
  • Azure Blob Storage
  • Azure Container Registry
  • AzureML Pipelines
  • BentoML
  • Comet