Integrations
WhyLabs whylogs
and
ZenML logo in purple, representing machine learning pipelines and MLOps framework.
Maintain data quality and detect drift with WhyLabs whylogs in ZenML pipelines
The image is blank. No elements are visible for description or keyword inclusion.
WhyLabs whylogs
All integrations

WhyLabs whylogs

Maintain data quality and detect drift with WhyLabs whylogs in ZenML pipelines
Add to ZenML
COMPARE
related resources
No items found.

Maintain data quality and detect drift with WhyLabs whylogs in ZenML pipelines

The WhyLabs whylogs integration with ZenML enables you to seamlessly integrate data and model profiling capabilities into your ML pipelines. By leveraging whylogs profiles, you can monitor data quality, detect data and model drift, and take automated corrective actions to ensure the reliability and performance of your models in production.

Features with ZenML

  • Seamless data profiling in ZenML pipelines
    Easily generate whylogs data profiles directly within your ZenML pipeline steps for any pandas DataFrame.
  • Flexible integration options
    Use the standard WhylogsProfilerStep, custom steps with the WhylogsDataValidator, or call the whylogs library directly.
  • Automated data validation
    Implement data quality checks and corrective actions based on the generated whylogs profiles.
  • Effortless visualization of profiles
    View interactive whylogs profile visualizations directly in the ZenML dashboard or Jupyter notebooks.
  • Easy WhyLabs platform logging
    Upload profiles to WhyLabs’ cloud platform for centralized tracking, analysis and documentation of data and models.

Main Features

  • Statistical data profiling and summarization
  • Data quality validation
  • Data drift detection
  • Model drift and performance degradation detection
  • Support for tabular data in pandas DataFrames
How to use ZenML with
WhyLabs whylogs
# zenml integration install whylogs -y
# zenml data-validator register whylogs_data_validator --flavor=whylogs
# zenml stack register custom_stack -dv whylogs_data_validator -o default -a default --set


from typing import Annotated,Tuple
import pandas as pd
import whylogs as why
from sklearn import datasets
from whylogs.core import DatasetProfileView

from zenml.integrations.whylogs.flavors.whylogs_data_validator_flavor import (
    WhylogsDataValidatorSettings,
)
from zenml import step, pipeline


@step(
    settings={
        "data_validator.whylogs": WhylogsDataValidatorSettings(
            enable_whylabs=True, dataset_id="model-1"
        )
    }
)
def data_loader() -> Tuple[
    Annotated[pd.DataFrame, "data"],
    Annotated[DatasetProfileView, "profile"]
]:
    """Load the diabetes dataset."""
    X, y = datasets.load_diabetes(return_X_y=True, as_frame=True)

    # merge X and y together
    df = pd.merge(X, y, left_index=True, right_index=True)

    profile = why.log(pandas=df).profile().view()
    return df, profile

@pipeline(enable_cache=False)
def my_pipeline():
    data, profile = data_loader()
    #... do something with the data

if __name__ == "__main__":
    my_pipeline()

This code snippet demonstrates the integration of WhyLogs, a data profiling and validation library, with ZenML, a machine learning pipeline framework. It defines a data_loader step that loads the diabetes dataset using scikit-learn, merges the features (X) and target (y) into a single DataFrame, and generates a WhyLogs profile of the data. The data_loader step is annotated with WhyLogs settings to enable WhyLabs integration and specify a dataset ID. The code then defines a pipeline called my_pipeline that includes the data_loader step and can be extended to perform further operations on the data. Finally, the if __name__ == "__main__": block ensures that the pipeline is executed when the script is run directly.

Model Serving & Monitoring with BentoML + WhyLabs
Additional Resources
Whylogs integration SDK docs
WhyLabs whylogs Documentation

Maintain data quality and detect drift with WhyLabs whylogs in ZenML pipelines

The WhyLabs whylogs integration with ZenML enables you to seamlessly integrate data and model profiling capabilities into your ML pipelines. By leveraging whylogs profiles, you can monitor data quality, detect data and model drift, and take automated corrective actions to ensure the reliability and performance of your models in production.
WhyLabs whylogs

Unify Your ML and LLM Workflows

Free, powerful MLOps open source foundation
Works with any infrastructure
Upgrade to managed Pro features
Dashboard displaying machine learning models, including versions, authors, and tags. Relevant to model monitoring and ML pipelines.

Connect Your ML Pipelines to a World of Tools

Expand your ML pipelines with more than 50 ZenML Integrations
Github Actions
LangChain
Google Cloud
HyperAI
Databricks Deployment
Modal
AutoGen
Evidently
XGBoost
LangGraph
Hugging Face (Inference Endpoints)

Connect Your ML Pipelines to a World of Tools

Expand your ML pipelines with Apache Airflow and other 50+ ZenML Integrations
No items found.