Skip to main content

MLOps Pipeline

User Guide

How to use the pipeline as an MLE who wants to train a model leveraging the pipeline's amenities in terms of visualization and reduced overhead can be found in the User Guide.

Developer Guide

The Developer Guide enlightens any MLOps engineer who wants to set up the pipeline on his own.

MLOps Resources

Here is a list of introductory materials on MLOps that will help you get started on the right foot. Feel free to explore these resources to gain valuable insights into MLOps. They cover various aspects of Machine Learning Operations, from data pipelines and model development to deployment in production.

Books with Code Samples

  1. Data Pipelines With Apache Airflow (Authors: Bas P. Harenslak, Julian Rutger De Ruiter)

  2. Practical Deep Learning at Scale with MLflow (Author: Yong Liu)

  3. Designing Machine Learning Systems (Author: Chip Huyen)

Tutorials and Courses

  1. Made With ML by Goku Mohandas

  2. Stanford's ML Systems Design Course

  3. Deploying Machine Learning Models in Production (Coursera)

  4. Full Stack Deep Learning

Goals

The pipeline was originally designed for the certAInty project but is meant to be general enough to be used for other projects as well.

Robustness: Data should persist even if the pipeline fails.

Reproducibility: The pipeline should be reproducible.

Scalability: The pipeline should be able to scale to multiple machines.

Flexibility: The pipeline should be able to run on different machines.

Monitoring: The pipeline should be able to monitor itself.

Logging: The pipeline should be able to log itself.

Parity: The pipeline should be able to run in production.

Visualization: The pipeline should be able to visualize itself.