PyData London 2023

MLflow workshop
06-02, 15:30–17:00 (Europe/London), Salisbury

In this tutorial, we will learn the basis of MLflow. After introducing the library and the problem it solved we will implement an end-to-end machine learning lifecycle using MLflow.


Bringing a Machine learning project to a successful conclusion is always more complex than originally planned. It requires a lot of experimentation with the data, algorithms, and parameters. Productionalizing the final model comes with another set of challenges: Not only do you have to deploy it, but you also have to monitor it over time to ensure its health. It requires its own toolbox.

This tutorial introduces MLflow, an open-source project that simplifies the entire ML lifecycle without being locked to a specific library or framework. MLflow offers a simple abstraction to track packages and manage ML projects. We will particularly focus on MLflow recipes, a new framework that enables data scientists to quickly develop high-quality models and deploy them to production.

At the end of the tutorial you will be familiar with:
* What are the key abstraction and components of the library
* How to keep track of ML parameters and results
* How to package the training code in a reproducible format
* How to deploy the model for batch or real-time inference

The tutorial intends to provide a broad rather than deep introduction. It will give the audience a hands-on experience allowing them to dive into the specific submodules.

GitHub repository

Please try to do the setup before the tutorial by looking at the instruction in the Readme!


Prior Knowledge Expected

No previous knowledge expected

Theodore Meynard is a data scientist at GetYourGuide. He works on our ranking algorithm to help customers to find the best activities to book and locations to explore. He is one of the co-organisers of the Pydata Berlin meetup. When he is not programming, he loves riding his bike looking for the best bakery-patisserie in town.

This speaker also appears in: