PyData London 2023

Steve Goodman

Steve has 20 years experience in data analytics and data science, mostly in the fields of marketing, consulting and financial services. He is currently a Data Science Lead at Tide, a financial services platform based in London. He holds a PhD in Applied Statistics and a MBA.

The speaker's profile picture


From correlations to causality in machine learning– a gentle guide to causal inference
Steve Goodman

Today most conventional ML systems look to exploit correlations in data in order to draw inferences. However as we learned back in school Statistics class, correlation is not causation. So when you need to know the ‘why’ behind a particular prediction, or why A outperforms B in an experiment, then relying on correlations is insufficient. Furthermore some ML models are build purely for explainability and insight purposes rather than predictions, in order to understand how the world works so we could potentially make some kind of policy change, e.g. What if we had chosen a different strategy or tactic – would the outcome have been different, and if so, by how much? To answer these kinds of questions, you need to delve into the world of causality.

This talk is a gentle (and occasionally entertaining) introduction to the interdisciplinary field of causality and how it is starting to impact machine learning. You will learn what kinds of questions causal inference can answer, and how it can address some of the limitations of current explainable ML methods, under certain conditions. I draw upon use-cases drawn from financial services and marketing, and I will show a short practical example of how combining human domain knowledge (intuitively via Graphical Causal Models) along with your data can sometimes unlock insights not recoverable by purely data driven approaches.