To kick off this article, I’d like to explain the interpretability of a machine learning (ML) model.

According to Merriam-Webster, interpretability describes the process of making something plain or understandable. In the context of ML, interpretability provides us with an understandable explanation of how a model behaves. Basically, it helps us figure out what’s behind model predictions and how these models work. Miller and Tim’s “Explanation in Artificial Intelligence: Insights from the Social Sciences” states that “Interpretability is the degree to which a human can understand the cause of a decision.” By utilizing ML interpretability methods, we increase this degree and allow humans to consistently predict the model's behavior.

Moreover, fairness and unbiasedness have recently become important auxiliary criteria for model optimization. ML interpretability is an essential tool to check these properties for ML systems.

In this work, I'll let you in on major methods of tackling the interpretability of ML models using Python and explain how to build a resilient machine learning infrastructure that serves as a foundation for artificial intelligence, accelerating the time to market for AI-powered projects.

Classes of Interpretability Methods

According to Christoph Molnar, author of the amazing Interpretable Machine Learning, all methods of ML interpretability can be broken down into the following classes:

Relative to the moment the interpretation task was set

First, the methods of interpretability can be classified relative to the moment a data scientist decides to interpret a model:

The first type of method is reduced to solving problems using relatively simple models whose structure is easy to interpret—sparse linear models, logistic regression, or a single decision tree. The second type of method is suitable when a ready-to-use model needs interpretation.

Model-Specific vs. Model-Agnostic

Secondly, methods of interpretability can also be broken down based on their area

of application: