Model Class Reliance for Variable Importance (MCR)

Model Class Reliance (MCR) for Variable Importance

Understanding phenomenon via modeling & post-hoc variable importance is currently flawed.

This is due to shared information in the input features leading to multiple “best” performing models which use the input variables to different degrees (i.e. exposes different, equally valid explanations of the phenomena). Considering a single, arbitrary, model (explanation) is misleading at best.

This project builds on the original work on Model Class Reliance (Fisher et. al. 2019) to address this contributing novel algorithms and methods.

Funder
EPSRC (Grant EP/T003928/1), ESRC (Grant ES/T01010X/1)
Duration
Oct 2019 – Mar 2023
Investigators
Gavin Smith, James Goulding, Roberto Mansilla
Partners
N/A

Project Description

Short Summary

Understanding phenomenon via modelling & post-hoc variable importance is currently flawed.

When the measured variables are not independent, multiple models exist. Without knownledge of the causal process (often what is being investigated in the first place) machine learning algorithms will arbitrarily build one. Considering variable importance (or SHAP, or LIME) of such a model is a best arbitrary though more likely misleading / wrong.

This project aims to address these issues, providing novel methods and algorithms to correctly examining phenomenon by acknowledging and explicitly accounting for the fact that measured features share information regarding the output. The approaches in many cases can be used as direct drop in replacements within machine learning pipelines for variable importance and other methods such as SHAP. For instance our work on MCR for Random Forests provides the approaches for Random Forests via Python estimators which are drop in replacements for their sklearn counterparts.

How Variable Importance Goes Wrong

Consider Two different equally performant predictive mechanisms.

Fitting a Random Forest Classifier and then computing traditional (unconditional) permutation importance or SHAP produces the following:

The values for the RF-Uncond are 50%, 25%, 25% L-R.

Why? Consider the mean decrease in accuracy reported by the permutation importance. In this instance an equal number of trees were trained. On average half used B and half used C. When a prediction is made when B is permuted half of the trees get it wrong and when it comes to voting the tree then randomly selects between and output of 1 or 0. By chance the model gets it right/wrong 50% of the time leading to a MDA of 50%. The same goes for C.

However, we know that this result does not tell the full story, i.e. is misleading. We know that the contribution, when measured as MDA, of A is 50% while B & C can be between 0-50%.In contrast Model Class Reliance (MCR) provides lower and upper bounds. I.e. we expect the variable contribution bounds of B & C = [0%, 50%] and A to be [50%, 50%]. And this is exactly what MCR returns for this toy example. The graph below show this based on empirical experiments (left graph compares MCR- for Random Forests and other existing methods which can be motivated as achiving somewhat similar goals while the right graph compares MCR+ for Random Forests). See:

G. SMITH, R. MANSILLA and J. GOULDING, 2020. Model Class Reliance for Random Forests. In 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada.

Problem addressed and Method

Machine learning models highlight the best predictive relationship between the inputs and outputs.

Variable importance details this relationship. But often there is more than one best predictive relationship. This set of models is known as the Rashomon Set.

When undertaking model building and explanation, all models other than the one built are ignored by current analysis and subsequently when attempts are made towards phenomenon understanding following such an approach. Unfortunately, the one considered is arbitrary and chosen (typically silently) by the machine learning algorithm.

Model Class Reliance (MCR) acknowledges and addresses this issue, providing upper and lower bounds of (input) feature importance by (typically implicitly for computational reasons) considering the whole Rashomon Set. This communicates to the practitioner the most or the least a variable may be used by any predictive model that is able to achieve maximal predictive accuracy. When considering the importance of a variable to the phenomena, this correctly indicates to practitioners the uncertainty in the variables causal nature in predicting the phenomena which is inherent due to the fact that (1) the data is purely observational and contains shared information and (2) no external causal knowledge has been provided.

This research extends state-of-the-art in MCR, providing new algorithms to (tractably) compute MCR for Random Forests (Classifiers and Regression) along with variants and extensions to aid practitioners/researchers in understanding phenomenon through learning (non)linear models. Recently completed work include grouped-MCR, MCR-SHAP with other variants and extensions currently under development.

Contribution and Results

In 2019 a MCR method for (regularised) Linear and Kernel Regression under squared loss was proposed in the seminal work by Fisher et. al [1].

Stage 1 of this work: First MCR method for Classification & Regression Random Forest (published in [2]). A Python implementation is available.

We provided proofs of correctness in the limit, empirical evidence of fast convergence & a linearithmic runtime implementation (vs. Polynomial from [1]).

We compared to SVM-MCR [1] (for regression) and a baseline of 2 other approaches best representing MCR- and MCR+ (for classification), showing:

multiple â€œbest modelsâ€ do exist. Different predictive mechanisms lead to different possible variable importance scores (wide MCR ranges).
RF-MCR performs correctly and is able to identify these ranges.

Left: MCR bounds for COMPAS – Recidivism Modelling (a regression task)
Right: Breast Cancer Wisconsin dataset (a classification task, hence the method from [1] is not compared)

See Smith et. al 2020 [2] for full details.

[1] Fisher, Rudin, and Dominici. “All Models are Wrong, but Many are Useful: Learning a Variable’s Importance by Studying an Entire Class of Prediction Models Simultaneously.” JMR 20.177 (2019): 1-81.

[2]Â Smith, Mansilla, and Goulding. “Model Class Reliance for Random Forests”. In 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada.

Associated Publications

Model Class Reliance for Random Forests
Variable Importance (VI) has traditionally been cast as the process of estimating each variable’s contribution to a predictive model’s overall performance. Analysis of a single model instance, however, guarantees no insight into a variables relevance to underlying generative processes. Recent research has… [more]

34th Conference on Neural Information Processing Systems (NeurIPS 2020)Gavin Smith, Roberto Mansilla, James Goulding

Group-MCR for RF-MCR is introduced in:
Ljevar, V., Goulding, J., Smith, G. and Spence, A. “Using Model Class Reliance to measure group effect on adherence to asthma medication”. (IEEE International Conference on Big Data (Big Data). IEEE, 2021.).
Pre-print | Proceedings

IEEE International Conference on Big Data (Big Data). IEEE, 2021.Ljevar, V., Goulding, J., Smith, G. and Spence

Resources

Github repository containing a pip installable, sklearn compatible, Python implementation of MCR for Random Forests.

Media, Blogs and News Stories

Extended presentation given at TU Wien, slide deck available here.