interpretable random forests
The “variable importance” plot that shows the mean decrease in accuracy or node impurity per predictor is a classic metric used to interpret random forest models.
This metric aggregates mean decrease in accuracy across all trees in the forest. I wanted to watch a forest “grow” tree by tree, alongside the cumulative variable importance:
What does an interpretable random forest (RF) 🌲🌳 #datavis look like? Out-of-the-box 📦RF in #rstats and #python3 computes variable importance over *all* trees, but how do we get there? Here's a RF of 300 trees, tree-by-tree, showing cumulative variable importance. #DataScience pic.twitter.com/ODyrYRfUya— Rich Pauloo (@RichPauloo) May 4, 2019
Animations like the one above may help visualize an important behavior of machine learning models: the stability of random forest variable importance rankings. Variable importance ranking stability is critical to scientific pursuits, which require interpretable models. Unstable models are not as interpretable as stable ones. To the best of my knowledge, variable importance ranking stability has been studied in the context of remote sensing 1 and bioinformatics 2.
In the example random forest model above, one predictor was much more predictive than others, and the variable importance ranking was relatively stable. However with different data, it's imaginable that due to the random way in which training data and predictors are used at each node to determine splits, a few predictors might switch top rank as the most important variable. This may happen within a single forest as it grows, or between different fully-grown forests.
Below is a minimal example in
R to reproduce an animation like the one above. In a next iteration, it would edifying to add the MSE/ntree plot. I expect that as the number of trees increases and the out of bag error stabilizes, the variable importance ranking also stabilizes.
Behnamian, Amir, et al. “A systematic approach for variable selection with random forests: achieving stable variable importance values.” IEEE Geoscience and Remote Sensing Letters 14.11 (2017): 1988-1992. ↩︎