Papers on Explainable Artificial Intelligence

This is an on-going attempt to consolidate all interesting efforts in the area of understanding / interpreting / explaining / visualizing machine learning models.

0. GUI tools

Deep Visualization Toolbox code | pdf

1. Surveys

Methods for Interpreting and Understanding Deep Neural Networks pdf
The Mythos of Model Interpretability pdf
Towards A Rigorous Science of Interpretable Machine Learning pdf
Visualizations of Deep Neural Networks in Computer Vision: A Survey (Seifert et al. 2017) pdf
How convolutional neural network see the world - A survey of convolutional neural network visualization methods (Qin et al. 2018) pdf
A brief survey of visualization methods for deep learning models from the perspective of Explainable AI (Chalkiadakis 2018) pdf

2. Feature Visualization / Activation Maximization

Synthesizing the preferred inputs for neurons in neural networks via deep generator networks code | pdf
Plug and Play Generative Networks pdf | code

3. Heatmap / Attribution

Learning how to explain neural networks: PatternNet and PatternAttribution (pdf)
A Theoretical Explanation for Perplexing Behaviors of Backpropagation-based Visualizations pdf
A Taxonomy and Library for Visualizing Learned Features in Convolutional Neural Networks pdf

Layer-wise Backpropagation

Beyond saliency: understanding convolutional neural networks from saliency prediction on layer-wise relevance propagation pdf
Explaining NonLinear Classification Decisions With Deep Tayor Decomposition pdf

4. Inverting Neural Networks

Understanding Deep Image Representations by Inverting Them pdf
Inverting Visual Representations with Convolutional Networks pdf
Neural network inversion beyond gradient descent pdf

5. Bayesian approaches

Yang, S. C. H., & Shafto, P. Explainable Artificial Intelligence via Bayesian Teaching. NIPS 2017 pdf

6. Distilling DNNs into more interpretable models

Interpreting CNNs via Decision Trees pdf
Distilling a Neural Network Into a Soft Decision Tree pdf

7. DNNs that learn to explain

Deep Learning for Case-Based Reasoning through Prototypes pdf
Unsupervised Learning of Neural Networks to Explain Neural Networks pdf

8. Others

Understanding Deep Architectures by Interpretable Visual Summaries pdf

3.4 KiB Raw Blame History

Papers on Explainable Artificial Intelligence

0. GUI tools

1. Surveys

2. Feature Visualization / Activation Maximization

3. Heatmap / Attribution

Layer-wise Backpropagation

4. Inverting Neural Networks

5. Bayesian approaches

6. Distilling DNNs into more interpretable models

7. DNNs that learn to explain

8. Others

3.4 KiB

Raw Blame History