Go to file

Anh M. Nguyen 3fc8b4f1e6 Update README.md		2019-01-25 08:37:54 -06:00
LICENSE	Initial commit	2017-12-21 12:33:57 -06:00
README.md	Update README.md	2019-01-25 08:37:54 -06:00

README.md

Papers on Explainable Artificial Intelligence

This is an on-going attempt to consolidate interesting efforts in the area of understanding / interpreting / explaining / visualizing machine learning models.

GUI tools

DeepVis: Deep Visualization Toolbox. Yosinski et al. 2015 code | pdf

Interpretability framework

https://github.com/utkuozbulak/pytorch-cnn-visualizations (activation maximization)
https://github.com/albermax/innvestigate (heatmaps)
https://github.com/tensorflow/lucid (activation maximization, heatmaps)

Surveys

Methods for Interpreting and Understanding Deep Neural Networks. Montavon et al. 2017 pdf
The Mythos of Model Interpretability. Lipton 2016 pdf
Towards A Rigorous Science of Interpretable Machine Learning Doshi-Velez & Kim. 2017 pdf
Visualizations of Deep Neural Networks in Computer Vision: A Survey. Seifert et al. 2017 pdf
How convolutional neural network see the world - A survey of convolutional neural network visualization methods. Qin et al. 2018 pdf
A brief survey of visualization methods for deep learning models from the perspective of Explainable AI. Chalkiadakis 2018 pdf
A Survey Of Methods For Explaining Black Box Models. Guidotti et al. 2018 pdf

Visualizing Preferred Stimuli

Activation Maximization

AM: Visualizing higher-layer features of a deep network. Erhan et al. 2009 pdf
DeepVis: Understanding Neural Networks through Deep Visualization. Yosinski et al. 2015 pdf | url
MFV: Multifaceted Feature Visualization: Uncovering the different types of features learned by each neuron in deep neural networks. Nguyen et al. 2016 pdf | code
DGN-AM: Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. Nguyen et al. 2016 pdf | code
PPGN: Plug and Play Generative Networks. Nguyen et al. 2017 pdf | code
Feature Visualization. Olah et al. 2017 url
Diverse feature visualizations reveal invariances in early layers of deep neural networks. Cadena et al. 2018 pdf

Real images / Segmentation Masks

Object Detectors Emerge in Deep Scene CNNs. Zhou et al. 2015 pdf
Network Dissection: Quantifying Interpretability of Deep Visual Representations. Bau et al. 2017 url | pdf
Net2Vec: Quantifying and Explaining how Concepts are Encoded by Filters in Deep Neural Networks. Fong & Vedaldi 2018 pdf

Heatmaps / Attribution

White-box

Learning how to explain neural networks: PatternNet and PatternAttribution pdf
A Theoretical Explanation for Perplexing Behaviors of Backpropagation-based Visualizations. Nie et al. 2018 pdf
A Taxonomy and Library for Visualizing Learned Features in Convolutional Neural Networks pdf
CAM: Learning Deep Features for Discriminative Localization. Zhou et al. 2016 code | web
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. Selvaraju et al. 2017 pdf
Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks. Chattopadhyay et al. 2017 pdf | code
LRP: Beyond saliency: understanding convolutional neural networks from saliency prediction on layer-wise relevance propagation pdf
DTD: Explaining NonLinear Classification Decisions With Deep Tayor Decomposition pdf
Regional Multi-scale Approach for Visually Pleasing Explanations of Deep Neural Networks. Seo et al. 2018 pdf
The (Un)reliability of saliency methods. Kindermans et al. 2018 pdf
Sanity Checks for Saliency Maps. Adebayo et al. 2018 pdf

Black-box

RISE: Randomized Input Sampling for Explanation of Black-box Models. Petsiuk et al. 2018 pdf
LIME: Why should i trust you?: Explaining the predictions of any classifier. Ribeiro et al. 2016 pdf | blog

Inverting Neural Networks

Understanding Deep Image Representations by Inverting Them pdf
Inverting Visual Representations with Convolutional Networks pdf
Neural network inversion beyond gradient descent pdf

Bayesian approaches

Yang, S. C. H., & Shafto, P. Explainable Artificial Intelligence via Bayesian Teaching. NIPS 2017 pdf

Distilling DNNs into more interpretable models

Interpreting CNNs via Decision Trees pdf
Distilling a Neural Network Into a Soft Decision Tree pdf
Distill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation. Tan et al. 2018 pdf
Improving the Interpretability of Deep Neural Networks with Knowledge Distillation. Liu et al. 2018 pdf

Learning to explain

Deep Learning for Case-Based Reasoning through Prototypes pdf
Unsupervised Learning of Neural Networks to Explain Neural Networks pdf

Machine rationalization

Automated Rationale Generation: A Technique for Explainable AI and its Effects on Human Perceptions pdf
Rationalization: A Neural Machine Translation Approach to Generating Natural Language Explanations pdf

Understanding via Mathematical and Statistical tools

Understanding Deep Architectures by Interpretable Visual Summaries pdf
A Peek Into the Hidden Layers of a Convolutional Neural Network Through a Factorization Lens. Saini et al. 2018 pdf
How Important Is a Neuron? Dhamdhere et al. 2018 pdf

Applications

Explainable AI for Designers: A Human-Centered Perspective on Mixed-Initiative Co-Creation pdf
ICADx: Interpretable computer aided diagnosis of breast masses. Kim et al. 2018 pdf

Natural Language Processing

NLIZE: A Perturbation-Driven Visual Interrogation Tool for Analyzing and Interpreting Natural Language Inference Models. Liu et al. 2018 pdf
Neural Network Interpretation via Fine Grained Textual Summarization. Guo et al. 2018 pdf