Added Efficient Explanations from Empirical Explainers

This commit is contained in:
Anh M. Nguyen 2021-08-26 16:30:04 -05:00 committed by GitHub
parent b808548846
commit f022b0c921
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -217,19 +217,22 @@ This is an on-going attempt to consolidate interesting efforts in the area of un
* Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations. _Ross et al. IJCAI 2017_ [pdf](https://www.ijcai.org/Proceedings/2017/0371.pdf) * Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations. _Ross et al. IJCAI 2017_ [pdf](https://www.ijcai.org/Proceedings/2017/0371.pdf)
* Learning Explainable Models Using Attribution Priors. _Erion et al. 2019_ [pdf](https://arxiv.org/abs/1906.10670) * Learning Explainable Models Using Attribution Priors. _Erion et al. 2019_ [pdf](https://arxiv.org/abs/1906.10670)
* Interpretations are useful: penalizing explanations to align neural networks with prior knowledge. _Rieger et al. 2019_ [pdf](https://arxiv.org/pdf/1909.13584.pdf) * Interpretations are useful: penalizing explanations to align neural networks with prior knowledge. _Rieger et al. 2019_ [pdf](https://arxiv.org/pdf/1909.13584.pdf)
* L2E: Learning to Explain: Generating Stable Explanations Fast. _Situ et al. ACL 2021_ [pdf](https://aclanthology.org/2021.acl-long.415.pdf "Training neural networks to mimic a black-box attribution methods e.g. Occlusion, LIME, SHAP produces a faster and more stable explanation method.") | [code](https://github.com/situsnow/L2E)
### B2.2 Explaining by examples (prototypes) ### B2.2 Training deep nets to approximate expensive, posthoc attribution methods
* L2E: Learning to Explain: Generating Stable Explanations Fast. _Situ et al. ACL 2021_ [pdf](https://aclanthology.org/2021.acl-long.415.pdf "Training neural networks to mimic a black-box attribution methods e.g. Occlusion, LIME, SHAP produces a faster and more stable explanation method.") | [code](https://github.com/situsnow/L2E)
* Efficient Explanations from Empirical Explainers. _Schwarzenberg et al. 2021_ [pdf](https://arxiv.org/abs/2103.15429 "Training deep nets to approximate Integrated Gradient and Shapley methods")
### B2.3 Explaining by examples (prototypes)
* This Looks Like That: Deep Learning for Interpretable Image Recognition. _Chen et al. NeurIPS 2019_ [pdf](https://arxiv.org/abs/1806.10574) | [code](https://github.com/cfchen-duke/ProtoPNet) * This Looks Like That: Deep Learning for Interpretable Image Recognition. _Chen et al. NeurIPS 2019_ [pdf](https://arxiv.org/abs/1806.10574) | [code](https://github.com/cfchen-duke/ProtoPNet)
* ProtoPNet: This Looks Like That, Because ... Explaining Prototypes for Interpretable Image Recognition. _Nauta et al. 2020_ [pdf](https://arxiv.org/pdf/2011.02863.pdf) * ProtoPNet: This Looks Like That, Because ... Explaining Prototypes for Interpretable Image Recognition. _Nauta et al. 2020_ [pdf](https://arxiv.org/pdf/2011.02863.pdf)
* NP-ProtoPNet: These do not Look Like Those. _Singh et al. 2021_ [pdf](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9373404 "ProtoPNet with negative prototypes and applied to chest x-rays") * NP-ProtoPNet: These do not Look Like Those. _Singh et al. 2021_ [pdf](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9373404 "ProtoPNet with negative prototypes and applied to chest x-rays")
### B2.3 Adversarial attacks on XAI systems with humans in the loop ### B2.4 Adversarial attacks on XAI systems with humans in the loop
* When and How to Fool Explainable Models (and Humans) with Adversarial Examples. _Vadilo et al. 2021_ [pdf](https://arxiv.org/abs/2107.01943 "A framework of scenarios, assumptions, and humans in an XAI system under adversarial attacks") * When and How to Fool Explainable Models (and Humans) with Adversarial Examples. _Vadilo et al. 2021_ [pdf](https://arxiv.org/abs/2107.01943 "A framework of scenarios, assumptions, and humans in an XAI system under adversarial attacks")
* The effectiveness of feature attribution methods and its correlation with automatic evaluation scores. _Nguyen, Kim, Nguyen 2021_ [pdf](http://anhnguyen.me/project/feature-attribution-effectiveness/ "On image classification, feature attribution maps are less effective in improving human-AI team compared to a simple nearest-neighbor method. The effectiveness of heatmaps also does not correlate with their localization performance.") * The effectiveness of feature attribution methods and its correlation with automatic evaluation scores. _Nguyen, Kim, Nguyen 2021_ [pdf](http://anhnguyen.me/project/feature-attribution-effectiveness/ "On image classification, feature attribution maps are less effective in improving human-AI team compared to a simple nearest-neighbor method. The effectiveness of heatmaps also does not correlate with their localization performance.")
### B2.4 Others ### B2.5 Others
* Learning how to explain neural networks: PatternNet and PatternAttribution [pdf](https://arxiv.org/abs/1705.05598) * Learning how to explain neural networks: PatternNet and PatternAttribution [pdf](https://arxiv.org/abs/1705.05598)
* Deep Learning for Case-Based Reasoning through Prototypes [pdf](https://arxiv.org/pdf/1710.04806.pdf) * Deep Learning for Case-Based Reasoning through Prototypes [pdf](https://arxiv.org/pdf/1710.04806.pdf)
* Unsupervised Learning of Neural Networks to Explain Neural Networks [pdf](https://arxiv.org/abs/1805.07468) * Unsupervised Learning of Neural Networks to Explain Neural Networks [pdf](https://arxiv.org/abs/1805.07468)