mirror of https://github.com/JoergFranke/ADNC.git synced 2024-11-17 13:58:03 +08:00

Advanced Differentiable Neural Computer (ADNC) with application to bAbI task and CNN RC task.

babi deep-learning dnc mann nlp question-answering

Go to file

Joerg Franke 9d9bfd8f75 add inference scripts		2018-07-04 23:59:24 +02:00
adnc	update babi task	2018-07-04 23:48:02 +02:00
experiments/pre_trained	update babi task	2018-07-04 23:48:02 +02:00
images	add images for README	2018-07-02 23:06:15 +02:00
scripts	add inference scripts	2018-07-04 23:59:24 +02:00
test	add bucket for preparing variables for plotting	2018-06-26 23:01:08 +02:00
.gitattributes	update gitignore and add gitattributes	2018-06-25 13:19:54 +02:00
.gitignore	update gitignore and add gitattributes	2018-06-25 13:19:54 +02:00
.travis.yml	update requirements, add gpu requirements and add setup.py	2018-06-19 23:33:07 +02:00
LICENSE	update README and add folder structure	2018-06-17 11:21:17 +02:00
README.md	update README	2018-07-02 23:12:09 +02:00
requirements-gpu.txt	update boilerplate	2018-06-20 23:55:28 +02:00
requirements.txt	update boilerplate	2018-06-20 23:55:28 +02:00
setup.py	update setup.py	2018-07-02 23:07:42 +02:00

README.md

Advanced Differentiable Neural Computer

THIS REPOSITORY IS IN CONSTRUCTION, NOT EVERYTHING IS WORKING FINE YET

This repository contains a implementation of a Differentiable Neural Computer (DNC) with advancements for a more robust and scalable usage in Question Answering. It is applied to:

20 bAbI QA tasks with state-of-the-art results
CNN Reading Comprehension Task with passable results without any adaptation.

This repository is the groundwork for the MRQA 2018 paper submission "Robust and Scalable Differentiable Neural Computer for Question Answering". It contains a modular and fully configurable DNC with the following advancements:


Bypass Dropout	DNC Normalization	Content Based Memory Unit	Bidirectional Controller
Dropout to reduce the bypass connectivity Forces an earlier memory usage during training	Normalizes the memory unit's input %like layer normalization Increases the model stability during training	Memory Unit without temporal linkage mechanism Reduces memory consumption by up to 70	Bidirectional DNC Architecture Allows to handle variable requests and rich information extraction

The plot below shows the impact of the different advancements in the word error rate with the bAbI task 1.

Furthermore, it contains a set of rich analysis tools to get a deeper insight in the functionality of the ADNC. For example that the advancements lead to a more meaningful gate usage of the memory cell.

DNC	ADNC

How to use:

Setup ADNC

To install ADNC and setup an virtual environment:

git clone https://github.com/joergfranke/ADNC.git
cd ADNC/
python3 -m venv venv
source venv/bin/activate
pip install -e .

Inference

For bAbI inference, choose pre-trained model (DNC, ADNC, BiADNC) in scripts/inference_babi_task.py and run:

python scripts/inference_babi_task.py

For CNN inference, choose pre-trained model (ADNC, BiADNC) in scripts/inference_cnn_task.py and run:

python scripts/inference_babi_task.py

Training

t.b.a.

Plots

t.b.a.

Repository Structure

t.b.a.

bAbI Results

Task	DNC	EntNet	SDNC	ADNC	BiADNC	BiADNC +aug16
1: 1 supporting fact	9.0 ± 12.6	0.0 ± 0.1	0.0 ± 0.0	0.1 ± 0.0	0.1 ± 0.1	0.1 ± 0.0
2: 2 supporting facts	39.2 ± 20.5	15.3 ± 15.7	7.1 ± 14.6	0.8 ± 0.5	0.8 ± 0.2	0.5 ± 0.2
3: 3 supporting facts	39.6 ± 16.4	29.3 ± 26.3	9.4 ± 16.7	6.5 ± 4.6	2.4 ± 0.6	1.6 ± 0.8
4: 2 argument relations	0.4 ± 0.7	0.1 ± 0.1	0.1 ± 0.1	0.0 ± 0.0	0.0 ± 0.0	0.0 ± 0.0
5: 3 argument relations	1.5 ± 1.0	0.4 ± 0.3	0.9 ± 0.3	1.0 ± 0.4	0.7 ± 0.1	0.8 ± 0.4
6: yes/no questions	6.9 ± 7.5	0.6 ± 0.8	0.1 ± 0.2	0.0 ± 0.1	0.0 ± 0.0	0.0 ± 0.0
7: counting	9.8 ± 7.0	1.8 ± 1.1	1.6 ± 0.9	1.0 ± 0.7	1.0 ± 0.5	1.0 ± 0.7
8: lists/sets	5.5 ± 5.9	1.5 ± 1.2	0.5 ± 0.4	0.2 ± 0.2	0.5 ± 0.3	0.6 ± 0.3
9: simple negation	7.7 ± 8.3	0.0 ± 0.1	0.0 ± 0.1	0.0 ± 0.0	0.1 ± 0.2	0.0 ± 0.0
10: indefinite knowledge	9.6 ± 11.4	0.1 ± 0.2	0.3 ± 0.2	0.1 ± 0.2	0.0 ± 0.0	0.0 ± 0.1
11: basic coreference	3.3 ± 5.7	0.2 ± 0.2	0.0 ± 0.0	0.0 ± 0.0	0.0 ± 0.0	0.0 ± 0.0
12: conjunction	5 ± 6.3	0.0 ± 0.0	0.2 ± 0.3	0.0 ± 0.0	0.0 ± 0.1	0.0 ± 0.0
13: compound coreference	3.1 ± 3.6	0.0 ± 0.1	0.1 ± 0.1	0.0 ± 0.0	0.0 ± 0.0	0.0 ± 0.0
14: time reasoning	11 ± 7.5	7.3 ± 4.5	5.6 ± 2.9	0.2 ± 0.1	0.8 ± 0.7	0.3 ± 0.1
15: basic deduction	27.2 ± 20.1	3.6 ± 8.1	3.6 ± 10.3	0.1 ± 0.1	0.1 ± 0.1	0.1 ± 0.1
16: basic induction	53.6 ± 1.9	53.3 ± 1.2	53.0 ± 1.3	52.1 ± 0.9	52.6 ± 1.6	0.0 ± 0.0
17: positional reasoning	32.4 ± 8	8.8 ± 3.8	12.4 ± 5.9	18.5 ± 8.8	4.8 ± 4.8	1.5 ± 1.8
18: size reasoning	4.2 ± 1.8	1.3 ± 0.9	1.6 ± 1.1	1.1 ± 0.5	0.4 ± 0.4	0.9 ± 0.5
19: path finding	64.6 ± 37.4	70.4 ± 6.1	30.8 ± 24.2	43.3 ± 36.7	0.0 ± 0.0	0.1 ± 0.1
20: agent’s motivation	0.0 ± 0.1	0.0 ± 0.0	0.0 ± 0.0	0.1 ± 0.1	0.1 ± 0.1	0.1 ± 0.1
Mean WER:	16.7 ± 7.6	9.7 ± 2.6	6.4 ± 2.5	6.3 ± 2.7	3.2 ± 0.5	0.4 ± 0.3
Failed Tasks (<5%):	11.2 ± 5.4	5.0 ± 1.2	4.1 ± 1.6	3.2 ± 0.8	1.4 ± 0.5	0.0 ± 0.0

README.md Unescape Escape

Advanced Differentiable Neural Computer

How to use:

Setup ADNC

Inference

Training

Plots

Repository Structure

bAbI Results

README.md