mirror of https://github.com/JoergFranke/ADNC.git synced 2024-11-17 13:58:03 +08:00

Advanced Differentiable Neural Computer (ADNC) with application to bAbI task and CNN RC task.

babi deep-learning dnc mann nlp question-answering

Go to file

Joerg Franke 7c3421f603 update babi plot functions		2018-07-06 09:01:58 +02:00
adnc	add function plot script and bugfix methods	2018-07-06 08:27:56 +02:00
experiments/pre_trained	add cnn model	2018-07-04 23:59:51 +02:00
images	update babi plot functions	2018-07-06 09:01:58 +02:00
scripts	update babi plot functions	2018-07-06 09:01:58 +02:00
test	add bucket for preparing variables for plotting	2018-06-26 23:01:08 +02:00
.gitattributes	update gitignore and add gitattributes	2018-06-25 13:19:54 +02:00
.gitignore	add pycharm specific	2018-07-05 22:11:55 +02:00
.travis.yml	update requirements, add gpu requirements and add setup.py	2018-06-19 23:33:07 +02:00
LICENSE	update README and add folder structure	2018-06-17 11:21:17 +02:00
README.md	update babi plot functions	2018-07-06 09:01:58 +02:00
requirements-gpu.txt	update boilerplate	2018-06-20 23:55:28 +02:00
requirements.txt	update boilerplate	2018-06-20 23:55:28 +02:00
setup.py	update setup.py	2018-07-02 23:07:42 +02:00

README.md

Advanced Differentiable Neural Computer

This repository is under construction and some comments and functions are missing.

This repository contains a implementation of a Differentiable Neural Computer (DNC) with advancements for a more robust and scalable usage in Question Answering. It is applied to:

20 bAbI QA tasks with state-of-the-art results
CNN Reading Comprehension Task with passable results without any adaptation or hyper-parameter tuning.

This repository is the groundwork for the MRQA 2018 paper submission "Robust and Scalable Differentiable Neural Computer for Question Answering". It contains a modular and fully configurable DNC with the following advancements:


Bypass Dropout	DNC Normalization	Content Based Memory Unit	Bidirectional Controller
Dropout to reduce the bypass connectivity Forces an earlier memory usage during training	Normalizes the memory unit's input Increases the model stability during training	Memory Unit without temporal linkage mechanism Reduces memory consumption by up to 70	Bidirectional DNC Architecture Allows to handle variable requests and rich information extraction

The plot below shows the impact of the different advancements in the word error rate with the bAbI task 1.

Furthermore, it contains a set of rich analysis tools to get a deeper insight in the functionality of the ADNC. For example that the advancements lead to a more meaningful gate usage of the memory cell as you can see in the following plots:

How to use:

Setup ADNC

To setup an virtual environment and install ADNC:

git clone https://github.com/joergfranke/ADNC.git
cd ADNC/
python3 -m venv venv
source venv/bin/activate
pip install -e .

Inference

The repository contains different pre-trained models in the experiments folder. For bAbI inference, choose pre-trained model e.g. adnc and run:

python scripts/inference_babi_task.py adnc

Possible models are dnc, adnc, biadnc on bAbi Task 1 and biadnc-all, biadnc-aug16-all for all bAbI tasks with or without augmentation of task 16.

For CNN inference of pre-trained ADNC run:

python scripts/inference_babi_task.py

Training

The configuration file scripts/config.yml contains the full config of the ADNC training. The training script can be run with:

python scripts/start/training.py

It starts a bAbI training and plots every epoch a function plot to control the training progress.

Plots

To plot a function plot of the bAbI task choose pre-trained model e.g. adnc and run:

python scripts/plot_function_babi_task.py