mirror of https://github.com/JoergFranke/ADNC.git synced 2024-11-17 13:58:03 +08:00

Advanced Differentiable Neural Computer (ADNC) with application to bAbI task and CNN RC task.

babi deep-learning dnc mann nlp question-answering

Go to file

joergfranke caf9e137fa update README with links		2018-07-10 16:12:50 +02:00
adnc	fix pylint errors	2018-07-09 15:35:13 +02:00
experiments/pre_trained	bugfix in config	2018-07-07 15:46:34 +02:00
images	update babi plot functions	2018-07-06 09:01:58 +02:00
scripts	update babi plot functions	2018-07-06 09:01:58 +02:00
test	add bucket for preparing variables for plotting	2018-06-26 23:01:08 +02:00
.gitignore	add pycharm specific	2018-07-05 22:11:55 +02:00
.travis.yml	update requirements, add gpu requirements and add setup.py	2018-06-19 23:33:07 +02:00
LICENSE	update README and add folder structure	2018-06-17 11:21:17 +02:00
README.md	update README with links	2018-07-10 16:12:50 +02:00
requirements-gpu.txt	update boilerplate	2018-06-20 23:55:28 +02:00
requirements.txt	update boilerplate	2018-06-20 23:55:28 +02:00
setup.py	update setup.py	2018-07-02 23:07:42 +02:00

README.md

Advanced Differentiable Neural Computer

This repository is under construction and some comments and functions are missing.

This repository contains a implementation of a Differentiable Neural Computer (DNC) with advancements for a more robust and scalable usage in Question Answering. It is published on the MRQA workshop at the ACL 2018.

MRQA 2018 paper submission Robust and Scalable Differentiable Neural Computer for Question Answering
More detailed master thesis about the Advanced DNC for Question Answering

The ADNC is applied to:

20 bAbI QA tasks with state-of-the-art results
CNN Reading Comprehension Task with passable results without any adaptation or hyper-parameter tuning.

The implementation provides the following features:

Multi-read and multi-write head memory units
Key implementations (memory unit, controller, etc. ) have tests
Modular implementation of memory unit and controller
Fully configurable by a yaml-file
Pre-trained models on bAbI task and CNN RC task
Plot of the memory unit functionality during inference
The following advancements are implemented:


Bypass Dropout	DNC Normalization	Content Based Memory Unit	Bidirectional Controller
Dropout to reduce the bypass connectivity Forces an earlier memory usage during training	Normalizes the memory unit's input Increases the model stability during training	Memory Unit without temporal linkage mechanism Reduces memory consumption by up to 70	Bidirectional DNC Architecture Allows to handle variable requests and rich information extraction

The plot below shows the impact of the different advancements in the word error rate with the bAbI task 1.

Furthermore, it contains a set of rich analysis tools to get a deeper insight in the functionality of the ADNC. For example that the advancements lead to a more meaningful gate usage of the memory cell as you can see in the following plots:

How to use:

Setup ADNC

To setup an virtual environment and install ADNC:

git clone https://github.com/joergfranke/ADNC.git
cd ADNC/
python3 -m venv venv
source venv/bin/activate
pip install -e .

Inference

The repository contains different pre-trained models in the experiments folder. For bAbI inference, choose pre-trained model e.g. adnc and run:

python scripts/inference_babi_task.py adnc

Possible models are dnc, adnc, biadnc on bAbi Task 1 and biadnc-all, biadnc-aug16-all for all bAbI tasks with or without augmentation of task 16.

For CNN inference of pre-trained ADNC run:

python scripts/inference_babi_task.py

Training

The configuration file scripts/config.yml contains the full config of the ADNC training. The training script can be run with:

python scripts/start/training.py

It starts a bAbI training and plots every epoch a function plot to control the training progress.

Plots

To plot a function plot of the bAbI task choose pre-trained model e.g. adnc and run:

python scripts/plot_function_babi_task.py