not reliable advise

This commit is contained in:
ixaxaar 2017-11-01 15:08:17 +05:30
parent af1a77ca7f
commit 9e1b0a2983

View File

@ -94,7 +94,6 @@ The visdom dashboard shows memory as a heatmap for batch 0 every `-summarize_fre
## General noteworthy stuff
1. DNCs converge with Adam and RMSProp learning rules, SGD generally causes them to diverge.
2. Using a large batch size (> 100, recommended 1000) prevents gradients from becoming `NaN`.
Repos referred to for creation of this repo: