修改数据集

This commit is contained in:
joe 2019-07-01 10:22:40 +08:00
parent 6730f0067f
commit 91b0cc6b60
4 changed files with 120006 additions and 102479 deletions

View File

@ -2,10 +2,13 @@
本文基于Google开源的[BERT](https://github.com/google-research/bert)代码进行了进一步的简化,方便生成句向量与做文本分类
---
***** New July 1st, 2019 *****
+ 修改句向量`graph`文件的生成方式提升句向量启动速度。不再每次以临时文件的方式生成首次执行extract_feature.py时会创建`tmp/result/graph`
再次执行时直接读取该文件,如果`args.py`文件内容有修改,需要删除`tmp/result/graph`文件
+ 修复同时启动两个进程生成句向量时代码报错的bug
+ 修改文本匹配数据集为QA_corpus该份数据相比于蚂蚁金服的数据更有权威性
---

42479
data/dev.csv

File diff suppressed because it is too large Load Diff

10001
data/test.csv Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff