修改数据集
This commit is contained in:
parent
6730f0067f
commit
91b0cc6b60
@ -2,10 +2,13 @@
|
||||
|
||||
本文基于Google开源的[BERT](https://github.com/google-research/bert)代码进行了进一步的简化,方便生成句向量与做文本分类
|
||||
|
||||
---
|
||||
|
||||
***** New July 1st, 2019 *****
|
||||
+ 修改句向量`graph`文件的生成方式,提升句向量启动速度。不再每次以临时文件的方式生成,首次执行extract_feature.py时会创建`tmp/result/graph`,
|
||||
再次执行时直接读取该文件,如果`args.py`文件内容有修改,需要删除`tmp/result/graph`文件
|
||||
+ 修复同时启动两个进程生成句向量时代码报错的bug
|
||||
+ 修改文本匹配数据集为QA_corpus,该份数据相比于蚂蚁金服的数据更有权威性
|
||||
|
||||
---
|
||||
|
||||
|
42479
data/dev.csv
42479
data/dev.csv
File diff suppressed because it is too large
Load Diff
10001
data/test.csv
Normal file
10001
data/test.csv
Normal file
File diff suppressed because it is too large
Load Diff
170002
data/train.csv
170002
data/train.csv
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue
Block a user