Commit Graph

61 Commits

Author SHA1 Message Date
Hai Liang Wang
1fdfaaea40 Ignore synonyms/data/words.vector 2020-09-21 13:19:15 +08:00
Hai Liang Wang
9de17fa1a2 publish new release and copyrights 2020-09-11 09:07:40 +08:00
corey@cn
6f6abdc877
nearby, display add default param: size(=10) 2019-06-11 14:51:20 +08:00
Hai Liang Wang
fe7450d51d Closed #77 fix package version for better compatibility 2019-04-21 09:19:31 +08:00
Hai Liang Wang
b23e1c3ba7 计算编辑距离,去停用词 2018-10-25 11:29:41 +08:00
Hai Liang Wang
9979984773 Expose sentence to vector fn 2018-09-21 21:58:02 +08:00
Hai Liang Wang
68234bd47c add synonyms.v fn 2018-09-21 21:42:43 +08:00
Chen Jiali
1c996d76d3 change import path of utils in word2vec.py to local path 2018-07-09 19:03:11 +08:00
Hai Liang Wang
4a44eff88f #60 compare 支持交换句子 2018-05-28 11:44:36 +08:00
huyingxi
c580b3d82d
Update synonyms.py
update _nearby_levenshtein_distance
which is calculated by considering both first and second sentence's nearby words
2018-04-28 17:20:49 +08:00
Hai Liang Wang
86e24d59f2 Merge branch 'master' of github.com:huyingxi/Synonyms 2018-04-11 11:13:30 +08:00
Hai Liang Wang
ae69e67517 #59 降低语义距离分数 2018-04-11 11:13:14 +08:00
Sun Zhi
241c089ed6 set ignore parameter for function synonyms.compare to ignore unconcerned words 2018-03-20 11:36:04 +08:00
Hai Liang Wang
ec30739893 Fix invalid cosine value 2018-03-09 20:50:37 +08:00
ubuntu
4ade9bd136 Fixed #51 return editor distance with empty vectors 2018-03-09 16:29:29 +08:00
Hai Liang Wang
a9e86a4f3b Fixed #51 return editor distance with empty vectors 2018-03-09 09:32:35 +08:00
Hai Liang Wang
81c580054a Fix install error in Linux 2018-03-08 11:19:36 +08:00
Hai Liang Wang
cbd18d54e2 #49 modify default dict path in jieba 2018-03-07 15:16:42 +08:00
Hai Liang Wang
1dff0f01b5 Use glogging module for logs 2018-03-07 15:08:42 +08:00
Hai Liang Wang
c1ac5a3d9c [compare] update formula 2018-03-05 14:54:18 +08:00
Hai Liang Wang
1f79bfdf4d Add VALUATION.md 2018-03-05 14:30:45 +08:00
Hai Liang Wang
989b44f4b3 Fixed #46 use jieba source codes, modify word, tag loading 2018-03-05 11:19:50 +08:00
Hai Liang Wang
dfd23c1b97 [seg] load dict 2018-03-05 09:17:34 +08:00
Hai Liang Wang
4c97dfb7ce Check string for chinese words 2018-03-04 23:11:12 +08:00
Hai Liang Wang
7bfee1403a [compare] 更新分数计算 2018-03-04 22:56:13 +08:00
Hai Liang Wang
ddd2a97720 [nearby] Check eq before computing sim 2018-03-04 17:49:40 +08:00
Hai Liang Wang
e3d91154e6 export seg method 2018-03-04 10:27:18 +08:00
Hai Liang Wang
97b676ee82 Closed #45 enable cache for nearby words 2018-03-04 09:52:44 +08:00
Hai Liang Wang
d4f20e94a9 Refine vocab and nearby 2018-03-04 00:02:28 +08:00
Hai Liang Wang
a1af98a5a9 Support customized dict and model 2018-03-03 21:18:43 +08:00
Hai Liang Wang
4450ba836c refine format 2018-03-02 13:19:03 +08:00
Hai Liang Wang
0e5794cfff Leverage distance computing algorithm in compare API 2018-03-02 11:08:54 +08:00
Hai Liang Wang
dac98aa891 Closed #43 smoothing scores in compare API 2018-03-01 23:29:49 +08:00
AlexSun1995
129fe23cc8 将模型库中不存在词的向量值由全0替换为随机向量 2018-02-02 21:04:52 +08:00
Hain Wang
043a8ced04
Merge pull request #34 from weiyunfei/master
Resolve the ImportError when there's "utils" model.
2018-01-29 11:36:20 +08:00
wei
2db34e1479 Resolve the ImportError when there's "utils" model. 2018-01-25 16:13:44 +08:00
Hai Liang Wang
9356e62c78 #31 remove synonyms/data.py 2018-01-23 10:33:10 +08:00
cclauss
f8a2e30a17
import shutil and from six import string_types, u 2018-01-18 11:31:57 +01:00
Hai Liang Wang
e73f2a0dd3 Refine distance params, upgrade to v2 2017-12-31 19:01:05 +08:00
huyingxi
54a978e63e
Update __init__.py
update similarity_distance weights
2017-12-30 23:22:35 +08:00
bobbercheng
cdb85530f4
Update __init__.py
Use _levenshtein_distance() to replace _unigram_overlap. We still need to adjust weight for better result. Right now following two sentences are still regarded as same.

synonyms.compare('目前你用什么方法来保护朋友',  '目前你用什么方法来保护家人')
2017-11-14 09:53:28 -06:00
inuyasha2012
c41ee7397d fix python3 open stopwords file UnicodeDecodeError bug 2017-11-08 09:37:48 +08:00
huyingxi
de9c7ec748
Update __init__.py
updata sentence similarity's threshold
2017-11-04 10:45:56 -05:00
Hai Liang Wang
a1283d9926 #6 move useless code into data.py 2017-10-31 17:17:01 +08:00
Hai Liang Wang
312eae0067 #6 simplify code and support py2,3 2017-10-31 16:54:55 +08:00
huyingxi
86c188bb57
Update __init__.py 2017-10-30 22:18:54 -05:00
huyingxi
c0a4775551 inhence sentence similarity 2017-10-31 10:54:31 +08:00
Hai Liang Wang
4575f3e7c6 use jieba as tokenizer 2017-10-28 10:06:11 +08:00
jiangbo
55be4367b7 chore: reattach stdout to sys 2017-10-21 22:16:00 +08:00
Hai Liang Wang
07de9b2aef refactor with pkl, add benchmark 2017-10-21 11:45:15 +08:00