Hai Liang Wang
1fdfaaea40
Ignore synonyms/data/words.vector
2020-09-21 13:19:15 +08:00
Hai Liang Wang
9de17fa1a2
publish new release and copyrights
2020-09-11 09:07:40 +08:00
corey@cn
6f6abdc877
nearby, display add default param: size(=10)
2019-06-11 14:51:20 +08:00
Hai Liang Wang
fe7450d51d
Closed #77 fix package version for better compatibility
2019-04-21 09:19:31 +08:00
Hai Liang Wang
b23e1c3ba7
计算编辑距离,去停用词
2018-10-25 11:29:41 +08:00
Hai Liang Wang
9979984773
Expose sentence to vector fn
2018-09-21 21:58:02 +08:00
Hai Liang Wang
68234bd47c
add synonyms.v fn
2018-09-21 21:42:43 +08:00
Chen Jiali
1c996d76d3
change import path of utils in word2vec.py to local path
2018-07-09 19:03:11 +08:00
Hai Liang Wang
4a44eff88f
#60 compare 支持交换句子
2018-05-28 11:44:36 +08:00
huyingxi
c580b3d82d
Update synonyms.py
...
update _nearby_levenshtein_distance
which is calculated by considering both first and second sentence's nearby words
2018-04-28 17:20:49 +08:00
Hai Liang Wang
86e24d59f2
Merge branch 'master' of github.com:huyingxi/Synonyms
2018-04-11 11:13:30 +08:00
Hai Liang Wang
ae69e67517
#59 降低语义距离分数
2018-04-11 11:13:14 +08:00
Sun Zhi
241c089ed6
set ignore parameter for function synonyms.compare to ignore unconcerned words
2018-03-20 11:36:04 +08:00
Hai Liang Wang
ec30739893
Fix invalid cosine value
2018-03-09 20:50:37 +08:00
ubuntu
4ade9bd136
Fixed #51 return editor distance with empty vectors
2018-03-09 16:29:29 +08:00
Hai Liang Wang
a9e86a4f3b
Fixed #51 return editor distance with empty vectors
2018-03-09 09:32:35 +08:00
Hai Liang Wang
81c580054a
Fix install error in Linux
2018-03-08 11:19:36 +08:00
Hai Liang Wang
cbd18d54e2
#49 modify default dict path in jieba
2018-03-07 15:16:42 +08:00
Hai Liang Wang
1dff0f01b5
Use glogging module for logs
2018-03-07 15:08:42 +08:00
Hai Liang Wang
c1ac5a3d9c
[compare] update formula
2018-03-05 14:54:18 +08:00
Hai Liang Wang
1f79bfdf4d
Add VALUATION.md
2018-03-05 14:30:45 +08:00
Hai Liang Wang
989b44f4b3
Fixed #46 use jieba source codes, modify word, tag loading
2018-03-05 11:19:50 +08:00
Hai Liang Wang
dfd23c1b97
[seg] load dict
2018-03-05 09:17:34 +08:00
Hai Liang Wang
4c97dfb7ce
Check string for chinese words
2018-03-04 23:11:12 +08:00
Hai Liang Wang
7bfee1403a
[compare] 更新分数计算
2018-03-04 22:56:13 +08:00
Hai Liang Wang
ddd2a97720
[nearby] Check eq before computing sim
2018-03-04 17:49:40 +08:00
Hai Liang Wang
e3d91154e6
export seg method
2018-03-04 10:27:18 +08:00
Hai Liang Wang
97b676ee82
Closed #45 enable cache for nearby words
2018-03-04 09:52:44 +08:00
Hai Liang Wang
d4f20e94a9
Refine vocab and nearby
2018-03-04 00:02:28 +08:00
Hai Liang Wang
a1af98a5a9
Support customized dict and model
2018-03-03 21:18:43 +08:00
Hai Liang Wang
4450ba836c
refine format
2018-03-02 13:19:03 +08:00
Hai Liang Wang
0e5794cfff
Leverage distance computing algorithm in compare API
2018-03-02 11:08:54 +08:00
Hai Liang Wang
dac98aa891
Closed #43 smoothing scores in compare API
2018-03-01 23:29:49 +08:00
AlexSun1995
129fe23cc8
将模型库中不存在词的向量值由全0替换为随机向量
2018-02-02 21:04:52 +08:00
Hain Wang
043a8ced04
Merge pull request #34 from weiyunfei/master
...
Resolve the ImportError when there's "utils" model.
2018-01-29 11:36:20 +08:00
wei
2db34e1479
Resolve the ImportError when there's "utils" model.
2018-01-25 16:13:44 +08:00
Hai Liang Wang
9356e62c78
#31 remove synonyms/data.py
2018-01-23 10:33:10 +08:00
cclauss
f8a2e30a17
import shutil and from six import string_types, u
2018-01-18 11:31:57 +01:00
Hai Liang Wang
e73f2a0dd3
Refine distance params, upgrade to v2
2017-12-31 19:01:05 +08:00
huyingxi
54a978e63e
Update __init__.py
...
update similarity_distance weights
2017-12-30 23:22:35 +08:00
bobbercheng
cdb85530f4
Update __init__.py
...
Use _levenshtein_distance() to replace _unigram_overlap. We still need to adjust weight for better result. Right now following two sentences are still regarded as same.
synonyms.compare('目前你用什么方法来保护朋友', '目前你用什么方法来保护家人')
2017-11-14 09:53:28 -06:00
inuyasha2012
c41ee7397d
fix python3 open stopwords file UnicodeDecodeError bug
2017-11-08 09:37:48 +08:00
huyingxi
de9c7ec748
Update __init__.py
...
updata sentence similarity's threshold
2017-11-04 10:45:56 -05:00
Hai Liang Wang
a1283d9926
#6 move useless code into data.py
2017-10-31 17:17:01 +08:00
Hai Liang Wang
312eae0067
#6 simplify code and support py2,3
2017-10-31 16:54:55 +08:00
huyingxi
86c188bb57
Update __init__.py
2017-10-30 22:18:54 -05:00
huyingxi
c0a4775551
inhence sentence similarity
2017-10-31 10:54:31 +08:00
Hai Liang Wang
4575f3e7c6
use jieba as tokenizer
2017-10-28 10:06:11 +08:00
jiangbo
55be4367b7
chore: reattach stdout to sys
2017-10-21 22:16:00 +08:00
Hai Liang Wang
07de9b2aef
refactor with pkl, add benchmark
2017-10-21 11:45:15 +08:00