update table of contents & add big model--nlp
This commit is contained in:
parent
ad44f4dc4c
commit
d6e129b3d1
35
README.md
35
README.md
@ -10,9 +10,14 @@
|
|||||||
</center>
|
</center>
|
||||||
<br><br><br><br>
|
<br><br><br><br>
|
||||||
|
|
||||||
|
[![](https://img.shields.io/github/stars/fighting41love/funnlp?style=social)](https://github.com/fighting41love/funnlp)
|
||||||
[![](https://img.shields.io/badge/dynamic/json?color=blue&label=%E7%9F%A5%E4%B9%8E%E5%85%B3%E6%B3%A8&query=%24.data.totalSubs&url=https%3A%2F%2Fapi.spencerwoo.com%2Fsubstats%2F%3Fsource%3Dzhihu%26queryKey%3Dmountain-blue-64)](https://www.zhihu.com/people/mountain-blue-64)
|
[![](https://img.shields.io/badge/dynamic/json?color=blue&label=%E7%9F%A5%E4%B9%8E%E5%85%B3%E6%B3%A8&query=%24.data.totalSubs&url=https%3A%2F%2Fapi.spencerwoo.com%2Fsubstats%2F%3Fsource%3Dzhihu%26queryKey%3Dmountain-blue-64)](https://www.zhihu.com/people/mountain-blue-64)
|
||||||
[![](https://img.shields.io/badge/dynamic/json?color=blueviolet&label=github%20followers&query=%24.data.totalSubs&url=https%3A%2F%2Fapi.spencerwoo.com%2Fsubstats%2F%3Fsource%3Dgithub%26queryKey%3Dfighting41love)](https://github.com/fighting41love)
|
[![](data/.logo图片/.捐赠图片/.Citations-487-red.svg)](https://scholar.google.com/citations?hl=en&user=aqZdfDUAAAAJ)
|
||||||
[![](data/.logo图片/.捐赠图片/.PaperCitations-467-red.svg)](https://scholar.google.com/citations?hl=en&user=aqZdfDUAAAAJ)
|
|
||||||
|
[![](data/.logo图片/.捐赠图片/.Home-%E4%BA%BA%E7%94%9F%E6%B5%AA%E8%B4%B9%E6%8C%87%E5%8D%97-brightgreen.svg)](https://scholar.google.com/citations?hl=en&user=aqZdfDUAAAAJ)
|
||||||
|
[![](data/.logo图片/.捐赠图片/.%E7%8C%8E%E9%80%81%E9%97%A8-CV-orange.svg)](http://fighting41love.github.io/)
|
||||||
|
<!-- [![](https://img.shields.io/badge/dynamic/json?color=blueviolet&label=github%20followers&query=%24.data.totalSubs&url=https%3A%2F%2Fapi.spencerwoo.com%2Fsubstats%2F%3Fsource%3Dgithub%26queryKey%3Dfighting41love)](https://github.com/fighting41love) -->
|
||||||
|
<!-- [![](https://img.shields.io/badge/Homepage-%E4%BA%BA%E7%94%9F%E6%B5%AA%E8%B4%B9%E6%8C%87%E5%8D%97-brightgreen)](http://fighting41love.github.io/archives/) -->
|
||||||
|
|
||||||
### The Most Powerful NLP-Weapon Arsenal
|
### The Most Powerful NLP-Weapon Arsenal
|
||||||
|
|
||||||
@ -24,6 +29,15 @@
|
|||||||
|
|
||||||
长期不定时更新,欢迎watch和fork!:heart::heart::heart::heart::heart:
|
长期不定时更新,欢迎watch和fork!:heart::heart::heart::heart::heart:
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
| 目录 | Table of contents |
|
||||||
|
| ---- | ---- |
|
||||||
|
| :strawberry: |* [语料库](#语料库) <br> * [词库及词法工具](#词库及词法工具) <br> * [预训练语言模型](#预训练语言模型) <br> * [抽取](#抽取) <br> * [知识图谱](#知识图谱) <br> * [文本生成](#文本生成) <br> * [文本摘要](#文本摘要) <br> * [智能问答](#智能问答) <br> * [文本纠错](#文本纠错) |
|
||||||
|
| :sunflower: | * [文档处理](#文档处理) <br> * [表格处理](#表格处理) <br> * [文本匹配](#文本匹配) <br> * [文本数据增强](#文本数据增强) <br> * [文本检索](#文本检索) <br> * [阅读理解](#阅读理解) <br> * [情感分析](#情感分析) <br> * [常用正则表达式](#常用正则表达式) <br> * [语音处理](#语音处理) |
|
||||||
|
| :sparkler: | * [常用正则表达式](#常用正则表达式) <br> * [事件抽取](#事件抽取) <br> * [机器翻译](#机器翻译) <br> * [数字转换](#数字转换) <br> * [指代消解](#指代消解) <br> * [文本聚类](#文本聚类) <br> * [文本分类](#文本分类) <br> * [知识推理](#知识推理) <br> * [可解释NLP](#可解释自然语言处理) <br> * [文本对抗攻击](#文本对抗攻击)|
|
||||||
|
| :100: | * [文本可视化](#文本可视化) <br> * [文本标注工具](#文本标注工具) <br> * [综合工具](#综合工具) <br> * [有趣搞笑工具](#有趣搞笑工具) <br> * [课程报告面试等](#课程报告面试等) <br> * [比赛](#比赛) <br> * [金融NLP](#金融自然语言处理) <br> * [医疗NLP](#医疗自然语言处理) <br> * [法律NLP](#法律自然语言处理) <br> * [其他](#其他)|
|
||||||
|
|
||||||
目录(Table of contents)
|
目录(Table of contents)
|
||||||
=================
|
=================
|
||||||
<table border="0">
|
<table border="0">
|
||||||
@ -254,6 +268,8 @@
|
|||||||
| 多语言句向量包 | | [github](https://github.com/yannvgn/laserembeddings) |
|
| 多语言句向量包 | | [github](https://github.com/yannvgn/laserembeddings) |
|
||||||
|Language Model as a Service (LMaaS)|语言模型即服务|[github](https://github.com/txsun1997/LMaaS-Papers)|
|
|Language Model as a Service (LMaaS)|语言模型即服务|[github](https://github.com/txsun1997/LMaaS-Papers)|
|
||||||
|开源语言模型GPT-NeoX-20B|200亿参数,是目前最大的可公开访问的预训练通用自回归语言模型|[github](https://github.com/EleutherAI/gpt-neox)|
|
|开源语言模型GPT-NeoX-20B|200亿参数,是目前最大的可公开访问的预训练通用自回归语言模型|[github](https://github.com/EleutherAI/gpt-neox)|
|
||||||
|
|中文科学文献数据集(CSL)|包含 396,209 篇中文核心期刊论文元信息 (标题、摘要、关键词、学科、门类)。CSL 数据集可以作为预训练语料,也可以构建许多NLP任务,例如文本摘要(标题预测)、 关键词生成和文本分类等。|[github](https://github.com/ydli-ai/CSL)|
|
||||||
|
|大模型开发神器||[github](https://github.com/hpcaitech/ColossalAI)|
|
||||||
|
|
||||||
# 抽取
|
# 抽取
|
||||||
|
|
||||||
@ -322,7 +338,8 @@
|
|||||||
| 百度百科人物词条属性抽取 | 用基于BERT的微调和特征提取方法来进行知识图谱 | [github](https://github.com/sakuranew/BERT-AttributeExtraction)|
|
| 百度百科人物词条属性抽取 | 用基于BERT的微调和特征提取方法来进行知识图谱 | [github](https://github.com/sakuranew/BERT-AttributeExtraction)|
|
||||||
| 新冠肺炎相关数据 | 新冠及其他类型肺炎中文医疗对话数据集;清华大学等机构的开放数据源(COVID-19) | [github](https://www.aminer.cn/data-covid19/)<br> [github](https://github.com/UCSD-AI4H/COVID-Dialogue) |
|
| 新冠肺炎相关数据 | 新冠及其他类型肺炎中文医疗对话数据集;清华大学等机构的开放数据源(COVID-19) | [github](https://www.aminer.cn/data-covid19/)<br> [github](https://github.com/UCSD-AI4H/COVID-Dialogue) |
|
||||||
| DGL-KE 图嵌入表示学习算法 | | [github](https://github.com/awslabs/dgl-ke) |
|
| DGL-KE 图嵌入表示学习算法 | | [github](https://github.com/awslabs/dgl-ke) |
|
||||||
|
|因果关系图谱||[method](https://github.com/liuhuanyong/CausalityEventExtraction) [data](https://github.com/fighting41love/CausalDataset)|
|
||||||
|
|基于多领域文本数据集的因果事件对||[link](http://thuctc.thunlp.org/)|
|
||||||
|
|
||||||
# 文本生成
|
# 文本生成
|
||||||
|
|
||||||
@ -405,6 +422,8 @@
|
|||||||
| GitHub Typo Corpus大规模GitHub多语言拼写错误/语法错误数据集 | | [github](https://github.com/mhagiwara/github-typo-corpus) |
|
| GitHub Typo Corpus大规模GitHub多语言拼写错误/语法错误数据集 | | [github](https://github.com/mhagiwara/github-typo-corpus) |
|
||||||
| BertPunc基于BERT的最先进标点修复模型 | | [github](https://github.com/nkrnrnk/BertPunc) |
|
| BertPunc基于BERT的最先进标点修复模型 | | [github](https://github.com/nkrnrnk/BertPunc) |
|
||||||
| 中文写作校对工具 | | [github](https://xiezuocat.com/#/) |
|
| 中文写作校对工具 | | [github](https://xiezuocat.com/#/) |
|
||||||
|
|文本纠错文献列表| Chinese Spell Checking (CSC) and Grammatical Error Correction (GEC)|[github](https://github.com/nghuyong/text-correction-papers)|
|
||||||
|
|文本智能校对大赛冠军方案|已落地应用,来自苏州大学、达摩院团队|[link](https://mp.weixin.qq.com/s/2TjpmoYnt2BUTQVLi26AFA)|
|
||||||
|
|
||||||
|
|
||||||
# 多模态
|
# 多模态
|
||||||
@ -534,7 +553,7 @@
|
|||||||
| Deepmatch针对推荐、广告和搜索的深度匹配模型库 | | [github](https://github.com/shenweichen/DeepMatch) |
|
| Deepmatch针对推荐、广告和搜索的深度匹配模型库 | | [github](https://github.com/shenweichen/DeepMatch) |
|
||||||
| wwsearch是企业微信后台自研的全文检索引擎 | | [github](https://github.com/Tencent/wwsearch) |
|
| wwsearch是企业微信后台自研的全文检索引擎 | | [github](https://github.com/Tencent/wwsearch) |
|
||||||
| aili - the fastest in-memory index in the East 东半球最快并发索引 | | [github](https://github.com/UncP/aili) |
|
| aili - the fastest in-memory index in the East 东半球最快并发索引 | | [github](https://github.com/UncP/aili) |
|
||||||
|
|高效的字符串匹配工具 RapidFuzz|a fast string matching library for Python and C++, which is using the string similarity calculations from FuzzyWuzzy|[github](https://github.com/maxbachmann/rapidfuzz)|
|
||||||
|
|
||||||
# 阅读理解
|
# 阅读理解
|
||||||
|
|
||||||
@ -681,7 +700,8 @@
|
|||||||
| nlpgnn图神经网络自然语言处理工具箱 | | [github](https://github.com/kyzhouhzau/NLPGNN) |
|
| nlpgnn图神经网络自然语言处理工具箱 | | [github](https://github.com/kyzhouhzau/NLPGNN) |
|
||||||
| Macadam | 以Tensorflow(Keras)和bert4keras为基础,专注于文本分类、序列标注和关系抽取的自然语言处理工具包 | [github](https://github.com/yongzhuo/Macadam) |
|
| Macadam | 以Tensorflow(Keras)和bert4keras为基础,专注于文本分类、序列标注和关系抽取的自然语言处理工具包 | [github](https://github.com/yongzhuo/Macadam) |
|
||||||
| LineFlow面向所有深度学习框架的NLP数据高效加载器 | | [github](https://github.com/tofunlp/lineflow) |
|
| LineFlow面向所有深度学习框架的NLP数据高效加载器 | | [github](https://github.com/tofunlp/lineflow) |
|
||||||
|
|Arabica:Python文本数据探索性分析工具包||[github](https://github.com/PetrKorab/Arabica)|
|
||||||
|
|Python 压力测试工具:SMSBoom||[github](github.com/WhaleFell/SMSBoom)|
|
||||||
|
|
||||||
# 有趣搞笑工具
|
# 有趣搞笑工具
|
||||||
|
|
||||||
@ -698,6 +718,9 @@
|
|||||||
| COPE - 格律诗编辑程序 | | [github](https://github.com/LingDong-/cope) |
|
| COPE - 格律诗编辑程序 | | [github](https://github.com/LingDong-/cope) |
|
||||||
|Paper2GUI | 一款面向普通人的AI桌面APP工具箱,免安装即开即用,已支持18+AI模型,内容涵盖语音合成、视频补帧、视频超分、目标检测、图片风格化、OCR识别等领域 | [github](https://github.com/Baiyuetribe/paper2gui) |
|
|Paper2GUI | 一款面向普通人的AI桌面APP工具箱,免安装即开即用,已支持18+AI模型,内容涵盖语音合成、视频补帧、视频超分、目标检测、图片风格化、OCR识别等领域 | [github](https://github.com/Baiyuetribe/paper2gui) |
|
||||||
|礼貌程度估算器(使用新浪微博数据训练)|| [github](https://github.com/tslmy/politeness-estimator) [paper](https://dl.acm.org/doi/abs/10.1145/3415190)|
|
|礼貌程度估算器(使用新浪微博数据训练)|| [github](https://github.com/tslmy/politeness-estimator) [paper](https://dl.acm.org/doi/abs/10.1145/3415190)|
|
||||||
|
|草蟒(Python 中文版)入门指南|中文编程语言|[homepage](https://www.grasspy.cn/zwdocs/grasspy-start/day1/) [gitee](https://gitee.com/laowu2019_admin/zwdocs)|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
# 课程报告面试等
|
# 课程报告面试等
|
||||||
|
|
||||||
@ -742,7 +765,7 @@
|
|||||||
| 开源的金融投资数据提取工具 | | [github](https://github.com/PKUJohnson/OpenData) |
|
| 开源的金融投资数据提取工具 | | [github](https://github.com/PKUJohnson/OpenData) |
|
||||||
| 金融领域自然语言处理研究资源大列表 | | [github](https://github.com/icoxfog417/awesome-financial-nlp) |
|
| 金融领域自然语言处理研究资源大列表 | | [github](https://github.com/icoxfog417/awesome-financial-nlp) |
|
||||||
| 基于金融-司法领域(兼有闲聊性质)的聊天机器人 | | [github](https://github.com/charlesXu86/Chatbot_CN) |
|
| 基于金融-司法领域(兼有闲聊性质)的聊天机器人 | | [github](https://github.com/charlesXu86/Chatbot_CN) |
|
||||||
|
|小型金融知识图谱构流程示范| |[github](github.com/jm199504/Financial-Knowledge-Graphs)|
|
||||||
|
|
||||||
# 医疗自然语言处理
|
# 医疗自然语言处理
|
||||||
|
|
||||||
|
1
data/.logo图片/.捐赠图片/.Citations-487-blueviolet.svg
Normal file
1
data/.logo图片/.捐赠图片/.Citations-487-blueviolet.svg
Normal file
@ -0,0 +1 @@
|
|||||||
|
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="Citations: 487"><title>Citations: 487</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="90" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="59" height="20" fill="#555"/><rect x="59" width="31" height="20" fill="blueviolet"/><rect width="90" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="305" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="490">Citations</text><text x="305" y="140" transform="scale(.1)" fill="#fff" textLength="490">Citations</text><text aria-hidden="true" x="735" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="210">487</text><text x="735" y="140" transform="scale(.1)" fill="#fff" textLength="210">487</text></g></svg>
|
After Width: | Height: | Size: 1.1 KiB |
@ -1 +1 @@
|
|||||||
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="124" height="20" role="img" aria-label="Paper Citations: 467"><title>Paper Citations: 467</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="124" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="93" height="20" fill="#555"/><rect x="93" width="31" height="20" fill="#e05d44"/><rect width="124" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="475" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="830">Paper Citations</text><text x="475" y="140" transform="scale(.1)" fill="#fff" textLength="830">Paper Citations</text><text aria-hidden="true" x="1075" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="210">467</text><text x="1075" y="140" transform="scale(.1)" fill="#fff" textLength="210">467</text></g></svg>
|
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="124" height="20" role="img" aria-label="Paper Citations: 487"><title>Paper Citations: 487</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="124" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="93" height="20" fill="#555"/><rect x="93" width="31" height="20" fill="#e05d44"/><rect width="124" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="475" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="830">Paper Citations</text><text x="475" y="140" transform="scale(.1)" fill="#fff" textLength="830">Paper Citations</text><text aria-hidden="true" x="1075" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="210">487</text><text x="1075" y="140" transform="scale(.1)" fill="#fff" textLength="210">487</text></g></svg>
|
Before Width: | Height: | Size: 1.1 KiB After Width: | Height: | Size: 1.1 KiB |
1
data/.logo图片/.捐赠图片/.Home-人生浪费指南-brightgreen.svg
Normal file
1
data/.logo图片/.捐赠图片/.Home-人生浪费指南-brightgreen.svg
Normal file
@ -0,0 +1 @@
|
|||||||
|
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="120" height="20" role="img" aria-label="Home: 人生浪费指南"><title>Home: 人生浪费指南</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="120" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="77" height="20" fill="#4c1"/><rect width="120" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Home</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Home</text><text aria-hidden="true" x="805" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="670">人生浪费指南</text><text x="805" y="140" transform="scale(.1)" fill="#fff" textLength="670">人生浪费指南</text></g></svg>
|
After Width: | Height: | Size: 1.2 KiB |
1
data/.logo图片/.捐赠图片/.猎送门-CV-orange.svg
Normal file
1
data/.logo图片/.捐赠图片/.猎送门-CV-orange.svg
Normal file
@ -0,0 +1 @@
|
|||||||
|
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="68" height="20" role="img" aria-label="猎送门: CV"><title>猎送门: CV</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="68" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="25" height="20" fill="#fe7d37"/><rect width="68" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">猎送门</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">猎送门</text><text aria-hidden="true" x="545" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="150">CV</text><text x="545" y="140" transform="scale(.1)" fill="#fff" textLength="150">CV</text></g></svg>
|
After Width: | Height: | Size: 1.1 KiB |
Loading…
Reference in New Issue
Block a user