近年来事件抽取方法总结,包括中文事件抽取、开放域事件抽取、事件数据生成、跨语言事件抽取、小样本事件抽取、零样本事件抽取等类型,DMCNN、FramNet、DLRNN、DBRNN、GCN、DAG-GRU、JMEE、PLMEE等方法
Go to file
2020-10-04 11:05:39 +08:00
figure1.png Add files via upload 2020-10-04 10:39:48 +08:00
README.md Update README.md 2020-10-04 11:05:39 +08:00
图片1.png Add files via upload 2020-10-04 10:39:48 +08:00

Event-Extraction事件抽取资料综述总结更新中...

近年来事件抽取方法总结包括中文事件抽取、开放域事件抽取、事件数据生成、跨语言事件抽取、小样本事件抽取、零样本事件抽取等类型DMCNN、FramNet、DLRNN、DBRNN、GCN、DAG-GRU、JMEE、PLMEE等方法

Table of Contents 目录


Define(事件抽取的定义)

⬆️

Closed-domain

Closed-domain event extraction uses predefined event schema to discover and extract desired events of particular type from text. An event schema contains several event types and their corresponding event structures. We use the ACE terminologies to introduce an event structure as follows: 1

Event mention

a phrase or sentence describing an event, including a trigger and several arguments.

Event trigger

the main word that most clearly expresses an event occurrence, typically a verb or a noun.

Event argument

an entity mention, temporal expression or value that serves as a participant or attribute with a specific role in an event.

Argument role

the relationship between an argument to the event in which it participants.

D.Ahn [the stages of event extraction] first proposed to divide the ACE event extraction task into four subtasks: trigger detection, event/trigger type identification, event argument detection, and argument role identification.

Event mention

a phrase or sentence within which an event is described, including a trigger and arguments.

Event trigger

the main word that most clearly expresses the occurrence of an event (An ACE event trigger is typically a verb or a noun).

Event argument

an entity mention, temporal expression or value (e.g. Job-Title) that is involved in an event (viz., participants).

Argument role

the relationship between an argument to the event in which it participates.

Open domain

Without predefined event schemas, open-domain event extraction aims at detecting events from texts and in most cases, also clustering similar events via extracted event key-words. Event keywords refer to those words/phrases mostly describing an event, and sometimes keywords are further divided into triggers and arguments.

Story segmentation

detecting the boundaries of a story from news articles.

First story detection

detecting the story that discuss anew topic in the stream of news.

Topic detection

grouping the stories based on the topics they discuss.

Topic tracking

detecting stories that discuss a previously known topic.

Story link detection

deciding whether a pair of stories discuss the same topic.

The first two tasks mainly focus on event detection; and the rest three tasks are for event clustering. While the relation between the five tasks is evident, each requires a distinct evaluation process and encourages different approaches to address the particular problem.

Surveys(综述论文)

⬆️

事件抽取综述

元事件抽取研究综述2019 byGAO Li-zheng, ZHOU Gang, LUO Jun-yong, LAN Ming-jing

事件抽取是信息抽取领域的一个重要研究方向,在情报收集、知识提取、文档摘要、知识问答等领域有着广泛应用。对当前事件抽取领域研究得较多的元事件抽取进行了综述。首先,简要介绍了元事件和元事件抽取的基本概念,以及元事件抽取的主要实现方法。然后,重点阐述了元事件抽取的主要任务,详细介绍了元事件检测过程,并对其他相关任务进行了概述。最后,总结了元事件抽取面临的问题,在此基础上展望了元事件抽取的发展趋势。

An Overview of Event Extraction from Text2019 byFrederik Hogenboom, Flavius Frasincar, Uzay Kaymak, Franciska de Jong:

One common application of text mining is event extraction,which encompasses deducing specific knowledge concerning incidents re-ferred to in texts. Event extraction can be applied to various types ofwritten text, e.g., (online) news messages, blogs, and manuscripts. Thisliterature survey reviews text mining techniques that are employed forvarious event extraction purposes. It provides general guidelines on howto choose a particular event extraction technique depending on the user,the available content, and the scenario of use.

A Survey of Event Extraction from Text2019 byWei Xiang, Bang Wang

事件提取的任务定义、数据源和性能评估,还为其解决方案方法提供了分类。在每个解决方案组中,提供了最具代表性的方法的详细分析,特别是它们的起源、基础、优势和弱点。最后,对未来的研究方向进行了展望。

A Survey of Textual Event Extraction from Social Networks2017 byMohamed Mejri, Jalel Akaichi

A Survey of event extraction methods from text for decision support systems2016 byFrederik Hogenboom, Flavius Frasincar, Uzay Kaymak, Franciska de Jong, Emiel Caron

A Survey of Textual Event Extraction from Social Networks2014 byVera DanilovaMikhail AlexandrovXavier Blanco

事件检测综述

A survey on multi-modal social event detection2020 byHan Zhou, Hongpeng Yin, Hengyi Zheng, Yanxia Li

回顾了事件特征学习和事件推理这两种研究方法。特别是,学习事件特征是必要的前提,因为它能够将社交媒体数据转换为计算机友好的数字形式。事件推理的目的是决定一个样本是否属于一个社会事件。然后,介绍了该社区的几个公共数据集,并给出了比较结果。在本文的最后,本文对如何促进多模式社会事件检测的发展进行了一般性的讨论。

Review on event detection techniques in social multimedia2016 byHan Zhou, Hongpeng Yin, Hengyi Zheng, Yanxia Li

Deep Learning Models深度学习模型

⬆️

事件抽取

2020

Event Extraction as Definitation Comprehension, arxiv 2020(Github)

动机:提出一种新颖的事件提取方法,为模型提供带有漂白语句的模型。漂白语句是指基于注释准则、描述事件发生的通常情况的机器可读的自然语言句子。实验结果表明,模型能够提取封闭本体下的事件,并且只需阅读新的漂白语句即可将其推广到未知的事件类型。

主要思想:提出了一种新的事件抽取方法,该方法考虑了通过漂白语句的注释准则;提出了一个多跨度的选择模型,该模型演示了事件抽取方法的可行性以及零样本或少样本设置的可行性。

数据集ACE 2005

Open-domain Event Extraction and Embedding for Natural Gas Market Prediction, arxiv 2020 (Github)

动机:以前的方法大多数都将价格视为可推断的时间序列,那些分析价格和新闻之间的关系的方法是根据公共新闻数据集相应地修正其价格数据、手动注释标题或使用现成的工具。与现成的工具相比,我们的事件提取方法不仅可以检测现象的发生,还可以由公共来源检测变化的归因和特征。

主要思想依靠公共新闻API的标题我们提出一种方法来过滤不相关的标题并初步进行事件抽取。价格和文本均被反馈到3D卷积神经网络以学习事件与市场动向之间的相关性。 数据集NYTf、FT、TG

2019

One for All: Neural Joint Modeling of Entities and Events, AAAI2019(Github)

Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction(Github)

任务:与其他研究不同,该任务被定义为:事件框架填充:也就是论元检测+识别

不同点有:不需要触发词检测;文档级的抽取;论元有重叠

动机:解码论元需要一定顺序,先后有关

主要思想:发布数据集具有特性arguments-scattering and multi-event,先对事件是否触发进行预测;然后,按照一定顺序先后来分别解码论元

数据集:ten years (2008-2018) Chinese financial announcementsChFinAnn;Crawling from http://www.cninfo.com.cn/new/index

HMEAE: Hierarchical Modular Event Argument Extraction, EMNLP 2019 short(Github)

任务:事件角色分类

动机:论元的类型如PERSON会给论元之间的关联带来影响

数据集:ACE 2005TAC KBP 2016

Joint Event Extraction Based on Hierarchical Event Schemas From FrameNet, EMNLP 2019 short(Github)

动机事件抽取对于许多实际应用非常有用例如新闻摘要和信息检索。但是目前很流行的自动上下文抽取ACE事件抽取程序仅定义了非常有限且粗糙的事件模式这可能不适合实际应用。 FrameNet是一种语言语料库它定义了完整的语义框架和框架间的关系。由于FrameNet中的框架与ACE中的事件架构共享高度相似的结构并且许多框架实际上表达了事件因此我们建议基于FrameNet重新定义事件架构。

主要思想1提取FrameNet中表示事件的所有框架并利用框架与框架之间的关系建立事件模式的层次结构。2适当利用全局信息例如事件间关系和事件抽取必不可少的局部特征例如词性标签和依赖项标签。基于一种利用事件抽取结果的多文档摘要无监督抽取方法我们实行了一种图排序。

数据集ACE 2005FrameNet 1.7 corpus

2018

Zero-Shot Transfer Learning for Event Extraction, ACL2018 by Lifu Huang, Heng Ji, Kyunghyun Cho, Ido Dagan, Sebastian Riedel, Clare R. Voss (Github)

动机:以前大多数受监督的事件提取方法都依赖手动注释派生的特征,因此,如果没有额外的注释工作,这些方法便无法应对于新的事件类型。我们设计了一个新的框架来解决这个问题。

image

主要思想:每个事件都有由候选触发词和论元组成的结构,同时这个结构具有和事件类型及论元相一致的预定义的名字和标签。 我们增加了事件类型以及事件信息片段的语义代表( semantic representations),并根据目标本体中定义的事件类型和事件信息片段的语义相似性来决定事件的类型

数据集ACE2005

Keywords: Zero-Shot Transfer

2016

RBPB Regularization Based Pattern Balancing Method for Event Extraction,ACL2016 by Sha, Lei and Liu, Jing and Lin, Chin-Yew and Li, Sujian and Chang, Baobao and Sui, Zhifang (Github)

动机在最近的工作中当确定事件类型触发器分类大多数方法要么是仅基于模式pattern要么是仅基于特征。此外以往的工作在识别和文类论元的时候忽略了论元之间的关系只是孤立的考虑每个候选论元。

image

主要思想:在本文中,我们同时使用‘模式’和‘特征’来识别和分类‘事件触发器’。 此外,我们使用正则化方法对候选自变量之间的关系进行建模,以提高自变量识别的性能。 我们的方法称为基于正则化的模式平衡方法。

数据集ACE2005

Keywords: Embedding & Pattern features, Regularization method

Few-shot or zero-shot

2020

Meta-Learning with Dynamic-Memory-Based Prototypical Network for Few-Shot Event Detection, WSDM 2020(Github)

Shumin Deng, Ningyu Zhang, Jiaojian Kang, Yichi Zhang, Wei Zhang, Huajun Chen

Exploiting the Matching Information in the Support Set for Few Shot Event Classification, PAKDD 2020(Github)

Viet Lai, Franck Dernoncourt, Thien Huu Nguyen

Towards Few-Shot Event Mention Retrieval : An Evaluation Framework and A Siamese Network Approach, 2020(Github)

2018

Zero-Shot Transfer Learning for Event Extraction, ACL 2018(Github)

Lifu Huang, Heng Ji, Kyunghyun Cho, Ido Dagan, Sebastian Riedel, Clare R. Voss

中文事件抽取

2019

Cross-lingual Structure Transfer for Relation and Event Extraction, EMNLP2019(Github)

Ananya Subburathinam, Di Lu, Heng Ji, Jonathan May, Shih-Fu Chang, Avirup Sil, Clare Voss

 A Hybrid Character Representatin for Chinese Event Detection, IJCNLP 2019(Github)

Xi Xiangyu ; Zhang Tong ; Ye Wei ; Zhang Jinglei ; Xie Rui ; Zhang Shikun

Event Detection with Trigger-Aware Lattice Neural Network, EMNLP 2019(Github)

Ning Ding, Ziran Li, Zhiyuan Liu, Haitao Zheng, Zibo Lin

Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction, EMNLP 2019(Github)

Shun Zheng, Wei Cao, Wei Xu, Jiang Bian

2018

DCFFE: A Document-level Chinese Financial Event Extraction System based on Automatically Labelled Training Data, ACL 2018(Github)

Yang, Hang and Chen, Yubo and Liu, Kang and Xiao, Yang and Zhao, Jun

Nugget Proposal Networks for Chinese Event Detection, ACL 2018(Github)

Lin, Hongyu and Lu, Yaojie and Han, Xianpei and Sun, Le

2016

A convolution bilstm neural network model for chinese event extraction,NLPCC 2016(Github)

半监督\远程监督事件抽取

2019

Neural Cross-Lingual Event Detection with Minimal Parallel Resources, EMNLP2019(Github)

The scarcity in annotated data poses a great challenge for event detection (ED). Cross-lingual ED aims to tackle this challenge by transferring knowledge between different languages to boost performance. However, previous cross-lingual methods for ED demonstrated a heavy dependency on parallel resources, which might limit their applicability. In this paper, we propose a new method for cross-lingual ED, demonstrating a minimal dependency on parallel resources. Specifically, to construct a lexical mapping between different languages, we devise a context-dependent translation method; to treat the word order difference problem, we propose a shared syntactic order event detector for multilingual co-training. The efficiency of our method is studied through extensive experiments on two standard datasets. Empirical results indicate that our method is effective in 1) performing cross-lingual transfer concerning different directions and 2) tackling the extremely annotation-poor scenario.

开放域事件抽取

2019

Neural Cross-Lingual Event Detection with Minimal Parallel Resources, EMNLP2019(Github)

The scarcity in annotated data poses a great challenge for event detection (ED). Cross-lingual ED aims to tackle this challenge by transferring knowledge between different languages to boost performance. However, previous cross-lingual methods for ED demonstrated a heavy dependency on parallel resources, which might limit their applicability. In this paper, we propose a new method for cross-lingual ED, demonstrating a minimal dependency on parallel resources. Specifically, to construct a lexical mapping between different languages, we devise a context-dependent translation method; to treat the word order difference problem, we propose a shared syntactic order event detector for multilingual co-training. The efficiency of our method is studied through extensive experiments on two standard datasets. Empirical results indicate that our method is effective in 1) performing cross-lingual transfer concerning different directions and 2) tackling the extremely annotation-poor scenario.

多语言事件抽取

2019

Neural Cross-Lingual Event Detection with Minimal Parallel Resources, EMNLP2019(Github)

The scarcity in annotated data poses a great challenge for event detection (ED). Cross-lingual ED aims to tackle this challenge by transferring knowledge between different languages to boost performance. However, previous cross-lingual methods for ED demonstrated a heavy dependency on parallel resources, which might limit their applicability. In this paper, we propose a new method for cross-lingual ED, demonstrating a minimal dependency on parallel resources. Specifically, to construct a lexical mapping between different languages, we devise a context-dependent translation method; to treat the word order difference problem, we propose a shared syntactic order event detector for multilingual co-training. The efficiency of our method is studied through extensive experiments on two standard datasets. Empirical results indicate that our method is effective in 1) performing cross-lingual transfer concerning different directions and 2) tackling the extremely annotation-poor scenario.

数据生成

2019

Neural Cross-Lingual Event Detection with Minimal Parallel Resources, EMNLP2019(Github)

The scarcity in annotated data poses a great challenge for event detection (ED). Cross-lingual ED aims to tackle this challenge by transferring knowledge between different languages to boost performance. However, previous cross-lingual methods for ED demonstrated a heavy dependency on parallel resources, which might limit their applicability. In this paper, we propose a new method for cross-lingual ED, demonstrating a minimal dependency on parallel resources. Specifically, to construct a lexical mapping between different languages, we devise a context-dependent translation method; to treat the word order difference problem, we propose a shared syntactic order event detector for multilingual co-training. The efficiency of our method is studied through extensive experiments on two standard datasets. Empirical results indicate that our method is effective in 1) performing cross-lingual transfer concerning different directions and 2) tackling the extremely annotation-poor scenario.

阅读理解式事件抽取

2019

Neural Cross-Lingual Event Detection with Minimal Parallel Resources, EMNLP2019(Github)

The scarcity in annotated data poses a great challenge for event detection (ED). Cross-lingual ED aims to tackle this challenge by transferring knowledge between different languages to boost performance. However, previous cross-lingual methods for ED demonstrated a heavy dependency on parallel resources, which might limit their applicability. In this paper, we propose a new method for cross-lingual ED, demonstrating a minimal dependency on parallel resources. Specifically, to construct a lexical mapping between different languages, we devise a context-dependent translation method; to treat the word order difference problem, we propose a shared syntactic order event detector for multilingual co-training. The efficiency of our method is studied through extensive experiments on two standard datasets. Empirical results indicate that our method is effective in 1) performing cross-lingual transfer concerning different directions and 2) tackling the extremely annotation-poor scenario.

Shallow Learning Models(浅层学习模型)

⬆️

2017

Data数据

⬆️

English Corpus

ACE2005 English Corpus

ACE 2005 Multilingual Training Corpus contains the complete set of English, Arabic and Chinese training data for the 2005 Automatic Content Extraction (ACE) technology evaluation. The corpus consists of data of various types annotated for entities, relations and events by the Linguistic Data Consortium (LDC) with support from the ACE Program and additional assistance from LDC.

Future Research Challenges未来研究挑战

⬆️

数据层面

模型层面

性能评估层面

Tools and Repos工具与资源

⬆️

NeuralClassifier

腾讯的开源NLP项目