您现在所在的位置是: 首页 > 正文

实验室两篇论文被IEEE TASLP接收

时间:2022年11月11日 | 栏目:新闻动态

近日,实验室硕士生徐康关于情感三元组抽取的研究成果、硕士生刘江关于不连续命名实体识别的研究成果分别被IEEE旗下Transactions on Audio, Speech and Language Processing(TASLP)期刊录用,其为中科院升级版一区期刊,CCF推荐期刊B类,清华计算机类推荐期刊A类。

题目:重新审视方面-情感-意见三元组提取:对简单有效的基于跨度的模型的详细

摘要:基于方面的情感信息提取在自然语言处理研究界引起了越来越多的关注。目前已经提出了各种方法,例如序列标记、序列到序列生成和基于跨度的提取,它们各有优缺点。在本文中,我们通过设计和分析一个简单而有效的基于跨度的模型,称为 SMTFASTE,重新审视了基于跨度的方面-情感-意见三元组提取方法。我们的模型利用了一个整洁的三层架构,包括一个基于 BERT 的编码层、一个跨度表示层和一个方面-情感-意见预测层。在两个广泛使用的基准(ASTE-V2 和 ASOTE-V2)的实验中,我们发现我们的模型在大多数评估指标中都优于许多复杂的最先进模型。因此,我们对我们的模型进行了详细的分析,例如对我们模型的核心组件的消融研究以及明确使用上下文信息的好处,并获得了一些有见地的发现和结论。通过这项研究,我们证明,一个简单的基于跨度的模型能够在没有太多特征和架构工程的情况下获得具有竞争力的结果。我们的模型很容易理解,并且我们已经打开了我们的代码以方便相关研究。


Title:Revisiting Aspect-Sentiment-Opinion Triplet Extraction: Detailed Analyses Towards a Simple and Effective Span-Based Model

Abstract:Abstract—Aspect-based sentiment information extraction has attracted increasing attention in the research community of natural language processing. Various methods, such as sequence tagging, sequence-to-sequence generation and span-based extraction, have been proposed, which own different advantages and disadvantages. In this article, we revisit the span-based method for aspect-sentiment-opinion triplet extraction, by designing and analyzing a simple yet effective Span-based Model, called SMTFASTE. Our model leverages a tidily three-layer architecture, including a BERT-based encoding layer, a span representation layer and an aspect-sentiment-opinion prediction layer. In the experiments of two widely-used benchmarks (ASTE-V2 and ASOTE-V2), we find that our model outperforms a number of complicated state-of-the-art models in most evaluation metrics. Therefore, we conduct detailed analyses for our model, such as ablation studies of the core components of our model and the benefit of explicitly using context information, and obtain some insightful findings and conclusion. Through this study, we show that a simple span-based model is able to achieve competitive results without much feature and architecture engineering. Our model is easy to follow and we have opened our code to facilitate related research.

链接https://ieeexplore.ieee.org/document/9868116





题目:TOE:网格标记不连续NER模型通过嵌入标记/单词关系和更细粒度的标签而增强

摘要:到目前为止,不连续命名实体识别(NER)受到了越来越多的研究关注,许多相关方法激增,如基于超图的方法、基于跨度的方法、序列到序列(Seq2Seq)方法等。然而,这些方法或多或少存在解码模糊性和效率等问题,限制了它们的性能。近年来,网格标记方法受益于标记系统和模型架构的灵活设计,已经显示出了适应各种信息提取任务的优越性。在本文中,我们遵循这些方法,提出了一个竞争性的不连续网格标记模型。我们称我们的模型为TOE,是因为我们将两种面向标签的增强机制合并到一个最先进的(SOTA)网格标记模型中,该模型将NER问题投射到词-词关系预测中。首先,我们设计了一个标签表示嵌入模块(TREM),以迫使我们的模型不仅考虑词-词关系,还考虑词-标签和标签-标签关系。具体地说,我们构建标签表示并将其嵌入到TREM中,这样TREM就可以将标签和单词表示视为查询/键/值,并利用自我关注来建模它们的关系。另一方面,在SOTA模型中的下邻词(NNW)和尾头词(THW)标签的激励下,我们添加了两个新的对称标签,即上邻词(PNW)和头尾词(HTW),以建模更细粒度的词-词关系,并减轻标签预测的错误传播。在CADEC、ShARe13和ShARe14三个基准数据集的实验中,我们的TOE模型在F1中将SOTA结果分别提高了0.83%、0.05%和0.66%,证明了其有效性。


Title:TOE: A Grid-Tagging Discontinuous NER Model Enhanced by Embedding Tag/Word Relations and More Fine-Grained Tags

Abstract:So far, discontinuous named entity recognition (NER) has received increasing research attention and many related methods have surged such as hypergraph-based methods, span-based methods, and sequence-to-sequence (Seq2Seq) methods, etc. However, these methods more or less suffer from some problems such as decoding ambiguity and efficiency, which limit their performance. Recently, grid-tagging methods, which benefit from the flexible design of tagging systems and model architectures, have shown superiority to adapt for various information extraction tasks. In this paper, we follow the line of such methods and propose a competitive grid-tagging model for discontinuous NER. We call our model TOE because we incorporate two kinds of Tag-Oriented Enhancement mechanisms into a state-of-the-art (SOTA) grid-tagging model that casts the NER problem into word-word relationship prediction. First, we design a Tag Representation Embedding Module (TREM) to force our model to consider not only word-word relationships but also word-tag and tag-tag relationships. Concretely, we construct tag representations and embed them into TREM, so that TREM can treat tag and word representations as queries/keys/values and utilize self-attention to model their relationships. On the other hand, motivated by the Next-Neighboring-Word (NNW) and Tail-Head-Word (THW) tags in the SOTA model, we add two new symmetric tags, namely Previous-Neighboring-Word (PNW) and Head-Tail-Word (HTW), to model more fine-grained word-word relationships and alleviate error propagation from tag prediction. In the experiments of three benchmark datasets, namely CADEC, ShARe13 and ShARe14, our TOE model pushes the SOTA results by about 0.83%, 0.05% and 0.66% in F1, demonstrating its effectiveness.

链接https://ieeexplore.ieee.org/document/9944897


Copyright © 2020 - 2024 武汉大学语言和认知计算实验室. All Rights Reserved
地址:湖北省武汉市东西湖区国家网络安全人才培养与创新基地新珈楼C402室