您现在所在的位置是: 首页 > 正文

实验室4篇论文被CCF 顶级会议期刊接收

时间:2024年12月10日 | 栏目:新闻动态

实验室郑理同学的论文《Multi-Granular Multimodal Clue Fusion for Meme Understanding被人工智能领域国际顶级会议AAAI2025(CCF A)接收。以下是论文的摘要。

With the continuous emergence of various social media platforms frequently used in daily life, the multimodal meme understanding (MMU) task has been garnering increasing attention. MMU aims to explore and comprehend the meanings of memes from various perspectives by performing tasks such as metaphor recognition, sentiment analysis, intention detection, and offensiveness detection. Despite making progress, limitations persist due to the loss of fine-grained metaphorical visual clue and the neglect of multimodal text-image weak correlation. To overcome these limitations, we propose a multigranular multimodal clue fusion model (MGMCF) to advance MMU. Firstly, we design an object-level semantic mining module to extract object-level image feature clues, achieving fine-grained feature clue extraction and enhancing the model’s ability to capture metaphorical details and semantics. Secondly, we propose a brand-new global-local cross-modal interaction model to address the weak correlation between text and images. This model facilitates effective interaction between global multimodal contextual clues and local unimodal feature clues, strengthening their representations through a bidirectional cross-modal attention mechanism. Finally, we devise a dual-semantic guided training strategy to enhance the model’s understanding and alignment of multimodal representations in the semantic space. Experiments conducted on the widely-used MET-MEME bilingual dataset demonstrate significant improvements over state-of-the-art baselines. Specifically, there is an 8.14% increase in precision for offensiveness detection task, and respective accuracy enhancements of 3.53%, 3.89%, and 3.52% for metaphor recognition, sentiment analysis, and intention detection tasks. These results, underpinned by in-depth analyses, underscore the effectiveness and potential of our approach for advancing MMU.


实验室刘江同学的论文《TKDP: Threefold Knowledge-Enriched Deep Prompt Tuning for Few-Shot Named Entity Recognition被顶级期刊TKDE(CCF A)接收。以下是论文的摘要。

Few-shot named entity recognition (NER) exploits limited annotated instances to identify named mentions. Effectively transferring the internal or external resources thus becomes the key to few-shot NER. While the existing prompt tuning methods have shown remarkable few-shot performances, they still fail to make full use of knowledge. In this work, we investigate the integration of rich knowledge to prompt tuning for stronger few-shot NER. We propose incorporating the deep prompt tuning framework with threefold knowledge (namely TKDP ), including the internal 1) context knowledge and the external 2) label knowledge & 3) sememe knowledge . TKDP encodes the three feature sources and incorporates them into soft prompt embeddings, which are further injected into an existing pre-trained language model to facilitate predictions. On five benchmark datasets, the performance of our knowledge-enriched model was boosted by at most 11.53% F1 over the raw deep prompt method, and it significantly outperforms 9 strong-performing baseline systems in 5-/10-/20-shot settings, showing great potential in few-shot NER. Our TKDP framework can be broadly adapted to other few-shot tasks without much effort.


实验室郑理同学的论文《Self-Adaptive Fine-grained Multi-modal Data Augmentation for Semi-supervised Muti-modal Coreference Resolution被国际多媒体顶级会议ACM MM2025(CCF A)接收。以下是论文的摘要。

Coreference resolution, an essential task in natural language processing, is particularly challenging in multi-modal scenarios where data comes in various forms and modalities.Despite advancements, limitations due to scarce labeled data and underleveraged unlabeled data persist. We address these issues with a self-adaptive fine-grained multi-modal data augmentation framework for semi-supervised MCR, focusing on enriching training data from labeled datasets and tapping into the untapped potential of unlabeled data.Regarding the former issue, we first leverage text coreference resolution datasets and diffusion models,to perform fine-grained text-to-image generation with aligned text entities and image bounding boxes.We then introduce a self-adaptive selection strategy, meticulously curating the augmented data to enhance the diversity and volume of the training set without compromising its quality.For the latter issue, we design a self-adaptive threshold strategy that dynamically adjusts the confidence threshold based on the model’s learning status and performance, enabling effective utilization of valuable information from unlabeled data.Additionally, we incorporate a distance smoothing term, which smooths distances between positive and negative samples, enhancing discriminative power of the model’s feature representations and addressing noise and uncertainty in the unlabeled data.Our experiments on the widely-used CIN dataset show that our framework significantly outperforms state-of-the-art baselines by at least 9.57% on MUC F1 score and 4.92% on CoNLL F1 score.Remarkably, against weakly-supervised baselines, our framework achieves a staggering 22.24% enhancement in MUC F1 score.These results, underpinned by in-depth analyses, underscore the effectiveness and potential of our approach for advancing MCR tasks.


实验室何海钧同学的论文《Heuristic personality recognition based on fusing multiple conversations and utterance-level affection被顶级期刊IPM(SCI 一区)接收。以下是论文的摘要。

Personality Recognition in Conversations (PRC) is a task of significant interest and practical value. Existing studies on the PRC task utilize conversation inadequately and neglect affective information. Considering the way of information processing of these studies is not yet close enough to the concept of personality, we propose the SAH-GCN model for the PRC task in this study. This model initially processes the original conversation input to extract the central speaker feature. Leveraging Contrastive Learning, it continuously adjusts the embedding of each utterance by incorporating affective information to cope with the semantic similarity. Subsequently, the model employs Graph Convolutional Networks to simulate the conversation dynamics, ensuring comprehensive interaction between the central speaker feature and other relevant features. Lastly, it heuristically fuses central speaker features from multiple conversations involving the same speaker into one comprehensive feature, facilitating personality recognition. We conduct experiments using the recently released CPED dataset, which is the personality dataset encompassing affection labels and conversation details. Our results demonstrate that SAH-GCN achieves superior accuracy (+1.88%) compared to prior works on the PRC task. Further analysis verifies the efficacy of our scheme that fuses multiple conversations and incorporates affective information for personality recognition.


Copyright © 2020 - 2025 武汉大学语言和认知计算实验室. All Rights Reserved
地址:湖北省武汉市东西湖区国家网络安全人才培养与创新基地新珈楼C402室