计算机科学 ›› 2022, Vol. 49 ›› Issue (6): 313-318.doi: 10.11896/jsjkx.210400101
郭雨欣1, 陈秀宏2
GUO Yu-xin1, CHEN Xiu-hong2
摘要: 自动文本摘要能够帮助人们快速筛选、辨别信息,掌握新闻关键内容,缓解信息过载问题。主流的生成式自动文摘模型主要基于编码器-解码器架构。针对解码器端在预测目标词时未充分考虑文本主题信息,并且传统的Word2Vec静态词向量无法解决一词多义问题的现状,提出了一种融合BERT词嵌入表示和主题信息增强的中文短新闻自动摘要模型。编码器端联合无监督算法获取文本主题信息并将其融入注意力机制中,提升了模型的解码效果;解码器端将BERT预训练语言模型抽取出的BERT句向量作为补充特征,以获取更多的语义信息,同时引入指针机制来解决词表外的单词问题,并利用覆盖机制有效抑制重复。在训练过程中,为了避免暴露偏差问题,针对不可微指标ROUGE采用强化学习方法来优化模型。在两个中文短新闻摘要数据集上的多组对比实验结果表明,该模型在ROUGE评价指标上有显著的提升,能有效融合文本主题信息,生成语句流畅、简明扼要的摘要。
中图分类号:
[1] HU X,LIN Y,WANG C.Overview of automatic textsum-mingtechnology[J].Journal of Information,2010,29(8):144-147. [2] RUSH A M,CHOPRA S,WESTON J.A neural attention model for abstractive sentence summarization[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:379-389. [3] PAULUS R,XIONG C,SOCHER R.A deep reinforced modelfor abstractive summarization[J].arXiv:1705.04304,2018. [4] CHOPRA S,AULI M,RUSH A M.Abstractive sentence summarization with attentive recurrent neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:93-98. [5] NALLAPATI R,ZHOU B W,GULCEHRE C,et al.Abstrac-tive text summarization using sequence-to-sequence RNNs and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computatioal Natural Language Learning.Stroudsburg:Association for Computational Linguistics.2016:280-290. [6] GU J,LU Z,LI H,et al.Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.2016:1631-1640. [7] ZENG W,LUO W,FIDLER S,et al.Efficient summarizationwith read-again and copy mechanism[J].arXiv:1611.03382,2016. [8] SEE A,LIU P J,MANNING C D.Get to the point summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meetings of the Association for Computational Linguistics.Stroudsburg,PA:Association for Computational Linguistics,2017:1073-1083. [9] JONAS G,MICHAEL A,DAVID G,et al.Convolutional se-quence to sequence learning[J].arXiv:1705.03122,2017. [10] WANG L,YAO J L,TAO Y Z,et al.A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization[C]//Proceedings of the Twenty-Seventh Inter-national Joint Conference on Artificial Intelligence.2018:4453-4460. [11] WANG Q S,ZHANG H,LI F.Automatic Summary Generation Method Based on Multidimensional Text Feature[J].Computer Engineering,2020,46(9):110-116. [12] ILYA S,ORIOLV,QUOCV L.Sequence to sequence learningwith neural networks[C]//Advances in Neural Information Processing Systems.2014:3104-3112. [13] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014. [14] JACOB D,CHANG M,KENTON L,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[J].arXiv:1801.04805v2,2018. [15] ASHISH V,NOAM S,NIKI P,et al.Attention is all you need[J].arXiv:1706.03762,2017. [16] LUHN H P.The Automatic Creation of Literature Abstracts[J].IBM Journal of Research and Development,1958,2(2):159-165. [17] AIZAWA A.An information-theoretic perspective of tf-idfmeasures[J].Information Processing & Management,2003,39(1):45-65. [18] HOU L W,HU P,CAO W L.Automatic Chinese abstractive summarization with topical keywords fusion[J].Acta Automa-tica Sinica,2019,45(3):530-539. [19] WILLIAMS R J,ZIPSER D.A learning algorithm for continually running fully recurrent neural networks[J].Neural Computation,1998,1(2):270-280. [20] RENNIE S J,MARCHERET E,MROUEH Y,et al.Self-Critical Sequence Training for Image Captioning[C]//Proceedings of the 2017 Conference of the IEEE Computer Vision and Pattern Recognition.2017:1179-1195. [21] LIU C,LOWE R,SERBAN I,et al.How NOT To EvaluateYour Dialogue System:An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing,Association for Computational Linguistics.2016:2122-2132. [22] HU B,CHEN Q,ZHU F.LCSTS:A Large Scale Chinese Short Text Summarization Dataset[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1967-1972. [23] MIHALCEA R,TARAU P.TextRank:Bringing Order intoTexts[C]//Conference on Empirical Methods in Natural Language Processing.2004:404-411. [24] XIE N,LI S,REN H,et al.Abstractive summarization improved by WordNet-based extractive sentences[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer,2018:404-415. |
[1] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[2] | 刘兴光, 周力, 刘琰, 张晓瀛, 谭翔, 魏急波. 基于边缘智能的频谱地图构建与分发方法 Construction and Distribution Method of REM Based on Edge Intelligence 计算机科学, 2022, 49(9): 236-241. https://doi.org/10.11896/jsjkx.220400148 |
[3] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[4] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[5] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[6] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[7] | 史殿习, 赵琛然, 张耀文, 杨绍武, 张拥军. 基于多智能体强化学习的端到端合作的自适应奖励方法 Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning 计算机科学, 2022, 49(8): 247-256. https://doi.org/10.11896/jsjkx.210700100 |
[8] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[9] | 汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188 |
[10] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[11] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[12] | 袁唯淋, 罗俊仁, 陆丽娜, 陈佳星, 张万鹏, 陈璟. 智能博弈对抗方法:博弈论与强化学习综合视角对比分析 Methods in Adversarial Intelligent Game:A Holistic Comparative Analysis from Perspective of Game Theory and Reinforcement Learning 计算机科学, 2022, 49(8): 191-204. https://doi.org/10.11896/jsjkx.220200174 |
[13] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[14] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
[15] | 熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚. 融合双向门控循环单元和注意力机制的软件自承认技术债识别方法 Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism 计算机科学, 2022, 49(7): 212-219. https://doi.org/10.11896/jsjkx.210500075 |
|