[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3477495.3531990acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

HTKG: Deep Keyphrase Generation with Neural Hierarchical Topic Guidance

Published: 07 July 2022 Publication History

Abstract

Keyphrases can concisely describe the high-level topics discussed in a document that usually possesses hierarchical topic structures. Thus, it is crucial to understand the hierarchical topic structures and employ it to guide the keyphrase identification. However, integrating the hierarchical topic information into a deep keyphrase generation model is unexplored. In this paper, we focus on how to effectively exploit the hierarchical topic to improve the keyphrase generation performance (HTKG). Specifically, we propose a novel hierarchical topic-guided variational neural sequence generation method for keyphrase generation, which consists of two major modules: a neural hierarchical topic model that learns the latent topic tree across the whole corpus of documents, and a variational neural keyphrase generation model to generate keyphrases under hierarchical topic guidance. Finally, these two modules are jointly trained to help them learn complementary information from each other. To the best of our knowledge, this is the first attempt to leverage the neural hierarchical topic to guide keyphrase generation. The experimental results demonstrate that our method significantly outperforms the existing state-of-the-art methods across five benchmark datasets.

References

[1]
Wasi Uddin Ahmad, Xiao Bai, Soomin Lee, and Kai-Wei Chang. 2021. Select, Extract and Generate: Neural Keyphrase Generation with Layer-wise Coverage Attention. In Proceedings of ACL.
[2]
Rabah Alzaidy, Cornelia Caragea, and C Lee Giles. 2019. Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents. In Proceedings of WWW.
[3]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR.
[4]
Hareesh Bahuleyan and Layla El Asri. 2020. Diverse keyphrase generation with neural unlikelihood training. In Proceedings of COLING.
[5]
Yu Bao, Hao Zhou, Shujian Huang, Lei Li, Lili Mou, Olga Vechtomova, Xinyu Dai, and Jiajun Chen. 2019. Generating sentences from disentangled syntactic and semantic spaces. In Proceedings of ACL.
[6]
Gary Bécigneul and Octavian-Eugen Ganea. 2019. Riemannian adaptive optimization methodsRiemannian adaptive optimization methods. In Proceedings of ICLR.
[7]
Gabor Berend. 2011. Opinion Expression Mining by Exploiting Keyphrase Extraction. In Proceedings of IJCNLP.
[8]
David M Blei, Thomas L Griffiths, and Michael I Jordan. 2010. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies. Journal of the ACM (JACM), Vol. 57, 2 (2010), 1--30.
[9]
David M Blei, Thomas L Griffiths, Michael I Jordan, Joshua B Tenenbaum, et al. 2003 a. Hierarchical topic models and the nested Chinese restaurant process. In Proceedings of NIPS.
[10]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003 b. Latent dirichlet allocation. Journal of machine Learning research, Vol. 3 (2003), 993--1022.
[11]
Florian Boudin. 2018. Unsupervised keyphrase extraction with multipartite graphs. In Proceedings of NAACL.
[12]
Adrien Bougouin, Florian Boudin, and Béatrice Daille. 2013. Topicrank: Graph-based topic ranking for keyphrase extraction. In Proceedings of IJCNLP.
[13]
Samuel R Bowman, Luke Vilnis, Oriol Vinyals, Andrew M Dai, Rafal Jozefowicz, and Samy Bengio. 2016. Generating sentences from a continuous space. In Proceedings of CoNLL.
[14]
Leo Breiman. 1996. Bagging predictors. Machine learning, Vol. 24, 2 (1996), 123--140.
[15]
Ricardo Campos, Vítor Mangaravite, Arian Pasquali, Alípio Jorge, Célia Nunes, and Adam Jatowt. 2020. YAKE! Keyword extraction from single documents using multiple local features. Information Sciences, Vol. 509 (2020), 257--289.
[16]
Hou Pong Chan, Wang Chen, Lu Wang, and Irwin King. 2019. Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards. In Proceedings of ACL.
[17]
Jun Chen, Xiaoming Zhang, Yu Wu, Zhao Yan, and Zhoujun Li. 2018. Keyphrase Generation with Correlation Constraints. In Proceedings of EMNLP.
[18]
Wang Chen, Hou Pong Chan, Piji Li, Lidong Bing, and Irwin King. 2019 a. An Integrated Approach for Keyphrase Generation via Exploring the Power of Retrieval and Extraction. In Proceedings of NAACL.
[19]
Wang Chen, Hou Pong Chan, Piji Li, and Irwin King. 2020. Exclusive Hierarchical Decoding for Deep Keyphrase Generation. In Proceedings of ACL.
[20]
Wang Chen, Yifan Gao, Jiani Zhang, Irwin King, and Michael R Lyu. 2019 b. Title-Guided Encoding for Keyphrase Generation. In Proceedings of AAAI.
[21]
Ziye Chen, Cheng Ding, Zusheng Zhang, Yanghui Rao, and Haoran Xie. 2021. Tree-structured topic modeling with nonparametric neural variational inference. In Proceedings of ACL.
[22]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of EMNLP.
[23]
Ran Ding, Ramesh Nallapati, and Bing Xiang. 2018. Coherence-aware neural topic modeling. In Proceedings of EMNLP.
[24]
Minh N Do. 2003. Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models. IEEE signal processing letters, Vol. 10, 4 (2003), 115--118.
[25]
Corina Florescu and Cornelia Caragea. 2017. Positionrank: An unsupervised approach to keyphrase extraction from scholarly documents. In Proceedings of ACL.
[26]
Eibe Frank, Gordon W Paynter, Ian H Witten, Carl Gutwin, and Craig G Nevill-Manning. 1999. Domain-specific keyphrase extraction. In Proceedings of IJCAI.
[27]
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of AISTATS.
[28]
Sujatha Das Gollapalli, Xiao-Li Li, and Peng Yang. 2017. Incorporating Expert Knowledge into Keyphrase Extraction. In Proceedings of AAAI.
[29]
Maria Grineva, Maxim Grinev, and Dmitry Lizorkin. 2009. Extracting key terms from noisy and multitheme documents. In Proceedings of WWW.
[30]
Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K. Li. 2016. Incorporating Copying Mechanism in Sequence-to-Sequence Learning. In Proceedings of ACL.
[31]
Kazi Saidul Hasan and Vincent Ng. 2014. Automatic keyphrase extraction: A survey of the state of the art. In Proceedings of ACL.
[32]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[33]
Anette Hulth and Beáta B Megyesi. 2006. A study on automatically extracted keywords in text categorization. In Proceedings of ACL.
[34]
Masaru Isonuma, Junichiro Mori, Danushka Bollegala, Ichiro Sakata, et al. 2020. Tree-Structured Neural Topic Model. In Proceedings of ACL.
[35]
Jihyuk Kim, Young-In Song, and Seung-won Hwang. 2021. Web Document Encoding for Structure-Aware Keyphrase Extraction. In Proceedings of SIGIR.
[36]
Joon Hee Kim, Dongwoo Kim, Suin Kim, and Alice Oh. 2012. Modeling topic hierarchies with the recursive chinese restaurant process. In Proceedings of CIKM.
[37]
Su Nam Kim, Olena Medelyan, Min-Yen Kan, and Timothy Baldwin. 2010. SemEval-2010 Task 5: Automatic Keyphrase Extraction from Scientific Articles. In Proceedings of fifth International Workshop on Semantic Evaluation.
[38]
Diederik Kingma and Max Welling. 2014a. Efficient gradient-based inference through transformations between bayes nets and neural nets. In Proceedings of ICML.
[39]
Diederik P Kingma and Max Welling. 2014b. Auto-encoding variational bayes. In Proceedings of ICRL.
[40]
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. In Proceedings of ICLR.
[41]
Mikalai Krapivin, Aliaksandr Autaeu, and Maurizio Marchese. 2009. Large dataset for keyphrases extraction. Technical Report. University of Trento.
[42]
Ponnambalam Kumaraswamy. 1980. A generalized probability density function for double-bounded random processes. Journal of hydrology, Vol. 46, 1--2 (1980), 79--88.
[43]
Piji Li, Wai Lam, Lidong Bing, and Zihao Wang. 2017. Deep recurrent generative decoder for abstractive text summarization. In Proceedings of EMNLP.
[44]
Zhiyuan Liu, Wenyi Huang, Yabin Zheng, and Maosong Sun. 2010. Automatic keyphrase extraction via topic decomposition. In Proceedings of EMNLP.
[45]
Zhiyuan Liu, Peng Li, Yabin Zheng, and Maosong Sun. 2009. Clustering to find exemplar terms for keyphrase extraction. In Proceedings of EMNLP.
[46]
Olena Medelyan, Eibe Frank, and Ian H Witten. 2009. Human-competitive tagging using automatic keyphrase extraction. In Proceedings of EMNLP.
[47]
Rui Meng, Sanqiang Zhao, Shuguang Han, Daqing He, Peter Brusilovsky, and Yu Chi. 2017. Deep Keyphrase Generation. In Proceedings of ACL.
[48]
Yishu Miao, Edward Grefenstette, and Phil Blunsom. 2017. Discovering discrete latent topics with neural variational inference. In Proceedings of ICML.
[49]
Yishu Miao, Lei Yu, and Phil Blunsom. 2016. Neural variational inference for text processing. In Proceedings of ICML.
[50]
Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing Order into Text. In Proceedings of EMNLP.
[51]
Thuy Dung Nguyen and Min-Yen Kan. 2007. Keyphrase extraction in scientific publications. In Proceedings of ICADL.
[52]
John Paisley, Chong Wang, David M Blei, and Michael I Jordan. 2014. Nested hierarchical Dirichlet processes. IEEE transactions on pattern analysis and machine intelligence, Vol. 37, 2 (2014), 256--270.
[53]
Krutarth Patel and Cornelia Caragea. 2021. Exploiting Position and Contextual Word Embeddings for Keyphrase Extraction from Scientific Papers. In Proceedings of ECACL.
[54]
Dang Pham and Tuan Le. 2021. Neural Topic Models for Hierarchical Topic Detection and Visualization. In Proceedings of ECML-PKDD.
[55]
Animesh Prasad and Min-Yen Kan. 2019. Glocal: Incorporating Global Information in Local Convolution for Keyphrase Extraction. In Proceedings of NAACL.
[56]
Iulian Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, and Yoshua Bengio. 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. In Proceedings of AAAI.
[57]
Akash Srivastava and Charles Sutton. 2017. Autoencoding variational inference for topic models. In Proceedings of ICML.
[58]
Lucas Sterckx, Thomas Demeester, Johannes Deleu, and Chris Develder. 2015. Topical word importance for fast keyphrase extraction. In Proceedings of WWW.
[59]
Lucas Sterckx, Thomas Demeester, Chris Develder, and Cornelia Caragea. 2016. Supervised keyphrase extraction as positive unlabeled learning. In Proceedings of EMNLP.
[60]
Jinsong Su, Shan Wu, Deyi Xiong, Yaojie Lu, Xianpei Han, and Biao Zhang. 2018. Variational recurrent neural machine translation. In Proceedings of AAAI.
[61]
Zhiqing Sun, Jian Tang, Pan Du, Zhi-Hong Deng, and Jian-Yun Nie. 2019. Divgraphpointer: A graph pointer network for extracting diverse keyphrases. In Proceedings of SIGIR.
[62]
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Proceedings of NIPS.
[63]
Yixuan Tang, Weilong Huang, Qi Liu, and Beibei Zhang. 2017. QALink: Enriching text documents with relevant Q&A site contents. In Proceedings of CIKM.
[64]
Nedelina Teneva and Weiwei Cheng. 2017. Salience rank: Efficient keyphrase extraction with topic modeling. In Proceedings of ACL (Short Papers).
[65]
Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. In Proceedings of ICRL.
[66]
Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. Advances in neural information processing systems, Vol. 28 (2015).
[67]
Xiaojun Wan and Jianguo Xiao. 2008. Single Document Keyphrase Extraction Using Neighborhood Knowledge. In Proceedings of AAAI.
[68]
Chong Wang and David Blei. 2009. Variational inference for the nested Chinese restaurant process. Advances in Neural Information Processing Systems, Vol. 22 (2009), 1990--1998.
[69]
Lu Wang and Claire Cardie. 2013. Domain-Independent Abstract Generation for Focused Meeting Summarization. In Proceedings of ACL.
[70]
Yue Wang, Jing Li, Hou Pong Chan, Irwin King, Michael R. Lyu, and Shuming Shi. 2019. Topic-Aware Neural Keyphrase Generation for Social Media Language. In Proceedings of ACL.
[71]
Ian H. Witten, Gordon W. Paynter, Eibe Frank, Carl Gutwin, and Craig G. Nevillmanning. 1999. KEA: Practical Automatic Keyphrase Extraction. In Proceedings of JCDL.
[72]
Yumo Xu and Mirella Lapata. 2021. Text Summarization with Latent Queries. arXiv preprint arXiv:2106.00104 (2021).
[73]
Tianchi Yang, Linmei Hu, Chuan Shi, Houye Ji, Xiaoli Li, and Liqiang Nie. 2021. HGAT: Heterogeneous graph attention networks for semi-supervised short text classification. ACM Transactions on Information Systems (TOIS), Vol. 39, 3 (2021), 1--29.
[74]
Hai Ye and Lu Wang. 2018. Semi-Supervised Learning for Neural Keyphrase Generation. In Proceedings of EMNLP.
[75]
Jiacheng Ye, Ruijian Cai, Tao Gui, and Qi Zhang. 2021 a. Heterogeneous Graph Neural Networks for Keyphrase Generation. In Proceedings of EMNLP.
[76]
Jiacheng Ye, Tao Gui, Yichao Luo, Yige Xu, and Qi Zhang. 2021 b. ONE2SET: Generating Diverse Keyphrases as a Set. In Proceedings of ACL.
[77]
Rong Ye, Wenxian Shi, Hao Zhou, Zhongyu Wei, and Lei Li. 2020. Variational template machine for data-to-text generation. In Proceedings of ICRL.
[78]
Xingdi Yuan, Tong Wang, Rui Meng, Khushboo Thaker, Peter Brusilovsky, Daqing He, and Adam Trischler. 2020. One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases. In Proceedings of ACL.
[79]
Yuxiang Zhang, Yaocheng Chang, Xiaoqing Liu, Sujatha Das Gollapalli, Xiaoli Li, and Chunjing Xiao. 2017a. Mike: keyphrase extraction by integrating multidimensional information. In Proceedings of CIKM.
[80]
Yong Zhang, Yang Fang, and Xiao Weidong. 2017b. Deep keyphrase generation with a convolutional sequence to sequence model. In Proceedings of ICSAI.
[81]
Jing Zhao and Yuxiang Zhang. 2019. Incorporating Linguistic Constraints into Keyphrase Generation. In Proceedings of ACL.
[82]
Tiancheng Zhao, Kyusong Lee, and Maxine Eskenazi. 2018. Unsupervised discrete sentence representation learning for interpretable neural dialog generation. In Proceedings of ACL.
[83]
Wayne Xin Zhao, Jing Jiang, Jing He, Yang Song, Palakorn Achanauparp, Ee-Peng Lim, and Xiaoming Li. 2011. Topical keyphrase extraction from twitter. In Proceedings of ACL.

Cited By

View all

Index Terms

  1. HTKG: Deep Keyphrase Generation with Neural Hierarchical Topic Guidance

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2022
    3569 pages
    ISBN:9781450387323
    DOI:10.1145/3477495
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 July 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep keyphrase generation
    2. neural hierarchical topic model
    3. variational neural generation model

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGIR '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)53
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)DKPENeurocomputing10.1016/j.neucom.2023.127177572:COnline publication date: 12-Apr-2024
    • (2024)Voice of the Professional: Acquiring competitive intelligence from large-scale professional generated contentsJournal of Business Research10.1016/j.jbusres.2024.114719180(114719)Online publication date: Jul-2024
    • (2024)A survey on neural topic models: methods, applications, and challengesArtificial Intelligence Review10.1007/s10462-023-10661-757:2Online publication date: 25-Jan-2024
    • (2024)Multi-level Contrastive Learning for Keyphrase GenerationAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5669-8_20(238-249)Online publication date: 3-Aug-2024
    • (2023)rHDP: An Aspect Sharing-Enhanced Hierarchical Topic Model for Multi-Domain CorpusACM Transactions on Information Systems10.1145/363135242:3(1-31)Online publication date: 29-Dec-2023
    • (2023)From statistical methods to deep learning, automatic keyphrase predictionInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10338260:4Online publication date: 1-Jul-2023
    • (2022)Keyword extraction as sequence labeling with classification algorithmsNeural Computing and Applications10.1007/s00521-022-07906-x35:4(3413-3422)Online publication date: 11-Oct-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media