Abstract
Keyphrase extraction is essential for many IR and NLP tasks. Existing methods usually use the phrases of the document separately without distinguishing the potential semantic correlations among them, or other statistical features from knowledge bases such as WordNet and Wikipedia. However, the mutual semantic information between phrases is also important, and exploiting their correlations may potentially help us more effectively extract the keyphrases. Generally, phrases in the title are more likely to be keyphrases reflecting the document topics, and phrases in the body are usually used to describe the document topics. We regard the relation between the title phrase and body phrase as a description relation. To this end, this paper proposes a novel keyphrase extraction approach by exploiting massive description relations. To make use of the semantic information provided by the description relations, we organize the phrases of a document as a description graph, and employ various graph-based ranking algorithms to rank the candidates. Experimental results on the real dataset demonstrate the effectiveness of the proposed approach in keyphrase extraction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Wan, X., Xiao, J.: Exploiting neighborhood knowledge for single document summarization and keyphrase extraction. ACM Transactions on Information Systems (TOIS) 28(2), 8 (2010)
Liu, Z., Li, P., Zheng, Y., Sun, M.: Clustering to find exemplar terms for keyphrase extraction. In: EMNLP, pp. 257–266. ACL (2009)
Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., Nevill-Manning, C.G.: Domain-specific keyphrase extraction. In: IJCAI, pp. 668–673 (1999)
Jones, S., Staveley, M.S.: Phrasier: a system for interactive document retrieval using keyphrases. In: SIGIR, pp. 160–167. ACM (1999)
Medelyan, O., Witten, I.H.: Thesaurus based automatic keyphrase indexing. In: JCDL, pp. 296–297. ACM (2006)
Song, M., Song, I.Y., Allen, R.B., Obradovic, Z.: Keyphrase extraction-based query expansion in digital libraries. In: JCDL, pp. 202–209. ACM (2006)
Salton, G., McGill, M.J.: Introduction to modern information retrieval (1986)
Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: Kea: Practical automatic keyphrase extraction. In: Conference on Digital Libraries, vol. 3, pp. 147–151. ACM (1999)
Medelyan, O., Frank, E., Witten, I.H.: Human-competitive tagging using automatic keyphrase extraction. In: EMNLP, pp. 1318–1327. ACL (2009)
Grineva, M., Grinev, M., Lizorkin, D.: Extracting key terms from noisy and multitheme documents. In: WWW, pp. 661–670. ACM (2009)
Mahdi, A.E., Joorabchi, A.: A citation-based approach to automatic topical indexing of scientific literature. Journal of Information Science 36(6), 798–811 (2010)
Joorabchi, A., Mahdi, A.E.: Automatic keyphrase annotation of scientific documents using wikipedia and genetic algorithms. Journal of Information Science 39(3), 410–426 (2013)
Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: CIKM, pp. 233–242. ACM (2007)
Yeh, E., Ramage, D., Manning, C.D., Agirre, E., Soroa, A.: Wikiwalk: random walks on wikipedia for semantic relatedness. In: TextGraphs Workshop, pp. 41–49. ACL (2009)
Milne, D.: Computing semantic relatedness using wikipedia link structure. In: Proceedings of the New Zealand Computer Science Research Student Conference. Citeseer (2007)
Fogarolli, A.: Word sense disambiguation based on wikipedia link structure. In: ICSC 2009, pp. 77–82. IEEE (2009)
Milne, D., Witten, I.H.: An open-source toolkit for mining wikipedia. Artificial Intelligence 194, 222–239 (2013)
Huang, C., Tian, Y., Zhou, Z., Ling, C.X., Huang, T.: Keyphrase extraction using semantic networks structure analysis. In: ICDM, pp. 275–284. IEEE (2006)
Zhang, W., Feng, W., Wang, J.: Integrating semantic relatedness and words’ intrinsic features for keyword extraction. In: IJCAI, pp. 2225–2231. AAAI (2013)
Lahiri, S., Choudhury, S.R., Caragea, C.: Keyword and keyphrase extraction using centrality measures on collocation networks. arXiv (2014)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web (1999)
Jeh, G., Widom, J.: Simrank: a measure of structural-context similarity. In: KDD, pp. 538–543. ACM (2002)
Wang, S., Xie, S., Zhang, X., Li, Z., Yu, P.S., Shu, X.: Future influence ranking of scientific literature. In: SDM, pp. 749–757. SIAM (2014)
Jiang, X., Hu, Y., Li, H.: A ranking approach to keyphrase extraction. In: SIGIR, pp. 756–757. ACM (2009)
Medelyan, O.: Human-competitive automatic topic indexing. PhD thesis, The University of Waikato (2009)
Rolling, L.: Indexing consistency, quality and efficiency. Information Processing & Management 17(2), 69–76 (1981)
Hasan, K.S., Ng, V.: Conundrums in unsupervised keyphrase extraction: making sense of the state-of-the-art. In: ICCL Posters, pp. 365–373. ACL (2010)
Kim, S.N., Medelyan, O., Kan, M.-Y., Baldwin, T.: Semeval-2010 task 5: Automatic keyphrase extraction from scientific articles. In: ACL Workshop, pp. 21–26. ACL (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, F., Wang, Z., Wang, S., Li, Z. (2014). Exploiting Description Knowledge for Keyphrase Extraction. In: Pham, DN., Park, SB. (eds) PRICAI 2014: Trends in Artificial Intelligence. PRICAI 2014. Lecture Notes in Computer Science(), vol 8862. Springer, Cham. https://doi.org/10.1007/978-3-319-13560-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-13560-1_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13559-5
Online ISBN: 978-3-319-13560-1
eBook Packages: Computer ScienceComputer Science (R0)