More Web Proxy on the site http://driver.im/

research-article

TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network

Authors:

Jiawei HanAuthors Info & Claims

WWW '20: Proceedings of The Web Conference 2020

Pages 486 - 497

https://doi.org/10.1145/3366423.3380132

Published: 20 April 2020 Publication History

Abstract

Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications. For example, online retailers (e.g., Amazon and eBay) use taxonomies for product recommendation, and web search engines (e.g., Google and Bing) leverage taxonomies to enhance query understanding. Enormous efforts have been made on constructing taxonomies either manually or semi-automatically. However, with the fast-growing volume of web content, existing taxonomies will become outdated and fail to capture emerging knowledge. Therefore, in many applications, dynamic expansions of an existing taxonomy are in great demand. In this paper, we study how to expand an existing taxonomy by adding a set of new concepts. We propose a novel self-supervised framework, named TaxoExpan, which automatically generates a set of ⟨query concept, anchor concept⟩ pairs from the existing taxonomy as training data. Using such self-supervision data, TaxoExpan learns a model to predict whether a query concept is the direct hyponym of an anchor concept. We develop two innovative techniques in TaxoExpan: (1) a position-enhanced graph neural network that encodes the local structure of an anchor concept in the existing taxonomy, and (2) a noise-robust training objective that enables the learned model to be insensitive to the label noise in the self-supervision data. Extensive experiments on three large-scale datasets from different domains demonstrate both the effectiveness and the efficiency of TaxoExpan for taxonomy expansion.

References

[1]

Rami Aly, Shantanu Acharya, Alexander Ossa, Arne Köhn, Christian Biemann, and Alexander Panchenko. 2019. Every Child Should Have Parents: A Taxonomy Refinement Algorithm Based on Hyperbolic Term Embeddings. In ACL.

[2]

Luis Espinosa Anke, José Camacho-Collados, Claudio Delli Bovi, and Horacio Saggion. 2016. Supervised Distributional Hypernym Discovery via Domain Adaptation. In EMNLP.

[3]

Luis Espinosa Anke, José Camacho-Collados, Sara Rodríguez-Fernández, Horacio Saggion, and Leo Wanner. 2016. Extending WordNet with Fine-Grained Collocational Information via Supervised Distributional Learning. In COLING.

[4]

Mohit Bansal, David Burkett, Gerard de Melo, and Dan Klein. 2014. Structured Learning for Taxonomy Induction with Belief Propagation. In ACL.

[5]

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching Word Vectors with Subword Information. arXiv preprint arXiv:1607.04606(2016).

[6]

Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In KDD.

[7]

Anne Cocos, Marianna Apidianaki, and Chris Callison-Burch. 2018. Comparing Constraints for Taxonomic Organization. In NAACL.

[8]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT.

[9]

Christiane Fellbaum. 1998. WordNet.

[10]

Christiane Fellbaum, Udo Hahn, and Barry D. Smith. 2006. Towards new information resources for public health - From WordNet to MedicalWordNet. Journal of biomedical informatics(2006).

[11]

Amit Gupta, Rémi Lebret, Hamza Harkous, and Karl Aberer. 2017. Taxonomy Induction Using Hypernym Subsequences. In CIKM.

[12]

William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In NIPS.

[13]

Marti A. Hearst. 1992. Automatic Acquisition of Hyponyms from Large Text Corpora. In COLING.

[14]

Wen Hua, Zhongyuan Wang, Haixun Wang, Kai Zheng, and Xiaofang Zhou. 2017. Understand Short Texts by Harvesting and Analyzing Semantic Knowledge. TKDE (2017).

[15]

Jun Huang, Zhaochun Ren, Wayne Xin Zhao, Gaole He, Ji-Rong Wen, and Daxiang Dong. 2019. Taxonomy-Aware Multi-Hop Reasoning Networks for Sequential Recommendation. In WSDM.

[16]

Meng Jiang, Jingbo Shang, Taylor Cassidy, Xiang Ren, Lance M. Kaplan, Timothy P. Hanratty, and Jiawei Han. 2017. MetaPAD: Meta Pattern Discovery from Massive Text Corpora. In KDD.

[17]

David Jurgens and Mohammad Taher Pilehvar. 2015. Reserating the awesometastic: An automatic extension of the WordNet taxonomy for novel terms. In NAACL-HLT.

[18]

David Jurgens and Mohammad Taher Pilehvar. 2016. SemEval-2016 Task 14: Semantic Taxonomy Enrichment. In SemEval@NAACL-HLT.

[19]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.

[20]

Zornitsa Kozareva and Eduard H. Hovy. 2010. A Semi-Supervised Method to Learn and Construct Taxonomies Using the Web. In EMNLP.

[21]

Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu, and Peter W. Battaglia. 2018. Learning Deep Generative Models of Graphs. In ICLR.

[22]

Dekang Lin. 1998. An Information-Theoretic Definition of Similarity. In ICML.

[23]

Carolyn E. Lipscomb. 2000. Medical Subject Headings (MeSH).Bulletin of the Medical Library Association(2000).

[24]

Bang Wu Liu, Weidong Guo, Di Niu, Chaoyue Wang, Shang-Zhong Xu, Jinghong Lin, Kunfeng Lai, and Yu Wei Xu. 2019. A User-Centered Concept Mining System for Query and Document Understanding at Tencent. In KDD.

[25]

Anh Tuan Luu, Yi Tay, Siu Cheung Hui, and See-Kiong Ng. 2016. Learning Term Embeddings for Taxonomic Relation Identification Using Dynamic Weighting Neural Network. In EMNLP.

[26]

Yuning Mao, Xiang Ren, Jiaming Shen, Xiaotao Gu, and Jiawei Han. 2018. End-to-End Reinforcement Learning for Automatic Taxonomy Induction. In ACL.

[27]

Rui Meng, Yongxin Tong, Lei Chen, and Caleb Chen Cao. 2015. CrowdTC: Crowdsourced Taxonomy Construction. In ICDM.

[28]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS.

[29]

Ndapandula Nakashole, Gerhard Weikum, and Fabian M. Suchanek. 2012. PATTY: A Taxonomy of Relational Patterns with Semantic Types. In EMNLP-CoNLL.

Digital Library

[30]

Vassilis Plachouras, Fabio Petroni, Timothy Nugent, and Jochen L. Leidner. 2018. A Comparison of Two Paraphrase Models for Taxonomy Augmentation. In NAACL-HLT.

[31]

Stephen Roller, Douwe Kiela, and Maximilian Nickel. 2018. Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora. In ACL.

[32]

Michael Sejr Schlichtkrull and Héctor Martínez Alonso. 2016. MSejrKu at SemEval-2016 Task 14: Taxonomy Enrichment by Evidence Ranking. In SemEval@NAACL-HLT.

[33]

Jiaming Shen, Zeqiu Wu, Dongming Lei, Jingbo Shang, Xiang Ren, and Jiawei Han. 2017. SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble. In ECML/PKDD.

[34]

Jiaming Shen, Zeqiu Wu, Dongming Lei, Chao Zhang, Xiang Ren, Michelle T. Vanni, Brian M. Sadler, and Jiawei Han. 2018. HiExpan: Task-Guided Taxonomy Construction by Hierarchical Tree Expansion. In KDD.

[35]

Zhihong Shen, Hao Ma, and Kuansan Wang. 2018. A Web-scale system for scientific knowledge exploration. In ACL.

[36]

Yu Shi, Jiaming Shen, Yuchen Li, Naijing Zhang, Xinwei He, Zhengzhi Lou, Qi Zhu, Matthew D Walker, Myung‐Hwan Kim, and Jiawei Han. 2019. Discovering Hypernymy in Text-Rich Heterogeneous Information Network by Exploiting Context Granularity. In CIKM’19.

[37]

Vered Shwartz, Yoav Goldberg, and Ido Dagan. 2016. Improving hypernymy detection with an integrated path-based and distributional method. ACL (2016).

[38]

Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June Paul Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MAS) and Applications. In WWW.

[39]

Darin Stewart. 2008. Building Enterprise Taxonomies.

[40]

Antonio Toral, Rafael Muñoz, and Monica Monachini. 2008. Named Entity WordNet. In LREC.

[41]

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. ArXiv (2018).

[42]

Nikhita Vedula, Patrick K. Nicholson, Deepak Ajwani, Sourav Dutta, Alessandra Sala, and Srinivasan Parthasarathy. 2018. Enriching Taxonomies With Functional Domain Knowledge. In SIGIR.

[43]

Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In ICLR.

[44]

Jingjing Wang, Changsung Kang, Yi Chang, and Jiawei Han. 2014. A hierarchical Dirichlet model for taxonomy expansion for search engines. In WWW.

[45]

Wentao Wu, Hongsong Li, Haixun Wang, and Kenny Q. Zhu. 2012. Probase: a probabilistic taxonomy for text understanding. In SIGMOD Conference.

Digital Library

[46]

Grace Hui Yang. 2012. Constructing Task-Specific Taxonomies for Document Collection Browsing. In EMNLP-CoNLL.

[47]

Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, and Jure Leskovec. 2018. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In KDD.

[48]

Zhitao Ying, Jiaxuan You, Christopher Morris, Xiang Ren, William L. Hamilton, and Jure Leskovec. 2018. Hierarchical Graph Representation Learning with Differentiable Pooling. In NeurIPS.

[49]

Jiaxuan You, Bowen Liu, Zhitao Ying, Vijay S. Pande, and Jure Leskovec. 2018. Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation. In NeurIPS.

[50]

Jiaxuan You, Rex Ying, and Jure Leskovec. 2019. Position-aware Graph Neural Networks. In ICML.

[51]

Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabás Póczos, Ruslan Salakhutdinov, and Alexander J. Smola. 2017. Deep Sets. In NIPS.

[52]

Chao Zhang, Fangbo Tao, Xiusi Chen, Jiaming Shen, Meng Jiang, Brian M. Sadler, Michelle T. Vanni, and Jiawei Han. 2018. TaxoGen: Constructing Topical Concept Taxonomy by Adaptive Term Embedding and Clustering. In KDD.

[53]

Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. 2018. An End-to-End Deep Learning Architecture for Graph Classification. In AAAI.

[54]

Xiangling Zhang, Yueguo Chen, Jun Chen, Xiaoyong Du, Ke Wang, and Ji-Rong Wen. 2017. Entity Set Expansion via Knowledge Graphs. In SIGIR ’17.

[55]

Yuchen Zhang, Amr Ahmed, Vanja Josifovski, and Alexander J. Smola. 2014. Taxonomy discovery for personalized recommendation. In WSDM.

Cited By

Jiang SYao QWang QSun YLarson K(2024)A single vector is not enoughProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/934(8421-8426)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/934
Niu YXu HLiu CWen YYuan XLarson K(2024)Contrastive representation learning for self-supervised taxonomy completionProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/712(6442-6450)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/712
Zhou YJin DWei JHe DYu ZZhang WLarson K(2024)Generalized taxonomy-guided graph neural networksProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/289(2616-2624)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/289
Show More Cited By

Index Terms

TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network

Index terms have been assigned to the content through auto-classification.

Recommendations

STEAM: Self-Supervised Taxonomy Expansion with Mini-Paths
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Taxonomies are important knowledge ontologies that underpin numerous applications on a daily basis, but many taxonomies used in practice suffer from the low coverage issue. We study the taxonomy expansion problem, which aims to expand existing ...
FLAME: Self-Supervised Low-Resource Taxonomy Expansion Using Large Language Models
Taxonomies represent an arborescence hierarchical structure that establishes relationships among entities to convey knowledge within a specific domain. They find utility in various real-world applications, such as e-commerce search engines and ...
A debiased self-training framework with graph self-supervised pre-training aided for semi-supervised rumor detection
Abstract
Existing rumor detection models have achieved remarkable performance in fully-supervised settings. However, it is time-consuming and labor-intensive to obtain extensive labeled rumor data. To mitigate the reliance on labeled data, semi-supervised ...
Highlights
- A self-training framework for semi-supervised rumor detection is proposed.
- Graph self-supervised pre-training is employed to alleviate confirmation bias.
- Self-adaptive thresholds are designed to generate reliable pseudo-labels.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '20: Proceedings of The Web Conference 2020

April 2020

3143 pages

ISBN:9781450370233

DOI:10.1145/3366423

Editors:
Yennun Huang
Acadmica sinica, Taiwan
,
Irwin King
The Chinese University of Hong Kong, Hong Kong
,
Tie-Yan Liu
Microsoft Research Asia, China
,
Maarten van Steen
University of Twente, Netherlands

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '20

Sponsor:

SIGWEB

WWW '20: The Web Conference 2020

April 20 - 24, 2020

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

36
Total Citations
View Citations
1,035
Total Downloads

Downloads (Last 12 months)93
Downloads (Last 6 weeks)7

Reflects downloads up to 11 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jiang SYao QWang QSun YLarson K(2024)A single vector is not enoughProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/934(8421-8426)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/934
Niu YXu HLiu CWen YYuan XLarson K(2024)Contrastive representation learning for self-supervised taxonomy completionProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/712(6442-6450)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/712
Zhou YJin DWei JHe DYu ZZhang WLarson K(2024)Generalized taxonomy-guided graph neural networksProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/289(2616-2624)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/289
Zhang FShi SZhu YChen BCen YYu JChen YWang LZhao QCheng YHan TAn YZhang DTam WCao KPang YGuan XYuan HSong JLi XDong YTang JBaeza-Yates RBonchi F(2024)OAG-Bench: A Human-Curated Benchmark for Academic Graph MiningProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672354(6214-6225)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3672354
Komarlu TJiang MWang XHan JBaeza-Yates RBonchi F(2024)OntoType: Ontology-Guided and Pre-Trained Language Model Assisted Fine-Grained Entity TypingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671745(1407-1417)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671745
Zhang YZhong MOuyang SJiao YZhou SDing LHan JBaeza-Yates RBonchi F(2024)Automated Mining of Structured Knowledge from Text in the Era of Large Language ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671469(6644-6654)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671469
Kumar AShalghar AChauhan HGanesan BChaudhuri RKannan A(2024)Document structure aware Relation Extraction for Semantic AutomationProceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)10.1145/3632410.3632466(232-236)Online publication date: 4-Jan-2024
https://dl.acm.org/doi/10.1145/3632410.3632466
Huang SMa SLi YLi YZheng HSerra ESpezzano F(2024)From Retrieval to Generation: Efficient and Effective Entity Set ExpansionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679837(921-931)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679837
Shi JDong HChen JWu ZHorrocks IChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Taxonomy Completion via Implicit Concept InsertionProceedings of the ACM Web Conference 202410.1145/3589334.3645584(2159-2169)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645584
Massai L(2024)Evaluation of semantic relations impact in query expansion-based retrieval systemsKnowledge-Based Systems10.1016/j.knosys.2023.111183283:COnline publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1016/j.knosys.2023.111183
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents