More Web Proxy on the site http://driver.im/

research-article

Unsupervised Semantic Association Learning with Latent Label Inference

Authors:

Yongyi MaoAuthors Info & Claims

WWW '21: Proceedings of the Web Conference 2021

Pages 4010 - 4019

https://doi.org/10.1145/3442381.3450132

Published: 03 June 2021 Publication History

Abstract

In this paper, we unify a diverse set of learning tasks in NLP, semantic retrieval and related areas, under a common umbrella, which we call unsupervised semantic association learning (USAL). Examples of this generic task include word sense disambiguation, answer selection and question retrieval. We then present a novel modeling framework to tackle such tasks. The framework introduces, under the deep learning paradigm, a latent label indexing the true target in the candidate target set. An EM algorithm is then developed for learning the deep model and inferring the latent variables, principled under variational techniques and noise contrastive estimation. We apply the model and algorithm to several semantic retrieval benchmark tasks and the superior performance of the proposed approach is demonstrated via empirical studies.

References

[1]

Akiko Aizawa. 2003. An information-theoretic perspective of tf–idf measures. Information Processing & Management 39, 1 (2003), 45–65.

Digital Library

[2]

Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2017. A Simple but Tough-to-Beat Baseline for Sentence Embeddings. In ICLR.

[3]

Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, 2018. Universal Sentence Encoder for English. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 169–174.

[4]

Zhuyun Dai and Jamie Callan. 2019. Deeper Text Understanding for IR with Contextual Neural Language Modeling. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, July 21-25, 2019, Benjamin Piwowarski, Max Chevalier, Éric Gaussier, Yoelle Maarek, Jian-Yun Nie, and Falk Scholer (Eds.). ACM, 985–988. https://doi.org/10.1145/3331184.3331303

Digital Library

[5]

Arpita Das, Manish Shrivastava, and Manoj Chinnakotla. 2016. Mirror on the Wall: Finding Similar Questions with Deep Structured Topic Modeling. In Advances in Knowledge Discovery and Data Mining, James Bailey, Latifur Khan, Takashi Washio, Gill Dobbie, Joshua Zhexue Huang, and Ruili Wang (Eds.). Springer International Publishing, Cham, 454–465.

[6]

Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, and W Bruce Croft. 2017. Neural ranking models with weak supervision. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 65–74.

Digital Library

[7]

Arthur P Dempster, Nan M Laird, and Donald B Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological) 39, 1(1977), 1–22.

[8]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186.

[9]

Wenyu Du, Baocheng Li, Min Yang, Qiang Qu, and Ying Shen. 2019. A Multi-Task Learning Approach for Answer Selection: A Study and a Chinese Law Dataset. Proceedings of the AAAI Conference on Artificial Intelligence 33, 01 (July 2019), 9935–9936. https://doi.org/10.1609/aaai.v33i01.33019935

Digital Library

[10]

Kawin Ethayarajh. 2018. Unsupervised Random Walk Sentence Embeddings: A Strong but Simple Baseline. In Proceedings of The Third Workshop on Representation Learning for NLP. Association for Computational Linguistics, Melbourne, Australia, 91–100. https://doi.org/10.18653/v1/W18-3012

[11]

Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. 249–256.

[12]

Jiafeng Guo, Yixing Fan, Liang Pang, Liu Yang, Qingyao Ai, Hamed Zamani, Chen Wu, W Bruce Croft, and Xueqi Cheng. 2019. A deep look into neural ranking models for information retrieval. Information Processing & Management(2019), 102067.

[13]

Michael Gutmann and Aapo Hyvärinen. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. 297–304.

[14]

Michael Heilman and Noah A. Smith. 2010. Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Los Angeles, California, 1011–1019. https://www.aclweb.org/anthology/N10-1145

[15]

Luyao Huang, Chi Sun, Xipeng Qiu, and Xuanjing Huang. [n.d.]. GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).

[16]

Luyao Huang, Chi Sun, Xipeng Qiu, and Xuan-Jing Huang. 2019. GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3500–3505.

[17]

Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. Embeddings for word sense disambiguation: An evaluation study. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 897–907.

[18]

Mohit Iyyer, Jordan Boyd-Graber, Leonardo Claudino, Richard Socher, and Hal Daumé III. 2014. A Neural Network for Factoid Question Answering over Paragraphs. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 633–644. https://doi.org/10.3115/v1/D14-1070

[19]

Arkadiusz Janz and Maciej Piasecki. 2019. Word Sense Disambiguation Based on Constrained Random Walks in Linked Semantic Networks. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019). 516–525.

[20]

Ganesh Jawahar, Benoît Sagot, and Djamé Seddah. 2019. What Does BERT Learn about the Structure of Language?. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 3651–3657.

[21]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).

[22]

Tao Lei, Hrishikesh Joshi, Regina Barzilay, Tommi Jaakkola, Kateryna Tymoshenko, Alessandro Moschitti, and Lluís Màrquez. 2016. Semi-supervised Question Retrieval with Gated Convolutions. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, 1279–1289. https://doi.org/10.18653/v1/N16-1153

[23]

Effi Levi, Roi Reichart, and Ari Rappoport. 2016. Edge-Linear First-Order Dependency Parsing with Undirected Minimum Spanning Tree Inference. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 2104–2113. https://doi.org/10.18653/v1/P16-1198

[24]

Fuli Luo, Tianyu Liu, Zexue He, Qiaolin Xia, Zhifang Sui, and Baobao Chang. 2018. Leveraging gloss knowledge in neural word sense disambiguation by hierarchical co-attention. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 1402–1411.

[25]

Bhaskar Mitra, Fernando Diaz, and Nick Craswell. 2017. Learning to match using local and distributed representations of text for web search. In Proceedings of the 26th International Conference on World Wide Web. 1291–1299.

Digital Library

[26]

Andrea Moro, Alessandro Raganato, and Roberto Navigli. 2014. Entity Linking meets Word Sense Disambiguation: a Unified Approach. Transactions of the Association for Computational Linguistics 2 (2014), 231–244. https://doi.org/10.1162/tacl_a_00179

[27]

Preslav Nakov, Doris Hoogeveen, Lluís Màrquez, Alessandro Moschitti, Hamdy Mubarak, Timothy Baldwin, and Karin Verspoor. 2017. SemEval-2017 task 3: Community question answering. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). 27–48.

[28]

Preslav Nakov, Lluís Màrquez, Alessandro Moschitti, Walid Magdy, Hamdy Mubarak, Abed Alhakim Freihat, Jim Glass, and Bilal Randeree. 2016. SemEval-2016 Task 3: Community Question Answering. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). Association for Computational Linguistics, San Diego, California, 525–545. https://doi.org/10.18653/v1/S16-1083

[29]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, 2019. Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems. 8026–8037.

[30]

Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2227–2237.

[31]

Alessandro Raganato, Jose Camacho-Collados, and Roberto Navigli. 2017. Word sense disambiguation: A unified evaluation framework and empirical comparison. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. 99–110.

[32]

Jinfeng Rao, Hua He, and Jimmy Lin. 2016. Noise-contrastive estimation for answer selection with deep neural networks. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 1913–1916.

Digital Library

[33]

Jinfeng Rao, Linqing Liu, Yi Tay, Wei Yang, Peng Shi, and Jimmy Lin. 2019. Bridging the Gap between Relevance Matching and Semantic Matching for Short Text Similarity Modeling. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 5370–5381. https://doi.org/10.18653/v1/D19-1540

[34]

Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3973–3983.

[35]

Stephen Robertson, Hugo Zaragoza, 2009. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends® in Information Retrieval 3, 4(2009), 333–389.

[36]

Thomas Roelleke and Jun Wang. 2008. TF-IDF uncovered: a study of theories and probabilities. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 435–442.

Digital Library

[37]

Wataru Sakata, Tomohide Shibata, Ribeka Tanaka, and Sadao Kurohashi. 2019. FAQ Retrieval using Query-Question Similarity and BERT-Based Query-Answer Relevance. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, July 21-25, 2019, Benjamin Piwowarski, Max Chevalier, Éric Gaussier, Yoelle Maarek, Jian-Yun Nie, and Falk Scholer (Eds.). ACM, 1113–1116. https://doi.org/10.1145/3331184.3331326

Digital Library

[38]

Min Joon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. 2017. Bidirectional Attention Flow for Machine Comprehension. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=HJ0UKP9ge

[39]

Darsh Shah, Tao Lei, Alessandro Moschitti, Salvatore Romeo, and Preslav Nakov. 2018. Adversarial Domain Adaptation for Duplicate Question Detection. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 1056–1063. https://doi.org/10.18653/v1/D18-1131

[40]

Gehui Shen, Yunlun Yang, and Zhi-Hong Deng. 2017. Inter-Weighted Alignment Network for Sentence Pair Modeling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 1179–1189. https://doi.org/10.18653/v1/D17-1122

[41]

Rocco Tripodi and Roberto Navigli. 2019. Game Theory Meets Embeddings: a Unified Framework for Word Sense Disambiguation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 88–99.

[42]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. https://arxiv.org/pdf/1706.03762.pdf

[43]

Di Wang and Eric Nyberg. 2015. A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, Beijing, China, 707–712. https://doi.org/10.3115/v1/P15-2116

[44]

Jie Wang, Zhenxin Fu, Moxin Li, Haisong Zhang, Dongyan Zhao, and Rui Yan. 2020. Learning Sense Representation from Word Representation for Unsupervised Word Sense Disambiguation (Student Abstract). In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 13947–13948. https://aaai.org/ojs/index.php/AAAI/article/view/7246

[45]

Mengqiu Wang and Christopher Manning. 2010. Probabilistic Tree-Edit Models with Structured Latent Variables for Textual Entailment and Question Answering. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010). Coling 2010 Organizing Committee, Beijing, China, 1164–1172. https://www.aclweb.org/anthology/C10-1131

[46]

Peng Xu, Xiaofei Ma, Ramesh Nallapati, and Bing Xiang. 2019. Passage Ranking with Weak Supervision. arXiv preprint arXiv:1905.05910(2019).

[47]

Vikas Yadav, Steven Bethard, and Mihai Surdeanu. 2019. Alignment over Heterogeneous Embeddings for Question Answering. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 2681–2691. https://doi.org/10.18653/v1/N19-1274

[48]

Vikas Yadav, Rebecca Sharp, and Mihai Surdeanu. 2018. Sanity check: A strong alignment and information retrieval baseline for question answering. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 1217–1220.

Digital Library

[49]

Min Yang. 2015. Deep Markov Neural Network for Sequential Data Classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, Beijing, China, 32–37. https://doi.org/10.3115/v1/P15-2006

[50]

Runqi Yang, Jianhai Zhang, Xing Gao, Feng Ji, and Haiqing Chen. 2019. Simple and Effective Text Matching with Richer Alignment Features. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 4699–4709. https://doi.org/10.18653/v1/P19-1465

[51]

Yi Yang, Wen-tau Yih, and Christopher Meek. 2015. Wikiqa: A challenge dataset for open-domain question answering. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2013–2018.

[52]

Xuchen Yao, Benjamin Van Durme, Chris Callison-Burch, and Peter Clark. 2013. Answer Extraction as Sequence Tagging with Tree Edit Distance. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Atlanta, Georgia, 858–867. https://www.aclweb.org/anthology/N13-1106

[53]

Wen-tau Yih, Kristina Toutanova, John C. Platt, and Christopher Meek. 2011. Learning Discriminative Projections for Text Similarity Measures. In Proceedings of the Fifteenth Conference on Computational Natural Language Learning. Association for Computational Linguistics, Portland, Oregon, USA, 247–256. https://www.aclweb.org/anthology/W11-0329

[54]

Minghua Zhang and Yunfang Wu. 2018. An unsupervised model with attention autoencoders for question retrieval. In Thirty-Second AAAI Conference on Artificial Intelligence.

[55]

Zhi Zhong and Hwee Tou Ng. 2010. It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text. In Proceedings of the ACL 2010 System Demonstrations. Association for Computational Linguistics, Uppsala, Sweden, 78–83. https://www.aclweb.org/anthology/P10-4014

Unsupervised Semantic Association Learning with Latent Label Inference
1. Computing methodologies
  1. Artificial intelligence

Recommendations

Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and dictionary

Word sense disambiguation (WSD) is meant to assign the most appropriate sense to a polysemous word according to its context. We present a method for automatic WSD using only two resources: a raw text corpus and a machine-readable dictionary (MRD). The ...
Unsupervised translated word sense disambiguation in constructing bilingual lexical database
SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing

The performance of a machine translation system depends on the availability of bilingual lexical dictionary and completion of its word sense disambiguation performance. Word sense disambiguation plays a vital role in several applications such as machine ...
Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and dictionary

Word sense disambiguation (WSD) is meant to assign the most appropriate sense to a polysemous word according to its context. We present a method for automatic WSD using only two resources: a raw text corpus and a machine-readable dictionary (MRD). The ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '21: Proceedings of the Web Conference 2021

April 2021

4054 pages

ISBN:9781450383127

DOI:10.1145/3442381

Editors:
Jure Leskovec
Stanford
,
Marko Grobelnik
Jožef Stefan Institute
,
Marc Najork
Google
,
Jie Tang
Tsinghua University
,
Leila Zia
Wikimedia Foundation

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '21

Sponsor:

SIGWEB

WWW '21: The Web Conference 2021

April 19 - 23, 2021

Ljubljana, Slovenia

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
211
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)1

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents