[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3308558.3313707acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Semantic Text Matching for Long-Form Documents

Published: 13 May 2019 Publication History

Abstract

Semantic text matching is one of the most important research problems in many domains, including, but not limited to, information retrieval, question answering, and recommendation. Among the different types of semantic text matching, long-document-to-long-document text matching has many applications, but has rarely been studied. Most existing approaches for semantic text matching have limited success in this setting, due to their inability to capture and distill the main ideas and topics from long-form text.
In this paper, we propose a novel Siamese multi-depth attention-based hierarchical recurrent neural network (SMASH RNN) that learns the long-form semantics, and enables long-form document based semantic text matching. In addition to word information, SMASH RNN is using the document structure to improve the representation of long-form documents. Specifically, SMASH RNN synthesizes information from different document structure levels, including paragraphs, sentences, and words. An attention-based hierarchical RNN derives a representation for each document structure level. Then, the representations learned from the different levels are aggregated to learn a more comprehensive semantic representation of the entire document. For semantic text matching, a Siamese structure couples the representations of a pair of documents, and infers a probabilistic score as their similarity.
We conduct an extensive empirical evaluation of SMASH RNN with three practical applications, including email attachment suggestion, related article recommendation, and citation recommendation. Experimental results on public data sets demonstrate that SMASH RNN significantly outperforms competitive baseline methods across various classification and ranking scenarios in the context of semantic matching of long-form documents.

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, 2016. TensorFlow: a system for large-scale machine learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation(OSDI'16). USENIX Association, 265-283.
[2]
Hadi Amiri, Philip Resnik, Jordan Boyd-Graber, and Hal Daume´ III. 2016. Learning text pair similarity with context-sensitive autoencoders. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(ACL'16), Vol. 1. ACL, 1882-1892.
[3]
Ricardo Baeza-Yates, Berthier Ribeiro-Neto, 1999. Modern information retrieval. Vol. 463. ACM.
[4]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473(2014).
[5]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research 3, Jan (2003), 993-1022.
[6]
Antoine Bordes, Jason Weston, and Nicolas Usunier. 2014. Open question answering with weakly supervised embedding models. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 165-180.
[7]
Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30, 1-7 (1998), 107-117.
[8]
Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard Säckinger, and Roopak Shah. 1994. Signature verification using a” siamese” time delay neural network. In Advances in Neural Information Processing Systems(NIPS'94). 737-744.
[9]
Michael Busch, Krishna Gade, Brian Larson, Patrick Lok, Samuel Luckenbill, and Jimmy Lin. 2012. Earlybird: Real-time search at twitter. In Proceedings of 2012 IEEE 28th International Conference on Data Engineering(ICDE'12). IEEE, 1360-1369.
[10]
Huimin Chen, Maosong Sun, Cunchao Tu, Yankai Lin, and Zhiyuan Liu. 2016. Neural sentiment classification with user and product attention. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing(EMNLP'16). 1650-1659.
[11]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078(2014).
[12]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805(2018).
[13]
Cicero dos Santos and Maira Gatti. 2014. Deep convolutional neural networks for sentiment analysis of short texts. In Proceedings of the 25th International Conference on Computational Linguistics(COLING'14). ACL, 69-78.
[14]
Abdessamad Echihabi and Daniel Marcu. 2003. A noisy-channel approach to question answering. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistic(ACL'03). ACL, 16-23.
[15]
George Forman. 2003. An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research 3, Mar (2003), 1289-1305.
[16]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management(CIKM'16). ACM, 55-64.
[17]
Jun Han and Claudio Moraga. 1995. The Influence of the Sigmoid Function Parameters on the Speed of Backpropagation Learning. In Proceedings of the international workshop on artificial neural networks: From natural to artificial neural computation. Springer-Verlag, 195-201.
[18]
Taher H Haveliwala, Aristides Gionis, Dan Klein, and Piotr Indyk. 2002. Evaluating strategies for similarity search on the web. In Proceedings of the 11th International Conference on World Wide Web(WWW'02). ACM, 432-442.
[19]
Hua He, Kevin Gimpel, and Jimmy Lin. 2015. Multi-perspective sentence similarity modeling with convolutional neural networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing(EMNLP'15). ACL, 1576-1586.
[20]
Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), 504-507.
[21]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735-1780.
[22]
Paul Jaccard. 1912. The distribution of the flora in the alpine zone. 1. New phytologist 11, 2 (1912), 37-50.
[23]
Jyun-Yu Jiang, Francine Chen, Yan-Ying Chen, and Wei Wang. 2018. Learning to disentangle interleaved conversational threads with a Siamese hierarchical network and similarity ranking. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies(NAACL-HLT'18). ACL, 1812-1822.
[24]
Rafal Jozefowicz, Wojciech Zaremba, and Ilya Sutskever. 2015. An empirical exploration of recurrent network architectures. In Proceedings of the 32nd International Conference on Machine Learning(ICML '15). 2342-2350.
[25]
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP'14). ACL, 1746-1751.
[26]
Diederik P Kingma and Jimmy Lei Ba. 2015. Adam: Amethod for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations(ICLR'15).
[27]
Bang Liu, Ting Zhang, Fred X Han, Di Niu, Kunfeng Lai, and Yu Xu. 2018. Matching Natural Language Sentences with Hierarchical Sentence Factorization. In Proceedings of the 2018 World Wide Web Conference(WWW'18). ACM, 1237-1246.
[28]
Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Recurrent neural network for text classification with multi-task learning. In Proceedings of the Twenty-fifth International Joint Conference on Artificial Intelligence(IJCAI'16). AAAI Press, 2873-2879.
[29]
Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing(EMNLP'15). ACL, 1412-1421.
[30]
Prem Melville, Wojciech Gryc, and Richard D Lawrence. 2009. Sentiment analysis of blogs by combining lexical knowledge with text classification. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD'09). ACM, 1275-1284.
[31]
Rada Mihalcea, Courtney Corley, Carlo Strapparava, 2006. Corpus-based and knowledge-based measures of text semantic similarity. In Proceedings of the Twentieth AAAI Conference on Artificial Intelligence(AAAI'06), Vol. 6. AAAI Press, 775-780.
[32]
Jonas Mueller and Aditya Thyagarajan. 2016. Siamese recurrent architectures for learning sentence similarity. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence(AAAI'16). AAAI Press, 2786-2792.
[33]
Paul Neculoiu, Maarten Versteegh, and Mihai Rotaru. 2016. Learning text similarity with siamese recurrent networks. In Proceedings of the 1st Workshop on Representation Learning for NLP. 148-157.
[34]
Alexandru Niculescu-Mizil and Rich Caruana. 2005. Predicting good probabilities with supervised learning. In Proceedings of the 22nd International Conference on Machine Learning(ICML'05). 625-632.
[35]
Geoffrey Nunberg. 1990. The linguistics of punctuation. Number 18. Center for the Study of Language (CSLI).
[36]
Douglas Oard, William Webber, David Kirsch, and Sergey Golitsynskiy. 2015. Avocado research email collection. Philadelphia: Linguistic Data Consortium(2015).
[37]
Dragomir R. Radev, Pradeep Muthukrishnan, Vahed Qazvinian, and Amjad Abu-Jbara. 2013. The ACL anthology network corpus. Language Resources and Evaluation(2013), 1-26.
[38]
Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR'15). ACM, 373-382.
[39]
Aliaksei Severyn and Alessandro Moschitti. 2015. Twitter sentiment analysis with deep convolutional neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR'15). ACM, 959-962.
[40]
Sainbayar Sukhbaatar, Jason Weston, Rob Fergus, 2015. End-to-end memory networks. In Advances in Neural Information Processing Systems(NIPS'15). 2440-2448.
[41]
Jaime Teevan, Daniel Ramage, and Merredith Ringel Morris. 2011. # TwitterSearch: a comparison of microblog search and web search. In Proceedings of the Fourth ACM International Conference on Web search and Data Mining(WSDM'11). ACM, 35-44.
[42]
George Tsatsaronis, Iraklis Varlamis, and Michalis Vazirgiannis. 2010. Text relatedness based on a word thesaurus. Journal of Artificial Intelligence Research 37 (2010), 1-39.
[43]
Christophe Van Gysel, Bhaskar Mitra, Matteo Venanzi, Roy Rosemarin, Grzegorz Kukla, Piotr Grudzien, and Nicola Cancedda. 2017. Reply with: Proactive recommendation of email attachments. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management(CIKM'17). ACM, 327-336.
[44]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems(NIPS'17). 5998-6008.
[45]
Shengxian Wan, Yanyan Lan, Jiafeng Guo, Jun Xu, Liang Pang, and Xueqi Cheng. 2016. A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations. In AAAI, Vol. 16. AAAI Press, 2835-2841.
[46]
Chenglong Wang, Feijun Jiang, and Hongxia Yang. 2017. A hybrid framework for text modeling with convolutional RNN. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD'17). ACM, 2061-2069.
[47]
Shuohang Wang and Jing Jiang. 2017. A compare-aggregate model for matching text sequences. (2017).
[48]
Wikipedia. 2001. Wikipedia, The Free Encyclopedia. http://en.wikipedia.org/
[49]
Ho Chung Wu, Robert Wing Pong Luk, Kam Fai Wong, and Kui Lam Kwok. 2008. Interpreting tf-idf term weights as making relevance decisions. ACM Transactions on Information Systems 26, 3 (2008), 13.
[50]
Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad-hoc ranking with kernel pooling. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 55-64.
[51]
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In International Conference on Machine Learning(ICML'15). 2048-2057.
[52]
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies(NAACL-HLT'16). ACL, 1480-1489.
[53]
Wen-tau Yih, Kristina Toutanova, John C Platt, and Christopher Meek. 2011. Learning discriminative projections for text similarity measures. In Proceedings of the Fifteenth Conference on Computational Natural Language Learning(CoNLL'11). ACL, 247-256.
[54]
Wenpeng Yin, Hinrich Schütze, Bing Xiang, and Bowen Zhou. 2016. Abcnn: Attention-based convolutional neural network for modeling sentence pairs. Transactions of the Association for Computational Linguistics 4 (2016), 259-272.
[55]
Xiang Zhang and Yann LeCun. 2015. Text understanding from scratch. arXiv preprint arXiv:1502.01710(2015).

Cited By

View all
  • (2024)Revolutionizing Duplicate Question Detection: A Deep Learning Approach for Stack OverflowIgMin Research10.61927/igmin1352:1(001-005)Online publication date: 9-Jan-2024
  • (2024)Ontology and its applications in skills matching in job recruitmentApplied Ontology10.3233/AO-24001919:3(287-306)Online publication date: 22-Oct-2024
  • (2024)Deep Bag-of-Words Model: An Efficient and Interpretable Relevance Architecture for Chinese E-CommerceProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671559(5398-5408)Online publication date: 25-Aug-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • IW3C2: International World Wide Web Conference Committee

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Semantic text matching
  2. attention mechanism
  3. hierarchical document structures
  4. long documents
  5. recurrent neural networks.

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '19
WWW '19: The Web Conference
May 13 - 17, 2019
CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)72
  • Downloads (Last 6 weeks)7
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Revolutionizing Duplicate Question Detection: A Deep Learning Approach for Stack OverflowIgMin Research10.61927/igmin1352:1(001-005)Online publication date: 9-Jan-2024
  • (2024)Ontology and its applications in skills matching in job recruitmentApplied Ontology10.3233/AO-24001919:3(287-306)Online publication date: 22-Oct-2024
  • (2024)Deep Bag-of-Words Model: An Efficient and Interpretable Relevance Architecture for Chinese E-CommerceProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671559(5398-5408)Online publication date: 25-Aug-2024
  • (2024)TRGNN: Text-Rich Graph Neural Network for Few-Shot Document Filtering2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650066(1-9)Online publication date: 30-Jun-2024
  • (2024)Similarity-Based Source Code Vulnerability Detection Leveraging Transformer Architecture: Harnessing Cross- Attention for Hierarchical AnalysisIEEE Access10.1109/ACCESS.2024.347485712(150295-150307)Online publication date: 2024
  • (2024)CODE-SMASH: Source-Code Vulnerability Detection Using Siamese and Multi-Level Neural ArchitectureIEEE Access10.1109/ACCESS.2024.343232312(102492-102504)Online publication date: 2024
  • (2024)Hierarchical and Multiple-Perspective Interaction Network for Long Text MatchingIEEE Access10.1109/ACCESS.2024.335472312(11135-11146)Online publication date: 2024
  • (2024)Match-Unity: Long-Form Text Matching With Knowledge ComplementarityIEEE Access10.1109/ACCESS.2023.334908912(3629-3637)Online publication date: 2024
  • (2024)bjEnet: a fast and accurate software bug localization method in natural language semantic spaceSoftware Quality Journal10.1007/s11219-024-09693-132:4(1515-1538)Online publication date: 22-Jul-2024
  • (2024)HyperMatch: long-form text matching via hypergraph convolutional networksKnowledge and Information Systems10.1007/s10115-024-02173-966:11(6597-6616)Online publication date: 12-Jul-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media