[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3366423.3380126acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Generating Clarifying Questions for Information Retrieval

Published: 20 April 2020 Publication History

Abstract

Search queries are often short, and the underlying user intent may be ambiguous. This makes it challenging for search engines to predict possible intents, only one of which may pertain to the current user. To address this issue, search engines often diversify the result list and present documents relevant to multiple intents of the query. An alternative approach is to ask the user a question to clarify her information need. Asking clarifying questions is particularly important for scenarios with “limited bandwidth” interfaces, such as speech-only and small-screen devices. In addition, our user studies and large-scale online experiments show that asking clarifying questions is also useful in web search. Although some recent studies have pointed out the importance of asking clarifying questions, generating them for open-domain search tasks remains unstudied and is the focus of this paper. Lack of training data even within major search engines for this task makes it challenging. To mitigate this issue, we first identify a taxonomy of clarification for open-domain search queries by analyzing large-scale query reformulation data sampled from Bing search logs. This taxonomy leads us to a set of question templates and a simple yet effective slot filling algorithm. We further use this model as a source of weak supervision to automatically generate clarifying questions for training. Furthermore, we propose supervised and reinforcement learning models for generating clarifying questions learned from weak supervision data. We also investigate methods for generating candidate answers for each clarifying question, so users can select from a set of pre-defined answers. Human evaluation of the clarifying questions and candidate answers for hundreds of search queries demonstrates the effectiveness of the proposed solutions.

References

[1]
Martín Abadi et al.2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from tensorflow.org.
[2]
Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W. Bruce Croft. 2019. Asking Clarifying Questions in Open-Domain Information-Seeking Conversations. In SIGIR ’19(Paris, France). 475–484.
[3]
James Allan. 2004. HARD Track Overview in TREC 2004: High Accuracy Retrieval from Documents. In TREC ’04(Gaithersburg, Maryland).
[4]
Paolo Boldi, Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis, and Sebastiano Vigna. 2008. The Query-flow Graph: Model and Applications. In CIKM ’08 (Napa Valley, CA, USA). 609–618.
[5]
Paolo Boldi, Francesco Bonchi, Carlos Castillo, Debora Donato, and Sebastiano Vigna. 2009. Query Suggestions Using Query-flow Graphs. In WSCD ’09 (Barcelona, Spain). 56–63.
[6]
Paolo Boldi, Francesco Bonchi, Carlos Castillo, and Sebastiano Vigna. 2011. Query reformulation mining: models, patterns, and applications. Inf. Retr. 14, 3 (2011), 257–289.
[7]
Marco De Boni and Suresh Manandhar. 2003. An Analysis of Clarification Dialogue for Question Answering. In NAACL ’03 (Edmonton, Canada). 48–55.
[8]
Pavel Braslavski, Denis Savenkov, Eugene Agichtein, and Alina Dubatovka. 2017. What Do You Mean Exactly?: Analyzing Clarification Questions in CQA. In CHIIR ’17 (Oslo, Norway). 345–348.
[9]
Fei Cai and Maarten de Rijke. 2016. A Survey of Query Auto Completion in Information Retrieval. Now Publishers Inc.
[10]
Fei Cai, Ridho Reinanda, and Maarten De Rijke. 2016. Diversifying Query Auto-Completion. ACM Trans. Inf. Syst. 34, 4 (2016), 25:1–25:33.
[11]
Jaime Carbonell and Jade Goldstein. 1998. The Use of MMR, Diversity-based Reranking for Reordering Documents and Producing Summaries. In SIGIR ’98 (Melbourne, Australia). 335–336.
[12]
Konstantina Christakopoulou, Filip Radlinski, and Katja Hofmann. 2016. Towards Conversational Recommender Systems. In KDD ’16 (San Francisco, CA, USA). 815–824.
[13]
Anni Coden, Daniel Gruhl, Neal Lewis, and Pablo N. Mendes. 2015. Did you mean A or B? Supporting Clarification Dialog for Entity Disambiguation. In SumPre ’15(Portoroz, Slovenia).
[14]
W. Bruce Croft. 2019. The Importance of Interaction for Information Retrieval. In SIGIR ’19 (Paris, France). 1–2.
[15]
J. Shane Culpepper, Fernando Diaz, and Mark D. Smucker. 2018. Research Frontiers in Information Retrieval: Report from the Third Strategic Workshop on Information Retrieval in Lorne (SWIRL 2018). SIGIR Forum 52, 1 (2018), 34–90.
[16]
Marco De Boni and Suresh Manandhar. 2005. Implementing Clarification Dialogues in Open Domain Question Answering. Nat. Lang. Eng. 11, 4 (2005), 343–361.
[17]
Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, and W. Bruce Croft. 2017. Neural Ranking Models with Weak Supervision. In SIGIR ’17 (Shinjuku, Tokyo, Japan). 65–74.
[18]
Fernando Diaz. 2016. Pseudo-Query Reformulation. In ECIR ’16 (Padua, Italy). 521–532.
[19]
Nan Duan, Duyu Tang, Peng Chen, and Ming Zhou. 2017. Question Generation for Question Answering. In EMNLP ’17 (Copenhagen, Denmark). 866–874.
[20]
Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying Relations for Open Information Extraction. In EMNLP ’11 (Edinburgh, United Kingdom). 1535–1545.
[21]
Ahmed Hassan, Xiaolin Shi, Nick Craswell, and Bill Ramsey. 2013. Beyond clicks: query reformulation as a predictor of search satisfaction. In CIKM ’13 (San Francisco, CA, USA). 2019–2028.
[22]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In WWW ’17 (Perth, Australia). 173–182.
[23]
Michael Heilman and Noah A. Smith. 2010. Good Question! Statistical Ranking for Question Generation. In NAACL ’10 (Los Angeles, CA, USA). 609–617.
[24]
Bernard J. Jansen, Danielle L. Booth, and Amanda Spink. 2009. Patterns of Query Reformulation During Web Searching. J. Am. Soc. Inf. Sci. Technol. 60, 7 (2009), 1358–1371.
[25]
Johannes Kiesel, Arefeh Bahrami, Benno Stein, Avishek Anand, and Matthias Hagen. 2018. Toward Voice Query Clarification. In SIGIR ’18 (Ann Arbor, MI, USA). 1257–1260.
[26]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR ’15 (San Diego, CA, USA).
[27]
Tessa Lau and Eric Horvitz. 1999. Patterns of Search: Analyzing and Modeling Web Query Refinement. In UM ’99, Judy Kay (Ed.). 119–128.
[28]
Zhen Liao, Xinying Song, Yelong Shen, Saekoo Lee, Jianfeng Gao, and Ciya Liao. 2017. Deep Context Modeling for Web Query Entity Disambiguation. In CIKM ’17 (Singapore, Singapore). 1757–1765.
[29]
Chang Liu, Jacek Gwizdka, Jingjing Liu, Tao Xu, and Nicholas J. Belkin. 2010. Analysis and Evaluation of Query Reformulations in Different Task Types. In ASIS&T ’10(Pittsburgh, PA, USA). 17:1–17:10.
[30]
Edward Loper and Steven Bird. 2002. NLTK: The Natural Language Toolkit. In ETMTNLP ’02 (Philadelphia, PA, USA). 63–70.
[31]
Cheng Luo, Yukun Zheng, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2017. Training Deep Ranking Model with Weak Relevance Labels. In ADC ’17 (Brisbane, Australia). 205–216.
[32]
Pont Lurcock, Peter Vlugter, and Alistair Knott. 2004. A framework for utterance disambiguation in dialogue. In ALTA ’04 (Sydney, Australia).
[33]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NeurIPS ’13(Lake Tahoe, CA, USA). 3111–3119.
[34]
Bhaskar Mitra. 2015. Exploring Session Context Using Distributed Representations of Queries and Reformulations. In SIGIR ’15 (Santiago, Chile). 3–12.
[35]
Yifan Nie, Alessandro Sordoni, and Jian-Yun Nie. 2018. Multi-level Abstraction Convolutional Model with Weak Supervision for Information Retrieval. In SIGIR ’18 (Ann Arbor, MI, USA). 985–988.
[36]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In EMNLP ’14 (Doha, Qatar). 1532–1543.
[37]
Chen Qu, Liu Yang, W. Bruce Croft, Johanne R. Trippas, Yongfeng Zhang, and Minghui Qiu. 2018. Analyzing and Characterizing User Intent in Information-seeking Conversations. In SIGIR ’18(Ann Arbor, MI, USA). 989–992.
[38]
Luis Quintano and Irene Pimenta Rodrigues. 2008. Question/Answering Clarification Dialogues. In MICAI ’08 (Atizapán de Zaragoza, Mexico). 155–164.
[39]
Filip Radlinski and Nick Craswell. 2017. A Theoretical Framework for Conversational Search. In CHIIR ’17 (Oslo, Norway). 117–126.
[40]
Marc’Aurelio Ranzato, Sumit Chopra, Michael Auli, and Wojciech Zaremba. 2016. Sequence Level Training with Recurrent Neural Networks. In ICLR ’16 (San Juan, Puerto Rico).
[41]
Sudha Rao and Hal Daumé III. 2018. Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information. In ACL ’18 (Melbourne, Australia). 2737–2746.
[42]
Sudha Rao and Hal Daumé III. 2019. Answer-based Adversarial Training for Generating Clarification Questions. In NAACL ’19 (Minneapolis, MN, USA).
[43]
Rodrygo L. T. Santos, Craig Macdonald, and Iadh Ounis. 2015. Search Result Diversification. Found. Trends Inf. Retr. 9, 1 (2015), 1–90.
[44]
M. Schuster and K.K. Paliwal. 1997. Bidirectional Recurrent Neural Networks. Trans. Sig. Proc. 45, 11 (1997), 2673–2681.
[45]
Svetlana Stoyanchev, Alex Liu, and Julia Hirschberg. 2014. Towards Natural Clarification Questions in Dialogue Systems. In AISB ’14 (London, UK), Vol. 20.
[46]
Yueming Sun and Yi Zhang. 2018. Conversational Recommender System. In SIGIR ’18 (Ann Arbor, MI, USA). 235–244.
[47]
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to Sequence Learning with Neural Networks. In NeurIPS ’14 (Montreal, Canada). 3104–3112.
[48]
Idan Szpektor, Aristides Gionis, and Yoelle Maarek. 2011. Improving Recommendation for Long-tail Queries via Templates. In WWW ’11 (Hyderabad, India). 47–56.
[49]
Jan Trienes and Krisztian Balog. 2019. Identifying Unclear Questions in Community Question Answering Websites. In ECIR ’19 (Cologne, Germany). 276–289.
[50]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NeurIPS ’17 (Long Beach, CA, USA). 5998–6008.
[51]
Ronald J. Williams. 1992. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Mach. Learn. 8, 3-4 (1992), 229–256.
[52]
Hui Yang, Dongyi Guan, and Sicong Zhang. 2015. The Query Change Model: Modeling Session Search As a Markov Decision Process. ACM Trans. Inf. Syst. 33, 4 (2015), 20:1–20:33.
[53]
Liu Yang, Hamed Zamani, Yongfeng Zhang, Jiafeng Guo, and W. Bruce Croft. 2017. Neural Matching Models for Question Retrieval and Next Question Prediction in Conversation. In NeuIR ’17 (Shinjuku, Tokyo, Japan).
[54]
Hamed Zamani and W. Bruce Croft. 2016. Estimating Embedding Vectors for Queries. In ICTIR ’16 (Newark, DE, USA). 123–132.
[55]
Hamed Zamani and W. Bruce Croft. 2018. On the Theory of Weak Supervision for Information Retrieval. In ICTIR ’18 (Tianjin, China).
[56]
Hamed Zamani, W. Bruce Croft, and J. Shane Culpepper. 2018. Neural Query Performance Prediction using Weak Supervision from Multiple Signals. In SIGIR ’18(Ann Arbor, MI, USA). 105–114.
[57]
Hamed Zamani, Mostafa Dehghani, W. Bruce Croft, Erik Learned-Miller, and Jaap Kamps. 2018. From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing. In CIKM ’18 (Torino, Italy). 497–506.
[58]
Hamed Zamani, Mostafa Dehghani, Fernando Diaz, Hang Li, and Nick Craswell. 2018. SIGIR 2018 Workshop on Learning from Limited or Noisy Data for Information Retrieval. In SIGIR’18 (Ann Arbor, MI, USA). 1439–1440.
[59]
Yongfeng Zhang, Xu Chen, Qingyao Ai, Liu Yang, and W. Bruce Croft. 2018. Towards Conversational Search and Recommendation: System Ask, User Respond. In CIKM ’18 (Torino, Italy). 177–186.
[60]
Qingyu Zhou, Nan Yang, Furu Wei, Chuanqi Tan, Hangbo Bao, and Ming Zhou. 2018. Neural Question Generation from Text: A Preliminary Study. In NLPCC ’18 (Hohhot, China). 662–671.

Cited By

View all
  • (2024)Online and Offline Evaluation in Search ClarificationACM Transactions on Information Systems10.1145/368178643:1(1-30)Online publication date: 4-Nov-2024
  • (2024)Simulating Conversational Search Users with Parameterized BehaviorProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698425(72-81)Online publication date: 8-Dec-2024
  • (2024)DQG: Database Question Generation for Exact Text-based Image RetrievalProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681469(7424-7433)Online publication date: 28-Oct-2024
  • Show More Cited By

Index Terms

  1. Generating Clarifying Questions for Information Retrieval
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WWW '20: Proceedings of The Web Conference 2020
        April 2020
        3143 pages
        ISBN:9781450370233
        DOI:10.1145/3366423
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 20 April 2020

        Permissions

        Request permissions for this article.

        Check for updates

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        WWW '20
        Sponsor:
        WWW '20: The Web Conference 2020
        April 20 - 24, 2020
        Taipei, Taiwan

        Acceptance Rates

        Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)427
        • Downloads (Last 6 weeks)57
        Reflects downloads up to 11 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Online and Offline Evaluation in Search ClarificationACM Transactions on Information Systems10.1145/368178643:1(1-30)Online publication date: 4-Nov-2024
        • (2024)Simulating Conversational Search Users with Parameterized BehaviorProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698425(72-81)Online publication date: 8-Dec-2024
        • (2024)DQG: Database Question Generation for Exact Text-based Image RetrievalProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681469(7424-7433)Online publication date: 28-Oct-2024
        • (2024)Advancing the Search Frontier with AI AgentsCommunications of the ACM10.1145/3655615Online publication date: 20-Aug-2024
        • (2024)Analysing Utterances in LLM-Based User Simulation for Conversational SearchACM Transactions on Intelligent Systems and Technology10.1145/365004115:3(1-22)Online publication date: 5-Mar-2024
        • (2024)Enhancing Asymmetric Web Search through Question-Answer Generation and RankingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671517(6127-6136)Online publication date: 25-Aug-2024
        • (2024)Generating Intent-aware Clarifying Questions in Conversational Information Retrieval SystemsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679851(3384-3394)Online publication date: 21-Oct-2024
        • (2024)The Surprising Effectiveness of Rankers trained on Expanded QueriesProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657938(2652-2656)Online publication date: 10-Jul-2024
        • (2024)ProCIS: A Benchmark for Proactive Retrieval in ConversationsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657869(830-840)Online publication date: 10-Jul-2024
        • (2024)Towards Human-centered Proactive Conversational AgentsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657843(807-818)Online publication date: 10-Jul-2024
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media