More Web Proxy on the site http://driver.im/

research-article

Generating Clarifying Questions for Information Retrieval

Authors:

Gord LueckAuthors Info & Claims

WWW '20: Proceedings of The Web Conference 2020

Pages 418 - 428

https://doi.org/10.1145/3366423.3380126

Published: 20 April 2020 Publication History

Abstract

Search queries are often short, and the underlying user intent may be ambiguous. This makes it challenging for search engines to predict possible intents, only one of which may pertain to the current user. To address this issue, search engines often diversify the result list and present documents relevant to multiple intents of the query. An alternative approach is to ask the user a question to clarify her information need. Asking clarifying questions is particularly important for scenarios with “limited bandwidth” interfaces, such as speech-only and small-screen devices. In addition, our user studies and large-scale online experiments show that asking clarifying questions is also useful in web search. Although some recent studies have pointed out the importance of asking clarifying questions, generating them for open-domain search tasks remains unstudied and is the focus of this paper. Lack of training data even within major search engines for this task makes it challenging. To mitigate this issue, we first identify a taxonomy of clarification for open-domain search queries by analyzing large-scale query reformulation data sampled from Bing search logs. This taxonomy leads us to a set of question templates and a simple yet effective slot filling algorithm. We further use this model as a source of weak supervision to automatically generate clarifying questions for training. Furthermore, we propose supervised and reinforcement learning models for generating clarifying questions learned from weak supervision data. We also investigate methods for generating candidate answers for each clarifying question, so users can select from a set of pre-defined answers. Human evaluation of the clarifying questions and candidate answers for hundreds of search queries demonstrates the effectiveness of the proposed solutions.

References

[1]

Martín Abadi et al.2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from tensorflow.org.

[2]

Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W. Bruce Croft. 2019. Asking Clarifying Questions in Open-Domain Information-Seeking Conversations. In SIGIR ’19(Paris, France). 475–484.

[3]

James Allan. 2004. HARD Track Overview in TREC 2004: High Accuracy Retrieval from Documents. In TREC ’04(Gaithersburg, Maryland).

[4]

Paolo Boldi, Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis, and Sebastiano Vigna. 2008. The Query-flow Graph: Model and Applications. In CIKM ’08 (Napa Valley, CA, USA). 609–618.

[5]

Paolo Boldi, Francesco Bonchi, Carlos Castillo, Debora Donato, and Sebastiano Vigna. 2009. Query Suggestions Using Query-flow Graphs. In WSCD ’09 (Barcelona, Spain). 56–63.

[6]

Paolo Boldi, Francesco Bonchi, Carlos Castillo, and Sebastiano Vigna. 2011. Query reformulation mining: models, patterns, and applications. Inf. Retr. 14, 3 (2011), 257–289.

Digital Library

[7]

Marco De Boni and Suresh Manandhar. 2003. An Analysis of Clarification Dialogue for Question Answering. In NAACL ’03 (Edmonton, Canada). 48–55.

[8]

Pavel Braslavski, Denis Savenkov, Eugene Agichtein, and Alina Dubatovka. 2017. What Do You Mean Exactly?: Analyzing Clarification Questions in CQA. In CHIIR ’17 (Oslo, Norway). 345–348.

[9]

Fei Cai and Maarten de Rijke. 2016. A Survey of Query Auto Completion in Information Retrieval. Now Publishers Inc.

[10]

Fei Cai, Ridho Reinanda, and Maarten De Rijke. 2016. Diversifying Query Auto-Completion. ACM Trans. Inf. Syst. 34, 4 (2016), 25:1–25:33.

Digital Library

[11]

Jaime Carbonell and Jade Goldstein. 1998. The Use of MMR, Diversity-based Reranking for Reordering Documents and Producing Summaries. In SIGIR ’98 (Melbourne, Australia). 335–336.

[12]

Konstantina Christakopoulou, Filip Radlinski, and Katja Hofmann. 2016. Towards Conversational Recommender Systems. In KDD ’16 (San Francisco, CA, USA). 815–824.

[13]

Anni Coden, Daniel Gruhl, Neal Lewis, and Pablo N. Mendes. 2015. Did you mean A or B? Supporting Clarification Dialog for Entity Disambiguation. In SumPre ’15(Portoroz, Slovenia).

[14]

W. Bruce Croft. 2019. The Importance of Interaction for Information Retrieval. In SIGIR ’19 (Paris, France). 1–2.

[15]

J. Shane Culpepper, Fernando Diaz, and Mark D. Smucker. 2018. Research Frontiers in Information Retrieval: Report from the Third Strategic Workshop on Information Retrieval in Lorne (SWIRL 2018). SIGIR Forum 52, 1 (2018), 34–90.

Digital Library

[16]

Marco De Boni and Suresh Manandhar. 2005. Implementing Clarification Dialogues in Open Domain Question Answering. Nat. Lang. Eng. 11, 4 (2005), 343–361.

Digital Library

[17]

Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, and W. Bruce Croft. 2017. Neural Ranking Models with Weak Supervision. In SIGIR ’17 (Shinjuku, Tokyo, Japan). 65–74.

[18]

Fernando Diaz. 2016. Pseudo-Query Reformulation. In ECIR ’16 (Padua, Italy). 521–532.

[19]

Nan Duan, Duyu Tang, Peng Chen, and Ming Zhou. 2017. Question Generation for Question Answering. In EMNLP ’17 (Copenhagen, Denmark). 866–874.

[20]

Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying Relations for Open Information Extraction. In EMNLP ’11 (Edinburgh, United Kingdom). 1535–1545.

[21]

Ahmed Hassan, Xiaolin Shi, Nick Craswell, and Bill Ramsey. 2013. Beyond clicks: query reformulation as a predictor of search satisfaction. In CIKM ’13 (San Francisco, CA, USA). 2019–2028.

[22]

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In WWW ’17 (Perth, Australia). 173–182.

[23]

Michael Heilman and Noah A. Smith. 2010. Good Question! Statistical Ranking for Question Generation. In NAACL ’10 (Los Angeles, CA, USA). 609–617.

[24]

Bernard J. Jansen, Danielle L. Booth, and Amanda Spink. 2009. Patterns of Query Reformulation During Web Searching. J. Am. Soc. Inf. Sci. Technol. 60, 7 (2009), 1358–1371.

Digital Library

[25]

Johannes Kiesel, Arefeh Bahrami, Benno Stein, Avishek Anand, and Matthias Hagen. 2018. Toward Voice Query Clarification. In SIGIR ’18 (Ann Arbor, MI, USA). 1257–1260.

[26]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR ’15 (San Diego, CA, USA).

[27]

Tessa Lau and Eric Horvitz. 1999. Patterns of Search: Analyzing and Modeling Web Query Refinement. In UM ’99, Judy Kay (Ed.). 119–128.

[28]

Zhen Liao, Xinying Song, Yelong Shen, Saekoo Lee, Jianfeng Gao, and Ciya Liao. 2017. Deep Context Modeling for Web Query Entity Disambiguation. In CIKM ’17 (Singapore, Singapore). 1757–1765.

[29]

Chang Liu, Jacek Gwizdka, Jingjing Liu, Tao Xu, and Nicholas J. Belkin. 2010. Analysis and Evaluation of Query Reformulations in Different Task Types. In ASIS&T ’10(Pittsburgh, PA, USA). 17:1–17:10.

[30]

Edward Loper and Steven Bird. 2002. NLTK: The Natural Language Toolkit. In ETMTNLP ’02 (Philadelphia, PA, USA). 63–70.

Digital Library

[31]

Cheng Luo, Yukun Zheng, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2017. Training Deep Ranking Model with Weak Relevance Labels. In ADC ’17 (Brisbane, Australia). 205–216.

[32]

Pont Lurcock, Peter Vlugter, and Alistair Knott. 2004. A framework for utterance disambiguation in dialogue. In ALTA ’04 (Sydney, Australia).

[33]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NeurIPS ’13(Lake Tahoe, CA, USA). 3111–3119.

[34]

Bhaskar Mitra. 2015. Exploring Session Context Using Distributed Representations of Queries and Reformulations. In SIGIR ’15 (Santiago, Chile). 3–12.

[35]

Yifan Nie, Alessandro Sordoni, and Jian-Yun Nie. 2018. Multi-level Abstraction Convolutional Model with Weak Supervision for Information Retrieval. In SIGIR ’18 (Ann Arbor, MI, USA). 985–988.

[36]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In EMNLP ’14 (Doha, Qatar). 1532–1543.

[37]

Chen Qu, Liu Yang, W. Bruce Croft, Johanne R. Trippas, Yongfeng Zhang, and Minghui Qiu. 2018. Analyzing and Characterizing User Intent in Information-seeking Conversations. In SIGIR ’18(Ann Arbor, MI, USA). 989–992.

[38]

Luis Quintano and Irene Pimenta Rodrigues. 2008. Question/Answering Clarification Dialogues. In MICAI ’08 (Atizapán de Zaragoza, Mexico). 155–164.

[39]

Filip Radlinski and Nick Craswell. 2017. A Theoretical Framework for Conversational Search. In CHIIR ’17 (Oslo, Norway). 117–126.

[40]

Marc’Aurelio Ranzato, Sumit Chopra, Michael Auli, and Wojciech Zaremba. 2016. Sequence Level Training with Recurrent Neural Networks. In ICLR ’16 (San Juan, Puerto Rico).

[41]

Sudha Rao and Hal Daumé III. 2018. Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information. In ACL ’18 (Melbourne, Australia). 2737–2746.

[42]

Sudha Rao and Hal Daumé III. 2019. Answer-based Adversarial Training for Generating Clarification Questions. In NAACL ’19 (Minneapolis, MN, USA).

[43]

Rodrygo L. T. Santos, Craig Macdonald, and Iadh Ounis. 2015. Search Result Diversification. Found. Trends Inf. Retr. 9, 1 (2015), 1–90.

Digital Library

[44]

M. Schuster and K.K. Paliwal. 1997. Bidirectional Recurrent Neural Networks. Trans. Sig. Proc. 45, 11 (1997), 2673–2681.

Digital Library

[45]

Svetlana Stoyanchev, Alex Liu, and Julia Hirschberg. 2014. Towards Natural Clarification Questions in Dialogue Systems. In AISB ’14 (London, UK), Vol. 20.

[46]

Yueming Sun and Yi Zhang. 2018. Conversational Recommender System. In SIGIR ’18 (Ann Arbor, MI, USA). 235–244.

[47]

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to Sequence Learning with Neural Networks. In NeurIPS ’14 (Montreal, Canada). 3104–3112.

[48]

Idan Szpektor, Aristides Gionis, and Yoelle Maarek. 2011. Improving Recommendation for Long-tail Queries via Templates. In WWW ’11 (Hyderabad, India). 47–56.

[49]

Jan Trienes and Krisztian Balog. 2019. Identifying Unclear Questions in Community Question Answering Websites. In ECIR ’19 (Cologne, Germany). 276–289.

[50]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NeurIPS ’17 (Long Beach, CA, USA). 5998–6008.

[51]

Ronald J. Williams. 1992. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Mach. Learn. 8, 3-4 (1992), 229–256.

Digital Library

[52]

Hui Yang, Dongyi Guan, and Sicong Zhang. 2015. The Query Change Model: Modeling Session Search As a Markov Decision Process. ACM Trans. Inf. Syst. 33, 4 (2015), 20:1–20:33.

Digital Library

[53]

Liu Yang, Hamed Zamani, Yongfeng Zhang, Jiafeng Guo, and W. Bruce Croft. 2017. Neural Matching Models for Question Retrieval and Next Question Prediction in Conversation. In NeuIR ’17 (Shinjuku, Tokyo, Japan).

[54]

Hamed Zamani and W. Bruce Croft. 2016. Estimating Embedding Vectors for Queries. In ICTIR ’16 (Newark, DE, USA). 123–132.

[55]

Hamed Zamani and W. Bruce Croft. 2018. On the Theory of Weak Supervision for Information Retrieval. In ICTIR ’18 (Tianjin, China).

[56]

Hamed Zamani, W. Bruce Croft, and J. Shane Culpepper. 2018. Neural Query Performance Prediction using Weak Supervision from Multiple Signals. In SIGIR ’18(Ann Arbor, MI, USA). 105–114.

[57]

Hamed Zamani, Mostafa Dehghani, W. Bruce Croft, Erik Learned-Miller, and Jaap Kamps. 2018. From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing. In CIKM ’18 (Torino, Italy). 497–506.

[58]

Hamed Zamani, Mostafa Dehghani, Fernando Diaz, Hang Li, and Nick Craswell. 2018. SIGIR 2018 Workshop on Learning from Limited or Noisy Data for Information Retrieval. In SIGIR’18 (Ann Arbor, MI, USA). 1439–1440.

[59]

Yongfeng Zhang, Xu Chen, Qingyao Ai, Liu Yang, and W. Bruce Croft. 2018. Towards Conversational Search and Recommendation: System Ask, User Respond. In CIKM ’18 (Torino, Italy). 177–186.

[60]

Qingyu Zhou, Nan Yang, Furu Wei, Chuanqi Tan, Hangbo Bao, and Ming Zhou. 2018. Neural Question Generation from Text: A Preliminary Study. In NLPCC ’18 (Hohhot, China). 662–671.

Cited By

Deng YLiao LLei WYang GLam WChua T(2025)Proactive Conversational AI: A Comprehensive Survey of Advancements and OpportunitiesACM Transactions on Information Systems10.1145/3715097Online publication date: 24-Jan-2025
https://doi.org/10.1145/3715097
Salamat SArabzadeh NSeyedsalehi SBigdeli AZihayat MBagheri E(2025)A contrastive neural disentanglement approach for query performance predictionMachine Learning10.1007/s10994-025-06752-x114:4Online publication date: 25-Feb-2025
https://doi.org/10.1007/s10994-025-06752-x
Saleminezhad AArabzadeh NRad RBeheshti SBagheri E(2025)Robust query performance prediction for dense retrievers via adaptive disturbance generationMachine Learning10.1007/s10994-024-06659-z114:3Online publication date: 6-Feb-2025
https://doi.org/10.1007/s10994-024-06659-z
Show More Cited By

Index Terms

Generating Clarifying Questions for Information Retrieval
1. Information systems
  1. Information retrieval

Index terms have been assigned to the content through auto-classification.

Recommendations

Generating Clarifying Questions with Web Search Results
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Asking clarifying questions is an interactive way to effectively clarify user intent. When a user submits a query, the search engine will return a clarifying question with several clickable items of sub-intents for clarification. According to the ...
Generating Intent-aware Clarifying Questions in Conversational Information Retrieval Systems
CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management

Generating clarifying questions can effectively clarify users' complicated search intent in conversational search systems. However, existing methods based on pre-defined templates are inadequate in understanding explicit user intents, making generated ...
Asking Clarifying Questions: To benefit or to disturb users in Web search?
Abstract
Modern information-seeking systems are becoming more interactive, mainly through asking Clarifying Questions (CQs) to refine users’ information needs. System-generated CQs may be of different qualities. However, the impact of asking ...
Highlights
- A user study to explore the trajectory effects of showing clarifying questions.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '20: Proceedings of The Web Conference 2020

April 2020

3143 pages

ISBN:9781450370233

DOI:10.1145/3366423

Editors:
Yennun Huang
Acadmica sinica, Taiwan
,
Irwin King
The Chinese University of Hong Kong, Hong Kong
,
Tie-Yan Liu
Microsoft Research Asia, China
,
Maarten van Steen
University of Twente, Netherlands

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '20

Sponsor:

SIGWEB

WWW '20: The Web Conference 2020

April 20 - 24, 2020

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

106
Total Citations
View Citations
2,233
Total Downloads

Downloads (Last 12 months)371
Downloads (Last 6 weeks)35

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Deng YLiao LLei WYang GLam WChua T(2025)Proactive Conversational AI: A Comprehensive Survey of Advancements and OpportunitiesACM Transactions on Information Systems10.1145/3715097Online publication date: 24-Jan-2025
https://doi.org/10.1145/3715097
Salamat SArabzadeh NSeyedsalehi SBigdeli AZihayat MBagheri E(2025)A contrastive neural disentanglement approach for query performance predictionMachine Learning10.1007/s10994-025-06752-x114:4Online publication date: 25-Feb-2025
https://doi.org/10.1007/s10994-025-06752-x
Saleminezhad AArabzadeh NRad RBeheshti SBagheri E(2025)Robust query performance prediction for dense retrievers via adaptive disturbance generationMachine Learning10.1007/s10994-024-06659-z114:3Online publication date: 6-Feb-2025
https://doi.org/10.1007/s10994-024-06659-z
Tavakoli LTrippas JZamani HScholer FSanderson M(2024)Online and Offline Evaluation in Search ClarificationACM Transactions on Information Systems10.1145/368178643:1(1-30)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3681786
Sekulić ILu LBedi NCrestani FSakai TIshita EOhshima HHasibi FMao JJose J(2024)Simulating Conversational Search Users with Parameterized BehaviorProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698425(72-81)Online publication date: 8-Dec-2024
https://dl.acm.org/doi/10.1145/3673791.3698425
Yanagi RTogo ROgawa THaseyama MCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)DQG: Database Question Generation for Exact Text-based Image RetrievalProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681469(7424-7433)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681469
White R(2024)Advancing the Search Frontier with AI AgentsCommunications of the ACM10.1145/3655615Online publication date: 20-Aug-2024
https://doi.org/10.1145/3655615
Sekulić IAlinannejadi MCrestani F(2024)Analysing Utterances in LLM-Based User Simulation for Conversational SearchACM Transactions on Intelligent Systems and Technology10.1145/365004115:3(1-22)Online publication date: 5-Mar-2024
https://dl.acm.org/doi/10.1145/3650041
Ye DLiu JFan JTian BZhou TChen XMa JBaeza-Yates RBonchi F(2024)Enhancing Asymmetric Web Search through Question-Answer Generation and RankingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671517(6127-6136)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671517
Zhao ZDou ZZhou YSerra ESpezzano F(2024)Generating Intent-aware Clarifying Questions in Conversational Information Retrieval SystemsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679851(3384-3394)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679851
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten