More Web Proxy on the site http://driver.im/

research-article

Joint question clustering and relevance prediction for open domain non-factoid question answering

Authors:

Snigdha Chaturvedi,

Vittorio Castelli,

Ramesh M. Nallapati,

Hema RaghavanAuthors Info & Claims

WWW '14: Proceedings of the 23rd international conference on World wide web

Pages 503 - 514

https://doi.org/10.1145/2566486.2567999

Published: 07 April 2014 Publication History

Abstract

Web searches are increasingly formulated as natural language questions, rather than keyword queries. Retrieving answers to such questions requires a degree of understanding of user expectations. An important step in this direction is to automatically infer the type of answer implied by the question, e.g., factoids, statements on a topic, instructions, reviews, etc. Answer Type taxonomies currently exist for factoid-style questions, but not for open-domain questions. Building taxonomies for non-factoid questions is a harder problem since these questions can come from a very broad semantic space. A few attempts have been made to develop taxonomies for non-factoid questions, but these tend to be too narrow or domain specific. In this paper, we address this problem by modeling the Answer Type as a latent variable that is learned in a data-driven fashion, allowing the model to be more adaptive to new domains and data sets. We propose approaches that detect the relevance of candidate answers to a user question by jointly 'clustering' questions according to the hidden variable, and modeling relevance conditioned on this hidden variable.

In this paper we propose 3 new models: (a) Logistic Regression Mixture (LRM), (b) Glocal Logistic Regression Mixture (G-LRM) and (c) Mixture Glocal Logistic Regression Mixture (MG-LRM) that automatically learn question-clusters and cluster-specific relevance models. All three models perform better than a baseline relevance model that uses explicit Answer Type categories predicted by a supervised Answer-Type classifier, on a newsgroups dataset. Our models also perform better than a baseline relevance model that does not use any answer-type information on a blogs dataset.

References

[1]

Bing feature update: Searching for a good deal? new natural language capabilities in bing shopping understand prices. http://www.bing.com/blogs/site_blogs/b/search/archive/2011/03/01/bing-feature-update-searching-for-a-good-deal-new-natural-language-capabilities-in-bing-shopping-understand-prices.aspx.

[2]

Meet hummingbird: Google just revamped search to answer your long questions better. http://www.forbes.com/sites/roberthof/2013/09/26/google-just-revamped-search-to-handle-your-long-questions.

[3]

N. Aikawa, T. Sakai, and H. Yamana. Community QA Question Classification: Is the Asker Looking for Subjective Answers or Not? IPSJ Online Transactions, 4:160--168, 2011.

[4]

S. M. Beitzel, E. C. Jensen, O. Frieder, D. Grossman, D. D. Lewis, A. Chowdhury, and A. Kolcz. Automatic web query classification using labeled and unlabeled training data. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '05, pages 581--582, New York, NY, USA, 2005. ACM.

Digital Library

[5]

S. M. Beitzel, E. C. Jensen, O. Frieder, D. D. Lewis, A. Chowdhury, and A. Kolcz. Improving automatic query classification via semi-supervised learning. In Proceedings of the 5th IEEE International Conference on Data Mining, ICDM '05, pages 42--49, Washington, DC, USA, 2005. IEEE Computer Society.

Digital Library

[6]

M. S. Bernstein, J. Teevan, S. Dumais, D. Liebling, and E. Horvitz. Direct answers for search queries in the long tail. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '12, pages 237--246, New York, NY, USA, 2012. ACM.

Digital Library

[7]

A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, Sept. 2002.

Digital Library

[8]

F. Bu, X. Zhu, Y. Hao, and X. Zhu. Function-based question classification for general qa. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP '10, pages 1119--1128, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics.

Digital Library

[9]

H. Cao, D. H. Hu, D. Shen, D. Jiang, J.-T. Sun, E. Chen, and Q. Yang. Context-aware query classification. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, SIGIR '09, pages 3--10, New York, NY, USA, 2009. ACM.

Digital Library

[10]

L. Chen, D. Zhang, and L. Mark. Understanding user intent in community question answering. In Proceedings of the 21st international conference companion on World Wide Web, WWW '12 Companion, pages 823--828, New York, NY, USA, 2012. ACM.

Digital Library

[11]

M. Chen, J.-T. Sun, X. Ni, and Y. Chen. Improving context-aware query classification via adaptive self-training. In Proceedings of the 20th ACM international conference on Information and knowledge management, CIKM '11, pages 115--124, New York, NY, USA, 2011. ACM.

Digital Library

[12]

A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39:1--38, 1977.

[13]

R. Florian, H. Hassan, A. Ittycheriah, H. Jing, N. Kambhatla, X. Luo, N. Nicolov, and S. Roukos. A statistical model for multilingual entity detection and tracking. In D. M. Susan Dumais and S. Roukos, editors, HLT-NAACL 2004: Main Proceedings, pages 1--8, Boston, MA, USA, May 2 - May 7 2004. Association for Computational Linguistics.

[14]

P. E. Gill and W. Murray. Minimization Subject to Bounds on the Variables. NPL Report NAC72, 1976.

[15]

P. E. Gill, W. Murray, and M. H. Wright. Practical optimization. Academic Press Inc. {Harcourt Brace Jovanovich Publishers}, London, 1981.

[16]

N. Goharian and S. S. Mengle. Context aware query classification using dynamic query window and relationship net. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, SIGIR '10, pages 723--724, New York, NY, USA, 2010. ACM.

Digital Library

[17]

U. Hermjakob. Parsing and question classification for question answering. In Proceedings of the workshop on Open-domain question answering - Volume 12, ODQA '01, pages 1--6, Stroudsburg, PA, USA, 2001. Association for Computational Linguistics.

Digital Library

[18]

R. Higashinaka and H. Isozaki. Corpus-based question answering for why-questions. In In Proceedings of IJCNLP, pages 418--425, 2008.

[19]

E. Hovy, L. Gerber, U. Hermjakob, C.-Y. Lin, and D. Ravichandran. Toward semantics-based answer pinpointing. In Proceedings of the first international conference on Human language technology research, HLT '01, pages 1--7, Stroudsburg, PA, USA, 2001. Association for Computational Linguistics.

Digital Library

[20]

Z. Huang, M. Thint, and Z. Qin. Question classification using head words and their hypernyms. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP '08, pages 927--936, Stroudsburg, PA, USA, 2008. Association for Computational Linguistics.

Digital Library

[21]

R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In Proceedings of the 15th international conference on World Wide Web, WWW '06, pages 387--396, New York, NY, USA, 2006. ACM.

Digital Library

[22]

A. Lally, J. M. Prager, M. C. McCord, B. Boguraev, S. Patwardhan, J. Fan, P. Fodor, and J. Chu-Carroll. Question analysis: How watson reads a clue. IBM Journal of Research and Development, 56(3):2, 2012.

Digital Library

[23]

E. H. Laurie, L. Gerber, U. Hermjakob, M. Junk, and C. yew Lin. Question answering in webclopedia. In Proceedings of the Ninth Text REtrieval Conference (TREC-9, pages 655--664, 2000.

[24]

M. Le Nguyen, T. T. Nguyen, and A. Shimazu. Subtree mining for question classification problem. In Proceedings of the 20th international joint conference on Artifical intelligence, IJCAI'07, pages 1695--1700, San Francisco, CA, USA, 2007. Morgan Kaufmann Publishers Inc.

Digital Library

[25]

B. Li, Y. Liu, and E. Agichtein. Cocqa: Co-training over questions and answers with an application to predicting question subjectivity orientation. In EMNLP, pages 937--946. ACL, 2008.

Digital Library

[26]

B. Li, Y. Liu, A. Ram, E. V. Garcia, and E. Agichtein. Exploring question subjectivity prediction in community qa. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '08, pages 735--736, New York, NY, USA, 2008. ACM.

Digital Library

[27]

X. Li and D. Roth. Learning question classifiers. In Proceedings of the 19th international conference on Computational linguistics - Volume 1, COLING '02, pages 1--7, Stroudsburg, PA, USA, 2002. Association for Computational Linguistics.

Digital Library

[28]

X. Luo, H. Raghavan, V. Castelli, S. Maskey, and R. Florian. Finding What Matters in Questions. In Proceedings of NAACL-HLT, pages 878--887, 2013.

[29]

D. Moldovan, S. Harabagiu, A. Harabagiu, M. Pasca, R. Mihalcea, R. Girju, R. Goodrum, V. Rus, and I. Background. The structure and performance of an open-domain question answering system. In In Proceedings of the Conference of the Association for Computational Linguistics (ACL-2000, pages 563--570, 2000.

Digital Library

[30]

A. Moschitti, S. Quarteroni, R. Basili, and S. Manandhar. Exploiting syntactic and shallow semantic kernels for question answer classification. In Proc. of ACL-07, pages 776--783, 2007.

[31]

J.-H. Oh, K. Torisawa, C. Hashimoto, T. Kawada, S. De Saeger, J. Kazama, and Y. Wang. Why question answering using sentiment analysis and word classes. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL '12, pages 368--378, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics.

Digital Library

[32]

B. Qu, G. Cong, C. Li, A. Sun, and H. Chen. An evaluation of classification models for question topic categorization. Journal of the American Society for Information Science and Technology, 63(5):889--903, 2012.

Digital Library

[33]

M. Razmara and L. Kosseim. Answering list questions using co-occurrence and clustering. In LREC, 2008.

[34]

R. Srihari and W. Li. A question answering system supported by information extraction. In Proceedings of the sixth conference on Applied natural language processing, ANLC '00, pages 166--172, Stroudsburg, PA, USA, 2000. Association for Computational Linguistics.

Digital Library

[35]

S. Verberne, L. Boves, N. Oostdijk, and P.-A. Coppen. Evaluating discourse-based answer extraction for why-question answering. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '07, pages 735--736, New York, NY, USA, 2007. ACM.

Digital Library

[36]

E. M. Voorhees. Overview of the TREC 2004 question answering track. In TREC, 2004.

[37]

X. Wei and W. B. Croft. Lda-based document models for ad-hoc retrieval. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '06, pages 178--185, New York, NY, USA, 2006. ACM.

Digital Library

[38]

J.-R. Wen, J.-Y. Nie, and H.-J. Zhang. Clustering user queries of a search engine. In Proceedings of the 10th international conference on World Wide Web, WWW '01, pages 162--168, New York, NY, USA, 2001. ACM.

Digital Library

[39]

T. C. Zhou, X. Si, E. Y. Chang, I. King, and M. R. Lyu. A data-driven approach to question subjectivity identification in community question answering, 2012.

Cited By

Ahmed MKhan HMunir E(2024)Conversational AI: An Explication of Few-Shot Learning Problem in Transformers-Based Chatbot SystemsIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.328149211:2(1888-1906)Online publication date: Apr-2024
https://doi.org/10.1109/TCSS.2023.3281492
Uwoghiren EOladipupo OOyelade J(2024)Advancements and Trends in Non-Factoid Question Answering: A Comprehensive Systematic Literature Review2024 International Conference on Science, Engineering and Business for Driving Sustainable Development Goals (SEB4SDG)10.1109/SEB4SDG60871.2024.10629871(1-17)Online publication date: 2-Apr-2024
https://doi.org/10.1109/SEB4SDG60871.2024.10629871
Bolotova VBlinov VScholer FCroft WSanderson MAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)A Non-Factoid Question-Answering TaxonomyProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531926(1196-1207)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531926
Show More Cited By

Index Terms

Joint question clustering and relevance prediction for open domain non-factoid question answering
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
    2. Retrieval tasks and goals
      1. Clustering and classification
  2. Information systems applications
    1. Data mining
      1. Clustering

Recommendations

A Non-Factoid Question-Answering Taxonomy
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Non-factoid question answering (NFQA) is a challenging and under-researched task that requires constructing long-form answers, such as explanations or opinions, to open-ended non-factoid questions - NFQs. There is still little understanding of the ...
Performance Prediction for Non-Factoid Question Answering
ICTIR '19: Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval

Estimating the quality of a result list, often referred to as query performance prediction (QPP), is a challenging and important task in information retrieval. It can be used as feedback to users, search engines, and system administrators. Although ...
Quality-aware collaborative question answering: methods and evaluation
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining

Community Question Answering (QA) portals contain questions and answers contributed by hundreds of millions of users. These databases of questions and answers are of great value if they can be used directly to answer questions from any user. In this ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '14: Proceedings of the 23rd international conference on World wide web

April 2014

926 pages

ISBN:9781450327442

DOI:10.1145/2566486

General Chair:
Chin-Wan Chung
Korea Advanced Institute of Science and Technology, Korea
,
Program Chairs:
Andrei Broder
Google Inc., USA
,
Kyuseok Shim
Seoul National University, Korea
,
Torsten Suel
New York University, USA

Copyright © 2014 Copyright is held by the International World Wide Web Conference Committee (IW3C2).

Sponsors

IW3C2: International World Wide Web Conference Committee

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 April 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '14

Sponsor:

IW3C2

WWW '14: 23rd International World Wide Web Conference

April 7 - 11, 2014

Seoul, Korea

Acceptance Rates

WWW '14 Paper Acceptance Rate 84 of 645 submissions, 13%;

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
478
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)1

Reflects downloads up to 12 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ahmed MKhan HMunir E(2024)Conversational AI: An Explication of Few-Shot Learning Problem in Transformers-Based Chatbot SystemsIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.328149211:2(1888-1906)Online publication date: Apr-2024
https://doi.org/10.1109/TCSS.2023.3281492
Uwoghiren EOladipupo OOyelade J(2024)Advancements and Trends in Non-Factoid Question Answering: A Comprehensive Systematic Literature Review2024 International Conference on Science, Engineering and Business for Driving Sustainable Development Goals (SEB4SDG)10.1109/SEB4SDG60871.2024.10629871(1-17)Online publication date: 2-Apr-2024
https://doi.org/10.1109/SEB4SDG60871.2024.10629871
Bolotova VBlinov VScholer FCroft WSanderson MAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)A Non-Factoid Question-Answering TaxonomyProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531926(1196-1207)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531926
Nur NPark NDorodchi MDou WMahzoon MNiu XMaher M(2019)Student Network Analysis: A Novel Way to Predict Delayed Graduation in Higher EducationArtificial Intelligence in Education10.1007/978-3-030-23204-7_31(370-382)Online publication date: 21-Jun-2019
https://doi.org/10.1007/978-3-030-23204-7_31
Sharma AHarithas C(2018)Inner Attention Based bi-LSTMs with Indexing for non-Factoid Question Answering2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA.2018.00009(1-7)Online publication date: Dec-2018
https://doi.org/10.1109/ICMLA.2018.00009
Song JZhang YDuan KShamim Hossain MRahman S(2017)TOLA: Topic-oriented learning assistance based on cyber-physical system and big dataFuture Generation Computer Systems10.1016/j.future.2016.05.04075(200-205)Online publication date: Oct-2017
https://doi.org/10.1016/j.future.2016.05.040
Srivastava SChaturvedi SMitchell T(2016)Inferring interpersonal relations in narrative summariesProceedings of the Thirtieth AAAI Conference on Artificial Intelligence10.5555/3016100.3016294(2807-2813)Online publication date: 12-Feb-2016
https://dl.acm.org/doi/10.5555/3016100.3016294
Sun HMa HYih WTsai CLiu JChang MGangemi ALeonardi SPanconesi A(2015)Open Domain Question Answering via Semantic EnrichmentProceedings of the 24th International Conference on World Wide Web10.1145/2736277.2741651(1045-1055)Online publication date: 18-May-2015
https://dl.acm.org/doi/10.1145/2736277.2741651

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents