[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Query expansion based on clustered results

Published: 01 March 2011 Publication History

Abstract

Query expansion is a functionality of search engines that suggests a set of related queries for a user-issued keyword query. Typical corpus-driven keyword query expansion approaches return popular words in the results as expanded queries. Using these approaches, the expanded queries may correspond to a subset of possible query semantics, and thus miss relevant results. To handle ambiguous queries and exploratory queries, whose result relevance is difficult to judge, we propose a new framework for keyword query expansion: we start with clustering the results according to user specified granularity, and then generate expanded queries, such that one expanded query is generated for each cluster whose result set should ideally be the corresponding cluster. We formalize this problem and show its APX-hardness. Then we propose two efficient algorithms named iterative single-keyword refinement and partial elimination based convergence, respectively, which effectively generate a set of expanded queries from clustered results that provides a classification of the original query results. We believe our study of generating an optimal query based on the ground truth of the query results not only has applications in query expansion, but has significance for studying keyword search quality in general.

References

[1]
B. L. 0002 and H. V. Jagadish. Using Trees to Depict a Forest. PVLDB, 2(1):133--144, 2009.
[2]
Z. Bar-Yossef and M. Gurevich. Mining Search Engine Query Logs via Suggestion Sampling. PVLDB, 1(1):54--65, 2008.
[3]
M. Baziz, M. Boughanem, and N. Aussenac-Gilles. Conceptual Indexing Based on Document Content Representation. In CoLIS, pages 171--186, 2005.
[4]
A. Z. Broder. A Taxonomy of Web Search. SIGIR Forum, 36(2):3--10, 2002.
[5]
G. Cao, J.-Y. Nie, J. Gao, and S. Robertson. Selecting Good Expansion Terms for Pseudo-Relevance Feedback. In SIGIR, pages 243--250, 2008.
[6]
D. Carmel, H. Roitman, and N. Zwerdling. Enhancing Cluster Labeling Using Wikipedia. In SIGIR, pages 139--146, 2009.
[7]
C. Carpineto, R. de Mori, G. Romano, and B. Bigi. An Information-Theoretic Approach to Automatic Query Expansion. ACM Trans. Inf. Syst., 19(1):1--27, 2001.
[8]
K. Chakrabarti, S. Chaudhuri, and S. won Hwang. Automatic Categorization of Query Results. In SIGMOD Conference, pages 755--766, 2004.
[9]
P.-A. Chirita, C. S. Firan, and W. Nejdl. Personalized Query Expansion for the Web. In SIGIR, pages 7--14, 2007.
[10]
G. Fu, C. B. Jones, and A. I. Abdelmoty. Ontology-Based Spatial Query Expansion in Information Retrieval. In OTM Conferences (2), pages 1466--1482, 2005.
[11]
L. Fu, D. H.-L. Goh, and S. S.-B. Foo. Evaluating the Effectiveness of a Collaborative Querying Environment. In ICADL, pages 342--351, 2005.
[12]
F. A. Grootjen and T. P. van der Weide. Conceptual Query Expansion. Data Knowl. Eng., 56(2):174--193, 2006.
[13]
Y. Huang, Z. Liu, and Y. Chen. Query Biased Snippet Generation in XML Search. In SIGMOD Conference, pages 315--326, 2008.
[14]
A. Kashyap, V. Hristidis, and M. Petropoulos. FACeTOR: Cost-Driven Exploration of Faceted Query Results. In CIKM, pages 719--728, 2010.
[15]
G. Koutrika, Z. M. Zadeh, and H. Garcia-Molina. Data Clouds: Summarizing Keyword Search Results over Structured Data. In EDBT, pages 391--402, 2009.
[16]
C. Li, N. Yan, S. B. Roy, L. Lisham, and G. Das. Facetedpedia: Dynamic generation of query-dependent faceted interfaces for wikipedia. In WWW, pages 651--660, 2010.
[17]
Z. Liu, S. Natarajan, and Y. Chen. Generating Expanded Queries Based on Clustered Query Results. Technical Report ASUCIDSE-2011-003, Arizona State University, 2010.
[18]
Z. Liu, P. Sun, and Y. Chen. Structured Search Result Differentiation. PVLDB, 2(1):313--324, 2009.
[19]
M. Muhr, R. Kern, and M. Granitzer. Analysis of Structural Relationships for Hierarchical Cluster Labeling. In SIGIR, pages 178--185, 2010.
[20]
S. E. Robertson. On Term Selection for Query Expansion. Journal of Documentation, 46:359--364, 1990.
[21]
N. Sarkas, N. Bansal, G. Das, and N. Koudas. Measure-driven keyword-query expansion. PVLDB, 2(1):121--132, 2009.
[22]
Y. Tao and J. X. Yu. Finding Frequent Co-occurring Terms in Relational Keyword Search. In EDBT, pages 839--850, 2009.
[23]
O. Vechtomova, S. E. Robertson, and S. Jones. Query Expansion with Long-Span Collocates. Inf. Retr., 6(2):251--273, 2003.
[24]
J. Xu and W. B. Croft. Query Expansion Using Local and Global Document Analysis. In SIGIR, pages 4--11, 1996.
[25]
Y. Xu, G. J. F. Jones, and B. Wang. Query Dependent Pseudo-Relevance Feedback based on Wikipedia. In SIGIR, pages 59--66, 2009.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 4, Issue 6
March 2011
71 pages

Publisher

VLDB Endowment

Publication History

Published: 01 March 2011
Published in PVLDB Volume 4, Issue 6

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)A semantic transfer approach to keyword suggestion for search engine advertisingElectronic Commerce Research10.1007/s10660-021-09496-723:2(921-947)Online publication date: 1-Jun-2023
  • (2019)Opportunities and challenges in enhancing access to metadata of cultural heritage collections: a surveyArtificial Intelligence Review10.1007/s10462-019-09773-w53:5(3621-3646)Online publication date: 9-Oct-2019
  • (2019)A survey of statistical approaches for query expansionKnowledge and Information Systems10.1007/s10115-018-1269-861:1(1-25)Online publication date: 1-Oct-2019
  • (2018)KlustreeProceedings of the ACM India Joint International Conference on Data Science and Management of Data10.1145/3152494.3152509(265-272)Online publication date: 11-Jan-2018
  • (2016)FluxQueryProceedings of the 2016 International Conference on Management of Data10.1145/2882903.2882945(1333-1345)Online publication date: 26-Jun-2016
  • (2016)A query term re-weighting approach using document similarityInformation Processing and Management: an International Journal10.1016/j.ipm.2015.09.00252:3(478-489)Online publication date: 1-May-2016
  • (2013)Summarizing answer graphs induced by keyword queriesProceedings of the VLDB Endowment10.14778/2556549.25565616:14(1774-1785)Online publication date: 1-Sep-2013
  • (2013)Diversifying Query Suggestions by Using Topics from WikipediaProceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 0110.1109/WI-IAT.2013.21(139-146)Online publication date: 17-Nov-2013
  • (2012)Using Google™ facets as implicit feedback for query expansion in database searchingProceedings of the 5th International Conference on PErvasive Technologies Related to Assistive Environments10.1145/2413097.2413138(1-8)Online publication date: 6-Jun-2012
  • (2012)Exploiting and Maintaining Materialized Views for XML Keyword QueriesACM Transactions on Internet Technology10.1145/2390209.239021212:2(1-27)Online publication date: 1-Dec-2012

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media