[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1989323.1989403acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Context-sensitive ranking for document retrieval

Published: 12 June 2011 Publication History

Abstract

We study the problem of context-sensitive ranking for document retrieval, where a context is defined as a sub-collection of documents, and is specified by queries provided by domain-interested users. The motivation of context-sensitive search is that the ranking of the same keyword query generally depends on the context. The reason is that the underlying keyword statistics differ significantly from one context to another. The query evaluation challenge is the computation of keyword statistics at runtime, which involves expensive online aggregations. We appropriately leverage and extend materialized view research in order to deliver algorithms and data structures that evaluate context-sensitive queries efficiently. Specifically, a number of views are selected and materialized, each corresponding to one or more large contexts. Materialized views are used at query time to compute statistics which are used to compute ranking scores. Experimental results show that the context-sensitive ranking generally improves the ranking quality, while our materialized view-based technique improves the query efficiency.

References

[1]
R. Agrawal, R. Rantzau, and E. Terzi. Context-sensitive ranking. In SIGMOD, 2006.
[2]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB, 1994.
[3]
A. Arampatzis and J. Kamps. A study of query length. In SIGIR, 2008.
[4]
E. Baralis, S. Paraboschi, and E. Teniente. Materialized views selection in a multidimensional database. In VLDB, 1997.
[5]
D. M. Blei and J. D. Lafferty. Correlated topic models. In In Proceedings of the 23rd International Conference on Machine Learning, 2006.
[6]
T. N. Bui and C. Jones. Finding good approximate vertex and edge partitions is np-hard. Inf. Process. Lett., 42(3), 1992.
[7]
S. Chakrabarti. Dynamic personalized pagerank in entity-relation graphs. In WWW, 2007.
[8]
C. Y. Chan and Y. E. Ioannidis. Hierarchical prefix cubes for range-sum queries. In VLDB, 1999.
[9]
S. Chaudhuri, K. W. Church, A. C. König, and L. Sui. Heavy-tailed distributions and multi-keyword queries. In SIGIR, 2007.
[10]
R. Chirkova, A. Y. Halevy, and D. Suciu. A formal perspective on the view selection problem. VLDB J., 11(3), 2002.
[11]
U. Feige, M. T. Hajiaghayi, and J. R. Lee. Improved approximation algorithms for minimum-weight vertex separators. In STOC, 2005.
[12]
J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-total. In ICDE, 1996.
[13]
J. Han, J. Pei, Y. Yin, and R. Mao. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min. Knowl. Discov., 8(1), 2004.
[14]
V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In SIGMOD, 1996.
[15]
T. H. Haveliwala. Topic-sensitive pagerank. In WWW, 2002.
[16]
W. R. Hersh and E. M. Voorhees. Trec genomics special issue overview. Inf. Retr., 12(1), 2009.
[17]
C.-T. Ho, R. Agrawal, N. Megiddo, and R. Srikant. Range queries in olap data cubes. In SIGMOD, 1997.
[18]
H. Hwang, A. Balmin, B. Reinwald, and E. Nijkamp. Binrank: Scaling dynamic authority-based search using materialized subgraphs. In ICDE, 2009.
[19]
G. Jeh and J. Widom. Scaling personalized web search. In WWW, 2003.
[20]
G. Koutrika and Y. E. Ioannidis. Personalization of queries in database systems. In ICDE, 2004.
[21]
X. Liu and W. B. Croft. Cluster-based retrieval using language models. In SIGIR, 2004.
[22]
Z. Lu, W. Kim, and W. J. Wilbur. Evaluation of query expansion using mesh in . Inf. Retr., 12(1), 2009.
[23]
Z. Ma, G. Pant, and O. R. L. Sheng. Interest-based personalized search. ACM Trans. Inf. Syst., 25(1).
[24]
C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, New York, 2008.
[25]
F. Qiu and J. Cho. Automatic identification of user interest for personalized search. In WWW, 2006.
[26]
M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth. The author-topic model for authors and documents. In UAI, 2004.
[27]
X. Shen, B. Tan, and C. Zhai. Context-sensitive information retrieval using implicit feedback. In SIGIR, 2005.
[28]
A. Shukla, P. Deshpande, and J. F. Naughton. Materialized view selection for multidimensional datasets. In VLDB, 1998.
[29]
A. Sieg, B. Mobasher, and R. D. Burke. Web search personalization with ontological user profiles. In CIKM, 2007.
[30]
A. Singhal. Modern information retrieval: A brief overview. IEEE Data Eng. Bull., 24(4), 2001.
[31]
J. Teevan, S. T. Dumais, and E. Horvitz. Personalizing search via automated analysis of interests and activities. In SIGIR, 2005.
[32]
J. S. Vitter and M. Wang. Approximate computation of multidimensional aggregates of sparse data using wavelets. In SIGMOD Conference, 1999.
[33]
E. M. Voorhees. Overview of the trec-9 question answering track. In TREC, 2000.
[34]
R. W. White and D. Morris. Investigating the querying and browsing behavior of advanced search engine users. In SIGIR, 2007.
[35]
S. Xu, S. Bao, B. Fei, Z. Su, and Y. Yu. Exploring folksonomy for personalized search. In SIGIR, 2008.
[36]
M. J. Zaki. Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng., 12(3), 2000.

Cited By

View all
  • (2021)Cluster Analysis of Influencing Factors of Regional Economic Growth Based on Random Walk Model2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA)10.1109/ICECA52323.2021.9675884(1243-1246)Online publication date: 2-Dec-2021
  • (2017)Adaptive query relaxation and top-k result ranking over autonomous web databasesKnowledge and Information Systems10.1007/s10115-016-0982-451:2(395-433)Online publication date: 1-May-2017
  • (2014)Designing an information retrieval system for the STT/SC2014 IEEE 16th International Conference on e-Health Networking, Applications and Services (Healthcom)10.1109/HealthCom.2014.7001893(500-505)Online publication date: Oct-2014
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
June 2011
1364 pages
ISBN:9781450306614
DOI:10.1145/1989323
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. context-sensitive ranking
  2. materialized views
  3. view selection

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Cluster Analysis of Influencing Factors of Regional Economic Growth Based on Random Walk Model2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA)10.1109/ICECA52323.2021.9675884(1243-1246)Online publication date: 2-Dec-2021
  • (2017)Adaptive query relaxation and top-k result ranking over autonomous web databasesKnowledge and Information Systems10.1007/s10115-016-0982-451:2(395-433)Online publication date: 1-May-2017
  • (2014)Designing an information retrieval system for the STT/SC2014 IEEE 16th International Conference on e-Health Networking, Applications and Services (Healthcom)10.1109/HealthCom.2014.7001893(500-505)Online publication date: Oct-2014
  • (2013)Context-Sensitive Ranking Using Cross-Domain Knowledge for Chemical Digital LibrariesResearch and Advanced Technology for Digital Libraries10.1007/978-3-642-40501-3_29(285-296)Online publication date: 2013
  • (2012)Context-aware document recommendation by mining sequential access dataProceedings of the 1st International Workshop on Context Discovery and Data Mining10.1145/2346604.2346612(1-7)Online publication date: 12-Aug-2012

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media