More Web Proxy on the site http://driver.im/

research-article

Optimal Aggregation Policy for Reducing Tail Latency of Web Search

Authors:

Sameh Elnikety,

Shaolei RenAuthors Info & Claims

SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 63 - 72

https://doi.org/10.1145/2766462.2767708

Published: 09 August 2015 Publication History

Abstract

A web search engine often employs partition-aggregate architecture, where an aggregator propagates a user query to all index serving nodes (ISNs) and collects the responses from them. An aggregation policy determines how long the aggregators wait for the ISNs before returning aggregated results to users, crucially affecting both query latency and quality. Designing an aggregation policy is, however, challenging: Response latency among queries and among ISNs varies significantly, and aggregators lack of knowledge about when ISNs will respond. In this paper, we propose aggregation policies that minimize tail latency of search queries subject to search quality service level agreements (SLAs), combining data-driven offline analysis with online processing. Beginning with a single aggregator, we formally prove the optimality of our policy: It achieves the offline optimal result without knowing future responses of ISNs. We extend our policy for commonly-used hierarchical levels of aggregators and prove its optimality when messaging times between aggregators are known. We also present an empirically-effective policy to address unknown messaging time. We use production traces from a commercial search engine, a commercial advertisement engine, and synthetic workloads to evaluate the aggregation policy. The results show that compared to prior work, the policy reduces tail latency by up to 40% while satisfying same quality SLAs.

References

[1]

M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. Data center tcp (dctcp). SIGCOMM Comput. Commun. Rev., 41(4):--, Aug. 2010.

Digital Library

[2]

I. S. Altingovde, R. Blanco, B. B. Cambazoglu, R. Ozcan, E. Sarigil, and O. Ulusoy. Characterizing web search queries that match very few or no results. In CIKM, 2012.

Digital Library

[3]

V. N. Anh, O. de Kretser, and A. Moffat. Vector-space ranking with effective early termination. In SIGIR, 2001.

Digital Library

[4]

I. Arapakis, X. Bai, and B. B. Cambazoglu. Impact of response latency on user behavior in web search. In SIGIR, 2014.

Digital Library

[5]

C. S. Badue, R. Baeza-Yates, B. Ribeiro-Neto, A. Ziviani, and N. Ziviani. Analyzing imbalance among homogeneous index servers in a web search system. Inf. Process. Manage., 43(3), 2007.

Digital Library

[6]

R. Baeza-Yates, A. Gionis, F. Junqueira, V. Murdock, V. Plachouras, and F. Silvestri. The impact of caching on search engines. In SIGIR, 2007.

Digital Library

[7]

L. A. Barroso, J. Dean, and U. Hölzle. Web search for a planet: The google cluster architecture. IEEE Micro, 23(2):22--28, 2003.

Digital Library

[8]

A. Broder and M. Mitzenmacher. Optimal plans for aggregation. In PODC, 2002.

Digital Library

[9]

B. B. Cambazoglu, I. S. Altingovde, R. Ozcan, and Ö. Ulusoy. Cache-based query processing for search engines. ACM Transactions on the Web, 6(4):14, 2012.

Digital Library

[10]

B. B. Cambazoglu, F. P. Junqueira, V. Plachouras, S. Banachowski, B. Cui, S. Lim, and B. Bridge. A refreshing perspective of search engine caching. In WWW, 2010.

Digital Library

[11]

B. B. Cambazoglu, V. Plachouras, and R. Baeza-Yates. Quantifying performance and quality gains in distributed web search engines. In SIGIR, 2009.

Digital Library

[12]

J. Dean. Challenges in building large-scale information retrieval systems (Invited talk). In WSDM, 2009.

Digital Library

[13]

J. Dean and L. A. Barroso. The tail at scale. Commun. ACM, 56(2):74--80, Feb. 2013.

Digital Library

[14]

S. Ding, J. He, H. Yan, and T. Suel. Using graphics processors for high performance ir query processing. In WWW, 2009.

Digital Library

[15]

E. Frachtenberg. Reducing query latencies in web search using fine-grained parallelism. In WWW, 2009.

Digital Library

[16]

Q. Gan and T. Suel. Improved techniques for result caching in web search engines. In WWW, 2009.

Digital Library

[17]

J. Hamilton. The cost of latency. http://perspectives.mvdirona.com/2009/10/31/TheCostOfLatency.aspx, 2009.

[18]

E. Hatcher and O. Gospodnetic. Lucene in Action. Manning Publications Co., 2004.

Digital Library

[19]

Y. He, S. Elnikety, J. Larus, and C. Yan. Zeta: Scheduling interactive services with partial execution. In SOCC, 2012.

Digital Library

[20]

V. Jalaparti, P. Bodik, S. Kandula, I. Menache, M. Rybalkin, and C. Yan. Speeding up distributed request-response workflows. In SIGCOMM '13, 2013.

Digital Library

[21]

M. Jeon, Y. He, S. Elnikety, A. L. Cox, and S. Rixner. Adaptive parallelism for web search. In EuroSys, 2013.

Digital Library

[22]

M. Jeon, S. Kim, S.-W. Hwang, Y. He, A. L. Cox, and S. Rixner. Taming tail latencies in web search. In SIGIR, 2014.

Digital Library

[23]

S. Jonassen, B. B. Cambazoglu, and F. Silvestri. Prefetching query results and its impact on search engines. In SIGIR, 2012.

Digital Library

[24]

J. R. Lorch and A. J. Smith. Improving dynamic voltage scaling algorithms with PACE. In SIGMETRICS, 2001.

Digital Library

[25]

C. Macdonald, N. Tonellotto, and I. Ounis. Learning to predict response times for online query scheduling. In SIGIR, 2012.

Digital Library

[26]

R. Ozcan, I. S. Altingovde, and Ö. Ulusoy. Cost-aware strategies for query result caching in web search engines. ACM Transactions on the Web, 5(2):9, 2011.

Digital Library

[27]

E. Schurman and J. Brutlag. The user and business impact of server delays, additional bytes, and http chunking in web search. In Velocity, 2009.

[28]

G. Upadhyaya, V. S. Pai, and S. P. Midkiff. Expressing and exploiting concurrency in networked applications with aspen. In PPoPP, 2007.

Digital Library

[29]

A. Vulimiri, O. Michel, P. B. Godfrey, and S. Shenker. More is less: Reducing latency via redundancy. In HotNets, 2012.

Digital Library

[30]

Z. Ye, A. A. Abouzeid, and J. Ai. Optimal stochastic policies for distributed data aggregation in wireless sensor networks. IEEE/ACM Trans. Netw., 17(5):1494--1507, Oct. 2009.

Digital Library

[31]

J. Yi, F. Maghoul, and J. Pedersen. Deciphering mobile search patterns: A study of Yahoo! mobile search queries. In WWW, 2008.

Digital Library

[32]

J.-M. Yun, Y. He, S. Elnikety, and S. Ren. Optimal aggregation policy for reducing tail latency of web search. Technical Report MSR-TR-2015--39, Microsoft Research, 2015. http://research.microsoft.com/apps/pubs/?id=246273.

[33]

F. Zhang, S. Shi, H. Yan, and J.-R. Wen. Revisiting globally sorted indexes for efficient document retrieval. In WSDM, 2010.

Digital Library

[34]

J. Zhang, X. Long, and T. Suel. Performance of compressed inverted list caching in search engines. In WWW, 2008.

Digital Library

[35]

S. Zhu, A. Potapova, M. Alabduljalil, X. Liu, and T. Yang. Clustering and load balancing optimization for redundant content removal. In WWW, 2012.

Digital Library

Cited By

Widmoser MKocher DAugsten N(2024)Scalable Distributed Inverted List Indexes in Disaggregated MemoryProceedings of the ACM on Management of Data10.1145/36549742:3(1-27)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654974
Wen SHan RLiu CChen L(2023)Fast DRL-based scheduler configuration tuning for reducing tail latency in edge-cloud jobsJournal of Cloud Computing10.1186/s13677-023-00465-z12:1Online publication date: 17-Jun-2023
https://doi.org/10.1186/s13677-023-00465-z
Kong XGao XPan SZhou YYang YZhao LQi H(2022)TailCmp - A Tail Latency Evaluation Solution of Public Cloud and Labeled von Neumann Architecture based Cloud Prototype2022 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00118(888-895)Online publication date: Dec-2022
https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00118
Show More Cited By

Index Terms

Optimal Aggregation Policy for Reducing Tail Latency of Web Search
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing

Recommendations

Random walk based rank aggregation to improving web search

In Web search, with the aid of related query recommendation, Web users can revise their initial queries in several serial rounds in pursuit of finding needed Web pages. In this paper, we address the Web search problem on aggregating search results of ...
Identifying popular search goals behind search queries to improve web search ranking
AIRS'11: Proceedings of the 7th Asia conference on Information Retrieval Technology

Web users usually have a certain search goal before they submit a search query. However, many laypersons can't transform their search goals into suitable queries. Thus, understanding original search goals behind a query is very important for search ...
Improving Ranking Consistency for Web Search by Leveraging a Knowledge Base and Search Logs
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

In this paper, we propose a new idea called ranking consistency in web search. Relevance ranking is one of the biggest problems in creating an effective web search system. Given some queries with similar search intents, conventional approaches typically ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

August 2015

1198 pages

ISBN:9781450336215

DOI:10.1145/2766462

General Chair:
Ricardo Baeza-Yates
Yahoo Labs, USA
,
Program Chairs:
Mounia Lalmas
Yahoo Labs, UK
,
Alistair Moffat
University of Melbourne, Australia
,
Berthier Ribeiro-Neto
Google, Brazil, and UFMG, Brazil

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 August 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

U.S. NSF

Conference

SIGIR '15

Sponsor:

SIGIR

SIGIR '15: The 38th International ACM SIGIR conference on research and development in Information Retrieval

August 9 - 13, 2015

Santiago, Chile

Acceptance Rates

SIGIR '15 Paper Acceptance Rate 70 of 351 submissions, 20%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
458
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Widmoser MKocher DAugsten N(2024)Scalable Distributed Inverted List Indexes in Disaggregated MemoryProceedings of the ACM on Management of Data10.1145/36549742:3(1-27)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654974
Wen SHan RLiu CChen L(2023)Fast DRL-based scheduler configuration tuning for reducing tail latency in edge-cloud jobsJournal of Cloud Computing10.1186/s13677-023-00465-z12:1Online publication date: 17-Jun-2023
https://doi.org/10.1186/s13677-023-00465-z
Kong XGao XPan SZhou YYang YZhao LQi H(2022)TailCmp - A Tail Latency Evaluation Solution of Public Cloud and Labeled von Neumann Architecture based Cloud Prototype2022 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00118(888-895)Online publication date: Dec-2022
https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00118
Wang ZXiao WZhou BHan J(2022)Qos-Sure: Qos-Assurance AutoScaling of Sharing GPU For DNN Inference in Multi-Task2022 10th International Conference on Information Systems and Computing Technology (ISCTech)10.1109/ISCTech58360.2022.00064(367-372)Online publication date: Dec-2022
https://doi.org/10.1109/ISCTech58360.2022.00064
Zhou LBhuyan LRamakrishnan K(2022)Cottage: Coordinated Time Budget Assignment for Latency, Quality and Power Optimization in Web Search2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00017(113-125)Online publication date: Apr-2022
https://doi.org/10.1109/HPCA53966.2022.00017
Yao CLiu WTang WHu S(2022)EAISFuture Generation Computer Systems10.1016/j.future.2022.01.004130:C(253-268)Online publication date: 1-May-2022
https://dl.acm.org/doi/10.1016/j.future.2022.01.004
Wang CJin LGong LXu CHu YTan LZhou X(2021)Tinker: A Middleware for Deploying Multiple NN-Based Applications on a Single MachineIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2020.301998140:7(1495-1499)Online publication date: Jul-2021
https://doi.org/10.1109/TCAD.2020.3019981
Gunasekaran J(2020)Minimizing Cost and Maximizing Performance for Cloud PlatformsProceedings of the 21st International Middleware Conference Doctoral Symposium10.1145/3429351.3431747(29-34)Online publication date: 7-Dec-2020
https://dl.acm.org/doi/10.1145/3429351.3431747
Zhou LBhuyan LRamakrishnan K(2020)Gemini: Learning to Manage CPU Power for Latency-Critical Search Engines2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00059(637-349)Online publication date: Oct-2020
https://doi.org/10.1109/MICRO50266.2020.00059
Zois VTsotras VNajjar W(2019)Efficient main-memory top-K selection for multicore architecturesProceedings of the VLDB Endowment10.14778/3364324.336432713:2(114-127)Online publication date: 1-Oct-2019
https://dl.acm.org/doi/10.14778/3364324.3364327
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten