More Web Proxy on the site http://driver.im/

research-article

Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail Latency in Web Search

Authors:

Seung-won Hwang,

Sameh Elnikety,

Seungjin ChoiAuthors Info & Claims

WSDM '15: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining

Pages 7 - 16

https://doi.org/10.1145/2684822.2685289

Published: 02 February 2015 Publication History

Abstract

A commercial web search engine shards its index among many servers, and therefore the response time of a search query is dominated by the slowest server that processes the query. Prior approaches target improving responsiveness by reducing the tail latency of an individual search server. They predict query execution time, and if a query is predicted to be long-running, it runs in parallel, otherwise it runs sequentially. These approaches are, however, not accurate enough for reducing a high tail latency when responses are aggregated from many servers because this requires each server to reduce a substantially higher tail latency (e.g., the 99.99th-percentile), which we call extreme tail latency.

We propose a prediction framework to reduce the extreme tail latency of search servers. The framework has a unique set of characteristics to predict long-running queries with high recall and improved precision. Specifically, prediction is delayed by a short duration to allow many short-running queries to complete without parallelization, and to allow the predictor to collect a set of dynamic features using runtime information. These features estimate query execution time with high accuracy. We also use them to estimate the prediction errors to override an uncertain prediction by selectively accelerating the query for a higher recall.

We evaluate the proposed prediction framework to improve search engine performance in two scenarios using a simulation study: (1) query parallelization on a multicore processor, and (2) query scheduling on a heterogeneous processor. The results show that, for both scenarios, the proposed framework is effective in reducing the extreme tail latency compared to a start-of-the-art predictor because of its higher recall, and it improves server throughput by more than 70% because of its improved precision.

References

[1]

R. Baeza-Yates, A. Gionis, F. P. Junqueira, V. Murdock, V. Plachouras, and F. Silvestri. Design trade-offs for search engine caching. ACM Transactions on Web, 2008.

Digital Library

[2]

M. Becchi and P. Crowley. Dynamic thread assignment on heterogeneous multiprocessor architectures. ACM Computing Frontiers, 2006.

Digital Library

[3]

C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: Characterization and architectural implications. Technical Report, 2008.

[4]

S. Briesemeister, J. Rahnenfuhrer, and O. Kohlbacher. No longer confidential: Estimating the confidence of individual regression predictions. PLos ONE, 2012.

[5]

A. Z. Broder, D. Carmel, M. Herscovici, A. Soffer, and J. Zien. Efficient query evaluation using a two-level retrieval process. In CIKM, 2003.

Digital Library

[6]

C. J. Burges, R. Ragno, and Q. V. Le. Learning to rank with nonsmooth cost functions. In NIPS, 2006.

[7]

J. Chen and L. K. John. Efficient program scheduling for heterogeneous multi-core processors. In DAC, 2009.

Digital Library

[8]

K. V. Craeynest, A. Jalelle, L. Eeckhout, P. Narvaez, and J. Emer. Scheduling heterogeneous multi-cores through performance impact estimation (pie). In ISCA, 2012.

Digital Library

[9]

J. Dean and L. A. Barroso. The tail at scale. In Communications of the ACM, 2013.

Digital Library

[10]

E. Frachtenberg. Reducing query latencies in web search using fine-grained parallelism. World Wide Web, 2009.

Digital Library

[11]

A. Freire, C. Macdonald, N. Tonellotto, I. Ounis, and F. Cacheda. A self-adapting latency/power tradeoff model for replicated search engines. In WSDM, 2014.

Digital Library

[12]

J. Friedman. Greedy function approximation: a gradient boosting machine. In Annals of Statistics, 2001.

[13]

P. Greenhalgh. Big.little processing with arm cortex-a15 & cortex-a7. ARM Whitepaper, 2011.

[14]

V. Janapa Reddi, B. C. Lee, T. Chilimbi, and K. Vaid. Web search using mobile cores: quantifying and mitigating the price of efficiency. In ISCA, 2010.

Digital Library

[15]

M. Jeon, Y. He, S. Elnikety, A. L. Cox, and S. Rixner. Adaptive parallelism for web search. In EuroSys, 2013.

Digital Library

[16]

M. Jeon, S. Kim, S. Hwang, Y. He, S. Elnikety, A. L. Cox, and S. Rixner. Predictive parallelization: Taming tail latencies in web search. In SIGIR, 2014.

Digital Library

[17]

Y. Kim, A. Hassan, R. W. White, and Y.-M. Wang. Playing by the rules: Mining query associations to predict search performance. In WSDM, 2013.

Digital Library

[18]

R. Kumar, K. I. Farkas, N. P. Jouppi, P. Ranganathan, and D. M. Tullsen. Single-isa heterogeneous multicore architectures: The potential for processor power reduction. In MICRO, 2003.

Digital Library

[19]

N. B. Lakshminarayana, J. Lee, and H. Kim. Age based scheduling for asymmetric multiprocessors. In SC, 2009.

Digital Library

[20]

C. Macdonald, N. Tonellotto, and I. Ounis. Learning to predict response times for online query scheduling. In SIGIR, 2012.

Digital Library

[21]

A. Moffat, W. Webber, J. Zobel, and R. Baeza-Yates. A pipelined architecture for distributed text query evaluation. Information Retrieval, 2007.

Digital Library

[22]

R. J. Oentaryo, E. P. Lim, D. J. W. Low, D. Lo, and M. Finegold. Predicting response in mobile advertising with hierarchical importance-aware factorization machine. In WSDM, 2014.

Digital Library

[23]

B. Page and T. Lechler. Desmo-J. http://desmoj.sourceforge.net/overview.html.

[24]

S. Ren, Y. He, S. Elnikety, and K. S. McKinley. Exploiting processor heterogeneity for interactive services. In ICAC, 2013.

[25]

J. C. Saez, D. Shelepov, A. Fedorova, and M. Prieto. Leveraging workload diversity through os scheduling to maximize performance on single-isa heterogeneous multicore systems. JPDC, 2011.

Digital Library

[26]

E. Schurman and J. Brutlag. Performance related changes and their user impact. Velocity, 2009.

[27]

S. Tatikonda, B. B. Cambazoglu, and F. P. Junqueira. Posting list intersection on multicore architectures. In SIGIR, 2011.

Digital Library

[28]

N. Tonellotto, C. Macdonald, and I. Ounis. Efficient and effective retrieval using selective pruning. In WSDM, 2013.

Digital Library

[29]

H. Turtle and J. Flood. Query evaluation: strategies and optimizations. Information Processing and Management, 1995.

Digital Library

Cited By

Guo JHong YWu YLiu YYang TCui BSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)SketchPolymer: Estimate Per-item Tail Quantile Using One SketchProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599505(590-601)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599505
Singer KAgrawal KLee IAgrawal KShun J(2023)An Efficient Scheduler for Task-Parallel Interactive ApplicationsProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591092(27-38)Online publication date: 17-Jun-2023
https://doi.org/10.1145/3558481.3591092
Savasci MAli-Eldin AEker JRobertsson AShenoy P(2023)DDPC: Automated Data-Driven Power-Performance Controller Design on-the-fly for Latency-sensitive Web ServicesProceedings of the ACM Web Conference 202310.1145/3543507.3583437(3067-3076)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583437
Show More Cited By

Index Terms

Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail Latency in Web Search
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing

Recommendations

Predictive parallelization: taming tail latencies in web search
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

Web search engines are optimized to reduce the high-percentile response time to consistently provide fast responses to almost all user queries. This is a challenging task because the query workload exhibits large variability, consisting of many short-...
Prediction and Predictability for Search Query Acceleration

A commercial web search engine shards its index among many servers, and therefore the response time of a search query is dominated by the slowest server that processes the query. Prior approaches target improving responsiveness by reducing the tail ...
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WSDM '15: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining

February 2015

482 pages

ISBN:9781450333177

DOI:10.1145/2684822

General Chairs:
Xueqi Cheng
ICT, Chinese Academy of Sciences, China
,
Hang Li
Huawei Technologies, China
,
Program Chairs:
Evgeniy Gabrilovich
Google, USA
,
Jie Tang
Tsinghua University, China

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 February 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WSDM 2015

Sponsor:

WSDM 2015: Eighth ACM International Conference on Web Search and Data Mining

February 2 - 6, 2015

Shanghai, China

Acceptance Rates

WSDM '15 Paper Acceptance Rate 39 of 238 submissions, 16%;

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

36
Total Citations
View Citations
443
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Guo JHong YWu YLiu YYang TCui BSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)SketchPolymer: Estimate Per-item Tail Quantile Using One SketchProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599505(590-601)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599505
Singer KAgrawal KLee IAgrawal KShun J(2023)An Efficient Scheduler for Task-Parallel Interactive ApplicationsProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591092(27-38)Online publication date: 17-Jun-2023
https://doi.org/10.1145/3558481.3591092
Savasci MAli-Eldin AEker JRobertsson AShenoy P(2023)DDPC: Automated Data-Driven Power-Performance Controller Design on-the-fly for Latency-sensitive Web ServicesProceedings of the ACM Web Conference 202310.1145/3543507.3583437(3067-3076)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583437
Lv ZShang KHuo HLiu XPeng YWang XTan Y(2023)RASK: Range Spatial Keyword Queries on Massive Encrypted Geo-Textual DataIEEE Transactions on Services Computing10.1109/TSC.2023.328965416:5(3621-3635)Online publication date: Sep-2023
https://doi.org/10.1109/TSC.2023.3289654
Liu XPan YLi YWang GLiu X(2022)An NVM SSD-based High Performance Query Processing Framework for Search EnginesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3160557(1-1)Online publication date: 2022
https://doi.org/10.1109/TKDE.2022.3160557
Chen SJin ADelimitrou CMartinez J(2022)ReTail: Opting for Learning Simplicity to Enable QoS-Aware Power Management in the Cloud2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00020(155-168)Online publication date: Apr-2022
https://doi.org/10.1109/HPCA53966.2022.00020
Zhou LBhuyan LRamakrishnan K(2022)Cottage: Coordinated Time Budget Assignment for Latency, Quality and Power Optimization in Web Search2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00017(113-125)Online publication date: Apr-2022
https://doi.org/10.1109/HPCA53966.2022.00017
Rojas OGil-Costa VMarin M(2021)A DFT-Based Running Time Prediction Algorithm for Web QueriesFuture Internet10.3390/fi1308020413:8(204)Online publication date: 4-Aug-2021
https://doi.org/10.3390/fi13080204
Wang YArya KKogias MVanga MBhandari AYadwadkar NSen SElnikety SKozyrakis CBianchini RBarbalace ABhatotia PAlvisi LCadar C(2021)SmartHarvestProceedings of the Sixteenth European Conference on Computer Systems10.1145/3447786.3456225(1-16)Online publication date: 21-Apr-2021
https://dl.acm.org/doi/10.1145/3447786.3456225
Tonellotto NMacdonald C(2020)Using an Inverted Index Synopsis for Query Latency and Performance PredictionACM Transactions on Information Systems10.1145/338979538:3(1-33)Online publication date: 18-May-2020
https://dl.acm.org/doi/10.1145/3389795
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten