Abstract
Modern cluster processors have been steadily increasing the number of cores able to execute concurrent threads. Web search engines critically rely on multithreading to efficiently process user queries and document insertions to support real-time search. This requires synchronization of readers and writers which, for large number of threads, poses the question of what concurrency control strategies are capable of scaling to hundreds of cores and more. This paper presents a comparative study of a number of such strategies. To this end, we focus on the development of suitable simulation models for performance evaluation of search algorithms on dedicated single-purpose multi-threaded processors. We validate our model against actual implementations of the multi-threading strategies to then go further on studying performance on very large processors. We conclude that intra-query parallelism scales up more efficiently than inter-query parallelism.
Partially funded by research grant DICYT 061319BC and FONDEF CA12i10314.
The original version of this chapter was revised: The copyright line was incorrect. This has been corrected. The Erratum to this chapter is available at DOI: 10.1007/978-3-319-02432-5_33
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bonacic, C., Garcia, C., Marin, M., Prieto-Matias, M., Tirado, F.: Building efficient multi-threaded search nodes. In: CIKM (2010)
Cacheda, F., Carneiro, V., Plachouras, V., Ounis, I.: Performance analysis of distributed information retrieval architectures using an improved network simulation model. Information Processing and Management 43, 204–224 (2007)
Fagni, T., Perego, R., Silvestri, F., Orlando, S.: Boosting the performance of Web search engines: caching and prefetching query results by exploiting historical usage data. ACM Transactions on Information Systems 24(1), 51–78 (2006)
Moffat, A., Webber, W., Zobel, J., Baeza-Yates, R.: A pipelined architecture for distributed text query evaluation. Information Retrieval 10(3) (2007)
Marin, M., Gil-Costa, V., Bonacic, C., Baeza-Yates, R., Scherson, I.D.: Sync/Async parallel search for the efficient design and construction of Web search engines. Parallel Computing 36(4), 153–168 (2010)
Marzolla, M.: LibCppSim: A SIMULA-like, portable process-oriented simulation library in C++. In: European Simulation Symposium, ESM (2004)
Gan, Q., Suel, T.: Improved techniques for result caching in Web search engines. In: WWW (2009)
Valiant, L.G.: A bridging model for multi-core computing. Journal of Computer and System Sciences 77(1) (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bonacic, C., Marin, M. (2013). Simulation Study of Multi-threading in Web Search Engine Processors. In: Kurland, O., Lewenstein, M., Porat, E. (eds) String Processing and Information Retrieval. SPIRE 2013. Lecture Notes in Computer Science, vol 8214. Springer, Cham. https://doi.org/10.1007/978-3-319-02432-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-02432-5_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02431-8
Online ISBN: 978-3-319-02432-5
eBook Packages: Computer ScienceComputer Science (R0)