More Web Proxy on the site http://driver.im/

research-article

ESTIMA: extrapolating scalability of in-memory applications

Authors:

Georgios Chatzopoulos,

Aleksandar Dragojević,

Rachid GuerraouiAuthors Info & Claims

ACM SIGPLAN Notices, Volume 51, Issue 8

Article No.: 27, Pages 1 - 11

https://doi.org/10.1145/3016078.2851159

Published: 27 February 2016 Publication History

Abstract

This paper presents ESTIMA, an easy-to-use tool for extrapolating the scalability of in-memory applications. ESTIMA is designed to perform a simple, yet important task: given the performance of an application on a small machine with a handful of cores, ESTIMA extrapolates its scalability to a larger machine with more cores, while requiring minimum input from the user. The key idea underlying ESTIMA is the use of stalled cycles (e.g. cycles that the processor spends waiting for various events, such as cache misses or waiting on a lock). ESTIMA measures stalled cycles on a few cores and extrapolates them to more cores, estimating the amount of waiting in the system. ESTIMA can be effectively used to predict the scalability of in-memory applications. For instance, using measurements of memcached and SQLite on a desktop machine, we obtain accurate predictions of their scalability on a server. Our extensive evaluation on a large number of in-memory benchmarks shows that ESTIMA has generally low prediction errors.

References

[1]

N. I. Akhiezer. Theory of Approximation. Dover Publications, 1992.

[2]

A. R. Alameldeen and D. A. Wood. Ipc considered harmful for multiprocessor workloads. IEEE Micro, 26(4), 2006.

Digital Library

[3]

AMD. BIOS and Kernel Developer's Guide (BKDG) For AMD Family 10h Processors, 2010.

[4]

B. J. Barnes, B. Rountree, D. K. Lowenthal, J. Reeves, B. de Supinski, and M. Schulz. A regression-based approach to scalability prediction. ICS '08. ACM, 2008.

Digital Library

[5]

C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: characterization and architectural implications. In PACT, 2008.

Digital Library

[6]

L. Carrington, A. Snavely, and N. Wolter. A performance prediction framework for scientific applications. Future Generation Computer Systems, 22(3), 2006.

Digital Library

[7]

C. Coarfa, J. Mellor-Crummey, N. Froyd, and Y. Dotsenko. Scalability analysis of spmd codes using expectations. ICS '07. ACM, 2007.

Digital Library

[8]

M. E. Crovella and T. J. LeBlanc. Parallel performance prediction using lost cycles analysis. IEEE Computer Society Press, 1994.

[9]

C. Diaconu, C. Freedman, E. Ismert, P.-A. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling. Hekaton: Sql server's memory-optimized oltp engine. SIGMOD '13. ACM, 2013.

Digital Library

[10]

A. Dragojević, P. Felber, V. Gramoli, and R. Guerraoui. Why stm can be more than a research toy. Commun. ACM, 54(4), Apr. 2011.

Digital Library

[11]

A. Dragojević, R. Guerraoui, and M. Kapalka. Stretching transactional memory. In PLDI '09, New York, NY, USA, 2009. ACM.

Digital Library

[12]

B. Fan, D. G. Andersen, and M. Kaminsky. Memc3: Compact and concurrent memcache with dumber caching and smarter hashing. NSDI'13. USENIX Association, 2013.

Digital Library

[13]

M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki, and B. Falsafi. Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In ASPLOS '12. ACM, 2012.

Digital Library

[14]

B. Haagdorens, T. Vermeiren, and M. Goossens. Improving the performance of signature-based network intrusion detection sensors by multi-threading. In WISA'04, 2005.

Digital Library

[15]

A. Heindl and G. Pokam. An analytic framework for performance modeling of software transactional memory. Computer Networks, 53(8), 2009.

Digital Library

[16]

A. Heindl and G. Pokam. An analytic model for optimistic stm with lazy locking. In Analytical and Stochastic Modeling Techniques and Applications. Springer, 2009.

Digital Library

[17]

A. Heindl, G. Pokam, and A.-R. Adl-Tabatabai. An analytic model of optimistic software transactional memory. In ISPASS 2009. IEEE, 2009.

[18]

M. Herlihy and J. E. B. Moss. Transactional memory: Architectural support for lock-free data structures. In ISCA '93. ACM, 1993.

Digital Library

[19]

A. Hoisie, O. Lubeck, and H. Wasserman. Performance and scalability analysis of teraflop-scale parallel architectures using multidimensional wavefront applications. International Journal of High Performance Computing Applications, 14(4), 2000.

Digital Library

[20]

I. Intel. and ia-32 architectures software developers manual volume 3b: System programming guide. Part, 1, 2007.

[21]

E. Ipek, B. R. De Supinski, M. Schulz, and S. A. McKee. An approach to performance prediction for parallel applications. In Euro-Par 2005 Parallel Processing. Springer, 2005.

Digital Library

[22]

R. Jain. The Art of Computer Systems Performance Analysis: techniques for experimental design, measurement, simulation, and modeling. Wiley, 1991.

[23]

V. Jiménez, F. J. Cazorla, R. Gioiosa, M. Valero, C. Boneti, E. Kursun, C.-Y. Cher, C. Isci, A. Buyuktosunoglu, and P. Bose. Power and thermal characterization of power6 system. In PACT '10. ACM, 2010.

Digital Library

[24]

D. J. Kerbyson, H. J. Alme, A. Hoisie, F. Petrini, H. J. Wasserman, and M. Gittings. Predictive performance and scalability modeling of a large-scale application. In SC2001. ACM, 2001.

Digital Library

[25]

B. C. Lee, D. M. Brooks, B. R. De Supinski, M. Schulz, K. Singh, and S. A. McKee. Methods of inference and learning for performance modeling of parallel applications. In PPoPP '07. ACM, 2007.

Digital Library

[26]

H. Lim, D. Han, D. G. Andersen, and M. Kaminsky. Mica: A holistic approach to fast in-memory key-value storage. NSDI'14. USENIX Association, 2014.

Digital Library

[27]

G. Marin and J. Mellor-Crummey. Cross-architecture performance predictions for scientific applications using parameterized models. In ACM SIGMETRICS Performance Evaluation Review, volume 32. ACM, 2004.

Digital Library

[28]

C. C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford transactional applications for multi-processing. In IISWC, 2008.

[29]

R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling memcache at facebook. NSDI'13. USENIX Association, 2013.

Digital Library

[30]

G. R. Nudd, D. J. Kerbyson, E. Papaefstathiou, S. C. Perry, J. S. Harper, and D. V. Wilcox. PACEa toolset for the performance prediction of parallel and distributed systems. International Journal of High Performance Computing Applications, 14(3), 2000.

Digital Library

[31]

C. R. M. Olschanowsky. Hpc Application Address Stream Compression, Replay and Scaling. PhD thesis, La Jolla, CA, USA, 2011.

Digital Library

[32]

C. A. Petri. Communication with automata, new york: Griffiss air force base. Technical report, Tech. Rep. RADC-TR-65-377, 1966.

[33]

J. R. Phillips. Zunzun.com. http://www.zunzun.com.

[34]

D. E. Porter and E. Witchel. Understanding transactional memory performance. In ISPASS 2010. IEEE, 2010.

[35]

J. Ruppert. A delaunay refinement algorithm for quality 2-dimensional mesh generation. Journal of algorithms, 18(3), 1995.

Digital Library

[36]

N. Shavit and D. Touitou. Software transactional memory. In PODC '95. ACM, 1995.

Digital Library

[37]

K. Singh, M. Bhadauria, and S. A. McKee. Real time power estimation and thread scheduling via performance counters. SIGARCH Comput. Archit. News, 37(2), July 2009.

Digital Library

[38]

J. Torrellas, Y. Solihin, and V. Lam. Scal-tool: Pinpointing and quantifying scalability bottlenecks in dsm multiprocessors. In Supercomputing, ACM/IEEE 1999 Conference, Nov 1999.

Digital Library

[39]

T. Usui, R. Behrends, J. Evans, and Y. Smaragdakis. Adaptive locks: Combining transactions and locks for efficient concurrency. In PACT '09. IEEE Computer Society, 2009.

Digital Library

[40]

A. Vega, A. Buyuktosunoglu, and P. Bose. Smt-centric power-aware thread placement in chip multiprocessors. In PACT '13. IEEE Press, 2013.

Digital Library

[41]

R. West, P. Zaroo, C. A. Waldspurger, and X. Zhang. Online cache modeling for commodity multicore processors. In PACT '10. ACM, 2010.

Digital Library

[42]

L. T. Yang, X. Ma, and F. Mueller. Cross-platform performance prediction of parallel applications using partial execution. In SC2005. IEEE, 2005.

Digital Library

[43]

A. Yasin. A top-down method for performance analysis and counters architecture. In Performance Analysis of Systems and Software (ISPASS), 2014 IEEE International Symposium on, March 2014.

[44]

J. Zhai, W. Chen, and W. Zheng. PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node. In ACM Sigplan Notices, volume 45. ACM, 2010.

Digital Library

Cited By

Reisert PCalotoiu AShudler SWolf F(2017)Following the Blind Seer – Creating Better Performance Models Using Less InformationEuro-Par 2017: Parallel Processing10.1007/978-3-319-64203-1_8(106-118)Online publication date: 1-Aug-2017
https://doi.org/10.1007/978-3-319-64203-1_8
Wang YFurman SHardy NEllis MBack GHong YCameron K(2024)A Detailed Historical and Statistical Analysis of the Influence of Hardware Artifacts on SPEC Integer Benchmark PerformanceIEEE Transactions on Computers10.1109/TC.2024.336594173:5(1262-1274)Online publication date: May-2024
https://doi.org/10.1109/TC.2024.3365941
Rank JHerget JHein AKrcmar H(2023)Evaluating Task-Level CPU Efficiency for Distributed Stream Processing SystemsBig Data and Cognitive Computing10.3390/bdcc70100497:1(49)Online publication date: 10-Mar-2023
https://doi.org/10.3390/bdcc7010049
Show More Cited By

Recommendations

ESTIMA: Extrapolating ScalabiliTy of In-Memory Applications
Special Issue: Invited papers from PPoPP 2016, Part 2

This article presents estima, an easy-to-use tool for extrapolating the scalability of in-memory applications. estima is designed to perform a simple yet important task: Given the performance of an application on a small machine with a handful of cores, ...
ESTIMA: extrapolating scalability of in-memory applications
PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

This paper presents ESTIMA, an easy-to-use tool for extrapolating the scalability of in-memory applications. ESTIMA is designed to perform a simple, yet important task: given the performance of an application on a small machine with a handful of cores, ...
Design and Optimization of Large Size and Low Overhead Off-Chip Caches

Large off-chip L3 caches can significantly improve the performance of memory-intensive applications. However, conventional L3 SRAM caches are facing two issues as those applications require increasingly large caches. First, an SRAM cache has a limited ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 51, Issue 8

PPoPP '16

August 2016

405 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/3016078

Editor:
Matthew Fluet

Issue’s Table of Contents

PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
February 2016
420 pages
ISBN:9781450340922
DOI:10.1145/2851141
General Chair:
Rafael Asenjo
University of Málaga, Spain
,
Program Chair:
Tim Harris
Oracle Labs, Cambridge, UK

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 February 2016

Published in SIGPLAN Volume 51, Issue 8

Check for updates

Qualifiers

Research-article

Funding Sources

European Research Council

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
372
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)1

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Reisert PCalotoiu AShudler SWolf F(2017)Following the Blind Seer – Creating Better Performance Models Using Less InformationEuro-Par 2017: Parallel Processing10.1007/978-3-319-64203-1_8(106-118)Online publication date: 1-Aug-2017
https://doi.org/10.1007/978-3-319-64203-1_8
Wang YFurman SHardy NEllis MBack GHong YCameron K(2024)A Detailed Historical and Statistical Analysis of the Influence of Hardware Artifacts on SPEC Integer Benchmark PerformanceIEEE Transactions on Computers10.1109/TC.2024.336594173:5(1262-1274)Online publication date: May-2024
https://doi.org/10.1109/TC.2024.3365941
Rank JHerget JHein AKrcmar H(2023)Evaluating Task-Level CPU Efficiency for Distributed Stream Processing SystemsBig Data and Cognitive Computing10.3390/bdcc70100497:1(49)Online publication date: 10-Mar-2023
https://doi.org/10.3390/bdcc7010049
Srikanthan SChakraborti SFerro PDwarkadas S(2022)MAPPER: Managing Application Performance via Parallel Efficiency Regulation*ACM Transactions on Architecture and Code Optimization10.1145/350176719:2(1-26)Online publication date: 24-Mar-2022
https://dl.acm.org/doi/10.1145/3501767
Valkov PKazmina KAntonov A(2019)Using Empirical Data for Scalability Analysis of Parallel ApplicationsParallel Computational Technologies10.1007/978-3-030-28163-2_5(58-73)Online publication date: 2-Aug-2019
https://doi.org/10.1007/978-3-030-28163-2_5
Purohith DMohan JChidambaram V(2017)The Dangers and Complexities of SQLite BenchmarkingProceedings of the 8th Asia-Pacific Workshop on Systems10.1145/3124680.3124719(1-6)Online publication date: 2-Sep-2017
https://dl.acm.org/doi/10.1145/3124680.3124719
Goodman DVaristeas GHarris T(2017)PandiaProceedings of the Twelfth European Conference on Computer Systems10.1145/3064176.3064177(254-269)Online publication date: 23-Apr-2017
https://dl.acm.org/doi/10.1145/3064176.3064177

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents