More Web Proxy on the site http://driver.im/

Article

Efficient Characterization of Hidden Processor Memory Hierarchies

Authors:

Xiaoran XuAuthors Info & Claims

Computational Science – ICCS 2018: 18th International Conference, Wuxi, China, June 11–13, 2018 Proceedings, Part III

Pages 335 - 349

https://doi.org/10.1007/978-3-319-93713-7_27

Published: 11 June 2018 Publication History

Abstract

A processor’s memory hierarchy has a major impact on the performance of running code. However, computing platforms, where the actual hardware characteristics are hidden from both the end user and the tools that mediate execution, such as a compiler, a JIT and a runtime system, are used more and more, for example, performing large scale computation in cloud and cluster. Even worse, in such environments, a single computation may use a collection of processors with dissimilar characteristics. Ignorance of the performance-critical parameters of the underlying system makes it difficult to improve performance by optimizing the code or adjusting runtime-system behaviors; it also makes application performance harder to understand.

To address this problem, we have developed a suite of portable tools that can efficiently derive many of the parameters of processor memory hierarchies, such as levels, effective capacity and latency of caches and TLBs, in a matter of seconds. The tools use a series of carefully considered experiments to produce and analyze cache response curves automatically. The tools are inexpensive enough to be used in a variety of contexts that may include install time, compile time or runtime adaption, or performance understanding tools.

References

[1]

Saavedra RH and Smith AJ Measuring cache and TLB performance and their effect on benchmark runtimes IEEE Trans. Comput. 1995 44 10 1223-1235

[2]

McVoy, L.W., Staelin, C.: Lmbench: portable tools for performance analysis. In: USENIX annual technical conference, pp. 279–294 (1996)

[3]

Dongarra J, Moore S, Mucci P, Seymour K, and You H Bubak M, van Albada GD, Sloot PMA, and Dongarra J Accurate cache and TLB characterization using hardware counters Computational Science - ICCS 2004 2004 Heidelberg Springer 432-439

[4]

Yotov, K., Pingali, K., Stodghill, P.: X-ray: a tool for automatic measurement of hardware parameters. In: Proceedings of Second International Conference on the Quantitative Evaluation of Systems 2005, pp. 168–177. IEEE, September 2005

[5]

Yotov K, Pingali K, and Stodghill P Automatic measurement of memory hierarchy parameters ACM SIGMETRICS Perform. Eval. Rev. 2005 33 1 181-192

[6]

Duchateau AX, Sidelnik A, Garzarán MJ, and Padua D Amaral JN P-ray: a software suite for multi-core architecture characterization Languages and Compilers for Parallel Computing 2008 Heidelberg Springer 187-201

[7]

González-Domínguez, J., Taboada, G.L., Fragüela, B.B., Martín, M.J., Tourino, J.: Servet: a benchmark suite for autotuning on multicore clusters. In: 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–9. IEEE, April 2010

[8]

Sandoval, J.A.: Foundations for Automatic, Adaptable Compilation. Doctoral dissertation, Rice University (2011)

[9]

Taylor, R., Li, X.: A micro-benchmark suite for AMD GPUs. In: 2010 39th International Conference on Parallel Processing Workshops (ICPPW), pp. 387–396. IEEE (2010)

[10]

Sussman, A., Lo, N., Anderson, T.: Automatic computer system characterization for a parallelizing compiler. In: 2011 IEEE International Conference on Cluster Computing (CLUSTER), pp. 216–224. IEEE (2011)

[11]

Abel, A.: Measurement-based inference of the cache hierarchy. Doctoral dissertation, Master’s thesis, Saarland University (2012)

[12]

González-Domínguez Jorge, Martín María J., Taboada Guillermo L., Expósito Roberto R., and Touriño Juan The Servet 3.0 benchmark suite: Characterization of network performance degradation Computers & Electrical Engineering 2013 39 8 2483-2493

[13]

Casas, M., Bronevetsky, G.: Active measurement of memory resource consumption. In: 2014 IEEE 28th International Symposium on Parallel and Distributed Processing, pp. 995–1004. IEEE, May 2014

[14]

Casas Marc and Bronevetsky Greg Evaluation of HPC Applications’ Memory Resource Consumption via Active Measurement IEEE Transactions on Parallel and Distributed Systems 2016 27 9 2560-2573

[15]

Moyer, S.A.: Performance of the iPSC/860 node architecture. Institute for Parallel Computation, University of Virginia (1991)

[16]

Qasem, A., Kennedy, K.: Profitable loop fusion and tiling using model-driven empirical search. In: Proceedings of the 20th Annual International Conference on Supercomputing, pp. 249–258. ACM, June 2006

[17]

Luk CK and Mowry TC Architectural and compiler support for effective instruction prefetching: a cooperative approach ACM Trans. Comput. Syst. 2001 19 1 71-109

Index Terms

Efficient Characterization of Hidden Processor Memory Hierarchies

Index terms have been assigned to the content through auto-classification.

Recommendations

Exploiting multiprocessor memory hierarchies for operating systems
Exploiting multiprocessor memory hierarchies for operating systems
Towards Virtually-Addressed Memory Hierarchies
HPCA '01: Proceedings of the 7th International Symposium on High-Performance Computer Architecture

Abstract: Currently cache hierarchies are indexed in parallel with a TLB but their tags are part of the physical address so that the memory hierarchy is physically addressed. This design faces problems as more concurrency is exploited in the processor ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Computational Science – ICCS 2018: 18th International Conference, Wuxi, China, June 11–13, 2018 Proceedings, Part III

Jun 2018

843 pages

ISBN:978-3-319-93712-0

DOI:10.1007/978-3-319-93713-7

Editors:
Yong Shi
Chinese Academy of Sciences, Beijing, China
,
Haohuan Fu
National Supercomputing Center in Wuxi, Wuxi, China
,
Yingjie Tian
Chinese Academy of Sciences, Beijing, China
,
Valeria V. Krzhizhanovskaya
University of Amsterdam, Amsterdam, The Netherlands
,
Michael Harold Lees
University of Amsterdam, Amsterdam, The Netherlands
,
Jack Dongarra
University of Tennessee, Knoxville, Tennessee, USA
,
Peter M. A. Sloot
University of Amsterdam, Amsterdam, The Netherlands

© Springer International Publishing AG, part of Springer Nature 2018.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 11 June 2018

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents