[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-319-93713-7_27guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Efficient Characterization of Hidden Processor Memory Hierarchies

Published: 11 June 2018 Publication History

Abstract

A processor’s memory hierarchy has a major impact on the performance of running code. However, computing platforms, where the actual hardware characteristics are hidden from both the end user and the tools that mediate execution, such as a compiler, a JIT and a runtime system, are used more and more, for example, performing large scale computation in cloud and cluster. Even worse, in such environments, a single computation may use a collection of processors with dissimilar characteristics. Ignorance of the performance-critical parameters of the underlying system makes it difficult to improve performance by optimizing the code or adjusting runtime-system behaviors; it also makes application performance harder to understand.
To address this problem, we have developed a suite of portable tools that can efficiently derive many of the parameters of processor memory hierarchies, such as levels, effective capacity and latency of caches and TLBs, in a matter of seconds. The tools use a series of carefully considered experiments to produce and analyze cache response curves automatically. The tools are inexpensive enough to be used in a variety of contexts that may include install time, compile time or runtime adaption, or performance understanding tools.

References

[1]
Saavedra RH and Smith AJ Measuring cache and TLB performance and their effect on benchmark runtimes IEEE Trans. Comput. 1995 44 10 1223-1235
[2]
McVoy, L.W., Staelin, C.: Lmbench: portable tools for performance analysis. In: USENIX annual technical conference, pp. 279–294 (1996)
[3]
Dongarra J, Moore S, Mucci P, Seymour K, and You H Bubak M, van Albada GD, Sloot PMA, and Dongarra J Accurate cache and TLB characterization using hardware counters Computational Science - ICCS 2004 2004 Heidelberg Springer 432-439
[4]
Yotov, K., Pingali, K., Stodghill, P.: X-ray: a tool for automatic measurement of hardware parameters. In: Proceedings of Second International Conference on the Quantitative Evaluation of Systems 2005, pp. 168–177. IEEE, September 2005
[5]
Yotov K, Pingali K, and Stodghill P Automatic measurement of memory hierarchy parameters ACM SIGMETRICS Perform. Eval. Rev. 2005 33 1 181-192
[6]
Duchateau AX, Sidelnik A, Garzarán MJ, and Padua D Amaral JN P-ray: a software suite for multi-core architecture characterization Languages and Compilers for Parallel Computing 2008 Heidelberg Springer 187-201
[7]
González-Domínguez, J., Taboada, G.L., Fragüela, B.B., Martín, M.J., Tourino, J.: Servet: a benchmark suite for autotuning on multicore clusters. In: 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–9. IEEE, April 2010
[8]
Sandoval, J.A.: Foundations for Automatic, Adaptable Compilation. Doctoral dissertation, Rice University (2011)
[9]
Taylor, R., Li, X.: A micro-benchmark suite for AMD GPUs. In: 2010 39th International Conference on Parallel Processing Workshops (ICPPW), pp. 387–396. IEEE (2010)
[10]
Sussman, A., Lo, N., Anderson, T.: Automatic computer system characterization for a parallelizing compiler. In: 2011 IEEE International Conference on Cluster Computing (CLUSTER), pp. 216–224. IEEE (2011)
[11]
Abel, A.: Measurement-based inference of the cache hierarchy. Doctoral dissertation, Master’s thesis, Saarland University (2012)
[12]
González-Domínguez Jorge, Martín María J., Taboada Guillermo L., Expósito Roberto R., and Touriño Juan The Servet 3.0 benchmark suite: Characterization of network performance degradation Computers & Electrical Engineering 2013 39 8 2483-2493
[13]
Casas, M., Bronevetsky, G.: Active measurement of memory resource consumption. In: 2014 IEEE 28th International Symposium on Parallel and Distributed Processing, pp. 995–1004. IEEE, May 2014
[14]
Casas Marc and Bronevetsky Greg Evaluation of HPC Applications’ Memory Resource Consumption via Active Measurement IEEE Transactions on Parallel and Distributed Systems 2016 27 9 2560-2573
[15]
Moyer, S.A.: Performance of the iPSC/860 node architecture. Institute for Parallel Computation, University of Virginia (1991)
[16]
Qasem, A., Kennedy, K.: Profitable loop fusion and tiling using model-driven empirical search. In: Proceedings of the 20th Annual International Conference on Supercomputing, pp. 249–258. ACM, June 2006
[17]
Luk CK and Mowry TC Architectural and compiler support for effective instruction prefetching: a cooperative approach ACM Trans. Comput. Syst. 2001 19 1 71-109

Index Terms

  1. Efficient Characterization of Hidden Processor Memory Hierarchies
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Guide Proceedings
        Computational Science – ICCS 2018: 18th International Conference, Wuxi, China, June 11–13, 2018 Proceedings, Part III
        Jun 2018
        843 pages
        ISBN:978-3-319-93712-0
        DOI:10.1007/978-3-319-93713-7

        Publisher

        Springer-Verlag

        Berlin, Heidelberg

        Publication History

        Published: 11 June 2018

        Author Tags

        1. Efficient characterization
        2. Hidden memory hierarchies
        3. Code performance
        4. Portable tool

        Qualifiers

        • Article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 26 Dec 2024

        Other Metrics

        Citations

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media