[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/SC.2005.33acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
Article

How Well Can Simple Metrics Represent the Performance of HPC Applications?

Published: 12 November 2005 Publication History

Abstract

In this paper, a systematic study of the effects of complexity of prediction methodology on its accuracy for a set of real applications on a variety of HPC systems is performed. Results indicate that the use of any single, simple synthetic metric to predict performance does an inadequate job, and the use of a linear combination of these simple metrics with optimized weights also performs poorly. Better, however, are methodologies that rely on the convolution of an application "transfer function" based on tracing information with system performance data measured by simple benchmarks. This latter methodology can predict performance with an average accuracy of 80%, based on the current work.

References

[1]
1. Top500, www.top500.org.
[2]
2. J. Dongarra, P. Luszczek, & A. Petitet, "The LINPACK benchmark: past, present and future", Concurrency and Computation: Practice and Experience, vol. 15, pp. 1- 18, 2003.
[3]
3. E. Joseph, C. G. Willard, M. Swenson, & D. Goldfarb, "A new HPC technical computing benchmark: the IDC balanced rating", IDC Bulletin W.
[4]
4. High Performance Computing Modernization Program, www.hpcmo.hpc.mil.
[5]
5. J. McCalpin, "Memory bandwidth and machine balance in current high performance computers", IEEE Technical Committee on Computer Architecture Newsletter.
[6]
6. HPC Challenge Benchmarks, http://icl.cs.utk.edu/hpcc/.
[7]
7. PMaC HPC Benchmark Suite, http://www.sdsc.edu/ pmac/.
[8]
8. J. Gustafson & R. Todi, "Conventional benchmarks as a sample of the performance spectrum", Hawaii International Conference on System Sciences, 1998.
[9]
9. L. Carrington, A. Snavely, N. Wolter, & X. Gao, "A performance prediction framework for scientific applications", Workshop on Performance Modeling and Analysis-ICCS, Melbourne, 2003.
[10]
10. A. Snavely, X. Gao, C. Lee, N. Wolter, & J. Labarta, "Performance modeling of HPC applications", Parallel Computing, Dresden, 2003.
[11]
11. S. Browne, J. Dongarra, N. Garner, K. London, & P. Mucci, "A scalable cross-platform infrastructure for application performance tuning using hardware counters", SC2000, Dallas, 2000.
[12]
12. J. Hollingsworth, A. Snavely, & S. Sbaraglia, "EMPS: an environment for memory performance studies", Proceedings of the 19th IEEE International Parallel and Distributed Systems (IPDPS), Washington D. C., 2005.
[13]
13. A. Snavely, L. Carrington, N. Wolter, J. Labarta, R. Badia, & A. Purkayastha, "A framework for application performance modeling and prediction", SC2002, Baltimore, 2002.
[14]
14. R. Badia, G. Rodriguez, & J. Labarta, "Deriving analytical models from a limited number of runs", Parallel Computing, Dresden, 2003.
[15]
15. Ad Emmen, "IDC reports latest supercomputer rankings based on the IDC balanced rating test", Primeur Monthly, May 16, 2002, http://www.hoise. com/primeur/02/articles/monthly/AE-PR-06-02-45. html.
[16]
16. D. Bailey, J. Barton, T. Lasinski, H. Simon, "The NAS parallel benchmarks", International Journal of Supercomputer Applications, 1991.
[17]
17. SPEC, http://www.spec.org/.
[18]
18. G. Marin & J. Mellor-Crummey, "Cross-architecture performance predictions for scientific applications using parameterized models", SIGMETRICS Performance 04, 2004.
[19]
19. L. Svobodova, Computer System Performance Measurement and Evaluation Methods: Analysis and Applications (Elsevier, N. Y. 1976).
[20]
20. R. S., Ballansc, J. A. Cocke, and H. G. Kolsky, The Lookahead Unit, Planning a Computer System, (McGraw-Hill, New York, 1962).
[21]
21. L. T. Boland, G. D. Granito, A. V. Marcotte, B. V. Messina, and J. W. Smith, "The IBM system 360/Model9: Storage System", IBMJ. Res. And Develop., vol. 11, pp. 54-79, 1967.
[22]
22. D. Burger, T. M. Austin, and S. Bennett, "Evaluating future microprocessors: The simplescalar tool set", Tech. Rep. CS-TR-1996-1308, University of Wisconsin-Madison, 1996.
[23]
23. J. O. Murphey and R. M. Wade, "The IBM 360/195", Datamation, vol. 16:4, pp. 72-79, 1970.
[24]
24. G. S. Tjaden and M. J. Flynn, "Detection and Parallel Execution of Independent Instructions", IEEE Trans. Comptrs., vol. C-19 pp. 889-895, 1970.
[25]
25. J. Lo, S. Egger, J. Emer, H. Levy, R. Stamm, and D. Tullsen, "Converting Thread-Level Parallelism to Instruction-Level Parallelism via Simultaneous Multithreading", ACM Transactions on Computer Systems, August, 1997.
[26]
26. B. Falsafi and D. A. Wood, "Modeling Cost/Performance of a Parallel Computer Simulator", ACM Transactions on Modeling and Computer Simulation, vol. 7:1, pp. 104-130, 1997.
[27]
27. J. Gibson, R. Kunz, D. Ofelt, M. Horowitz, J. Hennessy, and M. Heinrich, "FLASH vs. (Simulated) FLASH:Closing the Simulation Loop", Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), November, pp. 49-58, 2000.
[28]
28. R. H. Saavedra and A. J. Smith, "Measuring Cache and TLB Performance and Their Effect on Benchmark Run Times", IEEE Transactions on Computers, vol. 44:10 pp. 1223-1235, 1995.
[29]
29. R. H. Saavedra and A. J. Smith, "Analysis of Benchmark Characteristics and Benchmark Performance Prediction", TOCS14, vol. 4, pp. 344-384, 1996.
[30]
30. R. H. Saavedra and A. J. Smith, "Performance Characterization of Optimizing Compilers", TSE21, vol. 7, pp. 615-628, 1995.
[31]
31. C. L. Mendes and D. A. Reed, "Integrated Compilation and Scalability Analysis for Parallel Systems", IEEE PACT, 1998.
[32]
32. C. L. Mendes and D. A. Reed, "Performance Stability and Prediction", IEEE /USP International Workshop on High Performance Computing, 1994.
[33]
33. J. Simon and J. Wierun, "Accurate Performance Prediction for Massively Parallel Systems and its Applications", Euro-Par, vol. 2, pp. 675-688, 1996.
[34]
34. M. E. Crovella and T. J. LeBlanc, "Parallel Performance Prediction Using Lost Cycles Analysis", SuperComputing 1994, pp. 600-609, 1994.
[35]
35. Z. Xu, X. Zhang, L. Sun, "Semi-empirical Multiprocessor Performance Predictions", JPDC, vol. 39, pp. 14-28, 1996.
[36]
36. G. Abandah, E. S. Davidson, "Modeling the Communication Performance of the IBM SP2", Proceedings Int'l Parallel Processing Symposium, April, pp. 249-257, 1996.
[37]
37. E. L. Boyd, W. Azeem, H. H. Lee, T. P. Shih, S. H. Hung, and E. S. Davidson, "A Hierarchical Approach to Modeling and Improving the Performance of Scientific Applications on the KSR1", Proceedings of the 1994 International Conference on Parallel Processing, vol. 3, pp. 188-192, 1994.
[38]
38. A. Hosie, L. Olaf, H. Wasserman, "Performance Analysis of Wavefront Algorithms on Very-Large Scale Distributed Systems", Springer's "Lecture Notes in Control and Information Sciences", vol. 249, p. 171, 1999.
[39]
39. A. Hosie, L. Olaf, H. Wasserman, "Scalability Analysis of Multidimensional Wavefront Algorithms on Large-Scale SMP Clusters", Proceedings of Frontiers of Massively Parallel Computing '99, Annapolis, MD, February, 1999.
[40]
40. D. J. Kerbyson, A. Hoisie, and H. J. Wasserman, "Modeling the Performance of Large-Scale Systems", Keynote paper, UK Performance Engineering Workshop (UKPEW03), July, 2003.
[41]
41. L. Yong, L. M. Olaf, H. Wasserman, "Development and Validation of a Hierarchical Memory Model Incorporating CPU-and Memory-Operation Overlap", Proceedings of the First International Workshop on Software and Performance, Santa Fe, NM, pp. 152-163, 1996.
[42]
42. A. Spooner and D. Kerbyson, "Identification of Performance Characteristics from Multi-view Trace Analysis", Proc. Of Int. Conf. On Computational Science (ICCS), part 3 2659, pp. 936-945, 2003.

Cited By

View all
  • (2015)Making the Most of SMT in HPCACM Transactions on Architecture and Code Optimization10.1145/268765111:4(1-26)Online publication date: 9-Jan-2015
  • (2014)SMiTeProceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2014.53(406-418)Online publication date: 13-Dec-2014
  • (2013)Time-bound analytic tasks on large datasets through dynamic configuration of workflowsProceedings of the 8th Workshop on Workflows in Support of Large-Scale Science10.1145/2534248.2534257(88-97)Online publication date: 17-Nov-2013
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing
November 2005
829 pages
ISBN:1595930612

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 12 November 2005

Check for updates

Qualifiers

  • Article

Conference

SC '05
Sponsor:

Acceptance Rates

SC '05 Paper Acceptance Rate 62 of 260 submissions, 24%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2015)Making the Most of SMT in HPCACM Transactions on Architecture and Code Optimization10.1145/268765111:4(1-26)Online publication date: 9-Jan-2015
  • (2014)SMiTeProceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2014.53(406-418)Online publication date: 13-Dec-2014
  • (2013)Time-bound analytic tasks on large datasets through dynamic configuration of workflowsProceedings of the 8th Workshop on Workflows in Support of Large-Scale Science10.1145/2534248.2534257(88-97)Online publication date: 17-Nov-2013
  • (2012)Top500 versus sustained performanceProceedings of the 21st international conference on Parallel architectures and compilation techniques10.1145/2370816.2370850(223-230)Online publication date: 19-Sep-2012
  • (2011)A similarity measure for time, frequency, and dependencies in large-scale workloadsProceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/2063384.2063441(1-11)Online publication date: 12-Nov-2011
  • (2011)An idiom-finding tool for increasing productivity of acceleratorsProceedings of the international conference on Supercomputing10.1145/1995896.1995928(202-212)Online publication date: 31-May-2011
  • (2011)Reliable performance prediction for multigrid software on distributed memory systemsAdvances in Engineering Software10.1016/j.advengsoft.2010.10.00542:5(247-258)Online publication date: 1-May-2011
  • (2008)Accurate memory signatures and synthetic address traces for HPC applicationsProceedings of the 22nd annual international conference on Supercomputing10.1145/1375527.1375536(36-45)Online publication date: 7-Jun-2008
  • (2007)Parallel performance prediction for multigrid codes on distributed memory architecturesProceedings of the Third international conference on High Performance Computing and Communications10.5555/2401945.2402018(647-658)Online publication date: 26-Sep-2007
  • (2007)Causal analysis for performance modeling of computer programsScientific Programming10.1155/2007/91686115:3(121-136)Online publication date: 1-Aug-2007
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media