[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3030207.3030217acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
research-article

An Empirical Study of Computation-Intensive Loops for Identifying and Classifying Loop Kernels: Full Research Paper

Published: 17 April 2017 Publication History

Abstract

The process of performance tuning is time consuming and costly even if it is carried out automatically. It is crucial to learn from the experience of experts. Our long-term goal is to construct a database of facts extracted from specific performance tuning histories of computation-intensive applications such that we can search the database for promising optimization patterns that fit a given kernel.
In this study, as a significant step toward our goal, we explored a thousand computation-intensive applications in terms of the distribution of kernel classes, each of which is related to expected efficiency and specific tuning patterns. To statistically estimate the distribution of the kernel classes, 100 loops were randomly sampled and then manually classified by experienced performance engineers. The result indicates that 50-70% of the kernels are memory-bound and hence difficult to run efficiently on modern scalar processors. In addition, based on the classification results, we constructed experimental classifiers for identifying loop kernels and for predicting kernel classes, which achieved cross-validated classification accuracy of 81% and 65%, respectively.

References

[1]
K. Asanović, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, J. Shalf, S. W. Williams, and K. A. Yelick. The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006--183, EECS Department, University of California, Berkeley, 2006.
[2]
D. H. Bailey, R. F. Lucas, and S. W. Williams. Performance Tuning of Scientific Applications. CRC Press, 2011.
[3]
V. R. Basili, J. C. Carver, D. Cruzes, L. M. Hochstein, J. K. Hollingsworth, F. Shull, and M. V. Zelkowitz. Understanding the high-performance-computing community: A software engineer's perspective. IEEE Software, 25(4):29--36, 2008.
[4]
P. Basu, M. Hall, M. Khan, S. Maindola, S. Muralidharan, S. Ramalingam, A. Rivera, M. Shantharam, and A. Venkat. Towards making autotuning mainstream. International Journal of High Performance Computing Applications, 27(4):379--393, 2013.
[5]
A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem, C. Henri-Gros, A. Kamsky, S. McPeak, and D. Engler. A few billion lines of code later: Using static analysis to find bugs in the real world. Commun. ACM, 53(2):66--75, 2010.
[6]
C.-C. Chang and C.-J. Lin. Libsvm: A library for support vector machines. ACM Trans. Intell. Syst. Technol., 2(3):27:1--27:27, 2011.
[7]
C. Collberg, G. Myles, and M. Stepp. An empirical study of java bytecode programs. Softw. Pract. Exper., 37(6):581--641, 2007.
[8]
J. Dongarra and P. Luszczek. HPC challenge benchmark. http://icl.cs.utk.edu/hpcc/.
[9]
R. Dyer, H. A. Nguyen, H. Rajan, and T. N. Nguyen. Boa: Ultra-large-scale software repository and source-code mining. ACM Trans. Softw. Eng. Methodol., 25(1):7:1--7:34, 2015.
[10]
R. Dyer, H. Rajan, H. A. Nguyen, and T. N. Nguyen. Mining billions of ast nodes to study actual and potential usage of Java language features. In Proceedings of the 36th International Conference on Software Engineering (ICSE), pages 779--790, 2014.
[11]
G. Fursin, Y. Kashnikov, A. W. Memon, Z. Chamski, O. Temam, M. Namolaru, E. Yom-Tov, B. Mendelson, A. Zaks, E. Courtois, F. Bodin, P. Barnard, E. Ashton, E. V. Bonilla, J. Thomson, C. K. I. Williams, and M. F. P. O'Boyle. Milepost GCC: machine learning enabled self-tuning compiler. International Journal of Parallel Programming, 39(3):296--327, 2011.
[12]
T. Gorschek, E. Tempero, and L. Angelis. A large-scale empirical study of practitioners' use of object-oriented concepts. In Proceedings of the 2010 ACM/IEEE 32nd International Conference on Software Engineering (ICSE), volume 1, pages 115--124, 2010.
[13]
M. Grechanik, C. McMillan, L. DeFerrari, M. Comi, S. Crespi, D. Poshyvanyk, C. Fu, Q. Xie, and C. Ghezzi. An empirical investigation into a large-scale Java open source code repository. In Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pages 11:1--11:10, 2010.
[14]
M. Hashimoto, M. Terai, T. Maeda, and K. Minami. Extracting facts from performance tuning history of scientific applications for predicting effective optimization patterns. In Proceedings of the 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories (MSR), pages 13--23, 2015.
[15]
C.-W. Hsu, C.-C. Chang, and C.-J. Lin. A practical guide to support vector classification. Technical report, National Taiwan University, 2003. http://www.csie.ntu.edu.tw/ cjlin/papers/guide/guide.pdf.
[16]
E. Kalliamvakou, G. Gousios, K. Blincoe, L. Singer, D. M. German, and D. Damian. The promises and perils of mining github. In Proceedings of the 11th Working Conference on Mining Software Repositories (MSR), pages 92--101, 2014.
[17]
D. E. Knuth. An empirical study of FORTRAN programs. Software: Practice and Experience, 1(2):105--133, 1971.
[18]
R. Lämmel, E. Pek, and J. Starek. Large-scale, AST-based API-usage analysis of open-source Java projects. In Proceedings of the 2011 ACM Symposium on Applied Computing (SAC), pages 1317--1324, 2011.
[19]
G. Pinto, W. Torres, B. Fernandes, F. Castor, and R. S. Barros. A large-scale study on the usage of Java's concurrent programming constructs. J. Syst. Softw., 106(C):59--81, 2015.
[20]
M. Terai, H. Murai, K. Minami, M. Yokokawa, and E. Tomiyama. K-scope: A Java-based Fortran source code analyzer with graphical user interface for performance improvement. In Proceedings of the 41st International Conference on Parallel Processing Workshops (ICPPW), pages 434--443, 2012.
[21]
A. Tiwari, J. K. Hollingsworth, C. Chen, M. Hall, C. Liao, D. J. Quinlan, and J. Chame. Auto-tuning full applications: A case study. Int. J. High Perform. Comput. Appl., 25(3):286--294, 2011.
[22]
S. Williams, A. Waterman, and D. Patterson. Roofline: An insightful visual performance model for multicore architectures. Commun. ACM, 52(4):65--76, 2009.
[23]
M. Yokokawa, F. Shoji, A. Uno, M. Kurokawa, and T. Watanabe. The K computer: Japanese next-generation supercomputer development project. In Proceedings of the 2011 International Symposium on Low Power Electronics and Design (ISLPED), pages 371--372, 2011.

Cited By

View all
  • (2019)Double-Precision FPUs in High-Performance Computing: An Embarrassment of Riches?2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2019.00019(78-88)Online publication date: May-2019
  • (2018)Japanese Autotuning Research: Autotuning Languages and FFTProceedings of the IEEE10.1109/JPROC.2018.2870284106:11(2056-2067)Online publication date: Nov-2018

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICPE '17: Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering
April 2017
450 pages
ISBN:9781450344043
DOI:10.1145/3030207
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. computation-intensive application
  2. fortran parser
  3. kernel classification
  4. kernel prediction
  5. software performance tuning

Qualifiers

  • Research-article

Funding Sources

  • Japan Society for the Promotion of Science (JSPS)

Conference

ICPE '17
Sponsor:

Acceptance Rates

ICPE '17 Paper Acceptance Rate 27 of 83 submissions, 33%;
Overall Acceptance Rate 252 of 851 submissions, 30%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Double-Precision FPUs in High-Performance Computing: An Embarrassment of Riches?2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2019.00019(78-88)Online publication date: May-2019
  • (2018)Japanese Autotuning Research: Autotuning Languages and FFTProceedings of the IEEE10.1109/JPROC.2018.2870284106:11(2056-2067)Online publication date: Nov-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media