[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/2190025.2190057acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
Article

Automated programmable control and parameterization of compiler optimizations

Published: 02 April 2011 Publication History

Abstract

We present a framework which effectively combines programmable control by developers, advanced optimization by compilers, and flexible parameterization of optimizations to achieve portable high performance. We have extended ROSE, a C/C++/Fortran source-to-source optimizing compiler, to automatically analyze scientific applications and discover optimization opportunities. Instead of directly generating optimized code, our optimizer produces parameterized scripts in POET, an interpreted program transformation language, so that developers can freely modify the optimization decisions by the compiler and add their own domain-specific optimizations if necessary. The auto-generated POET scripts support extra optimizations beyond those available in the ROSE optimizer. Additionally, all the optimizations are parameterized at an extremely fine granularity, so the scripts can be ported together with their input code and automatically tuned for different architectures. Our results show that this approach is highly effective, and the code optimized by the auto-generated POET scripts can significantly outperform those optimized using the ROSE optimizer alone.

References

[1]
N. Baradaran, J. Chame, C. Chen, P. Diniz, M. Hall, Y.-J. Lee, B. Liu, and R. Lucas. Eco: An empirical-based compilation and optimization system. In International Parallel and Distributed Processing Symposium, 2003.
[2]
J. Bilmes, K. Asanovic, C.-W. Chin, and J. Demmel. Optimizing matrix multiply using phipac: a portable, high-performance, ansi c coding methodology. In Proc. the 11th international conference on Supercomputing, pages 340-347, New York, NY, USA, 1997. ACM Press.
[3]
S. Carr and K. Kennedy. Improving the ratio of memory operations to floating-point operations in loops. ACM Transactions on Programming Languages and Systems, 16(6), 1994.
[4]
C. Chen, J. Chame, and M. Hall. Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy. In International Symposium on Code Generation and Optimization, March 2005.
[5]
A. Cohen, M. Sigler, S. Girbal, O. Temam, D. Parello, and N. Vasilache. Facilitating the search for compositions of program transformations. In ICS '05: Proceedings of the 19th annual international conference on Supercomputing, pages 151-160, New York, NY, USA, 2005. ACM.
[6]
S. Donadio, J. Brodman, T. Roeder, K. Yotov, D. Barthou, A. Cohen, M. J. Garzarán, D. Padua, and K. Pingali. A language for the compact representation of multiple program versions. In LCPC, October 2005.
[7]
M. Frigo and S. Johnson. FFTW: An Adaptive Software Architecture for the FFT. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), volume 3, page 1381, 1998.
[8]
M. Hall, J. Chame, C. Chen, J. Shin, G. Rudy, and M. M. Khan. Loop transformation recipes for code generation and auto-tuning. In LCPC, October 2009.
[9]
T. Kisuki, P. Knijnenburg, M. O'Boyle, and H. Wijsho. Iterative compilation in program optimization. In Compilers for Parallel Computers, pages 35-44, 2000.
[10]
M. Lam, E. Rothberg, and M. E. Wolf. The cache performance and optimizations of blocked algorithms. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IV), Santa Clara, Apr. 1991.
[11]
K. McKinley, S. Carr, and C. Tseng. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems, 18(4):424-453, July 1996.
[12]
M. O'Boyle, N. Motogelwa, and P. Knijnenburg. Feedback assisted iterative compilation. In Languages and Compilers for Parallel Computing, 2000.
[13]
Z. Pan and R. Eigenmann. Fast automatic procedure-level performance tuning. In Proc. Parallel Architectures and Compilation Techniques, 2006.
[14]
G. Pike and P. Hilfinger. Better tiling and array contraction for compiling scientific programs. In SC, Baltimore, MD, USA, November 2002.
[15]
M. P?uschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. W. Singer, J. Xiong, F. Franchetti, A. Ga¿ic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. IEEE special issue on Program Generation, Optimization, and Adaptation, 93(2), 2005.
[16]
A. Qasem, K. Kennedy, and J. Mellor-Crummey. Automatic tuning of whole applications using direct search and a performance-based transformation system. The Journal of Supercomputing, 36(2):183-196, 2006.
[17]
S. F. Rahman, J. Guo, and Q. Yi. Automated empirical tuning of scientific codes for performance and power consumption. In HIPEAC:High-Performance and Embedded Architectures and Compilers (to appear), Heraklion, Greece, Jan 2011.
[18]
M. Stephenson and S. Amarasinghe. Predicting unroll factors using supervised classification. In CGO, San Jose, CA, USA, March 2005.
[19]
M. J. Voss and R. Eigenmann. High-level adaptive program optimization with ADAPT. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2001.
[20]
R. Vuduc, J. Demmel, and K. Yelick. OSKI: An interface for a self-optimizing library of sparse matrix kernels, 2005. bebop.cs.berkeley.edu/oski.
[21]
R. C. Whaley, A. Petitet, and J. Dongarra. Automated empirical optimizations of software and the ATLAS project. Parallel Computing, 27(1):3-25, 2001.
[22]
M. J. Wolfe. More iteration space tiling. In Proceedings of Supercomputing, Reno, Nov. 1989.
[23]
Q. Yi, K. Kennedy, and V. Adve. Transforming complex loop nests for locality. The Journal Of Supercomputing, 27, 2004.
[24]
Q. Yi and D. Quinlan. Applying loop optimizations to object-oriented abstractions through general classification of array semantics. In The 17th International Workshop on Languages and Compilers for Parallel Computing, West Lafayette, Indiana, USA, Sep 2004.
[25]
Q. Yi, K. Seymour, H. You, R. Vuduc, and D. Quinlan. POET: Parameterized optimizations for empirical tuning. In Workshop on Performance Optimization for High-Level Languages and Libraries, Mar 2007.
[26]
Q. Yi and C. Whaley. Automated transformation for performance-critical kernels. In ACM SIGPLAN Symposium on Library-Centric Software Design, Oct. 2007.
[27]
K. Yotov, X. Li, G. Ren, M. Garzaran, D. Padua, K. Pingali, and P. Stodghill. A comparison of empirical and model-driven optimization. IEEE special issue on Program Generation, Optimization, and Adaptation, 2005.

Cited By

View all
  • (2018)SCPACM Transactions on Architecture and Code Optimization10.1145/327465415:4(1-21)Online publication date: 10-Oct-2018
  • (2017)Automatic generation of fast BLAS3-GEMM: a portable compiler approachProceedings of the 2017 International Symposium on Code Generation and Optimization10.5555/3049832.3049846(122-133)Online publication date: 4-Feb-2017
  • (2017)Optimization of Triangular and Banded Matrix Operations Using 2d-Packed LayoutsACM Transactions on Architecture and Code Optimization10.1145/316201614:4(1-19)Online publication date: 18-Dec-2017
  • Show More Cited By
  1. Automated programmable control and parameterization of compiler optimizations

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CGO '11: Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
    April 2011
    324 pages
    ISBN:9781612843568

    Sponsors

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 02 April 2011

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    CGO '11 Paper Acceptance Rate 28 of 105 submissions, 27%;
    Overall Acceptance Rate 312 of 1,061 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)SCPACM Transactions on Architecture and Code Optimization10.1145/327465415:4(1-21)Online publication date: 10-Oct-2018
    • (2017)Automatic generation of fast BLAS3-GEMM: a portable compiler approachProceedings of the 2017 International Symposium on Code Generation and Optimization10.5555/3049832.3049846(122-133)Online publication date: 4-Feb-2017
    • (2017)Optimization of Triangular and Banded Matrix Operations Using 2d-Packed LayoutsACM Transactions on Architecture and Code Optimization10.1145/316201614:4(1-19)Online publication date: 18-Dec-2017
    • (2014)Specializing Compiler Optimizations through Programmable Composition for Dense Matrix ComputationsProceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2014.14(596-608)Online publication date: 13-Dec-2014
    • (2013)AUGEMProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.1145/2503210.2503219(1-12)Online publication date: 17-Nov-2013
    • (2013)Layout-oblivious compiler optimization for matrix computationsACM Transactions on Architecture and Code Optimization (TACO)10.1145/2400682.24006949:4(1-20)Online publication date: 20-Jan-2013
    • (2013)Defensive loop tiling for shared cacheProceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO.2013.6495008(1-11)Online publication date: 23-Feb-2013
    • (2012)Portable section-level tuning of compiler parallelized applicationsProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.5555/2388996.2389009(1-11)Online publication date: 10-Nov-2012
    • (2012)Studying the impact of application-level optimizations on the power consumption of multi-core architecturesProceedings of the 9th conference on Computing Frontiers10.1145/2212908.2212927(123-132)Online publication date: 15-May-2012

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media