[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2642937.2642975acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article
Free access

Understanding performance stairs: elucidating heuristics

Published: 15 September 2014 Publication History

Abstract

How do experts navigate the huge space of implementations for a given specification to find an efficient choice with minimal searching? Answer: They use "heuristics" -- rules of thumb that are more street wisdom than scientific fact. We provide a scientific justification for Dense Linear Algebra (DLA) heuristics by showing that only a few decisions (out of many possible) are critical to performance; once these decisions are made, the die is cast and only relatively minor performance improvements are possible. The (implementation x performance) space of DLA is stair-stepped. Each stair is a set of implementations with very similar performance and (surprisingly) share key design decision(s). High-performance stairs align with heuristics that prescribe certain decisions in a particular context. Stairs also tell us how to tailor the search engine of a DLA code generator to reduce the time it needs to find implementations that are as good or better than those crafted by experts.

References

[1]
S. Amarel. Program synthesis as a theory formation task: Problem representations and solution methods. In Machine Learning: An Artificial Intelligence Approach: Volume II, pages 499--569. Kaufmann, Los Altos, CA, 1986.
[2]
A. Auer, G. Baumgartner, D. Bernholdt, A. Bibireata, V. Choppella, D. C. and X. Gao, R. Harrison, S. Krishnamoorthy, S. Krishnan, S. Lam, Q. Lu, M. Nooijen, R. P. añd J. Ramanujam, P. Sadayappan, and A. Sibiryakov. Automatic code generation for many-body electronic structure methods: The Tensor Contractio n Engine. Molecular Physics, 2005.
[3]
I. D. Baxter. Design Maintenance Systems. CACM, April 1992.
[4]
G. Belter, E. Jessup, I. Karlin, and J. G. Siek. Automating the generation of composed linear algebra kernels. In SC, 2009.
[5]
P. Bientinesi et al. Families of algorithms related to the inversion of a symmetric positive definite matrix. ACM Trans. Math. Softw., 35(1):1--22, 2008.
[6]
L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley. ScaLAPACK Users' Guide. SIAM, 1997.
[7]
E. Chan, M. Heimlich, A. Purkayastha, and R. van de Geijn. Collective communication: theory, practice, and experience: Research articles. Concurrency and Computation: Practice & Experience, 19(13):1749--1783, Sept. 2007.
[8]
J. J. Dongarra, J. Du Croz, S. Hammarling, and I. Duff. A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Softw., 16(1):1--17, Mar. 1990.
[9]
J. Feigenspan, D. S. Batory, and T. L. Riché. Is the derivation of a model easier to understand than the model itself? In ICPC, pages 47--52, 2012.
[10]
D. S. Frankel. Model Driven Architecture: Applying MDA to Enterprise Computing. John Wiley & Sons, Inc., 2003.
[11]
R. C. Gonçalves, D. Batory, and J. Sobral. ReFlO: An interactive tool for pipe-and-filter domain specification and program generation. submitted, 2013.
[12]
J. Guo, K. Czarnecki, S. Apel, N. Siegmund, and A. Wasowski. Variability-aware performance prediction: A statistical learning approach. In Proceedings of IEEE/ACM International Conference on Automated Software Engineering (ASE), Nov 2013.
[13]
A. Kleppe, J. Warmer, and W. Bast. MDA Explained: The Model-Driven Architecture. Addison-Wesley, Boston, MA, 2003.
[14]
R. E. Korf. Artificial intelligence search algorithms. In In Algorithms and Theory of Computation Handbook. CRC Press, 1996.
[15]
G. M. Lohman. Grammar-like functional rules for representing query optimization alternatives. In ACM SIGMOD, 1988.
[16]
B. R. Mandelbrot. The Fractal Geometry of Nature. Macmillian, Philadelphia, PA, USA, 1983.
[17]
B. Marker, D. Batory, and R. van de Geijn. A case study in mechanically deriving dense linear algebra code. International Journal of High Performance Computing Applications, 27(4):439--452, 2013.
[18]
B. Marker, D. Batory, and R. A. van de Geijn. Code generation and optimization of distributed-memory dense linear algebra kernels. In ICCS, 2013.
[19]
B. Marker, J. Poulson, D. Batory, and R. A. van de Geijn. Designing linear algebra algorithms by transformation: Mechanizing the expert developer. In VECPAR, 2012.
[20]
T. Meng Low, B. Marker, and R. van de Geijn. FLAME Working Note #64. Theory and practice of fusing loops when optimizing parallel dense linear algebra operations. Technical Report TR-12-18, The University of Texas at Austin, Department of Computer Sciences, 2012.
[21]
T. Menzies, D. Owen, and J. Richardson. The strangest thing about software. Computer, 40(1):54--60, January 2007.
[22]
S. Nedunuri, D. R. Smith, and W. R. Cook. Synthesis of greedy algorithms using dominance relations. In NASA Formal Methods, pages 97--108, 2010.
[23]
S. Nedunuri, D. R. Smith, and W. R. Cook. Theory and techniques for synthesizing efficient breadth-first search algorithms. In FM, pages 308--325, 2012.
[24]
J. Poulson, B. Marker, R. A. van de Geijn, J. R. Hammond, and N. A. Romero. Elemental: A new framework for distributed memory dense matrix computations. ACM Trans. Math. Softw., 39(2):13:1--13:24, 2013.
[25]
J. Poulson, R. van de Geijn, and J. Bennighof. (Parallel) algorithms for reducing the generalized hermitian-definite eigenvalue problem. ACM Trans. on Math. Softw., 2012. submitted.
[26]
M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, special issue on "Program Generation, Optimization, and Adaptation", 93(2):232--275, 2005.
[27]
T. Riché, R. Goncalves, B. Marker, and D. Batory. Pushouts in Software Architecture Design. In GPCE, 2012.
[28]
N. Siegmund, S. S. Kolesnikov, C. Kästner, S. Apel, D. Batory, M. Rosenmüller, and G. Saake. Predicting performance via automated feature-interaction detection. In Proceedings of the 34th International Conference on Software Engineering, ICSE '12, pages 167--177, Piscataway, NJ, USA, 2012. IEEE Press.
[29]
M. Snir, S. W. Otto, S. Huss-Lederman, D. W. Walker, and J. Dongarra. MPI: The Complete Reference. The MIT Press, 1996.
[30]
R. A. van de Geijn and E. S. Quintana-Ortí. The Science of Programming Matrix Computations. www.lulu.com, 2008.
[31]
F. Van Zee and R. van de Geijn. BLIS: A framework for rapid instantiation of blas functionality. ACM Trans. Math. Softw. accepted.
[32]
R. C. Whaley and J. J. Dongarra. Automatically tuned linear algebra software. In Proceedings of SC'98, 1998.
[33]
R. Williams, C. P. Gomes, and B. Selman. Backdoors to typical case complexity. In Proceedings of the 18th International Joint Conference on Artificial Intelligence, IJCAI'03, pages 1173--1178, San Francisco, CA, USA, 2003. Morgan Kaufmann Publishers Inc.

Cited By

View all
  • (2023)Finding Near-optimal Configurations in Colossal Spaces with Statistical GuaranteesACM Transactions on Software Engineering and Methodology10.1145/361166333:1(1-36)Online publication date: 23-Nov-2023
  • (2021)Product Optimization in Stepwise DesignLogic, Computation and Rigorous Methods10.1007/978-3-030-76020-5_4(63-81)Online publication date: 4-Jun-2021
  • (2017)Finding near-optimal configurations in product lines by random samplingProceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering10.1145/3106237.3106273(61-71)Online publication date: 21-Aug-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASE '14: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering
September 2014
934 pages
ISBN:9781450330138
DOI:10.1145/2642937
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 September 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dense linear algebra
  2. distributed-memory computing
  3. high-performance software
  4. model driven engineering
  5. program generation

Qualifiers

  • Research-article

Funding Sources

Conference

ASE '14
Sponsor:

Acceptance Rates

ASE '14 Paper Acceptance Rate 82 of 337 submissions, 24%;
Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)61
  • Downloads (Last 6 weeks)9
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Finding Near-optimal Configurations in Colossal Spaces with Statistical GuaranteesACM Transactions on Software Engineering and Methodology10.1145/361166333:1(1-36)Online publication date: 23-Nov-2023
  • (2021)Product Optimization in Stepwise DesignLogic, Computation and Rigorous Methods10.1007/978-3-030-76020-5_4(63-81)Online publication date: 4-Jun-2021
  • (2017)Finding near-optimal configurations in product lines by random samplingProceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering10.1145/3106237.3106273(61-71)Online publication date: 21-Aug-2017
  • (2017)From software extensions to product lines of dataflow programsSoftware and Systems Modeling (SoSyM)10.1007/s10270-015-0495-816:4(929-947)Online publication date: 1-Oct-2017
  • (2015)A theory of modularity for automated software development (keynote)Companion Proceedings of the 14th International Conference on Modularity10.1145/2735386.2735843(1-10)Online publication date: 16-Mar-2015

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media