[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2908080.2908117acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article
Public Access

Verified lifting of stencil computations

Published: 02 June 2016 Publication History

Abstract

This paper demonstrates a novel combination of program synthesis and verification to lift stencil computations from low-level Fortran code to a high-level summary expressed using a predicate language. The technique is sound and mostly automated, and leverages counter-example guided inductive synthesis (CEGIS) to find provably correct translations. Lifting existing code to a high-performance description language has a number of benefits, including maintainability and performance portability. For example, our experiments show that the lifted summaries can enable domain specific compilers to do a better job of parallelization as compared to an off-the-shelf compiler working on the original code, and can even support fully automatic migration to hardware accelerators such as GPUs. We have implemented verified lifting in a system called STNG and have evaluated it using microbenchmarks, mini-apps, and real-world applications. We demonstrate the benefits of verified lifting by first automatically summarizing Fortran source code into a high-level predicate language, and subsequently translating the lifted summaries into Halide, with the translated code achieving median performance speedups of 4.1X and up to 24X for non-trivial stencils as compared to the original implementation.

References

[1]
Apache Hive. http://hive.apache.org.
[2]
Syntax-Guided Synthesis Competition. http://www.sygus. org.
[3]
Rajeev Alur, Rastislav Bod´ık, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. Syntax-guided synthesis. In Formal Methods in Computer-Aided Design, FMCAD 2013, Portland, OR, USA, October 20-23, 2013, pages 1–17, 2013.
[4]
Saman P Amarasinghe, Jennifer-Ann M Anderson, Monica S Lam, and Chau-Wen Tseng. An overview of the SUIF compiler for scalable parallel machines. In PPSC, pages 662–667, 1995.
[5]
Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O’Reilly, and Saman Amarasinghe. OpenTuner: An extensible framework for program autotuning. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation, PACT ’14, pages 303–316, New York, NY, USA, 2014. ACM.
[6]
D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. The NAS parallel benchmarks— summary and preliminary results. In Proceedings of the 1991 ACM/IEEE Conference on Supercomputing, Supercomputing ’91, pages 158–165, New York, NY, USA, 1991. ACM.
[7]
Gilles Barthe, Juan Manuel Crespo, Sumit Gulwani, Cesar Kunz, and Mark Marron. From relational verification to SIMD loop synthesis. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’13, pages 123–134, 2013.
[8]
John R. Baumgardner. Three-dimensional treatment of convective flow in the earth’s mantle. Journal of Statistical Physics, 39(5-6):501–511, 1985.
[9]
Jeff Bilmes, Krste Asanovic, Chee-Whye Chin, and Jim Demmel. Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology. In Proceedings of the 11th International Conference on Supercomputing, ICS ’97, pages 340–347, New York, NY, USA, 1997. ACM.
[10]
Aaron R. Bradley. SAT-based model checking without unrolling. In Proceedings of the 12th International Conference on Verification, Model Checking, and Abstract Interpretation, VMCAI’11, pages 70–87, 2011.
[11]
Aaron R. Bradley, Zohar Manna, and Henny B. Sipma. What’s decidable about arrays? In Proceedings of the 7th International Conference on Verification, Model Checking, and Abstract Interpretation, VMCAI’06, pages 427–442, Berlin, Heidelberg, 2006. Springer-Verlag.
[12]
Peter E Bulychev, Egor V Kostylev, and Vladimir A Zakharov. Anti-unification algorithms and their applications in program analysis. In Perspectives of Systems Informatics, pages 413– 423. Springer, 2010.
[13]
Bryan Catanzaro, Shoaib Kamil, Yunsup Lee, Krste Asanovic, James Demmel, Kurt Keutzer, John Shalf, Kathy Yelick, and Armando Fox. SEJITS: Getting productivity and performance with selective embedded JIT specialization. In Workshop on Programmable Models for Emerging Architecture (PMEA), 2009.
[14]
B.L. Chamberlain, D. Callahan, and H.P. Zima. Parallel programmability and the Chapel language. International Journal High Performance Computing Applications, 21(3):291– 312, 2007.
[15]
Alvin Cheung, Armando Solar-Lezama, and Samuel Madden. Optimizing database-backed applications with query synthesis. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, pages 3–14, New York, NY, USA, 2013. ACM.
[16]
M. Christen, O. Schenk, and H. Burkhart. PATUS: A code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures. In International Parallel Distributed Processing Symposium (IPDPS), pages 676–687, May 2011.
[17]
Alessandro Cimatti and Alberto Griggio. Software model checking via IC3. In Proceedings of the 24th International Conference on Computer Aided Verification, CAV’12, pages 277–293, 2012.
[18]
Kaushik Datta, Mark Murphy, Vasily Volkov, Samuel Williams, Jonathan Carter, Leonid Oliker, David Patterson, John Shalf, and Katherine Yelick. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC ’08, pages 4:1–4:12, Piscataway, NJ, USA, 2008. IEEE Press.
[19]
Kei Davis and Daniel J. Quinlan. ROSE: An optimizing transformation system for C++ array-class libraries. In Workshop on Object-Oriented Technology, ECOOP ’98, pages 452–453, London, UK, UK, 1998. Springer-Verlag.
[20]
Leonardo De Moura and Nikolaj Bjørner. Z3: An efficient SMT solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS’08/ETAPS’08, pages 337–340, Berlin, Heidelberg, 2008. Springer-Verlag.
[21]
Michael D. Ernst, Jake Cockrell, William G. Griswold, and David Notkin. Dynamically discovering likely program invariants to support program evolution. In Proceedings of the 21st International Conference on Software Engineering, ICSE ’99, pages 213–224, New York, NY, USA, 1999. ACM.
[22]
Paul Feautrier. Dataflow analysis of array and scalar references. International Journal of Parallel Programming, 20(1):23–53, 1991.
[23]
Cormac Flanagan and Shaz Qadeer. Predicate abstraction for software verification. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’02, pages 191–202, New York, NY, USA, 2002. ACM.
[24]
Matteo Frigo and Volker Strumpen. Cache oblivious stencil computations. In Proceedings of the 19th Annual International Conference on Supercomputing, ICS ’05, pages 361–366, New York, NY, USA, 2005. ACM.
[25]
Pranav Garg, Christof Löding, P. Madhusudan, and Daniel Neider. ICE: A robust framework for learning invariants. In Armin Biere and Roderick Bloem, editors, Computer Aided Verification, volume 8559 of Lecture Notes in Computer Science, pages 69–87. Springer International Publishing, 2014.
[26]
Tobias Grosser, Albert Cohen, Justin Holewinski, P. Sadayappan, and Sven Verdoolaege. Hybrid hexagonal/classical tiling for gpus. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO ’14, pages 66:66–66:75, New York, NY, USA, 2014. ACM.
[27]
Tobias Grosser, Sven Verdoolaege, Albert Cohen, and P. Sadayappan. The relation between diamond tiling and hexagonal tiling. Parallel Processing Letters, 24(03):1441002, 2014.
[28]
Michael A. Heroux, Douglas W. Doerfler, Paul S. Crozier, James M. Willenbring, H. Carter Edwards, Alan Williams, Mahesh Rajan, Eric R. Keiter, Heidi K. Thornquist, and Robert W. Numrich. Improving performance via mini-applications. Technical report, Sandia National Laboratory, 2009.
[29]
C. A. R. Hoare. An axiomatic basis for computer programming. Communications of the ACM, 12(10):576–580, October 1969.
[30]
Intel. The X10 parallel programming language. http:// x10-lang.org.
[31]
Francois Irigoin and Remi Triolet. Supernode partitioning. In Symposium on Principles of Programming Languages (POPL’88), pages 319–328, San Diego, CA, January 1988.
[32]
Ming-Yee Iu and Willy Zwaenepoel. HadoopToSQL: A MapReduce query optimizer. In Proceedings of the 5th European Conference on Computer Systems, EuroSys ’10, pages 251–264, New York, NY, USA, 2010. ACM.
[33]
Jinseong Jeon, Xiaokang Qiu, Armando Solar-Lezama, and Jeffrey S. Foster. Adaptive concretization for parallel program synthesis. In Computer Aided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part II, pages 377–394, 2015.
[34]
Rajeev Joshi, Greg Nelson, and Keith Randall. Denali: A goal-directed superoptimizer. pages 304–314, 2002.
[35]
David Joyner, Ondˇrej ˇ Cert´ık, Aaron Meurer, and Brian E. Granger. Open source computer algebra systems: SymPy. ACM Communications in Computer Algebra, 45(3/4):225–234, January 2012.
[36]
Shoaib Kamil. A cross-domain stencil benchmark suite. In Workshop on Optimizing Stencil Computations, 2013.
[37]
Shoaib Kamil, Cy Chan, Leonid Oliker, John Shalf, and Samuel Williams. An auto-tuning framework for parallel multicore stencil computations. In International Parallel and Distributed Processing Symposium (IPDPS), pages 1–12, 2010.
[38]
Aleksandr Karbyshev, Nikolaj Bjørner, Shachar Itzhaky, Noam Rinetzky, and Sharon Shoham. Property-directed inference of universal invariants or proving their absence. In Computer Aided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part I, pages 583–602, 2015.
[39]
Martin Kong, Richard Veras, Kevin Stock, Franz Franchetti, Louis-No¨el Pouchet, and P. Sadayappan. When polyhedral transformations meet SIMD code generation. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, pages 127– 138, New York, NY, USA, 2013. ACM.
[40]
A.C. Mallinson, D.A. Beckingsale, W.P. Gaudin, J.A. Herdman, J.M. Levesque, and S.A. Jarvis. Cloverleaf: Preparing hydrodynamics codes for exascale. In Cray User Group, 2013.
[41]
Henry Massalin. Superoptimizer: A look at the smallest program. In Proceedings of the Second International Conference on Architectual Support for Programming Languages and Operating Systems, ASPLOS II, pages 122–126, Los Alamitos, CA, USA, 1987. IEEE Computer Society Press.
[42]
John Matthews, J. Strother Moore, Sandip Ray, and Daron Vroon. Verification condition generation via theorem proving. In Proceedings of the 13th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning, LPAR’06, pages 362–376, 2006.
[43]
Charith Mendis, Jeffrey Bosboom, Kevin Wu, Shoaib Kamil, Jonathan Ragan-Kelley, Sylvain Paris, Qin Zhao, and Saman Amarasinghe. Helium: Lifting high-performance stencil kernels from stripped x86 binaries to Halide DSL code. In ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, June 2015.
[44]
Ravi Teja Mullapudi, Vinay Vasista, and Uday Bondhugula. PolyMage: Automatic optimization for image processing pipelines. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’15, pages 429–443, New York, NY, USA, 2015. ACM.
[45]
Phitchaya Mangpo Phothilimthana, Tikhon Jelvis, Rohin Shah, Nishant Totla, Sarah Chasins, and Rastislav Bodik. Chlorophyll: Synthesis-aided compiler for low-power spatial architectures. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, pages 396–407, New York, NY, USA, 2014. ACM.
[46]
Gordon D Plotkin. A note on inductive generalization. Machine Intelligence, 5(1):153–163, 1970.
[47]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, 2013.
[48]
John C. Reynolds. Transformational systems and the algebraic structure of atomic formulas. In Bernard Meltzer and Donald Michie, editors, Machine Intelligence 5, pages 135–151. Edinburgh University Press, Edinburgh, Scotland, 1969.
[49]
Gabriel Rivera and Chau-Wen Tseng. Tiling optimizations for 3D scientific computations. In Proceedings of the 2000 ACM/IEEE Conference on Supercomputing, SC ’00, Washington, DC, USA, 2000. IEEE Computer Society.
[50]
Eric Schkufza, Rahul Sharma, and Alex Aiken. Stochastic superoptimization. SIGPLAN Notices, 48(4):305–316, March 2013.
[51]
Eric Schkufza, Rahul Sharma, and Alex Aiken. Stochastic optimization of floating-point programs with tunable precision. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, Edinburgh, United Kingdom - June 09 - 11, 2014, page 9, 2014.
[52]
Armando Solar Lezama. Program Synthesis By Sketching. PhD thesis, EECS Department, University of California, Berkeley, Dec 2008.
[53]
Armando Solar-Lezama, Gilad Arnold, Liviu Tancau, Rastislav Bodk, Vijay A. Saraswat, and Sanjit A. Seshia. Sketching stencils. In Jeanne Ferrante and Kathryn S. McKinley, editors, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 167–178. ACM, 2007.
[54]
Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Sanjit Seshia, and Vijay Saraswat. Combinatorial sketching for finite programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XII, pages 404–415, New York, NY, USA, 2006. ACM.
[55]
Arvind K. Sujeeth, Kevin J. Brown, Hyoukjoong Lee, Tiark Rompf, Hassan Chafi, Martin Odersky, and Kunle Olukotun. Delite: A compiler architecture for performance-oriented embedded domain-specific languages. ACM Transactions in Embedded Computer Systems, 13(4s):134:1–134:25, April 2014.
[56]
Yuan Tang, Rezaul Chowdhury, Bradley C. Kuszmaul, Chi keung Luk, and Charles E. Leiserson. The Pochoir stencil compiler. In ACM Symposium on Parallelism in Algorithms and Architectures, (SPAA), 2011.
[57]
Abhishek Udupa, Arun Raghavan, Jyotirmoy V. Deshmukh, Sela Mador-Haim, Milo M. K. Martin, and Rajeev Alur. TRANSIT: specifying protocols with concolic snippets. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, Seattle, WA, USA, June 16-19, 2013, pages 287–296, 2013.
[58]
Diego Vasco, Nelson Moraga, and Gundolf Haase. Parallel finite volume method simulation of three-dimensional fluid flow and convective heat transfer for viscoplastic non-newtonian fluids. J. Numerical Heat Transfer, Part A: Applications, 66(2):990–1019, 2014.
[59]
Diego Vasco, Nelson Moraga, Aurel Neic, and Gundolf Haase. OpenMP parallel acceleration of a 3D finite volume method based code for simulation of non-Newtonian fluid flows. In Sergio Gutiérrez, Daniel Hurtado, and Esteban Sáez, editors, Cuadernos de Mecánica Computacional, pages 108–117, Concepci´on, Chile, 2011. Sociedad Chilena de Mecánica Computacional.
[60]
Ben Wiedermann, Ali Ibrahim, and William R. Cook. Interprocedural query extraction for transparent persistence. In Proceedings of the 23rd ACM SIGPLAN Conference on Objectoriented Programming Systems Languages and Applications, pages 19–36, 2008.
[61]
Glynn Winskel. The Formal Semantics of Programming Languages: An Introduction. MIT Press, Cambridge, MA, USA, 1993.
[62]
Christoph M. Wintersteiger, Youssef Hamadi, and Leonardo Mendonc¸a de Moura. Efficiently solving quantified bit-vector formulas. In Proceedings of 10th International Conference on Formal Methods in Computer-Aided Design, FMCAD 2010, Lugano, Switzerland, October 20-23, pages 239–246, 2010.
[63]
David Wonnacott. Achieving scalable locality with time skewing. International Journal of Parallel Programming, 30(3):181–221, 2002.
[64]
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI’12, 2012.
[65]
Fubo Zhang and Erik H. D’Hollander. Using hammock graphs to structure programs. IEEE Transactions on Software Engineering, 30(4):231–245, 2004.

Cited By

View all
  • (2024)HiPy: Extracting High-Level Semantics from Python Code for Data ProcessingProceedings of the ACM on Programming Languages10.1145/36897378:OOPSLA2(736-762)Online publication date: 8-Oct-2024
  • (2024)Equivalence by Canonicalization for Synthesis-Backed RefactoringProceedings of the ACM on Programming Languages10.1145/36564538:PLDI(1879-1904)Online publication date: 20-Jun-2024
  • (2024)SpEQ: Translation of Sparse Codes using EquivalencesProceedings of the ACM on Programming Languages10.1145/36564458:PLDI(1680-1703)Online publication date: 20-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PLDI '16: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2016
726 pages
ISBN:9781450342612
DOI:10.1145/2908080
  • General Chair:
  • Chandra Krintz,
  • Program Chair:
  • Emery Berger
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 51, Issue 6
    PLDI '16
    June 2016
    726 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2980983
    • Editor:
    • Andy Gill
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 June 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Domain-specific Languages
  2. Program Synthesis
  3. Verified Lifting

Qualifiers

  • Research-article

Funding Sources

Conference

PLDI '16
Sponsor:

Acceptance Rates

Overall Acceptance Rate 378 of 1,962 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)336
  • Downloads (Last 6 weeks)45
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)HiPy: Extracting High-Level Semantics from Python Code for Data ProcessingProceedings of the ACM on Programming Languages10.1145/36897378:OOPSLA2(736-762)Online publication date: 8-Oct-2024
  • (2024)Equivalence by Canonicalization for Synthesis-Backed RefactoringProceedings of the ACM on Programming Languages10.1145/36564538:PLDI(1879-1904)Online publication date: 20-Jun-2024
  • (2024)SpEQ: Translation of Sparse Codes using EquivalencesProceedings of the ACM on Programming Languages10.1145/36564458:PLDI(1680-1703)Online publication date: 20-Jun-2024
  • (2024)Lifting Micro-Update Models from RTL for Formal Security AnalysisProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640418(631-648)Online publication date: 27-Apr-2024
  • (2023)Decompiling x86 deep neural network executablesProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620649(7357-7374)Online publication date: 9-Aug-2023
  • (2023)Towards Auto-Generated Data SystemsProceedings of the VLDB Endowment10.14778/3611540.361163516:12(4116-4129)Online publication date: 1-Aug-2023
  • (2023)Fast Instruction Selection for Fast Digital Signal ProcessingProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 410.1145/3623278.3624768(125-137)Online publication date: 25-Mar-2023
  • (2023)User-Customizable Transpilation of Scripting LanguagesProceedings of the ACM on Programming Languages10.1145/35860347:OOPSLA1(201-229)Online publication date: 6-Apr-2023
  • (2023)Source Matching and Rewriting for MLIR Using String-Based AutomataACM Transactions on Architecture and Code Optimization10.1145/357128320:2(1-26)Online publication date: 1-Mar-2023
  • (2022)Are machine programming systems using right source-code measures to select code repositories?Proceedings of the 6th International Workshop on Machine Learning Techniques for Software Quality Evaluation10.1145/3549034.3561176(11-16)Online publication date: 7-Nov-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media