[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3624007.3624053acmconferencesArticle/Chapter ViewAbstractPublication PagesgpceConference Proceedingsconference-collections
research-article

C2TACO: Lifting Tensor Code to TACO

Published: 22 October 2023 Publication History

Abstract

Domain-specific languages (DSLs) promise a significant performance and portability advantage over traditional languages. DSLs are designed to be high-level and platform-independent, allowing an optimizing compiler significant leeway when targeting a particular device. Such languages are particularly popular with emerging tensor algebra workloads. However, DSLs present their own challenge: they require programmers to learn new programming languages and put in significant effort to migrate legacy code.
We present C2TACO, a synthesis tool for synthesizing TACO, a well-known tensor DSL, from C code. We develop a smart, enumerative synthesizer that uses automatically generated IO examples and source-code analysis to efficiently generate code. C2TACO is able to synthesize 95% bench marks from a tensor benchmark suite, out-performing an alternative neural machine translation technique, and demonstrates substantially higher levels of accuracy when evaluated against two state-of-the-art existing schemes, TF-Coder and ChatGPT. Our synthesized TACO programs are, by design, portable achieving significant performance improvement when evaluated on a multi-core and GPU platform.

References

[1]
[n. d.]. Mathfu. https://github.com/google/mathfu
[2]
[n. d.]. Texas Instrument Digital Signal Processing (DSP) Library for MSP430 Microcontrollers. https://www.ti.com/tool/MSP-DSPLIB
[3]
Alessandro Abate, Cristina David, Pascal Kesseli, Daniel Kroening, and Elizabeth Polgreen. 2018. Counterexample Guided Inductive Synthesis Modulo Theories. In CAV (1) (Lecture Notes in Computer Science, Vol. 10981). Springer, 270–288.
[4]
Maaz Bin Safeer Ahmad and Alvin Cheung. 2016. Leveraging parallel data processing frameworks with verified lifting. arXiv preprint arXiv:1611.07623.
[5]
Maaz Bin Safeer Ahmad and Alvin Cheung. 2017. Optimizing Data-Intensive Applications Automatically By Leveraging Parallel Data Processing Frameworks. In Proceedings of the 2017 ACM International Conference on Management of Data. 1675–1678.
[6]
Maaz Bin Safeer Ahmad and Alvin Cheung. 2018. Automatically leveraging mapreduce frameworks for data-intensive applications. In Proceedings of the 2018 International Conference on Management of Data. 1205–1220.
[7]
Maaz Bin Safeer Ahmad, Jonathan Ragan-Kelley, Alvin Cheung, and Shoaib Kamil. 2019. Automatically translating image processing libraries to halide. ACM Transactions on Graphics (TOG), 38, 6 (2019), 1–13.
[8]
Peter Ahrens, Fredrik Kulstad, and Saman Amarasinghe. 2022. Autoscheduling for Sparse Tensor Algebra with an Asymptotic Cost Model. PLDI.
[9]
Aws Albarghouthi, Sumit Gulwani, and Zachary Kincaid. 2013. Recursive Program Synthesis. In CAV (Lecture Notes in Computer Science, Vol. 8044). Springer, 934–950.
[10]
Rajeev Alur, Rastislav Bodík, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. 2013. Syntax-guided synthesis. In FMCAD. IEEE, 1–8.
[11]
Jordi Armengol-Estapé and Michael O’Boyle. 2021. Learning C to x86 Translation: An Experiment in Neural Compilation. In Advances in Programming Languages and Neurosymbolic Systems Workshop.
[12]
Jordi Armengol-Estapé, Jackson Woodruff, Chris Cummins, and Michael FP O’Boyle. 2023. SLaDe: A Portable Small Language Model Decompiler for Optimized Assembler. arXiv preprint arXiv:2305.12520.
[13]
Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2019. An Effective Approach to Unsupervised Machine Translation. In ACL (1). Association for Computational Linguistics, 194–203.
[14]
Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Terry Michael, Quoc Le, and Charles Sutton. 2021. Program Synthesis with Large Language Models. CoRR.
[15]
Manya Bansal, Olivia Hsu, Kunle Olukotun, and Fredrik Kjolstad. 2023. Mosaic: An Interoperable Compiler for Tensor Algebra. PLDI.
[16]
Rohan Bavishi, Caroline Lemieux, Roy Fox, Koushik Sen, and Ion Stoica. 2019. AutoPandas: neural-backed generators for program synthesis. Proceedings of the ACM on Programming Languages, 3, OOPSLA (2019), 1–27.
[17]
Sahil Bhatia, Sumer Kohli, Sanjit A. Seshia, and Alvin Cheung. 2023. Building Code Transpilers for Domain-Specific Languages Using Program Synthesis (Experience Paper). In ECOOP (LIPIcs, Vol. 263). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 38:1–38:30.
[18]
L Susan Blackford, Antoine Petitet, Roldan Pozo, Karin Remington, R Clint Whaley, James Demmel, Jack Dongarra, Iain Duff, Sven Hammarling, and Greg Henry. 2002. An updated set of basic linear algebra subprograms (BLAS). ACM Trans. Math. Software, 28, 2 (2002), 135–151.
[19]
Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, and Luis Ceze. 2018. TVM: an automated end-to-end optimizing compiler for deep learning. In Proceedings of the 13th USENIX conference on Operating Systems Design and Implementation. 579–594.
[20]
Bruce Collie, Philip Ginsbach, and Michael FP O’Boyle. 2019. Type-Directed Program Synthesis and Constraint Generation for Library Portability. In 2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT). 55–67.
[21]
Bruce Collie and Michael FP O’Boyle. 2021. Program lifting using gray-box behavior. In 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT). 60–74.
[22]
Bruce Collie, Jackson Woodruff, and Michael FP O’Boyle. 2020. Modeling black-box components with probabilistic synthesis. In Proceedings of the 19th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences. 1–14.
[23]
Keith D Cooper and Linda Torczon. 2011. Engineering a compiler. Elsevier.
[24]
Joao PL De Carvalho, Braedy Kuzma, Ivan Korostelev, José Nelson Amaral, Christopher Barton, José Moreira, and Guido Araujo. 2021. KernelFaRer: replacing native-code idioms with high-performance library calls. ACM Transactions On Architecture And Code Optimization (TACO), 18, 3 (2021), 1–22.
[25]
Leonardo Mendonça de Moura and Nikolaj S. Bjørner. 2008. Z3: An Efficient SMT Solver. In TACAS (Lecture Notes in Computer Science, Vol. 4963). Springer, 337–340.
[26]
Mehdi Drissi, Olivia Watkins, Aditya Khant, Vivaswat Ojha, Pedro Sandoval Segura, Rakia Segev, Eric Weiner, and Robert Keller. 2018. Program Language Translation Using a Grammar-Driven Tree-to-Tree Model. CoRR, abs/1807.01784 (2018).
[27]
Venmugil Elango, Norm Rubin, Mahesh Ravishankar, Hariharan Sandanagobalane, and Vinod Grover. 2018. Diesel: DSL for linear algebra and neural net computations on GPUs. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages. 42–51.
[28]
Grigory Fedyukovich, Maaz Bin Safeer Ahmad, and Rastislav Bodik. 2017. Gradual synthesis for static parallelization of single-pass array-processing programs. ACM SIGPLAN Notices, 52, 6 (2017), 572–585.
[29]
Björn Franke and Michael O’Boyle. 2003. Array recovery and high-level transformations for DSP applications. ACM Transactions on Embedded Computing Systems (TECS), 2, 2 (2003), 132–162.
[30]
Philip Ginsbach, Toomas Remmelg, Michel Steuwer, Bruno Bodin, Christophe Dubach, and Michael FP O’Boyle. 2018. Automatic matching of legacy code to heterogeneous APIs: An idiomatic approach. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems. 139–153.
[31]
Sumit Gulwani. 2011. Automating string processing in spreadsheets using input-output examples. ACM Sigplan Notices, 46, 1 (2011), 317–330.
[32]
Yining Hong, Kaichun Mo, Li Yi, Leonidas J. Guibas, Antonio Torralba, Joshua B. Tenenbaum, and Chuang Gan. 2022. Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction. In CVPR. IEEE, 1403–1413.
[33]
Olivia Hsu, Alexander Rucker, Tian Zhao, Kule Olukotun, and Fredrik Kjolstad. 2022. Stardust: Compiling Sparse Tensor Algebra to a Reconfigurable Dataflow Architecture. CoRR, Available at arxiv:2211.03251
[34]
Shoaib Kamil, Alvin Cheung, Shachar Itzhaky, and Armando Solar-Lezama. 2016. Verified lifting of stencil computations. ACM SIGPLAN Notices, 51, 6 (2016), 711–726.
[35]
Yuning Kang, Zan Wang, Hongyu Zhang, Junjie Chen, and Hanmo You. 2021. APIRecX: Cross-Library API Recommendation via Pre-Trained Language Model. In EMNLP (1). Association for Computational Linguistics, 3425–3436.
[36]
Omer Katz, Yuval Olshaker, Yoav Goldberg, and Eran Yahav. 2019. Towards Neural Decompilation. CoRR, abs/1905.08325 (2019).
[37]
Fredrik Kjolstad, Stephen Chou, David Lugato, Shoaib Kamil, and Saman Amarasinghe. 2017. taco: A Tool to Generate Tensor Algebra Kernels. ASE.
[38]
Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe. 2017. The Tensor Algebra Compiler. OOPSLA.
[39]
Daniel Kroening and Michael Tautschnig. 2014. CBMC - C Bounded Model Checker - (Competition Contribution). In TACAS (Lecture Notes in Computer Science, Vol. 8413). Springer, 389–391.
[40]
Taku Kudo and John Richardson. 2018. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing. In EMNLP (Demonstration). Association for Computational Linguistics, 66–71.
[41]
Jeremy Lacomis, Pengcheng Yin, Edward J. Schwartz, Miltiadis Allamanis, Claire Le Goues, Graham Neubig, and Bogdan Vasilescu. 2019. DIRE: A Neural Approach to Decompiled Identifier Naming. In ASE. IEEE, 628–639.
[42]
Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, and Shengyu Fu. 2021. CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation. CoRR, Available at arxiv:2102.04664
[43]
Shantanu Mandal, Adhkri Chethan, Vahid Janfaza, S M Farabi Mahmud, Todd A Anderson, Javier Turek, Jesmin Jahan Tithi, and Abdullah Muzahid. 2023. Large Language Models Based Automatic Synthesis of Software Specifications. CoRR, Available at arxiv:2304.09181
[44]
Pablo Antonio Martínez, Jackson Woodruff, Jordi Armengol-Estapé, Gregorio Bernabé, José Manuel García, and Michael FP O’Boyle. 2023. Matching linear algebra and tensor code to specialized hardware accelerators. In Proceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction. 85–97.
[45]
MG Sarwar Murshed, Christopher Murphy, Daqing Hou, Nazar Khan, Ganesh Ananthanarayanan, and Faraz Hussain. 2021. Machine learning at the network edge: A survey. ACM Computing Surveys (CSUR), 54, 8 (2021), 1–37.
[46]
Daye Nam, Baishakhi Ray, Seohyun Kim, Xianshan Qu, and Satish Chandra. 2022. Predictive synthesis of API-centric code. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming. 40–49.
[47]
Michael F. P. O’Boyle and Peter M. W. Knijnenburg. 2002. Integrating Loop and Data Transformations for Global Optimization. J. Parallel Distributed Comput., 62, 4 (2002), 563–590.
[48]
OpenAI. [n. d.]. ChatGPT. https://openai.com/chatgpt
[49]
Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, and Michael Auli. 2019. fairseq: A fast, extensible toolkit for sequence modeling. arXiv preprint arXiv:1904.01038.
[50]
Davide Pizzolotto and Katsuro Inoue. 2021. Identifying Compiler and Optimization Level in Binary Code from Multipler Architectures. IEEE Access.
[51]
Elizabeth Polgreen, Andrew Reynolds, and Sanjit A. Seshia. 2022. Satisfiability and Synthesis Modulo Oracles. In VMCAI (Lecture Notes in Computer Science, Vol. 13182). Springer, 263–284.
[52]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. Acm Sigplan Notices, 48, 6 (2013), 519–530.
[53]
Joseph Redmon. 2013–2016. Darknet: Open Source Neural Networks in C. http://pjreddie.com/darknet/
[54]
Christopher D Rosin. 2019. Stepping stones to inductive synthesis of low-level looping programs. In Proceedings of the AAAI Conference on Artificial Intelligence. 33, 2362–2370.
[55]
Baptiste Roziere, Marie-Anne Lachaux, Lowik Chanussot, and Guillaume Lample. 2020. Unsupervised Translation of Programming Languages. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS’20). Curran Associates Inc., Red Hook, NY, USA. Article 1730, 11 pages. isbn:9781713829546
[56]
Mazen AR Saghir. 1998. Application-specific instruction-set architectures for embedded DSP applications. Citeseer.
[57]
Eddie Antonio Santos, Joshua Charles Campbell, Dhvani Patel, Abram Hindle, and José Nelson Amaral. 2018. Syntax and sensibility: Using language models to detect and correct syntax errors. In SANER. IEEE Computer Society, 311–322.
[58]
Ryan Senanayake, Changwan Hong, Siheng Wang, Wilson Amalee, Stephen Chou, Shoaib Kamil, Saman Amarasinghe, and Fredrik Kjolstad. 2020. A Sparse Iteration Space Transformation Framework for Sparse Tensor Algebra. OOPSLA.
[59]
Kensen Shi, David Bieber, and Rishabh Singh. 2022. TF-Coder: Program synthesis for tensor manipulations. ACM Transactions on Programming Languages and Systems (TOPLAS), 44, 2 (2022), 1–36.
[60]
Nicholas D Sidiropoulos, Lieven De Lathauwer, Xiao Fu, Kejun Huang, Evangelos E Papalexakis, and Christos Faloutsos. 2017. Tensor decomposition for signal processing and machine learning. IEEE Transactions on signal processing, 65, 13 (2017), 3551–3582.
[61]
Rohit Singh, Rishabh Singh, Zhilei Xu, Rebecca Krosnick, and Armando Solar-Lezama. 2014. Modular synthesis of sketches using models. In Verification, Model Checking, and Abstract Interpretation: 15th International Conference, VMCAI 2014, San Diego, CA, USA, January 19-21, 2014, Proceedings 15. 395–414.
[62]
Sunbeom So and Hakjoo Oh. 2017. Synthesizing imperative programs from examples guided by static analysis. In International Static Analysis Symposium. 364–381.
[63]
Armando Solar-Lezama, Liviu Tancau, Rastislav Bodík, Sanjit A. Seshia, and Vijay A. Saraswat. 2006. Combinatorial sketching for finite programs. In ASPLOS. ACM, 404–415.
[64]
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In NIPS. 3104–3112.
[65]
Marc Szafraniec, Baptiste Roziere, Hugh Leather Francois Charton, Patrick Labatut, and Gabriel Synnaeve. 2022. Code translation with compiler representations. arXiv preprint arXiv:2207.03578.
[66]
Abhishek Udupa, Arun Raghavan, Jyotirmoy V. Deshmukh, Sela Mador-Haim, Milo M. K. Martin, and Rajeev Alur. 2013. TRANSIT: specifying protocols with concolic snippets. In PLDI. ACM, 287–296.
[67]
Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S Moses, Sven Verdoolaege, Andrew Adams, and Albert Cohen. 2018. Tensor comprehensions: Framework-agnostic high-performance machine learning abstractions. arXiv preprint arXiv:1802.04730.
[68]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS. 5998–6008.
[69]
Jackson Woodruff, Jordi Armengol-Estapé, Sam Ainsworth, and Michael F. P. O’Boyle. 2022. Bind the gap: compiling real software to hardware FFT accelerators. In PLDI. ACM, 687–702.
[70]
Rohan Yadav, Alex Aiken, and Fredrik Kjolstad. 2022. DISTAL: The Distributed Tensor Algebra Compiler. PLDI.
[71]
Rohan Yadav, Alex Aiken, and Fredrik Kjolstad. 2022. SpDISTAL: Compiling Distributed Sparse Tensor Computations. International Conference for High Performance Computing, Networking, Storage and Analysis.
[72]
Vojin Zivojnovic. 1994. DSPstone: A DSP-oriented benchmarking methodology. Proc. Signal Processing Applications & Technology, Dallas, TX, 1994, 715–720.
[73]
Amit Zohar and Lior Wolf. 2018. Automatic program synthesis of long programs with a learned garbage collector. Advances in neural information processing systems, 31 (2018).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
GPCE 2023: Proceedings of the 22nd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences
October 2023
152 pages
ISBN:9798400704062
DOI:10.1145/3624007
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Program Lifting
  2. Synthesis
  3. TACO
  4. Tensor Algebra

Qualifiers

  • Research-article

Conference

GPCE '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 56 of 180 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 274
    Total Downloads
  • Downloads (Last 12 months)232
  • Downloads (Last 6 weeks)32
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media