[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/CGO51591.2021.9370308acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
research-article

MLIR: scaling compiler infrastructure for domain specific computation

Published: 17 September 2021 Publication History

Abstract

This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR addresses software fragmentation, compilation for heterogeneous hardware, significantly reducing the cost of building domain specific compilers, and connecting existing compilers together.
MLIR facilitates the design and implementation of code generators, translators and optimizers at different levels of abstraction and across application domains, hardware targets and execution environments. The contribution of this work includes (1) discussion of MLIR as a research artifact, built for extension and evolution, while identifying the challenges and opportunities posed by this novel design, semantics, optimization specification, system, and engineering. (2) evaluation of MLIR as a generalized infrastructure that reduces the cost of building compilers---describing diverse use-cases to show research and educational opportunities for future programming languages, compilers, execution environments, and computer architecture. The paper also presents the rationale for MLIR, its original design principles, structures and semantics.

References

[1]
C. Lattner and V. Adve, "LLVM: A compilation framework for lifelong program analysis & transformation," in Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization, ser. CGO '04. Washington, DC, USA: IEEE Computer Society, 2004, pp. 75--. [Online]. Available: http://dl.acm.org/citation.cfm?id=977395.977673
[2]
T. Lindholm and F. Yellin, Java Virtual Machine Specification, 2nd ed. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1999.
[3]
R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck, "Efficiently computing static single assignment form and the control dependence graph," ACM Trans. Program. Lang. Syst., vol. 13, no. 4, pp. 451--490, Oct. 1991. [Online].
[4]
R. Johnson, D. Pearson, and K. Pingali, "The program structure tree: Computing control regions in linear time," in Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation, ser. PLDI '94. New York, NY, USA: ACM, 1994, pp. 171--185. [Online].
[5]
W. A. Havanki, S. Banerjia, and T. M. Conte, "Treegion scheduling for wide issue processors," in Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, Las Vegas, Nevada, USA, January 31 - February 4, 1998, 1998, pp. 266--276. [Online].
[6]
G. Ramalingam, "On loops, dominators, and dominance frontiers," ACM Trans. Program. Lang. Syst., vol. 24, no. 5, pp. 455--490, 2002. [Online].
[7]
D. Khaldi, P. Jouvelot, F. Irigoin, C. Ancourt, and B. Chapman, "LLVM parallel intermediate representation: Design and evaluation using OpenSHMEM communications," in Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, ser. LLVM '15. New York, NY, USA: ACM, 2015, pp. 2:1--2:8. [Online].
[8]
T. B. Schardl, W. S. Moses, and C. E. Leiserson, "Tapir: Embedding fork-join parallelism into LLVM's intermediate representation," SIGPLAN Not., vol. 52, no. 8, pp. 249--265, Jan. 2017. [Online].
[9]
Open64 Developers, "Open64 compiler and tools," 2001.
[10]
C. Click and K. D. Cooper, "Combining analyses, combining optimizations," ACM Trans. Program. Lang. Syst., vol. 17, no. 2, pp. 181--196, Mar. 1995. [Online].
[11]
A. Pnueli, M. Siegel, and E. Singerman, "Translation validation," in Tools and Algorithms for Construction and Analysis of Systems, 4th International Conference, TACAS '98, Held as Part of the European Joint Conferences on the Theory and Practice of Software, ETAPS'98, Lisbon, Portugal, March 28 - April 4, 1998, Proceedings, 1998, pp. 151--166. [Online].
[12]
G. C. Necula, "Translation validation for an optimizing compiler," SIGPLAN Not., vol. 35, no. 5, pp. 83--94, May 2000. [Online].
[13]
J. Tristan and X. Leroy, "Formal verification of translation validators: a case study on instruction scheduling optimizations," in Proceedings of the 35th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2008, San Francisco, California, USA, January 7--12, 2008, 2008, pp. 17--27. [Online].
[14]
J. Tristan, "Verified validation of lazy code motion," in Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2009, Dublin, Ireland, June 15--21, 2009, 2009, pp. 316--326. [Online].
[15]
Y. Chen, A. Groce, C. Zhang, W. Wong, X. Z. Fern, E. Eide, and J. Regehr, "Taming compiler fuzzers," in ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '13, Seattle, WA, USA, June 16--19, 2013, 2013, pp. 197--208. [Online].
[16]
B. Schommer, C. Cullmann, G. Gebhard, X. Leroy, M. Schmidt, and S. Wegener, "Embedded Program Annotations for WCET Analysis," in WCET 2018: 18th International Workshop on Worst-Case Execution Time Analysis, vol. 63. Barcelona, Spain: Dagstuhl Publishing, Jul. 2018. [Online]. Available: https://hal.inria.fr/hal-01848686
[17]
S. T. Vu, K. Heydemann, A. de Grandmaison, and A. Cohen, "Secure delivery of program properties through optimizing compilation," in ACM SIGPLAN 2020 International Conference on Compiler Construction (CC 2020), San Diego, CA, Feb. 2020.
[18]
G. Balakrishnan and T. Reps, "Wysinwyx: What you see is not what you execute," ACM Trans. Program. Lang. Syst., vol. 32, no. 6, pp. 23:1--23:84, Aug. 2010. [Online].
[19]
"TableGen - LLVM 10 Documentation," Online, =https://llvm.org/docs/TableGen/, accessed Nov 22, 2019, 2019. [Online]. Available: https://llvm.org/docs/TableGen/
[20]
A. W. Appel, "SSA is functional programming," ACM SIGPLAN NOTICES, vol. 33, no. 4, pp. 17--20, 1998.
[21]
C. Click and M. Paleczny, "A simple graph-based intermediate representation," in Papers from the 1995 ACM SIGPLAN Workshop on Intermediate Representations, ser. IR '95. New York, NY, USA: Association for Computing Machinery, 1995, p. 35--49. [Online].
[22]
A. Veen, "Dataflow machine architecture," ACM Comput. Surv., vol. 18, pp. 365--396, 12 1986.
[23]
M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, "TensorFlow: Large-scale machine learning on heterogeneous systems," 2015, software available from tensorflow.org. [Online]. Available: https://www.tensorflow.org/
[24]
"XLA - TensorFlow, compiled," Google Developers Blog, https://developers.googleblog.com/2017/03/xla-tensorflow-compiled.html, Mar 2017. [Online]. Available: https://developers.googleblog.com/2017/03/xla-tensorflow-compiled.html
[25]
P. Feautrier, "Some efficient solutions to the affine scheduling problem. part II. multidimensional time," Int. J. Parallel Program., vol. 21, no. 6, pp. 389--420, 1992.
[26]
S. Girbal, N. Vasilache, C. Bastoul, A. Cohen, D. Parello, M. Sigler, and O. Temam, "Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies," Int. J. Parallel Program., vol. 34, no. 3, pp. 261--317, Jun. 2006. [Online].
[27]
S. Verdoolaege, "ISL: An integer set library for the polyhedral model," in Proceedings of the Third International Congress Conference on Mathematical Software, ser. ICMS'10. Berlin, Heidelberg: Springer-Verlag, 2010, pp. 299--302. [Online]. Available: http://dl.acm.org/citation.cfm?id=1888390.1888455
[28]
S. Verdoolaege, J. Carlos Juega, A. Cohen, J. Ignacio Gómez, C. Tenllado, and F. Catthoor, "Polyhedral parallel code generation for CUDA," ACM Trans. Archit. Code Optim., vol. 9, no. 4, pp. 54:1--54:23, Jan. 2013. [Online].
[29]
N. Vasilache, O. Zinenko, T. Theodoridis, P. Goyal, Z. Devito, W. S. Moses, S. Verdoolaege, A. Adams, and A. Cohen, "The next 700 accelerated layers: From mathematical expressions of network computation graphs to accelerated GPU kernels, automatically," ACM Trans. Archit. Code Optim., vol. 16, no. 4, pp. 38:1--38:26, Oct. 2019. [Online].
[30]
C. Reddy and U. Bondhugula, "Effective automatic computation placement and data allocation for parallelization of regular programs," in Proceedings of the 28th ACM International Conference on Supercomputing, ser. ICS '14. New York, NY, USA: ACM, 2014, pp. 13--22. [Online].
[31]
T. Grosser, A. Größlinger, and C. Lengauer, "Polly - performing polyhedral optimizations on a low-level intermediate representation," Parallel Processing Letters, vol. 22, no. 4, 2012. [Online].
[32]
L. Chelini, O. Zinenko, T. Grosser, and H. Corporaal, "Declarative loop tactics for domain-specific optimization," TACO, vol. 16, no. 4, pp. 55:1--55:25, 2020. [Online].
[33]
C. Bastoul, "Code generation in the polyhedral model is easier than you think," in Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, ser. PACT '04. Washington, DC, USA: IEEE Computer Society, 2004, pp. 7--16. [Online].
[34]
E. Schweitz, "An MLIR dialect for high-level optimization of fortran," LLVM Developer Meeting, Oct 2019.
[35]
E. Garcia and M. Gupta, "Lattice regression," in Advances in Neural Information Processing Systems 22, Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, and A. Culotta, Eds. Curran Associates, Inc., 2009, pp. 594--602. [Online]. Available: http://papers.nips.cc/paper/3694-lattice-regression.pdf
[36]
M. Bravenboer, K. T. Kalleberg, R. Vermaas, and E. Visser, "Stratego/xt 0.17. A language and toolset for program transformation," Sci. Comput. Program., vol. 72, no. 1--2, pp. 52--70, 2008. [Online].
[37]
J. Meseguer, "Twenty years of rewriting logic," in Proceedings of the 8th International Conference on Rewriting Logic and Its Applications, ser. WRLA'10. Berlin, Heidelberg: Springer-Verlag, 2010, pp. 15--17. [Online]. Available: http://dl.acm.org/citation.cfm?id=1927806.1927809
[38]
P. Thier, M. A. Ertl, and A. Krall, "Fast and flexible instruction selection with constraints," in Proceedings of the 27th International Conference on Compiler Construction, ser. CC 2018. New York, NY, USA: ACM, 2018, pp. 93--103. [Online].
[39]
OpenMP ARB, "The OpenMP API specification for parallel programming," Online, https://www.openmp.org, accessed Feb 19, 2020.
[40]
J. Planas, R. M. Badia, E. Ayguadé, and J. Labarta, "Hierarchical task-based programming with starss," IJHPCA, vol. 23, no. 3, pp. 284--299, 2009. [Online].
[41]
"OpenACC application programming interface," Online, https://www.openacc.org, accessed Feb 19, 2020.
[42]
"SyCL: C++ single-source heterogeneous programming for OpenCL," Online, https://www.khronos.org/sycl, accessed Feb 19, 2020.
[43]
J. Auerbach, D. F. Bacon, I. Burcea, P. Cheng, S. J. Fink, R. Rabbah, and S. Shukla, "A compiler and runtime for heterogeneous computing," in Proceedings of the 49th Annual Design Automation Conference, ser. DAC '12. New York, NY, USA: ACM, 2012, pp. 271--276. [Online].
[44]
S. Kou and J. Palsberg, "From oo to fpga: Fitting round objects into square hardware?" in Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, ser. OOPSLA '10. New York, NY, USA: ACM, 2010, pp. 109--124. [Online].
[45]
T. Rompf and M. Odersky, "Lightweight modular staging: a pragmatic approach to runtime code generation and compiled dsls," Commun. ACM, vol. 55, no. 6, pp. 121--130, 2012. [Online].
[46]
A. K. Sujeeth, K. J. Brown, H. Lee, T. Rompf, H. Chafi, M. Odersky, and K. Olukotun, "Delite: A compiler architecture for performance-oriented embedded domain-specific languages," ACM Trans. Embedded Comput. Syst., vol. 13, no. 4s, pp. 134:1--134:25, 2014. [Online].
[47]
T. J. Parr and R. W. Quong, "Antlr: A predicated-ll(k) parser generator," Softw. Pract. Exper., vol. 25, no. 7, pp. 789--810, Jul. 1995. [Online].
[48]
N. Rotem, J. Fix, S. Abdulrasool, G. Catron, S. Deng, R. Dzhabarov, N. Gibson, J. Hegeman, M. Lele, R. Levenstein, J. Montgomery, B. Maher, S. Nadathur, J. Olesen, J. Park, A. Rakhov, M. Smelyanskiy, and M. Wang, "Glow: Graph lowering compiler techniques for neural networks," 2018.
[49]
T. Chen, T. Moreau, Z. Jiang, L. Zheng, E. Yan, H. Shen, M. Cowan, L. Wang, Y. Hu, L. Ceze, C. Guestrin, and A. Krishnamurthy, "TVM: An automated end-to-end optimizing compiler for deep learning," in 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). Carlsbad, CA: USENIX Association, Oct. 2018, pp. 578--594. [Online]. Available: https://www.usenix.org/conference/osdi18/presentation/chen
[50]
J. Ragan-Kelley, A. Adams, D. Sharlet, C. Barnes, S. Paris, M. Levoy, S. Amarasinghe, and F. Durand, "Halide: Decoupling algorithms from schedules for high-performance image processing," Commun.
[51]
U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, "A practical automatic polyhedral parallelizer and locality optimizer," in Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, Tucson, AZ, ACM, vol. 61, no. 1, pp. 106--115, Dec. 2017. [Online].
[52]
G. Rudy, M. M. Khan, M. Hall, C. Chen, and J. Chame, "A programming language interface to describe transformations and code generation," in Languages and Compilers for Parallel Computing, K. Cooper, J. Mellor-Crummey, and V. Sarkar, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 136--150.
[53]
L. Bagnères, O. Zinenko, S. Huot, and C. Bastoul, "Opening polyhedral compiler's black box," in Proceedings of the 2016 International Symposium on Code Generation and Optimization, CGO 2016, Barcelona, Spain, March 12--18, 2016, 2016, pp. 128--138.
[54]
A. Cohen, S. Donadio, M.-J. Garzaran, C. Herrmann, O. Kiselyov, and D. Padua, "In search of a program generator to implement generic transformations for high-performance computing," Sci. Comput. Program., vol. 62, no. 1, pp. 25--46, Sep. 2006. [Online].
[55]
R. T. Mullapudi, V. Vasista, and U. Bondhugula, "PolyMage: Automatic optimization for image processing pipelines," in International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2015, pp. 429--443.
[56]
T. Zerrell and J. Bruestle, "Stripe: Tensor compilation via the nested polyhedral model," CoRR, vol. abs/1903.06498, 2019. [Online]. Available: http://arxiv.org/abs/1903.06498
[57]
V. Elango, N. Rubin, M. Ravishankar, H. Sandanagobalane, and V. Grover, "Diesel: Dsl for linear algebra and neural net computations on gpus," in Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, ser. MAPL 2018. New York, NY, USA: ACM, 2018, pp. 42--51. [Online].
[58]
R. Baghdadi, J. Ray, M. B. Romdhane, E. Del Sozzo, A. Akkas, Y. Zhang, P. Suriana, S. Kamil, and S. Amarasinghe, "Tiramisu: A polyhedral compiler for expressing fast and portable code," in Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization, ser. CGO 2019. IEEE Press, 2019, p. 193--205. USA, June 7--13, 2008, 2008, pp. 101--113. [Online].
[59]
The Linux Foundation, "ONNX: Open neural network exchange," Online, https://github.com/onnx/onnx, accessed Feb 19, 2020. [Online]. Available: https://github.com/onnx/onnx

Cited By

View all
  • (2025)MimIR: An Extensible and Type-Safe Intermediate Representation for the DSL AgeProceedings of the ACM on Programming Languages10.1145/37048409:POPL(95-125)Online publication date: 9-Jan-2025
  • (2024)DaCapoProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699291(6993-7010)Online publication date: 14-Aug-2024
  • (2024)Practical performance guarantees for pipelined DNN inferenceProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692137(1655-1671)Online publication date: 21-Jul-2024
  • Show More Cited By
  1. MLIR: scaling compiler infrastructure for domain specific computation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CGO '21: Proceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization
    February 2021
    395 pages
    ISBN:9781728186139
    • General Chair:
    • Jae W. Lee

    Sponsors

    In-Cooperation

    • IEEE CS

    Publisher

    IEEE Press

    Publication History

    Published: 17 September 2021

    Check for updates

    Badges

    Qualifiers

    • Research-article

    Conference

    CGO '21
    CGO '21: 19th ACM/IEEE International Symposium on Code Generation and Optimization
    February 27 - March 3, 2021
    Virtual Event, Republic of Korea

    Acceptance Rates

    Overall Acceptance Rate 312 of 1,061 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)174
    • Downloads (Last 6 weeks)12
    Reflects downloads up to 12 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)MimIR: An Extensible and Type-Safe Intermediate Representation for the DSL AgeProceedings of the ACM on Programming Languages10.1145/37048409:POPL(95-125)Online publication date: 9-Jan-2025
    • (2024)DaCapoProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699291(6993-7010)Online publication date: 14-Aug-2024
    • (2024)Practical performance guarantees for pipelined DNN inferenceProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692137(1655-1671)Online publication date: 21-Jul-2024
    • (2024)CollectiveHLS: A Collaborative Approach to High-Level Synthesis Design OptimizationACM Transactions on Reconfigurable Technology and Systems10.1145/370200518:1(1-32)Online publication date: 26-Oct-2024
    • (2024)A Survey on Architectures, Hardware Acceleration and Challenges for In-Network ComputingACM Transactions on Reconfigurable Technology and Systems10.1145/369951418:1(1-34)Online publication date: 10-Oct-2024
    • (2024)SilvanForge: A Schedule-Guided Retargetable Compiler for Decision Tree InferenceProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695958(488-504)Online publication date: 4-Nov-2024
    • (2024)UFront: Toward A Unified MLIR Frontend for Deep LearningProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695002(255-267)Online publication date: 27-Oct-2024
    • (2024)Unifying Static and Dynamic Intermediate Languages for Accelerator GeneratorsProceedings of the ACM on Programming Languages10.1145/36897908:OOPSLA2(2242-2267)Online publication date: 8-Oct-2024
    • (2024)A Typed Multi-level Datalog IR and Its Compiler FrameworkProceedings of the ACM on Programming Languages10.1145/36897678:OOPSLA2(1586-1614)Online publication date: 8-Oct-2024
    • (2024)HiPy: Extracting High-Level Semantics from Python Code for Data ProcessingProceedings of the ACM on Programming Languages10.1145/36897378:OOPSLA2(736-762)Online publication date: 8-Oct-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media