[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/3155562.3155683guideproceedingsArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
Article
Free access

taco: a tool to generate tensor algebra kernels

Published: 30 October 2017 Publication History

Abstract

Tensor algebra is an important computational abstraction that is increasingly used in data analytics, machine learning, engineering, and the physical sciences. However, the number of tensor expressions is unbounded, which makes it hard to develop and optimize libraries. Furthermore, the tensors are often sparse (most components are zero), which means the code has to traverse compressed formats. To support programmers we have developed taco, a code generation tool that generates dense, sparse, and mixed kernels from tensor algebra expressions. This paper describes the taco web and command-line tools and discusses the benefits of a code generator over a traditional library. See also the demo video at tensor-compiler.org/ase2017.

References

[1]
J. McAuley and J. Leskovec, “Hidden factors and hidden topics: understanding rating dimensions with review text,” in Proceedings of the 7th ACM conference on Recommender systems. ACM, 2013, pp. 165–172.
[2]
N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers et al., “In-datacenter performance analysis of a tensor processing unit,” arXiv preprint arXiv:1704.04760, 2017.
[3]
H. Abelson, G. J. Sussman, and J. Sussman, Structure and interpretation of computer programs. Justin Kelly, 1996.
[4]
E. S. Raymond, The art of Unix programming. Addison-Wesley Professional, 2003.
[5]
F. Kjolstad, S. Kamil, S. Chou, D. Lugato, and S. Amarasinghe, “The Tensor Algebra Compiler,” Proceedings of the ACM on Programming Languages, vol. 1, no. OOPSLA, October 2017.
[6]
G. Ricci-Curbastro and T. Levi-Civita, “Méthodes de calcul différentiel absolu et leurs applications,” Mathematische Annalen, vol. 54, 1901.
[7]
A. Einstein, “The foundation of the General Theory of Relativity,” Annalen der Physik, vol. 354, pp. 769–822, 1916.
[8]
D. R. Kincaid, T. C. Oppe, and D. M. Young, ITPACKV 2D User’s Guide, 1989.
[9]
W. F. Tinney and J. W. Walker, “Direct solutions of sparse network equations by optimally ordered triangular factorization,” Proceedings of the IEEE, vol. 55, no. 11, pp. 1801–1809, 1967.
[10]
S. Smith and G. Karypis, “Tensor-matrix products with a compressed sparse tensor,” in Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms. ACM, 2015, p. 5.
[11]
M. Baskaran, B. Meister, N. Vasilache, and R. Lethin, “Efficient and scalable computations with sparse tensors,” in High Performance Extreme Computing (HPEC), 2012 IEEE Conference on. IEEE, 2012, pp. 1–6.
[12]
S. Smith, J. W. Choi, J. Li, R. Vuduc, J. Park, X. Liu, and G. Karypis. (2017) FROSTT: The formidable repository of open sparse tensors and tools. {Online}. Available: http://frostt.io/
[13]
R. F. Boisvert, R. Pozo, and K. A. Remington, “The matrix market exchange formats: Initial design,” National Institute of Standards and Technology Internal Report, NISTIR, vol. 5935, 1996.
[14]
I. S. Duff, R. G. Grimes, and J. G. Lewis, “Users’ guide for the harwellboeing sparse matrix collection (release i),” 1992.
[15]
B. Viswanath, A. Mislove, M. Cha, and K. P. Gummadi, “On the evolution of user interaction in facebook,” in Proceedings of the 2nd ACM SIGCOMM Workshop on Social Networks (WOSN’09), August 2009.
[16]
R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou, “Cilk: An efficient multithreaded runtime system,” Journal of parallel and distributed computing, vol. 37, no. 1, pp. 55–69, 1996.
[17]
L. Dagum and R. Menon, “Openmp: an industry standard api for sharedmemory programming,” IEEE computational science and engineering, vol. 5, no. 1, pp. 46–55, 1998.
[18]
B. W. Bader and T. G. Kolda, “Efficient matlab computations with sparse and factored tensors,” SIAM Journal on Scientific Computing, vol. 30, no. 1, pp. 205–231, 2007.
[19]
G. Guennebaud, B. Jacob et al., “Eigen v3,” http://eigen.tuxfamily.org, 2010.
[20]
S. Smith, N. Ravindran, N. Sidiropoulos, and G. Karypis, “Splatt: Efficient and parallel sparse tensor-matrix multiplication,” in 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2015, pp. 61–70.
[21]
J. Bezanson, S. Karpinski, V. B. Shah, and A. Edelman, “Julia: A fast dynamic language for technical computing,” 2012.
[22]
M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI’16. Berkeley, CA, USA: USENIX Association, 2016, pp. 265–283. {Online}. Available: http://dl.acm.org/citation.cfm?id=3026877.3026899
[23]
C. L. Lawson, R. J. Hanson, D. R. Kincaid, and F. T. Krogh, “Basic linear algebra subprograms for Fortran usage,” ACM Transactions on Mathematical Software (TOMS), vol. 5, no. 3, pp. 308–323, 1979.
[24]
R. Vuduc, J. W. Demmel, and K. A. Yelick, “OSKI: A library of automatically tuned sparse matrix kernels,” Journal of Physics: Conference Series, vol. 16, no. 1, pp. 521+, 2005.
[25]
A. A. Auer, G. Baumgartner, D. E. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R. Harrison, S. Krishnamoorthy, S. Krishnan, C.-C. Lam, Q. Lu, M. Nooijen, R. Pitzer, J. Ramanujam, P. Sadayappan, and A. Sibiryakov, “Automatic code generation for many-body electronic structure methods: the Tensor Contraction Engine,” Molecular Physics, vol. 104, no. 2, pp. 211–228, 2006.

Cited By

View all
  • (2024)Cyclebite: Extracting Task Graphs From Unstructured Compute-ProgramsIEEE Transactions on Computers10.1109/TC.2023.332750473:1(221-234)Online publication date: 1-Jan-2024
  • (2023)Flexagon: A Multi-dataflow Sparse-Sparse Matrix Multiplication Accelerator for Efficient DNN ProcessingProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582069(252-265)Online publication date: 25-Mar-2023
  • (2021)Understanding and bridging the gaps in current GNN performance optimizationsProceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3437801.3441585(119-132)Online publication date: 17-Feb-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ASE '17: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering
October 2017
1033 pages
ISBN:9781538626849

Sponsors

Publisher

IEEE Press

Publication History

Published: 30 October 2017

Author Tags

  1. Tensor algebra
  2. compiler
  3. linear algebra
  4. sparse

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)96
  • Downloads (Last 6 weeks)12
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Cyclebite: Extracting Task Graphs From Unstructured Compute-ProgramsIEEE Transactions on Computers10.1109/TC.2023.332750473:1(221-234)Online publication date: 1-Jan-2024
  • (2023)Flexagon: A Multi-dataflow Sparse-Sparse Matrix Multiplication Accelerator for Efficient DNN ProcessingProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582069(252-265)Online publication date: 25-Mar-2023
  • (2021)Understanding and bridging the gaps in current GNN performance optimizationsProceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3437801.3441585(119-132)Online publication date: 17-Feb-2021
  • (2021)Program Lifting using Gray-Box BehaviorProceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques10.1109/PACT52795.2021.00012(60-74)Online publication date: 26-Sep-2021
  • (2019)Tensor algebra compilation with workspacesProceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization10.5555/3314872.3314894(180-192)Online publication date: 16-Feb-2019

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media