MAGMA: : Enabling exascale performance with accelerated BLAS and LAPACK for diverse GPU architectures
Abstract
References
Index Terms
- MAGMA: Enabling exascale performance with accelerated BLAS and LAPACK for diverse GPU architectures
Recommendations
Kokkos
The manycore revolution can be characterized by increasing thread counts, decreasing memory per thread, and diversity of continually evolving manycore architectures. High performance computing (HPC) applications and libraries must exploit increasingly ...
Evaluation of a performance portable lattice Boltzmann code using OpenCL
IWOCL '14: Proceedings of the International Workshop on OpenCL 2013 & 2014With the advent of many-core computer architectures such as GPGPUs from NVIDIA and AMD, and more recently Intel's Xeon Phi, ensuring performance portability of HPC codes is potentially becoming more complex. In this work we have focused on one important ...
High-performance Cholesky factorization for GPU-only execution
GPGPU-10: Proceedings of the General Purpose GPUsWe present our performance analysis, algorithm designs, and the optimizations needed for the development of high-performance GPU-only algorithms, and in particular, for the dense Cholesky factorization. In contrast to currently promoted designs that ...
Comments
Please enable JavaScript to view thecomments powered by Disqus.Information & Contributors
Information
Published In
Publisher
Sage Publications, Inc.
United States
Publication History
Author Tags
Qualifiers
- Research-article
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 0Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
Citations
View Options
View options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in