[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/IPDPS.2011.70guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures

Published: 16 May 2011 Publication History

Abstract

Stencil calculations comprise an important class of kernels in many scientific computing applications ranging from simple PDE solvers to constituent kernels in multigrid methods as well as image processing applications. In such types of solvers, stencil kernels are often the dominant part of the computation, and an efficient parallel implementation of the kernel is therefore crucial in order to reduce the time to solution. However, in the current complex hardware micro architectures, meticulous architecture-specific tuning is required to elicit the machine's full compute power. We present a code generation and auto-tuning framework \textsc{Patus} for stencil computations targeted at multi- and many core processors, such as multicore CPUs and graphics processing units, which makes it possible to generate compute kernels from a specification of the stencil operation and a parallelization and optimization strategy, and leverages the auto tuning methodology to optimize strategy-dependent parameters for the given hardware architecture.

Cited By

View all
  • (2024)Across Time and Space: Senju’s Approach for Scaling Iterative Stencil Loop Accelerators on Single and Multiple FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/363492017:2(1-33)Online publication date: 30-Apr-2024
  • (2024)A shared compilation stack for distributed-memory parallelism in stencil DSLsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651344(38-56)Online publication date: 27-Apr-2024
  • (2023)Performance Portability Evaluation of Blocked Stencil Computations on GPUsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624177(1007-1018)Online publication date: 12-Nov-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
IPDPS '11: Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
May 2011
1285 pages
ISBN:9780769543857

Publisher

IEEE Computer Society

United States

Publication History

Published: 16 May 2011

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Across Time and Space: Senju’s Approach for Scaling Iterative Stencil Loop Accelerators on Single and Multiple FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/363492017:2(1-33)Online publication date: 30-Apr-2024
  • (2024)A shared compilation stack for distributed-memory parallelism in stencil DSLsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651344(38-56)Online publication date: 27-Apr-2024
  • (2023)Performance Portability Evaluation of Blocked Stencil Computations on GPUsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624177(1007-1018)Online publication date: 12-Nov-2023
  • (2023)A Fast Algorithm for Aperiodic Linear Stencil Computation using Fast Fourier TransformsACM Transactions on Parallel Computing10.1145/360633810:4(1-34)Online publication date: 24-Jul-2023
  • (2022)Toward accelerated stencil computation by adapting tensor core unit on GPUProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532392(1-12)Online publication date: 28-Jun-2022
  • (2022)Scalable communication for high-order stencil computations using CUDA-aware MPIParallel Computing10.1016/j.parco.2022.102904111:COnline publication date: 1-Jul-2022
  • (2021)Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core ProcessorsProceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3473517(1-12)Online publication date: 9-Aug-2021
  • (2021)Enhancing the Scalability of Multi-FPGA Stencil Computations via Highly Optimized HDL ComponentsACM Transactions on Reconfigurable Technology and Systems10.1145/346147814:3(1-33)Online publication date: 12-Aug-2021
  • (2021)Reducing redundancy in data organization and arithmetic calculation for stencil computationsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3476154(1-15)Online publication date: 14-Nov-2021
  • (2021)Bliss: auto-tuning complex applications using a pool of diverse lightweight learning modelsProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454109(1280-1295)Online publication date: 19-Jun-2021
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media