Article

PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures

Authors:

Matthias Christen,

Olaf Schenk,

Helmar BurkhartAuthors Info & Claims

IPDPS '11: Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium

Pages 676 - 687

https://doi.org/10.1109/IPDPS.2011.70

Published: 16 May 2011 Publication History

Abstract

Stencil calculations comprise an important class of kernels in many scientific computing applications ranging from simple PDE solvers to constituent kernels in multigrid methods as well as image processing applications. In such types of solvers, stencil kernels are often the dominant part of the computation, and an efficient parallel implementation of the kernel is therefore crucial in order to reduce the time to solution. However, in the current complex hardware micro architectures, meticulous architecture-specific tuning is required to elicit the machine's full compute power. We present a code generation and auto-tuning framework \textsc{Patus} for stencil computations targeted at multi- and many core processors, such as multicore CPUs and graphics processing units, which makes it possible to generate compute kernels from a specification of the stencil operation and a parallelization and optimization strategy, and leverages the auto tuning methodology to optimize strategy-dependent parameters for the given hardware architecture.

Cited By

View all

Del Sozzo EConficconi DSano K(2024)Across Time and Space: Senju’s Approach for Scaling Iterative Stencil Loop Accelerators on Single and Multiple FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/363492017:2(1-33)Online publication date: 30-Apr-2024
https://dl.acm.org/doi/10.1145/3634920
Bisbas GLydike ABauer EBrown NFehr MMitchell LRodriguez-Canal GJamieson MKelly PSteuwer MGrosser TTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)A shared compilation stack for distributed-memory parallelism in stencil DSLsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651344(38-56)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651344
Antepara OWilliams SJohansen HZhao THirsch SGoyal PHall M(2023)Performance Portability Evaluation of Blocked Stencil Computations on GPUsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624177(1007-1018)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3624062.3624177
Show More Cited By

Recommendations

Patus for convenient high-performance stencils: evaluation in earthquake simulations
SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Patus is a code generation and auto-tuning framework for stencil computations targeting modern multi and many-core processors. The goals of the framework are productivity and portability for achieving high performance on the target platform. Its stencil ...
Patus for convenient high-performance stencils: Evaluation in earthquake simulations
SC '12: Proceedings of the 2012 International Conference for High Performance Computing, Networking, Storage and Analysis

PATUS is a code generation and auto-tuning framework for stencil computations targeting modern multi and many-core processors. The goals of the framework are productivity and portability for achieving high performance on the target platform. Its stencil ...
Evaluation of Rodinia Codes on Intel Xeon Phi
ISMS '13: Proceedings of the 2013 4th International Conference on Intelligent Systems, Modelling and Simulation

High performance computing (HPC) is a niche area where various parallel benchmarks are constantly used to explore and evaluate the performance of Heterogeneous computing systems on the horizon. The Rodinia benchmark suite, a collection of parallel ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

IPDPS '11: Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium

May 2011

1285 pages

ISBN:9780769543857

Publisher

IEEE Computer Society

United States

Publication History

Published: 16 May 2011

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

113
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Del Sozzo EConficconi DSano K(2024)Across Time and Space: Senju’s Approach for Scaling Iterative Stencil Loop Accelerators on Single and Multiple FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/363492017:2(1-33)Online publication date: 30-Apr-2024
https://dl.acm.org/doi/10.1145/3634920
Bisbas GLydike ABauer EBrown NFehr MMitchell LRodriguez-Canal GJamieson MKelly PSteuwer MGrosser TTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)A shared compilation stack for distributed-memory parallelism in stencil DSLsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651344(38-56)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651344
Antepara OWilliams SJohansen HZhao THirsch SGoyal PHall M(2023)Performance Portability Evaluation of Blocked Stencil Computations on GPUsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624177(1007-1018)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3624062.3624177
Ahmad ZChowdhury RDas RGanapathi PGregory AZhu Y(2023)A Fast Algorithm for Aperiodic Linear Stencil Computation using Fast Fourier TransformsACM Transactions on Parallel Computing10.1145/360633810:4(1-34)Online publication date: 24-Jul-2023
https://dl.acm.org/doi/10.1145/3606338
Liu XLiu YYang HLiao JLi MLuan ZQian DRauchwerger LCameron KNikolopoulos DPnevmatikatos D(2022)Toward accelerated stencil computation by adapting tensor core unit on GPUProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532392(1-12)Online publication date: 28-Jun-2022
https://dl.acm.org/doi/10.1145/3524059.3532392
Pekkilä JVäisälä MKäpylä MRheinhardt MLappi O(2022)Scalable communication for high-order stencil computations using CUDA-aware MPIParallel Computing10.1016/j.parco.2022.102904111:COnline publication date: 1-Jul-2022
https://dl.acm.org/doi/10.1016/j.parco.2022.102904
Li MLiu YYang HHu YSun QChen BYou XLiu XLuan ZQian D(2021)Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core ProcessorsProceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3473517(1-12)Online publication date: 9-Aug-2021
https://dl.acm.org/doi/10.1145/3472456.3473517
Reggiani EDel Sozzo EConficconi DNatale GMoroni CSantambrogio M(2021)Enhancing the Scalability of Multi-FPGA Stencil Computations via Highly Optimized HDL ComponentsACM Transactions on Reconfigurable Technology and Systems10.1145/346147814:3(1-33)Online publication date: 12-Aug-2021
https://dl.acm.org/doi/10.1145/3461478
Li KYuan LZhang YYue Yde Supinski BHall MGamblin T(2021)Reducing redundancy in data organization and arithmetic calculation for stencil computationsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3476154(1-15)Online publication date: 14-Nov-2021
https://dl.acm.org/doi/10.1145/3458817.3476154
Roy RPatel TGadepally VTiwari DFreund SYahav E(2021)Bliss: auto-tuning complex applications using a pool of diverse lightweight learning modelsProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454109(1280-1295)Online publication date: 19-Jun-2021
https://dl.acm.org/doi/10.1145/3453483.3454109
Show More Cited By

Abstract

Cited By

Recommendations

Patus for convenient high-performance stencils: evaluation in earthquake simulations

Patus for convenient high-performance stencils: Evaluation in earthquake simulations

Evaluation of Rodinia Codes on Intel Xeon Phi

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations