[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3295500.3356204acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Moment representation in the lattice Boltzmann method on massively parallel hardware

Published: 17 November 2019 Publication History

Abstract

The widely-used lattice Boltzmann method (LBM) for computational fluid dynamics is highly scalable, but also significantly memory bandwidth-bound on current architectures. This paper presents a new regularized LBM implementation that reduces the memory footprint by only storing macroscopic, moment-based data. We show that the amount of data that must be stored in memory during a simulation is reduced by up to 47%. We also present a technique for cache-aware data re-utilization and show that optimizing cache utilization to limit data motion results in a similar improvement in time to solution. These new algorithms are implemented in the hemodynamics solver HARVEY and demonstrated using both idealized and realistic biological geometries. We develop a performance model for the moment representation algorithm and evaluate the performance on Summit.

References

[1]
R Argentini, AF Bakker, and CP Lowe. 2004. Efficiently using memory in lattice Boltzmann simulations. Future Generation Computer Systems 20, 6 (2004), 973--980.
[2]
Peter Bailey, Joe Myre, Stuart DC Walsh, David J Lilja, and Martin O Saar. 2009. Accelerating lattice Boltzmann fluid flow simulations using graphics processors. In Parallel Processing, 2009. ICPP'09. International Conference on. IEEE, 550--557.
[3]
Stewart M Benton, Christian Tesche, Carlo N De Cecco, Taylor M Duguay, U Joseph Schoepf, and Richard R Bayer. 2018. Noninvasive derivation of fractional flow reserve from coronary computed tomographic angiography. Journal of Thoracic Imaging 33, 2 (2018), 88--96.
[4]
Massimo Bernaschi, Mauro Bisson, Toshio Endo, Satoshi Matsuoka, Massimiliano Fatica, and Simone Melchionna. 2011. Petaflop biofluidics simulations on a two million-core system. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 4.
[5]
Prabhu Lai Bhatnagar, Eugene P Gross, and Max Krook. 1954. A model for collision processes in gases. I. Small amplitude processes in charged and neutral one-component systems. Physical review 94, 3 (1954), 511.
[6]
Jonathan Carter, Min Soe, Leonid Oliker, Yoshinori Tsuda, George Vahala, Linda Vahala, and Angus Macnab. 2005. Magnetohydrodynamic Turbulence Simulations on the Earth Simulator Using the Lattice Boltzmann Method. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. ACM.
[7]
Shiyi Chen and Gary D Doolen. 1998. Lattice Boltzmann method for fluid flows. Ann Rev Fluid Mech 30, 1 (1998), 329--364.
[8]
Kaushik Datta, Mark Murphy, Vasily Volkov, Samuel Williams, Jonathan Carter, Leonid Oliker, David Patterson, John Shalf, and Katherine Yelick. 2008. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing. IEEE Press, 4.
[9]
Dominique d'Humières, Irina Ginzburg, Manfred Krafczyk, Pierre Lallemand, and Li-Shi Luo. 2002. Multiple-relaxation-time lattice Boltzmann models in three dimensions. Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences 360, 1792 (2002), 437--451.
[10]
Yuankun Fu, Feng Li, Fengguang Song, and Luoding Zhu. 2018. Designing a parallel memory-aware lattice Boltzmann algorithm on manycore systems. In 2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). IEEE, 97--106.
[11]
Martin Geier and Martin Schoenherr. 2017. Esoteric twist: an efficient in-place streaming algorithmus for the lattice Boltzmann method on massively parallel hardware. Computation 5, 2 (2017), 19.
[12]
Nicholas Geneva, Cheng Peng, Xiaoming Li, and Lian-Ping Wang. 2017. A scalable interface-resolved simulation of particle-laden flow using the lattice Boltzmann method. Parallel Comput. 67 (2017), 20--37.
[13]
Christian Godenschwager, Florian Schornbaum, Martin Bauer, Harald Köstler, and Ulrich Rüde. 2013. A framework for hybrid parallel flow simulations with a trillion cells in complex geometries. In SC'13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE, 1--12.
[14]
LA Hegele Jr, A Scagliarini, M Sbragaglia, KK Mattila, PC Philippi, DF Puleri, J Gounley, and A Randles. 2018. High-Reynolds-number turbulent cavity flow using the lattice Boltzmann method. Physical Review E 98, 4 (2018), 043302.
[15]
Gregory Herschlag, Seyong Lee, Jeffrey S Vetter, and Amanda Randles. 2018. GPU Data Access on Complex Geometries for D3Q19 Lattice Boltzmann Method. In 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 825--834.
[16]
Jonas Latt. 2007. Hydrodynamic limit of lattice Boltzmann equations. Ph.D. Dissertation. University of Geneva.
[17]
Jonas Latt and Bastien Chopard. 2006. Lattice Boltzmann method with regularized pre-collision distribution functions. Mathematics and Computers in Simulation 72, 2-6 (2006), 165--168.
[18]
Jonas Latt, Bastien Chopard, Orestis Malaspinas, Michel Deville, and Andreas Michler. 2008. Straight velocity boundaries in the lattice Boltzmann method. Physical Review E 77, 5 (2008), 056703.
[19]
Song Liu, Nianjun Zou, Yuanzhen Cui, and Weiguo Wu. 2017. Accelerating the parallelization of lattice Boltzmann method by exploiting the temporal locality. In 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC). IEEE, 1186--1193.
[20]
Nicos S Martys and John G Hagedorn. 2002. Multiscale modeling of fluid transport in heterogeneous materials using discrete Boltzmann methods. Materials and structures 35, 10 (2002), 650--658.
[21]
Keijo Mattila, Jari Hyväluoma, Tuomo Rossi, Mats Aspnäs, and Jan Westerholm. 2007. An efficient swap algorithm for the lattice Boltzmann method. Computer Physics Communications 176, 3 (2007), 200--210.
[22]
John D McCalpin et al. 1995. Memory bandwidth and machine balance in current high performance computers. IEEE Computer Society Technical Committee on Computer Architecture (TCCA) newsletter 1995 (1995), 19--25.
[23]
Amanda Peters, Simone Melchionna, Efthimios Kaxiras, Jonas Lätt, Joy Sircar, Massimo Bernaschi, Mauro Bison, and Sauro Succi. 2010. Multiscale simulation of cardiovascular flows on the IBM Bluegene/P: Full heart-circulation system at red-blood cell resolution. In SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1--10.
[24]
Thomas Pohl, Frank Deserno, Nils Thurey, Ulrich Rude, Peter Lammers, Gerhard Wellein, and Thomas Zeiser. 2004. Performance evaluation of parallel large-scale lattice Boltzmann applications on three supercomputing architectures. In Supercomputing, 2004. Proceedings of the ACM/IEEE SC2004 conference. IEEE, 21--21.
[25]
Thomas Pohl, Markus Kowarschik, Jens Wilke, Klaus Iglberger, and Ulrich Rüde. 2003. Optimization and profiling of the cache performance of parallel lattice Boltzmann codes. Parallel Processing Letters 13, 04 (2003), 549--560.
[26]
Amanda Randles, Erik W Draeger, Tomas Oppelstrup, Liam Krauss, and John A Gunnels. 2015. Massively parallel models of the human circulatory system. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 1.
[27]
Amanda Peters Randles, Vivek Kale, Jeff Hammond, William Gropp, and Efthimios Kaxiras. 2013. Performance analysis of the lattice Boltzmann model beyond Navier-Stokes. In Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on. IEEE, 1063--1074.
[28]
S Succi, G Amati, M Bernaschi, G Falcucci, M Lauricella, and A Montessori. 2019. Towards Exascale Lattice Boltzmann computing. Computers & Fluids (2019).
[29]
Pedro Valero-Lara. 2018. Analysis and Applications of Lattice Boltzmann Simulations. IGI Global.
[30]
David Vidal, Robert Roy, and François Bertrand. 2010. A parallel workload balanced and memory efficient lattice-Boltzmann algorithm with single unit BGK relaxation time for laminar Newtonian flows. Computers & Fluids 39, 8 (2010), 1411--1423.
[31]
Gerhard Wellein, Georg Hager, Thomas Zeiser, Markus Wittmann, and Holger Fehske. 2009. Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization. In 2009 33rd Annual IEEE International Computer Software and Applications Conference, Vol. 1. IEEE, 579--586.
[32]
Gerhard Wellein, Thomas Zeiser, Georg Hager, and Stefan Donath. 2006. On the single processor performance of simple lattice Boltzmann kernels. Computers & Fluids 35, 8-9 (2006), 910--919.
[33]
Samuel Williams, Jonathan Carter, Leonid Oliker, John Shalf, and Katherine Yelick. 2008. Lattice Boltzmann simulation optimization on leading multicore platforms. In 2008 IEEE International Symposium on Parallel and Distributed Processing. IEEE, 1--14.
[34]
Samuel Williams, Leonid Oliker, Jonathan Carter, and John Shalf. 2011. Extracting ultra-scale lattice Boltzmann performance via hierarchical and distributed autotuning. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 55.
[35]
Samuel Williams, Andrew Waterman, and David Patterson. 2009. Roofline: An insightful visual performance model for floating-point programs and multicore architectures. Technical Report. Lawrence Berkeley National Lab.(LBNL), Berkeley, CA (United States).
[36]
Markus Wittmann, Viktor Haag, Thomas Zeiser, Harald Köstler, and Gerhard Wellein. 2018. Lattice Boltzmann benchmark kernels as a testbed for performance analysis. Computers & Fluids 172 (2018), 582--592.
[37]
Markus Wittmann, Thomas Zeiser, Georg Hager, and Gerhard Wellein. 2013. Comparison of different propagation steps for lattice Boltzmann methods. Computers & Mathematics with Applications 65, 6 (2013), 924--935.
[38]
Thomas Zeiser, Gerhard Wellein, Aditya Nitsure, Klaus Iglberger, U Rude, and Georg Hager. 2008. Introducing a parallel cache oblivious blocking approach for the lattice Boltzmann method. Progress in Computational Fluid Dynamics, an International Journal 8, 1-4 (2008), 179--188.
[39]
Qisu Zou and Xiaoyi He. 1997. On pressure and velocity boundary conditions for the lattice Boltzmann BGK model. Physics of fluids 9, 6 (1997), 1591--1598.

Cited By

View all
  • (2024)An efficient flux-reconstructed lattice boltzmann flux solver for flow interaction of multi-structure with curved boundaryEngineering Analysis with Boundary Elements10.1016/j.enganabound.2024.105958169(105958)Online publication date: Dec-2024
  • (2023)Moment Representation of Regularized Lattice Boltzmann Methods on NVIDIA and AMD GPUsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624250(1697-1704)Online publication date: 12-Nov-2023
  • (2023)High-Order Moment-Encoded Kinetic Simulation of Turbulent FlowsACM Transactions on Graphics10.1145/361834142:6(1-13)Online publication date: 5-Dec-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
November 2019
1921 pages
ISBN:9781450362290
DOI:10.1145/3295500
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 November 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bandwidth
  2. lattice boltzmann method
  3. memory
  4. moment representation

Qualifiers

  • Research-article

Funding Sources

  • NIH

Conference

SC '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)50
  • Downloads (Last 6 weeks)4
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)An efficient flux-reconstructed lattice boltzmann flux solver for flow interaction of multi-structure with curved boundaryEngineering Analysis with Boundary Elements10.1016/j.enganabound.2024.105958169(105958)Online publication date: Dec-2024
  • (2023)Moment Representation of Regularized Lattice Boltzmann Methods on NVIDIA and AMD GPUsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624250(1697-1704)Online publication date: 12-Nov-2023
  • (2023)High-Order Moment-Encoded Kinetic Simulation of Turbulent FlowsACM Transactions on Graphics10.1145/361834142:6(1-13)Online publication date: 5-Dec-2023
  • (2023)Improving the Performance of Lattice Boltzmann Method with Pipelined Algorithm on A Heterogeneous Multi-zone ProcessorParallel and Distributed Computing, Applications and Technologies10.1007/978-3-031-29927-8_3(28-41)Online publication date: 8-Apr-2023
  • (2023)A graphic processing unit implementation for the moment representation of the lattice Boltzmann methodInternational Journal for Numerical Methods in Fluids10.1002/fld.518595:7(1076-1089)Online publication date: 20-Feb-2023
  • (2022)Parallel Scheme for Multi-Layer Refinement Non-Uniform Grid Lattice Boltzmann Method Based on Load BalancingEnergies10.3390/en1521788415:21(7884)Online publication date: 24-Oct-2022
  • (2022)Propagation Pattern for Moment Representation of the Lattice Boltzmann MethodIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.309845633:3(642-653)Online publication date: 1-Mar-2022
  • (2021)Designing a 3D Parallel Memory-Aware Lattice Boltzmann Algorithm on Manycore SystemsEuro-Par 2021: Parallel Processing10.1007/978-3-030-85665-6_32(519-535)Online publication date: 25-Aug-2021
  • (2021)Functionally Arranged Data for Algorithms with Space-Time WavefrontParallel Computational Technologies10.1007/978-3-030-81691-9_10(134-148)Online publication date: 9-Jul-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media