[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/SC.2004.26acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
Article

GPU Cluster for High Performance Computing

Published: 06 November 2004 Publication History

Abstract

Inspired by the attractive Flops/dollar ratio and the incredible growth in the speed of modern graphics processing units (GPUs), we propose to use a cluster of GPUs for high performance scientific computing. As an example application, we have developed a parallel flow simulation using the lattice Boltzmann model (LBM) on a GPU cluster and have simulated the dispersion of airborne contaminants in the Times Square area of New York City. Using 30 GPU nodes, our simulation can compute a 480x400x80 LBM in 0.31second/step, a speed which is 4.6 times faster than that of our CPU cluster implementation. Besides the LBM, we also discuss other potential applications of the GPU cluster, such as cellular automata, PDE solvers, and FEM.

References

[1]
{1} General-Purpose Computation Using Graphics Hardware (GPGPU). http://www.gpgpu.org.
[2]
{2} J. Backus. Can programming be liberated from the von Neumann style? A functional style and its algebra of programs. ACM Turing Award Lecture, 1977.
[3]
{3} J. Bolz, I. Farmer, E. Grinspun, and P. Schröder. Sparse matrix solvers on the GPU: conjugate gradients and multigrid. ACM Trans. Graph. (SIGGRAPH), 22(3):917-924, 2003.
[4]
{4} M. Brown, M. Leach, R. Calhoun, W.S. Smith, D. Stevens, J. Reisner, R. Lee, N.-H. Chin, and D. DeCroix. Multiscale modeling of air flow in Salt Lake City and the surrounding region. ASCE Structures Congress, 2001. LA-UR-01-509.
[5]
{5} M. Brown, M. Leach, J. Reisner, D. Stevens, S. Smith, H.- N. Chin, S. Chan, and B. Lee. Numerical modeling from mesoscale to urban scale to building scale. AMS 3rd Urb. Env. Symp., 2000.
[6]
{6} I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan. Brook for GPUs: Stream Computing on Graphics Hardware. ACM Trans. Graph. (SIGGRAPH), to appear, 2004.
[7]
{7} N. A. Carr, J. D. Hall, and J. C. Hart. The ray engine. Proceedings of Graphics Hardware, pages 37-46, September 2002.
[8]
{8} D. D'Humieres, M. Bouzidi, and P. Lallemand. Thirteen-velocity three-dimensional lattice Boltzmann model. Phys. Rev. E, 63(066702), 2001.
[9]
{9} N. K. Govindaraju, A. Sud, S.-E. Yoon, and D. Manocha. Interactive visibility culling in complex environments using occlusion-switches. In Proceedings Symposium on Interactive 3D Graphics, pages 103-112, 2003.
[10]
{10} M. Harris, G. Coombe, T. Scheuermann, and A. Lastra. Physically-based visual simulation on graphics hardware. SIGGRAPH/Eurographics Workshop on Graphics Hardware, pages 109-118, September 2002.
[11]
{11} M. J. Harris. GPGPU: Beyond graphics. Eurographics Tutorial , August 2004.
[12]
{12} A. Heirich, P. Ezolt, M. Shand, E. Oertli, and G. Lupton. Performance scaling and depth/alpha acquisition in DVI graphics clusters. In Proc. Workshop on Commodity-Based Visualization Clusters CCViz02, 2002.
[13]
{13} G. Humphreys, M. Eldridge, I. Buck, G. Stoll, M. Everett, and P. Hanrahan. Wiregl: a scalable graphics system for clusters. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), pages 129-140, 2001.
[14]
{14} G. Humphreys, M. Houston, R. Ng, R. Frank, S. Ahern, P. D. Kirchner, and J. T. Klosowski. Chromium: a stream-processing framework for interactive rendering on clusters. In Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), pages 693-702, 2002.
[15]
{15} D. Kirk. Innovation in graphics technology. Talk in Canadian Undergraduate Technology Conference, 2004.
[16]
{16} J. Krüger and R. Westermann. Linear algebra operators for GPU implementation of numerical algorithms. ACM Trans. Graph. (SIGGRAPH), 22(3):908-916, 2003.
[17]
{17} P. Lallemand and L. Luo. Theory of the lattice Boltzmann method: Accoustic and thermal properties in two and three dimensions. Phys. Rev. E, 68(036706), 2003.
[18]
{18} W. Li, X. Wei, and A. Kaufman. Implementing lattice Boltzmann computation on graphics hardware. Visual Computer, 19(7-8): 444-456, December 2003.
[19]
{19} C. P. Lowe and S. Succi. Go-with-the-flow lattice Boltzmann methods for tracer dynamics, chapter 9. Lecture Notes in Physics. Springer-Verlag, 2002.
[20]
{20} W. R. Mark, R. S. Glanville, K. Akeley, and M. J. Kilgard. Cg: a system for programming graphics hardware in a C-like language. ACM Trans. Graph. (SIGGRAPH), 22(3): 896-907, 2003.
[21]
{21} N. Martys, J. Hagedorn, D. Goujon, and J. Devaney. Large scale simulations of single and multi-component flow in porous media. Proceedings of The International Symposium on Optical Science, Engineering, and Instrumentation, June 1999.
[22]
{22} F. Massaioli and G. Amati. Optimization and scaling of an OpenMP LBM code on IBM SP nodes. Scicomp06 Talk, August 2002.
[23]
{23} F. Massaioli and G. Amati. Performance portability of a lattice Boltzmann code. Scicomp09 Talk, March 2004.
[24]
{24} R. Mei, W. Shyy, D. Yu, and L. S. Luo. Lattice Boltzmann method for 3-D flows with curved boundary. J. Comput. Phys., 161:680-699, March 2000.
[25]
{25} L. Moll, A. Heirich, and M. Shand. Sepia: scalable 3D compositing using PCI pamette. In Proc. IEEE Symposium on Field Programmable Custom Computing Machines, pages 146-155, April 1999.
[26]
{26} S. Succi. The Lattice Boltzmann Equation for Fluid Dynamics and Beyond. Numerical Mathematics and Scientific Computation. Oxford University Press, 2001.
[27]
{27} A.T.C. Tam and C.-L. Wang. Contention-aware communication schedule for high-speed communication. Cluster Computing , (4), 2003.
[28]
{28} C. J. Thompson, S. Hahn, and M. Oskin. Using modern graphics architectures for general-purpose computing: A framework and analysis. International Symposium on Microarchitecture (MICRO), November 2002.
[29]
{29} S. Venkatasubramanian. The graphics card as a stream computer. SIGMOD Workshop on Management and Processing of Massive Data, June 2003.
[30]
{30} A. Wilen, J. Schade, and R. Thornburg. Introduction to PCI Express*: A Hardware and Software Developer's Guide. 2003.
[31]
{31} D. A. Wolf-Gladrow. Lattice Gas Cellular Automata and Lattice Boltzmann Models: an Introduction. Springer-Verlag, 2000.
[32]
{32} F. Zara, F. Faure, and J-M. Vincent. Physical cloth simulation on a PC cluster. In Proceedings of the Fourth Eurographics Workshop on Parallel Graphics and Visualization, pages 105- 112, 2002.

Cited By

View all
  • (2023)Checkpoint/Restart for CUDA KernelsProceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624254(1729-1737)Online publication date: 12-Nov-2023
  • (2023)ReFloat: Low-Cost Floating-Point Processing in ReRAM for Accelerating Iterative Linear SolversProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607077(1-15)Online publication date: 12-Nov-2023
  • (2018)Exposing hidden performance opportunities in high performance GPU applicationsProceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2018.00045(301-310)Online publication date: 1-May-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '04: Proceedings of the 2004 ACM/IEEE conference on Supercomputing
November 2004
724 pages
ISBN:0769521533

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 06 November 2004

Check for updates

Author Tags

  1. GPU cluster
  2. computational fluid dynamics
  3. data intensive computing
  4. lattice Boltzmann model
  5. urban airborne dispersion

Qualifiers

  • Article

Conference

SC '04
Sponsor:

Acceptance Rates

SC '04 Paper Acceptance Rate 60 of 200 submissions, 30%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)2
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Checkpoint/Restart for CUDA KernelsProceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624254(1729-1737)Online publication date: 12-Nov-2023
  • (2023)ReFloat: Low-Cost Floating-Point Processing in ReRAM for Accelerating Iterative Linear SolversProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607077(1-15)Online publication date: 12-Nov-2023
  • (2018)Exposing hidden performance opportunities in high performance GPU applicationsProceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2018.00045(301-310)Online publication date: 1-May-2018
  • (2018)Physically based visual simulation of the Lattice Boltzmann method on the GPUThe Journal of Supercomputing10.1007/s11227-018-2392-874:7(3441-3467)Online publication date: 1-Jul-2018
  • (2016)A novel GPU resources management and scheduling system based on virtual machinesInternational Journal of High Performance Computing and Networking10.1504/ijhpcn.2016.0804159:5-6(423-430)Online publication date: 1-Jan-2016
  • (2016)Multidisciplinary simulation acceleration using multiple shared memory graphical processing unitsInternational Journal of High Performance Computing Applications10.1177/109434201663911430:4(486-508)Online publication date: 1-Nov-2016
  • (2016)Software pipelining for graphic processing unit accelerationInternational Journal of High Performance Computing Applications10.1177/109434201558584530:2(169-185)Online publication date: 1-May-2016
  • (2016)Contextual abstraction in a type system for component-based high performance computing platformsScience of Computer Programming10.1016/j.scico.2016.07.005132:P1(96-128)Online publication date: 15-Dec-2016
  • (2015)Scaling soft matter physics to thousands of graphics processing units in parallelInternational Journal of High Performance Computing Applications10.1177/109434201557684829:3(274-283)Online publication date: 1-Aug-2015
  • (2015)A case study of data transfer efficiency optimization for GPU- and infiniband-based clustersProceedings of the 2015 Conference on research in adaptive and convergent systems10.1145/2811411.2811468(247-250)Online publication date: 9-Oct-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media