[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2063384.2063482acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Hardware/software co-design for energy-efficient seismic modeling

Published: 12 November 2011 Publication History

Abstract

Reverse Time Migration (RTM) has become the standard for high-quality imaging in the seismic industry. RTM relies on PDE solutions using stencils that are 8th order or larger, which require large-scale HPC clusters to meet the computational demands. However, the rising power consumption of conventional cluster technology has prompted investigation of architectural alternatives that offer higher computational efficiency. In this work, we compare the performance and energy efficiency of three architectural alternatives -- the Intel Nehalem X5530 multicore processor, the NVIDIA Tesla C2050 GPU, and a general-purpose manycore chip design optimized for high-order wave equations called "Green Wave." We have developed an FPGA-accelerated architectural simulation platform to accurately model the power and performance of the Green Wave design. Results show that across highly-tuned high-order RTM stencils, the Green Wave implementation can offer up to 8x and 3.5x energy efficiency improvement per node respectively, compared with the Nehalem and GPU platforms. These results point to the enormous potential energy advantages of our hardware/software co-design methodology.

References

[1]
M. Araya-Polo, F. Rubio, M. Hanzich, R. de la Cruz, J. M. Cela, and D. P. Scarpazza. High-performance seismic acoustic imaging by reverse-time migration on the cell/b.e. architecture. In ISCA2008, 2008.
[2]
Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A. Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, and Katherine A. Yelick. The landscape of parallel computing research: A view from Berkeley. Technical Report UCB/EECS-2006-183 (http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html), EECS Department, University of California, Berkeley, December 2006.
[3]
Satish Balay, Kris Buschelman, William D. Gropp, Dinesh Kaushik, Matthew G. Knepley, Lois Curfman McInnes, Barry F. Smith, and Hong Zhang. PETSc Web page, 2009. http://www.mcs.anl.gov/petsc.
[4]
James Balfour and William J. Dally. Design tradeoffs for tiled cmp on-chip networks. In ICS '06: Proceedings of the 20th annual international conference on Supercomputing, pages 187--198, New York, NY, USA, 2006. ACM.
[5]
Edip Baysal, Dan D Kosloff, and John W. C. Sherwood. Reverse time migration. In Geophysics, volume 48(11), 1983.
[6]
Berkeley Wireless Research Center. RAMP Homepage. http://ramp.eecs.berkeley.edu/.
[7]
Cadence Inc. Denali DDR3 memory controller IPt. Whitepaper, April 2011. http://www.cadence.com/solutions/dip/memorystorage/ddr_cntrl_ip/Pages/default.aspxl.
[8]
Robert G. Clapp, Haohuan Fu, and Olav Lindtjorn. Selecting the right hardware for reverse time migration. The Leading Edge, 29(1), 2010.
[9]
Kaushik Datta, Shoaib Kamil, Samuel Williams, Leonid Oliker, John Shalf, and Katherine Yelick. Optimization and performance modeling of stencil computations on modern microprocessors. SIAM Review, 51(1):129--159, 2009.
[10]
Kaushik Datta, Mark Murphy, Vasily Volkov, Samuel Williams, Jonathan Carter, et al. Stencil Computation Optimization and Auto-Tuning on State-of-the-art Multicore Architectures. In Proceedings SC '08, pages 1--12, Piscataway, NJ, USA, 2008. IEEE Press.
[11]
D. Donofrio, L. Oliker, J. Shalf, M. Wehner, C. Rowen, J. Krueger, S. Kamil, and M. Mohiyuddin. Energy-efficient computing for extreme-scale science. In IEEE Computer, 2009.
[12]
James P. Durbano, Fernando E. Ortiz, John R. Humphrey, Mark S. Mirotznik, and Dennis W. Prather. Hardware implementation of a three dimensional finite-difference time-domain algorithm. In IEEE Antennas and Wireless Propagation Letters, volume 2, 2003.
[13]
Eric Dussaud, William W. Symes, Paul Williamson, Larent Lemaistre, Paul Singer, Bertrand Denel, and Adam Cherrett. Computational strategies for reverse-time migration. In SEG, Las Vegas, 2008.
[14]
Peter Kogge et al. Exascale computing study: Technology challenges in achieving exascale systems. http://users.ece.gatech.edu/~mrichard/ExascaleComputingStudyReports/exascale_final_report_100208.pdf, 2008.
[15]
Darren Foltinek, Daniel Eaton, Jeff Mahovsky, Peyman Moghaddam, and Ray McGarry. Industrial-scale reverse time migration on gpu hardware. In SEG Houston International Exposition, 2009.
[16]
Haohuan Fu, William Osborne, Robert G. Clapp, Oskar-Mencer, and Wayne Luk. Accelerating seismic computations using customized number representations on fpgas. EURASIP Journal on Embedded Systems, 2008.
[17]
Chuan He, Mi Lu, and Chuanwen Sun. Accelerating seismic migration using fpga-based coprocessor platform. In IEEE Symposium on Field-Programmable Custom Computing Machines, 2004.
[18]
Gilbert Hendry, Johnnie Chan, Shoaib Kamil, Lenny Oliker, John Shalf, et al. Silicon nanophotonic network-on-chip using tdm arbitration. High-Performance Interconnects, Symposium on, 0:88--95, 2010.
[19]
Gilbert Hendry, Shoaib Kamil, and Aleksandr Biberman. Analysis of photonic networks for a chip multiprocessor using scientific applications. In NOCS, pages 104--113, 2009.
[20]
Gilbert Hendry, Shoaib Kamil, Aleksandr Biberman, Johnnie Chan, Benjamin G. Lee, Marghoob Mohiyuddin, Ankit Jain, Keren Bergman, Luca P. Carloni, John Kubiatowicz, Leonid Oliker, and John Shalf. Analysis of photonic networks for a chip multiprocessor using scientific applications. In NOCS, pages 104--113, 2009.
[21]
Voltaire Inc. Datasheet for voltaire qdr infiniband switch for ibm idataplex, 2011. http://www.voltaire.com/assets/files/Voltaire_IBM_BC-H_HSSM_datasheet-WEB-070109.pdf.
[22]
Jonathan G Koomey. Worldwide electricity used in data centers. Environ. Res. Lett., 3(034008), 2008.
[23]
Jacob Leverich, Hideho Arakida, Alex Solomatnikov, Amin Firoozshahian, Mark Horowitz, and Christos Kozyrakis. Comparing Memory Systems for Chip Multiprocessors. In International Symposium on Computer Architecture, 2007.
[24]
A. Lim, S. Liao, and M. Lam. Blocking and array contraction across arbitrarily nested loops using affine partitioning. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, June 2001.
[25]
Wei Liu and et al. Anisotropic reverse-time migration using co-processors. In SEG Houston International Exposition. SEG, 2009.
[26]
Paulius Micikevicius. 3D finite difference computation on GPUs using CUDA. In GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, 2009.
[27]
Paulius Micikevicius. 3d finite difference computation on gpus using cuda. In GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, pages 79--84, New York, NY, USA, 2009. ACM.
[28]
Paulius Micikevicius. Performance scaling of 3d finite difference computation on gpu clusters. Technical Report NVR-002-2009, NVIDIA, December 2009.
[29]
Micron Inc. Calculating Memory System Power for DDR3, June 2010. http://www.micron.com/support/dram/power_calc.html.
[30]
M. Mohiyuddin, M. Murphy, S. Williams, L. Oliker, J. Shalf, and J. Wawrzynek. A Case for Hardware/Software Co-Tuning for Power Efficient Scientific Computing. In Proc. SC'09, 2009.
[31]
E. Motuk, R. Woods, and S Bilbao. Implementation of finite difference schemes for the wave equation on fpga. Technical report, University of Belfast, 2005.
[32]
NERSC. Carver/Magellan Web page, 2010. http://www.nersc.gov/systems/carver-ibm-idataplex/.
[33]
Francisco Ortigosa, Mauricio Araya-Polo, Felix Rubio, Mauricio Hanzich, Raul de la Cruz, and Jose Maria Cela. Evaluation of 3d rtm on hpc platforms. In Barcelona Supercomputing Center, editor, SEG, 2008.
[34]
Ramesh Radhakrishnan, Rizwan Ali, and Vishvesh Sahasrabudhe. Evaluating energy efficiency in infiniband-based dell poweredge energy smart clusters. Dell Power Solutions, February 2008.
[35]
G. Rivera and C. Tseng. Tiling optimizations for 3D scientific computations. In Proceedings of SC'00, Dallas, TX, November 2000. Supercomputing 2000.
[36]
S. Sellappa and S. Chatterjee. Cache-efficient multigrid algorithms. International Journal of High Performance Computing Applications, 18(1):115--133, 2004.
[37]
John Shalf, David Donofrio, Curtis Janssen, and Dan Quinlan. CoDEx Web page, 2011. http://www.nersc.gov/projects/CoDEx.
[38]
David E. Shaw, Ron O. Dror, John K. Salmon, et al. Millisecond-scale molecular dynamics simulations on anton. In Proceedings SC'09, pages 1--11, New York, NY, USA, 2009. ACM.
[39]
Silicon Creations Inc. Si Creations Programmable PLL IP product. Whitepaper, April 2008. http://www.siliconcr.com/news.html.
[40]
William W. Symes. Reverse time migration with optimal checkpointing. In SEG, 2007.
[41]
Tensilica Inc. Xtensa Architecture and Performance. Whitepaper, October 2005. http://www.tensilica.com/pdf/xtensa_arch_white_paper.pdf.
[42]
Shyamkumar Thoziyoor, Naveen Muralimanohar, Jung Ho Ahn, and Norman P. Jouppi. CACTI 5.1. Technical Report HPL-2008-20, HP Labs, 2008.
[43]
Reverse time migration with random boundaries. Reverse time migration with random boundaries. In SEG, 2009.
[44]
D. Wang, B. Ganesh, N. Tuaycharoen, K. Baynes, A. Jaleel, and B. Jacob. Dramsim: a memory system simulator. SIGARCH Comput. Archit. News, 33(4):100--107, 2005.
[45]
M. Wehner, L. Oliker, and J. Shalf. Green Flash: Designing an energy efficient climate supercomputer. In IEEE Spectrum, 2009.
[46]
S. Williams, J. Shalf, L. Oliker, S. Kamil, P. Husbands, and K. Yelick. Scientific Computing Kernels on the Cell Processor. International Journal of Parallel Programming, 35(3):263--298, 2007.
[47]
Öz Yilmaz. Seismic Data Analysis. Society of Exploration Geophysics, 2001.

Cited By

View all
  • (2024)Key Technologies and Design Aspects for Wafer Level Packaging of High Performance Computing Modules2024 IEEE 74th Electronic Components and Technology Conference (ECTC)10.1109/ECTC51529.2024.00340(433-440)Online publication date: 28-May-2024
  • (2024)Expert and operator perspectives on barriers to energy efficiency in data centersEnergy Efficiency10.1007/s12053-024-10244-717:6Online publication date: 17-Jul-2024
  • (2023)FAWS: FPGA Acceleration of Large-Scale Wave Simulations2023 IEEE 34th International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP57973.2023.00025(76-84)Online publication date: Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
November 2011
866 pages
ISBN:9781450307710
DOI:10.1145/2063384
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. GPU
  2. RTM
  3. co-design
  4. manycore
  5. seismic
  6. stencil

Qualifiers

  • Research-article

Funding Sources

Conference

SC '11
Sponsor:

Acceptance Rates

SC '11 Paper Acceptance Rate 74 of 352 submissions, 21%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)1
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Key Technologies and Design Aspects for Wafer Level Packaging of High Performance Computing Modules2024 IEEE 74th Electronic Components and Technology Conference (ECTC)10.1109/ECTC51529.2024.00340(433-440)Online publication date: 28-May-2024
  • (2024)Expert and operator perspectives on barriers to energy efficiency in data centersEnergy Efficiency10.1007/s12053-024-10244-717:6Online publication date: 17-Jul-2024
  • (2023)FAWS: FPGA Acceleration of Large-Scale Wave Simulations2023 IEEE 34th International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP57973.2023.00025(76-84)Online publication date: Jul-2023
  • (2022)Scalable distributed high-order stencil computationsProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.5555/3571885.3571924(1-13)Online publication date: 13-Nov-2022
  • (2022)Scalable Distributed High-Order Stencil ComputationsSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00035(1-13)Online publication date: Nov-2022
  • (2021)A 3D complexity-adaptive approach to explore sparsity in elastic wave propagationGEOPHYSICS10.1190/geo2020-0490.186:5(T321-T335)Online publication date: 16-Aug-2021
  • (2020)Design of Efficient, Dependable SoCs Based on a Cross-Layer-Reliability Approach with Emphasis on Wireless Communication as Application and DRAM MemoriesDependable Embedded Systems10.1007/978-3-030-52017-5_18(435-455)Online publication date: 10-Dec-2020
  • (2019)NTX: An Energy-efficient Streaming Accelerator for Floating-point Generalized Reduction Workloads in 22 nm FD-SOI2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE.2019.8715007(662-667)Online publication date: Mar-2019
  • (2018)Towards compilation of an imperative language for FPGAsProceedings of the 10th ACM SIGPLAN International Workshop on Virtual Machines and Intermediate Languages10.1145/3281287.3281291(47-56)Online publication date: 4-Nov-2018
  • (2017)Integrating DRAM power-down modes in gem5 and quantifying their impactProceedings of the International Symposium on Memory Systems10.1145/3132402.3132444(86-95)Online publication date: 2-Oct-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media