[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1572769.1572792acmconferencesArticle/Chapter ViewAbstractPublication PageshpgConference Proceedingsconference-collections
research-article

Understanding the efficiency of ray traversal on GPUs

Published: 01 August 2009 Publication History

Abstract

We discuss the mapping of elementary ray tracing operations---acceleration structure traversal and primitive intersection---onto wide SIMD/SIMT machines. Our focus is on NVIDIA GPUs, but some of the observations should be valid for other wide machines as well. While several fast GPU tracing methods have been published, very little is actually understood about their performance. Nobody knows whether the methods are anywhere near the theoretically obtainable limits, and if not, what might be causing the discrepancy. We study this question by comparing the measurements against a simulator that tells the upper bound of performance for a given kernel. We observe that previously known methods are a factor of 1.5--2.5X off from theoretical optimum, and most of the gap is not explained by memory bandwidth, but rather by previously unidentified inefficiencies in hardware work distribution. We then propose a simple solution that significantly narrows the gap between simulation and measurement. This results in the fastest GPU ray tracer to date. We provide results for primary, ambient occlusion and diffuse interreflection rays.

References

[1]
Blelloch, G. 1990. Prefix sums and their applications. In Synthesis of Parallel Algorithms, Morgan Kaufmann, J. H. Reif, Ed.
[2]
Ernst, M., and Greiner, G. 2007. Early split clipping for bounding volume hierarchies. In Proc. IEEE/Eurographics Symposium of Interactive Ray Tracing 2007, 73--78.
[3]
Günther, J., Popov, S., Seidel, H.-P., and Slusallek, P. 2007. Realtime ray tracing on GPU with BVH-based packet traversal. In Proc. IEEE/Eurographics Symposium on Interactive Ray Tracing 2007, 113--118.
[4]
Lindholm, E., Nickolls, J., Oberman, S., and Montrym, J. 2008. Nvidia tesla: A unified graphics and computing architecture. IEEE Micro 28, 2, 39--55.
[5]
NVIDIA. 2008. NVIDIA CUDA Programming Guide Version 2.1.
[6]
Reshetov, A., Soupikov, A., and Hurley, J. 2005. Multi-level ray tracing algorithm. ACM Trans. Graph. 24, 3, 1176--1185.
[7]
Wächter, C., and Keller, A. 2006. Instant ray tracing: The bounding interval hierarchy. In Proc. Eurographics Symposium on Rendering 2006, 139--149.
[8]
Wald, I., Benthin, C., and Wagner, M. 2001. Interactive rendering with coherent ray tracing. Computer Graphics Forum 20, 3, 153--164.
[9]
Wald, I., Boulos, S., and Shirley, P. 2007. Ray Tracing Deformable Scenes using Dynamic Bounding Volume Hierarchies. ACM Trans. Graph. 26, 1.
[10]
Wald, I., Benthin, C., and Boulos, S. 2008. Getting rid of packets: Efficient SIMD single-ray traversal using multibranching bvhs. In Proc. IEEE/Eurographics Symposium on Interactive Ray Tracing 2008.
[11]
Woop, S. 2004. A Ray Tracing Hardware Architecture for Dynamic Scenes. Tech. rep., Saarland University.
[12]
Zhou, K., Hou, Q., Wang, R., and Guo, B. 2008. Real-time KD-tree construction on graphics hardware. ACM Trans. Graph. 27, 5, 1--11.

Cited By

View all
  • (2024)Simulation and Visualisation of Traditional Craft ActionsHeritage10.3390/heritage71203287:12(7083-7114)Online publication date: 12-Dec-2024
  • (2024)GPU Coroutines for Flexible Splitting and Scheduling of Rendering TasksACM Transactions on Graphics10.1145/368776643:6(1-24)Online publication date: 19-Dec-2024
  • (2024)High-Throughput Batch Rendering for Embodied AISIGGRAPH Asia 2024 Conference Papers10.1145/3680528.3687629(1-9)Online publication date: 3-Dec-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
HPG '09: Proceedings of the Conference on High Performance Graphics 2009
August 2009
185 pages
ISBN:9781605586038
DOI:10.1145/1572769
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. SIMD
  2. SIMT
  3. ray tracing

Qualifiers

  • Research-article

Conference

HPG 2009
Sponsor:
HPG 2009: High Performance Graphics
August 1 - 3, 2009
Louisiana, New Orleans

Acceptance Rates

Overall Acceptance Rate 15 of 44 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)105
  • Downloads (Last 6 weeks)16
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Simulation and Visualisation of Traditional Craft ActionsHeritage10.3390/heritage71203287:12(7083-7114)Online publication date: 12-Dec-2024
  • (2024)GPU Coroutines for Flexible Splitting and Scheduling of Rendering TasksACM Transactions on Graphics10.1145/368776643:6(1-24)Online publication date: 19-Dec-2024
  • (2024)High-Throughput Batch Rendering for Embodied AISIGGRAPH Asia 2024 Conference Papers10.1145/3680528.3687629(1-9)Online publication date: 3-Dec-2024
  • (2024)SAH-Optimized k-DOP Hierarchies for Ray TracingProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36753917:3(1-16)Online publication date: 9-Aug-2024
  • (2024)HIPRT: A Ray Tracing Framework in HIPProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36753787:3(1-18)Online publication date: 9-Aug-2024
  • (2024)Real-Time Procedural Generation with GPU Work GraphsProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36753767:3(1-16)Online publication date: 9-Aug-2024
  • (2024)Neural BoundingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657442(1-10)Online publication date: 13-Jul-2024
  • (2024)Faster Ray Tracing through Hierarchy Cut CodeComputer Graphics Forum10.1111/cgf.1522643:7Online publication date: 24-Oct-2024
  • (2024)Generalizing Ray Tracing Accelerators for Tree Traversals on GPUs2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00080(1041-1057)Online publication date: 2-Nov-2024
  • (2023)A no-API approach to massive-parallel architecturesKeldysh Institute Preprints10.20948/prepr-2023-58(1-54)Online publication date: 2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media