[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2492045.2492058acmconferencesArticle/Chapter ViewAbstractPublication PageshpgConference Proceedingsconference-collections
research-article

An energy and bandwidth efficient ray tracing architecture

Published: 19 July 2013 Publication History

Abstract

We propose two hardware mechanisms to decrease energy consumption on massively parallel graphics processors for ray tracing while keeping performance high. First, we use a streaming data model and configure part of the L2 cache into a ray stream memory to enable efficient data processing through ray reordering. This increases the L1 hit rate and reduces off-chip memory accesses substantially. Second, we employ reconfigurable special-purpose pipelines than are constructed dynamically under program control. These pipelines use shared execution units (XUs) that can be configured to support the common compute kernels that are the foundation of the ray tracing algorithm, such as acceleration structure traversal and triangle intersection. This reduces the overhead incurred by memory and register accesses. These two synergistic features yield a ray tracing architecture that significantly reduces both power consumption and off-chip memory traffic when compared to a more traditional cache only approach.

References

[1]
Aila, T., and Karras, T. 2010. Architecture considerations for tracing incoherent rays. In Proc. High Performance Graphics.
[2]
Bigler, J., Stephens, A., and Parker, S. G. 2006. Design for parallel interactive ray tracing systems. In Symposium on Interactive Ray Tracing (IRT06).
[3]
Boulos, S., Edwards, D., Lacewell, J. D., Kniss, J., Kautz, J., Shirley, P., and Wald, I. 2007. Packet-based Whitted and Distribution Ray Tracing. In Proc. Graphics Interface.
[4]
Boulos, S., Wald, I., and Benthin, C. 2008. Adaptive ray packet reordering. In Symposium on Interactive Ray Tracing (IRT08).
[5]
Brownlee, C., Fogal, T., and Hansen, C. D. 2012. GLu-Ray: Enhanced ray tracing in existing scientific visualization applications using OpenGL interception. In EGPGV, Eurographics, 41--50.
[6]
Brownlee, C., Ize, T., and Hansen, C. D. 2013. Image-parallel ray tracing using OpenGL interception. In EGPGV, Eurographics, 65--72.
[7]
Christensen, P. H., Laur, D. M., Fong, J., Wooten, W. L., and Batali, D. 2003. Ray differentials and multiresolution geometry caching for distribution ray tracing in complex scenes. In Eurographics 2003, 543--552.
[8]
Dachille, IX, F., and Kaufman, A. 2000. Gi-cube: an architecture for volumetric global illumination and rendering. In ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware, HWWS '00, 119--128.
[9]
Dally, B., 2013. The challenge of future high-performance computing. Celsius Lecture, Uppsala University, Uppsala, Sweden, Feb. http://media.medfarm.uu.se/play/video/3261.
[10]
Dmitriev, K., Havran, V., and Seidel, H.-P. 2004. Faster ray tracing with SIMD shaft culling. Tech. Rep. MPI-I-2004-4-006, Max-Planck-Institut für Informatik.
[11]
Govindaraju, V., Djeu, P., Sankaralingam, K., Vernon, M., and Mark, W. R. 2008. Toward a multicore architecture for real-time ray-tracing. In IEEE/ACM Micro '08.
[12]
Gribble, C., and Ramani, K. 2008. Coherent ray tracing via stream filtering. In Symposium on Interactive Ray Tracing (IRT08).
[13]
Günther, J., Popov, S., Seidel, H.-P., and Slusallek, P. 2007. Realtime ray tracing on GPU with BVH-based packet traversal. In Symposium on Interactive Ray Tracing (IRT07), 113--118.
[14]
Hapala, M., Davidovic, T., Wald, I., Havran, V., and Slusallek, P. 2011. Efficient Stack-less BVH Traversal for Ray Tracing. In Proceedings 27th Spring Conference of Computer Graphics (SCCG) 2011, 29--34.
[15]
HWRT, 2012. SimTRaX a cycle-accurate ray tracing architectural simulator and compiler. http://code.google.com/p/simtrax/. Utah Hardware Ray Tracing Group.
[16]
Imagination Technologies, 2013. Caustic professional. http://www.imgtec.com/caustic/.
[17]
Ize, T., Brownlee, C., and Hansen, C. D. 2011. Real-time ray tracer for visualizing massive models on a cluster. In EGPGV, Eurographics, 61--69.
[18]
Kajiya, J. T. 1986. The rendering equation. In Proceedings of SIGGRAPH, 143--150.
[19]
Kelm, J. H., Johnson, D. R., Johnson, M. R., Crago, N. C., Tuohy, W., Mahesri, A., Lumetta, S. S., Frank, M. I., and Patel, S. J. 2009. Rigel: an architecture and scalable programming interface for a 1000-core accelerator. In ISCA '09.
[20]
Kim, H.-Y., Kim, Y.-J., and Kim, L.-S. 2012. MRTP: Mobile ray tracing processor with reconfigurable stream multi-processors for high datapath utilization. IEEE JSSC 47, 2, 518--535.
[21]
Kopta, D., Spujt, J., Brunvand, E., and Parker, S. 2008. Comparing incoherent ray performance of TRaX vs. Manta. In Symposium on Interactive Ray Tracing (IRT08), 183.
[22]
Kopta, D., Spjut, J., Brunvand, E., and Davis, A. 2010. Efficient mimd architectures for high-performance ray tracing. In IEEE International Conference on Computer Design (ICCD).
[23]
Laine, S. 2010. Restart trail for stackless bvh traversal. In Proc. High Performance Graphics, 107--111.
[24]
Mansson, E., Munkberg, J., and Akenine-Moller, T. 2007. Deep coherent ray tracing. In Symposium on Interactive Ray Tracing (IRT07).
[25]
Mathew, B., Davis, A., and Parker, M. 2004. A Low Power Architecture for Embedded Perception Processing. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, 46--56.
[26]
Möller, T., and Trumbore, B. 1997. Fast, minimum storage ray triangle intersection. Journal of Graphics Tools 2, 1 (October), 21--28.
[27]
Moon, B., Byun, Y., Kim, T.-J., Claudio, P., Kim, H.-S., Ban, Y.-J., Nam, S. W., and Yoon, S.-E. 2010. Cache-oblivious ray reordering. ACM Trans. Graph. 29, 3 (July), 28:1--28:10.
[28]
Muralimanohar, N., Balasubramonian, R., and Jouppi, N. 2007. Optimizing NUCA organizations and wiring alternatives for large caches with CACTI 6.0. In MICRO '07, 3--14.
[29]
Navrátil, P. A., and Mark, W. R. 2006. An analysis of ray tracing bandwidth consumption. Tech. Rep. TR-06-40, The University of Texas at Austin.
[30]
Navratil, P., Fussell, D., Lin, C., and Mark, W. 2007. Dynamic ray scheduling for improved system performance. In Symposium on Interactive Ray Tracing (IRT07).
[31]
Overbeck, R., Ramamoorthi, R., and Mark, W. R. 2008. Large ray packets for real-time whitted ray tracing. In Symposium on Interactive Ray Tracing (IRT08), 41--48.
[32]
Pharr, M., and Hanrahan, P. 1996. Geometry caching for ray-tracing displacement maps. In Eurographics Rendering Workshop, 31--40.
[33]
Pharr, M., Kolb, C., Gershbein, R., and Hanrahan, P. 1997. Rendering complex scenes with memory-coherent ray tracing. In SIGGRAPH '97, 101--108.
[34]
Ramani, K., and Gribble, C. 2009. StreamRay: A stream filtering architecture for coherent ray tracing. In ASPLOS '09.
[35]
Reshetov, A., Soupikov, A., and Hurley, J. 2005. Multi-level ray tracing algorithm. ACM Transactions on Graphics (SIGGRAPH '05) 24, 3, 1176--1185.
[36]
Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., and Hanrahan, P. 2008. Larrabee: A many-core x86 architecture for visual computing. ACM Transactions on Graphics 27, 3 (August).
[37]
Shevtsov, M., Soupikov, A., Kapustin, A., and Novorod, N. 2007. Ray-Triangle Intersection Algorithm for Modern CPU Architectures. In Procedings of GraphiCon'2007.
[38]
Silicon Arts Coproration, 2013. RayCore series 1000. http://www.siliconarts.co.kr/gpu-ip.
[39]
Smits, B. 1998. Efficiency issues for ray tracing. J. Graph. Tools 3, 2 (Feb.), 1--14.
[40]
Spjut, J., Kensler, A., Kopta, D., and Brunvand, E. 2009. TRaX: A multicore hardware architecture for real-time ray tracing. IEEE Transactions on Computer-Aided Design 28, 12, 1802--1815.
[41]
Steinhurst, J., Coombe, G., and Lastra, A. 2005. Reordering for cache conscious photon mapping. In Proceedings of Graphics Interface 2005, 97--104.
[42]
Tsakok, J. A. 2009. Faster incoherent rays: Multi-BVH ray stream tracing. In Proc. High Performance Graphics, 151--158.
[43]
Wald, I., Slusallek, P., Benthin, C., and Wagner, M. 2001. Interactive rendering with coherent ray tracing. Computer Graphics Forum (EUROGRAPHICS '01) 20, 3, 153--164.
[44]
Wald, I., Benthin, C., and Boulos, S. 2008. Getting rid of packets - efficient simd single-ray traversal using multi-branching bvhs. In Symposium on Interactive Ray Tracing (IRT08), 49--57.
[45]
Whitted, T. 1980. An improved illumination model for shaded display. Communications of the ACM 23, 6, 343--349.
[46]
Williams, A., Barrus, S., Morley, R. K., and Shirley, P. 2005. An efficient and robust ray-box intersection algorithm. Journal of Graphics Tools 10, 1.

Cited By

View all
  • (2022)Mach-RT: A Many Chip Architecture for High Performance Ray TracingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2020.302104828:3(1585-1596)Online publication date: 1-Mar-2022
  • (2021)A Survey on Bounding Volume Hierarchies for Ray TracingComputer Graphics Forum10.1111/cgf.14266240:2(683-712)Online publication date: 4-Jun-2021
  • (2020)Hardware-Accelerated Dual-Split TreesProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/34061853:2(1-21)Online publication date: 26-Aug-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
HPG '13: Proceedings of the 5th High-Performance Graphics Conference
July 2013
149 pages
ISBN:9781450321358
DOI:10.1145/2492045
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bandwidth reduction
  2. energy reduction
  3. persistent pipelines
  4. ray tracing
  5. streaming

Qualifiers

  • Research-article

Funding Sources

Conference

HPG '13
Sponsor:
HPG '13: High Performance Graphics
July 19 - 21, 2013
California, Anaheim

Acceptance Rates

HPG '13 Paper Acceptance Rate 15 of 44 submissions, 34%;
Overall Acceptance Rate 15 of 44 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)3
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Mach-RT: A Many Chip Architecture for High Performance Ray TracingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2020.302104828:3(1585-1596)Online publication date: 1-Mar-2022
  • (2021)A Survey on Bounding Volume Hierarchies for Ray TracingComputer Graphics Forum10.1111/cgf.14266240:2(683-712)Online publication date: 4-Jun-2021
  • (2020)Hardware-Accelerated Dual-Split TreesProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/34061853:2(1-21)Online publication date: 26-Aug-2020
  • (2019)Examination of the Nvidia RTXGraphiCon'2019 Proceedings. Volume 210.30987/graphicon-2019-2-7-12(7-12)Online publication date: 5-Nov-2019
  • (2019)Mach-RTProceedings of the Conference on High-Performance Graphics10.2312/hpg.20191188(1-6)Online publication date: 8-Jul-2019
  • (2018)A detailed study of ray tracing performanceThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-018-1532-834:6-8(875-885)Online publication date: 1-Jun-2018
  • (2017)Unleashing the power of GPU for physically-based rendering via dynamic ray shufflingProceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3123939.3124532(560-573)Online publication date: 14-Oct-2017
  • (2017)Dual streaming for hardware-accelerated ray tracingProceedings of High Performance Graphics10.1145/3105762.3105771(1-11)Online publication date: 28-Jul-2017
  • (2017)Toward Real-Time Ray TracingACM Computing Surveys10.1145/310406750:4(1-41)Online publication date: 30-Aug-2017
  • (2017)Power and energy implications of misunderstanding DRAM2017 Eighth International Green and Sustainable Computing Conference (IGSC)10.1109/IGCC.2017.8323589(1-6)Online publication date: Oct-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media