Kim et al., 2012 - Google Patents
A reconfigurable SIMT processor for mobile ray tracing with contention reduction in shared memoryKim et al., 2012
View PDF- Document ID
- 2278130565138510048
- Author
- Kim H
- Kim Y
- Oh J
- Kim L
- Publication year
- Publication venue
- IEEE Transactions on Circuits and Systems I: Regular Papers
External Links
Snippet
In this paper, we present a reconfigurable SIMT multi-core processor with a shared memory for mobile ray tracing. The proposed processor addresses two issues of SIMT architecture: branch divergence of concurrently executed threads and contention in a shared memory …
- 230000015654 memory 0 title abstract description 103
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3889—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute
- G06F9/3891—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute organised in groups of units sharing resources, e.g. clusters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling, out of order instruction execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/005—General purpose rendering architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power Management, i.e. event-based initiation of power-saving mode
- G06F1/3234—Action, measure or step performed to reduce power consumption
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/50—Lighting effects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/06—Ray-tracing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kim et al. | A reconfigurable SIMT processor for mobile ray tracing with contention reduction in shared memory | |
Zaruba et al. | Snitch: A tiny pseudo dual-issue processor for area and energy efficient execution of floating-point intensive workloads | |
Tine et al. | Vortex: Extending the RISC-V ISA for GPGPU and 3D-graphics | |
Lindholm et al. | NVIDIA Tesla: A unified graphics and computing architecture | |
US10409319B2 (en) | System, apparatus and method for providing a local clock signal for a memory array | |
Wong et al. | Pangaea: a tightly-coupled IA32 heterogeneous chip multiprocessor | |
CN112907711A (en) | Using a single Instruction Set Architecture (ISA) instruction for vector normalization | |
US12039001B2 (en) | Scalable sparse matrix multiply acceleration using systolic arrays with feedback inputs | |
CN113094298A (en) | Mechanism to partition shared local memory | |
CN112233010A (en) | Partial write management in a multi-block graphics engine | |
US20210365402A1 (en) | Computing efficient cross channel operations in parallel computing machines using systolic arrays | |
US11914438B2 (en) | Repeating graphics render pattern detection | |
Kopta et al. | Memory considerations for low energy ray tracing | |
Spjut et al. | TRaX: A multicore hardware architecture for real-time ray tracing | |
Bouvier et al. | Kabini: An AMD accelerated processing unit system on a chip | |
Mahesri et al. | Tradeoffs in designing accelerator architectures for visual computing | |
Bush et al. | Nyami: a synthesizable GPU architectural model for general-purpose and graphics-specific workloads | |
Kim et al. | MRTP: Mobile ray tracing processor with reconfigurable stream multi-processors for high datapath utilization | |
Kopta et al. | Efficient MIMD architectures for high-performance ray tracing | |
JP2021099779A (en) | Page table mapping mechanism | |
US11669490B2 (en) | Computing efficient cross channel operations in parallel computing machines using systolic arrays | |
US20220164504A1 (en) | Technologies for circuit design | |
Sanchez-Elez et al. | Algorithm optimizations and mapping scheme for interactive ray tracing on a reconfigurable architecture | |
Lee et al. | Real-time ray tracing on future mobile computing platform | |
Ramani et al. | Streamray: a stream filtering architecture for coherent ray tracing |