GB2465812A - Distributed processing for rendering 3D images - Google Patents
Distributed processing for rendering 3D images Download PDFInfo
- Publication number
- GB2465812A GB2465812A GB0821938A GB0821938A GB2465812A GB 2465812 A GB2465812 A GB 2465812A GB 0821938 A GB0821938 A GB 0821938A GB 0821938 A GB0821938 A GB 0821938A GB 2465812 A GB2465812 A GB 2465812A
- Authority
- GB
- United Kingdom
- Prior art keywords
- processing units
- graphics processing
- image
- graphics
- rectangular area
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000012545 processing Methods 0.000 title claims abstract description 46
- 238000009877 rendering Methods 0.000 title claims abstract description 13
- 238000000034 method Methods 0.000 claims abstract description 13
- 238000012360 testing method Methods 0.000 description 3
- 238000002156 mixing Methods 0.000 description 2
- 241001522296 Erithacus rubecula Species 0.000 description 1
- 235000005121 Sorbus torminalis Nutrition 0.000 description 1
- 244000152100 Sorbus torminalis Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
- 238000004148 unit process Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/005—General purpose rendering architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/40—Filling a planar surface by adding surface attributes, e.g. colour or texture
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Graphics (AREA)
- Image Generation (AREA)
Abstract
A method and apparatus are provided for rendering a three dimensional computer graphics image. The image is sub-divided into a plurality of rectangular areas each associated with a rectangular portion of a display. Graphics image data relating to objects to be rendered is provided and assigned to respective ones of object lists associated with each respective rectangular area. The object lists for each rectangular area are passed to a distribution means coupled to a plurality of graphics processing units. The distribution means determine which graphics processing units are able to receive data for processing and passes the object lists to individual ones of the processing units in dependence on the result of the determination.
Description
Multi-Core Rasterisation in a Tile Based Rendering System This invention relates to a three-dimensional computer graphics rendering system and in particular to methods and apparatus which may be used for combining multiple independent graphics processing cores for the purpose of increasing rasterisation performance.
Background to the Invention
It is desirable to offer computer graphics processing cores at many different performance points, however the complexity of modern computer graphics makes it difficult to do this in either a timely or cost effective manner. As such it is desirable to have a method of combining multiple independent processing cores such that performance may be increased without developing a whole new core.
Tile based rendering systems are well-known. These subdivide an image into a plurality of rectangular blocks or tiles. Figure 1 illustrates an example of a tile based rendering system. A primitive/command fetch unit 101 retrieves command and (graphics) primitive data from memory and passes this to a geometry processing unit 102. This transforms the primitive and command data into screen space using well-known methods. This data is then supplied to a tiling unit 103 which inserts object data from the screen space geometry into object lists for each of a set of defined rectangular regions or tiles in which screen space is divided. An object list for each tile contains primitives that exist wholly or partially in that tile. An object list exists for every tile on the screen, although some object lists may have no data in them. These object lists are fetched by a tile parameter fetch unit 105 which supplies them tile by tile to a hidden surface removal unit (HSR) 106 which removes the primitives of surfaces which will not contribute to the final scene (usually because they are obscured by another surface). The HSR unit processes each primitive in the tile to determine which on visible of pixels and passes only data for visible pixels to a testing and shading unit (TSU) 108.
The TSU takes the data from the HSR and uses it to fetch textures and apply shading to each pixel within a visible object using well-known techniques. The TSU then supplies the textured and shaded data to an alpha test/fogging/alpha blending unit 110. This is able to apply degrees of transparency/opacity to the surfaces again using well-known techniques. Alpha blending is performed using an on chip tile buffer 112 thereby eliminating the requirement to access external memory for this operation. It should be noted that the TSU and alpha test/fogging/alpha blend units may be fully programmable in nature.
Once each tile has been completed, the pixel processing unit 114 performs any necessary backend processing such as packing and anti-alias filtering before writing the resulting data to a rendered scene buffer 116, ready for displayBritish Patent No. GB2343598 (the contents of which are incorporated herein by reference) describes scaling rasterisation performance within a tile based rendering environment by distributing workload across cores by rasterising alternate tiles on alternate cores, for example in a chequer board pattern.
Although this approach minimises the effects of uneven distribution of work load across the tiles that make up the scene it doesn't allow for all circumstances. For example consider the image in Figure 2, triangles 1 and 2 (200, 210) in tile 0 require a total of 600 clocks of processing, the triangles overlapping each of the remaining tiles T3, T4 and T5 (220, 230 and 240) each require 200 clocks of pixel processing. Given two cores processing the tiles in a checker board arrangement as shown, core 1 will execute a total of 800 clocks of pixel processing and core 2 will execute a total of 400 clocks of pixel processing, as a result core 2 will remain idle for 400 clocks. This is a significant imbalance in processing load between the two processing cores.
Summary of the Invention
Preferred embodiments of the present invention provide a method and apparatus that allow a tile based rendering system to scale rasterisation performance in a linear fashion and that minimises the loading differences across a plurality of cores. This is accomplished by the addition of separate region fetch and distribution units that distribute regions to be processed across multiple cores based on work load within each core.
Brief Description of the Drawings
Preferred embodiments of the invention will now be described in detail by way of with reference to the accompanying drawings in which: Figure 1 illustrates an example of a prior art tile based rendering system as described above; Figure 2 illustrates the load balancing problem that occurs with a fixed assignment of cores to regions as described above; and Figure 3 illustrates a system embodying the invention.
Detailed Description of Preferred Embodiments
The output of the tiling process in a graphics rasterisation system is a set of object lists for non overlapping regions each of which contains references to all geometry that overlaps its respective region. As there is no spatial overlap between each region it is possible to rasterise the regions in any order. Given this it is possible to distribute regions across processing cores for rasterisation in an order that is dictated by loading on the individual processing cores instead of some predetermined spatial arrangement as described in our previous United Kingdom Patent No. GB2343598.
Figure 3 illustrates a proposed arrangement embodying the invention. A region fetch unit 300 reads region header data (including object list data) from memory and passes them to a region distribution unit 310 which passes the region data to a processing core within an array of available cores 340 that is least busy for processing. Each core within the array of cores receives a signal or set of signals 330 from the region distribution unit 310 including data about the region to be processed by the processing core. Each processing core produces a return signal 320 that indicates whether or not the core can take a new region for processing, at that time.
In the scenario shown in figure 2 the system, when equipped with two processing cores will operate as follows. The region fetch unit 310 fetches region data for tile0 and passes it to the region distribution unit 310. As all processing cores will initially be idle the region distribution unit will distribute data for the regionto the available cores in a round robin fashion, starting with core 0 for tile 0 and core 1 for tile and so on. The region distribution unit will then need to wait for one of the processing cores to be able to accept another region before proceeding to distribute tile 2. As core 2 will only have 200 clocks of processing for tile I it will become free first so the region distribution unit will pass tile 2 to it for process, and again for tile 3 when tile 2 is complete.
Therefore, initial distribution takes place sequentially, but as different processing cores take different times to perform rasterisation, the system monitors availability of cores and distributes data in dependence on this.
It should be noted that completion of rasterisation of the whole scene will be signalled by the region distribution unit when there are no more regions to be processed and all processing cores signal that they are idle.
In a modification to the invention it is proposed that the region fetch unit and the region distribution unit allow subsequent renders for the next field frame to be executed in the case where one or more of the processing cores has become idle and there are no more regions to be rasterised in the current scene.
Claims (8)
- Claims 1. A method for rendering a three dimensional computer graphics image comprising the steps of subdividing an image into a plurality of rectangular areas, each associated with a rectangular portion of a display; providing graphics image data relating to objects in the image to be rendered; for each rectangular area, assigning data associated with each object to an object lists associated with the rectangular area; passing the object lists for each rectangular area to a distribution means coupled to a plurality of graphics processing units; determining at the distribution means which graphics processing units are available to receive data for processing; and passing object lists to individual ones of the graphic processing units in dependence on the result of the determination.
- 2. A method according to claim 1 including the step of providing a return signal from each graphics processing unit indicating its availability to accept data, for use in the determining step.
- 3. A method according to any preceding claim including the step of commencing rendering of a subsequent field or frame which one or more graphics processing units become idle.
- 4. A method according to any preceding claim in which the distribution means determining whether or not graphics processing units are idle.
- 5. A system for rendering a three dimensional computer graphics image comprising; means for sub-dividing an image into a plurality of rectangular areas, each associated with a rectangular portion of a display; means for providing graphics image data relating to objects in the image to be rendered; means for assigning data associated with each object for each rectangular area to an object list associated with the rectangular area; means for passing the object lists for each rectangular area to a distribution means coupled to a plurality of graphics processing units; means for determining at the distribution means which graphics processing units are able to receive data from processing; and, means for passing object lists to individual ones of the graphics processing units in dependence on the result of the determination.
- 6. A system according to claim 5 including means for providing a return signal from each graphics processing unit indicating its availability to accept data for use in the determining step.
- 7. A system according to claim 5 or 6 including means for commencing rendering of a subsequent frame when one or more graphic processing units becomes idle.
- 8. A system according to any of claims 5 to 7 in which the distribution means determines whether or not a graphics processing unit is idle.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB0821938A GB2465812A (en) | 2008-12-01 | 2008-12-01 | Distributed processing for rendering 3D images |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB0821938A GB2465812A (en) | 2008-12-01 | 2008-12-01 | Distributed processing for rendering 3D images |
Publications (2)
Publication Number | Publication Date |
---|---|
GB0821938D0 GB0821938D0 (en) | 2009-01-07 |
GB2465812A true GB2465812A (en) | 2010-06-02 |
Family
ID=40262492
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB0821938A Withdrawn GB2465812A (en) | 2008-12-01 | 2008-12-01 | Distributed processing for rendering 3D images |
Country Status (1)
Country | Link |
---|---|
GB (1) | GB2465812A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0394384A (en) * | 1989-09-07 | 1991-04-19 | Matsushita Electric Ind Co Ltd | Multiprocessor type image processor |
US5655120A (en) * | 1993-09-24 | 1997-08-05 | Siemens Aktiengesellschaft | Method for load balancing in a multi-processor system where arising jobs are processed by a plurality of processors under real-time conditions |
US20030160794A1 (en) * | 2002-02-28 | 2003-08-28 | Pascual Mark E. | Arbitration scheme for efficient parallel processing |
WO2005050557A2 (en) * | 2003-11-19 | 2005-06-02 | Lucid Information Technology Ltd. | Method and system for multiple 3-d graphic pipeline over a pc bus |
WO2006083045A2 (en) * | 2005-02-07 | 2006-08-10 | Sony Computer Entertainment Inc. | Method and apparatus for particle manipulation using graphics processing |
-
2008
- 2008-12-01 GB GB0821938A patent/GB2465812A/en not_active Withdrawn
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0394384A (en) * | 1989-09-07 | 1991-04-19 | Matsushita Electric Ind Co Ltd | Multiprocessor type image processor |
US5655120A (en) * | 1993-09-24 | 1997-08-05 | Siemens Aktiengesellschaft | Method for load balancing in a multi-processor system where arising jobs are processed by a plurality of processors under real-time conditions |
US20030160794A1 (en) * | 2002-02-28 | 2003-08-28 | Pascual Mark E. | Arbitration scheme for efficient parallel processing |
WO2005050557A2 (en) * | 2003-11-19 | 2005-06-02 | Lucid Information Technology Ltd. | Method and system for multiple 3-d graphic pipeline over a pc bus |
WO2006083045A2 (en) * | 2005-02-07 | 2006-08-10 | Sony Computer Entertainment Inc. | Method and apparatus for particle manipulation using graphics processing |
Also Published As
Publication number | Publication date |
---|---|
GB0821938D0 (en) | 2009-01-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2227781B1 (en) | Multi-core geometry processing in a tile based rendering system | |
US11017589B2 (en) | Untransformed display lists in a tile based rendering system | |
EP3286738B1 (en) | Apparatus and method for non-uniform frame buffer rasterization | |
WO2009068893A1 (en) | Multi-core rasterisation in a tile based rendering system | |
JP4489806B2 (en) | Scalable shader architecture | |
JP4480895B2 (en) | Image processing device | |
EP2596491B1 (en) | Displaying compressed supertile images | |
US20080211805A1 (en) | Method and System for Minimizing an Amount of Data Needed to Test Data Against Subarea Boundaries in Spatially Composited Digital Video | |
US9105125B2 (en) | Load balancing for optimal tessellation performance | |
JP2012178158A (en) | Parallel array architecture for graphics processor | |
US8928679B2 (en) | Work distribution for higher primitive rates | |
TWI601096B (en) | Method and apparatus for direct and interactive ray tracing of a subdivision surface | |
CN106575440A (en) | Constant buffer size multi-sampled anti-aliasing depth compression | |
US12051154B2 (en) | Systems and methods for distributed rendering using two-level binning | |
GB2465812A (en) | Distributed processing for rendering 3D images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WAP | Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1) |