Parallel rendering

From Wikipedia, the free encyclopedia

Parallel rendering (or Distributed rendering) is a method used to improve the performance of computer graphics creation software. The rendering of graphics requires massive computational resources for complex objects like medical visualization, iso-surface generation, and some CAD applications. Traditional methods like ray tracing, 3D textures, etc., work extremely slowly in simple machines. Furthermore, virtual reality and visual simulation programs, which render to multiple display systems concurrently, are applications for parallel rendering.

Contents

[edit] Subdivision of work

Parallel rendering divides the work to be done and processes it in parallel. For example, if we have a non-parallel ray-casting application, we would send rays one by one to all the pixels in the view frustum. Instead, we can divide the whole frustum into some x number of parts and then run that many threads or processes to send rays in parallel to those x tiles. We can use a cluster of machines to do such a thing and then composite the results. This is parallel rendering.

[edit] Non-interactive parallel rendering

Traditional parallel rendering is a great example of what is meant by embarrassingly parallel in that the frames to be rendered are distributed amongst the available compute nodes. For instance, one frame is rendered on one compute node. Multiple frames can be processed because there are multiple nodes. A truly parallel process can distribute a frame across multiple nodes using a tightly coupled cross communication methodology to process frames by orders of magnitude faster. In this way, a full-rendering job consisting of multiple frames can be edited in real-time enabling designers to do better work faster.

[edit] Interactive parallel rendering

In interactive parallel rendering, there are different approaches of distributing the rendering work, which have different advantages and disadvantages.

[edit] Sort-first

Sort-first rendering decomposes the final view in screen space, that is, each contributor renders a 2D tile of the final view.[1] This mode has a limited scalability due to the parallel overhead caused by objects rendered on multiple tiles.

The image to the right shows an example of sort-first rendering on a video wall. Each computer in the video wall renders a portion of the viewing volume, or viewing frustum, and the final image is the summation of the images on the monitors that make up the video wall. The speedup comes from the fact that graphics libraries (OpenGL for example) will clip away pixels that would appear outside of the viewing volume. This happens very early in the graphics pipeline, which accelerates the rendering process by eliminating the unneeded rasterization and post-processing on primitives that will not appear anyway.

[edit] Sort-last

Sort-last rendering on the other hand decomposes the rendered database across all rendering units, and recombines the partially rendered frames. This modes scales the rendering very well, but the recomposition step is expensive due to the amount of pixel data processed during recomposition.

The image to the right shows an example of sort-last rendering. The computer in the top left corner is the master computer. This means it is responsible for receiving the images created by the other computers, and then compositing them into the final image, which it displays on its own monitor.

[edit] Pixel Decompositions

Pixel decompositions divide the pixels of the final view evenly, either by dividing full pixels or sub-pixels. The first 'squeezes' the view frustum, while the second renders the same scene with slightly modified camera positions for full-screen anti-aliasing or depth-of-field effects. Pixel decompositions do composite the pixels side-by-side, while sub-pixel decompositions blend all sub-pixels to compute the final pixel.

In contrast to sort-first, no sorting of rendered primitives takes place since all rendering resources render more or less the same view. Pixel decompositions are inherently load-balanced, and are ideal for purely fill-limited applications such as raytracing and 3D volume rendering.

[edit] Others

DPlex rendering distributes full, alternating frames to the individual rendering nodes. It scales very well, but increases the latency between user input and final display, which is often irritating for the user. Stereo decomposition is used for immersive applications, where the individual eye passes are rendered by different rendering units. Passive stereo systems are a typical example for this mode.

Parallel rendering can be used in graphics intensive applications to visualize the data more efficiently by adding resources like more machines.

[edit] Open source applications

The open source software package Chromium (http://chromium.sourceforge.net) provides a parallel rendering mechanism for existing applications. It intercepts the OpenGL calls and processed them, typically to send them to multiple rendering units driving a display wall.

Equalizer (http://www.equalizergraphics.com) is an open source rendering framework and resource management system for multipipe applications. Equalizer provides an API to write parallel, scalable visualization applications which are configured at run-time by a resource server.

OpenSG (http://opensg.vrsource.org/trac) is an open source scenegraph system that provides parallel rendering capabilities, especially on clusters. It hides the complexity of parallel multi-threaded and clustered applications and supports sort-first as well as sort-last rendering.

[edit] See also

Concepts
Implementations

[edit] References

  1. ^ Molnar, S., M. Cox, D. Ellsworth, and H. Fuchs. “A Sorting Classification of Parallel Rendering.” IEEE Computer Graphics and Algorithms, pages 23-32, July 1994.

[edit] External links