High Performance Volume Rendering
-
Motivation:
Implement a high performance volume rendering system on PC. The main
focus of this project is to mask I/O time by computation time.
-
Principal Investigator: Tzi-cker Chiueh
-
People Involved: Chuan-kai
Yang
-
Description: The goal of this project is to make the best use of
modern general purpose processors such as pentium family processors to
do volume rendering as fast as possible. The underlying operating system
is Linux, whose new supporting features like user threading make significant
contribution to our performance improvement. According to data size and
memory size, there are at least two scenerios to consider:
-
Memory-Based volume rendering: How fast can we do if all the data set is
memory-resident? Tricks may include replacing floating point calculations
by integer ones as much as possible, simplifying algorithm, and using MMX
instructions if possible. Currently we can render a 128x128 grey-scale
image from a 128x128x128 data set within one second in average with the
underlying 128M memory, Pentium II 300 machine.
-
Disk-Based volume rendering: Once data set is too large to fit in memory
usually what we can do is to decouple the I/O part from the rendering part.
The overlapping of computation and data loading is as much as desired.
Data sets are divided into subblocks to allow the I/O process to proceed
first to feed the following rendering process. Threads are used in place
of processes to reduce context switching overhead. Various subblock sizes
are experimented for the trade-off between random-I/O overhead and
parallelism. To use a loaded subblock to the extreme, a group of rays are
casted instead of casting a ray all the ray through. The ray group size
offers another dimension to explore: working size v.s. parallelism. Due
to variable subblock size and hardisk speed, the latency to read the first
subblock from disk to memory may differ. It makes the traditional programming
style harder to achieve accurate estimation. We employ an I/O-driven
approach, where a rendering process is waken up only if there are some
subblocks "arriving" from disk to memory, to resolve this difficulty.
-
This approach can be easily extended to handle out-of-core rendering as
well. Subblocks are sorted according to their distance to the image plane.
Using this order, subblocks are brought into the memory for rendering therefore
each subblock is touched only once. The sorting cost can be avoided since
there are only six possible orders for all the viewing directions.
-
Publications:
Chuan-kai Yang, Tzi-cker Chiueh, "I/O-Conscious
Volume Rendering," in VisSym '01, Joint Eurographics - IEEE TCVG Symposium
on Visualization, Ascona, Switzerland, May, 2001.
-
Related links: