ParVox --A Parallel Volume Rendering System for
Scientific Visualization
Objective:
Develop a portable and scalable parallel volume rendering system for the
teraflop supercomputers to support distributed visualization needs demanded
by HPCC Grand Challenge applications and the general science community.
The rendering system is capable of visualizing large volumes of 4-D simulation/modeling
data which are beyond what the existing workstation and network bandwidth
can handle.
Approach:
ParVox was developed under the HPCC Program ESS Project in the past four
years. The ParVox system consists of three major components:
-
A parallel volume rendering API for time varying 3-D scalar fields in structured
grids.
-
An X-window based GUI with multiple control panels for interactive control
of visualization parameters and viewing positions.
-
A network interface connecting the renderer to the GUI and supporting image
compression and various output devices.
The ParVox API was first implemented using Cray's Shmem library and later
ported to MPI 2.0 using its one-sided communication API. The Shmem version
runs on the Cray T3D and T3E machines and the MPI version runs on the HP
Exemplar system. The ParVox GUI runs on all the Unix workstations, including
Sun, SGI, HP, and PC with Linux operating system.
Accomplishments:
ParVox 1.0 (Parallel Voxel Renderer), a parallel and distributed
volume rendering system, was released in Feb. 1999 and is available at
JPL
HPCC Software Repository. ParVox 1.0 includes (1) a parallel rendering
library for structured grid 4D datasets running on the Cray T3D and T3E,
(2) a parallel input/output library supporting files in NetCDF format or
in raw binary format, (3) a parallel wavelet compression library, (4) a
network interface program that works together with the ParVox GUI, (5)
an interactive GUI that runs on SGI, Sun, HP, and PC/Linux.
In FY99, we concentrated our effort in both bug fixing and functional
enhancement of ParVox 1.0. Three major milestones were accomplished:
-
Added functional pipeline and out-of-core rendering capability.
The ParVox functions were separated into three functional modules, namely,
the input module, the render module and the output module.
The three modules can be placed on different partitions of a single parallel
machine or three different machines. They are interconnected using
MPI communication calls. (see Figure 1) The advantage of functional
pipeline is three fold: 1) increase the parallelism and the performance
by pipelining the input, rendering, and compression operations, 2) easy
to plug in additional input or render modules for new data formats or new
rendering algorithms, and 3) allow distributed rendering on heterogeneous
environment. Out-of-core rendering capability was added into
ParVox after the functional pipeline was in place. For time-sequence
animation, a specified subset of data can be prefetched into the memory
at initialization. The resident data will be swapped out and new
data will be read in when the renderer is ready to render a dataset not
currently resident in memory. This out-of-core rendering capability
allows ParVox to render 4D datasets of total size beyond the physical memory
of the machine. One limit applies to the current implementation:
each time step has to be loaded into memory as a whole. If one time
step does not fit into the physical memory of the parallel machine,
ParVox will not be able to handle it.
Figure 1. The ParVox Functional Pipeline
-
Ported a cell-projection renderer for unstructured grid dataset.
A new parallel volume rendering algorithm for 4D unstructured grid
datasets was added into the ParVox visualization system. The algorithm
was based on Ma's
cell-projection algorithm on IBM SP-2. The algorithm was implemented
using MPI1.2 communication API and was ported to the ParVox functional
pipeline as an additional render module. A new input module was also
written for multiple variable unstructured grid datasets in NetCDF format.
The algorithm was modified and tuned to improve the performance in input,
rendering, as well as output operations. Figure 2 is a rendered image of
an unstructured grid finite element dataset provided by Tom Cwik.
Figure 2. Unstructured Grid Data Rendering
-
Accomplished HPCC Level 1 milestone "Demonstrate portable scalable distributed
visualization of multi-terabyte 4D data sets on TeraFLOPS scalable systems".
The data used in this demo was 245 time steps of 1/6 degree global ocean
circulation datasets generated by the COSIM
(Climate, Ocean and Sea Ice Modeling) project at Los Alamos National
Laboratory. A 240 node HP Exemplar machine at Caltech
was used to run the demo. Neither the machine nor the dataset
is at Teraflop or terabyte level, but it is the biggest dataset and
the largest accessible machine available at the time of the demo.
The data was stored at a remote data server and retrieved using Storage
Resource Broker (SRB) in realtime . The rendered images were sent
back and displayed on SGI Oynx at JPL. A full
report on the milestone demonstration was prepared. Three animation
movies were generated during the demo: velocity(1.8Mbyte),
temperature
(800
Kbyte), and salinity
(850
Kbyte).
Significance:
The demand for parallel supercomputing in interactive scientific visualization
is increasing as the ability of the machines to produce large output datasets
has dramatically increased. The ParVox system provides a solution
for distributed visualization of large time-varying datasets on a scientist's
desktop even when using low speed network and low-end workstations.
Status/Plans:
We are still working on several known problems for the unstructured grid
renderer, including memory scalability problem, I/O performance,
and isosurface rendering. We are also looking for a large
4D unstructured grid dataset for testing purpose. In addition to
that, there are two major milestones we would like to accomplish in FY2000:
-
Port ParVox to SGI Origin and Beowulf. MPI2.0 has not been
widely accepted by the MPI community. Currently not many vendors
support MPI2.0 on their hardware. Therefore, although ParVox is ported
to MPI2.0, it can only run on the HP Exemplar system. We plan to
port ParVox to MPI1.2 point-to-point communication and test it on the SGI
Origin and Beowulf PC clusters. The JPL 128 node SGI Origin system
and the 32 node Beowulf system owned by JPL High Performance Computing
Group will be used as test machines.
-
Deliver ParVox 2.0 to the JPL's HPCC Software Repository.
The new release will include:
-
MPI1.2 version of ParVox running on HP Exemplar, SGI Origin, and PC clusters
-
Functional pipeline code supporting out-of-core rendering
-
Bug fix and various functional enhancements from ParVox1.0
-
Complete documentation
Point of Contact:
P. Peggy Li
Jet Propulsion Laboratory
P.P.Li@jpl.nasa.gov
(818)354-1341
http://pat.jpl.nasa.gov/public/ParVox/