Scientific Visualization of Very Large Data Sets
Introduction
The challenges for scientific visualization on massive data sets
are two-fold:
- How do you handle the scale of large
data? How do you load a petabyte
of data (or an exabyte!) and apply an algorithm to
it? What are the right processing
techniques? How can you prepare
data to expedite processing?
- How do you reduce the complexity of resulting visualizations?
How do you simplify the results so our
human brain's visual process system can understand it?
Because, if you process an
exabyte of data but produce the equivalent of random noise on the screen,
then you haven't really done anything meaningful...
Selected Current Research Projects
Exploration at the Exascale
Power constraints from exascale
computing (10^18 floating point operations per second) will preclude the
traditional visualization workflow, where simulations save data to disk and
analysts later explore this data with visualization tools.
Instead, we will need to embed routines
into the simulation code to massively reduce the data.
However, this reduction must be
intelligently carried out: if too aggressive, the analyst will not feel
confident in the integrity of the data and will and disregard any resulting
analyses. In short, this research
tackles how to balance the tension between reduction and integrity.
The specific research to inform this
tension involves areas such as uncertainty visualization, wavelet compression,
massive concurrency, and many more.
More information about this project can be found here.
Efficient Parallel Algorithms
Many visualization algorithms are difficult to parallelize and
even more difficult to make run efficiently in parallel.
Our group has recently published new
advances for stream surfaces (below, left) and techniques for
dealing with complex inputs, specifically adaptive mesh refinement (below,
right).
Heterogeneous Algorithms
As compute nodes increasingly have accelerators (such as GPUs),
we have to explore the best way to map visualization algorithms onto them and
we also have to evaluate their efficacy.
Further, although these accelerators provide increased computational
power, this power is balanced by increased latencies.
In a distributed memory setting -- where
accelerators must coordinate their activities via a network -- the optimal way
to design an algorithm is rarely the naive one.
Creating Insight from Large Scientific Data Sets
Our group has extensive contacts with end users and
performs application-oriented research to help them better understand
data. This often means
developing new techniques or applying techniques in new ways, such as the
Finite-Time Lyapunov Exponent (FTLE)-based analysis
of oil dispersion in the Gulf of Mexico (below, left),
nuclear reactor design (below, middle-top),
identification and analysis of features in turbulent flow (below, middle-bottom),
and explosions of stars (below, right).
More information about the analysis of oil
dispersion in the Gulf of Mexico can be found here.
Collaborations
We collaborate with a number of individuals and research
programs:
- The Visualization Group at Lawrence Berkeley National Laboratory (LBNL).
- The Institute for Data Analysis and Visualization (IDAV), a UC Davis-supported Institute that houses a number of visualization and computer graphics researchers.
- The Department of Energy's Institute for Scalable Data Management, Analysis, and Visualization (SDAV), which includes six DOE laboratories, seven universities, and one private company.
- We also collaborate with the Technical University of Kaiserslautern (Germany), with the possibility for students to go back and forth between the universities.
- In addition to LBNL, we also collaborate with Lawrence Livermore National Laboratory and Oak Ridge National Laboratory, with collaborations with Pacific Northwest National Laboratory and Idaho National Laboratory forming.
Frequently, these collaborations often lead to our students performing their research on the leading supercomputers in the world today.
- We collaborate with National Science Foundation supercomputing centers as well, primarily with the Texas Advanced Computing Center (TACC) and the San Diego Supercomputing Center (SDSC).
Faculty
Hank Childs, Assistant Professor
Selected Publications
Here are a few recent publications from the group:
- E. Wes Bethel, Hank Childs, and Charles Hansen, editors. High Performance Visualization-Enabling Extreme-Scale Scientific Insight. Chapman & Hall, CRC Computational Science. CRC Press/Francis-Taylor Group, Boca Raton, FL, USA, Nov. 2012. (Textbook)
- T. M. Ozgokmen, A. C. Poje, P. F. Fischer, H. Childs, H. Krishnan, C. Garth, A. C. Haza, and E. Ryan. On multi-scale dispersion under the influence of surface mixed layer instabilities. Ocean Modelling, 56:16-30, Oct. 2012
- K. P. Gaither, H. Childs, K. Schulz, C. Harrison, B. Barth, D. Donzis, and P. Yeung. Using Visualization and Data Analysis to Understand Critical Structures in Massive Time Varying Turbulent Flow Simulations. IEEE Computer Graphics and Applications, 32(4):34-45, July/Aug 2012
- P. Navratil, D. Fussell, C. Lin, and H. Childs. Dynamic Scheduling for Large-Scale Distributed-Memory Ray Tracing, Proceedings of EuroGraphics Symposium on Parallel Graphics and Visualization (EGPGV), pages 61-70, May 2012. Best paper winner.
- Mark Howison, E. Wes Bethel, Hank Childs. Hybrid Parallelism for Volume Rendering on Large, Multi- and Many-core Systems. IEEE Transactions on Visualization and Computer Graphics, 18(1):17-29, Jan. 2012
- David Camp, Christoph Garth, Hank Childs, David Pugmire, Kenneth Joy, "Streamline Integration using MPI-Hybrid Parallelism on Large Multi-Core Architecture" IEEE Transactions on Visualization and Computer Graphics, 17(11):1702-1713, Nov. 2011
- Hank Childs, David Pugmire, Sean Ahern, Brad Whitlock, Mark Howison, Prabhat, Gunther Weber, E. Wes Bethel. "Extreme Scaling of Production Visualization Software on Diverse Architectures." IEEE Computer Graphics and Applications, 30(3):22-31, May/June 2010.