DOESciDAC ReviewOffice of Science
SCIDAC-2 AWARDS
SciDAC-2: The Next Phase of Discovery
Most of the projects funded under the initial SciDAC program have officially ended and the resulting success stories underscore the value of multi-disciplinary collaborations. With the September 2006 announcement of SciDAC-2 projects, it is clear that The Department of Energy and the nation’s scientific computing community are up to the task of taking scientific computing to the petascale.
DOE’s supercomputers have enabled researchers to make significant scientific advances in Office of Science mission areas as diverse as developing future energy sources, studying global climate change, and improving our understanding of the physics both of tiny particles and of massive supernovae explosions. By relying solely on theoretical or laboratory studies, many of these breakthroughs would not have been possible. High-performance computer simulations help ensure that the U.S. maintains a leadership role in science and technology.
With scientific advances, however, come new computational challenges. As science programs in DOE become increasingly large and multidisciplinary, there is a need for ever more advanced modeling and simulation capabilities. At the same time, the dawning of the era of petascale computers able to perform quadrillions of calculations per second necessitates the development of effective means of capturing, storing, transmitting, sharing, and analyzing large-scale experimental and theoretical data, as well as data from simulations.

Enter SciDAC-2
Building on the success of the initial round of SciDAC projects, SciDAC-2 will bring together some of the nation’s top researchers at DOE’s national laboratories and U.S. universities in a variety of projects aimed at creating the software (figure 1) and infrastructure needed to help scientists get the most out of the next generation of supercomputers.
“Advanced computing is a critical element of President Bush’s American Competitiveness Initiative and these projects represent an important path to scientific discovery,” Dr. Raymond Orbach, DOE Under Secretary for Science, said. “We anticipate that they will develop and improve software for simulating scientific problems and help reduce the time-to-market for new technologies.”
SciDAC-2 will award approximately $60 million annually to 30 computational science projects over the next three to five years. These projects, involving 70 institutions and hundreds of researchers and students, were selected from 240 proposals. In this round of awards, the National Science Foundation (NSF) and DOE’s National Nuclear Security Administration (NNSA) join SciDAC as new funding partners. The NSF will contribute nearly $3 million a year to the Open Science Grid that supports the large, international physics collaborations supported by DOE and the NSF. The NNSA will contribute nearly $3 million a year for physics and materials research.
SciDAC-2 consists of three different components. Seventeen Science Application projects (sidebar "Science Applications," p18) will receive approximately $26.1 million annually to conduct research in areas ranging from quarks to genomes to astrophysics. Approximately $24.3 million in annual awards will be used to establish nine Centers for Enabling Technologies (CETs; see sidebar "Centers for Enabling Technologies," p27). These multidisciplinary teams, led by national laboratories and universities, will focus on meeting the specific needs of SciDAC Science Applications researchers as they move toward petascale computing. SciDAC will also increase its presence in the academic community by creating four university-led SciDAC Institutes (sidebar "SciDAC Institutes," p25), in which thirteen universities will participate. The Institutes, annually receiving approximately $8.2 million, will help a broad range of researchers prepare to take advantage of the increasing capabilities of supercomputer centers around the country, while fostering the next generation of computational scientists.
Figure 1. SciDAC projects create software applications that make scientific discovery possible. This image comes from members of the Visualization and Analytics Center for Enabling Technologies (VACET), a new SciDAC-2 project. VACET leverages established, production-quality, parallel-capable visualization and analytics infrastructure to deliver new capabilities to SciDAC Science Applications, including accelerator modeling, astrophysics, climate modeling, combustion, and fusion. One of the major delivery vehicles for VACET is VisIt, a tool that originated in the Advanced Simulation and Computing (ASC) program, and is now developed jointly between ASC and SciDAC. This image, created by Hank Childs of LLNL, is a visualization of a data set consisting of a twenty seven billion grid cell Rayleigh-Taylor simulation, which models the turbulent mixing of fluids. The simulation was done by MIRANDA, which also originated in ASC, but is now also being developed jointly with the SciDAC Physics Research Projects “Simulations of Turbulent Flows with Strong Shocks and Density Variations.”

Science Applications and Partnerships
Science Applications projects in six domain areas—Physics, Climate Research, Groundwater Research, Fusion Science, Life Sciences, and Materials Science and Chemistry—are included in this round of awards (see table). Progress in simulation-based science depends on the interaction of application domains, computer science, and applied mathematics. Science Application Partnerships (SAPs; see sidebar "Science Application Partnerships," p21) offer support for these multidisciplinary enterprises. SAPs enable applied mathematics and computer science research to enhance targeted Science Application projects in domains across all DOE Office of Science core programs.
Four Physics projects will study fundamental forces and elementary particles in an effort to improve our understanding of the nature of matter, energy, space, and time. Astrophysicist Dr. Stan Woosley and his team will explore what happens “when good stars go bad.” Using supernovae simulation codes developed in SciDAC-1, the team will simulate different types of supernovae (with an emphasis on Type 1A) and gamma-ray bursts in order to learn more about star evolution and nucleosynthesis, and to aid in the investigation of the greatest mystery in high energy and astrophysics today, the dark energy that makes up the majority of our universe. “How certain dying stars produce supernovae, some of the biggest explosions in the universe, is not understood despite more than 50 years of effort,” says Dr. Woosley. “The confluence of petascale computers and SciDAC expertise offers, for the first time, a real opportunity to change that situation.” This project includes a Science Application Partnership concentrating on adaptive algorithms for computational astrophysics. Figure 2 (p20) depicts images from a three-dimensional study related to this project.
A team led by Dr. Robert Sugar will develop a national computational infrastructure for lattice quantum chromodynamics—the theory of quarks and gluons formulated on a space-time lattice—in order to perform simulations that address problems at the heart of the DOE’s large experimental programs in high energy and nuclear physics. “The United States Lattice QCD Community’s SciDAC-2 project will develop software for the study of Quantum Chromodynamics (QCD) on petascale computers,” explains Dr. Sugar. “QCD is the theory of the strong interactions of subatomic physics, and petascale calculations of QCD will address critical questions in high energy and nuclear physics regarding the basic building blocks of matter and the fundamental forces of nature.” The major goals of this project will be to verify the Standard Model or discover its limits, to determine the properties of strongly interacting matter under extreme conditions, and to understand the structure of nucleons and other strongly interacting particles.
SCIDAC-2 SCIENCE APPLICATION PROJECTS AWARDED IN SEPTEMBER 2006
PROJECT TITLE AREA PRINCIPAL INVESTIGATOR PI AFFILIATION ANNUAL BUDGET DURATION
Computational Astrophysics Consortium: Supernovae, Gamma Ray Bursts, and Nucleosynthesis Physics (Astro) Dr. Stan Woosley University of California-Santa Cruz $1.9 million 5 years
National Computational Infrastructure for Lattice Gauge Theory Physics (QCD) Dr. Robert Sugar University of California-Santa Barbara $2.2 million 5 years
Simulations of Turbulent Flows with Strong Shocks and Density Variations Physics (Turbulence) Dr. Sanjiva Lele Stanford University $0.8 million 5 years
Sustaining and Extending the Open Science Grid: Science Innovation on a Petascale Nationwide Facility Physics (Petabytes) Dr. Miron Livny University of Wisconsin $6.1 million 5 years
A Scalable and Extensible Earth System Model for Climate Change Science Climate Dr. John B. Drake Oak Ridge National Laboratory $4.8 million 5 years
Design and Testing of a Global Cloud-Resolving Model Climate Dr. David Randall Colorado State University $1.2 million 5 years
A Data Domain to Model Domain Conversion Package (DMCP) for Sparse Climate Related Process Measurements Climate Dr. Rao Kotamarthi Argonne National Laboratory $0.25 million 5 years
Modeling Multiscale-Multiphase-Multicomponent Subsurface Reactive Flows using Advanced Computing Groudwater Dr. Peter C. Lichtner Los Alamos National Laboratory $0.8 million 5 years
Hybrid Numerical Methods for Multiscale Simulations of Subsurface Biogeochemical Processes Groundwater Dr. Timothy D. Scheibe Pacific Northwest National Laboratory $1.1 million 4 years
Framework Application for Core-Edge Transport Simulations (FACETS) Fusion Dr. John R. Cary Tech-X Corporation $2.2 million 5 years
Robust and Precise Gene Function Predictions on a Genomic Scale Life Sciences Dr. Steven Brenner Lawrence Berkeley National Laboratory $0.3 million 3 years
Filling Knowledge Gaps in Biological Networks: Integrated Global Approaches to Understand H2 Metabolism in Chlamydomonas Reinhardtii Life Sciences Dr. Michael Seibert National Renewable Energy Laboratory $0.7 million 3 years
Advanced Mathematics for Electronic Structure Materials & Chemistry (SAP) Dr. George Fann Oak Ridge National Laboratory $0.3 million 3 years
Chemistry Framework using Common Component Architecture Materials & Chemicals (SAP) Dr. Mark Gordon Ames Laboratory $0.5 million 3 years
Quantum Simulations of Material and Nanostructures (Q-SIMAN) Materials & Chemistry Dr. Giulia Galli University of California-Davis $1.2 million 5 years
Next Generation Multi-Scale Quantum Simulation Software for Strongly Correlated Materials Materials & Chemistry Dr. Mark Jarrell University of Cincinnati $0.6 million 5 years
Hierarchical Petascale Simulation Framework for Stress Corrosion Cracking Materials & Chemistry Dr. Priya Vashishta University of Southern California $1.1 million 5 years
Dr. Sanjiva K. Lele is the PI of a project that will use supercomputer simulations to revolutionize our understanding of shock-turbulence interactions and multimaterial mixing in complex flows. Bringing together a team with expertise in numerical simulations of turbulence and turbulence physics, computational gas dynamics and shock wave physics, numerical analysis and nonlinear dynamics, and massively parallel computing, this project will address problems central to inertial confinement fusion applications and supernovae astrophysics, as well as to the broader Stockpile Stewardship mission of DOE.
An ensemble of domain scientists, software developers, and providers of computing resources led by Dr. Miron Livny will stimulate new discoveries by sustaining and extending the Open Science Grid, a national distributed computational facility. By establishing a reliable computing and network infrastructure that can manage and analyze petabytes of data from the next generation of physics accelerators and detectors, this project will serve thousands of users at universities and DOE laboratories throughout the country, and will help maximize U.S. investments in high energy and nuclear physics experiments.
Figure 2. A thermonuclear runaway moves through a white dwarf star 1.38 times the mass of the Sun and with a radius of 1,500 km. The conflagration begins as a burning floating bubble that assumes a torroidal geometry (upper left). Eventually the burning erupts on the far side and the ashes sweep around in the star in about a second, colliding on the near side. Earlier 2D studies by others suggested a detonation might ignite at the collision point, incinerating the remaining carbon and oxygen into heavier elements. This 3D study failed to produce a detonation. With the help of combustion scientists and a new generation of codes, other promising alternatives for the explosion mechanism are being pursued. Calculations were done on Jaguar at ORNL with support from SciDAC.
SciDAC-2 includes three awards for Science Application projects in the field of Climate Research. These projects will dramatically increase both the accuracy and throughput of computer model-based predictions of future global climate system response to environmental change. Dr. John B. Drake and his team will create a first-generation Earth system model that fully simulates the interrelation of physical, chemical, and biogeochemical processes. By improving the representation of carbon and chemical processes, particularly with regard to greenhouse gas emissions and aerosol feedbacks, the project will provide policymakers with more accurate climate data and models to help in the determination of safe levels of greenhouse gases in the Earth’s atmosphere. This project includes two Science Application Partnerships, focusing on statistical approaches to aerosol dynamics, and on performance engineering in climate system modeling.
Another Climate application team, led by Dr. David A. Randall, will develop and test a global cloud-resolving model capable of simulating the circulations associated with large convective clouds, thus paving the way for the future production of realistic climate simulations and predictions. The project headed by Dr. Rao Kotamarthi will also help to improve global climate models by developing a uniform set of software tools suitable for the evaluation of high-end climate models, using the latest available statistical modeling tools and a knowledge of the relevant physical and chemical processes.
Groundwater Research is a new SciDAC area with the aim of providing more advanced models of subsurface contamination. Two projects in this area will aid in environmental remediation efforts at DOE facilities as well as around existing and future radionuclide waste disposal and storage sites. Dr. Peter C. Lichtner is the PI of a project which will help predict the movement of subsurface contaminants by developing the next-generation massively parallel, multiscale-multiphase-multicomponent reactive flow and transport code based on the successful prototype code PFLOTRAN. “The purpose of this project is to develop a powerful new massively parallel computer model called PFLOTRAN for use in studying subsurface processes over a wide range of spatial and temporal scales,” says Dr. Lichtner. “The model will enable geoscience researchers to obtain more accurate predictive capabilities for underground contaminant transport and carbon dioxide sequestration.” A pH profile graph from this work is shown in figure 3 (p22).
Working on another groundwater project, Dr. Timothy D. Scheibe and his team will integrate existing multiscale hybrid modeling tools into a coherent multiscale modeling framework. It will utilize high-performance computational facilities in order to provide more accurate field-scale simulations of the biogeochemical processes that control the mobility of subsurface contaminant metals and radionuclides. Figure 4 (p22) displays some pore-scale simulation results relevant to the new project.
In Fusion Science, improved simulation and modeling of fusion systems using tera- and peta-scale computers will be essential to achieving the scientific understanding necessary to make fusion a viable energy source in the future. The project led by Dr. John R. Cary will provide a highly flexible, multiphysics, parallel framework application (FACETS). Taking advantage of the largest supercomputer hardware, FACETS will enable whole-device modeling for the U.S. fusion program as well as providing the modeling infrastructure needed for the international endeavor ITER, the next step magnetic-fusion plasma confinement device. The project will concentrate in particular on the development of core-edge transport simulations. FACETS includes a Science Application Partnership focusing on steady state gyrokinetic transport code.
Projects in Life Sciences, another new SciDAC area, will focus on developing new methods for modeling complex biological systems including molecular complexes, metabolic and signaling pathways, individual cells, and, ultimately, interacting organisms and ecosystems. Dr. Steven Brenner and his team will improve various aspects of the protein function-predicting program, Statistical Inference of Function Through Evolutionary Relationships (SIFTER), including increasing its scalability and range, enhancing its prediction reliability, and integrating SIFTER into the automated microbial protein annotation system at DOE’s Joint Genome Institute. Dr. Michael Seibert is the PI of a project focused on advancing the understanding of hydrogen-producing metabolism in green algae. Utilizing both biological models, such as the alga Chlamydomonas reinhardtii (figure 5), and computer models, the project will address problems critical to renewable energy research.
Figure 3.The pH profile after 200 years resulting from a density instability as carbon dioxide at the upper boundary diffuses downward into the domain increasing the density of the fluid in the sandstone pores. The pH ranges from the initial value of 8 to approximately 4.8 at the center of the high carbon dioxide lobes. Figure 4. A computer simulation of mineral precipitation at a solute mixing interface. The black circles represent grains of the porous medium, such as sand, and the red and blue represent the fluid phase, with red being dissolved calcium and blue being dissolved carbonate. The green part represents areas where the mineral calcium carbonate has formed by reaction of the red and blue dissolved chemicals. Pore-scale modeling such as this, when coupled with other simulation scales, helps researchers make more accurate predictions of the movement and fate of contaminants in groundwater. PNNL and its collaborators will work to develop a robust computer model that will join different models at multiple scales into a single hybrid model for enhanced predictability.
In Materials and Chemistry, SciDAC-2 projects will increase our understanding of the reactions and interactions that determine material properties. Dr. George I. Fann will head up a Science Application Partnership that will develop and implement advanced mathematical methods and software for petascale computational chemistry. Through a close collaboration of chemists, mathematicians, and computer scientists, this project will lead to radical advances in the capabilities of quantum chemical methods to describe, efficiently and with controlled precision, the electronic structure of atoms, molecules, and nanoscale chemical systems. A second Science Application Partnership, led by Dr. Mark S. Gordon, will focus on creating a flexible, community-based computational chemistry framework that will allow scientists to collaborate in the development of chemical simulation software. The project will employ the infrastructure of the Common Component Architecture (CCA) to produce interfaces among three of the world’s most important computational chemistry codes: General Atomic and Molecular Electronic Structure System (GAMESS), the Massively Parallel Quantum Chemistry program (MPQC), and Northwest Chem (NWChem).
Another project will address a major challenge in materials science and chemistry: the prediction and design of molecular and materials properties with controllable accuracy from first principles, that is, from the fundamental laws of quantum mechanics. Dr. Giulia Galli and her team will transform existing quantum simulation techniques into predictive design and discovery tools by improving accuracy, robustness, efficiency, and software performance and scalability. These tools will be available to theorists, computational scientists, and experimentalists alike, and will enable large-scale quantum simulations to be carried out for a wide range of materials and nanostructures.
Figure 5. The photosynthetic green alga, Chlamydomonas reinhardtii, shown here in both light (left) and scanning electron (right) micrographs. This organism offers a biological paradigm for the conversion of light energy directly into hydrogen. However, the complexity of the associated metabolism demands a numerical model by which to integrate observed outcomes with many disparate physiological test conditions. To address unknown kinetic parameter spaces and future engineering of coupled metabolic networks, this project will combine petascale computing and experimental systems biology to develop a quantitative understanding of hydrogen production in C. reinhardtii.
Dr. Mark S. Jarrell will lead an interdisciplinary team of computational physicists and applied mathematicians in developing a massively parallel multiscale method for the study of strongly correlated materials such as magnets and superconductors. This project will advance our understanding of these materials as well as allow for the simulation and design of magnetic materials and superconductors for basic research, energy, and national security applications.
Dr. Priya Vashishta and a multidisciplinary team of computational materials scientists, applied mathematicians, and computer scientists will address the complex technological and economic problem of stress corrosion cracking, which can severely limit the performance and lifetime of materials used in nuclear technology and advanced power generation technologies such as turbines, combustors, and fuel cells. By developing a computational framework consisting of modeling techniques, algorithms, analytical underpinnings, and release-quality software, this project will bring quantum-level accuracy to multimillion-atom, nanosecond petascale simulations of stress corrosion cracking.
Features relevant to Dr. Drake’s climate project (p44) and Dr. Fann’s computational chemistry research (p54) appear in this issue. Future issues of SciDAC Review will feature articles on other Science Application projects. Furthermore, another Science Application project, "Building a Universal Nuclear Energy Density Functional," (sidebar "Universal Nuclear Energy Density Functional," p32) was added to SciDAC-2 shortly before the publication of this article.
The following summaries provide details about the four Institutes and nine Centers for Enabling Technology most recently funded under the SciDAC program.



INSTITUTES

Petascale Data Storage Institute

Dr. Garth Gibson, Carnegie Mellon University

With the advent of new experimental facilities and more powerful supercomputers, researchers are now faced with the task of managing, sharing, and analyzing petabytes of data. Petascale computing infrastructures for scientific discovery make enormous demands on information storage capacity, performance, concurrency, reliability, availability, and manageability. The last decade has shown that parallel file systems can barely keep pace with high-performance computing in these areas. When petascale requirements are considered, this poses a critical challenge, one that the Petascale Data Storage Institute intends to meet.
The Petascale Data Storage Institute brings together data storage and management expertise to address the high-performance data storage requirements of today’s DOE terascale computational science, as well as to identify, resolve, and initiate solutions for problems arising from the petascale computing infrastructures of tomorrow. Special attention will be given to issues such as interoperability, community buy-in, and shared tools.
The Institute will educate the computing community on the best practices for the use of existing and forthcoming large-scale storage systems. To engage the scientific computing community in the emerging problems of petascale storage system performance, the Institute will develop and chair an annual petascale storage workshop in conjunction with a major scientific computing conference, such as the annual SC conference. Workshops will also be presented to the academic computer science community at conferences such as the USENIX Conference on File and Storage Technologies and the Institute of Electrical and Electronics Engineers (IEEE) Conference on Mass Storage Systems and Technologies.
The Institute will also develop and conduct various tutorials for the broader communities of scientific computing, academic computer science, and industrial storage systems development. These tutorials will address techniques, mechanisms, programming practices, and tools, and will include advice to scientific discovery application developers on strategies for maximizing the effectiveness of petascale storage access. Tutorials will be offered at a variety of conferences in order to make this resource accessible to a broad range of users.
Figure 6. Four Institutes covering three areas are included under SciDAC-2.
In order to prepare personnel to design, operate, and manage the petascale systems of the next decade, the Institute will create educational materials on petascale data storage for use in at least the graduate programs of the Institute’s three university members (Carnegie Mellon University, University of California–Santa Cruz, and the University of Michigan). Possible courses will include advanced operating and distributed systems, advanced storage systems, security systems, and advanced scientific algorithms.
Building on its members’ experience in applications and expertise in diverse file and storage systems, the Institute will enable researchers to collaborate extensively on developing requirements, standards, algorithms, and development and performance tools for petascale computing infrastructures. Its results will be made available to the petascale computing community as a whole.

Performance Engineering Research Institute
Dr. Robert F. Lucas, University of Southern California

While current terascale and planned petascale supercomputers offer unprecedented capabilities for creating detailed scientific simulations and analyzing massive amounts of data, making the most efficient and effective use of such systems requires that the systems and applications be optimized for highest performance. This will grow ever more difficult due to the enormous scale and increasing complexity of system architecture and applications. At the same time, users will want to focus on science and not be burdened by the need to optimize code performance. The ideal performance tool will therefore analyze a scientific application (both as source code and during execution), generate a space of tuning options, and search for a near-optimal performance solution.
Attaining this ideal will involve many challenges, including the enhancement of automatic code manipulation tools, automatic run-time parameter selection, automatic communication optimization, and intelligent heuristics to control the combinatorial explosion of tuning possibilities. To address these challenges, the Performance Engineering Research Institute (PERI) will focus on education and outreach, performance modeling and prediction, automatic performance optimization, and performance engineering of high profile applications.
The Institute will work on improving application performance through tutorials and workshops with SciDAC Application teams and the research community. These will typically be associated with major conferences, and will help disseminate results and maximize impact on the performance of SciDAC and other high priority DOE scientific codes. Because experience has shown that short-term intensive workshops can be effective in overcoming the tool learning curve, obtaining initial results, developing collaborative working relationships, and determining future directions for productive performance optimization efforts, more formal short courses or summer courses will also be developed.
To enhance the performance of a broad set of application projects, the Institute will collaborate with other SciDAC Centers and Institutes as well as DOE scientific computing facilities. Each year, the Institute will target a particular code or set of codes within a given discipline for performance analysis and optimization to achieve specific short-term performance goals. Working with the Institute, DOE scientific computer facilities will add in-depth questions to their regular user surveys about performance problems and the use of performance tools. Application developers will also be surveyed by the Institute about current application performance and future performance goals.
The Institute will develop and refine performance models, significantly reducing the cost of collecting the data on which the models are based and increasing model fidelity, speed, and generality. Emphasis will also be placed on researching automatic performance optimization tuning software and on performance engineering of high-profile SciDAC Applications. To ensure that important performance goals are achieved in the near term, the Institute will allocate resources and assign personnel to work directly with individual Science Application projects as needed. “PERI’s goal is to help DOE scientists maximize the performance of their software on petascale systems,” explains Dr. Lucas. “In the near-term we’ll work directly with SciDAC code teams. Our long-term goal is to automate performance tuning to minimize the tuning burden on the end user.”

Combinatorial Scientific Computing and Petascale Simulations Institute
Dr. Alex Pothen, Old Dominion University

Petascale machines of the near future are likely to have hundreds of thousands of processors, complex memory hierarchies, and relatively poor interconnecting network performance. The applications that will run on these machines will involve complex multiscale or multiphase physics, adaptive meshes, and/or sophisticated numerical methods. A key challenge for scientific computing will be obtaining high-performance for these advanced applications on such complicated computers.

To address this challenge, the Combinatorial Scientific Computing and Petascale Simulations Institute (CSCAPES) will accelerate the development and deployment of fundamental enabling technologies in high-performance computing. CSCAPES will focus on providing advanced new capabilities in load balancing and parallelization toolkits for petascale computers, accelerating the development of new automatic differentiation capabilities for complex SciDAC applications, and advancing the state of the art in sparse matrix software tools. These seemingly disparate areas are unified by a common set of abstractions and algorithms based on combinatorics, graphs, and hypergraphs.
“Combinatorial scientific computing,” says Dr. Pothen, “lies at the border of two distinct provinces: continuous and discrete mathematics. The CSCAPES Institute strives to catalyze interactions at this boundary to enable breakthroughs in petascale science.”
Researchers from SciDAC as well as academia, national labs, and industrial partners will be invited to participate in training workshops every year of the project. Participants will learn about the Institute’s combinatorial scientific computing software tools for their applications, and the Institute will learn about the researchers’ applications and user problems so that its software tools can be made more responsive to their needs. CSCAPES will also offer short courses at SC and other parallel processing conferences. Through presentations at the biannual Society for Industrial and Applied Mathematics (SIAM) workshop on combinatorial scientific computing, the Institute will extend its outreach to an international community of researchers.
CSCAPES will work with other SciDAC research groups as well as other interested researchers from both the computational and domain sciences to integrate software tools into other application codes. All software created under this project will be available under an open-source public license.
The Institute will also focus on educating the next generation of researchers in the application of combinatorial techniques to scientific computing. Selected students will be trained in the methods of combinatorial scientific computing and multidisciplinary research. A DOE laboratory scientist will serve on each student’s thesis committee. The students will also work one summer at a DOE lab, helping them gain valuable experience in one domain science in addition to the computational sciences.
By addressing combinatorial scientific computing issues with algorithmic and software solutions, along with education and outreach, CSCAPES will enable new discoveries in applications that require the solution of discretized partial differential equations, numerical optimization, eigenvalue computations, and management of massive data sets, such as accelerator design, biological remediation, groundwater flow modeling, radiation transport and computational biology. An example of a solution created by the tools at the CSCAPES institute is shown in figure 7.
Figure 7. Shown here is an image of a mesh partitioned by one of the load balancing tools at the CSCAPES Institute.


Institute For Ultrascale Visualization
Dr. Kwan-Liu Ma, University of California–Davis

Understanding the science behind ultrascale simulations and high-throughput experiments requires scientists to extract meaning from massive datasets containing hundreds of terabytes or more. Parallel visualization is the most plausible path to understanding data at this scale. Existing parallel visualization tools, however, have limited functionality, are not portable or scalable to the largest systems, or are not readily adapted to new applications.
By bringing together leading experts from visualization, high-performance computing, and Science Application areas, the Institute for Ultrascale Visualization will make parallel visualization technology a commodity for SciDAC scientists as well as for the broader community. In order to enable scientific discovery at the petascale, the Institute will assemble a comprehensive parallel visualization suite that is portable across platforms. More specifically, it will develop high-performance visualization strategies on diverse platforms including general-purpose clusters, dedicated visualization clusters, and, especially, high-end computing systems, leveraging existing technologies when appropriate. The Institute will instruct application scientists on how best to use these tools, producing benchmarks of parallel visualization tools as guides and as methods to assess the capabilities of these tools on new ultrascale systems. It will also make recommendations to industry for revising hardware and software architectures and protocols to support large-scale visualization calculations.
Within the broader visualization community, the Institute will encourage collaborations through a visitor program. This program will expand existing ties in the Institute by bringing in additional expertise on various topics and will also be an important conduit for distributing the tools, results, and expertise of Institute members. Workshops will be another important aspect of the Institute’s outreach program, providing hands-on transfer of major visualization codes and analysis tools to the community. These workshops will also provide a forum for focused, intensive activities between researchers.
Complementing the outreach program will be a comprehensive education program. The Institute will present tutorials and demonstrations at major conferences. Summer schools will also be organized to educate potential users of advanced visualization technology. Summer schools will also target university students who are participating in the development of high-performance visualization technology. These education programs will help disseminate knowledge and new technologies both quickly and widely.
The Institute’s activities will be strengthened by its connections to SciDAC Science Applications, which serve as early adopters of technology, as well as by its connections to computer science partners, whose tools will combine with those developed by the Institute to aid in the scientific process of ultrascale visualization.

CENTERS FOR ENABLING TECHNOLOGIES

Visualization and Analytics Center for Enabling Technologies
Dr. E. Wes Bethel, LBNL

Scientific visualization—seeing the unseeable—plays an important role in the scientific process, as the most visible element of scientific research and as the visual component of day-to-day diagnostic and exploration tools. Its aim is to help scientists gain insight into structures, relationships, and anomalies hidden within data. Its impact can be seen in science, medicine, engineering, finance, security, and safety.
Since the advent of computing, the world has experienced an information big bang in the form of an explosion of data. Information is being created at an exponential rate, such that new information generated annually exceeds the information contained in all previously created documents. Furthermore, digital information now makes up more than 90% of all information produced, vastly exceeding that generated on paper and film. One of the greatest scientific and engineering challenges of the 21st century will be to understand and make use of this growing wealth of information to scientific advantage. Software plays a central role in providing the means to manage and understand relationships, anomalies, trends, and features contained in today’s abundant scientific data.
The Visualization and Analytics Center for Enabling Technology (VACET) will focus on the creation and deployment of scientific visualization and analytics software technology to increase scientific productivity and create new opportunities for scientific insight. A number of visualization examples from VACET appear within this issue (figure 1, p17; figure 9, p29; figure 14, p35; front cover). No single visualization technology solution exists that is responsive to the broad set of challenges facing the scientific research community. Instead, effective solutions will require the careful adaptation and deployment of technologies from many sources.
Figure 8. SciDAC-2 grant awards support nine Centers for Enabling Technologies. Each Center consists of a multidisciplinary team that will focus on meeting the needs of SciDAC Science Application researchers as the program moves towards petascale computing.
VACET will consist of a team of international leaders in scientific visualization and analysis with a strong record of creating and deploying visualization software and of collaborating effectively with application stakeholders. VACET will draw from a diverse set of visualization technology ranging from production quality applications and application frameworks to state-of-the-art algorithms for visualization, analysis, analytics, data manipulation, and data management. The Center’s goal will be to respond to the urgent needs of the scientific community by providing significant, production-quality technology to aid in data understanding. VACET will adapt, extend, create when necessary, and deploy visual data analysis solutions that are responsive to the needs of DOE’s computational and experimental scientists for use in DOE’s large open computing facilities.
Figure 9. One use for comparative visualization and analysis is to study the effects of parameter settings on the resulting output. These examples, created by Hank Childs of LLNL, show the effect of varying two parameters—coefficients of turbulent viscosity and buoyancy—on the velocity magnitude computed by a Rayleigh–Taylor instability simulation. For each coefficient, five parameter values were selected, then twenty-five runs were executed corresponding to each permutation of parameter value pairs. The VACET performs comparative analysis to visually analyze the effect of these parameters on velocity magnitude. The upper left image shows velocity magnitude from one run. In the upper right, grid points are colored by the simulation index having the maximum velocity at that point. That image shows that no one simulation dominates. In the lower left, grid points are colored by the buoyancy coefficient of the simulation having the maximum velocity. In the lower right, each grid point is colored by the turbulent velocity coefficient of the simulation having the maximum velocity. This final image shows that most of the high speeds come from either very low or very high values of the turbulent viscosity coefficient.

The Applied Partial Differential Equations Center for Enabling Technologies
Dr. Phillip Colella, LBNL

The Applied Partial Differential Equations Center for Enabling Technologies (APDEC) will develop simulation tools based on finite-difference and finite-volume methods on logically-rectangular structured grids combined with block structured adaptive mesh refinement to represent multi-scale behavior. These tools will be used for solving multiscale and multiphysics problems.
APDEC will build on the previous accomplishments of the first round of SciDAC investments. The Center will primarily support computational scientists who wish to use multiresolution tools to solve scientific problems in support of DOE missions. Potential collaborations will include astrophysics, combustion, and magnetic fusion. Other Center goals will be to obtain uniformly high performance for APDEC software, including parallel scalability to thousands of processors to hundreds of thousands of processors, and to develop new adaptive mesh refinement capabilities required by applications stakeholders, such as fourth-order finite-volume methods, adaptive mesh refinement for Maxwell’s equations, high-performance solvers for anisotropic problems, new methods for treating complex geometries, and particle and hybrid particle/continuum methods.
In all of these areas, there are one or more SciDAC projects that have specific requirements that can be met by APDEC. APDEC will interact with Science Application developers of structured-grid adaptive methods on a variety of issues, including model analysis, algorithm design, domain-specific modifications to the algorithms and software, and debugging.

Center for Interoperable Technologies for Advanced Petascale Simulations
Dr. Lori Diachin, LLNL
SciDAC applications have a demonstrated need for advanced software tools to manage the complexities associated with sophisticated geometry, mesh, and field manipulation tasks, particularly as computer architectures move toward the petascale. The Center for Interoperable Technologies for Advanced Petascale Simulations (ITAPS) will deliver interoperable and interchangeable mesh, geometry, and field manipulation services of direct use to SciDAC applications with minimal intrusion into application codes.
The initial round of SciDAC investments developed and deployed a number of advanced technologies that were widely used by application scientists, including front tracking, mesh quality improvement via smoothing and swapping, and adaptive mesh refinement. The Center will continue to develop the most promising of those technologies and invest in critical new areas identified by SciDAC Application teams. Specifically, ITAPS will develop new geometry, mesh, and field services that support partial differential equation-constrained design optimization on deforming geometries, mesh alignment, adaptive mesh refinement, front tracking, verification, solution transfer operations, dynamic partitioning, and other parallel tools for petascale simulations. Underlying these services are the common interfaces that provide data-structure-neutral access to mesh, geometry, and field information. These interfaces are the key to providing uniform access to all ITAPS tools and to creating interoperability among ITAPS technologies. Using these technologies, ITAPS will work with application scientists to develop the next generation of petascale simulation codes.
DOE has made considerable investments in petascale computing to enable the advanced modeling and numerical simulation of systems relevant to its mission needs. In order that numerical simulations can be constructed more efficiently using the latest technologies, the Center will develop interoperable and interchangeable mesh, geometry, and field manipulation tools that are of direct use to Science Applications including accelerator modeling and design, fusion energy science, groundwater reactive transport modeling and simulation, and nuclear energy.

Towards Optimal Petascale Simulations Center
Dr. David E. Keyes, Columbia University

Multiscale, multirate scientific and engineering applications in the SciDAC portfolio possess resolution requirements that are practically inexhaustible and demand execution on the highest-capability computers available, which will soon reach the petascale. Just as the variety of applications is enormous, so are their needs for mathematical software infrastructure. The chief bottleneck is often the solver. At their current scalability limits, many applications devote a vast majority of their operations to solvers because of solver algorithmic complexity that is superlinear in the problem size, whereas other phases scale linearly. Furthermore, the solver may be the phase of the simulation with the poorest parallel scalability due to intrinsic global dependencies.
The TOPS (Towards Optimal Petascale Simulation) Center will bring together the providers of some of the world’s most widely distributed, freely available, scalable solver software for the purpose of relieving this bottleneck for many specific SciDAC applications. Solver software that will be directly supported under TOPS includes: hypre, PETSc, SUNDIALS, SuperLU, TAO, and Trilinos. Transparent access will also be provided to other solver software through the TOPS interface.
The Center’s primary goals will be the development, testing, and dissemination of solver software, especially for systems governed by partial differential equations. Upon discretization, these systems possess mathematical structure that must be exploited for optimal scalability; application-targeted algorithmic research will therefore be included. TOPS software development will give attention to high performance as well as interoperability among the solver components. Support for integration of TOPS solvers into SciDAC applications will also be directly supported by the Center.

“Passage to the petascale is an unforgiving journey for an application code,” says Dr. Keyes. “Tiny imbalances must be redressed. Suboptimal features that can be hidden at 1,000 processors dominate at 100,000. TOPS has been to 100,000 processors with its solvers and is ready to help users take those next one or two steps in orders of magnitude. Our own codes will continue to improve in the process.”
Figure 10. Material temperature contours for a model Marshak wave problem in flux-limited radiation diffusion in a test mesh with 1.2 million tetrahedral elements, solved fully implicitly with PETSc’s Jacobian-free Newton-Krylov method on 256 processors of BlueGene.
DOE’s “A Science-based Case for Large-scale Simulation” (the “ScaLeS” report) identifies a diverse body of science poised for breakthroughs, given the ability to resolve more scales, sample larger ensembles, and/or couple together more phenomena simultaneously. However, for every model that is today ready for predictive simulation at the petascale (such as lattice gauge theory), there is another that will most rapidly achieve predictive power through petascale experimentation (such as subsurface hydrology in biology). There is a daunting dichotomy between the steady improvements in the capabilities and price performance of computer hardware and the difficulty of its full exploitation by the majority of practicing computational scientists, whose expertise lies, appropriately, in their science. The TOPS Center will directly respond to this dichotomy by developing, demonstrating, and disseminating scalable solver software, particularly in the areas of Accelerator Modeling and Design, Subsurface Reactive transport, and Quantum Chromodynamics (QCD). An algorithmic demonstration from the TOPS project appears in figure 10.

Center for Technology for Advanced Scientific Component Software
Dr. David E. Bernholdt, ORNL

Computational scientists face ever-increasing challenges in creating, managing, and applying simulation software to scientific discovery. These challenges, arising from the growing complexity of the scientific problems and the rapid advances and increasing diversity in hardware platforms, impact researchers’ productivity throughout the lifecycle of their scientific software.
The next generation of scientific applications will be larger and more complex and will require contributions from more diverse groups of developers. Coupling simulations across multiple time and length scales will become the norm rather than the exception, and these simulations will run on more complicated and diverse hardware platforms. DOE’s SC plans for a 100-fold increase in scientific computing capabilities between 2004 and 2007, and for full petaflop capabilities for open science by 2009. Systems with 10,000 to 100,000 processors are in the process of being deployed.
Earlier SciDAC investments developed the Common Component Architecture (CCA) and brought the benefits of component-based software engineering to high-performance scientific software. Scientific teams who have adopted the CCA are now realizing the advantages of this extensible environment, which facilitates software interoperability within and across scientific domains while addressing issues in programming language interoperability, domain-specific common interfaces, and dynamic composability. Teams increasingly report that the CCA has become integral to the future of their science. Figure 11 shows results from a CCA-based simulation.
Figure 11. CCA-based simulation of OH concentration in advective-diffusive-reactive simulation using a fourth order Runge–Kutta–Chebyshev integrator on four levels of adaptively refined mesh.
The Center for Technology for Advanced Scientific Component Software (TASCS) will transform component technology from a useful tool for forward-thinking software developers into an indispensable strategy across the entire spectrum of computational science. TASCS will extend the software component methodology in close collaboration with a number of key application projects, through an interlinked series of activities, leveraging the component environment to develop powerful new capabilities.
“TASCS leads the development of the Common Component Architecture, which brings component software technology to high-performance scientific computing,” explains Dr. Bernholdt. “This approach, already well established outside of HPC, will enable the creation of SciDAC-scale software for scientific discovery.”
The Center’s activities will focus on coupling parallel simulations, supporting emerging hardware and software paradigms for petascale computing, enhancing software quality and robustness, and dynamically adapting applications. TASCS will continue to enhance the core Common Component Architecture software environment, with an emphasis on improving usability, and will build a component ecosystem to provide more off-the-shelf components. Outreach activities will include tutorials and other educational activities as well as collaborations with numerous applications, Centers, and Institutes.

Center for Enabling Distributed Petascale Science
Dr. Ian Foster, ANL

DOE computational and experimental facilities will soon be producing petabytes of data per year in fields as diverse as astrophysics, biology, chemistry, combustion, fusion, high energy physics, nanoscience, and nuclear physics. However, this data will only be useful if application communities, which are often large and distributed, are able to access it effectively and thus translate it into knowledge. In order for this to occur, the data must be moved to where they are needed, or analysis enabled to take place near the data. These tasks are challenging in a petascale environment not only because of the sheer size of the data, but also the need to coordinate numerous shared resources, including CPUs, storage, and networks.
The Center for Enabling Distributed Petascale Science (CEDPS) will address these challenging tasks in close consultation with leading DOE Science Application groups. Specifically, the Center will design and develop technical innovations to allow for rapid and dependable data placement within a distributed high-performance environment, for the convenient construction of scalable services that provide reliable and high-performance processing of computation and data analysis requests from many remote clients, and for the troubleshooting of ultra-high-performance distributed activities from the perspective of both performance and functionality.
CEDPS will deploy and evaluate these powerful services and tools for data placement and science service construction in close collaboration with major DOE projects in high energy and nuclear physics, combustion, astrophysics, fusion, biology, and other sciences.

Center for Scalable Application Development Software for Advanced Architectures
Dr. Ken Kennedy, Rice University

Making effective use of petascale computers for DOE’s mission-critical applications will require state-of-the-art software. It will also require cooperation and coordination among academia, industry, national laboratories, and other government agencies. Through close collaborations with major DOE software and computing resource centers (such as those at ORNL, ANL, and LBNL) and other SciDAC Institutes and Centers, as well as interactions with computing industry system vendors and independent software developers, the Center for Scalable Application Development Software for Advanced Architectures will serve as a resource for scientific application teams. It will enable these teams to focus on application challenges and not be diverted by inadequate system software. “We are excited about the opportunity to conduct research on tools and transformation systems to help automate the steps of scaling from an abstract algorithm in a high-level notation to a high-performance implementation on different petascale platforms,” PI Dr. Ken Kennedy explained. A graph used in the analysis of such problems is shown in figure 12.
Figure 12. The Center for Scalable Application Development Software for Advanced Architectures (CSADS) conducts research on tools and transformation systems to help automate the steps of scaling from an abstract algorithm in a high-level notation to a high-performance implementation on different petascale platforms. This diagram summarizes the CSADS approach to accomplishing these goals.
The Center will focus on software tools for increasing the productivity of scientific application development on high-end computer systems. It will conduct research on the technical challenges associated with the effective utilization of emerging multicore architectures by applications. Compilation methodology for existing languages such as Fortran 90 and emerging partitioned global address space languages including co-array Fortran will be emphasized, as well as auto-tuning techniques, rapid prototyping systems, and scalable performance tools. The Center will also develop and maintain open-source shared software infrastructures to enable the incremental construction of programming support technologies that are portable across a broad range of high-end computer architectures by the research and development community. In particular, attention will be paid to the Open64 compiler infrastructure that supports a number of projects within DOE. Finally, the Center will establish a program of summer workshops and other activities to bring researchers in programming systems and tools for scalable computing together with technology consumers (that is, developers of applications, tools, and systems) in order to exchange information, discuss problems, and build community consensus in areas including parallel programming models, runtime libraries, instrumentation, and tools.

The Scientific Data Management Center for Enabling Technologies
Dr. Arie Shoshani, LBNL

With the increasing volume and complexity of data produced by ultrascale simulations and high-throughput experiments, scientists can end up spending more time managing their data than studying their results. Not surprisingly, the management of scientific data has been identified as one of the most important emerging needs of the scientific community. Effectively generating, managing, and analyzing this information requires a comprehensive, end-to-end approach to data management that encompasses all of the stages from the initial data acquisition to the final analysis and visualization of the data. The data management problems encountered by many DOE scientific domains face common technical problems and benefit from shared technology solutions.
The initial SciDAC investments succeeded in bringing an initial set of advanced data management technologies to DOE application scientists in astrophysics, climate, fusion, and biology. Equally important was the establishment of collaborations with these scientists, which led to a better understanding of their science as well as of their forthcoming data management and data analytics challenges.
Building on these early successes, the Scientific Data Management Center will improve the scientific data management framework in order to address the needs of petascale science. Specifically, the Center will enhance and extend existing tools to allow for more interactivity and fault tolerance when managing scientists’ workflows, for better parallelism and feature extraction capabilities in their data analytics operations, and for greater efficiency and functionality in users’ interactions with local parallel file systems and remote storage. These improvements, which will be complemented by targeted data management efforts through partnerships with application and computer scientists, will prepare the scientific data management framework for the scalability and complexity challenges presented by hardware and applications at the petascale.

Scaling the Earth System Grid to Petascale Data Center for Enabling Technologies
Dr. Dean N. Williams, LLNL

Current efforts in climate modeling and climate science are generating massive amounts of data distributed around the globe. Under SciDAC-1, the Earth System Grid (ESG) was developed and deployed to make climate simulation data easily accessible to the climate modeling community. ESG currently has 2,300 registered users and manages 140 terabytes of data. It is estimated that more than 200 scientific publications are under way from analysis of ESG-delivered data in the past year alone. Despite these successes, ESG will face significant challenges in coming years as the size, complexity, and number of climate datasets grow dramatically.
Ames Laboratory, Ames, IA
Argonne National Laboratory, Argonne, IL
Binghamton University, Binghamton, NY
Boston University, Boston, MA
Brookhaven National Laboratory, Upton, NY
California Institute of Technology, Pasadena, CA
California State University, Northridge, CA
Carnegie Institution, Washington, DC
Carnegie Mellon University, Pittsburgh, PA
Central Michigan University, Mount Pleasant, MI
Colorado School of Mines, Golden, CO
Colorado State University, Fort Collins, CO
Columbia University, New York, NY
Cornell University, Ithaca, NY
DePaul University, Chicago, IL
Fermi National Accelerator Laboratory, Batavia, IL
General Atomics, San Diego, CA
Harvard University, Cambridge, MA
Idaho National Laboratory, Idaho Falls, ID
Illinois Institute of Technology, Chicago, IL
Indiana University, Bloomington, IN
Iowa State University, Ames, IA
Johns Hopkins University, Baltimore, MD
Lawrence Berkeley National Laboratory, Berkeley, CA
Lawrence Livermore National Laboratory, Livermore, CA
Los Alamos National Laboratory, Los Alamos, NM
Massachusetts Institute of Technology, Cambridge, MA
Michigan State University, East Lansing, MI
NASA Ames Research Center, Moffett Field, CA
National Center for Atmospheric Research, Boulder, CO
National Oceanic and Atmospheric Administration, Washington, DC
National Renewable Energy Laboratory, Golden, CO
North Carolina State University, Raleigh, NC
Northwestern University, Evanston, IL
Oak Ridge National Laboratory, Oak Ridge, TN
Ohio State University, Columbus, OH
Old Dominion University, Norfolk, VA
Pacific Northwest National Laboratory, Richland, WA
ParaTools Inc., Eugene, OR
Princeton Plasma Physics Laboratory, Princeton, NJ
Purdue University, West Lafayette, IN
Rensselaer Polytechnic Institute, Troy, NY
Rice University, Houston, TX
Sandia National Laboratories, Albuquerque, NM
San Diego State University, San Diego, CA
Stanford Linear Accelerator Center, Menlo Park, CA
Stanford University, Palo Alto, CA
State University of New York, Buffalo, NY
State University of New York, Stony Brook, NY
Tech-X Corporation, Boulder, CO
Thomas Jefferson National Accelerator Facility, Newport News, VA
University of Arizona, Tucson, AZ
University of British Columbia, Vancouver, British Columbia
University of California, Berkeley, CA
University of California, Davis, CA
University of California, Los Angeles, CA
University of California, San Diego, CA
University of California, Santa Cruz, CA
University of California, Santa Barbara, CA
University of Chicago, Chicago, IL
University of Cincinnati, Cincinnati, OH
University of Colorado, Boulder, CO
University of Florida, Gainesville, FL
University of Illinois, Champaign, IL
University of Iowa, Iowa City, IA
University of Maryland, College Park, MD
University of Michigan, Ann Arbor, MI
University of North Carolina, Chapel Hill, NC
University of Southern California, Los Angeles, CA
University of Tennessee, Knoxville, TN
University of Texas, Austin, TX
University of Utah, Salt Lake City, UT
University of Virginia, Charlottesville, VA
University of Washington, Seattle, WA
University of Wisconsin, Madison, WI
Vanderbilt University, Nashville, TN
Figure 13. This map illustrates the distribution of institutions participating in the thirty projects announced in September 2006. Over 75 laboratories, universities, and corporations are participating in the SciDAC-2 projects.
The Scaling the Earth System Grid to Petascale Data Center for Enabling Technologies has several goals. In addition to sustaining the existing ESG system, the Center will address projected scientific needs for data management and analysis, and will extend ESG to support the major Intergovernmental Panel on Climate Change assessment in 2010. The Center will also support the Climate Science Computational End Station at the DOE Leadership Computing Facility at ORNL, as well as supporting the climate model evaluation activities under the proposed SciDAC-2 climate application. To achieve these goals, the Center will broaden ESG to support multiple types of model and observational data, provide more powerful (client-side) ESG access and analysis services, enhance interoperability between common climate analysis tools and ESG, and enable end-to-end simulation and analysis workflow.
The Center’s work will be relevant to efforts to deliver improved climate data and models for policy makers in order to determine safe levels of greenhouse gases in the Earth’s atmosphere. It will also reduce differences between observed and model-simulated temperatures at subcontinental scales, based on the use of several decades of recent data. The Center will improve access to and utility of datasets in support of DOE’s contribution to the Intergovernmental Panel on Climate Change assessment in 2010.
Figure 14. A visualization of magnetically unstable cylindrical Couette flow, this image shows the enstrophy and regions of high hydrodynamic dissipation. Simulations were conducted by Dr. Fausto Cattaneo (ANL and University of Chicago), Dr. Paul Fischer (ANL), and Dr. Aleksandr Obabko (University of Chicago).
Contributors:
Many of the investigators involved in SciDAC-2 projects have contributed text and images for this article; Debra Hershkowitz assisted with the writing and editing process