SciDAC Review
SIMULATION SCALE
In HPC Simulations, How Much is ENOUGH?
Running a recently developed, high-performance molecular dynamics code on BlueGene/L—the IBM machine that ranks number one on the list of the world's Top500 supercomputers—researchers in the Modeling and Simulations Group at Lawrence Livermore National Laboratory are closing in on answers to fundamental issues of simulation scale: how large a simulation is large enough, and how small a scale is small enough?

With the advent of petaflop computing—and with exaflop computing on the horizon—simulations can now be performed on what were previously unthinkable scales. By exploring the continuum from microscopic to macroscopic simulations, researchers can come to grips with one of the most vexing limitations on earlier calculations: the presence of size effects. For example, using atomistic methods (figure 1) researchers can readily perform simulations that clearly resolve such intricate and emergent phenomena as the dynamic formation of a shock front or the mixing behavior at an interface in a sheared fluid flow. These methods, however, fail to capture the largest structures that may develop, as such structures in the simulation are necessarily limited by the overall system size. At the other end of the spectrum, simulations of fluid dynamics typically involve models of large-scale flow solving hydrodynamics equations that fail to resolve the finest structures, as these methods are limited in resolution by the grid size. As ever larger and more capable computers lead to simulations that are more complex and better resolved, the questions "how large is large enough?" and "how small is small enough?" become relevant and urgent. Researchers in the Modeling and Simulations Group at Lawrence Livermore National Laboratory (LLNL) have begun to investigate such issues of scale by making use of capability computers such as BlueGene/L.
Figure 1. Clusters of solid are seen forming spontaneously out of a molten tantalum sea as the system is compressed to progressively higher pressures. The nucleation and growth process is notoriously difficult to model accurately using entirely atomistic methods due to the length and time scales involved—here the large system size (greater than 16 million atoms) allows the process to proceed naturally. In this image, the atoms are shaded according to the value of their local order parameter: red indicates solid-like order; yellow indicates partial ordering (typically at a grain boundary or surface); and blue indicates liquid-like disorder. The deepest blue (most disordered) atoms are not shown for clarity.
Funded by the National Nuclear Security Administration (NNSA) Office of Advanced Simulation and Computation (ASC), capability computing in the national interest employs ever more powerful supercomputers to solve the largest and most demanding physics and mathematical problems, with the intention of minimizing the time to solution. A capability computer—in contrast to its better-known cousin, the capacity computer—is dedicated to running one problem, or at most a few problems, at a time. In this way, the powerful computer provides the capability to solve problems that would otherwise be prohibitively large. IBM's BlueGene/L (figure 2, p12) at LLNL solves multiscale, multiphysics problems for the Stockpile Stewardship Program, an effort that verifies the safety and reliability of the nation's nuclear weapons. The Stockpile Stewardship Program integrates the activities of the U.S. nuclear weapons complex, which includes three national laboratories, four production sites, and the Nevada Test Site. As nuclear weapons in the national stockpile continue to age, national laboratory scientists and engineers must ensure their performance and refurbish them as necessary, all without conducting nuclear tests.
By exploring the continuum from microscopic to macroscopic simulations, researchers can come to grips with one of the most vexing limitations on earlier calculations: the presence of size effects.
Capability machines give investigators the computational power they need to model the behavior of materials under extreme conditions. Under some conditions to be simulated, the required accuracy demands larger and larger simulations; in other scenarios, researchers must look at finer and finer scales. Consider two examples involving the behavior of molten metals under extreme conditions: pressure-induced solidification, and the development of hydrodynamic instabilities at a sheared interface. In both cases, large-scale or macroscopic structures develop from microscopic origins—the trick is to capture details at the "hard" length scale (large for solidification, small for instability) without losing detail at the "easy" length scale (small for solidification, large for instability). These dual needs challenge researchers to leverage the nation's capability computing resources to capture the physics results they require to accurately analyze the stockpile under the full range of conditions.

How Big is Big Enough?
Solidification, also known as crystallization, is a process that is ubiquitous in nature but poorly understood. Whether it is the result of a quench process (such as water becoming ice in a freezer) or the result of compression (such as occurs to iron in the Earth's core), solidification describes the creation of a more ordered state from a less ordered one. Initially, the atoms in molten metal have no long-range order—the atoms are positioned essentially at random. As the metal solidifies under the application of pressure, microscopic crystallites begin to form in the molten sea (figure 1, p11). At first these crystallites are few and far between; however, they grow rapidly and eventually coalesce into a jumbled mass of crystals. The details of this evolving microstructure affect the material properties of the metal.
Figure 2. BlueGene/L is the world's fastest computer. BlueGene/L technology builds a supercomputer one dual-processor chip at a time. Chips are aggregated into compute cards, which are then assembled into node cards. Each rack holds 32 node cards, and the full machine comprises 64 racks. BlueGene/L has been the number one machine on the Top500 list since November 2004. All this performance fits into a footprint of approximately 2,500 square feet, about 80% smaller than more conventional systems. Compare BlueGene/L's footprint with ASCI White's at about 10,000 square feet, ASCI Q's at about 14,000 square feet, and the Earth Simulator's at about 34,000 square feet. Power consumption also is a key issue in such large-scale computers, therefore the FPUs and CPUs are designed for low power consumption. Incorporated techniques range from the use of transistors with low leakage current, to local clock gating, to the ability to put the FPU or CPU/FPU pair to sleep. Furthermore, idle computational units are isolated from changing data so as to avoid unnecessary toggling.
The behavior of metals under extremes of pressure and temperature is of particular interest to weapons scientists. "To predict the performance and reliability of our aging nuclear weapons, we need detailed information on the solidification history of the various components," says Dr. Fred Streitz, Group Leader for the Modeling and Simulations Group. He explains, "Scientists want to understand exactly what happens to a metal as it transitions from the molten liquid into an ordered solid. The more we understand this process, the better models (figure 3) we can derive to describe the strength and failure of metals under a variety of conditions."
Solidification, also known as crystallization, is a process that is ubiquitous in nature but poorly understood.
Experimentally observing the initial stages of metal solidification is an extremely difficult task under the best of circumstances, and it is impossible under the conditions that would be present during a nuclear explosion. To better understand this process, scientists must resort to computer simulations to model the formation and growth of clusters. Although some of the earliest computer simulations ever performed dealt with crystallization, the process remains a very difficult one to model, primarily due to the issue of length scale. The phenomenon necessarily begins at the atomic scale, and the initial physics is dominated by the motion of individual atoms. Very quickly, however, the relevant length scale grows from microscopic to macroscopic as incipient grains continue to nucleate, grow, and coalesce. Scientists have debated for decades the size of simulation necessary to model this process correctly—it is only with the availability of capability computing at the level of BlueGene/L that the question can be answered.
Figure 3. Microstructure resulting from coalescence following solidification in a 32 million-atom simulation. Color indicates the degree of local order, ranging from blue (liquid-like order), to yellow (intermediate order) to red (solid-like order).
To model pressure-induced solidification in molten tantalum, LLNL researchers developed the world's fastest classical molecular dynamics code, ddcMD (sidebar "The Need for Speed," p14) and set about investigating the nucleation and growth process. Their simulation sizes ranged from 64,000 atoms to 524 million atoms, and revealed a size dependence on the nature of the results (figure 5, p15). Small simulations (for these purposes, a 64,000-atom simulation must be considered small!) were unable to capture the large-scale structure apparent in the larger simulations (as demonstrated with a 16,384,000-atom simulation). Simulations of 128,000 and 256,000 atoms produced results similar to the 64,000-atom case—artificial grain structures with maximum grain sizes limited by the size of the simulation cell. However, there is a true size scale defined by the material that should emerge during the solidification process, and a sufficiently large simulation would capture it. By measuring the characteristic size at coalescence in a series of simulations, the scientists were able to show that for all simulations of fewer than about 8 million atoms, the characteristic size followed standard scaling laws and was thus being determined by the size of the simulation, whereas for larger simulations the characteristic size was independent of simulation size (figure 6).
"These simulations allowed us to examine, for the first time, the process of solid formation at high temperature and pressure from the atomistic level. We can actually watch, atom-by-atom, as macroscopic grains grow out of the liquid and form structures," notes Dr. Streitz. "This helps us to better understand the properties of these metals and has important implications for the development of stronger metals, such as those that might be used for aircraft or automobile components as well as other applications."
For now, the answer to the question "how large is large enough?" appears to be millions of atoms for the solidification process in tantalum, although that is not the end of the story. "Our current understanding is that this size scale is set by the competition between the nucleation rate and the subsequent growth rate," explains staff physicist Dr. James Glosli. "But the nucleation rate is a material property that's a very sensitive function of the local environment, and any changes in material, or pressure or temperature will change this characteristic size. There are systems for which 8 million atoms isn't nearly enough."


Although some of the earliest computer simulations ever performed dealt with crystallization, the process remains a very difficult one to model, primarily due to the issue of length scale.
How Small is Small Enough?
At the other end of the size spectrum, the continuum mechanical response to outside physical influences of an assembly of macroscopic quantities of material (solid or liquid) is often modeled using hydrodynamics codes, or hydrocodes. Such hydrocodes are fundamental to understanding whether devices in the stockpile will function as designed, should they ever be called into use. In a continuum hydrodynamics simulation, a material's macroscopic properties (such as pressure, temperature, and flow fields) are defined as the collective properties of a sufficiently large ensemble of atoms. Continuum equations such as the Navier-Stokes equation are derived from conservation laws assuming that field gradients are small and material properties change slowly compared to atomic length and time scales. Although the Navier-Stokes equation provides a very powerful description of the continuum limit, it is predicated on this asymptotic analysis. There is no explicit notion of atoms or atomic motion in a hydrocode; the smallest unit is a small volume of material whose size is determined by the grid size of the simulation. The trend toward smaller and smaller length scales in both experiments and continuum modeling raises questions concerning the applicability of the hydrodynamic approximation as atomic lengths are approached. Does it make sense to use the Navier-Stokes equations for volume elements that are approaching the atomic size scale? At what length scale do the approximations inherent in hydrocodes break down?
Figure 5. About 16 million atoms tell the tale: these two sequences are taken as snapshots from simulations of solidification in tantalum. The top sequence displays (a) nucleation and (b) growth occurring in a 16,372,000-atom simulation, resulting in (c) a realistic distribution of grains and grain boundaries. The same process, but modeled using only 64,000 atoms (d-f) produced the artificial (and incorrect) final structure shown in (f).
"This question is not just the subject of arcane academic debate," Dr. Robert Rudd, staff physicist at LLNL, explains. "Investigators at sites such as the National Ignition Facility are producing gradients on extremely short time and spatial scales, while nanotechnologists are studying flow through carbon nanotubes and other systems in which the fluids are but a few atoms thick. Can these phenomena be understood using continuum analysis? What would be the signature that these fluids are discrete? No one has a good answer."
"These simulations allowed us to examine, for the first time, the process of solid formation at high temperature and pressure from the atomistic level."

DR. FRED STREIZ
LLNL
The greatest challenge is to identify when the simulation scale has become so small that it is dominated by the effects of the discrete nature of the atoms in the fluid. How does the researcher characterize that effect, and thus accurately predict whether to use hydrocodes or molecular dynamics (MD)? More generally, how can the experimenter match the correct code architecture and power with the appropriate level of refinement in the code? With the latest advances in capacity computing, investigators are in a position to learn how to answer these questions, lines of inquiry that only now can be clearly framed in the research.
Figure 6. Characteristic size, M, plotted as a function of simulation size, N. For simulations smaller than approximately eight million atoms, characteristic size was determined by simulation size. Simulations larger than about eight million atoms no longer exhibit this scaling, producing instead results that are system-size independent.
A team composed of scientists at LLNL and at IBM is starting to tackle such questions by investigating the formation and evolution of a model fluid instability—the Kelvin-Helmholtz (KH) instability—using molecular dynamics techniques. The KH instability (figures 7 and 8), occurs when fluid layers undergo shear flow, and is responsible for the striking forms seen on a windblown ocean or sand dune, in the Great Red Spot and other vortices in Jupiter's atmosphere, and as cloud billows in Earth's atmosphere. Although the development of the KH instability and the transition from smooth to turbulent flow have been studied extensively, never before has the evolution been modeled using a purely atomistic model such as MD.
For now, the answer to the question "how large is large enough?" appears to be millions of atoms for the solidification process in tantalum, although that is not the end of the story.
In contrast to hydrodynamics, MD utilizes no direct continuum-level input or equation. Instead of a continuum constitutive law, MD is based on an interatomic force law. Properties such as the equation of state (density as a function of temperature and pressure), transport coefficients such as the diffusivity and the viscosity, and interfacial properties such as the surface tension arise naturally from these underlying atomic forces. Interdiffusion at interfaces is physical, not a numerical artifact. "MD is fully resolved by nature—there is no mesh size to adjust, and no gradient is too steep," points out Dr. Rudd. "Numerically, MD simulation is the gold standard." Unlike a hydrodynamic simulation, where the challenge is to add sufficient degrees of freedom to obtain a converged result, the challenge for MD has always been to overcome the constraints on system size and simulation length imposed by computational resources. As Dr. Glosli explains, "What separates an actual science calculation from a 'toy' simulation is the product of simulation size and time. To truly reach the hydro time and length scale an MD simulation must model microns of material for nanoseconds of time." Using the ddcMD code on BlueGene/L, the research team did exactly that.
Figure 7. Parameters used in the setup of a Kelvin-Helmholtz (KH) simulation. The third dimension is eight atoms thick, or about 2 nm. The metal layers are liquids in thermal equilibrium at 5,500 K, and the interface is initially flat, apart from thermal fluctuations. The sample contained 2,013,345,000 atoms, split equally between the two metals. Figure 8. Detail from a two billion-atom simulation of the KH instability at a sheared copper-aluminum interface. Colors represent mass density, from low-density aluminum (blue) to high-density copper (red). The well-defined interface layer arises from the effects of atomic-scale diffusivity.
The researchers created the initial configuration by simulating one billion aluminum atoms and one billion copper atoms placed into a box measuring 2 nm × 5 µm × 2.9 µm. Periodic boundary conditions were used in the x and y directions; that is, the thin direction into the page and the horizontal flow direction in figure 9. The third dimension, z, was not periodic: atoms were confined through the use of a one-body potential. The use of the confining potential permitted the system to have a single interface between the two kinds of fluid, gaining a factor of two in computational efficiency over a two-interface fully periodic system. The initial velocity of each atom was selected at random from a Boltzmann thermal distribution to give a temperature of about 5,500 K in the rest frame of its fluid, and the two fluids were given an initial relative velocity of 2,000 ms-1. This step-function initial velocity profile sets up a strong shear layer at the flat interface between the two metals, which is unstable against the growth of small perturbations.
The simulation ran for one week on 131,072 processors of BlueGene/L, amassing an astounding 2.8 CPU-millennia in compute time. A single CPU would have taken almost 3,000 years to perform the same simulation. During this time the system advanced 750,000 time steps, corresponding to more than 1.5 nanoseconds of real time. Over the course of the simulation, the initially smooth interface is seen to develop a classic KH instability (figure 10, p18). The atoms are not visible in the snapshots shown in figure 10. To appreciate the scale of the simulation, consider that each pixel in the images represents, on average, about 250 atoms.
The greatest challenge is to identify when the simulation scale has become so small that it is dominated by the effects of the discrete nature of the atoms in the fluid.
The effect of the KH instability on the interface is quite pronounced. Initially, atomic-level interdiffusion leads to a slow broadening of the interface. This interdiffusion layer maintains a flat interface on average, but atomic-level thermal fluctuations perturb the interface and trigger the growth of wave structures. These structures grow in amplitude and eventually crest to form micron-scale vortices. As the KH instability grows vertically, the characteristic wavelength of the waves and vortices grows in the direction of the flow as well. Initially short-wavelength modes grow, but ultimately the larger structures grow at the expense of the small ones. This ripening, also known as coarsening, may be seen in figure 9, in which the evolution of the interface density with time is plotted. The initial short-wavelength structure evolves into the much larger vortex structures as evident from the broad bands, with fine structure arising from the ligaments and other processes that transport material in the vortices. A quantitative analysis of the growth of interface width and vorticity (sidebar "From Atoms to Hydrodynamics," p19) reveals the essence of the simulation; after an initial period that is dominated by motion at the atomic scale, the sheared layers develop a classical hydrodynamic KH instability.
Figure 9. A micron-scale atomistic simulation of the KH instability, with interface growth (y) plotted against time (t). This y–t diagram shows the time evolution of the mass of the interface distribution in y, colored such that yellow and red indicate a high mass (more copper) and the blue a lower mass (more aluminum). This roughly corresponds to the location of the interface as shown in figure 8, with red (blue) indicating a higher (lower) interface. The large swaths of red and yellow indicate large KH waves, whereas the smaller streaks of yellow at a lower angle indicate material being swept from the KH waves. The increase in structure size with time is quite evident as the short-wavelength modes at earlier times (bottom of figure) evolve into longer-wavelength modes.
These simulations are the first to describe the evolution of a KH instability from the atomic scale to continuum hydrodynamic length scales. More importantly, they reveal the microscopic origins of the instability—a result unobtainable with a hydrodynamic simulation. The ability to perform atomistic dynamics on the micron scale has opened the door to many possibilities for studying physical processes associated with the early stages of KH and other instabilities at appropriate length and time scales. However, the team is quick to point out that much that remains to be done. "From a scientific point of view, there are a lot of additional questions we have to ask," cautions Dr. Glosli. "We have just generated an enormous amount of data on BlueGene/L that we are now in the process of going through."

A Look Ahead
Fast-forward 20 years: What might researchers accomplish with the computers of 2028? Assuming the uninterrupted validity of Moore's Law (which projects the doubling of computer power every two years) we might imagine a machine approximately 1,000 times more powerful than today's capability computers—an exaflop computer.
The success of codes such as ddcMD on machines similar to BlueGene/L might suggest that with an exaflop computer, issues of size scale would be a thing of the past. This is not true, however. Even given the anticipated advances in computational horsepower, some problems cannot be addressed by atomistic techniques due to the length or time scales involved. One example of such a physical process is the evolution of grain structure occurring in a solidifying material immediately after the nucleation and growth process described earlier. The slow growth of larger grains at the expense of the smaller grains (ripening) happens in real materials on time frames ranging from microseconds to hours, and can encompass size scales from tens of nanometers to hundreds of microns (figure 11, p18). Attempting to simulate this process using purely atomistic techniques could require twenty to thirty orders of magnitude greater computational power than what is currently available. For such problems, even exaflop computing would not make the correct size and time scale accessible.
For a given amount of computing resource, any increase in the computational cost of the simulation requires a matching decrease in the system size that can be modeled in the same time.
Figure 10. Simulating the KH instability. The KH instability occurs when fluid layers in contact with one another are subjected to a shear flow. A characteristic wave pattern develops similar to that seen at a large scale on a wind-blown ocean or sand dune. Each of the four pairs of images represents a further development of perturbations at the interface. The box at the right of each large image is an enlarged detail of the area bounded by the white box in the adjacent image.
Nevertheless, as researchers at LLNL and IBM have shown, for certain classes of problems the simulation-size effects no longer present a barrier. For these problems, the availability of greater computer power enables instead higher-fidelity models and greater complexity in the simulations. One example of such complexity is in the choice of interatomic potentials (or force laws) that are at the heart of any MD simulation. Whereas simple interatomic potentials (such as the Embedded Atom-type potentials used for the fluid instability study) might require a few thousand floating-point operations (flops) per particle to evaluate a force, more complicated potentials (such as the Model Generalized Pseudopotential Theory potentials used in the tantalum solidification study) require millions of flops per particle. The difference in cost generally translates to improved accuracy or fidelity. "To be able to model truly realistic materials—one of the holy grails of capability computing—we need potentials that can describe multicomponent alloys, impurities, and defects," observes Dr. Streitz. "This is one of our next big challenges."
Figure 11. By using an adaptive mesh to describe the grain structure, researchers replace millions of atomic coordinates with thousands of grid coordinates, reducing the number of degrees of freedom by many orders of magnitude.
For a given amount of computing resource, any increase in the computational cost of the simulation requires a matching decrease in the system size that can be modeled in the same time. For instance, if a given simulation with a simple potential takes one week to run, then replacing the potential with a more sophisticated one that requires four times as much computation capacity will either take one month to run or will require a sample one-fourth the size. If a computer four times as powerful were available, however, then the more accurate potential could be used "for free." For simulations where size scale is no longer an issue, any increase in computer power translates into possible increases in complexity, greatly enhancing the fidelity of simulations. Perhaps the most valuable benefit that an exaflop computer will provide is not the thousand-fold increase in system size, but the thousand-fold increase in model complexity that it enables.
One current specific challenge that might be tackled by such a future machine involves the full modeling of a human cell or group of cells. This would allow researchers to demonstrate the connection of physics (at the atomic scale), to chemistry (at the molecular scale), to biology (at the cellular scale). This continuum could provide insights into solving such perplexing problems as correctly charactering specific effects within the renegade cells that proliferate unimpeded to form cancerous growths.
Contributor: Alane L. Alchorn, LLNL