DOESciDAC ReviewOffice of Science
COMPUTATIONAL EPIDEMIOLOGY
A Cyber Environment to Support Pandemic Planning and Response
With an increasingly urbanized and more mobile population, the likelihood of a worldwide epidemic is increasing. By leveraging advances in high-performance computing, algorithmic science, computational social science, and network science, computational epidemiologists are now able to construct large-scale, high-fidelity models that will provide policy makers and public health officials with the information they need to be able to prevent and control such pandemics.
 
According to the World Health Organization, "An influenza pandemic occurs when a new influenza virus appears against which the human population has no immunity, resulting in several simultaneous epidemics worldwide with enormous numbers of deaths and illness." While many infectious disease outbreaks—for example, severe acute respiratory syndrome, Ebola, human immunodeficiency virus, or West Nile virus—can cause devastation, these infections are typically limited in their spread to either at-risk populations or localized areas. By contrast, pandemics—or worldwide simultaneous epidemics—pose disease control challenges unmatched by any other infectious disease event, natural or intentional.
Pandemic influenza, in particular, has the potential to be a rapidly developing global event in which most, if not all, populations worldwide are at risk of infection and illness.
Pandemic influenza, in particular, has the potential to be a rapidly developing global event in which most, if not all, populations worldwide are at risk of infection and illness. Influenza viruses have demonstrated their ability to spread worldwide within months or even weeks and to cause infections in all age groups. Without adequate planning and preparations, an influenza pandemic in the 21st century has the potential to overwhelm current public health and medical care capacities at all levels, despite the vast improvements made in medical technology in recent years.
Source: K. Bisset and M. Marathe, Virginia Tech Illustration: A. Tovey
Figure 1. The impact of network structure on vulnerability and criticality of individuals. Consider a simple network consisting of two complete graphs denoting closely knit communities joined by a set of simple chains. Informally, the vulnerability of a node denotes the probability that the node gets infected when the outbreak starts at a random node in the network. The criticality of a node denotes the reduction in average epidemic size when the node is vaccinated and hence cannot transmit the disease. In general, nodes that are highly vulnerable are not necessarily highly critical. Moreover, criticality and vulnerability of a node do not simply depend on the number of contacts (degree) of the node. Both these facts are illustrated by considering the blue nodes in the figure; these nodes are highly critical but not highly vulnerable.
Certain modern trends could actually increase the potential for pandemics to cause more illnesses and deaths than occurred in earlier pandemics. First, the global population is larger and increasingly urbanized, allowing viruses to be transmitted within populations more easily. Second, the human population is more mobile today than in the past, increasing exposure rates and allowing viruses to spread quickly. Third, partly because of advances in medical technology and availability of medical services, populations in many countries have increasing numbers of elderly persons and those with chronic medical conditions, thus increasing the number of individuals especially vulnerable in an influenza pandemic.
As the global landscape of pandemics changes, governments must adjust public policies and their implementations in an emergency in order to successfully stem the effects of pandemic infectious disease. The field of computational epidemiology has arisen as a new branch of epidemiology to support the study of health policy aimed at preventing and remedying pandemics. Computational epidemiology provides computer models of populations and diseases that allow public health officials to examine diverse scenarios and study the best interventions to use for different cases. By running various experiments and comparing different intervention strategies with such a model, in silico, public health officials can plan for real events and improve the response to pandemics. Appropriate and timely actions by these officials can save the lives of hundreds of thousands of people.
 
Network-Based Social Contact Models
Progress in the field of computational epidemiology has been toward better models used to study infectious disease with ever-greater accuracy, resolution, and predictive power. Achievements have been made in all aspects of the field: its theoretical underpinnings, its greater understanding of large systems, and its ability to model increasingly complex systems through advances in computing.
For nearly a century the models used by epidemiologists were based on ordinary differential equations and aggregate models. A significant breakthrough in the study of disease and population dynamics, these models were sufficiently rich to describe many infectious diseases. They were also useful for obtaining analytical expressions for a number of interesting parameters, such as the numbers of individuals in a population who were sick, infected, and recovered. However, these modeling approaches were limited in their ability to capture the complexity of human interaction that underlies disease transmission. The models often assumed that a population was partitioned into just a few subpopulations (by age, for example) and that there was a regular interaction structure within and between subpopulations. Additionally, the number of different subpopulation types considered by these models was small, and parameters such as mixing rate and reproductive number were either unknown or hard to observe.
Research published in the past few years has shown these models are often inadequate for developing realistic public policies to control epidemics. Researchers now generally agree network-based modeling is crucial to computational epidemiology. But there is an ongoing debate in the scientific community about how to construct these networks and what features of these networks are most important for understanding disease dynamics. Initially, mathematical results were proved by using variants of the well-studied Erdos-Renyi random graph models. Early research using this approach showed the importance of network heterogeneity and indicated an important advance over the completely mixed models typical of an earlier generation.
The field of computational epidemiology has arisen as a new branch of epidemiology to support the study of health policy aimed at preventing and remedying pandemics.
Network-based models for studying epidemics may qualitatively change the kind of scientific questions that can be studied. The insights these models can provide are often fundamentally different from the insights provided by aggregate models. For example, the public health goal of "reducing the number of contacts by 30%" can be implemented in a network several different ways, such as reducing everyone's contact by 30% or removing a randomly chosen 30% of the contacts of the entire population, and the outcomes are different for these two implementations.
Source: K. Bisset and M. Marathe, Virginia Tech Illustration: A. Tovey
Figure 2. A plot showing the difference in epidemic curves as one changes the network structure. The red curve represents the epi-curve using a synthetic representation of the contact network of New River Valley, Virginia. The green curve is obtained by randomly swapping 20% of the edges and leaving everything else the same. In particular, the degree of every vertex remains the same in the new network, and the probability of transmission also remains unchanged. Notice the change caused by disrupting the community structure of the network.
But Dr. Madhav V. Marathe, deputy director of the Network Dynamics and Simulation Science Laboratory at Virginia Tech, adds a cautionary note. "Just like aggregate models, simple network models lack the structural variability present in realistic urban social networks," he says. "We thus seek first-principles methods for constructing realistic but synthetic social contact networks."
 
Computational Challenges of Network-Based Modeling
The Network Dynamics and Simulation Science Laboratory is using an endogenous representation of individual agents, such as representations of individual people, together with explicit interactions between these agents to generate and capture the disease spread across social networks. Social networks define the interrelationships between the individual members of the population; other aspects of the model, such as each individual's daily activities, are combined with the social network to provide specific occasions during which individuals may make contact with a contagious person and be likely to contract an infectious disease (sidebar "Methodology" p40).
These far more complex network-based models present a new set of computational challenges that require the use of high-performance computing. One reason is that the contact network is extremely large, irregular, and dynamic. The lack of symmetry and the constantly changing structure rule out the possibility of model reduction techniques often used for physical systems. Second, because of the stochastic nature of the models, a typical experimental design used to support a public policy question requires a large number of runs. Moreover, the size of the design often requires computational steering of experiments. Third, unlike physical systems, the diversity among agents is crucial in understanding the spatial and temporal spread of the disease.
Four simple examples illustrate the role of social network (both structure and individual attributes) on epidemics. Figure 1 (p37) shows that network structure can affect the vulnerability and criticality of a node. Figure 2 shows, in a real example from a study of the social contact network for the city of Chicago, how the epidemic varies with the structure of connectivity within the network. Figure 3 shows how vaccinating individuals in various age groups affects individuals in a specific age group. Figure 4 illustrates how working with realistic networks can shape the policies used to control epidemics.
Source: K. Bisset and M. Marathe, Virginia Tech Illustration: A. Tovey
Figure 3. The impact of network structure and individual heterogeneity on epidemics. This plot shows the effect of vaccinating various groups of individuals, on young adults from ages 16-20.   The x-axis denotes the percentage of individuals (for each group) vaccinated, and the y-axis denotes the number of infected individuals in age group 16-20.
Source: K. Bisset and M. Marathe, Virginia Tech Illustration: A. Tovey
Figure 4. The role of networks in policy planning, illustrating how policy planning questions can be shaped and qualitatively different insights can be obtained by using network models. Consider a complete network in the left panel. Reducing the contacts by 50% can be done in three ways: (1) dividing the network into two isolated pieces, (2) removing some nodes completely, or (3) reducing the strength of the contacts. Each results in a different impact on the final disease outbreak. The efficacy of the policy depends on the transmission probability. For example, if transmission probability is high, then splitting the network into two groups will prevent the individuals in one of the groups from being infected, assuming the disease started at a single node.

Network Structure, Scale, and Detail
An article in the Spring 2008 issue of SciDAC Review ("In HPC Simulations, How Much Is Enough?", SciDAC Review 7, Spring 2008, p10) raised important questions in the context of molecular dynamics: how big is big, and how small is small? A slight variant of this question is pertinent when investigating network models for computational epidemiology, as well. For instance, the resolution of the model used at Virginia Tech is at the level of the individual person and individual location in a large population. Individual-based models allow the capture of individual attributes, behaviors, and activities and are appropriate for computational epidemiology. Such models are important for at least three reasons.
First, the structure of a social contact network is defined by the representation of these individual elements and their relationships with each other. The network structure, in turn, is crucial for understanding how disease might propagate. Second, realistic and implementable policies often cannot be designed and studied without individual-level representations. This statement sounds a bit counterintuitive at first. After all, policies are broad guidelines one uses to control disease spread. One does not usually decide that a specific person should not go to work, for example. The problem is that these general guidelines are applied over arbitrary subsets of the population that have some common defining characteristics, and it is not known which of these subsets should be targeted. A representation that is individual-based can be used to develop policies by aggregating the results at the required level. Third, just as in physical systems, certain phenomena of interest cannot be observed at higher levels of aggregation. For instance, closing a single school or investigating the effect of a family structure on disease propagation is simply not possible if coarse representations are used.
Furthermore, the needed resolution is likely to be disease-specific. For example, for infectious diseases such as human immunodeficiency virus, disease evolution occurs within the host. Hence, this characteristic of the infectious disease may need a substantially more detailed representation than at the resolution of the individual person, as it is currently set. The development of vaccines and drug-resistant strains might be motivating factors to adjust the resolution to capture important host-disease dynamics.
Source: K. Bisset and M. Marathe, Virginia Tech Illustration: A. Tovey
Figure 5. A schematic diagram for Step 1 describing how a synthetic social contact network is constructed by integrating each person's activities and the locations where these activities take place. The colors show where contacts occur.
The structure of the interaction network also greatly influences an outbreak of an infectious disease. For diseases such as influenza and smallpox, the network is a social proximity network; for vector-borne diseases such as malaria, the network consists of both people and the disease vector organism, the mosquito. Recent research shows that even for diseases such as obesity and diabetes, appropriate interaction networks might play an important role: it appears that individuals forming social networks influence each other's behavior, and such actions in turn have an important impact on the disease outcome. Moreover, the problem of infectious diseases is not limited to humans; appropriate interaction networks also define how diseases are transmitted between animals or plants or across species. Aggregate models are insufficient because they do not capture these interaction structures.
Individual-based models allow the capture of individual attributes, behaviors, and activities and are appropriate for computational epidemiology.
Constructing social contact networks with sufficient accuracy to model disease spread within a city such as Los Angeles or New York is challenging. Such networks cannot be constructed by using extensive measurements, except in very simple, restricted situations. Doing so would require knowledge about every individual's demographics, activities, and locations, which would be both technologically impossible and ethically undesirable in terms of individual privacy. So how does one accurately represent a city's populace? The networks have to be constructed synthetically by integrating or fusing available datasets with simulation-based generative methods.
San Diego Supercomputer Center
Figure 6. A slice of the complete social network obtained by integrating individual social networks as depicted in figure 5. (Simulated social contacts for an individual in Chicago.)
Source: K. Bisset and M. Marathe, Virginia Tech Illustration: A. Tovey
Figure 7. A schematic diagram of the scalable cyberinfrastructure being developed at Virginia Tech.
 
Source: K. Bisset and M. Marathe, Virginia Tech Illustration: A. Tovey
Figure 8. A picture showing a visual analytic tool to support epidemiologist analysis of large-scale experiments. Top left, an egocentric network centered on a specific individual. Numbers in dark depict neighbors that are already infected. Bottom left and right, panels showing the kinds of information an epidemiologist cares about, including the overall epidemic curve, the home location of the individual, and various statistics.
Simulating virtual epidemics in a synthetic population the size of the U.S. population (currently around 300 million individuals) is even more challenging if one uses individual-based models with a sufficient level of detail to enable the exploration of interesting policy questions. But the advantage of simulations in such large geographic regions is clear: insights can be gained into phenomena that cannot be observed and understood when studying small neighborhoods. A good example of new phenomena observed in this manner is a phase transition point where small outbreaks take off and become epidemics.
Ideally, to study pandemics, one should consider the entire global population. The current state of the art is not yet at this level. Nevertheless, several groups are on the threshold of being able to model the entire global population, and researchers expect to see the first results within the next two years.
 
Developing an Integrated Modeling Environment
Simdemics is an integrated modeling environment developed by the team at Virginia Tech to support federal, state, and local government response to pandemics and epidemics. Simdemics uses network-based models at the resolution of individual people within a population. It yields detailed information concerning the demographic and geographic distributions of disease and provides decision makers with information about the consequences of a biological attack or natural disease outbreak, the resulting demand for health services, and the feasibility and effectiveness of response options. It leverages many recent advances in high-performance computing, theory of algorithms, social and behavioral sciences, and complex networks to provide new insights into understanding and controlling epidemics. In return, Simdemics has motivated novel research questions in complex networks, high-performance computing, and simulation science. Early versions of Simdemics concentrated on methods for detailed construction of social networks and simulating the disease dynamics. Initially, studies using Simdemics were done to evaluate simple epidemic intervention strategies. In response to continued interactions with analysts, however, Simdemics has been extended to improve its usability, scalability, and accuracy. These extensions are discussed below.
 
Scalable Models of Disease Propagation
Representing and analyzing disease dynamics over large, unstructured, and time-varying social contact networks required new work in high-performance computing. To address the scaling problem typical in algorithms implemented for high-performance computing, the group at Virginia Tech has developed and implemented three parallel algorithms over the past 10 years: EpiSims, EpiSimdemics, and EpiFast. These differ in terms of the tradeoff they provide between computation speed and model realism and sophistication. All three algorithms can be executed on a distributed-memory cluster.
Simdemics is an integrated modeling environment developed by the team at Virginia Tech to support federal, state, and local government response to pandemics and epidemics.
EpiSims is very general but substantially slower than the other two. EpiSimdemics falls in the middle in its tradeoff between speed and model generality. It can represent virtually all the existing models of between-host disease propagation. It supports fully dynamic social networks, and it also has the ability to represent a large collection of behavioral specifications. EpiFast takes a different approach to the problem of simulating disease spread. It capitalizes on research that showed a connection between percolation processes and finding connected components in a graph. This result was extended to obtain the first combinatorial algorithm for simulating disease spread after showing how time-varying progression disease dynamics can be mapped to the shortest path problem on a suitably defined weighted graph. EpiFast is extremely fast, completing one run on social contact networks with 10 million nodes in about two minutes using ten 2.5 GHz processors. However, it is not as general as EpiSimdemics. It works with a specific class of models that capture between-host disease transmission. Other generalizations are possible but result in increased running times.
"Each method thus is valuable," says Dr. Marathe. "We generally use Epifast for short turnaround studies and use EpiSimdemics for studies that require more sophisticated behavioral and disease transmission models."
 
Representing Coevolution of Interventions, Social Networks, and Disease Dynamics
The primary goal of an epidemiologist is to control the spread of infectious disease through the application of interventions, guided by public policy. These interventions induce a behavioral change in individuals at the same time as individuals self-impose behavioral changes in response to their perception of how the disease is evolving. Both of these factors imply that the underlying social network is constantly changing. Indeed, individual behaviors, disease dynamics, and the social contact networks that they generate interact and coevolve as individuals try to avoid infection and public health interventions have their effects. For example, consider a potentially lethal disease such as avian influenza. People may be expected to change their contact patterns when they perceive a potential threat due to the onset of avian influenza. This change in turn will alter the epidemic dynamics.
Representing and analyzing disease dynamics over large, unstructured, and time-varying social contact networks required new work in high-performance computing.
In addition to the coevolution of the epidemic and the social contact network, another component affects the dynamic—public policy. Policies such as vaccinations, flu shots, and school closures put in place by public health authorities cause significant changes to the social network. Policy planning has been a central focus of epidemiological research over the years.
An example that illustrates this issue is policy planning in the workplace. Individuals who are suffering the symptoms of a cold or flu have to ask themselves, should I go to work today even though I am coming down with a flu? This is a small decision, but it is faced simultaneously by millions of people throughout the country every day during cold and flu season. Either decision, to go to work or to stay home, impacts the disease and the economy: the immediate economic impact of absenteeism as a result of colds and influenza in the United States in 1980 was estimated to have been $6.5 billion. While a fraction of these infections arises from exposure outside the workplace, many infections occur because a coworker decided the consequences of possibly transmitting the disease were less important than the certain consequences of staying home. The term presenteeism has been coined to describe this problem.
 
Simulating Coevolving Networks
Although interventions and individual behavioral changes were studied in the past, their implementations were carried out in an ad hoc manner. Computational efforts concentrated first on developing fast methods for simulating disease progression. Yet the factors leading to coevolution within the underlying system should make it apparent that fast methods for disease progression that do not consider the interaction with policies and individual behaviors are insufficient. Also needed is a modeling environment in which simulation of disease spread is carried out in lock step with the interventions that are instituted and their resulting effect on the network structure and individual attributes.
This is where the classical analogy between percolation and epidemics breaks down. To date, most of the research efforts in building large-scale models have represented this coevolution in an ad hoc manner. Recent advances in artificial intelligence and operations research are likely to be useful in representing and analyzing this aspect of the model. Two mathematical models are at play: graphical coevolving discrete dynamical systems, or GCDDS (sidebar "Methodology" p40), form the basis for simulating the coevolving dynamics, while partially observable Markov decision processes, or POMDPs and n-way games, are suitable for representing and reasoning about interventions and individual behaviors. These require computations over the configuration space of these dynamical systems. Note that the configuration space over which POMDPs have to reason is exponentially larger than their representations. Simulations that implement a GCDDS are computationally as well as conceptually much harder than simulations that simply implement basic percolation processes. In general, simulations implementing a GCDDS comprise three steps that are repeated: simulate a step of disease progression in the network; evaluate the state of the disease, and test whether one or more triggering conditions hold; and apply applicable pharmaceutical or nonpharmaceutical interventions that change the social network structure or individual disease model. The model in Simdemics facilitates this type of representation and this multistep process. The triggering condition can be based on an individual or on a subpopulation and, in general, can involve evaluating a complicated function. Intervention also applies either to individuals or to a subset of individuals.
Source: K. Bisset and M. Marathe, Virginia Tech Illustration: A. Tovey
Figure 9. A possible analysis from Didactic. Shown here are four of the possible sets of epi-curves for the simulations chosen for analysis.
 
Building a Scalable Cyberinfrastructure
In addition to building parallel, scalable representations of the model within Simdemics, the team at Virginia Tech is working on ways to make Simdemics seamlessly accessible to users via today's web technology, thereby changing how such analytical tools are used to support studies leading to new public health policies. The objective is to connect researchers, educators, and experts worldwide by making an array of advanced computational resources available through high-speed networks and the Internet. To this end, the various components—software services, tools, data, grid computing, resource discovery services, and interfaces—are provided through a web interface, linking them to an underlying framework that runs the necessary software on high-performance computers. The user accesses the modeling environment in a familiar and comfortable way, simply by specifying the experimental design. The cyberinfrastructure (figure 7, p42) includes a data management environment and a visual analytics environment to support decision-making and consequence analysis. The results and analysis are returned to the user after the required computations are completed. The user is not expected to be an expert in high-performance computing and therefore is spared exposure to the machines these computations run on, the way the models are updated, the software systems used, and so forth. The viewpoint of "software as a service," so prevalent in today's information technology industry, is the one embraced here.
Just as the advent of search engines such as Google radically altered research and analysis of technical subjects across the board, the goal of the Simdemics cyberinfrastructure development is to make resources seamless, invisible, and indispensable in routine analytical efforts. The team at Virginia Tech has just completed the first working prototype of the system and delivered it to a user. This system, called Didactic, provides a convenient Internet-like access to high-performance, grid-based computational resources to support policy analysis, planning, course-of-action analysis, incident management, and training for preparation and protection of operational military forces faced with the threat of pandemic influenza (figure 11). As the technology continues to improve, it will provide public health analysts with unprecedented Internet-based access to data and models pertaining to large social organizations (sidebar "Didactic" p44).
The objective is to connect researchers, educators, and experts worldwide by making an array of advanced computational resources available through high-speed networks and the Internet.
The models are both compute- and memory-intensive. Experience in supporting pandemic planning studies for federal sponsors suggests the need for a mechanism by which users can have short-term access to a large number of high-performance computing resources. The TeraGrid is a natural choice, but it currently is not configured to support such requests because of its focus on research computing. Cloud computing might be an alternative, but it is still evolving. Environments such as SPRUCE that are designed to allow urgent computing tasks to be scheduled and completed rapidly on distributed high-performance computers in an emergency will become increasingly important (http://spruce.teragrid.org/).
Source: K. Bisset and M. Marathe, Virginia Tech Illustration: A. Tovey
Figure 10. "Standard Plot," one of the Didactic analysis types. The final average attack rates for the eight simulations chosen for analysis are shown. Here closing schools (sds) is the most effective intervention.
Practical Usefulness
Over the past 12 years Simdemics has been developed and used in a number of user-defined studies. These studies have guided its continued evolution and, equally important, helped identify new research questions at the interface of multi-agent modeling, data mining, network science, and high-performance computing. Recently, as part of the MIDAS project funded by the National Institutes of Health, Simdemics was used to analyze combinations of strategies for responding to a potential influenza pandemic. Results of the MIDAS analysis were reviewed in a letter report by the Institute of Medicine, Modeling Community Containment for Pandemic Influenza, and were published in the Proceedings of the National Academy of Sciences.
The MIDAS study considered both pharmaceutical and nonpharmaceutical interventions (NPIs) targeted at those parts of the population where they might most effectively control the spread of disease. NPIs aim to alter human social behaviors so as to mitigate an outbreak; interventions include closing schools or reducing contacts at work and in the community. In the course of the MIDAS study, the Virginia Tech researchers found that their overall methodology was suitable for estimating normal social contact patterns as well as changes in patterns resulting from NPIs. Arguably, it is difficult to generalize observations about transmission in observed outbreaks to hypothetical circumstances without such a generative, structurally calibrated model of social networks. For example, one can estimate from historical outbreaks that roughly 35% of influenza transmission occurs within a household and 65% occurs in the community (for example, at work or at school). However, if the proportion of transmission occurring in different contexts is a parameter of the model, it becomes impossible to say how this parameter might change as people's behaviors change. Moreover, since it is difficult to find out what spontaneous changes in behavior were happening during the historical outbreaks, it is possible that the effects of NPIs on overall transmission patterns are already included in transmission parameter estimates. The methods used in Simdemics infer the proportions given a social network from assumptions about relative transmission rates between people with different demographics, in effect separating the problem of estimating the social network from the problem of estimating transmission over that network.
 
Expanding the Cyber Environment and Its Application Domains
The Simdemics cyber environment comprises three components: (1) fast parallel simulations of disease propagation, (2) an environment for expressing and evaluating interventions and social network changes in conjunction with disease propagation, and (3) a scalable web-based architecture for analysts to set up, run, and analyze computer experiments. Researchers at Virginia Tech are currently adding data management middleware for efficiently storing the outputs of the simulations and allowing analysts to efficiently search and reason about them. This middleware will include a digital library, a backend database, and graphics and data-mining tools. In the next year or two researchers expect that it will become computationally feasible to develop simulations of global pandemics. The use of petascale computing platforms will play a crucial role in achieving this goal.
In the next year or two researchers expect that it will become computationally feasible to develop simulations of global pandemics. The use of petascale computing platforms will play a crucial role in achieving this goal.
K. Bisset and M. Marathe, Virginia Tech
Figure 11. A screen shot from Didactic. The system provides users with convenient, Internet-like access to distributed computing resources for policy planning and analysis of evolving epidemics.
The cyberinfrastructure, in conjunction with the high-resolution modeling tools being developed, is designed to be useful in other application domains besides pandemic planning: urban multimodal transit planning; integrated telecommunication systems; and the interaction of commodity markets and social networks such as those arising in study of collective behavior, opinion dynamics, and economic transactions. It is a part of a broader program in policy informatics and computational socio-technical science that our group is pursuing. The program seeks to build high-performance computing-based models and associated service-oriented architectures for analyzing coevolving, interdependent societal infrastructures.
 
Petaflop/s Computing and Future Simulations
On a machine with one petaflop/s of sustained performance, it is expected that it would take 7.5 seconds to run one national-scale simulation. By itself, a single run is not very useful. It takes many replicates of a run to capture the stochastic nature of the disease transmission process. In addition, typical case studies seek to understand the efficacy of various combinations of pharmaceutical and nonpharmaceutical interventions at different compliance rates. A set of experiments comparing combinations of four interventions, each with four levels, leads to a design with 256 cells, each requiring 50 replicates. Based on estimates, this will take 27 hours on a one petaflop/s machine.
The cyberinfrastructure is designed to be useful in other application domains besides pandemic planning: urban multimodal transit planning; integrated telecommunication systems; and the interaction of commodity markets and social networks such as those arising in study of collective behavior, opinion dynamics, and economic transactions.
Future computations will likely consider more complex intervention strategies. These interventions often result in individual behavior change—for example, individuals refraining from going to work if they predict that a number of their colleagues are likely to be sick and yet report to work. Simdemics can represent such behavioral adaptations at an individual level, but the overall run time is impacted. Thus, the increased complexity will need to be balanced by increases in hardware speed and software efficiency.
 
Contributors Dr. Keith Bisset and Dr. Madhav Marathe, Virginia Tech
 
Acknowledgments
This project is the joint work of the faculty, staff, and students of the Network Dynamics and Simulation Science Laboratory, Virginia Bio-Informatics Institute, Virginia Tech. We thank our sponsors at the National Science Foundation, National Institutes of Health, Center for Disease Control, Virginia Tech, and the Department of Defense for their support
 
Further Reading
The Network Dynamics and Simulation Science Laboratory
http://ndssl.vbi.vt.edu/index.php

C. Barret et al. 2008. EpiSimdemics: an efficient and scalable framework for simulating the spread of infectious disease on large social networks. In Supercomputing 08: International Conference for High Performance Computing, Networking Storage and Analysis, article 37. S.

Eubank et al. 2004. Modeling disease outbreaks in realistic urban social networks. Nature 429:180-184.

N. Ferguson et al. 2006. Strategies for mitigating an influenza pandemic. Nature 442: 448-452.

T. C. Germann et al. 2006. Mitigation strategies for pandemic influenza in the United States. Proc. Nat. Acad. Sci. U.S.A. 103(15): 5935-5940.

M. E. Halloran et al. 2008. Modeling targeted layered containment of an influenza pandemic in the United States. Proc. Nat. Acad. Sci. U.S.A. 105: 4639-4644.

H. Rahmandad and J. Sterman. 2008. Heterogeneity and network structure in the dynamics of diffusion: Comparing agent-based and differential equation models. Management Science 54(5): 998-1014.