Outbreak Science | Chapter 13: Mathematical Modeling of Infectious Disease by Operation Outbreak

FIGURE 13.0 | A phase portrait of a Susceptible-Infected-Removed epidemiological model.

CHAPTER 13

Mathematical Modeling of Infectious Disease

Selected Key Terms

Here are a few essential terms used in the science of mathematical modeling. By the end of this chapter, you should be able to apply these terms and understand how they relate to other critical concepts.

Derivatives

Deterministic

Differential Equations

Distribution Infected

Mathematical Modeling

Random Event

State Variables

Removed

Susceptible

Susceptible-Infected-Removed (SIR) Model

Slope

Stochastic

Variables

Big Concepts

13.1: Introduction to Mathematical Modeling

Mathematical models use equations to simplify complex processes and predict possible outcomes from starting conditions. In epidemiology, we can model a set of features of a population and disease to understand and predict outbreak outcomes. Historically, mathematical modeling has been extremely helpful in creating short- and longterm forecasts of epidemics, which has helped leaders create policies to limit the spread of the pathogen.

13.2: Types of Mathematical Models

Different mathematical models help us answer different questions. Building a model means identifying the elements of a system (called variables) that are most relevant to the phenomenon we want to understand. We then characterize the interaction between these variables using mathematical expressions. Different models may emphasize different aspects of a system; for example, some models capture the randomness often found in real-world events. One system can have multiple valid models. Models are not expected to be perfect; the main goal of a model is to generate a reasonable prediction of what might occur given a specified set of initial conditions.

13.3: A Primer on Differential Equations

Differential equations model how one variable changes with respect to another. Differential equations make it possible to mathematically describe systems that change over time, which is helpful for understanding and predicting a complex system’s behavior. Since outbreaks are dynamic situations with many interacting elements, differential equations are commonly used in outbreak modeling. A key step in writing a differential equation is identifying the processes that increase or decrease a specific quantity.

13.4: The Susceptible-InfectedRemoved (SIR) Model

The SIR Model is one of the most straightforward and powerful models in outbreak science. It divides a population into three compartments, which are also referred to as groups: Susceptible, Infected, and Removed. It then uses differential equations to model transitions between these compartments. Variations of the SIR model can be applied to different outbreak situations. The SIR model and other compartmentalized models are great tools for efficiently investigating, predicting, and controlling the spread of disease.

Vick’s Video Corner

Watch “ Vick’s Video Corner ” as an entry point for this chapter.

13 13

Mathematical Models

After reading this chapter, you will understand the fundamental concepts underlying mathematical models. You will closely examine one of the most wellknown models used to predict the course of infectious disease outbreaks and the most common applications for outbreak response. Finally, you’ll recognize that many concepts discussed in this textbook are based on and play a role in mathematical modeling.

The year is 1760, and you are a mathematician living through the peak of a smallpox outbreak in Europe. Sickness is everywhere: about one out of every ten deaths is due to the deadly disease. Currently, the most effective form of smallpox prevention is inoculation, which exposes healthy individuals to small pieces of someone else’s smallpox scabs to cause a relatively controlled infection, leading to lifelong immunity. However, one of your colleagues isn’t fully convinced that inoculating the public will affect the course of the outbreak—yet.

You have many well-known mathematicians in your family, and as a child you developed a strong interest in the subject. At the behest of your father, an early pioneer of calculus, you studied medicine and business for a chance at a more profitable career. With experience in both mathematics and medicine, you now get to work on proving the benefit of inoculation.

Aiming to uncover inoculation’s impact on mortality, you start to collect data on the number of susceptible individuals who can become infected, the number currently infected, and the number of those who have recovered or died from the infection. You use equations to describe how these numbers change over time, building a mathematical model that forecasts how often individuals become infected and ultimately recover or die from the illness.

Using your equations, you are able not only to predict the number of individuals who would survive each year if smallpox were eradicated,

“What I cannot create, I do not understand.”

—Richard Feynman, American Physicist

but you also show a substantial increase in life expectancy if the population were inoculated against smallpox. Ultimately, your mathematical process demonstrates the compelling benefit of universal inoculation against smallpox. Congratulations! You have helped to convince your critics and created epidemiology’s first credited mathematical approach to outbreak forecasting.

You are Daniel Bernoulli, a revolutionary mathematician and physicist (Figure 13.1). Your breakthrough epidemiological model brought a new perspective to the field of epidemiology that continues to have an impact on the way we tackle outbreaks today.

13.1: Introduction to Mathematical Modeling

Models are applied in many disciplines to help break down even the most complicated natural phenomena into their most basic elements. A model is a simplified representation of a real-world object or process, whether that representation is conceptual, mathematical, or physical. You’re already familiar with various types of models from reading earlier chapters in this textbook—as well as from your everyday experiences. Think back on the last time that you visited a new place; you probably consulted a map, which can be thought of as a model of a specific geographical location. The map showed you a simplified representation of your surroundings—highlighting major landmarks while omitting less important details like the cracks on the sidewalk—that helped you navigate the area. As another example, in Chapter 5: Biology of Infectious Agents, you saw many diagrams of pathogens, all of which are visual models of how these microbes are organized. In real life, microbes are much more variable and dynamic, but these model diagrams help us understand and think about their most important features.

Mathematical modeling takes the modeling approach a step further by using mathematical equations to describe real-world phenomena. These models describe complex processes with a defined set of equations that can be used, for example, to predict the likely outcomes of an event of interest. Mathematical modeling allows us to extend our predictive abilities beyond what would be possible with words and diagrams alone. Additionally, mathematical models help scientists explore and make predictions about situations that would be too complex, timeconsuming, or costly to test in the real world through controlled laboratory experiments.

FIGURE 13.1 | A portrait of Daniel Bernoulli in 1750. Bernoulli was a Swiss-Dutch mathematician who pioneered the use of mathematical modeling for studying epidemiological phenomena.

You can think of models as “virtual simulators” where you can tweak conditions and see how things might play out without physically testing each combination. While the results from these simulated experiments cannot be taken as real-world data, they can still inform future experiments or give us a sense of the limits of the model. These capabilities become extremely valuable during an infectious disease outbreak, where many complex interactions are at play— and stakes are high.

Mathematical models use equations based on relationships between variables, which are components or characteristics of the system that change depending on the circumstances. During outbreaks, variables are those factors impacting the outcome of an outbreak such as the population’s vaccination rate or how infectious the disease is. These and other variables, along with their relationships, can provide numerical data that informs scientists about how an outbreak may run its course. Defining the relationships between multiple variables helps us predict outcomes for events that have not yet happened, as they help us anticipate the effects of changes to the system. In other words, when we provide numerical inputs, we’re able to generate informational outputs that better describe our system—in this case, the outbreak—as a whole.

This might all sound intimidating, but you’re likely already familiar with variables and data in mathematical modeling. In sports, for example, analysts use data from previous games to establish models that predict how teams and players will perform in future matches. In video games, models of fluid dynamics, collisions, and other physical phenomena are used to create virtual worlds that respond realistically to player actions. Generally, mathematical models are built from real-world data, and can be used to simulate the course of events that have not occurred before.

As you learned at the beginning of this chapter, the first mathematical model of epidemiological phenomena is credited to Bernoulli in the late 1700s. While groundbreaking at its time, Bernoull’s approach was relatively simple and specific to studying the impact of inoculation on smallpox outcomes. Since then, the field has undergone incredible growth and advancement, with generalizable and complex mathematical models that can involve a system of equations. In the field of infectious disease, we commonly construct mathematical models using epidemiological data to mathematically evaluate and predict disease outcomes.

For example, during a smallpox outbreak in 1838, British physician William Farr noted an interesting trend: case numbers rose and then fell following a similar pattern. His observation of this trend and his later work applying it to subsequent outbreaks led to the formalization of “Farr’s law,” which predicted how many newly infected people we would expect to see in any given outbreak over time.

Interested in artifacts? Scan this QR code or click on this link to see William Farr’s original 28-page abstract from 1840. A footnote on page 97 contains the following hypothesis: “...the causes of epidemics are generations of minute insects transmitted from one individual to another, through the medium of the atmosphere.”

In 1906, former British Medical Officer of Health

Sir William Hamer proposed that the spread of infection was dependent on several factors, including how many people were susceptible as well as factors influencing how many people would be infected e.g., contact rate, probability of transmission, and the duration of infectiousness. Then he, along with public health physician Sir Ronald Ross, developed a

model for the spread of a pathogen through a population that would eventually become known as the Susceptible-Infected-Removed (SIR) model

To build their model, Hamer and Ross divided the population into three compartments. The names of these, while connected to the epidemiological definitions, differ very slightly. We define these compartments in relation to mathematical modeling here:

1. Susceptible, which are people who could become infected.

2. Infected, which are people who are infected.

3. Removed, which are people who have either died of the disease or survived and gained immunity against it (in both cases, these individuals cannot be infected again).

The model describes how the number of susceptible, infected, and removed individuals changes over time, thus tracking the spread of the pathogen through a population (Figure 13.2). We will describe the underlying mathematics of the model in Section 13.4: The SusceptibleInfected-Removed (SIR) Model. For now, we want to introduce you to the basic concepts.

Hamer and Ross’ equations were further refined in 1927 by Scottish chemist William Kermack and Scottish physician Anderson Gray McKendrick. Their model, called the Kermack-McKendrick model, was able to incorporate more complex variables of an outbreak situation. One of their chief takeaways was that not all events are entirely predictable, highlighting the different impacts that additional variables such as birth, migration, and imperfect immunity might have.

Mathematical models have become a mainstay in our response to outbreaks, providing useful insight into situations that are difficult to predict and navigate using intuition alone. They have also been extremely helpful in creating shortand long-term forecasts of epidemics. Each time a new outbreak emerges, such as H1N1 in 2009, Ebola in 2014, and COVID-19 in 2020, the use of models has helped leaders understand complex situations, evaluate mitigation measures (such as vaccination, mask-wearing, and stay-at-home orders), and create policies to help mitigate or limit the spread of the virus. Even if you don’t yet understand all of the mathematical techniques behind these concepts, we hope you can appreciate just how much they inform the design and utility of some of our means of defense against infectious threats.

FIGURE 13.2 | Diagram of the SIR model. The SIR model is based on the concept that susceptible individuals can become infected, and infected individuals are eventually removed by

Guiding Features of Mathematical Models

All mathematical modeling begins with the observation of a phenomenon that you would like to understand better. Model creators start by defining a “problem” or the phenomenon they want to model. Once the problem is defined, the modeler will map out the variables that need to be included. In the example of the SIR model, the data include the initial numbers of people who are susceptible, infected, or removed.

The modeler must also define parameters, which are constant numerical values that define and influence a model’s output and function. In a mathematical model, a parameter is like a setting that helps define how the model behaves. It’s a number that doesn’t change while the model is running. For example, in the SIR model, a parameter beta (ß) represents the transmission rate of the disease. We will learn more about ß in Section 13.4: The Susceptible-Infected-Removed (SIR) Model. Once they have a good sense of

their data and parameters, modelers assess the data available to them and gather the data that they will need to build the model.

In epidemiological modeling of infectious diseases, certain information is crucial to predicting the overall public health impact of an outbreak. Trends in previous outbreaks of a similar pathogen, data from current epidemiological surveillance initiatives, and social and biological information about the affected communities are particularly key to building effective models (Figure 13.3).

Previous outbreaks data

You might have heard the saying that the best predictor of the future is the past. This is also true in mathematical modeling. Data and other findings from previous outbreaks of a similar pathogen serve as a good starting point for building models. For instance, airborne pathogens spread through populations very differently from foodborne pathogens. As such,

PREVIOUS OUTBREAKS

Data, models, ﬁndings

EPIDEMIOLOGICAL SURVEILLANCE DATA

Cases, hospitalizations, genetic data

SOCIAL & BIOLOGICAL INFORMATION

Survey data, severity information, mobility

MODELS

FIGURE 13.3 | Data sources that contribute to infectious disease model construction. The data that are used to develop models can come from a variety of sources. This figure illustrates data plotted in three different charts that are used to develop mathematical models of outbreaks, including data from 1) previous related outbreaks, 2) current outbreak epidemiological surveillance, 3) social and biological information collected about the population of interest, among many others that can be a good source depending on the research goals.

data from a previous influenza outbreak are more helpful in predicting potential spread of SARS-CoV-2 than data obtained about an E. coli outbreak at a salad bar.

Epidemiological surveillance data

As you previously learned in Chapter 2: Epidemiology, surveillance data can include information about the demographics of a population, the number of cases of known diseases, the prevalence of behavioral risk factors among a population, and the risk factors and outcomes associated with certain diseases. Much of this information can be critical in creating and informing mathematical models. For example, in an Ebola outbreak, it would be very helpful to know how many people in a community are currently hospitalized for the disease, and how many others are sick at home, as this will give insight into both the case burden and severity of the disease. Overall, epidemiological surveillance data can serve as a natural comparison to predictive models, help mathematicians understand the general trends of diseases of interest, and attune their models to current observations.

Social and biological information

Mathematicians can also collect social and biological information from social surveys to inform their models. Social surveys are research methods that are used to collect information about individuals’ attitudes, behaviors, opinions, or characteristics within a specific population. These surveys are typically conducted through structured questionnaires or interviews with a representative sample of the population, and give insight into what people are likely to do during an outbreak. Finally, in epidemiological modeling of infectious diseases, “mobility” refers to the

movement of individuals within and between geographic areas. It encompasses various aspects of human travel patterns, such as commuting, travel for work or leisure, migration, and transportation infrastructure. In practice, mobility data for epidemiological modeling is often obtained from sources such as transportation records, surveys, census data. As technological developments continue, tools like geospatial, mobility and proximity data (currently anonymized for privacy). Collected on mobile phones, cars, and other devices can also contribute to the modeling data. Advanced modeling techniques, including network-based models and spatial simulations, use this data to simulate and predict the spread of infectious diseases more accurately.

Most real-world phenomena are highly complex, depending on many interconnected factors. Because some of these components and their interactions may be unknown, it can be difficult to build a model that will be able to represent the system in its totality. Additionally, limitations in the feasibility of collecting data and available computational power can constrain how extensive a model can be.

In order to build a manageable model of a complex real-world scenario, we have to make critical decisions about what can be included in the model and how to include them. When we choose which pieces of information are truly relevant to a model, we are making assumptions, which are conditions accepted as true for the purpose of developing a streamlined and relatively simple model. For example, when attempting to determine the number of infected individuals in a community, we might assume a constant population size (also known as a fixed population size), meaning that the number of individuals in a population remains the same

Table 13.1: Guiding questions researchers consider to design a model

Modeling guiding questions

What are the critical components of the system that must be addressed by the model?

What assumptions am I making when setting up this model, and what information is being lost as a result?

Is the model simple enough to be usable?

How

does this help?

We can decide which variables to include in our model to answer a specific question.

We can understand what variables the model does not capture, and assess what their absence might mean for the results it produces.

We can decide what computational resources we would need to implement this model and determine if we can use it efficiently in a real-world scenario.

over time, outside of the effects of the infectious disease. However, we recognize that, in reality, populations may shift slightly through births, deaths, and migration. These assumptions can be challenging to make, but thankfully, we can apply a few overarching principles to guide us as we make them. In Table 13.1, we show a few guiding questions that researchers might consider when designing a model.

Once we define our variables and make the necessary assumptions, we can start to build the mathematical equations that best describe the system.

No Model Is Perfect

At this point, you may be wondering: how great can a model be if we have to leave information out or make potentially inaccurate assumptions. In most real-life cases, there is no such thing as a perfect model. It is impossible to include every variable that may affect a realworld phenomenon we are interested in, but an imperfect model can still be useful when focused on a specific problem. Furthermore, different models have different key priorities and applications, allowing for one system to be represented by multiple models, each targeting distinct phenomena.

It’s also important to note that every scientist who develops a model runs the risk of introducing their own biases to the model. The people who build models will have their own viewpoints, blindspots, and goals for their research, which—unless carefully accounted for—might influence the construction, and skew the outcomes. Just as an AI or machine learning model can be biased by the human data it is trained on, the data that mathematicians use might similarly be reflective of specific groups of people. This means that they might be modeling outcomes for a much smaller subset of the population than intended. In these ways, mathematical modeling has both social and empirical influences.

Let’s go over an example of an imperfect but useful mathematical model: the weather forecast (Figure 13.4). At first glance, weather may appear to be one of the most random occurrences in daily life. So, how are we able to predict what will happen with any accuracy at all? Numerical Weather Prediction (NWP) models simplify hundreds of potential variables (such as humidity, wind, precipitation, etc.) and the relationships among them into a series of equations. Due to the inherent chaos of our

atmospheric system, the difficulty of gathering data in every location, and other challenges, these models can sometimes make inaccurate predictions. We continually use weather models because we recognize that they still have utility despite these limitations. If your model provides useful information that can answer your question with reasonable accuracy, then it has value. We note that weather models are typically much more complicated than the epidemiological models that we discuss in this chapter, but they still serve as useful examples of how to represent real-world events mathematically.

Importantly, model-building is an iterative process, in which the predictions of a model are compared to real-world outcomes and used to guide further model development. Consistencies between the two highlight areas where the model performs well, while discrepancies between the two can provide clues about which aspects of a model could be

improved, or areas where more data need to be gathered to include in the model. The dialogue between modeling and data-gathering allows both practices to be refined and improved.

F IGURE 13.4 | Mathematical models help predict the weather. Different variables in the atmosphere, including humidity, winds, precipitation, etc., affect the weather. The mathematical models that we use to predict the weather take into account some of these variables and the relationships between them.

Stop to Think

1. What is a variable, and how does it relate to mathematical modeling?

2. What is a mathematical model?

3. What is the importance of making assumptions in mathematical models of real-world phenomena?

Di erences between columns of air

Di erences between vertical layers

13.2: Types of Mathematical Models

Now that you have been introduced to what mathematical models are, as well as their underlying principles, limitations, and utility, let’s dive into how we devise a mathematical model.

The Growth Of A Bacterial Population: A Simple Model

Let’s begin with a simple example of a mathematical model: determining how fast a population of bacteria grows.

In constructing a simple model for bacterial population growth, we need to start by asking ourselves: what factors contribute to the size of a bacterial population? One of the most important elements you may have considered is reproduction. As you might recall, in bacterial reproduction, each bacterial cell typically divides to create two new cells. Each of these two cells divides to produce two more, creating a total of four cells. Each of these four cells goes on to divide again, creating eight bacterial cells. In other words, the bacterial population doubles every generation. This pattern of increase is known as exponential growth (Figure 13.5).

Exponential growth can be expressed using the following function, which is a mathematical description of a relationship between variables: P(n) = 2n

In this equation, n represents the number of generations, and P(n) represents the number of bacteria present at generation n. This basic model can describe the growth of the bacterial population from a single bacterial cell. If we want to know the number of bacteria present at any generation, we plug the number of the generation into the equation. For example, at the start of the time period, when n=0, P(n) is 20, which is equal to 1. In other words, there is one bacterial cell at the start of the time period, which is consistent with what we know to be true. In the next generation, n=1, since our original cell has divided in two, we have 21 total cells. Just as we predicted in Figure 13.5, this pattern continues across the next generations. We can further visualize and develop general insights about the system by using a chart like the one in Figure 13.6, which follows bacterial growth over time.

This chart gives us a lot of useful information beyond the number of bacteria present in any given generation. For example, researchers studying wastewater contamination might be

FIGURE 13.5 | Exponential growth of bacterial cells. In this figure, we are referring to the first bacterial cell as generation 0 and as each cell divides, the number of generations goes up. On the right side of the figure, you can see a visual representation of each cell division, adding two new cells, resulting in a doubling of the cell population every generation.

FIGURE 13.6 | Exponential growth of bacterial cells over time. The number of bacterial cells P(n) as a function of the number of generations n follows a classic pattern of exponential growth as shown on this chart.

interested to know how many bacteria were present from the day before, even if they hadn’t collected a sample on the desired day. To determine how many more cells there are in one generation compared to the previous one, we can find this using the equation P(n) - P(n-1) = 2n - 2n-1. These sorts of relationships can help us identify values for previously unknown variables and determine a more holistic picture of the system over time.

As it stands, this model describes the growth of a bacterial population that starts with just one reproducing cell. But what about populations that started at different sizes? To account for different-sized starting populations, we simply multiply the equation by the number of bacterial cells that are able to replicate in the initial population – for simplicity, we will assume that all bacteria within the initial population, called p0, are able to replicate. Now, our model accounts for the number of bacteria able to replicate at the beginning of the time period. The equation becomes

where P(n) is the population size at generation n.

Have you noticed any assumptions in the model so far? A significant one is that we are assuming that none of the bacteria die. Let’s make the model more realistic by introducing a new variable: a survival rate, s, which describes what proportion of bacteria from a given generation survive into the next. So, in our revised model, we have

P(n) = p0 · (s · 2)n,

where s is the proportion of the population that survives in each generation, and p0 is the initial population (the population at time 0).

Now, we have a model that shows how the number of bacteria in a population might change over time, as well as the survival rate for each generation. It’s important to also note that there are other potential factors influencing bacterial growth that our model does not consider, such as temperature, pH, or nutrients available. As you might imagine, our model would change if we were considering a bacterial population that resides in your acidic small intestine versus in a test tube in the lab. As we’ve discussed at length, a key part of model-building is deciding which variables are relevant, and which are not. Models should strive to represent a system accurately while still incorporating enough simplicity to be useful. Mathematical models can be as complex as the modeler chooses, but importantly, even a simple model like this gives a fairly accurate representation of bacterial growth and suits our specific purposes well. To model more complex events, we need to use other tools, which we’ll dive into below.

Types Of Models Used In Epidemiology

Epidemiologists and other outbreak responders employ a number of different mathematical models to predict and curtail the spread of disease. These models can be categorized in various ways, with one of the most common groupings being static vs. dynamic. Static models represent a system at a specific point in time, while dynamic models describe how a system evolves over time. Static models show a moment frozen in time, like a snapshot of a population taken by a camera. In the context of infectious disease, a static model might show the number of individuals in a community that are infected with COVID-19 at the height of the pandemic, without considering changes over time. On the other hand, a dynamic model would behave more like a short film, giving information on how various crucial metrics change over time. For example, the changes in the population impacted by COVID-19 over different periods of time, such as before and after mask mandates or school closings.

Epidemiological models can also be categorized as linear or nonlinear. Linear models refer to simple equations in which the input and output have a direct, constant mathematical relationship that can be modeled with a straight line, while nonlinear models refer to equations in which the input and output have a nonconstant mathematical relationship that cannot be modeled with a straight line. We might use a linear model when measuring something that we know has a fixed rate of change or an otherwise linear relationship, like the number of sick patients an urgent care clinic might receive over an eight-hour day if we know that they always have a constant patient admittance rate of 15 patients per hour. However, in a situation like where you are predicting the number of

people that visit the zoo on a hot day, you might see more visitors per hour in the morning when the temperature is cooler and fewer visitors per hour during the afternoon when temperatures are hotter. Here, we would rely on a nonlinear model that factors in these non-constant visitor rates so that we more accurately capture the differences in the rates of change. We will discuss linear and nonlinear approaches in more depth in Section 13.3: A Primer on Differential Equations.

There are a number of other ways to categorize the models that are useful for epidemiological investigation. We will describe two of these in greater depth here: deterministic vs. stochastic, and compartmentalized vs. noncompartmentalized

Deterministic vs. Stochastic

There are certain scenarios in which a given input will always generate the same output, meaning that this model will always generate the same result when provided with the same variables. Such models are said to be deterministic. For example, the area of a rectangle can always be calculated as Area = Length × Width, providing a direct and predictable result based on these two measurements. No matter how many times you ‘run’ this area model, every time you put in a length of 2 inches and a width of 4 inches, your output will be an area of 8 square inches. Similarly, a car’s speedometer shows its speed based on the rotation rate of the wheels, directly correlating wheel rotation with the car’s speed in a predictable manner. If your car’s speed is calculated as Speed = Rotation Rate × π × Tire Diameter, when driving with 0.5-meter-wide tires and a rotation rate of 10 rotations per second, you can always expect a speed of 15.7 meters per second (m/s).

But many processes have inherent randomness or variation, and they don’t always produce the same results twice, even with the same input. These events are said to show stochastic behavior. The predictability of these events cannot be precisely measured but can be estimated with probabilities, which you might already know from your math classes are the estimated likelihood for a given event to happen. So how can we account for these types of variations in outcome in our model?

Let’s first consider an everyday example of a stochastic process: flipping a coin. Have you ever flipped a coin to make a decision? If you have, you know that you can never be certain what the outcome will be. However, if you flipped a coin several times and recorded the results, you would likely see that the first 10 flips have a different breakdown of heads and tails compared to your next 10 flips: you might first get 7 heads and 3 tails, and then 4 heads and 6 tails. At the same time, you may already know that the probability of landing on either heads or tails is 50%. How can both of these things be true? To answer this question, let’s learn about the nature of random events.

A flip of a coin can be described as a random event because it is an event that may occur with some degree of uncertainty; one cannot know for sure what the outcome will be. Assuming the coin is fair and the flip is unbiased, each possible outcome—landing on either heads or tails—is also equally likely. This unpredictability and the equal probability of all outcomes are what make a coin flip a classic example of a random event. However, as we look at more occurrences of the random event (many flips of the coin), random differences can begin to cancel out and a noticeable trend emerges.

For example, let’s say we flip a coin 10 times. In this set of 10, since each outcome is a random event, we don’t know exactly how many times heads will come up versus tails, and with so few flips it often is not exactly 50/50. At this point, rather than physically flipping a coin many more times to generate a larger and more reliable sample size, let’s use a computer model to explore what happens during a longer sequence of coin flips. In other words, let’s use a model to illustrate a realworld phenomenon. In Figure 13.7, you can see an example for a series of random coin flips generated by a computer model. If you look at the first row, we find that 30% of the flips were heads and 70% were tails. On the next row, we find 90% heads and 10% tails. Neither of these is particularly close to our expected 50/50 split. However, the 100 coin flips resulted in 55% heads and 45% tails altogether. This division is much closer to the expected 50/50 proportion, and would likely grow even closer to 50/50 if we further increased the number of flips.

Using a computer to simulate the possible outcomes of many, many sets of coin flips will show us the distribution of the outcome. A distribution is the set of all possible values a variable could have and the probabilities or frequencies associated with each one. This set allows us to paint a comprehensive picture of not only every possible outcome we might expect but also how likely each one is to occur—crucial information for any system. Once we know the distribution, we can determine the expected value of our random variable by averaging its outcomes over repeated measurements. If we run such a stochastic model repeatedly, we can

FIGURE 13.7 | Series of random coin flips generated by a computer model. This figure shows how repeated coin flips allow an expected value to emerge. This figure is generated from an actual computer simulation of random coin flips. The first ten coin flips result in 30% heads and 70% tails. The next ten flips result in 90% heads and 10% tails. After 100 coin flips, we find a proportion closer to 50/50, with 55% heads and 45% tails.

get a sense of this randomness and what the average scenario might look like, as well as the distribution of possible outcomes. For example, if we flipped a coin many times, we would uncover an expected value of 50% of the flips resulting in heads. This is one example of how useful it is to collect as much data as possible when constructing a model for stochastic processes. The expected value and distribution of the number of “heads” observed within 100 consecutive flips across 10,000 simulations is shown in Figure 13.8.

How is randomness built into stochastic models?

Stochastic models incorporate an element of randomness, such as an X or a Y, that produces an output that cannot be precisely predicted. Good stochastic models combine previously collected data with random components to produce a range of results based on the same inputs.

You might be wondering why we would want to build randomness into our models. After all, if we’re running a model to get a specific answer, wouldn’t having multiple different

outputs from the same data be unhelpful? This variety is actually what makes stochastic models so effective at modeling real-world events, as naturally occurring events always have unpredictable elements. In the context of infectious disease, the random elements could be things like the unpredictable behavior of individuals in a population or the random mutation of a virus that makes it more transmissible. Both of these random events introduce variability and uncertainty.

Importantly, stochastic models can sometimes give a very wide range of possible outcomes. For example, a model might predict that 10 thousand to 50 thousand people will become infected with a given disease. While these numbers are very different, the fact that the model can provide a highly accurate approximate range is very useful to outbreak responders nonetheless. Furthermore, a wider range may be more informative for outbreak professionals than a single number on its own. While singular, specific output may appear

FIGURE 13.8 | Number of heads observed within 100 consecutive flips in 10,000 simulations. Note that the highest number of simulations showed a value of around 50 heads – or 50% of the 100 flips. (The peak does not occur at exactly 50 due to the element of randomness.)

COIN FLIPS DISTRIBUTION

ideal, this full range could communicate a certain degree of uncertainty that allows health professionals to prepare for all likely possibilities.

In practice, both deterministic and stochastic models can provide valuable insights into the dynamics of infectious disease outbreaks. Deterministic models are often used for making general predictions under assumed conditions, while stochastic models are more suitable for capturing the randomness and uncertainties inherent in real-world scenarios. Together, they allow outbreak scientists to predict key details regarding a disease’s spread as well as our responses to it.

Compartmentalized vs. Noncompartmentalized

Epidemiologists also use both compartmentalized and noncompartmentalized models in outbreak response. Compartmentalized models divide the population into smaller groups, called compartments, where each group represents a unique state or condition in the system. In epidemiology, an example of a compartment would be a disease status, such as “infected.” These models then describe the transitions between these groups. Hamer and Ross’s SIR model was the first compartmentalized model widely used in outbreak response, dividing the population into susceptible, infected, and removed groups.

Compartmentalized models use state variables, which represent the quantity or status of a specific group or compartment at any given time. These models often involve the concept of rates of movement or transition from one state

to another. For example, this would include the rate at which infected people recover or pass away, moving them from the infected group to the removed group. Compartmentalized models can use a type of equation called a differential equation, which we will discuss in the next section, to model how the numbers of people in each of these compartments change over time. As we discussed, models can be described in many ways. For example, compartmental models can themselves be either deterministic –where the flows are precisely determined by variables of the model, or stochastic – where randomness is included in the flows.

There are also a number of noncompartmentalized models that have utility in outbreaks. For instance, network models simulate how disease spreads through contact networks through the entire population. Similarly, statistical models can identify factors associated with disease spread without necessarily dividing the population into compartments. They might look at the overall rate of infection in relation to variables like population density, vaccination rates, or social behaviors.

Now that you know more about the types of mathematical models and their components, let’s learn more about the mathematical equations that allow models to be run: differential equations

Stop to Think

1. Name three important considerations when building a mathematical model.

2. What is the key characteristic of stochastic events?

3. Classify each of the following scenarios as stochastic or deterministic events:

a. Each time you sell a chocolate bar, you earn one dollar.

b. You flip a coin 20 times and count the number of heads.

c. You squeeze food coloring into a glass of water and observe the food coloring diffuse.

d. For every hour you do community service in the library, you receive one extra credit point.

13.3: A

Primer on Differential Equations

Mathematical models can be built using various tools and techniques, such as algebraic equations and statistical methods, depending on the nature of the problem and the aspects of reality they are designed to represent. In physics and biology, including the study of disease transmission, many models involve studying quantities that change at a certain rate in relation to another variable, such as over time. To help understand and mathematically model these systems, we need to use differential equations, which are mathematical statements that describe how a system changes. Differential equations are constructed using derivatives, which measure the rate of change of a variable.

If you haven’t taken calculus, these concepts may be new to you but don’t worry - for now, we just want you to get a sense of what they tell us and how they can be used. Derivatives are powerful mathematical tools with many real-world applications, and you will learn much more about them in advanced math classes.

The Anatomy Of A Differential Equation

All differential equations follow a basic form that describes how one variable changes with respect to another. The dependent variable is the variable being measured in an experiment that is expected to change in response to the independent variable. The independent variable is your input; rather than being influenced by other variables in the equation, it is the only variable that you are changing. A common independent variable in mathematical modeling is time.

FIGURE 13.9 | A linear function describing the relationship between two variables. In this figure, a straight line represents the relationship between two variables, t, and y, where y is the dependent variable, as it changes in response to changes in the variable t

Let’s consider a simple system containing variables t (representing time) and y, where y changes with respect to t (Figure 13.9). We consider y to be the dependent variable since it depends on t. Meanwhile, t is an independent variable. The relationship between y and t is shown in the graph in Figure 13.9, where the points labeled (t1, y1) and (t2, y2) show the output of y values at two separate times: t1, and t2

We can mathematically describe y’s dependence on t using a concept that may be familiar to you from algebra: the line’s slope Slope refers to the steepness of a function on a graph and is calculated as rise over run, where ‘rise’ represents the change in y-values and ‘run’ represents the change in x-values on the graph.

Slope = (y2 - y1) / (t2 -t1).

We use the greek letter Δ (pronounced “delta”) as a shorthand to signify the change in value of each of the two variables, allowing us to represent these changes in x and y values as

F IGURE 13.10 | A nonlinear function – a parabola, where y = t2. For a nonlinear function, the slope is different at varying points. We can approximate the slope at the red point (t1, y1), by selecting a nearby point (shown in blue), and calculating the slope between the two points. As the blue point moves closer to the red point, our approximation gets more and more accurate.

We can now rewrite our equation for slope as

Slope = Δ y / Δ t.

With a straight line, calculating the slope is easy. But when many real-world processes follow more complicated patterns than straight lines, what can we do to characterize the behaviors of these more complicated nonlinear functions? As an example, let’s say we have the curve in Figure 13.10 and want to calculate the slope at the red point marked (t1, y1).

We can approximate the slope by selecting a nearby point and repeating our previous calculation. As Δ t (that is, t 2 - t 1 ) gets smaller and smaller, the points get closer and closer, and our approximation of slope at the precise point ( t 1 , y 1 ) gets better and better. This approximation based on smaller and smaller distances between the point of interest and a nearby point is known as a limit . The limit of Δ y / Δ t as the change in t values ( Δ t ) approaches zero is a derivative. This is now denoted as dy / dt rather than Δ y / Δ t , since d is often used to represent an infinitesimally small change in mathematics. This derivative can be used to precisely identify the slope at our point of interest.

Let’s review a simple real-life example of a differential equation in detail. Imagine that you have a bucket with a small hole at the bottom. You turn on a faucet, and water begins pouring into the bucket at a rate of 50 milliliters (mL) per second. At the same time, because of the hole, water escapes from the bucket at a rate of 20 mL per second. If the bucket can hold 1000 mL of water and we start with an empty bucket, when will it fill up?

Does this image of the flow of water in Figure 13.11 look familiar? It may remind you of our dynamic bathtub system in Figure 2.8 of Chapter 2: Epidemiology. That figure depicts a dynamic model of prevalence and incidence as individuals become infected, and then either recover (evaporation) or die (pour out) at different measurable rates. As you will see below, this system can also be represented with a differential equation.

The rate at which water flows into and leaves the bucket shown in Figure 13.11 can be modeled with derivatives. Since derivatives measure the instantaneous rate of change of a system, we can write two derivatives to express

two rates: 1) the rate at which water enters the bucket and 2) the rate at which water leaves the bucket, with respect to time:

Δwater in over Δt =50 mL/second

Δwater out over Δt =20 mL/second

Plugging in the values given above, we can determine the rate at which the bucket is filled:

As in previous examples, Δ represents change, and the letter t represents time, so Δt represents a change in time. For simplicity, we will set Δt to one second in both equations. The first equation tells us that we have 50 mL of water entering the bucket per second, and the second equation tells us that we have 20 mL leaving per second.

We can now write a differential equation, which represents the quantity we care about – the amount of water in the bucket – as a value that changes with respect to time.

Our differential equation demonstrates that the bucket fills at a net rate of 30 milliliters per second. Using a simple linear equation, we can now answer the question of how long it takes for the empty bucket, which can hold 1000 mL, to become full:

1000 mL over 30 mL/sec = 33.3 sec

Let’s consider an example of a nonlinear differential equation. Suppose that you are investigating the growth of a bacterial population on a Petri dish. Each day, you count the number of colonies on the plate. You plot these values on a graph. You want to know what the growth rate of your population is and whether that rate is changing over time. Given that the only pieces of data you have are the number of colonies and the time, how can you find the rate of growth?

Let’s now look at the number of colonies on your Petri dish over time. In Figure 13.12, we show a typical bacterial growth curve, which we can use to find the slope between any two points and determine the population’s growth rate between those points. The slope gradually increases as the bacterial population reaches a state of rapid cell growth and division, where the slope is the most steep. The bacteria populations

F IGURE 13.11 | Filling a leaky bucket. This bucket is both being filled and emptied at different rates. We can use a differential equation to determine when the bucket will eventually be fully filled.

ultimately reach a carrying capacity, at which their resources are depleted, and their environments are no longer able to accommodate growth, so reproduction is limited and slope declines. Ultimately, as the conditions no longer can accommodate the existing bacteria and more bacteria die off than are produced, the slope becomes negative. A differential equation could be used to model the changing population size over time throughout the experiment, taking into account the derivative of population size from this graph. In other words, you can calculate the growth rate between any two time points, which will vary depending on where on the graph you are, and predict the different population sizes at all points in time.

Stop to Think

F IGURE 13.12 | Bacterial growth curve. If we measure the number of bacteria in a population over time, we can plot these measurements on a curve, as shown here. The x-axis represents time, and the y-axis represents the number of bacteria. As the bacterial population grows, it experiences new growth constraints, such as environmental capacity, lifespan, and more. These constraints are represented on our graph by the changes in the slope of the curve over time (the derivative).

1. How does a differential equation describe the rate of change of a quantity over time?

2. Name a process that can be modeled with differential equations. There are many possible answers to this question. (Hint: Think rates of change!)

BACTERIAL GROWTH CURVE

13.4: The SIR Model

Now that we have discussed various types of models and the role that differential equations play in them, it is time to take a closer look at one of the most commonly used models for studying disease outbreaks, the SIR model.

The SIR Model: Conceptual Breakdown

Let’s start by discussing the susceptible, infected, and removed compartments in an outbreak setting, the three compartments we will use to build our model.

In creating an SIR model, we begin with a population of primarily healthy individuals, with only one (or, in reality, a few) individuals infected by the pathogen in question. We assume the majority of the population at the beginning of an outbreak is susceptible to the illness, meaning most individuals can become ill.

For infection to spread, there must be an initial condition in which one or more individuals in the population are infected with the disease. The infected individual(s) can then transmit the infection to the healthy, susceptible population around them, causing the infected population to increase and the susceptible population to decrease.

As time goes on, certain infected individuals will either recover or die from the illness. These individuals become part of the removed compartment, meaning they cannot be susceptible or infected. It might be intuitive that those who die are then considered part of the removed compartment – they are clearly no longer able to be infected. But what about those who recover? When someone gains immunity to a disease after having it, they cannot be reinfected, so they also become part of the

removed population. In this way, both death and recovery can increase the removed population and decrease the infected and susceptible populations. If individuals do not gain immunity after disease, they will re-enter the susceptible population, but we assume this is not the case in the SIR model.

The general progression of the compartments of the SIR model is modeled in Figure 13.13. Since the exchanges of proportions between the three populations occur continuously and at measurable rates, we can mathematically describe the movement of individuals between the compartments and create a model that uses differential equations to chart the likely course of the outbreak.

This compartmentalized model can help us predict the length of an outbreak, the expected number of infections, when the peak of an outbreak will be, and more. It can be updated to show the effects of different public health measures and therefore work as a guide for public health interventions.

The SIR Model: Assumptions And Limitations

Like all models, SIR models of outbreaks make some assumptions. We have already talked about one major assumption: recovery from infection provides permanent immunity. This is not always the case in real life, and other types of models account for the possibility that recovered individuals will eventually become susceptible again. We also assume that at the outset everyone in the population is susceptible. However, we now know that for most, if not all, infectious diseases, some proportion of the population may have a preexisting immunological or genetic resistance to infection or disease. Another consideration

MOVEMENT BETWEEN SIR MODEL COMPARTMENTS OVER TIME

F IGURE 13.13 | Movement of individuals through the compartments of the SIR model. This stacked bar chart shows one way of visualizing the movement between the compartments of the SIR model. The size of the segments within each bar represents the number of people in each compartment at a given point in time. At the start of this particular outbreak, most of the individuals are in the susceptible compartment, with one or a few in the infected compartment. As time goes on and the infection spreads, the number of infected people increases as the number of susceptible people decreases. Finally, the number of removed people grows as individuals either die or gain immunity. Note that the total size of the bars, representing the total number of individuals, remains constant over time.

of our SIR model is that we assume the disease is brought into the population by a single index case (patient zero, or the first identified case), and that the rest of the population begins as susceptible. In a simple SIR model, we also assume that the rates at which people become infected, recover, or die from the disease are constant for the whole duration of the epidemic—there are no ‘surge’ infection rates or periods of swifter recoveries like you might observe in real life. More complex models can include rates that vary over time or based on given environmental or behavioral changes.

There are also a number of more broad assumptions used in many epidemiological models. We previously noted that we often assume a constant population size in which there are no births or deaths other than those caused by the illness, or migrations in and out of the population. This is never truly the case in reality, particularly during an outbreak setting

where people may try to flee the area of the outbreak. This assumption also doesn’t account for the fact that there might be newcomers that enter the population during the course of the outbreak. Some of these new arrivals might already be sick and could even run the risk of introducing a new variant to the population.

We also assume that every individual in the population is equally likely to come in contact with an infected individual. A population in which every individual is equally likely to come in contact with each other is known as a wellmixed population. However, in reality, you’re a lot more likely to interact with people within your household, school, or workplace: those with the same backgrounds, interests, or ages as you. As you may recall from Chapter 10: Social Determinants of Health, many social drivers like socioeconomic status affect our interactions and, therefore risk of exposure. Finally, as you already know, not all of the

deaths attributable to an outbreak are due to the disease itself, as hospitals become overwhelmed and other aspects of society change. Outbreaks put increased stress on every part of not only our healthcare systems but our social systems as well, often leading to adverse effects within these other areas.

Assumptions, while necessary to the SIR model, introduce limitations that should not be ignored. For example, the limitations we describe here hinder our ability to account for disparities in infection rates between different groups. To address these factors and many others, more complicated models can be developed. Even with the limitations we’ve discussed, the SIR model is one of the most useful tools at our disposal in an outbreak.

The SIR Model: Equation Breakdown

Now that you understand the assumptions and limitations of the SIR model, let’s have a look at its creation.

You already know the three main compartments represented in the SIR model: susceptible, infected, and removed individuals. These are our three dependent variables, S, I,

of infection

and T, as they will all vary with respect to time t, our independent variable. We can represent these compartments at varying times as S(t), which is the number of healthy, susceptible individuals at time t who can be infected; I(t), which is the number of infected individuals at time t who have the capacity to transmit the illness to other members of the total population; and R(t), or the number of removed individuals at time t. These variables change over time, but it is important to note that they add up to the total population, N, at any given moment:

Transition Rates

The interactions between our three compartments are largely summarized by two key transition rates: from susceptible to infected and from infected to removed. The first rate, known as the rate of infection, describes the transition of the susceptible S(t) population into the infected I(t) population over time, while the second rate, the rate of removal, describes the transition of the I(t) population into the removed R(t) population over time (Figure 13.14).

of removal

F IGURE 13.14 | A simple representation of the SIR model. The SIR model studies the interactions between susceptible, infected, and removed variables. The first rate is defined as the rate of infection and describes the transition of the susceptible to the infected populations. The second rate is known as the rate of removal and describes the transition of the infected population into the removed population.

Infection rate

To determine the rate of infection, we must ask ourselves how individuals become infected in the first place. As you have already learned from previous chapters, there are different means of disease transmission. The SIR model can be adapted to model a variety of transmission mechanisms, including personto-person contact, airborne transmission, vector-borne transmission (like malaria spread by mosquitoes), or even waterborne and foodborne spread. Here we will discuss an SIR model for person-to-person transmission.

In a person-to-person transmission scenario, when a susceptible individual comes into contact with an infected individual, there is a likelihood that they will contract the illness. However, people interact with each other to varying degrees, and the probability of transmitting a disease is never 100%. To capture this mathematically, we use a parameter—in this instance, a quantity called the effective contact rate (beta, denoted as ß). We define ß as the product of two other quantities, k and b, where k is the average rate of contact between the susceptible and infected individuals and b is the probability of transmission during an interaction:

From here, we can develop a truly comprehensive understanding of infection rates that can mathematically inform our model. Infection requires an interaction between a susceptible individual and an infected individual, and so the rate of infection depends on the proportion of the population that is susceptible, S(t), and the proportion of the population that is infected, I(t). To determine the rate of infection, we use ß to estimate the probability of interaction between the two compartments. Therefore, the equation to estimate the rate of infection in respect to time is

Note that both k and b are complicated quantities specific to each individual pathogenic outbreak that are determined from extensive epidemiological studies. These parameters are highly dependent on unique properties of both the pathogen and the population itself. For the purposes of this textbook, we won’t break down k or b; for now, we just want you to have an intuitive sense for what they represent.

From examining this equation, we can see that the parameter ß can be used to help determine how quickly people are likely to become infected and move from the susceptible group to the infected group. A large ß reflects an extremely contagious disease and high levels of interaction between individuals. If there is limited interaction (small k) and/or the disease is less contagious (small b), ß will be smaller. Importantly, both k and b can vary over the course of an outbreak.

Removal rate

Determining the rate of removal follows the same process as finding the rate of infection. First, we must ask how individuals get into the removed population. They can only be counted as removed after recovering or dying from the disease. With a larger I(t), more individuals will either recover or die, showing that the rate of removal is proportional to I(t). Since S, the number of susceptible individuals, has no impact on the likelihood of an individual recovering from infection, the rate of removal is only proportional to I(t).

ß = kb

The parameter known as gamma (denoted as γ) acts similarly to ß, but performs the opposite function. Functioning as a multiplier it describes how quickly individuals recover (or die) from the infection. γ is inversely related to the duration of infection, which we refer to as d (which is different from the d used to describe derivatives). Therefore, . Intuitively, we can think about the duration of infection as the time until removal by recovery or death. The longer the time until removal (greater d), the lower the rate of removal (smaller). We can now model our rate of removal as follows:

disease. Our change in susceptible, infected, and removed individuals over time will be written as , and , respectively, since our independent variable is time t. Let’s go through each compartment of individuals and assemble the corresponding equations, using the rates we determined above.

As we mentioned before, the susceptible population loses individuals when they become infected with the disease. Thus, must include the negative of the rate of infection, . In this example, there are no additional terms: we are assuming that an individual cannot become susceptible for a second time in this outbreak, so infected individuals cannot move back into the susceptible compartment. Therefore,

These equations are added to our existing visual representation found in Figure 13.14 and reflected in Figure 13.15.

In the Figure 13.15, you can see how the rates of infection and removal fit into the SIR model. Now that we have all of the essential components of the SIR model, we can form our system of differential equations to model the

= ß S(t)*I(t) N

The infected population gains individuals when the susceptible population becomes infected, and loses individuals when the infected individuals recover or die. Thus, includes

(t) = γ I (t)

F IGURE 13.15 | Rates of infection and removal. Here we see our previous image of the susceptible, infected, and removed compartments, now with equations that represent the rates of infection and removal.

the rate of infection. Movement out of the infected compartment, is expressed by the term -γ I(t), which is the negative rate of removal

Rateremoval(t). Thus,

Finally, the removed population gains individuals when infected individuals either recover or die. So, accounts for movement into the removed population, expressed by the term I(t) (the rate of removal Rateremoval(t)).

Our initial assumption is that once they have recovered, individuals cannot enter any other compartment, so there is no need to account for movement out of the removed population.

Now that we have the three equations that can estimate the change in susceptible, infected, and removed populations, we can solve these equations to model the course of an outbreak. The computational techniques needed to calculate the numerical solutions to these equations are beyond the scope of this textbook, but we want you to see how they are graphically represented.

The SIR Model: Visualizing The Course Of An Outbreak

These three equations are added in Figure 13.16.

Let’s take a look at the outputs of the SIR equations and how these can be visualized on a graph. Although exact values can vary from model to model, the overall shape of the SIR model is consistent and generalizable. As you can see, almost all of the population is in the Susceptible compartment at day 0 of the outbreak; and as the outbreak runs its course, the full population eventually moves to the Removed compartment (Figure 13.17).

F IGURE 13.16 | Differential equations for movement between populations. Here we see our previous image of the susceptible, infected, and removed compartments with equations that represent the rates of infection and removal, now with equations that represent the movement between the populations of each of the compartments within the SIR model.

In a population where most individuals begin as susceptible to the disease, S(t) will be at its maximum value, or peak position on the graph, when time is 0. S(t) will also have a negative slope for the entirety of the graph, meaning that the number of susceptible people is continually decreasing, until it reaches a minimum value. This is due to our model’s assumption of a fixed population where people cannot be reinfected with the pathogen.

At time 0 of an outbreak, we typically assume the number of infected individuals I(t) equals one, corresponding to the outbreak’s patient zero. (In reality, there may be more than one infected individual at the start, but one is often used for simplicity.) As individuals become infected, the number of infected individuals I(t) increases at the same rate as the number of susceptible individuals S(t) decreases. Accordingly, we see the greatest positive slope of infected individuals occur at about the same

time as the greatest negative slope of susceptible individuals. This is because the movement into and out of the infected population both depend on I(t), the size of the infected population.

There are also 0 individuals in the removed population at time 0. The R(t) curve has a delay in its start due to the duration of infection; individuals must first undergo the full course of infection before they can be removed from the infected population. Since R(t) also depends on I(t), we see that the removed population has the greatest increase when the infected population is decreasing most rapidly as removal can only occur after infection. Conversely, we see the sharpest increase in the removed population shortly after the infected population reaches its peak. R(t)’s maximum on the y-axis is the total number of individuals in the population, since at the end of our time period, everyone has either recovered or died from the illness according to our model.

FIGURE 13.17 | Graphic representation of SIR populations. This graphic representation shows the movement between each of the compartments of the SIR model. The number of people in each of the SIR groups changes as the outbreak progresses; this model predicts that most of the population will move from the susceptible group to the removed group within around 65 days.

SIR MODEL

The independent variables, dependent variables, parameters, and functions of the SIR Model. Let’s put together what we learned about the different components of mathematical models in the context of the SIR Model.

Independent Variables

• Time (t): Represents the progression through time as the model simulates the dynamics of the disease spread.

Dependent Variables

• Susceptible (S): The number of individuals susceptible to the infection at any given time.

• Infected (I): The number of individuals currently infected and capable of transmitting the disease.

• Removed (R): The number of individuals who have either died from the disease or have recovered from the disease and are assumed to be immune.

Parameters

• Beta (ß): The transmission rate, indicating the probability of transmitting disease between a susceptible and an infected individual per contact per time unit.

• Gamma (γ): The recovery rate, denoting the fraction of infected individuals who recover during a given time period. The inverse of gamma (1/γ) indicates the average duration an individual remains infectious.

Functions (Differential Equations)

• Rate of change in the susceptible group: the rate of decrease in the susceptible population due to becoming infected, given by

• Rate of change in the infected group: the rate of change in the infected population, accounting for new infections and recoveries, given by

• Rate of change of the removed group: the rate of increase in the recovered population as infected individuals recover, given by

• Rate of Infection: This is the same as the rate of change of the susceptible group and is given by

• Rate of Removal: This is the same as the rate of change of the removed group and is given by

Infection Rates And The Reproduction Number

As you can imagine, based on their functions predicting how an infected individual might spread the disease to a susceptible person, there is an important relationship between the SIR model and the reproduction number R0. Recall from Figure 2.8 in Chapter 2: Epidemiology that R0 is a constant that represents the average number of people an infectious individual will infect, showing us how infectious a disease is at the onset of an outbreak with a fully susceptible population. Let’s take a look at how R0 is determined by three variables we have seen before in this chapter: b, k, and d where b is the infection’s probability of transmission, k is the average rate of contact between the susceptible and infected individuals, and d is the duration of infection.

We have defined a numerical multiplier ß (infection rate) as the product of k and b (ß =kb). To find the value of R0, we have to multiply

the duration of infection, d, by our numerical multiplier ß, since the longer a patient is infected, the greater their probability of transmitting the infection. This results in the following equation:

R0= bkd = ßd

R0 can also be expressed in terms of γ, the removal rate parameter, since d and γ are inversely related :

R0 is useful in epidemiology because it is linked to movement between SIR model compartments, which can help us understand whether an outbreak will take off when a pathogen is introduced into a population. In short, when R0> 1, the rate of change of the infected comportant is positive, and the pathogen is expected to spread through the population. Let’s walk through the math behind this concept on the next page.

Connecting the reproduction number to the SIR model

For an outbreak to start, the rate of change of individuals in the infected compartment must be positive. In other words:

In the previous section, we defined

For this model, at the very beginning of an outbreak, we assume that only one individual is infected. Therefore S(t), or the number of susceptible individuals in a population, can be rounded to the population size, N. Thus becomes or . We can then rewrite in terms of R0, since we are considering the beginning of an outbreak, where >1. We now have on the left side of the equation, which we previously defined as R0 such that

Therefore, we can substitute as follows:

Adding γI(t) to both sides yields

Dividing both sides by γI(t) yields

Our expression now describes an increase in the infected population in terms of SIR model variables:

For the infected population to grow and for an outbreak to begin ( >0), it therefore must be true that R0>1.

As you can see, R0 is a useful and mathematically simple parameter that can be used to understand how individuals move between the key SIR model populations in outbreaks.

Using Models To Examine The Impact Of Mitigation Strategies

For diseases like COVID-19 and many other common and highly transmissible diseases, protective measures such as personal protective equipment (PPE), general hygiene practices, and social distancing, among others, can all lower the value of infection rates and the corresponding reproductive number. On our graphical representation of the SIR model, decreased infection rates would result in a decrease in the slope of I(t). This phenomenon, where we flatten the curve, we’ve reduced the infected population at varying points in time. You may recall this concept from Chapter 2: Epidemiology, which describes the aim of trying to avoid a steep peak in prevalence so that hospitals and other resources are not overwhelmed by the outbreak.

Using the equations we have discussed for this model, we can explore all sorts of questions related to outbreak mitigation, such as how effective mask-wearing is for reducing transmission.

We can use the SIR model to predict the effects of different outbreak mitigation strategies. For example, we know that masks decrease a disease’s transmissibility, so wearing masks can considerably reduce infection rates. We can also further consider the transmissibility in a range of scenarios. For COVID-19, we understand the transmissibility to be highest when no one is wearing a mask, further reduced when the infected individual is wearing a mask and the lowest level when all individuals are protected.

Scan this QR code or click on this link to to review how mathematical modeling helps guide outbreak mitigation strategies

Other Compartmental Models

While the SIR model is a good choice for modeling many infectious diseases, various compartmental models are at our disposal. Depending on the characteristics of the pathogen, it might sometimes make sense to choose a slightly different model. Here, we’ll introduce a popular alternative: the SIS model.

Imagine an infectious disease that you can contract again and again, receiving practically no additional protection each time you recover. The common cold is an example of such a disease. In this case, the removed state (R in SIR) doesn’t apply since recovered patients are once again susceptible to reinfection. We might instead choose the Susceptible-Infectious-Susceptible (SIS) model for a more accurate model (Figure 13.18). Instead of modeling a rate of change from the infectious compartment to the removed compartment, the SIS system models the rate of change from the infectious compartment back into the susceptible compartment, which results in changes to the model’s behavior.

As you can see, there are many different types of compartmentalized mathematical models. While the equations for each differ, all of them require high-quality, pertinent data to be useful, and they describe some kind of movement between compartments. This means that the core principles that you learned here – identification of key variables, exponential growth, derivatives, and more – will apply to each new compartmentalized model you encounter.

Of course, there are also many more models with varying degrees of complexity that incorporate other features such as population behavior, public health interventions, and treatments, which can all have an impact on an outbreak as it unfolds. The power and scope of

these models is immense; whether we employ SIS models, SIR models, basic linear equations, or everything in between, we hope you’re able

to better understand the mathematical bases behind these crucial tools and their use in our everyday lives.

F IGURE 13.18 | Movement between the Susceptible and Infected compartments of an SIS model. A) This model describes a continuous loop, as there is no removed compartment. B) This model implies that there will eventually be a state where there is always a fraction of the population that is infected and the remainder are susceptible, with the fraction determined by the ratio of γ to ß

Stop to Think

1. What does the SIR model depict?

2. What kinds of things can we learn about an outbreak when we apply the SIR model?

3. How does the SIS model differ from the SIR model?

4. Name three applications of mathematical modeling in outbreak science.

SIS MODEL

Stop to Think Answers

13.1: Introduction to Mathematical Modeling

1. Variables are components or characteristics of a system that change depending on the circumstances. Mathematical models are used to study the relationship between variables and to predict outcomes.

2. A mathematical model is a set of equations that quantifies and analyzes the relationships and behaviors within a real-world system.

3. Models make assumptions to make complex systems more tractable, such as a constant population size.

13.2: Types of Mathematical Models

1. Answers will vary but should include:

a. Select the most relevant information of the system: the critical components of the system that must be addressed by the model.

b. Establish assumptions and understand how they can impact your results.

c. Ensure that the model can be feasibly run.

d. Determine what information about the system we are losing with the model.

e. Check that the model is factually correct.

2. Stochastic events are characterized by their inherent randomness, exhibiting variability and unpredictability in their outcomes.

3. Answers for each question:

a. Deterministic. The outcome, earning one dollar, is consistently the same and directly related to the action of selling a chocolate bar without any randomness involved.

b. Stochastic. Each flip is random and has an unpredictable outcome, making the total count of heads uncertain until you actually do all the flips.

c. Stochastic. The movement of the coloring molecules is random and unpredictable, influenced by their random collisions with water molecules.

d. Deterministic. There is a direct, predictable relationship between the hours worked and the points earned, with no element of chance involved.

13.3: A Primer on Differential Equations

1. A differential equation is a mathematical equation that describes how a quantity in a dynamic system changes over time, taking into account the processes that increase or decrease that quantity.

2. Answers may vary, but could include: weather, bacterial growth, spread of an outbreak, traffic flow, ice cream melting.

13.4: The SIR Model

1. The SIR model depicts a population divided into three compartments—Susceptible, Infected, and Removed—and shows their dynamics amidst an infectious disease outbreak over time.

2. During an outbreak, we can track the movement between susceptible, infected and removed compartments and the rates at which these changes in compartments happen. We can use the resulting model to predict. We can use the resulting model

to predict: the approximate peak of an outbreak, the rate of spread, and the number of people who will likely be infected.

3. The SIS model differs from the SIR model in that individuals who recover from the infection in the SIS model become susceptible again, whereas in the SIR model, the assumption is that individuals gain immunity after recovery, and do not become susceptible again.

4. Answers may vary; here are some possible answers:

a. Predicting how the number of infected people will rise and/or fall over time.

b. Understanding how contact rate affects the spread of infection.

c. Predicting how mitigation measures will affect the course of an outbreak.

Glossary

Assumptions: Within the context of a mathematical model, the conditions or premises accepted as true to build a manageable and predictive representation of a complex real-world scenario.

Compartmentalized Models: Models that divide the population into smaller groups, called compartments, where each group represents a unique state or condition in the system. In epidemiology, an example of a compartment would be a disease status, such as “infected.”

Dependent Variable: A variable being tested and measured in an experiment, as it is expected to change in response to variations in the independent variable. In the context of mathematical modeling, a dependent variable is a variable whose value is determined by the values of one or more independent variables within the model.

Derivative: A mathematical measure of how a quantity changes in response to a change in another quantity; used in differential equations.

Deterministic: A model where knowing the value of an input produces an output with certainty.

Differential Equations: A mathematical statement that describes how a system changes using derivatives, which measure the rate of change (a “differential”) of a variable.

Distribution: In the context of a mathematical model, the set of all possible outcomes or results that the model could produce, along with the probabilities or frequencies associated with each outcome.

Expected Value: A measure that represents the average outcome of a random variable over a large number of trials, providing a central or typical value based on the probabilities of all possible outcomes.

Exponential Growth: A form of growth where the quantity increases at a rate proportional to its current value; as the quantity grows, the rate of growth becomes faster.

Flatten the Curve: The aim of trying to avoid a steep peak in prevalence so that hospitals and other resources are not overwhelmed by the outbreak.

Function: A mathematical description of a relationship between variables. For every input value, there is a corresponding unique output value, determined by the function’s rule or expression.

Infected: Within the context of an SIR model, individuals who are infected with a diseasecausing pathogen.

Independent Variable: A variable manipulated or controlled in an experiment to test its effects on dependent variables. In the context of mathematical modeling, an independent variable is a variable that is manipulated or changed to observe its effect on dependent variables within the model.

Mathematical Modeling: Process of representing real-world phenomena in terms of mathematical equations that can analyze situations in order to provide insight and predict outcomes.

Parameters: In the context of mathematical modeling, a constant numerical value that defines and influences the behavior and outcomes of the model’s equations.

Random Event: In the context of mathematical modeling, an event whose outcome can not be predicted with absolute certainty and that varies each time the model is run. These random events incorporate unpredictability into the model, reflecting the real-world variability and uncertainty of disease transmission among a population.

Rate of Infection: Within the context of an SIR model, the transition of the susceptible population into the infected population.

State Variables: In the context of a compartmentalized model, the variables that represent the quantity or status of a specific group or compartment at any given time.

Susceptible: Within the context of an SIR model, individuals who could become infected.

Susceptible-Infected-Removed (SIR) Model: A mathematical model that divides a population into three compartments—Susceptible, Infected, and Removed—to study the dynamics of an infectious disease outbreak over time.

Susceptible-Infectious-Susceptible (SIS) Model: A mathematical model that divides a population into two compartments—Susceptible and Infected—and allows individuals to become reinfected after recovery, modeling diseases where immunity does not occur.

Slope: The steepness of a function on a graph.

Rate of Removal: Within the context of an SIR model, the transition of the infected population into the removed population.

Removed: Within the context of an SIR model, individuals who either recovered and gained immunity or died from an infectious disease.

Stochastic: Systems or processes that have a probability distribution that has at least one random element and therefore cannot be predicted precisely.

Variables: Components or characteristics of a system that change depending on the circumstances, affecting the outcomes of a model.