Pandemics and the SIR Model

Now that the H1N1 flu virus has been about a month, it's interesting to discuss the SIR mathematical model for pandemics.

Yes, S stand for the Susceptible population. Those who are not yet infected, but may become infected.
I stands for the Infected population ... those who have it and can transmit the disease.

>And some of them die, right?
They either die or recover. In any case, they're Removed from the infected population and can not transmit the disease.

The "standard" differential equations that descibe the progress of the disease goes like this:

[1]         dS/dt = - a S(t) I(t)
This describes an S-population which is always decreasing as more and more members become infected.
The rate at which they become infected depends upon the number of interactions between the S- and I-populations.

>And S-guy meets and I-guy and becomes infected, eh?
Yes. The number of interactions will depend upon the product S I. (For example, it'll be half as much if S decreases by half and will be twice as much if I doubles).

Further, the I-population will increase by 1 for every member of the S-population that becomes infected.
That implies that dI/dt should increase at the same rate as S decreases ... so dS/dt = a S(t) I(t).
However, if there were no members of the S-population we'd nevertheless expect the I-population to decrease at a rate propoetional to the S-population.
Combining the two notions we get:

[2]         dI/dt = a S(t) I(t) - b I(t)

Although [1] and [2] are sufficient to generate charts for S(t) and I(t), it's usual to add an equation for the R-population.
Note that it will increase by 1 for every member of the I-population that recovers or dies, hence:

[3]         dR/dt = b I(t)


  1. The total population, S + I + R is constant, since d/dt(S + I + R) = 0.
    (This also explains why [3] has that particular expression for dR/dt.)
  2. S(t) is decreasing so long as there are members of the S and I populations.
  3. R(t) is increasing so long as there are infected members (that haven't died or recovered).
  4. I(t) is increasing until S(t) decreases to a value less than b/a ... then I(t) will decrease.
  5. We will assume that, for pandemics, b/a < 1 !!
Here's a "typical" set of charts:

>Huh? S(t) starts out at a value of 1?
We can consider the numbers as fractions of the total population. Then S(0) = 1 just means that the entire population is susceptible.

>So it looks like I(t) has a maximum of about 9%.
Yes, but that'll depend upon your choices of parameters a and b. In the above example I chose A = 0.5 and B = 0.3.

>I assume that these differential equations are hard to solve.


Dividing [2] by [1] gives:

[4]         dI/dS = - 1 + k/S ... where k = b/a.
The solution is:

[5]         I(t) = I(0) - (S(t) - S(0)) + k log(S(t)/S(0))

If we assume that S(0) = 1 (meaning that 100% of the population is Susceptible), then we have:
[A]         I(t) = I(0) - (S(t) - 1) + k log(S(t))       ... where k = b/a < 1
That'll give a graph like this (for a = 0.5, b = 0.3 and I(0) = 0.0000001%):
It shows S(t) decreasing from 100% to 32% and I(t) reaching a maximum of 9.3% then decreasing to 0.

>I(0) is pretty small don't you think?
If there are 6 billion people on the planet and 6 are currently infected, then that'd give I(0) = 0.0000001%.

>Then you're saying that, eventually, 9.3% of the world population will become infected?
Did I mention that it'll depend upon your choices of parameters a and b?
Further, I(t) gives the number of people "currently" infected.
Some may have recovered or died ... and gone to the R-population.
More importantly, it'll depend upon how well the SIR mathematical description matches the actual evolution of the disease.

From [A], if we put I(0) ≈ 0 and S(t) = 1 - x with x ≈ 0 (so we're talking about the initial spreading of the disease), then we get:

[6]         I(t) ≈ x + k log(1-x)   or   I(t) ≈ (1-k)x +k x2/2
For our example above, k = b/a = 0.3/0.5 = 0.6.
The approximation to the differential equation [6] compares favourably with the actual solution [A]:

>For the start of the pandemic, right?
Right ... and we might use that to gauge the values of a and b which give the "best" fit to the actual data.
Of course, we just have a theoretical expression relating I to R ... not I to t.
To get I(t) vs t we need to solve [2], namely dI/dt = a S(t) I(t) - b I(t), using [A].

We could also try to solve [1] using [A]. That'd give:

[7]         dS/dt = - a S(t) { I(0) - (S(t) - 1) + k log(S(t)) }   ... where k = b/a.

Alas, I can't do that, so I just did a numerical integration of [1] and [2] (using a Runke-Kutta routine).

Here's a spreadsheet to play with. Click on the picture to download the spreadsheet.


  • The parameter a is a time scale. If the time variable t is replaced by θ = at, then the differential equations becomes:
    [1A]         dS/dθ = - S(t) I(t)
    [2A]         dI/dθ = S(t) I(t) - k I(t)
    [3A]         dR/dθ = k I(t)
  • Increasing a then makes the disease evolution speed up. Maxima are reached soooner. The disease terminates sooner.
  • Since the maximum of I(t) occurs when S(t) has fallen to the value k = b/a, changing k then varies when that occurs.
  • I can't find any "good" parameter values !! (YOU try.)
Maybe I'll play with the SEIR model ...