Emily Oster

5 minute read Emily Oster

Emily Oster

Technical Notes

Emily Oster

5 minute read

I’ll be back early this week with some more baby-and-kid related content, but I have spent a lot of the last week thinking about the virus, and wanted to share some thoughts out of the non-parenting side of my brain.

In my academic life a lot of my work overlaps with epidemiology and public health. I thought it might be useful here to (briefly) talk about (in my view) why it is so challenging to predict what is going to happen over the next few weeks.

As I see it, there are two main sources of uncertainty. First, how many people will get the virus. Second, what share of them will need to be hospitalized and use ventilators. The second one is really important because one of the very real fears in this epidemic is hospitals getting overwhelmed (as has happened in Italy). If this happens, people will die who wouldn’t have died if care was available. This is the logic behind the “flatten the curve” idea – if we can slow the flow into hospitals, that will ensure people can get the care they need to have the best chance.

So, how can we figure out the infection rates and hospitalization rates?

Predicting infection rates relies on epidemiological models, the simplest of which are “SIR” models (susceptible, infected, recovered) which use information on how infectious the virus is and length of illness to predict the path. There are many of these models floating around for coronavirus. The conclusions of such models are extremely sensitive to the inputs you put in them. In particular, they are really, really sensitive to the rate of spread.

This rate, in turn, is sensitive to how much interaction people have (this is the point of social distancing). In order to figure out what is going on, you really need to be able to evaluate the performance of the model in real time. This is basically impossible to do.

Why? Well, even in the best of circumstances it would be hard. If you could test everyone every day for the virus, you might be able to validate your models quickly. You’d need to be able to compare what the model predicts to what you actually see at each day. You could then modify the parameters to fit what you see.

But we are not in such best circumstances in the US. We have really really limited testing even among sick people. Forget about healthy people. Tests take forever. We might be able to do better in validating the models using deaths, which are observed, but this introduces even more variables and, plus, infection and death are widely separated in time, meaning we’d be weeks behind in our model evaluation.

The best we can can say from these models is:

  1. In many (possible) scenarios a huge, huge share of people will be infected very quickly. In others, things look more optimistic.
  2. If we lower the reproductive rate – through social distancing – the infection will grow more slowly.

Attempting to predict anything more specific given our current data seems to me impossible.

The second issue is predicting use of hospitals. Right now, about 20% of people who are known to be infected need to be hospitalized. The models which look at hospital usage (the NY Times published one) use a version of this assumption. These models are typically assuming that 20% of all infections will need to be hospitalized.

But this is wrong. For the reasons I talked about above, our lack of testing means we are missing a lot of infections. The actual share of ALL infections who will be hospitalized is lower than 20%. We could say that 20% of detected infections will be hospitalized (this matches the data) but then we also need to make a guess about the share of infections which will be detected.

People are working to incorporate this into their models – it is not difficult to do – but then you need to make an assumptions about share of infections detected. For which you need… testing.

The Bottom Line

First: social distance. We don’t know how much it will matter, but it will matter some.

Beyond that: Unless we improve our testing and surveillance, it will be virtually impossible to predict where we are headed with the epidemic. Improving models is something we can do, but since we cannot validate them without data, it’s not clear what we get from improved models.

Testing data would be ideal, but even comprehensive hospitalization data or symptom tracking might help some. The places that have more of a handle on the epidemic without completely shutting down – South Korea, Israel – did it with surveillance.

I know some people have thrown up their hands about getting this done, and it is right that maybe we cannot in the next couple of weeks. BUT: investing in this is worthwhile both for possible later phases of this epidemic, and for later ones.

Community Guidelines
Covid-19 rapid antigen tests arranged in a pattern on a yellow background.

Feb. 20, 2023

12 minute read

COVID-19: Where to Go from Here

A long-term view of the virus

Covid-19 rapid antigen tests arranged in a pattern on a yellow background.

Oct. 20, 2022

9 minute read

Should You Get the Bivalent Booster?

The latest on the risks and benefits of COVID vaccines boosters for older adults, pregnant people, and kids

A line graph with pink, yellow, and blue dots representing life's ups and downs.

Aug. 16, 2022

3 minute read

Wins, Woes, and Doing It Again

We have our first story from a dad! And it’s a good one. 10/10 —Girl Dad with Confidence Growing by Read more

Covid-19 rapid antigen tests arranged in a pattern on a yellow background.

Aug. 15, 2022

8 minute read

Updated CDC Guidelines for School and Child Care

NO QUARANTINES!!!