Blog | Data 2 the People

The folly of COVID-19 forecasts

Apr 21, 2020
5 min read

Updated: Apr 22, 2020

With the proliferation of COVID-19 forecasts, the average person and government official alike is likely feeling overwhelmed. Is everyone already infected? How many deaths will we really see from this? Will we need to shelter in place for two weeks, or for six months? What will happen to my business or to my children’s schools? Predictions of what lies ahead have, arguably, never been given more attention, nor have the stakes been higher.

As data scientists involved in modeling this pandemic, we know how forecasts help us navigate the unknown. Many of us look to them for a shred of certainty in the midst of so much that is out of our control. But, forecasts are not as certain as we could hope. Even amongst experts, the accuracy of different forecasts is hotly debated. However, nearly all of these forecasts are missing something critical -- and lives depend on this missing consideration.

What’s missing is this: the goal of a COVID-19 forecast is not to be accurate. It’s to save lives.

What we need to realize is that in this dynamic, unprecedented situation, all forecasts have short shelf lives. We must think of forecasts not as inevitable futures, but as guideposts that inform us about the imperative to act and care for one another today.

With this new goal in mind, we’d like to offer 3 practical tips to help us make the most of forecasts.

Beware of any forecast that tells you an exact number. Look for a range to prepare your expectations. If you are making a forecast, share that range.
Consider the ethics of a too high or too low forecast. If you are seeing lots of different forecasts make sure you pay attention to the high numbers too. Forecasters: do not discount high predictions.
Account for how people will react to the forecast. As soon as people’s actions change, many predicted scenarios become less likely. A wise forecaster sets expectations that the outcome will change based on how people respond.

Let’s dive into each of these.

First, know that all data related to COVID-19 are uncertain. The number of COVID-19 cases is subject to how many tests are available and how those tests are administered. Are tests given to only the very sick? To anyone who wants one? Even data on the number of deaths, while likely more accurate, has uncertainty. Was everyone who died from the virus tested? Were there delays in the testing results?(see this article for more) It is impossible to have certainty in a forecast given this shaky data foundation.

All statistical models, even with rock solid data, produce probabilities, not certainties. This uncertainty is typically communicated by statisticians with a “confidence interval” which estimates an upper and lower bound for the forecast.

The problem is that many COVID-19 forecasts being shared and actively used by the public and by policy makers are not using confidence intervals. Any time data is expressed as an exact number, it gives a false sense of certainty.

So projecting 300 hospital admissions in Washington, D.C., on May 14 (as the Penn CHIME model does) is not helpful. Saying “9 days till the projected peak” (as the IHME model cited by the White House does) is similarly irresponsible.

Screenshot of the Penn CHIME model on April 20th where the forecasts show exact numbers:

I believe the forecasters sharing exact numbers mean well; perhaps they think it’s simpler for people to interpret an exact number rather than a range, or perhaps they aren’t able to estimate the uncertainty. Yet scientists are also educators, and as such, they should share the range of potential outcomes. If they do not, ask for that range. Compare multiple sources of models. The discrepancy between forecasts is not a scary thing; it’s a realistic reflection that there are a range of potential outcomes.

Recommendation for the forecaster: Present uncertainty with a range of numbers such as a “worst case” and a “best case,” which is clear language that the average person can understand.

Second, make sure to pay attention to the high estimates. It may feel good to look at the low end of a range, but we argue there is a greater risk in not paying attention to the higher estimates. In addition, low estimates often are created assuming that we will take preventive measures.

We should ask ourselves:

If we act on a forecast that is too low, what will happen? We may not prioritize changing behavior, changing policy, or investing in research. We will not be adequately prepared to prevent fatalities. Associated economic suffering in the long term will be enormous as well. If a forecast is too low, significantly higher fatalities are likely.

If we act on a forecast that is too high, what will happen? Governments may over-prepare or mis-allocate resources. Given the limited preparation taken by most governments so far, over-preparing seems to be a lower risk. The economic impact of over-preparing could be greater in the short-term (arguably not as high in the long term though, because the economic impact will last much longer if the virus spreads widely). We can compensate for a loss of a job. We can’t compensate for loss of life.

As an example of how this can go wrong, on March 2nd, scientists in the UK predicted that there could be an astonishing 500k deaths from COVID-19. Yet, the forecasts were said to be unlikely even by the scientists themselves. The result: the UK waited too long to act. Of course, this isn’t an occasion to cherry pick high numbers. The model choices need to be reasonable. The point is that it is not a good idea to downplay high numbers just because they “seem too high.”

Recommendation for the forecaster: Ensure that the high numbers are not overlooked, using language such as “the number could be as high as X”.

Finally, let’s address the broader point of the impact of sharing a forecast on the outcome. The prophet’s dilemma warns that telling people a prediction can cause a reaction that either contradicts or fulfills that prediction--also known as self-defeating or self-fulfilling prophecies.

This became especially salient in 2016, when Nate Silver and most political forecasters predicted Hilary Clinton would win, and she didn’t. Research showed that because of widely-shared forecasts of her win many individuals may not have voted thinking their vote was not needed, which then cost Clinton the election. In this case, the forecast became a self-defeating prophecy: it led to its own inaccuracy.

The same is true for COVID-19 forecasts. Any projection that is widely-shared can potentially change— for better or for worse— the outcome, depending on how people respond to it. When we see a forecast we should make sure we look into how the forecast has accounted for the impact of behaviors changing. If no changes are accounted for, we should look at a forecast as “what will happen if we continue behaving the same way we have behaved before.”

Recommendation for the forecaster: Make it clear what assumptions you make about human behavior changing when producing the forecast.

Thus, the goal of producing or reading a forecast shouldn’t be to find the most accurate prediction, but to save lives. What ends up happening will be highly dependent on actions we take, and forecasts directly influence this.

As consumers of forecasts, we can hold our statisticians accountable. The role of the scientist and statistician during this pandemic is not simply to share numbers but to take responsibility for the outcome of sharing those numbers. At times like these, where we are in a crisis and faced with immense consequences of decisions, this is even more urgent. Together, we can make the high forecasts wrong.

Thanks for reading, and if there is something we missed here, please let us know in the comments!

#covid19 #forecasting

If your state were holding elections this week, would you vote in person?

Apr 8, 2020
4 min read

Sarah McGowan, Siyu Song

This week, Wisconsin held its primary election as planned despite burgeoning Covid-19 infections and a state-wide stay-at-home order, because of Supreme Court rulings denying postponing in-person and absentee voting.

While we’ll have to wait for official data to see the exact impact of Wisconsin’s decision on turnout and public health, we can assume that the impact will be sizable.

But assumptions aren’t helpful when making tough decisions. We at Data 2 the People prefer to use data. And when the perfect data is not available, we go to the next best data that is available. This is why we were so excited to see the results of recent surveys from 1Q. The surveys aimed to assess public behavior and sentiment in the context of Covid-19, and included questions on Americans’ preferred election format in this unprecedented time.

We dug into this data and share our findings in the interest of preserving the health and safety of both our democracy and its constituents. Our hope is that public officials across the country can use this data to help inform equitable decisions that optimize for both public health and election integrity.

Some notes before we share the results:

This data reflects snapshots in time. Surveys reflect the opinions, knowledge, and trends at the time they are conducted. The Covid-19 situation is evolving rapidly, and sentiment expressed in these survey may likewise change with time. The survey we primarily reference was conducted on April 8th, and we use data from a survey on March 22nd to look at trends.
Respondents are only a sample of the US population. The April 8th survey does a great job of census-balancing the demographics of the respondents. Still, the sample size is small so there is uncertainty in how the results generalize.
What people say is not necessarily what they will do. This is always the case, but is especially important to keep in mind during a period of time that is especially stressful and emotional for many.

Here is what the data shows:

1. Respondents expressed low likelihood of voting in person.

When asked their likelihood of voting in person at a voting location if their state were holding elections next week, only 27% of respondents said they would be “likely” to vote. The majority of respondents - 55% - indicated they would not be comfortable voting in person.

2. Respondents were generally happy with mail and online voting options.

57% of respondents expressed likelihood to vote by mail and 72% expressed likelihood to vote through a secure online link.

Notably, vote by mail would make voting more accessible for about a third of respondents. 34% of respondents are “likely” to vote by mail where they were “unlikely” to vote in person.

Some states already have 100% vote by mail - Colorado, Hawaii, Oregon, Washington and Utah. We are unaware, however, of any state using or considering online voting methods in this primary season for the general population (some states have considered online methods for absentee or military voting), so the online option is purely hypothetical. (But interesting for the future!)

3. The interest in voting by mail appears consistent across demographics.

We did not see meaningful differences in likelihood to vote through different channels when looking across gender, race, education level, and income. This suggests that encouraging vote by mail options could be an equitable option during this time when in-person voting presents a health risk to many.

4. Interest in voting by mail does differ by political party, but this may be due to recent politicization of the topic.

Results from the most recent survey conducted on April 8th indicate that Republicans are less interested in voting by mail than Democrats, though the groups have similar rates of aversion to voting in person. This is a change from the survey on March 22nd, however, when Democrats had a slightly stronger aversion to vote by mail.

On April 8th, 22% of Republicans reported they were unlikely to vote by mail in the next week, while only 13% of Democrats reported such. This difference was significant, with only a 4.3% likelihood of chance. This trend was also reflected in the “incremental” mail-in voter population - the group unlikely to vote in person, but likely to vote by mail. 39% of Democrats are “incremental” mail-in voters, vs. 35% of Republicans.

What is particularly interesting is that results differ from a similar survey conducted on March 22nd which showed a dramatically less polarized environment. At that time, Democrats actually had a stronger aversion to voting by mail than Republicans. It’s possible that the partisan dialog and news coverage surrounding this topic in the past few weeks swayed responses. We’ll note that the March 22nd poll surveyed a different population (and a smaller population), but we believe the trend is worth noting.

We'd like to emphasize the universal benefit mail-in voting provides. Roughly a third of Democrats and Republicans report that they are unlikely to vote in person, but would vote by mail. As a reference point, only 23% of Americans cast a mail-in vote in 2016. (Only 17.7% were absentee ballots.) By supporting mail in options during this pandemic, we will include large legions of voters on both sides of the aisle.

We hope this data will help states decide quickly what changes to make to voting, so they can begin the required operations and communication processes as soon as possible.

As always, our goal is to help current and future public leaders leverage the best data available to make the best possible decisions. For further details and analysis, please see the appendix file, here. If you have any questions or would like to know more about our analysis, please reach out to us at data2thepeople.org.

For further information on how you can use mobile surveys for rapid data collection, please reach out to 1Q.com. Many thanks to 1Q for sharing their survey data.

#covid19 #ElectionIntegrity #1Q #dataforgood

Welcome!

Apr 6, 2020
1 min read

Updated: Jun 1, 2020

We're excited you have come to our blog. We at D2P are here to help.

Our values:

Take action on climate change, transform our economy with new energy sources and business opportunities, fairly utilized.
Protect a woman’s right to choose and fight for justice for all.
Champion innovative public education, so our citizens can have the opportunity to learn and grow and be change-makers too.
Foster a government which enables us as a community to take care of those who lack basic human rights such as shelter, food, safety, healthcare, and connection.
Understand the existence and impact of racial, gender, class, and religious inequalities and implement policies to address and reverse those.
Support bi-partisan redistricting and election reform.

Principles:

Move fast and have fun. We are pragmatic and creative, focused on our goal of creating positive change.
Stand for Democracy. Community oriented, including many to achieve the goal.
Inclusive teams for inclusive data. We are diverse, optimistic, and empathetic.

Mission:

Above all, we look at the data, and we strive to understand what is really happening. We've believe that data can surprise us and change how we think about the world and change what we do as a result. That search for truth is at the core of our mission.

We care about our country and the people in it, and we are committed to using our skills for the interest of the public good. This is our act of citizenship.

- The People of Data 2 the People

#data2thepeople #datascience #dataforgood