The central, unanswered COVID-19 policy question is whether governments around the world were justified in imposing, reimposing, and, in many cases, maintaining for months or years a massive and unprecedented suite of social distancing measures. Pre-COVID pandemic plans rated the evidence in support of nearly the whole range of non-pharmaceutical interventions (NPIs) as very low, and none of the respondents contests that. Furthermore, these plans warned that mandatory stay-at-home orders, business, school, and border closures, and other NPIs would be very costly. Our critics to some degree concede this point as well.
So we ask: Did the unprecedented imposition of NPIs achieve aims valuable enough to justify their profound costs? Adam Gaffney and Adam Kucharski argue that NPIs reduced rates of coronavirus transmission and then conclude from this—without evidence—that the NPIs must have reduced mortality during the pandemic. We will explain again why we disagree. (Serious illness matters too, of course, but relevant data on this score is even harder to obtain, and we lack evidence that NPIs reduced its incidence; long COVID, for its part, is concerning but poorly understood.)
For their part, Cailin O’Connor and James Owen Weatherall assert that “robust and healthy debate took place around the world,” while Jonathan White suggests that for some progressives there was a silver lining in the lockdowns. We disagree profoundly with both claims.
A common misconstrual of our position renders it as, “the answers are clear cut, and . . . the benefits of NPIs were not worth the costs,” in the words of O’Connor and Weatherall. We do not conclude or claim that NPIs accomplished nothing. Instead, we point to the lack of clear evidence that NPIs reduced overall mortality. To put the matter in conventional social science terminology, our findings—along with those from other studies—leave us unable to reject the null hypothesis, so the benefits of these measures remain unproven. Reviewing the available data, we simply do not know what NPIs, if any, made a meaningful difference in saving lives during the pandemic.
Gaffney observes that “quantitative study of the effects of individual non-pharmaceutical interventions using observational data is difficult due to enormous regional variation,” including in “the timing and magnitude of outbreaks and policy responses.” We agree. But the absence of evidence that NPIs lowered mortality is striking. One cannot attribute this lack of a relationship to a simple endogenous process in which harder-hit states imposed tougher restrictions. For one thing, across the fifty U.S. states, there was no relationship between the use of lockdown restrictions and successive virus waves. One January 2021 survey reported that “students are more likely to be attending school in person where Covid is spreading more rapidly.” Throughout the crisis, and regardless of disease severity at any given time, Democratic states always maintained more restrictions than Republican states.
Likewise, political scientists Michael Hartney and Leslie Finger find that “political factors best explain the degree to which [school] districts reopened in-person.” The factors most closely associated with school re-openings are higher electoral support for Donald Trump and weaker teachers’ unions. In comparison to these political factors, the effect of the “intensity” of the pandemic was “substantively trivial.”
Gaffney suggests that we “made a critical difference” by “keeping COVID-19 at the gates as long as possible.” We recognize that this is what the NPIs attempted to accomplish, but we lack evidence that the more stringent NPIs in Democratic yielded lower rates of mortality. After all, it is hardly unusual for new medical or policy interventions to fail to achieve their theorized benefits. Our analysis and that of the Lancet study we cited in our essay show that COVID-19 deaths in Republican states diverged upward from Democratic states only after vaccines were made available. In addition to these aggregate analyses, an individual-level study of Democratic and Republican voters in Florida and Ohio, adjusted for age and other factors, finds evidence “of higher excess mortality for Republican voters compared with Democratic voters . . . after, but not before, COVID-19 vaccines were available to all adults.”
O’Connor and Weatherall confusingly run together pre- and post-vaccine eras when they observe that “in 2021, the five states with the highest age-adjusted fatality rates were consistently Republican-leaning states, whereas the five states with the lowest death rates were Democratic-leaning states.” But 2021 straddles the two eras. Vaccine hesitancy is clearly associated with greater COVID-19 mortality in the post-vaccine period. But no similar evidence links laxer NPIs to higher mortality in the pre-vaccine period.
Do we control for confounding variables “such as differences in demographics, compliance,” and other factors that “may make NPIs more or less effective,” as O’Connor and Weatherall ask? Yes: as we stated, “our analyses of both policy effects and vaccine effects control for differences in state populations, including age . . . obesity, urbanization, and insurance coverage,” the last being a proxy for the quality of state health care systems. The Lancet study also controls for demographic variation—including age and the prevalence of major comorbidities—and finds no relationship between pandemic outcomes and the use of more stringent NPIs. The failure to find benefits from NPIs is especially notable because polling and cell phone mobility data indicate that Democrats complied better with stay-at-home orders than Republicans did.
The COVID-19 policies of twenty-nine European countries varied even more than those in U.S. states, with Sweden declining to lock down at all, some countries only imposing one lockdown, and others issuing lockdowns three or more times. Mirroring our findings, a Lancet Regional Health study on disparities in Europe finds no evidence that European countries employing more stringent NPIs experienced lower mortality but clear evidence of lower mortality post-vaccine in countries where vaccine uptake was higher.
Kucharski raises the issue of timing and opines that Democratic states like New York and Massachusetts did not fare better than Florida and Texas because those Democratic states “got hit hard early on,” while the latter Republican pair “got hit later.” We considered timing: throw out the blue states hit earliest and hardest and the absence of evidence for the effectiveness of NPIs remains. Even more telling, some states that were not hit hard in the first wave imposed early and stringent lockdowns but did not fare systematically better than less stringent states over the course of the pandemic. California and Ohio closed early and hard but did not achieve significantly better outcomes than Florida and Texas.
The plethora of studies and conflicting claims about these issues can be confusing. Our book explains why we think excess mortality during the pandemic—as compared with pre-pandemic years—is a better measure than raw counts of COVID-19 deaths. For one thing, it accounts for other demographic factors affecting population health as well as differences in how countries counted deaths. The European study noted above uses a full ten-year average prior to the pandemic to establish a baseline for each European country and finds no relationship between excess mortality and the stringency of NPIs.
A 2024 study by Eran Bendavid and Chirag Patel in Science Advances draws on the largest, most credible datasets tracking the pandemic around the world and aims to give an authoritative assessment of how the imposition and lifting of restrictions affected outcomes within those jurisdictions. The authors subjected panel data for 181 countries covering 2020 through 2021 to a “metaverse” analysis, testing for relationships using nearly 100,000 different models. (These include fixed effects for countries and time in order to control for time-invariant differences between countries and for temporal trends shared among all countries.) Summarizing their findings in STAT, Bendavid and Patel write: “No matter how we approached these questions, the primary finding was lack of definitive patterns that could support claims about governmental policy impacts” on pandemic outcomes.
We understand that it can be difficult to accept the absence of good evidence for the effectiveness of costly measures that we all endured trying to curb the pandemic. Like it or not, the most rigorous studies available are consistent with “no effect” on COVID-19 mortality.
Cherry-picked cases, modeled predictions, and the assumption that lower transmission means lives saved are all apt to mislead. Kucharski cites estimates of how much NPIs curbed transmission and chides us for failing “to mention the experience of Vietnam, Australia, Japan, South Korea, or even Sweden,” which he says all “brought transmission down with a combination of behavioral change and control measures.” His collection of examples is odd. Sweden was deemed a “pariah state” for its failure to impose more stringent NPIs, but its outcome was the best in Europe according to the study noted above. Japan never imposed a lockdown or widespread business closures, nor did it make much use of testing and contact tracing, yet its outcomes were excellent by global standards.
Kucharski and Gaffney muddle the question of NPI effectiveness by arguing that a combination of NPIs and behavioral changes reduced viral transmission. Kucharski does this by running together Japan and Sweden with Australia and other countries, but Japan and Sweden had light restrictions and fared well. Gaffney does this by noting that “some of the hardest things we did were largely voluntary”—indeed, people were going to change their behavior in a pandemic with or without mandated NPIs.
Moreover, Gaffney says that NPIs “must . . . have at least incrementally reduced coronavirus transmission,” pointing out that “other respiratory viruses that spread like the novel coronavirus” such as influenza and RSV “basically disappeared.” However, while “other respiratory viruses” did in fact dwindle during the pandemic, coronavirus did not, so Gaffney’s “must have” claim for the effectiveness of NPIs is unproven. One alternative theory is that a novel virus may interfere with and reduce the transmission of other viruses, an explanation that undercuts the role of NPIs in reducing the spread of endemic disease. Another possibility is that the force of infection is much greater for a pandemic virus; NPIs may make a difference with respect to endemic disease but fail to have similar effects on much more contagious pandemic viruses. The point is that we just don’t know what to conclude here.
Kucharski agrees with our claim that that “drastic control measures” to suppress coronavirus transmission had “a huge social and psychological toll.” Yet he also insists that these “unprecedented interventions” worked in Wuhan and elsewhere by reducing transmission. Which leaves us with the question: What would be the point of enduring the “huge social and psychological toll” of the lockdowns if mortality was not measurably reduced?
“Good debate requires good evidence,” says Kucharski, who cites evidence showing that people in lower income neighborhoods and populations died at higher rates from COVID-19. Granted, such neighborhoods have worse outcomes from virtually all health threats. Where is the data showing that NPIs reduced COVID-19 mortality in those neighborhoods? Gaffney guardedly suggests that “every sacrifice that broke a chain of transmission meant that someone, somewhere may have avoided a terrible death.” Perhaps, but was the overall death toll actually lowered? Our critics’ focus on transmission highlights the absence of clear evidence that NPIs reduced coronavirus deaths. If that evidence was available, you can be sure that respondents would cite it—and so would we. In its absence, how can anyone confidently claim that NPIs worked?
O’Connor and Weatherall doubt that the decisions “public health authorities made” were “bad choices at the time.” They even claim that “robust and healthy debate took place around the world” and that by the “summer of 2020, leaders around the world” made “more informed decisions, based on better data, more experience, and, most important, input from their constituents.” Nothing we know about the U.S. experience supports their picture of “robust and healthy debate,” even leaving aside (as we have) the widespread suppression of the possibility—now likelihood—that the virus spilled out of a lab.
Debate on some matters was more robust and evidence-driven in Europe than in the United States. David Zweig’s excellent book An Abundance of Caution (2025) shows that much was known very early in the pandemic about the relative safety of reopening schools but that evidence was either ignored or distorted by major media in the United States. By May 17, 2020, children had been back in school for three to four weeks in twenty-two European countries, and it was announced on an EU education ministers’ conference call that “there was no evidence of a significant increase in COVID infections or a negative impact from reopening schools.” As Zweig further notes, researchers from Dartmouth College and Brown University found that over seven months in 2020, “Ninety one percent of stories by U.S. major media outlets are negative in tone versus fifty four percent for non-U.S. major sources.” Citizens’ preferences were formed in an environment of extremely one-sided media coverage that stoked fears and played down the costs of restrictions.
Our critics’ essays demonstrate how persistent misconceptions about NPIs effectiveness remain. A “robust and healthy” debate would have yielded better understanding.
Finally, we agree with White that, under COVID-19, “legislative debate about tradeoffs and choices was repeatedly bypassed.” He is likely correct, as well, that some of “those concerned about climate change and runaway capitalism” viewed the lockdowns as hopeful glimpses of better alternatives to “business-as-usual.” But a great many people suffered tremendously as a consequence of radical and untested policy interventions that have failed (so far) to produce measurable benefits. The moralized tribalism that so degraded elite institutions under COVID-19 prefigures a future that none of us should wish for.