This article was published on March 18, 2021

Beware of biased vaccine distribution algorithms

Here’s how to remedy them


Beware of biased vaccine distribution algorithms Image by: CDC from Pexels

As the sheer logistical challenge of distributing vaccines to over 300 million Americans looms large, institutions are furiously developing algorithms to assist the rollout. The promise: that technology will enable us to allocate a limited number of doses efficiently — to the highest priority groups, and free of human error. 

Vaccine distribution algorithms have already been deployed in many places. In December 2020, researchers at Stanford University rolled out a system that ranked individuals in its 20,000-plus strong community by priority. Around that time, the U.S. Department of Health and Human Services revealed it had partnered with the data-analysis firm Palantir in harnessing “Tiberius,” an algorithm to efficiently allocate doses across the hardest-stricken areas. And at the state level, Arizona, South Carolina, Tennessee and at least four others are building proprietary technologies to modulate vaccine rollout. 

But all this could go horribly wrong. 

Imagine an algorithm for vaccine distribution that under-supplies majority-black counties. Or one that disadvantages female patients compared to male ones. Or yet another that favors the top one percent.

These possibilities seem outlandish, even dystopian. But they could become reality in the space of the next few months.

At the heart of these frightening prospects is the issue of bias in algorithms. Rule-based systems, like the one powering Stanford’s algorithm, can deliver discriminatory outcomes if programmers fail to capture all the relevant variables in painstaking detail. Machine learning systems, such as the one most likely behind Palantir’s algorithm, on the other hand, may seem to escape this issue because they learn from data with minimal human input. But in their case, bias arises when the data they are fed underrepresents certain demographics — like class, gender, or race. The algorithms then replicate the bias encoded in data.

Already, there are warning signs that vaccine distribution algorithms may allocate doses inequitably. For instance, Stanford’s system caused a debacle when it selected only seven of 1,300 medical residents to receive doses. Researchers deduced that this error was caused by a failure on the programmers’ part to adjust for residents’ actual exposure to the virus.

The University then had to issue a widespread apology and revise its rollout plans; some vials were even manually apportioned. Moreover, a number of studies have exposed bias in machine learning systems, and might be reason to doubt their efficacy.

In 2018, a test by the American Civil Liberties Union (ACLU) revealed that Amazon’s machine learning system misidentified 28 members of Congress as criminals. Disproportionately many of those falsely detected were people of color — they included six members of the Congressional Black Caucus, among them the late civil rights leader Rep. John Lewis. There’s a danger that algorithms deployed with the intention of ethically distributing vaccines will, ironically, distribute them unethically.

This problem points at a central tension. Technology is probably the best solution we have today for massive logistical challenges like delivering groceries, meals, and packages. But simultaneously, algorithms are not yet sufficiently advanced to make ethical decisions. Their failures could pose great harm, especially to vulnerable populations. So how can we resolve this tension, not just in the context of vaccine distribution, but more broadly? 

One way is to remedy the technology. Rule-based systems can be extensively evaluated before deployment. Machine learning systems are harder to correct, but research suggests that improving the quality of data sets does reduce bias in outcomes. Even then, building perfectly unbiased data sets might be a bit like a band-aid for a terminal illness. Such sets often contain millions, or even billions of samples.

Humans would need to sort through them all and categorize them appropriately for this approach to succeed — something that’s both impractical and unscalable. The central limitation, as the pioneering computer scientist Judea Pearl has pointed out, persists: that machine learning is statistical in nature, and powered by reams of data.

But this limitation also hints at avenues for improvement. The psychologist Gary Marcus has proposed one: a “hybrid” approach, which combines statistical elements with those of deductive reasoning. On this idea, programmers would explicitly teach algorithms about concepts like “race” and “gender” and encode rules preventing any discrepancies in outcomes based on those categories. Then, the algorithms would learn from reams of relevant data. In this way, hybrid approaches would be the best of both worlds. They would capture the benefits of statistical methodologies while providing a clear solution for issues like bias.

A third solution to the tension is to designate domains where algorithmic approaches should for now be experimental and not decisive. Perhaps it’s acceptable to harness algorithms for well-defined tasks where the risk of failure is low (like running transportation networks, logistics platforms, and energy distribution systems) but not yet those that involve messy ethical dilemmas, like distributing vaccines.

This point of view still leaves open the possibility that some more advanced system in the future can be entrusted with such a task while protecting us from immediate potential harms.

As technology becomes increasingly potent and pervasive, it will serve us well to be on guard for any harms algorithms may cause. We will need suitable cures in place — or eventually let machines allocate them.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with