IBM examines best methods to reduce bias around AI in healthcare

Biases in medical AI algorithms can have critical implications for minority patients, which is why IBM Research and Watson Health researchers have launched a new study to examine the best methods for addressing this problem.

The study, which was recently published in JAMA Network Open, examined the impacts of various AI algorithms on a very common condition impacting pregnant women. The researchers analyzed postpartum depression and mental health usage among nearly 600,000 women covered by Medicaid and sought out the presence of algorithmic bias. They then introduced and assessed methods to combat those biases.

The researchers first looked for bias in training data and then applied debiasing methods called reweighing9 and Prejudice Remover to mitigate any bias. They then compared the two models to another debiasing method that completely removes race from the data, called Fairness Through Unawareness (FTU).

Unsurprisingly, the study revealed AI algorithms trained on biased data can have unfair outcomes for some patients of a different demographic. White women, which made up 55% of the cohort, were more likely to be diagnosed with postpartum depression and had higher incidence of mental health services.

The finding goes against medical literature, which dictates higher rates of PPD among minority women is more commonly observed. That indicates underlying disparities “in timely evaluation, screening, and symptom detection among Black women,” wrote first author Yoonyoung Park, ScD, of the Center for Computational Health, IBM Research, et al. 

Machine learning models also predicted unequal outcomes, favoring white women who were already at a disadvantage for diagnosis and treatment. Among Black women who were predicted to be similarly at a higher risk, there was a worse health status. 

Disregarding race in the models was “inferior,” the study authors said, while the two deabiasing methods would actually allocate more resources toward Black women compared to the baseline or FTU model. In other words, the debiasing methods would help produce fairer outcomes for patients.