‘The nature of the error matters’: Why considering AI’s failures is important, too

A multitude of recent studies and success stories suggest artificial intelligence is on its way to topping doctors in accurately diagnosing diseases from asthma to breast cancer—seemingly a step in the right direction. But does the hype surrounding AI’s victories eclipse its shortcomings?

A Wired reporter argued that might be the case in a March 6 editorial focused on the fallout of a New York Times article suggesting AI could be used as a physician aid for diagnosing some conditions. The article drew from a Nature paper that found a certain AI software package was more accurate than physicians in diagnosing asthma and gastrointestinal disease, but the Times article failed to mention where the technology fell short.

While the latter piece mentioned AI’s triumph over five sets of physicians in identifying asthma and GI conditions, it didn’t mention the gap between physicians’ abilities to accurately diagnose encephalitis (more than 95 percent) and AI’s ability to do the same (83.7 percent). Encephalitis, an inflammation of the brain, can be deadly.

“In other words, the human doctors beat the AI system when it came to correctly diagnosing a more serious illness,” the Wired article read.

The reporter warned this kind of data can be presented in a misleading way, implying AI doesn’t need to be held accountable for catching life-threatening diseases if it can identify a greater number of less-acute conditions. Gregory La Blanc, a distinguished teaching fellow at UC-Berkeley’s Haas School of Business, told the writer it’s important to consider all the facts.

“I am a huge optimist about the application of AI in medicine, but we need to look beyond accuracy measures as presented in this (Times) article,” La Blanc said. “The most accurate Ebola test ever invented is the one that always says no. The nature of the error matters.”

Read the full piece below: