Commentary: AI systems should be evaluated in clinical settings

Though artificial intelligence (AI)-based systems have proven their worth when it comes to diagnosing various conditions from medical images, there’s still a need to prove their capabilities in a clinical setting, according to a recent invited commentary piece published in JAMA.

In the piece, author Elaine O. Nsoesie, PhD, an assistant professor for global health at the University of Washington, said AI-based systems have proven useful and are widely accepted when it comes to image-based medical diagnostics. However, she also argued findings in several studies show a need for “rigorous evaluation in clinical settings.”

Most deep-learning algorithms need large data sets for training, which usually require thousands of images and can be expensive, Nsoesie said. That alone may cause people working to develop AI diagnostic tools to only rely on whatever data is available for their initial results.

Without evaluations in a clinical setting, deficiencies in AI diagnostic tools may not be known, as data sets used for training are "carefully curated" without imperfect data samples.

“Problems identified can be corrected prior to deployment. Findings from these evaluations should also be published in peer reviewed literature to monitor progress and allow for comparison of different systems,” Nsoesie wrote.

Evaluating AI diagnostic tools in clinical settings will also allow researchers and clinicians to ensure its potential effect on patient outcomes and healthcare decisions, she stated. 

“Although multiple studies have demonstrated that AI can perform on par with clinical experts in disease diagnosis, most of these tools have not been evaluated in controlled clinical studies to assess their effect on health care decisions and patient outcomes,” Nsoesie concluded. “While AI tools have the potential to improve disease diagnosis and care, premature deployment can lead to increased strain on the health care system, undue stress to patients, and possibly death owing to misdiagnosis.”