Call made for more rigorous evaluation of AI aimed at guiding providers, patients

Only two of 34 representative studies evaluating the use of AI for real-world shared clinical decisionmaking from 2014 to 2020 included external validation of the models up for consideration.

Further, most of the reviewed studies used just one algorithm for training, testing and internally validating, with only eight putting multiple algorithms through their paces.

So report industry researchers who conducted a systematic literature review to gauge the robustness of studies focused on the use of machine learning to assist in joint patient-provider decisions.

BMC Medical Informatics and Decision Making published the study Feb. 15.

Lisa Hess, PhD, and Alan Brnabic of Eli Lilly state, in so many words, that their review revealed an unruly assortment of methods, statistical software and validation strategies.

Commenting on the array of approaches as well as the relative thinness of many published studies, Hess and Brnabic call for clinical AI researchers to make sure “multiple modeling approaches are employed in the development of machine learning-based models for patient care, which requires the highest research standards to reliably support shared evidence-based decisionmaking.”

Going forward, experimental machine learning models should be sized up with both internal and external validation before the models are proposed for real-world patient care, the authors suggest.

“Few studies have yet to reach that bar of evidence to inform patient-provider decisionmaking,” Hess and Brnabic comment.

The study is available in full for free.