Electronic health records (EHRs) are more than just a place to store and share health information. In a recent study, published in Circulation: Cardiovascular Quality and Outcomes, scientists from IBM and Sutter Health developed research methods with artificial intelligence to predict heart failure with information stored in EHRs.
Although physicians document patient symptoms in advance, predicting acute onset of heart failures is a tough job. In response, scientists used artificial intelligence to create a model capable of detecting pre-diagnostic heart failure through EHRs. The model analyzed data on prediction window length, observation window length, the number of different data domains, the number of patient records in the training data set and the density of patient encounters to predict heart failure.
“In our analysis, model performance was most strongly influenced by the diversity of data, basic feature construction and the length of the observation window,” wrote Kenny Ng, research staff member in the Center for Computational Health and manager of the Health Analytics Research Group at IBM Thomas J. Watson Research Center and first author on the study. “In raw form, EHR data are highly diverse, represented by thousands of variants for disease coding, medication orders, laboratory measures, and other data types. It seems obvious that some level of feature construction that relies on well-established ontologies should improve model performance.”
When evaluating the data of 1,684 heart failure cases, the model revealed the types of data used to previously predict heart failures were outdated. For example, only six out of the 28 risk factors within the Framingham Heart Failure Signs and Symptoms were accurate predictors of a future diagnosis of heart failure.
“The quantity and diversity of available EHR data are highly heterogeneous among patients, and this poses potential methodological challenges in using EHR data for predictive modeling purposes,” wrote Ng and colleagues. “In contrast to epidemiological data that are collected using a structured protocol and time schedule, EHR data accrue in an unscheduled manner and with a modest amount of structure defined by clinical protocols. Although we do not have evidence of the generalizability of this finding to other areas of predictive modeling, we can conclude that there is likely to be a strong trade-off in model performance and the size of the training set and that smaller practices may benefit from pooling of data.”
Scientists hope to advance the model to improve the accuracy of its predictions and to test models using different data.