CNN, transfer learning automates EMR data input

Researchers at the University of Washington have developed a method for streamlining electronic medical record (EMR) data entry using convolutional neural networks (CNNs) and transfer learning, according to a paper published April 12 in Artificial Intelligence in Medicine.

The team’s work is part of a larger effort to automate EMR input, which can be both time-consuming and error-prone. Authors Anthony Rios, PhD, and Ramakanth Kavuluru, PhD, said that while EMRs are a necessity in the clinic, manually logging International Classification of Diseases (ICD) codes and processing essay-length records presents a challenge for physicians already strapped for time.

“Annotating EMRs with ICD codes is important for medical billing,” Rios and Kavuluru wrote. “If a diagnosis code cannot be justified, then the doctor/hospital may not be paid by the insurers, or worse, cause unfair financial burden to the patient. Therefore, developing automated medical coding systems and tools for human coders to become more efficient and accurate is vital.”

Automating ICD coding using CNNs isn’t a new idea, but Rios and Kavuluru said there’s one major roadblock to a successful system: the fact that some disease codes occur very infrequently, creating a paucity of reliable data for training a CNN. The authors tried to circumvent that by applying transfer learning—the process of transferring knowledge acquired from one task to a different task—to their workflow.

Rios and Kavuluru first trained a CNN to predict medical subject headings using 1.6 million indexed biomedical abstracts pulled from PubMed, then trained a CNN on 71,463 real-world EMRs from the University of Kentucky Medical Center to predict ICD diagnosis codes.

The authors found their model’s micro and macro F-scores, both measures of a test’s accuracy, improved by more than 8% when they applied the transfer learning approach. Their method was also superior to other transfer learning methods, resulting in an almost 2% improvement in the macro F-score.

“While calculating the macro F-score over all labels gives some insight about how our method performs on infrequent labels, if the frequent and infrequent codes are jointly compared, then it confounds its interpretation,” Rios and Kavuluru wrote. “We find that our proposed method improves infrequent label performance by 5%.”

The team said they’re looking to expand the method by incorporating more PubMed data and exploring hospital-to-hospital transfer learning. While their results were good, they said there’s room for improvement.

“The major weakness of this line of work is similar to the weaknesses of other transfer learning methodologies—we must train our model on two different datasets,” they wrote. “However, we believe this is an acceptable weakness, because only the training time is increased.”