How AI, Twitter can improve our understanding of the opioid epidemic

Machine learning, natural language processing and social media can help researchers track the ongoing opioid epidemic, according to new findings published in JAMA Network Open.

The National Institute of Drug Abuse reported that opioids were responsible for more than 47,000 deaths in the United States in 2017, up from more than 18,000 deaths in 2007. According to the CDC, the numbers are even higher––68,000 overdose deaths were reported in 2018. And tracking the epidemic’s progression provides researchers with a significant challenge.

“Studies have suggested that the state-by-state variations in opioid overdose–related deaths are multifactorial but may be associated with differences in state-level policies and laws regarding opioid prescribing practices and population-level awareness or education regarding the risks and benefits of opioid use,” wrote lead author Abeed Sarker, PhD, Perelman School of Medicine at the University of Pennsylvania in Philadelphia, and colleagues. “Although the geographic variation is now known, strategies for monitoring the crisis are grossly inadequate. Current monitoring strategies have a substantial time lag, meaning that the outcomes of recent policy changes, efforts and implementations cannot be assessed close to real time.”

Sarker et al. developed a text-processing tool for analyzing social media posts related to opioids. The tool used social media activity posted on Twitter from January 2012 to October 2015. All posts—more than 9,000 of them overall—originated from the state of Pennsylvania. They were located by searching Twitter for keywords, including certain street names for a variety of drugs.

After the Twitter posts were manually categorized into four classes, the team trained and evaluated “several supervised learning algorithms.” A total of six classifiers were examined: naïve bayes, decision tree, k-nearest neighbors, random forest, support vector machine and a deep convolutional neural network (CNN). While more than 6,000 social media posts were used for training purposes, another 900 were used for validation and more than 1,800 were used for testing. The precision, recall and F1 scores of all six classifiers were calculated.

Overall, 19.4% of the annotated Twitter posts were related to opioid abuse, 22.2% included information about opioids, 4.7% were in a language other than English and 53.6% were not actually related to opioids. The team’s highest F1 score was 0.726—“comparable to human agreement”—and the authors noted that “automatic processing of social media data, combined with geospatial and temporal information, may provide close to real-time insights into the status and trajectory of the opioid epidemic.”

“Big data derived from social media such as Twitter present the opportunity to perform localized monitoring of the opioid crisis in near real time,” Sarker and colleagues concluded. “In this cross-sectional study, we presented the building blocks for such social media–based monitoring by proposing data collection and classification strategies that employ natural language processing and machine learning.”

Michael Walter
Michael Walter, Managing Editor

Michael has more than 16 years of experience as a professional writer and editor. He has written at length about cardiology, radiology, artificial intelligence and other key healthcare topics.

Trimed Popup
Trimed Popup