Harvard researcher creates machine learning model to treat drug resistant tuberculosis

A Harvard undergrad has created a computer program that can improve the treatment of tuberculosis, an infectious disease with unique challenges thanks to its shapeshifting ability to resist drugs.

The disease is common––about 10 million new cases are diagnosed globally each year, with 4% of those cases being multidrug resistant TB, or resistant to at least two drugs. Among the drug resistant infections, 1 in 10 are resistant to multiple medications.

Current drug resistance testing methods are slow, with some tests taking up to six weeks after an initial diagnosis to reveal drug sensitivity in a lab setting. In many places across the world where drug sensitivity testing is more difficult to conduct, TB treatment can come down to guesswork, representing a significant opportunity for machine learning to disrupt the status quo. In addition, other testing methods are flawed, missing the ability to detect for second-line drug resistance or blind to assessment of genetic interactions.

The Harvard computer program predicts the resistance of a TB strain to 10 first- and second-line drugs in a tenth of a second with greater precision than similar models. In a clinical setting, the computer program, which will be available online as a feature on Harvard Medical School’s genTB tool, could speed up and improve the accuracy of detection of TB drug resistance.

“Drug-resistant forms of TB are hard to detect, hard to treat and portend poor outcomes for patients,” Maha Farhat––senior study author, assistant professor of biomedical informatics at Harvard Medical School and a pulmonary medicine specialist at Massachusetts General Hospital––said in a statement. “The ability to rapidly detect the full profile of resistance upon diagnosis is critical both to improving individual patient outcomes and in reducing the spread of the infection to others.”

The Harvard scientists set out to specifically overcome the flaws of current TB drug resistance testing models and exposed the computer programs to a data set with a wide range of genetic mutations. The models were trained on 3,601 TB strains resistant to first- and second-line drug, including 1,288 multidrug resistant strains. In testing, the models were challenged to predict resistance in 792 fully sequenced TB genomes that the models were not trained on.

Out of five computational models designed and tested, two were above the fray in their accuracy, including a statistical model and a neural network. The final model is a diagnostic tool that can determine drug resistance by assessing all available information against prior knowledge. The wide and deep neural network was better for predicting second-line therapy resistance, making it the more accurate model.

"The wide and deep neural network interlaces two forms of machine learning to identify the combined effects of genetic variants on antibiotic resistance,” Michael Chen, study first author who started writing the program as a freshman at Harvard, said in a statement.

The neural network model predicted resistance to first-line drugs with 94 percent accuracy and achieved 90 percent accuracy with second-line drugs, on average. By comparison, the statistical model recorded 94 percent accuracy in predicting resistance to first-line drugs and 88 percent accuracy for second-line drugs.

The speed and accuracy underscore the growing capabilities of AI in the healthcare space when it comes to treating and diagnosing infectious diseases.

“Our model highlights the role of artificial intelligence in the case of TB, but its importance goes well beyond TB,” said Andy Beam, study co-senior author, research associate in biomedical informatics at HMS and a visiting lecturer at the Harvard T.H. Chan School of Public Health. “AI will help guide clinical decision-making by rapidly synthesizing large amounts of data to help clinicians make the most informed decision in many scenarios and for many other diseases.”