Deep learning digs deep to understand mutations in DNA ‘junk’

Researchers have developed a deep-learning framework that can show how mutations in “noncoding DNA”—meaning parts of the strand that contain no genes—contribute to autism. And they believe their algorithm is generalizable for clinical researchers studying the role of noncoding mutations in just about any disease.

Senior author Olga Troyanskaya, PhD, of Princeton and colleagues had their work published online May 27 in Nature Genetics.

The team applied their framework to the families of 1,790 children who, alone among healthy siblings, have autism spectrum disorder.

They worked up a deep-learning technique in which an algorithm could train itself to home in on sections of DNA, including those in noncoding “junk” segments, and determine whether each plays a role in any of the hundreds of protein processes known to affect the regulation of genes.

As explained in lay terms in a news item from Princeton’s school of engineering and applied science, the algorithm “slides along the genome,” analyzing each chemical pair of DNA units in the context of the 1,000 chemical pairs around it. It performs this task until it has scanned all mutations.

The result is an algorithmically prioritized list of DNA sequences that are likely to regulate genes and mutations that are likely to interfere with that regulation.

Troyanskaya and co-authors found that, within the study cohort, fewer than 30% of individuals affected by autism spectrum disorder had a previously identified genetic cause.

The team said the newly found mutations will probably increase that fraction “significantly.”

“This method provides a framework for doing this analysis with any disease,” Troyanskaya added. “This transforms the way we need to think about the possible causes of those diseases.”

The team is currently applying its deep-learning development to root out the genetic causes of various cancers, heart diseases and other health disorders.