Google DeepMind AI Tool Predicts Whether Genetic Mutations Are Likely to Cause Harm

Researchers at Google DeepMind, the tech giant’s artificial intelligence unit, on Tuesday unveiled a tool that can predict whether genetic mutations are likely to cause harm, a breakthrough that could aid research into rare diseases. Pushmeet Kohli, vice president of research at Google DeepMind, said the findings “are another step toward understanding the impact of artificial intelligence on the natural sciences.”

The tool focuses on so-called “missense” mutations, in which a single letter of the genetic code is affected. There are 9,000 such mutations in a typical human genome; they may be harmless, or they may cause diseases such as cystic fibrosis or cancer, or impair brain development. To date, 4 million such mutations have been observed in humans, but only 2% of them are classified as pathogenic or benign.

There are a total of 71 million possible mutations. Google DeepMind tool AlphaMissense reviewed these mutations and was able to predict 89% of them with 90% accuracy. Each mutation is assigned a score that represents its risk of causing disease (also called pathogenic).

Results: 57% were classified as probably benign and 32% as probably pathogenic – the remainder were indeterminate. The database is public and available to scientists, and the study was published Tuesday in the journal Science.

Experts Joseph Marsh and Sarah Teichmann wrote in an article published in the journal Science that AlphaMissense demonstrated “superior performance” over previously available tools. “We should emphasize that these predictions were never really trained and were never really intended to be used solely for clinical diagnosis,” said Jun Cheng of Google DeepMind.

“However, we do think that our predictions may help improve the diagnosis rate of rare diseases and may also help us discover new disease-causing genes,” Cheng added. Researchers say this could indirectly lead to the development of new treatments.

The tool was trained on DNA from humans and closely related primates, allowing it to identify which genetic mutations are widespread. The training allowed the tool to feed in “millions of protein sequences and learn what regular protein sequences look like,” Cheng said.

It can then identify mutations and their potential harm. Cheng likens the process to learning a language. “If we replace a word in an English sentence, someone familiar with English can immediately see whether this word substitution changes the meaning of the sentence.”

Affiliate links may be generated automatically – see our Ethics Statement for details.

Svlook