Google DeepMind and Sanger Institute Partner on AI Genomics Consortium

A new collaboration between Google's artificial intelligence company, DeepMind, the Wellcome Sanger Institute in the UK, and Google.org aims to create high-quality genomic datasets for training AI models. The partnership will focus on developing these essential resources over a five-year period.

The consortium is backed by annual funding of $5 million from Google.org and DeepMind. This significant investment underscores the importance of reliable data in advancing AI research in genomics.

DeepMind has already made notable contributions to the field with its AlphaFold software, which predicts protein structures. The company also released AlphaGenome in January, a publicly available model that can predict DNA sequence function. According to Žiga Avsec, lead author and researcher at Google DeepMind, this tool goes beyond basic expression predictions, instead focusing on more detailed aspects like DNA accessibility and transcription-factor binding.

Another AI platform developed by DeepMind is Co-Scientist, which was introduced in May as a multiagent system capable of scanning existing literature to generate hypotheses. While these tools rely on open-access datasets, not all areas of the life sciences have suitably indexed resources for training new AI algorithms.

The Sanger Institute's chief innovation and impact officer, Julia Wilson, emphasizes that the consortium seeks to create widely shareable resources that will facilitate transformative scientific discoveries across the life sciences.