Gene Manipulation Enters the AI Era

In recent years, the development of gene editing technology has given the scientific community the tools to rewrite genes. However it remains a major challenge to achieve precise regulation of genes in different cells. The article we cite here proposes to use machine learning to design and synthesize cis-regulatory elements (CREs), which are key factors in controlling gene expression and define thousands of unique cell types in the body.

The researchers developed a modular platform called Computational Optimization of DNA Activity (CODA). CODA presents an innovative solution for how to precisely deliver genes to target cells by generating a CRE that activates genes only in the target cells and can significantly reduce off-target effects.

How CODA Works

  • Massively parallel reporter assays (MPRAs) technology enables researchers to simultaneously test the activity of hundreds of thousands of CRE sequences in different cell types, thereby building a large dataset of sequence-activity relationships.
  • Based on this dataset, CODA was trained using a deep learning model (Malinois). This complex neural network can predict the activity of any given DNA sequence in various cell types, effectively revealing the regulatory mechanisms of gene expression. Malinois shows a strong correlation between prediction results and actual CRE activity.
  • CODA can successfully apply orthogonal methods such as STARR-seq, DHS-seq and H3K27ac ChIP-seq for activity prediction.
  • CODA employs a variety of algorithms (including evolutionary, probabilistic, and gradient-based algorithms) to iteratively generate and optimize CRE sequences. These algorithms utilize Malinois predictions and are designed to optimize the sequence to achieve the desired cell type specificity.

CRE sequence activity testing and predicting.Fig. 1. MPRA model tests (a) and Malinois model predicts CRE sequence activity (b). (Gosai S J.; et al. 2024)

Research Results

To validate the effectiveness of CODA, the researchers conducted rigorous tests on the generated synthetic CREs. Through the MPRAs technique, they tested the CREs in vitro. Also, the researchers validated them in mice and zebrafish to confirm their functionality and cell type specificity in relevant tissues.

The results of the study showed that the CODA-designed synthetic CREs significantly outperformed the naturally occurring sequences in terms of cell type specificity, achieving a clear separation between target and off-target activities. The synthetic sequences exhibited higher activity in target cell types and greater inhibition in non-target cell types. This demonstrates that machine learning models are not only capable of predicting existing patterns, but also actively designing novel and high-performance CREs.

This research provides a powerful new tool for designing gene therapies and other biotechnology applications that require precise genetic control. CODA opens up vast prospects for the treatment of inherited diseases and the development of novel diagnostic tools through the creation of customized CREs capable of controlling gene expression with unprecedented precision.

AI is bringing unprecedented hope and possibilities to mankind, and the emergence of CODA represents the latest direction in the application of AI in the life sciences. Our company has industry-leading AI tools and is committed to working with our customers in the biopharmaceutical industry to solve the major challenges facing the therapeutic space. If you are interested in our AI technology, please contact us to discuss how we can move your project forward.

Original Article:

Gosai S J.; et al. (2024). Machine-guided design of cell-type-targeting cis-regulatory elements. Nature. 2024: 1-10.

Services Related in the Article:

Inquiry
logo

Our mission is to accelerate the development of life-saving drugs by leveraging cutting-edge AI technologies.

CONTACT US
  • Tel:
  • E-mail:
  • Address:

Certification

Certification
Top