Targeted Artificial Enhancers Design for Specific Tissues in the Housefly Embryo

Enhancers control gene expression and play a crucial role in development and homeostasis. However, the targeted design of new enhancers with tissue-specific activities remains a challenge. Here, we combine deep learning and knowledge transfer to design tissue-specific enhancers for five tissues in the Drosophila melanogaster embryo – the central nervous system (CNS), the outer layer, the gut, the muscles, and the brain. We first train convolutional neural networks (CNNs) using genome-wide scATAC-seq datasets and then fine-tune the networks with smaller-scale data from in vivo enhancer activity experiments, resulting in models with positive predictive values ranging from 25% to 75% based on cross-validation. We designed and evaluated 40 artificial enhancers (eight per tissue) in vivo, of which 31 (78%) were active and 27 (68%) functioned in the targeted tissue (100% for CNS and muscles). The strategy of combining genome-wide functional datasets with smaller datasets using knowledge transfer is generally applicable and should enable the design of tissue- and cell-type-specific enhancers in any system.

Introduction

Enhancers control gene expression and play a crucial role in development and homeostasis. However, the targeted design of new enhancers with tissue-specific activities remains a challenge. In this study, we integrated deep learning and knowledge transfer to design tissue-specific enhancers in Drosophila melanogaster embryos. Convolutional neural networks were trained using genome-wide scATAC-seq datasets, and these networks were then fine-tuned with smaller-scale data from in vivo enhancer activity experiments. We designed 40 artificial enhancers and evaluated them in vivo, with results showing that 31 were active and 27 were functioning in the targeted tissue. This strategy can be widely applied and may enable the design of tissue- and cell-type-specific enhancers in any system.

Targeted Design of Artificial Enhancers

This study aims to design tissue-specific artificial enhancers for five tissues in the Drosophila melanogaster embryo – the central nervous system (CNS), the outer layer, the gut, the muscles, and the brain. Deep learning networks and knowledge transfer were employed in this work. Deep learning networks were trained using genome-wide scATAC-seq datasets to learn the gene expression patterns in different tissues. These networks were then fine-tuned with smaller-scale data from in vivo enhancer activity experiments to improve the predictive accuracy of enhancer activity in the targeted tissues. A total of 40 artificial enhancers (eight per tissue) were designed and evaluated in vivo. Results showed that 31 of the enhancers were active and 27 were functioning in the targeted tissue. This strategy can be used to design tissue-specific enhancers, cell types, and cell state enhancers in any system.

Conclusions

This study concludes that the use of deep learning and knowledge transfer can contribute to the design of tissue-specific artificial enhancers in Drosophila melanogaster embryos. Deep learning networks were trained using genome-wide scATAC-seq datasets and fine-tuned with smaller-scale data from in vivo enhancer activity experiments. A total of 40 artificial enhancers were designed and evaluated in vivo, with results showing that 31 were active and 27 were functioning in the targeted tissue. This strategy can be used to design tissue- and cell-type-specific enhancers in any system.

Source:

https://www.nature.com/articles/s41586-023-06905-9

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *