NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE183936 Query DataSets for GSE183936
Status Public on Feb 24, 2022
Title DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers [Drosophila genome-wide UMI-STARR-seq]
Organisms Drosophila melanogaster; synthetic construct
Experiment type Other
Summary Enhancer sequences control gene expression and comprise binding sites (motifs) for different transcription factors (TFs). Despite extensive genetic and computational studies, the relationship between DNA sequence and regulatory activity is poorly understood and enhancer de novo design is considered impossible. Here we built a deep learning model, DeepSTARR, to quantitatively predict the activities of thousands of developmental and housekeeping enhancers directly from DNA sequence in Drosophila melanogaster S2 cells. The model learned relevant TF motifs and higher-order syntax rules, including functionally non-equivalent instances of the same TF motif that are determined by motif-flanking sequence and inter-motif distances. We validated these rules experimentally and demonstrated their conservation in human by testing more than 40,000 wildtype and mutant Drosophila and human enhancers. Finally, we designed and functionally validated synthetic enhancers with desired activities de novo.
 
Overall design Genome-wide UMI-STARR-seq was performed in S2 cells using two core promoters each representing housekeeping and developmental transcription programs. All experiments were performed in 2 biological replicates.
 
Contributor(s) de Almeida BP, Reiter F, Pagani M, Stark A
Citation(s) 35551305
Submission date Sep 10, 2021
Last update date May 26, 2022
Contact name Bernardo P de Almeida
E-mail(s) bernardo.almeida@imp.ac.at
Organization name Research Institute of Molecular Pathology (IMP)
Lab Stark Lab
Street address Campus-Vienna-Biocenter 1
City Wien
ZIP/Postal code 1030
Country Austria
 
Platforms (3)
GPL19604 Illumina HiSeq 2500 (synthetic construct)
GPL22106 NextSeq 550 (Drosophila melanogaster)
GPL27609 NextSeq 550 (synthetic construct)
Samples (6)
GSM5574549 S2_dev_STARRseq_input
GSM5574550 S2_dev_STARRseq_rep1
GSM5574551 S2_dev_STARRseq_rep2
This SubSeries is part of SuperSeries:
GSE183939 DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers
Relations
BioProject PRJNA762363
SRA SRP336576

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE183936_RAW.tar 555.4 Mb (http)(custom) TAR (of BW)
GSE183936_S2_dev_STARRseq_merged.bw 51.8 Mb (ftp)(http) BW
GSE183936_S2_dev_STARRseq_merged.peaks.txt.gz 178.7 Kb (ftp)(http) TXT
GSE183936_S2_hk_STARRseq_merged.bw 50.7 Mb (ftp)(http) BW
GSE183936_S2_hk_STARRseq_merged.peaks.txt.gz 143.3 Kb (ftp)(http) TXT
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap