NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE269036 Query DataSets for GSE269036
Status Public on Jun 14, 2024
Title Iterative deep learning-design of human enhancers exploits condensed sequence grammar to achieve cell type-specificity [RNA-Seq]
Organisms Homo sapiens; synthetic construct
Experiment type Expression profiling by high throughput sequencing
Other
Summary An important and largely unsolved problem in synthetic biology is how to target gene expression to specific cell types. Here, we apply iterative deep learning to design synthetic enhancers with strong differential activity between two human cell lines. We initially train models on published datasets of enhancer activity and chromatin accessibility and use them to guide the design of synthetic enhancers that maximize predicted specificity. We experimentally validate these sequences, use the measurements to re-optimize the predictor, and design a second generation of enhancers with improved specificity. Our design methods embed relevant transcription factor binding site (TFBS) motifs with higher frequencies than comparable endogenous enhancers while using a more selective motif vocabulary, and we show that enhancer activity is correlated with transcription factor expression at the single cell level. Finally, we characterize causal features of top enhancers via perturbation experiments and show enhancers as short as 50bp can maintain specificity.
 
Overall design We designed 3 plasmid libraries (R1-MPRA, R1-DHS, R2) of synthetic candidate enhancer sequences placed upstream of a minimal promoter (minP) and unique barcode sequence, and transfected them into two different human cell lines, HepG2 and K562. We quantified enhancer strength by sequencing the mRNA of transfected cells as well as the DNA of the plasmid libraries, and calculated a ratio of RNA barcode counts to DNA barcode counts. Two biological replicates of each cell line measurement were obtained.
 
Contributor(s) Christopher Y, Sebastian C, Gun Woo B, Peter B, Wouter M, Georg S
Citation missing Has this study been published? Please login to update or notify GEO.
Submission date Jun 04, 2024
Last update date Jun 14, 2024
Contact name Christopher Yin
E-mail(s) c8yin@uw.edu
Organization name University of Washington
Department Electrical & Computer Engineering
Lab Seelig
Street address 3946 W Stevens Wy NE
City Seattle
State/province WA
ZIP/Postal code 98105
Country USA
 
Platforms (2)
GPL21697 NextSeq 550 (Homo sapiens)
GPL27609 NextSeq 550 (synthetic construct)
Samples (16)
GSM8305400 R1-DHS_DNA
GSM8305401 R1-DHS_HepG2_replicate1
GSM8305402 R1-DHS_HepG2_replicate2
Relations
BioProject PRJNA1120004

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE269036_R1-DHS_processed_data.csv.gz 62.3 Kb (ftp)(http) CSV
GSE269036_R1-MPRA_processed_data.csv.gz 99.3 Kb (ftp)(http) CSV
GSE269036_R2_processed_data.csv.gz 99.4 Kb (ftp)(http) CSV
SRA Run SelectorHelp
Raw data are available in SRA

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap