GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
Sample GSM2668125 Query DataSets for GSM2668125
Status Public on Jan 27, 2018
Title GM12878_E15.5_Forebrain_scATAC
Sample type SRA
Source name Mixture of nuclei from E15.5 mouse forebrain and lymphoblastoid cell line (GM12878)
Organisms Homo sapiens; Mus musculus
Characteristics strain: C57BL/6NCrl
developmental stage: embryonic day 15.5
cell type: mixture of mouse embryonic forebrain and human GM12878 immortalized lymphoblastoid cell line
barcodes: p5:1-8, p7:1-12 i5: S502-S511; i7: S701-S715
Biomaterial provider Coriell Cell repositories (GM12878)
Extracted molecule genomic DNA
Extraction protocol Nuclei were isolated from snap frozen forebrain tissues or GM12878 cells. Permeabilized nuclei were distributed to 96 wells and tagmentation was carried out to introduce a first barcode combination. After pooling, 25 nuclei were sorted into 384 or 96 wells and a second barcode wqas introduced by PCR.
Permeabilized nuclei were distributed to 96 wells and tagmentation was carried out to introduce a first barcode combination (p7 and p5 barcodes). After pooling, 25 nuclei were sorted into 384 or 96 wells and a second barcode (i7 and i5 barcodes) was introduced by PCR. In general, libraries were sequenced on a HiSeq2500 with following read lengths: 50 + 43 + 37 + 50 (Read1 + Index1 + Index2 + Read2). The human mouse mixtures was sequenced on a Miseq with this read lengths: 44 + 43 + 37 + 44 (Read1 + Index1 + Index2 + Read2). The first 8 bp of Index1 correspond to the p7 barcode and the last 8 bp to the i7 barcode. The first 8 bp of Index2 correspond to the i5 barcode and the last 8 bp to the p5 barcode. PLease note that the barcode information was integrated into the read name.
combinatorial ATAC-seq
Library strategy ATAC-seq
Library source genomic
Library selection other
Instrument model Illumina HiSeq 2500
Data processing Alignment: reads were mapped to the mm10 genome assembly using BWA in pair-end mode with default parameters. Alignment filtering: non-uniquely mapped (MAPQ < 10) and improperly paired (flag = 1804) alignments were filtered. Barcode correction: every barcode combination has 4 indexes (i7, p7, p5, i5) with length of 8 bp for each. Reads with barcode combination containing more than 1 mismatch for any index were filtered. Barcodes with allowed mismatches were changed to its closest barcode. Split Reads: Reads were separated into individual cell based on the barcode combination. Mark and remove PCR duplication: for individual cell, we sorted reads based on the genomic coordinates using “samtools sort”, then marked and removed PCR duplication using Picard tools (MarkDuplicate). Remove mitochondria reads: for each cell, reads mapped to mitochondria sequence were filtered. Tn5 insertion position adjustment: all reads aligning to the + strand were offset by +4 bp, and all reads aligning to the - strand were offset -5 bp. Calculate coverage of active promoters (promoters overlapping with aggregate scATAC-seq peaks), coverage of distal elements and “reads in peaks” ratio for each cell. Cell selection: we kept cells that passed our threshold (1) active promoter coverage > 5%; 2) number of reads > 1,000; 3) reads in peak ratio greater than corresponding bulk ATAC-seq level. Selected cells were separated into two replicates based on barcode combination. Insert size distribution was estimated by Picard (CollectInsertSizeMetrics).
Genome_build: mm10 or hg19
Supplementary_files_format_and_content: .mat: binary accessible matrix; xgi: row names for the matrix representing cell barcodes; ygi: column names for the matrix representing genomic elements (features); .qc: Quality metrics for cells that passed quality filtering (column 1: cell barcodes, column 2: number of reads, column 3: promoter coverage, column 4: fraction of reads in peaks.
Submission date Jun 14, 2017
Last update date Aug 15, 2019
Contact name Rongxin Fang
Organization name Ludwig Institute for Cancer Research
Lab Ren Lab
Street address 9500 Gilman Dr #3080
City La Jolla
State/province CA
ZIP/Postal code 92093
Country USA
Platform ID GPL22245
Series (1)
GSE100033 Single-nucleus analysis of accessible chromatin in developing mouse forebrain reveals cell-type-specific transcriptional regulation
BioSample SAMN07237042
SRA SRX3739254

Supplementary file Size Download File type/resource
GSM2668125_Undetermined_S0_L001.barcode_freq.hg19.mm9.txt.gz 15.3 Kb (ftp)(http) TXT
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap