Sample GSM2668120 Query DataSets for GSM2668120
Status Public on Jan 27, 2018
Title E14.5_Forebrain_scATAC_Rep1_Rep2
Sample type SRA
Source name mouse forebrain
Organism Mus musculus
Characteristics strain: C57BL/6NCrl x C57BL/6NTac
developmental stage: embryonic day 14.5
barcodes: Set_1 for Replicate 1; Set_2 for Replicate 2
Extracted molecule genomic DNA
Extraction protocol Nuclei were isolated from snap frozen forebrain tissues or GM12878 cells. Permeabilized nuclei were distributed to 96 wells and tagmentation was carried out to introduce a first barcode combination. After pooling, 25 nuclei were sorted into 384 or 96 wells and a second barcode wqas introduced by PCR.
Permeabilized nuclei were distributed to 96 wells and tagmentation was carried out to introduce a first barcode combination (p7 and p5 barcodes). After pooling, 25 nuclei were sorted into 384 or 96 wells and a second barcode (i7 and i5 barcodes) was introduced by PCR. In general, libraries were sequenced on a HiSeq2500 with following read lengths: 50 + 43 + 37 + 50 (Read1 + Index1 + Index2 + Read2). The human mouse mixtures was sequenced on a Miseq with this read lengths: 44 + 43 + 37 + 44 (Read1 + Index1 + Index2 + Read2). The first 8 bp of Index1 correspond to the p7 barcode and the last 8 bp to the i7 barcode. The first 8 bp of Index2 correspond to the i5 barcode and the last 8 bp to the p5 barcode. PLease note that the barcode information was integrated into the read name.
combinatorial ATAC-seq
Library strategy ATAC-seq
Library source genomic
Library selection other
Instrument model Illumina HiSeq 2500
Data processing Alignment: reads were mapped to the mm10 genome assembly using BWA in pair-end mode with default parameters. Alignment filtering: non-uniquely mapped (MAPQ < 10) and improperly paired (flag = 1804) alignments were filtered. Barcode correction: every barcode combination has 4 indexes (i7, p7, p5, i5) with length of 8 bp for each. Reads with barcode combination containing more than 1 mismatch for any index were filtered. Barcodes with allowed mismatches were changed to its closest barcode. Split Reads: Reads were separated into individual cell based on the barcode combination. Mark and remove PCR duplication: for individual cell, we sorted reads based on the genomic coordinates using “samtools sort”, then marked and removed PCR duplication using Picard tools (MarkDuplicate). Remove mitochondria reads: for each cell, reads mapped to mitochondria sequence were filtered. Tn5 insertion position adjustment: all reads aligning to the + strand were offset by +4 bp, and all reads aligning to the - strand were offset -5 bp. Calculate coverage of active promoters (promoters overlapping with aggregate scATAC-seq peaks), coverage of distal elements and “reads in peaks” ratio for each cell. Cell selection: we kept cells that passed our threshold (1) active promoter coverage > 5%; 2) number of reads > 1,000; 3) reads in peak ratio greater than corresponding bulk ATAC-seq level. Selected cells were separated into two replicates based on barcode combination. Insert size distribution was estimated by Picard (CollectInsertSizeMetrics).
Genome_build: mm10 or hg19
Supplementary_files_format_and_content: .mat: binary accessible matrix; xgi: row names for the matrix representing cell barcodes; ygi: column names for the matrix representing genomic elements (features); .qc: Quality metrics for cells that passed quality filtering (column 1: cell barcodes, column 2: number of reads, column 3: promoter coverage, column 4: fraction of reads in peaks.
Submission date Jun 14, 2017
Last update date Aug 15, 2019
Contact name Rongxin Fang
Organization name Ludwig Institute for Cancer Research
Lab Ren Lab
Street address 9500 Gilman Dr #3080
City La Jolla
State/province CA
ZIP/Postal code 92093
Country USA
Platform ID GPL17021
Series (1)
GSE100033 Single-nucleus analysis of accessible chromatin in developing mouse forebrain reveals cell-type-specific transcriptional regulation
BioSample SAMN07237049
SRA SRX3739249

Supplementary file Size Download File type/resource
GSM2668120_e14.5.nchrM.merge.sel_cell.mat.gz 5.1 Mb (ftp)(http) MAT
GSM2668120_e14.5.nchrM.merge.sel_cell.qc.txt.gz 40.0 Kb (ftp)(http) TXT
GSM2668120_e14.5.nchrM.merge.sel_cell.xgi.txt.gz 7.6 Kb (ftp)(http) TXT
GSM2668120_e14.5.nchrM.merge.sel_cell.ygi.txt.gz 1.1 Mb (ftp)(http) TXT
Raw data are available in SRA
Processed data provided as supplementary file

