|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Mar 14, 2022 |
Title |
Hi-C H9 replicate C |
Sample type |
SRA |
|
|
Source name |
H9
|
Organism |
Homo sapiens |
Characteristics |
cell type: embryonic stem cells cell line: H9 treatment: no treat protocol: Hi-C restriction enzyme: HindIII
|
Growth protocol |
H9 cells were cultured on the hESC-qualified Matrigel (Corning, #354277) coated plates in mTeSR1 medium (StemCell Technologies, #05850).
|
Extracted molecule |
genomic DNA |
Extraction protocol |
For Hi-C, nuclei were extraced after fixing using a cell lysis buffer. All Hi-C libraries were constructed following illumina insctructions accompanying Truseq sample preparation kit.
|
|
|
Library strategy |
Hi-C |
Library source |
genomic |
Library selection |
other |
Instrument model |
Illumina NovaSeq 6000 |
|
|
Description |
There are 10 replicates for H9 HiC raw data. "H9.cool" is the processed file for combining the 10 replicates.
|
Data processing |
The paired-end Hi-C reads were mapped to human genome hg19 using BOWTIE. Only first 36 bases were used for mapping when reads is longer. The two reads were mapped independently and then merged into pairs using in-house script. Duplicated read pairs from the same biological library were removed. For Hi-C, we focus on cis-interactions and therefore only kept Hi-C paired-end reads which both ends are mapped to the same chromosome. Out of all the intra-chromosome paired-end reads, we also discard the reads with both ends mapped to the same HindIII fragments. Since cut-and-ligation events are expected to generate reads within 500bp upstream of HindIII cutting sites due to the size selection (“+” strand reads should be within 500bp upstream of a HindIII site, and “-“ strand reads should be within 500bp downstream a HindIII site), we only keep reads pairs with both ends satisfying this criteria. We next split all these reads into three classes based on their strand orientations (“same-strand”, “inward”, or “outward”), and generated the resulting lists of fragment pairs (with Hi-C read counts) from each class of reads. There are total ~840k fragments in human genome. They are assembled into ~335k anchors after short fragments (<5kb) are merged into neighboring anchors. For every anchor, we first count Hi-C reads from the anchor to every fragment within 2Mb range. We estimated a background frequency between any two fragments based on the average reads count of all fragment pairs that have similar lengths, similar gap distance, GC content, and visibility. The fragment-to-fragment data can be then converted to anchor-to-anchor data by adding the read counts and background frequencies together based on the assignment of fragments to anchors. P values of the enrichment of Hi-C reads over the background frequency can be calculated using a negative binomial model. The statistical method has been described in previous publication (Jin, F. et al. Nature 2013, 503:290-294) Genome_build: hg19 Supplementary_files_format_and_content: For each Hi-C, the fragment-to-fragment frequency are provided in the txt file.
|
|
|
Submission date |
Aug 24, 2021 |
Last update date |
Mar 14, 2022 |
Contact name |
Shanshan Zhang |
Organization name |
Case Western Reserve University
|
Department |
Department of Genome and Genetics
|
Lab |
Jin lab
|
Street address |
10900 Euclid Avenue
|
City |
Cleveland |
State/province |
OH |
ZIP/Postal code |
44106 |
Country |
USA |
|
|
Platform ID |
GPL24676 |
Series (1) |
GSE167200 |
DeepLoop enables robust mapping of DNA loops from low-depth single cell or allele-resolved Hi-C data at high resolution |
|
Relations |
BioSample |
SAMN20963393 |
SRA |
SRX11898130 |
Supplementary data files not provided |
SRA Run Selector |
Raw data are available in SRA |
Processed data are available on Series record |
|
|
|
|
|