GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM5534883

Query DataSets for GSM5534883

Status

Public on Mar 14, 2022

Title

Hi-C H9 replicate C

Sample type

SRA

Source name

Organism

Homo sapiens

Characteristics

cell type: embryonic stem cells
cell line: H9
treatment: no treat
protocol: Hi-C
restriction enzyme: HindIII

Growth protocol

H9 cells were cultured on the hESC-qualified Matrigel (Corning, #354277) coated plates in mTeSR1 medium (StemCell Technologies, #05850).

Extracted molecule

genomic DNA

Extraction protocol

For Hi-C, nuclei were extraced after fixing using a cell lysis buffer.
All Hi-C libraries were constructed following illumina insctructions accompanying Truseq sample preparation kit.

Library strategy

Hi-C

Library source

genomic

Library selection

other

Instrument model

Illumina NovaSeq 6000

Description

There are 10 replicates for H9 HiC raw data. "H9.cool" is the processed file for combining the 10 replicates.

Data processing

The paired-end Hi-C reads were mapped to human genome hg19 using BOWTIE. Only first 36 bases were used for mapping when reads is longer. The two reads were mapped independently and then merged into pairs using in-house script. Duplicated read pairs from the same biological library were removed.
For Hi-C, we focus on cis-interactions and therefore only kept Hi-C paired-end reads which both ends are mapped to the same chromosome. Out of all the intra-chromosome paired-end reads, we also discard the reads with both ends mapped to the same HindIII fragments. Since cut-and-ligation events are expected to generate reads within 500bp upstream of HindIII cutting sites due to the size selection (“+” strand reads should be within 500bp upstream of a HindIII site, and “-“ strand reads should be within 500bp downstream a HindIII site), we only keep reads pairs with both ends satisfying this criteria.
We next split all these reads into three classes based on their strand orientations (“same-strand”, “inward”, or “outward”), and generated the resulting lists of fragment pairs (with Hi-C read counts) from each class of reads.
There are total ~840k fragments in human genome. They are assembled into ~335k anchors after short fragments (<5kb) are merged into neighboring anchors. For every anchor, we first count Hi-C reads from the anchor to every fragment within 2Mb range.
We estimated a background frequency between any two fragments based on the average reads count of all fragment pairs that have similar lengths, similar gap distance, GC content, and visibility. The fragment-to-fragment data can be then converted to anchor-to-anchor data by adding the read counts and background frequencies together based on the assignment of fragments to anchors. P values of the enrichment of Hi-C reads over the background frequency can be calculated using a negative binomial model. The statistical method has been described in previous publication (Jin, F. et al. Nature 2013, 503:290-294)
Genome_build: hg19
Supplementary_files_format_and_content: For each Hi-C, the fragment-to-fragment frequency are provided in the txt file.

Submission date

Aug 24, 2021

Last update date

Mar 14, 2022

Contact name

Shanshan Zhang

Organization name

Case Western Reserve University

Department

Department of Genome and Genetics

Lab

Jin lab