|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Nov 03, 2013 |
Title |
Hi-C, F123 mouse ES cells, replicate one |
Sample type |
SRA |
|
|
Source name |
F123 mouse embyronic stem cells
|
Organism |
Mus musculus |
Characteristics |
cell line: F123 mouse ES cell line
|
Treatment protocol |
None
|
Growth protocol |
F123 cells were grown in KnockOut Serum Replacement containing mouse ES cell media: DMEM 85%, 15% KnockOut Serum Replacement (Invitrogen), penicillin/streptomycin, 1X Non-essential amino acids (Gibco), 1X GlutaMax, 1000U/mL LIF (Millipore), 0.4mM b-mercaptoethanol. F123 mouse ES cells were initially cultured on 0.1% Gelatin-coated plates with mitomycin-C treated mouse embryonic fibroblasts (Millipore). Cells were passaged twice on 0.1% gelatin-coated feeder free plates before harvesting. see samples section
|
Extracted molecule |
genomic DNA |
Extraction protocol |
Hi-C experiments were conducted using HindIII according to previous publication (Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289-93 (2009).). Sequencing libraries were constructed according to previous publication (Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289-93 (2009).).
|
|
|
Library strategy |
OTHER |
Library source |
genomic |
Library selection |
other |
Instrument model |
Illumina HiSeq 2000 |
|
|
Description |
castx129_variants.vcf F123.haps
|
Data processing |
fastq: Illumina's HiSeq Control Software For Hi-C read alignment, we aligned Hi-C reads to the mm9 (mouse) or the hg18 (human) genome. In each case, we masked any bases in the genome that were genotyped as SNPs in either Mus musculus castaneus or S129/SvJae (for mouse) or GM12878 (for humans). These bases were masked to āNā in order to reduce reference bias mapping artifacts. Hi-C reads were aligned iteratively as single end reads using Novoalign. Specifically, for iterative alignment, we first aligned the entire sequencing read to either the mouse or human genome. Unmapped reads are then trimmed by 5 base pairs and realigned. This process is repeated until the read successfully aligns to the genome or until the trimmed read is less than 25 base pairs long. After iterative mapping was finished, read pairs were re-constructed from single reads using an in house pipeline. Unmapped reads were filtered out and PCR duplicate reads were removed. Final alignment files were then processed using the GATK pipeline, specifically using Indel Realignment and Variant Recalibration Haplotypes were generated from the final aligned bam file after merging the two biological replicats using the HapCUT algorithm. The details of HapCUT are described previously (Bansal and Bafna, Bioinformatics 24, i153-159, 2008). Genome_build: mm9 Genome_build: hg18 Supplementary_files_format_and_content: The castx129_variants.vcf and GM12878_depristoeal.vcf are VCF format files of the variants used for input into the haplotyping algorithm. Both of these files are derived from publicly available datasets. WIth regards to the "publicly available datasets", the castx129_variants.vcf file is derived from data downloaded from the ENA (ERP000042) and the SRA (SRX037820). The GM12878_depristoeal.vcf is downloaded from the 1000 genomes project. Supplementary_files_format_and_content: The F123.haps and GM12878_seed.haps are modified bed format files. In this files, the first column is the chromosome, and the second column is the location of the variant. The third and fourth column are the phased variants in the "A" and "B" haplotypes. The choice of "A" and "B" is arbitrary, and it should be noted that the "A" haplotype from one chromosome is not necessarily derived from the same parent as the "A" haplotype from a different chromosome. Supplementary_files_format_and_content: The GM12878_lcp.vcf file is a VCF format file from after local conditional phasing of variants in the seed haplotype
|
|
|
Submission date |
Jul 08, 2013 |
Last update date |
Feb 22, 2021 |
Contact name |
Jesse R Dixon |
E-mail(s) |
jedixon@salk.edu
|
Organization name |
Salk Institute for Biological Studies
|
Lab |
PBL-D
|
Street address |
10010 N. Torrey Pines Rd.
|
City |
La Jolla |
State/province |
CA |
ZIP/Postal code |
92037 |
Country |
USA |
|
|
Platform ID |
GPL13112 |
Series (1) |
GSE48592 |
Whole-genome Haplotype Reconstruction using Proximity-ligation and Shotgun Sequencing |
|
Relations |
Reanalyzed by |
GSE119805 |
Reanalyzed by |
GSE167200 |
BioSample |
SAMN02228118 |
SRA |
SRX318774 |
Supplementary data files not provided |
SRA Run Selector |
Processed data are available on Series record |
Raw data are available in SRA |
|
|
|
|
|