GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
Series GSE137879 Query DataSets for GSE137879
Status Public on Mar 18, 2020
Title Whole Genome Bisulfite sequencing: Allele-specific DNA methylation is increased in cancers and its dense mapping in normal plus neoplastic cells increases the yield of disease-associated regulatory SNPs
Organism Homo sapiens
Experiment type Methylation profiling by high throughput sequencing
Summary Background: Mapping of allele-specific DNA methylation (ASM) can be a post-GWAS strategy for localizing regulatory sequence polymorphisms (rSNPs). However, the advantages of this approach, and the mechanisms underlying ASM in normal and neoplastic cells, remain to be clarified. Results: We performed whole genome methyl-seq on diverse normal cells and tissues and three types of cancers (multiple myeloma, lymphoma, glioblastoma multiforme). After excluding imprinting, the data pinpointed 15,114 high-confidence ASM differentially methylated regions (DMRs), of which 1,842 contained SNPs in strong linkage disequilibrium or coinciding with GWAS peaks. ASM frequencies were increased 5 to 9-fold in cancers vs. matched normal tissues, due to widespread allele-specific hypomethylation and focal allele-specific hypermethylation in poised chromatin. Cancers showed increased allele switching at ASM loci, but destructive SNPs in specific classes of CTCF and transcription factor (TF) binding motifs were similarly correlated with ASM in cancer and non-cancer. Rare somatic mutations in these same motif classes tracked with de novo ASM in the cancers. Allele-specific TF binding from ChIP-seq was enriched among ASM loci, but most ASM DMRs lacked such annotations, and some were found in otherwise uninformative “chromatin deserts”. Conclusions: ASM is increased in cancers but occurs by a shared mechanism involving rSNPs in CTCF and TF binding sites in normal and neoplastic cells. Dense ASM mapping in normal plus cancer samples reveals candidate rSNPs that are difficult to find by other approaches. Together with GWAS data, these rSNPs can nominate specific transcriptional pathways in susceptibility to autoimmune, neuropsychiatric, and neoplastic diseases.
Overall design For analyzing complete methylomes in 65 primary non-neoplastic and 16 primary neoplastic samples, plus the GM12878 LCL, WGBS was performed at the New York Genome Center (NYGC), MNG Genetics (MNG) and the Genomics Shared Resource of the Roswell Park Cancer Institute (RPCI). The NYGC used a modified Nextera transposase-based library approach. Briefly, genomic DNA was first tagmented using Nextera XT transposome and end repair was performed using 5mC. After bisulfite conversion, Illumina adapters and custom bisulfite converted adapters are attached by limited cycle PCR. Two separate libraries were prepared and pooled for each sample to limit the duplication rate and sequenced using Illumina X system (150 bp paired-end). WGBS performed at MNG used the Illumina TruSeq DNA Methylation Kit for library construction according to the manufacturer’s instructions and generated 150 bp paired end reads on an Illumina NovaSeq machine. WGBS performed at RPCI utilized the ACCEL-NGS Methyl-Seq DNA Library kit for library construction (Swift Biosciences) and generated 150 bp paired end reads on an Illumina NovaSeq. After trimming for low-quality bases (Phred score<30) and reads with a length <40 bp with TrimGalore, the reads were aligned to the human genome (GRCh37) using Bismark with paired end mode and default setting. Duplicate reads were removed using Picard tools and reads with more than 10% unconverted CHG or CHH cytosines were filtered out. SNP calling was performed with BisSNP using default settings, except for the maximum coverage filter set at 200 to encompass deep sequencing, and quality score recalibration. SNP calling was carried out using human genome GRCh37 and dbSNP147 as references. For ASM calling, we filtered out heterozygous SNPs with less than 5 reads per allele. In addition, SNP with multiple mapping positions were filtered out, as well as SNPs with more than one minor allele with allele frequency>0.05 and SNPs that deviated significantly from Hardy-Weinberg equilibrium based on exact tests corrected for multiple tests (FDR<0.05 by HardyWeinberg R package). C/T and G/A SNPs were assessed after filtering out reads mapping to the C/T strand. ASM calling was performed after separating the SNP-containing reads by allele. After Bismark methylation extractor is applied, CpG methylation calls by allele are retrieved using allele tagged read IDs. Paired reads with ambiguous SNP calling (i.e., called as REF allele on one paired end and ALT allele on the other) were discarded. For Nextera WGBS, due to the fill-in reaction using 5mC following DNA tagmentation which affects the 10 first base pairs (bp) on 5’ of read 2, methylation calling for Cs mapping to these bp were not considered. In addition, a slight methylation bias due to random priming and specific to each library kit was observed within the last 2 bp on 3’ of both paired ends for Nextera WGBS, within the first 10 bp on 5’ of both paired ends and the last 2 bp on 3’ of read 2 for TruSeq WGBS, and within the first 10 bp on 5’ of read 2 for ACCEL-NGS WGBS. Therefore, methylation calls in these windows were ignored.
Contributor(s) Do C, Dumont E, Salas M, Castano A, Mujahed H, Maldonado L, Singh A, Bhagat G, Lehman S, Christiano AM, Madhavan S, Nagy PL, Green PR, Ilsey N, Feinman R, Trimble C, Marder K, Honig L, Monk C, Goy A, Chow K, Goldlust S, Kaptain G, Siegel D, Tycko B
Citation(s) 32594908
Submission date Sep 23, 2019
Last update date Feb 27, 2023
Contact name Benjamin Tycko
Phone 5519963595
Organization name HUMC
Department Epigenetics
Street address 40 prospect avenue
City hackensack
State/province NJ
ZIP/Postal code 07601
Country USA
Platforms (2)
GPL20795 HiSeq X Ten (Homo sapiens)
GPL24676 Illumina NovaSeq 6000 (Homo sapiens)
Samples (85)
GSM4090863 sample_1_breast
GSM4090864 sample_10_B_cells
GSM4090865 sample_100_myeloma
This SubSeries is part of SuperSeries:
GSE137880 Whole genome bisulfite sequencing and Genome-wide targeted methyl-seq: Allele-specific DNA methylation is increased in cancers and its dense mapping in normal plus neoplastic cells increases the yield of disease-associated regulatory SNPs
BioProject PRJNA574550
SRA SRP223612

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE137879_RAW.tar 3.3 Mb (http)(custom) TAR (of TXT)
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap