|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Feb 15, 2017 |
Title |
S_WGB_2 |
Sample type |
SRA |
|
|
Source name |
whole genome bisulfite sequencing replicate 2
|
Organism |
Homo sapiens |
Characteristics |
tissue: Postmortem brain tissue region: Brodmann area 10
|
Extracted molecule |
genomic DNA |
Extraction protocol |
Human brain tissue was homogenized in Buffer RTL (Qiagen, # 79216) with a 1 mL syringe fitted with a 20G needle. Genomic DNA was extracted from the homogenate using the AllPrep DNA/RNA/miRNA Universal Kit (Qiagen, # 80224). Bisulfite sequencing DNA fragmentation Genomic DNA was sonicated to 550 bp using a Covaris S2 ultra-sonicator (1 cycle, 2 cycles per burst, 45 seconds, 10% duty cycle, intensity 2.0). Multiple sonications were performed for each sample (2 μg DNA maximum per 100 μL fragmentation). The fragmented aliquots from each sample were pooled together to meet the input needs of both WGB and TAB, prior to any downstream assays. Bisulfite and TET1 treatment For whole genome bisulfite (WGB) sequencing the fragmented DNA was bisulfite treated using reduced conversion times with the EZ DNA Methylation-Lightning™ Kit (Zymo Research) to convert non-methylated cytosines to uracil. For Tet-assisted bisulfite (TAB(44)) sequencing we used the WiseGene kit to treat the fragmented DNA with T4 β-glucosyltransferase (16 hours at 30°C) to protect hmCs from oxidation by installation of a glucose moiety. Treatment with TET1 then oxidized only mCs to 5-carboxylcytosine. Finally, the TET1-oxidized DNA is bisulfite treated using the EZ DNA Methylation-Lightning™ Kit (Zymo Research) to convert non-hydroxymethylated cytosines to uracil. To assess conversion and protection rates, fragmented genomic DNA was spiked with commercial methylated and hydroxymethylated non-mammalian control DNAs (WiseGene). The mC control consisted of SssI treated lambda DNA, which should theoretically contain 100% methylated CpGs and non-methylated non-CpG cytosines. In reality, we observed a 96.3% methylation level for CpGs in the mC control, and likely the result of incomplete methylation during the production of the control. The hmC control consisted of a 1.64 kb amplicon from pUC19, intended to contain 100% hydroxymethylcytosines at all locations. The observed hydroxymethylation level for the hmC control was also reduced at 91.1%, but is not unexpected considering past evidence of contamination of commercial 5-hmCTP with dCTP(45-46). Conversion and protection rates are presented in Supplementary Table 1. Sequencing We used the Accel-NGS™ Methyl-Seq DNA Library Kit (Swift Biosciences) to create indexed Illumina libraries directly from the bisulfite and TET1-oxidized/bisulfite converted DNA. Libraries were pooled with PhiX (15%) to compensate for lowered cytosine signal and 2x150 bp paired-end sequenced on an Illumina NextSeq 500 using v2 chemistry. Enrichment panel DNA fragmentation Genomic DNA was sonicated to 100-150 bp using a Covaris S2 ultra-sonicator (10 cycles, 1 cycle per burst, 60 seconds, 10% duty cycle, intensity 5.0). Multiple sonications were performed for each sample (2 μg DNA maximum per 100 μL fragmentation). The fragmented aliquots from each sample were pooled together prior to any downstream assays. MBD We used the MethylMiner™ Kit (Invitrogen) to enrich for mCG via affinity purification with a methyl-CG binding domain protein (MBD2). For each sample, 15 μL of stock Dynabeads ® M-280 Streptavidin (10 μL of beads per 1 μg of DNA input) was coupled to 5.25 μg of MBD-Biotin Protein according to the vendor supplied protocol. Prepared MBD-beads were added to each input sample of 1.5 μg of fragmented DNA, and adjusted to a final volume of 200 μL in 1x Bind/Wash Buffer. The MBD-beads were incubated with DNA on an orbital shaker at 650 rpm for 1 hour at room temperature in a 96-well 1.2 mL square well plate. Following incubation the plate was placed on a magnetic rack and the supernatant containing non-captured, mCG-depleted DNA was retained for MBD-DIP. Beads were then washed three times by addition of 200 μL of 1x Bind/Wash Buffer followed by 3 minute incubations at room temperature on an orbital shaker at 650 rpm. Methylated DNA was eluted from the beads with three treatments with 200 μL low salt elution buffer (12.5% High Salt Elution Buffer + 87.5% Low Salt Elution Buffer [v/v]; 500 mM NaCl final), following the same incubation paradigm as the wash steps. The combined 600 μL mCG-enriched eluate was purified by ethanol precipitation. After optimization, we have empirically determined that a low salt elution improves the specificity of the assay for loci with moderate numbers of CpG sites, giving better methylome-wide representation(41). Additionally, shearing genomic DNA to as small fragments as the downstream sequencing protocol allows (100-150 bp) further increases both the resolution and sensitivity of the MBD enrichment assay (i.e. small, sparsely methylated fragments require less energy to be captured than larger fragments). MBD-DIP The mCG-depleted DNA fraction was retained after MBD enrichment and purified by ethanol precipitation. Methylated DNA immunoprecipitation (MeDIP) was used to enrich for mCH from the mCG-depleted fraction of each sample. We used the MagMeDIP kit (Diagenode, # C02010021) with 1.0 μg of mCG-depleted DNA as input. Briefly, input DNA was heat denatured and incubated with anti-mC antibody and magnetic beads overnight at 4°C with end-over-end rotation. After incubation, beads were washed and mCH-enriched DNA liberated with proteinase K treatment. Reagent volumes and concentrations were prepared according to the vendor supplied protocol. The crude mCH enriched fractions were cleaned by column purification (ChIP DNA Clean & Concentrator, Zymo, # D5201) prior to library generation. hMe-Seal Selective chemical labeling and enrichment of hmC (hMe-Seal)(37) was performed using components of the Hydroxymethyl Collector™ kit (Active Motif). We substituted the enzyme in the kit with T4 β-glucosyltransferase from New England Biosciences (# M0357) to improve labeling performance. In our experience, using high-quality and high-activity enzyme is critical for successful hmC enrichment. 1.5 μg of 150bp fragmented gDNA was used as input for each assay. Briefly, T4 β-glucosyltransferase (10 U per sample, 16 hour incubation at 30°C) was used to selectively label hmC residues with 150 μM final UDP-azide-glucose. Each azo-glucosylated hmC was next biotinylated via dibenzocyclooctyl click chemistry by addition of the provided Biotin Conjugate Solution (likely DBCO-S-S-PEG3-Biotin at 150 μM final) with a 2 hour incubation at 37°C. After biotinylation, the labeled DNA was column purified, enriched with paramagnetic streptavidin beads, washed, and eluted following the vendor supplied protocol, with the substitution of end-over-end rotation with agitation on an orbital shaker at 700 rpm in a 96-well 1.2 mL square well plate. In an independent study(38) this chemical capture-based approach compared favorably to hmC enrichment methods using proteins or antibodies. MeDIP MeDIP was performed with 1.0 μg fragmented gDNA using the MagMeDIP Kit (Diagenode, # C02010021). Capture protocol, purification, library preparation, and sequencing were carried out identically to that of the MBD-DIP samples. Sequencing The MBD and hMe-Seal enriched fractions are used to create indexed sequencing libraries using the TruSeq Nano DNA HT Library Prep Kit (Illumina). For MBD-DIP and MeDIP enriched fractions, the Accel-NGS Methyl-Seq DNA Library Kit (Swift Biosciences) was used to generate indexed libraries directly from single-stranded DNA. Libraries were size-selected using SPRI beads to obtain a mean insert size of 150 bp. Libraries were pooled and 75-cycle single-end sequenced on an Illumina NextSeq 500 using v2 chemistry.
|
|
|
Library strategy |
Bisulfite-Seq |
Library source |
genomic |
Library selection |
RANDOM |
Instrument model |
Illumina NextSeq 500 |
|
|
Data processing |
library strategy: WGB-seq Bisulfite sequencing and enrichment sequencing Bisulfite data: BS-Seeker2(Guo, Fiziev, Yan, Cokus, Sun, Zhang, Chen, and Pellegrini 2013) was used to align data and call methylation levels. BS-Seeker2 converts both reads and reference to align reads (build hg19/GRCh37) in a 3-letter base space using Bowtie2(Langmead and Salzberg 2012). We used local (i.e. removal of terminal mismatches from the reads) and gapped alignment (e.g. allowing small indels) to increase mapping rates(Guo et al. 2013). The local alignment algorithm also automatically truncates adapter sequences. The maximum number of mismatches allowed per read was three. Only successfully mapped read-pairs were used to call methylation levels with correction for overlaps between read-pairs. To avoid unreliable estimates, for the bisulfite data we used methylation calls that were based on five or more sequencing reads. *Guo, W., P. Fiziev, W. Yan, S. Cokus, X. Sun, M. Q. Zhang, P. Y. Chen, and M. Pellegrini. 2013. "BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data." BMC Genomics 14:774. *Langmead, B. and S. L. Salzberg. 2012. "Fast gapped-read alignment with Bowtie 2." Nat Methods 9:357-9. Enrichment data: Reads were aligned (build hg19/GRCh37) with Bowtie2(Langmead and Salzberg 2012) using a seed-and-extend approach combined with local alignment while allowing for gaps. Specifically, we used a 20 bp seed with zero mismatches. Rather than considering the entire read, local alignment was used to improve sensitivity by finding the maximum similarity score between the reference sequence and a substring of the extension that may be "trimmed" at both ends. Gaps were allowed to account for small indels. We quality controlled (QC) multi- and duplicate-reads. Reads often map to multiple genomic locations but in most cases a single alignment can be selected because it is clearly better than the others. In the case of multi-reads, multiple alignments are about equally good. When Bowtie2 encounters a set of equally good alignments, it uses a pseudo-random number to select one primary alignment. Duplicate-reads are reads that start at the same nucleotide positions. When sequencing a whole genome, duplicate-reads often arise from template preparation or amplification artifacts. In our context of sequencing an enriched genomic fraction, duplicate-reads are increasingly likely to occur because reads align to a much smaller fraction of the genome. In instances where >3 (duplicate) reads start at the same position, we reset the read count to 1 implicitly assuming that these reads all tagged a single clonal fragment. A natural way to quantify enrichment is to count the number of fragments covering a cytosine. In contrast to bisulfite sequencing where precision depends on the number of sequenced bases covering each putative methylation site, the number of sequenced fragments determines the precision of enrichment methods. For this reason, single-end libraries are in principle more cost-effective then paired end libraries. However, with single-end libraries the fragment sizes are not observed. Counting the number of reads covering the putative methylation site instead, seriously underestimates coverage as the fragment that generated the read is usually longer than the read. To remedy this, we estimated the fragment size distributions from the empirical sequencing data using isolated cytosines(van den Oord, Bukszar, Rudolf, Nerella, McClay, Xie, and Aberg 2013). To evaluate this approach, we showed that methylation estimates obtained from paired end data (where the fragment size distribution is observed) are almost identical to those obtained from single end data with our estimator (correlation was 0.999). The estimated fragment size distributions is used to calculate the probability that a sequenced fragment will cover the cytosine under consideration. Coverage for each cytosine was then calculated by taking the sum of probabilities for all fragments aligning within proximity of the cytosine. For example, this probability is 1.0 for fragments with reads starting within one read-length of the cytosine, but is ≤1.0 for fragments with reads starting more than one read-length away. Coverage is also affected by the total number of used reads per sample that is a function of sequencer loading and output, rather than differences in methylation. Therefore, coverage estimates were standardized using the total number of reads that remained after quality control. *Langmead, B. and S. L. Salzberg. 2012. "Fast gapped-read alignment with Bowtie 2." Nat Methods 9:357-9. *van den Oord, E. J., J. Bukszar, G. Rudolf, S. Nerella, J. L. McClay, L. Y. Xie, and K. A. Aberg. 2013. "Estimation of CpG coverage in whole methylome next-generation sequencing studies." BMC Bioinformatics 14:50. Genome_build: build hg19/GRCh37
Supplementary_files_format_and_content: CGmap.tar.gz: BSseeker2 CGmap files mCG.csv.gz: Panel mCG (chr, coordinate, enrichement coverage, bisulfite estimated methylation sum) hmCG.csv.gz: Panel hmCG (chr, coordinate, enrichement coverage, bisulfite estimated methylation sum) mCH.csv.gz: Panel mCH (chr, coordinate, enrichement coverage, bisulfite estimated methylation sum) MeDIP_mCG.csv.gz: MeDIP mCG (chr, coordinate, enrichement coverage, bisulfite estimated methylation sum) MeDIP_mCH.csv.gz: MeDIP mCH (chr, coordinate, enrichement coverage, bisulfite estimated methylation sum)
|
|
|
Submission date |
Feb 14, 2017 |
Last update date |
May 15, 2019 |
Contact name |
Edwin van den Oord |
E-mail(s) |
ejvandenoord@vcu.edu
|
Organization name |
Virginia Commonwealth University
|
Department |
Center Biomarker Research and Precision Med
|
Street address |
1112 East Clay Street
|
City |
Richmond |
ZIP/Postal code |
23298 |
Country |
USA |
|
|
Platform ID |
GPL18573 |
Series (1) |
GSE94866 |
Enrichment methods provide a feasible approach to comprehensive and adequately powered investigations of the methylome |
|
Relations |
BioSample |
SAMN06330135 |
SRA |
SRX2558809 |
Supplementary file |
Size |
Download |
File type/resource |
GSM2486674_S_WGB_2_CGmap.tar.gz |
4.4 Gb |
(ftp)(http) |
TAR |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
|
|
|
|
|