GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
Series GSE43414 Query DataSets for GSE43414
Status Public on Jan 24, 2013
Title A data-driven approach to preprocessing Illumina 450K methylation array data
Organism Homo sapiens
Experiment type Methylation profiling by array
Summary Background: As the most stable and experimentally accessible epigenetic mark, DNA methylation is of great interest to the research community. The landscape of DNA methylation across tissues, through development and in disease pathogenesis is not yet well characterised. Thus there is a need for rapid and cost effective methods for assessing genome-wide levels of DNA methylation. The Illumina Infinium HumanMethylation450 (450K) BeadChip is a very useful addition to the available methods but its complex design, incorporating two different assay methods, requires careful consideration. Accordingly, several normalization schemes have been published. We have taken advantage of known DNA methylation patterns associated with genomic imprinting and X-chromosome inactivation (XCI), in addition to the performance of SNP genotyping assays present on the array, to derive three independent metrics which we use to test alternative schemes of correction and normalization. These metrics also have potential utility as quality scores for datasets.
Results: The standard index of DNA methylation at any specific CpG site is β = M/(M + U + 100) where M and U are methylated and unmethylated signal intensities. Betas calculated from raw signal intensities (the default GenomeStudio behaviour) perform well, but using 11 methylomic datasets we demonstrate that quantile normalization methods produce marked improvement, even in highly consistent data, by all three metrics. The commonly used procedure of normalizing betas is inferior to the separate normalization of M and U, and it is also advantageous to normalize Type I and Type II assays separately. More elaborate manipulation of quantiles proves to be counterproductive.
Conclusions: Careful selection of preprocessing steps can minimise variance and thus improve statistical power, especially for the detection of the small absolute DNA methylation changes likely associated with complex disease phenotypes. For the convenience of the research community we have created an R software package called wateRmelon, compatible with the existing methylumi, minfi and IMA packages, that allows others to utilise the same normalization methods and data quality tests on 450K data.
Overall design Bisulfite converted DNA from the 11 cohorts (N=695, including 36 technical replicates) were hybridised to the Illumina Infinium 450k Human Methylation Beadchip v1.2
Contributor(s) Pidsley R, Wong CC, Volta M, Lunnon K, Mill J, Schalkwyk LC
Citation(s) 23631413
Submission date Jan 10, 2013
Last update date Mar 22, 2019
Contact name Chloe Wong
Organization name King's College London
Street address MRC SGDP Centre, Institute of Psychiatry,
City London
State/province Select a State or Province
ZIP/Postal code SE5 8AF
Country United Kingdom
Platforms (1)
GPL13534 Illumina HumanMethylation450 BeadChip (HumanMethylation450_15017482)
Samples (696)
GSM1068821 genomic DNA from cerebellum_6057825008_R01C01_Cohort 1Ai
GSM1068822 genomic DNA from cerebellum_6057825008_R01C02_Cohort 1Ai
GSM1068823 genomic DNA from cerebellum_6057825008_R02C01_Cohort 1Ai
BioProject PRJNA187197

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE43414_RAW.tar 183.1 Mb (http)(custom) TAR
GSE43414_betaqn_geo_all_cohorts.csv.gz 2.6 Gb (ftp)(http) CSV
GSE43414_danen_geo_all_cohorts.csv.gz 2.6 Gb (ftp)(http) CSV
GSE43414_danes_geo_all_cohorts.csv.gz 2.6 Gb (ftp)(http) CSV
GSE43414_danet_geo_all_cohorts.csv.gz 2.6 Gb (ftp)(http) CSV
GSE43414_dasen_geo_all_cohorts.csv.gz 2.6 Gb (ftp)(http) CSV
GSE43414_daten1_geo_all_cohorts.csv.gz 2.6 Gb (ftp)(http) CSV
GSE43414_daten2_geo_all_cohorts.csv.gz 2.6 Gb (ftp)(http) CSV
GSE43414_fuks_geo_all_cohorts.csv.gz 2.2 Gb (ftp)(http) CSV
GSE43414_nanes_geo_all_cohorts.csv.gz 2.6 Gb (ftp)(http) CSV
GSE43414_nanet_geo_all_cohorts.csv.gz 2.6 Gb (ftp)(http) CSV
GSE43414_nasen_geo_all_cohorts.csv.gz 2.6 Gb (ftp)(http) CSV
GSE43414_naten_geo_all_cohorts.csv.gz 2.6 Gb (ftp)(http) CSV
GSE43414_signal_intensities.csv.gz 1.5 Gb (ftp)(http) CSV
GSE43414_swan_geo_all_cohorts.csv.gz 2.5 Gb (ftp)(http) CSV
GSE43414_tost_geo_all_cohorts.csv.gz 2.6 Gb (ftp)(http) CSV
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap