Jump to: Authorized Access | Attribution | Authorized Requests

Study Description

The Northern Finland Birth Cohorts program (NFBC) was initiated in the 1960s in the two northernmost provinces of Finland to study risk factors involved in pre-term birth and intrauterine growth retardation, and the consequences of these early adverse events on subsequent morbidity and mortality. The uniqueness of NBFCs is that the data of the cohorts were obtained from early fetal life (including maternal health during pregnancy) to adulthood. The NFBC1966 includes 12,058 live births to mothers in the two northern-most provinces of Finland. Two decades later, a second cohort of 9432 births was obtained (NFBC1986). In NFBC1966 pregnancies were followed prospectively from the first antenatal contact (10-16th week). After birth, the offspring were examined and then again underwent clinical evaluation at ages 1y, 7y, 14-16y and 31y. At each visit, a wide range of phenotypic, lifestyle and demographic data were gathered by questionnaires and clinical examinations. For the most part, NFBC1986 has undergone similar evaluations to NFBC1966. Linkage to national registries includes hospitalization, deaths, education, medication, pensions, and provides up-to-date demographic and clinical information for members of both cohorts. DNA samples were obtained from 5,923 subjects from NFBC1966 and 6688 subjects from NFBC1986. Data coverage, 96% of all births in 1966 and 99% in 1986, is highly representative for the whole population. The NFBC program comprises more than 20 different projects coordinated by the Center of Lifecourse Disease studies in Northern Finland (COLD) at Oulu University. The prospective data collected from the NFBCs form a unique resource, allowing the study of disease emergence, and of the importance of genetic, biological, social and behavioral risk factors.

The genome-wide association (GWA) study sponsored through the STAMPEED program of NHLBI employed genomic DNA samples previously collected by the NFBC1966 study and stored in the DNA repository of the National Institute for Health and Welfare, Finland. This NHLBI sponsored RO1 project aimed to identify genetic variants contributing to metabolic and cardiovascular diseases (CVD). In addition to de-identified genome wide genotypic data, a selected list of phenotypic data related to CVD including weight, height, BMI, HDL, LDL, total cholesterol, triglyceride, glucose, insulin and fasting status, are also available in dbGaP. A summary of the GWAS for the NFBC1966 cardiovascular risk traits can be found in Sabatti et al., Nature Genetics 41: 35-46, 2009, PMID: 19060910.

The version 2 release of this study contains sequence data from seventeen loci associated with levels of triglyceride, HDL-C, LDL-C, total cholesterol, fasting plasma glucose, and fasting plasma insulin (Kathiresan et al. 2008, Willer et al. 2008, Sabatti et al. 2009, Dupuis et al. 2010, Teslovich et al. 2010). At each locus, protein-coding regions and 5' and 3' untranslated regions of genes nearest to single nucleotide polymorphisms showing genome-wide significant association with metabolic syndrome-related traits, were sequenced. Targeted Illumina sequencing of 78 genes (~270kb) using 150bp probes was performed on 4943 subjects of the Northern Finland Birth Cohort 1966 (NFBC1966). Whole exome sequencing on the Illumina platform was carried out on 586 of those participants.

The sequencing study is part of a larger project that is funded by the National Human Genome Research Institute's Allelic Spectrum in Common Disease Initiative, and comprises sequence data from more than 7000 individuals in two Finnish cohorts: NFBC1966 and the Finland-United States Investigation of NIDDM Genetics (FUSION) study.

Authorized Access
Publicly Available Data
  Link to other NCBI resources related to this study
Study Inclusion/Exclusion Criteria

A detailed description of design of the NFBC1966 study can be found at: Sabatti C, Service SK, Hartikainen AL, Pouta A, Ripatti S, Brodsky J, Jones CG, Zaitlen NA, Varilo T, Kaakinen M, Sovio U, Ruokonen A, Laitinen J, Jakkula E, Coin L, Hoggart C, Collins A, Turunen H, Gabriel S, Elliot P, McCarthy MI, Daly MJ, Järvelin MR, Freimer NB, Peltonen L. Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat Genet. 2009 Jan;41(1):35-46, PMID: 19060910.

Specific exclusion criteria used in the above analyses include:
Subjects were excluded from analysis of specific phenotypes on the basis of criteria that were established separately for each phenotype. Individuals were excluded from analysis of lipid phenotypes (TG, HDL, LDL) if the blood sample was not collected after fasting, or if they were diabetic. Individuals were excluded from analysis of GLU and INS if the blood sample was non-fasting, if they were diabetic, on diabetic medication, pregnant, or if their glucose/insulin measurement (after correction for sex, oral contraceptive use, and pregnancy status) was in excess of three standard deviations from the mean. Individuals were excluded from analysis of BMI if their weight was self-reported, or if they were pregnant. No exclusion criteria were applied to CRP or to SBP/DBP.

Any individual with genotyping call rates <95% was excluded from analysis. Subjects who were discrepant between their reported sex and the sex determined from the X Chromosome were excluded from analysis. We employed the identity-by-descent (IBD) analysis option of PLINK (Purcell et al. 2007, PMID: 17701901) to determine possible relatedness among our sample subjects, and to identify sample duplications and sample contamination (the latter identified as subjects who appeared to be related to nearly everyone in the sample). If the sample duplication issue could not be resolved by external means, both samples were excluded. All apparently contaminated samples were excluded. Individuals related at the level of half-sibs or closer were identified with the IBD analysis and one subject excluded from each pair (the subject with less complete genotyping). Subsequent to this overall exclusion, subjects may be excluded from analysis of specific phenotypes as detailed above.

Molecular Data
TypeSourcePlatformNumber of Oligos/SNPsSNP Batch IdComment
Whole Genome Genotyping Illumina HumanCNV370-Quadv3_C 373339 1049349 Version 1 Data
Whole Exome and Targeted Sequencing Illumina HiSeq 2000 N/A N/A Version 2 Data
Whole Exome and Targeted Sequencing Illumina Genome Analyzer IIX N/A N/A Version 2 Data
Targeted Capture Sequencing Agilent Custom N/A N/A Version 2 Data
Study History

The NFBC study was started in the two Northernmost provinces in Finland (Oulu and Lapland) already in the year 1965. Data on the individuals born into this cohort (with expected date of birth in 1966) was collected since the 24th gestational week as well as their mothers and, to a lesser extent, fathers (NFBC1966). NFBC 1966 comprises of 12068 deliveries and 12231 children.

In 1985 the collection of a younger cohort (with expected date of birth in July 1985-June 1986) began in the same region with the special purpose of preventing mental and physical handicap (NFBC1986). NFBC1986 comprises of 9362 mothers and 9479 children.

For both cohorts interviews and postal questionnaires were completed/returned from the 24th gestational week onwards (data since 12-16th gestational week). The course of pregnancy and delivery, including complications, were confirmed from patient records, as was the neonatal outcome. The children were followed-up at the ages of 6-12 months, 7-8 years (NFBC 1986), 14-16 years (NFBC 1966, 1986) and at the age of 31 (NFBC 1966). Follow-up of the NFBC 1986 with clinical data collection -- similarly as for the NFBC 1966 at age 31-- was carried out during 2001-2003. The data have been supplemented by various hospital records and statistical register data.

A detail description of history and initial design of the NFBC1966 study can be found at Rantakallio P, Groups at risk in low birth weight infants and perinatal mortality. Acta Paediatr Scand. 1969;193:43, PMID: 4911003.

Selected Publications
Diseases/Traits Related to Study (MeSH terms)
Authorized Data Access Requests
Study Attribution
  • Principal Investigator
    • Leena Peltonen (deceased). Wellcome Trust Sanger Institute, Cambridge, UK; Institute for Molecular Medicine Finland and National Institute for Health and Welfare, Helsinki, Finland; Broad Institute, Cambridge, MA, USA.
  • Co-Principal Investigators
    • Nelson Freimer. Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, UCLA, Los Angeles, CA, USA.
    • Aarno Palotie. Wellcome Trust Sanger Institute, Cambridge, UK; Institute for Molecular Medicine Finland and National Institute for Health and Welfare, Helsinki, Finland; Broad Institute, Cambridge, MA, USA.
  • Co-Investigators
    • Joel Hirschhorn. Broad Institute, Cambridge, MA, USA.
    • Mark Daly. Broad Institute, Cambridge, MA, USA.
    • Chiara Sabatti. UCLA, Los Angeles, CA, USA.
    • Marjo-Riitta Järvelin. Imperial College London, UK and Oulu University, Oulu Finland.
    • Paul Elliott. Imperial College, London, UK.
    • Mark McCarthy. Oxford University, Oxford UK.
  • Director, Human Genetics Platform
    • Stacey Gabriel. Broad Institute, Cambridge, MA, USA.
  • DNA Repository
    • Finnish Biobanks. National Institute for Health and Welfare (THL), Finland.
  • Genotyping Center
    • Broad Institute Biological Sample Repository (BSP). Broad Institute, Cambridge, MA, USA.
  • Funding Source
    • R01-HL087679-01. National Heart, Lung, and Blood Institute, Bethesda, MD, USA.
    • STAMPEED Program. National Heart, Lung, and Blood Institute, Bethesda, MD, USA.
    • Project grant 104781. Academy of Finland, Finland.
    • Project grant 120315. Academy of Finland, Finland.
    • Project grant 132797. Academy of Finland, Finland.
    • Center of Excellence in Complex Disease Genetics. Academy of Finland, Finland.
    • University Hospital Oulu, Finland.
    • University of Oulu, Finland.
    • Biocenter Finland, Finland.
  • Sequencing Center
    • The Genome Institute, Richard K. Wilson, Director. Washington University School of Medicine, St. Louis, MO, USA.
  • Funding Source for Sequencing
    • U54HG003079 (Center for Large-Scale Genome Sequencing and Analysis). National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.