Incorporating genotyping uncertainty in haplotype frequency estimation in pedigree studies

Hum Hered. 2007;64(3):172-81. doi: 10.1159/000102990. Epub 2007 May 25.

Abstract

Aims: Haplotype frequency estimation is indispensable in studies of human genetics based on haplotypes since studies based on haplotypes are likely to yield more information than those based on single SNP marker. However, most existing algorithms estimate haplotype frequencies under the assumption that all of the genotype data sets are correct. To date, nearly all large genotype data sets have errors, and studies have demonstrated that even a small quantity of genotyping errors can have enormous impact on haplotype frequency estimation.

Methods: Although the GenoSpectrum (GS)-EM algorithm which estimates haplotype frequencies incorporating genotyping uncertainty has been presented recently [1], it can only be suitable for independent individuals rather than dependent pedigree data. In this paper, we describe a new EM algorithm, called GS-PEM, that calculates maximum likelihood estimates (MLEs) of haplotype frequencies based on all possible multilocus genotypes (GenoSpectrum) of each member of the pedigrees through making use of the dependence information of relatives.

Results and conclusion: We evaluate the performance of the GS-PEM by simulation studies and find that our GS-PEM can reduce the impact induced by the genotyping errors in haplotype frequency estimation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Gene Frequency*
  • Genotype
  • Haplotypes*
  • Humans
  • Likelihood Functions*
  • Pedigree*