CLUMPHAP: a simple tool for performing haplotype-based association analysis

Genet Epidemiol. 2008 Sep;32(6):539-45. doi: 10.1002/gepi.20327.

Abstract

The completion of the HapMap Project and the development of high-throughput single nucleotide polymorphism genotyping technologies have greatly enhanced the prospects of identifying and characterizing the genetic variants that influence complex traits. In principle, association analysis of haplotypes rather than single nucleotide polymorphisms may better capture an underlying causal variant, but the multiple haplotypes can lead to reduced statistical power due to the testing of (and need to correct for) a large number of haplotypes. This paper presents a novel method based on clustering similar haplotypes to address this issue. The method, implemented in the CLUMPHAP program, is an extension of the CLUMP program designed for the analysis of multi-allelic markers (Sham and Curtis [1995] Ann. Hum. Genet. 59(Pt1):97-105). CLUMPHAP performs a hierarchical clustering of the haplotypes and then computes the chi(2) statistic between each haplotype cluster and disease; the statistical significance of the largest of the chi(2) statistics is obtained by permutation testing. A significant result suggests that the presence of a disease-causing variant in the haplotype cluster is over-represented in cases. Using simulation studies, we have compared CLUMPHAP and more widely used approaches in terms of their statistical power to identify an untyped susceptibility locus. Our results show that CLUMPHAP tends to have greater power than the omnibus haplotype test and is comparable in power to multiple regression locus-coding approaches.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Computer Simulation
  • Gene Frequency
  • Genetic Markers
  • Genetic Predisposition to Disease
  • Haplotypes*
  • Humans
  • Logistic Models
  • Models, Genetic*
  • Models, Statistical*
  • Polymorphism, Single Nucleotide
  • Software*

Substances

  • Genetic Markers