MaGIC: a program to generate targeted marker sets for genome-wide association studies

Biotechniques. 2004 Dec;37(6):996-9. doi: 10.2144/04376BIN03.

Abstract

High-throughput genotyping technologies such as DNA pooling and DNA microarrays mean that whole-genome screens are now practical for complex disease gene discovery using association studies. Because it is currently impractical to use all available markers, a subset is typically selected on the basis of required saturation density. Restricting markers to those within annotated genomic features of interest (e.g., genes or exons) or within feature-rich regions, reduces workload and cost while retaining much information. We have designed a program (MaGIC) that exploits genome assembly data to create lists of markers correlated with other genomic features. Marker lists are generated at a user-defined spacing and can target features with a user-defined density. Maps are in base pairs or linkage disequilibrium units (LDUs) as derived from the International HapMap data, which is useful for association studies and fine-mapping. Markers may be selected on the basis of heterozygosity and source database, and single nucleotide polymorphism (SNP) markers may additionally be selected on the basis of validation status. The import function means the method can be used for any genomic features such as housekeeping genes, long interspersed elements (LINES), or Alu repeats in humans, and is also functional for other species with equivalent data. The program and source code is freely available at http://cogent.iop.kcl.ac.uk/MaGIC.cogx.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms*
  • Chromosome Mapping / methods*
  • Gene Targeting / methods*
  • Genetic Markers / genetics*
  • Genome, Human
  • Humans
  • Linkage Disequilibrium / genetics
  • Polymorphism, Single Nucleotide / genetics
  • Sequence Alignment / methods
  • Sequence Analysis, DNA / methods*
  • Software*
  • User-Computer Interface*

Substances

  • Genetic Markers