Improving SNP prioritization and pleiotropic architecture estimation by incorporating prior knowledge using graph-GPA

Bioinformatics. 2018 Jun 15;34(12):2139-2141. doi: 10.1093/bioinformatics/bty061.

Abstract

Summary: Integration of genetic studies for multiple phenotypes is a powerful approach to improving the identification of genetic variants associated with complex traits. Although it has been shown that leveraging shared genetic basis among phenotypes, namely pleiotropy, can increase statistical power to identify risk variants, it remains challenging to effectively integrate genome-wide association study (GWAS) datasets for a large number of phenotypes. We previously developed graph-GPA, a Bayesian hierarchical model that integrates multiple GWAS datasets to boost statistical power for the identification of risk variants and to estimate pleiotropic architecture within a unified framework. Here we propose a novel improvement of graph-GPA which incorporates external knowledge about phenotype-phenotype relationship to guide the estimation of genetic correlation and the association mapping. The application of graph-GPA to GWAS datasets for 12 complex diseases with a prior disease graph obtained from a text mining of biomedical literature illustrates its power to improve the identification of risk genetic variants and to facilitate understanding of genetic relationship among complex diseases.

Availability and implementation: graph-GPA is implemented as an R package 'GGPA', which is publicly available at http://dongjunchung.github.io/GGPA/. DDNet, a web interface to query diseases of interest and download a prior disease graph obtained from a text mining of biomedical literature, is publicly available at http://www.chunglab.io/ddnet/.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bayes Theorem
  • Computational Biology / methods
  • Data Mining
  • Data Visualization
  • Genome-Wide Association Study / methods*
  • Polymorphism, Single Nucleotide*
  • Software*