A unifying framework for evaluating the predictive power of genetic variants based on the level of heritability explained

PLoS Genet. 2010 Dec 2;6(12):e1001230. doi: 10.1371/journal.pgen.1001230.

Abstract

An increasing number of genetic variants have been identified for many complex diseases. However, it is controversial whether risk prediction based on genomic profiles will be useful clinically. Appropriate statistical measures to evaluate the performance of genetic risk prediction models are required. Previous studies have mainly focused on the use of the area under the receiver operating characteristic (ROC) curve, or AUC, to judge the predictive value of genetic tests. However, AUC has its limitations and should be complemented by other measures. In this study, we develop a novel unifying statistical framework that connects a large variety of predictive indices together. We showed that, given the overall disease probability and the level of variance in total liability (or heritability) explained by the genetic variants, we can estimate analytically a large variety of prediction metrics, for example the AUC, the mean risk difference between cases and non-cases, the net reclassification improvement (ability to reclassify people into high- and low-risk categories), the proportion of cases explained by a specific percentile of population at the highest risk, the variance of predicted risks, and the risk at any percentile. We also demonstrate how to construct graphs to visualize the performance of risk models, such as the ROC curve, the density of risks, and the predictiveness curve (disease risk plotted against risk percentile). The results from simulations match very well with our theoretical estimates. Finally we apply the methodology to nine complex diseases, evaluating the predictive power of genetic tests based on known susceptibility variants for each trait.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Area Under Curve
  • Biostatistics
  • Genetic Predisposition to Disease*
  • Genetic Testing / statistics & numerical data*
  • Genetic Variation*
  • Humans
  • Models, Statistical*
  • Predictive Value of Tests
  • ROC Curve
  • Risk Factors