Semiparametric prognosis models in genomic studies

Brief Bioinform. 2010 Jul;11(4):385-93. doi: 10.1093/bib/bbp070. Epub 2010 Feb 1.

Abstract

Development of high-throughput technologies makes it possible to survey the whole genome. Genomic studies have been extensively conducted, searching for markers with predictive power for prognosis of complex diseases such as cancer, diabetes and obesity. Most existing statistical analyses are focused on developing marker selection techniques, while little attention is paid to the underlying prognosis models. In this article, we review three commonly used prognosis models, namely the Cox, additive risk and accelerated failure time models. We conduct simulation and show that gene identification can be unsatisfactory under model misspecification. We analyze three cancer prognosis studies under the three models, and show that the gene identification results, prediction performance of all identified genes combined, and reproducibility of each identified gene are model-dependent. We suggest that in practical data analysis, more attention should be paid to the model assumption, and multiple models may need to be considered.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Genomics*
  • Humans
  • Models, Theoretical*
  • Prognosis