Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality

Clin Epigenetics. 2018 Dec 13;10(1):155. doi: 10.1186/s13148-018-0591-z.

Abstract

Background: The effects of tobacco smoking on epigenome-wide methylation signatures in white blood cells (WBCs) collected from persons living with HIV may have important implications for their immune-related outcomes, including frailty and mortality. The application of a machine learning approach to the analysis of CpG methylation in the epigenome enables the selection of phenotypically relevant features from high-dimensional data. Using this approach, we now report that a set of smoking-associated DNA-methylated CpGs predicts HIV prognosis and mortality in an HIV-positive veteran population.

Results: We first identified 137 epigenome-wide significant CpGs for smoking in WBCs from 1137 HIV-positive individuals (p < 1.70E-07). To examine whether smoking-associated CpGs were predictive of HIV frailty and mortality, we applied ensemble-based machine learning to build a model in a training sample employing 408,583 CpGs. A set of 698 CpGs was selected and predictive of high HIV frailty in a testing sample [(area under curve (AUC) = 0.73, 95%CI 0.63~0.83)] and was replicated in an independent sample [(AUC = 0.78, 95%CI 0.73~0.83)]. We further found an association of a DNA methylation index constructed from the 698 CpGs that were associated with a 5-year survival rate [HR = 1.46; 95%CI 1.06~2.02, p = 0.02]. Interestingly, the 698 CpGs located on 445 genes were enriched on the integrin signaling pathway (p = 9.55E-05, false discovery rate = 0.036), which is responsible for the regulation of the cell cycle, differentiation, and adhesion.

Conclusion: We demonstrated that smoking-associated DNA methylation features in white blood cells predict HIV infection-related clinical outcomes in a population living with HIV.

Keywords: DNA methylation; Ensemble machine learning; HIV frailty; Mortality; Tobacco smoking.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • CpG Islands
  • DNA Methylation*
  • Epigenesis, Genetic
  • Female
  • Frailty / genetics*
  • Genome-Wide Association Study / methods*
  • HIV Infections / genetics
  • HIV Infections / mortality*
  • Humans
  • Machine Learning
  • Male
  • Middle Aged
  • Mortality
  • Prognosis
  • Signal Transduction
  • Smoking / genetics*