Multiple network-constrained regressions expand insights into influenza vaccination responses

Bioinformatics. 2017 Jul 15;33(14):i208-i216. doi: 10.1093/bioinformatics/btx260.

Abstract

Motivation: Systems immunology leverages recent technological advancements that enable broad profiling of the immune system to better understand the response to infection and vaccination, as well as the dysregulation that occurs in disease. An increasingly common approach to gain insights from these large-scale profiling experiments involves the application of statistical learning methods to predict disease states or the immune response to perturbations. However, the goal of many systems studies is not to maximize accuracy, but rather to gain biological insights. The predictors identified using current approaches can be biologically uninterpretable or present only one of many equally predictive models, leading to a narrow understanding of the underlying biology.

Results: Here we show that incorporating prior biological knowledge within a logistic modeling framework by using network-level constraints on transcriptional profiling data significantly improves interpretability. Moreover, incorporating different types of biological knowledge produces models that highlight distinct aspects of the underlying biology, while maintaining predictive accuracy. We propose a new framework, Logistic Multiple Network-constrained Regression (LogMiNeR), and apply it to understand the mechanisms underlying differential responses to influenza vaccination. Although standard logistic regression approaches were predictive, they were minimally interpretable. Incorporating prior knowledge using LogMiNeR led to models that were equally predictive yet highly interpretable. In this context, B cell-specific genes and mTOR signaling were associated with an effective vaccination response in young adults. Overall, our results demonstrate a new paradigm for analyzing high-dimensional immune profiling data in which multiple networks encoding prior knowledge are incorporated to improve model interpretability.

Availability and implementation: The R source code described in this article is publicly available at https://bitbucket.org/kleinstein/logminer .

Contact: steven.kleinstein@yale.edu or stefan.avey@yale.edu.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Computational Biology / methods*
  • Gene Expression Regulation
  • Humans
  • Immune System
  • Influenza, Human / genetics
  • Influenza, Human / metabolism
  • Influenza, Human / prevention & control*
  • Models, Biological*
  • Signal Transduction
  • Transcriptome
  • Vaccination*