Approximate inference of gene regulatory network models from RNA-Seq time series data

BMC Bioinformatics. 2018 Apr 11;19(1):127. doi: 10.1186/s12859-018-2125-2.

Abstract

Background: Inference of gene regulatory network structures from RNA-Seq data is challenging due to the nature of the data, as measurements take the form of counts of reads mapped to a given gene. Here we present a model for RNA-Seq time series data that applies a negative binomial distribution for the observations, and uses sparse regression with a horseshoe prior to learn a dynamic Bayesian network of interactions between genes. We use a variational inference scheme to learn approximate posterior distributions for the model parameters.

Results: The methodology is benchmarked on synthetic data designed to replicate the distribution of real world RNA-Seq data. We compare our method to other sparse regression approaches and find improved performance in learning directed networks. We demonstrate an application of our method to a publicly available human neuronal stem cell differentiation RNA-Seq time series data set to infer the underlying network structure.

Conclusions: Our method is able to improve performance on synthetic data by explicitly modelling the statistical distribution of the data when learning networks from RNA-Seq time series. Applying approximate inference techniques we can learn network structures quickly with only moderate computing resources.

MeSH terms

  • Algorithms
  • Area Under Curve
  • Bayes Theorem
  • Cell Differentiation / genetics
  • Gene Regulatory Networks*
  • Humans
  • Models, Genetic*
  • Neural Stem Cells / cytology*
  • ROC Curve
  • Saccharomyces cerevisiae / genetics*
  • Sequence Analysis, RNA / methods*
  • Time Factors