Predicting regulatory variants with composite statistic

Bioinformatics. 2016 Sep 15;32(18):2729-36. doi: 10.1093/bioinformatics/btw288. Epub 2016 Jun 6.

Abstract

Motivation: Prediction and prioritization of human non-coding regulatory variants is critical for understanding the regulatory mechanisms of disease pathogenesis and promoting personalized medicine. Existing tools utilize functional genomics data and evolutionary information to evaluate the pathogenicity or regulatory functions of non-coding variants. However, different algorithms lead to inconsistent and even conflicting predictions. Combining multiple methods may increase accuracy in regulatory variant prediction.

Results: Here, we compiled an integrative resource for predictions from eight different tools on functional annotation of non-coding variants. We further developed a composite strategy to integrate multiple predictions and computed the composite likelihood of a given variant being regulatory variant. Benchmarked by multiple independent causal variants datasets, we demonstrated that our composite model significantly improves the prediction performance.

Availability and implementation: We implemented our model and scoring procedure as a tool, named PRVCS, which is freely available to academic and non-profit usage at http://jjwanglab.org/PRVCS CONTACT: wang.junwen@mayo.edu, jliu@stat.harvard.edu, or limx54@gmail.com

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms*
  • Biological Evolution
  • Genetic Variation
  • Humans
  • Models, Theoretical*
  • Molecular Sequence Annotation*
  • RNA, Untranslated
  • Software*

Substances

  • RNA, Untranslated