A STATISTICAL FRAMEWORK FOR DATA INTEGRATION THROUGH GRAPHICAL MODELS WITH APPLICATION TO CANCER GENOMICS

Ann Appl Stat. 2017 Mar;11(1):161-184. doi: 10.1214/16-AOAS998. Epub 2017 Apr 8.

Abstract

Recent advances in high-throughput biotechnologies have generated var-ious types of genetic, genomic, epigenetic, transcriptomic and proteomic data across different biological conditions. It is likely that integrating data from diverse experiments may lead to a more unified and global view of biolog-ical systems and complex diseases. We present a coherent statistical frame-work for integrating various types of data from distinct but related biological conditions through graphical models. Specifically, our statistical framework is designed for modeling multiple networks with shared regulatory mech-anisms from heterogeneous high-dimensional datasets. The performance of our approach is illustrated through simulations and its applications to cancer genomics.

Keywords: Cancer genomics; data integration; graphical models.