Improving Gene-Set Enrichment Analysis of RNA-Seq Data with Small Replicates

PLoS One. 2016 Nov 9;11(11):e0165919. doi: 10.1371/journal.pone.0165919. eCollection 2016.

Abstract

Deregulated pathways identified from transcriptome data of two sample groups have played a key role in many genomic studies. Gene-set enrichment analysis (GSEA) has been commonly used for pathway or functional analysis of microarray data, and it is also being applied to RNA-seq data. However, most RNA-seq data so far have only small replicates. This enforces to apply the gene-permuting GSEA method (or preranked GSEA) which results in a great number of false positives due to the inter-gene correlation in each gene-set. We demonstrate that incorporating the absolute gene statistic in one-tailed GSEA considerably improves the false-positive control and the overall discriminatory ability of the gene-permuting GSEA methods for RNA-seq data. To test the performance, a simulation method to generate correlated read counts within a gene-set was newly developed, and a dozen of currently available RNA-seq enrichment analysis methods were compared, where the proposed methods outperformed others that do not account for the inter-gene correlation. Analysis of real RNA-seq data also supported the proposed methods in terms of false positive control, ranks of true positives and biological relevance. An efficient R package (AbsFilterGSEA) coded with C++ (Rcpp) is available from CRAN.

MeSH terms

  • Algorithms*
  • Animals
  • Computational Biology / methods*
  • Computer Simulation
  • Data Mining / methods
  • Databases, Genetic*
  • Gene Expression Profiling / methods*
  • Humans
  • Oligonucleotide Array Sequence Analysis / methods
  • Reproducibility of Results
  • Sequence Analysis, RNA / methods*
  • Transcriptome*

Grants and funding

The genomics programs (NRF-2014M3C9A3068554 and 2014M3C9A3068555) of the National Research Foundation (NRF, http://www.nrf.re.kr/) of Korea, funded by the Ministry of Science, ICT, and Future Planning, the KRIBB Research Initiative. Basic Science Research Program through a National Research Foundation (NRF) grants funded by the Korean government (MOE) (2014R1A1A2056353). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.