Adjacent nucleotide dependence in ncRNA and order-1 SCFG for ncRNA identification

PLoS One. 2010 Sep 28;5(9):e12848. doi: 10.1371/journal.pone.0012848.

Abstract

Background: Non-coding RNAs (ncRNAs) are known to be involved in many critical biological processes, and identification of ncRNAs is an important task in biological research. A popular software, Infernal, is the most successful prediction tool and exhibits high sensitivity. The application of Infernal has been mainly focused on small suspected regions. We tried to apply Infernal on a chromosome level; the results have high sensitivity, yet contain many false positives. Further enhancing Infernal for chromosome level or genome wide study is desirable.

Methodology: Based on the conjecture that adjacent nucleotide dependence affects the stability of the secondary structure of an ncRNA, we first conduct a systematic study on human ncRNAs and find that adjacent nucleotide dependence in human ncRNA should be useful for identifying ncRNAs. We then incorporate this dependence in the SCFG model and develop a new order-1 SCFG model for identifying ncRNAs.

Conclusions: With respect to our experiments on human chromosomes, the proposed new model can eliminate more than 50% false positives reported by Infernal while maintaining the same sensitivity. The executable and the source code of programs are freely available at http://i.cs.hku.hk/~kfwong/order1scfg.

Publication types

  • Evaluation Study

MeSH terms

  • Base Sequence
  • Computational Biology / methods*
  • Humans
  • Molecular Sequence Data
  • Nucleic Acid Conformation
  • Nucleotides / chemistry*
  • Nucleotides / genetics
  • RNA, Untranslated / chemistry*
  • RNA, Untranslated / genetics

Substances

  • Nucleotides
  • RNA, Untranslated