Incipient de novo genes can evolve from frozen accidents that escaped rapid transcript turnover

Nat Ecol Evol. 2018 Oct;2(10):1626-1632. doi: 10.1038/s41559-018-0639-7. Epub 2018 Sep 10.

Abstract

A recent surge of studies have suggested that many novel genes arise de novo from previously noncoding DNA and not by duplication. However, most studies concentrated on longer evolutionary time scales and rarely considered protein structural properties. Therefore, it remains unclear how these properties are shaped by evolution, depend on genetic mechanisms and influence gene survival. Here we compare open reading frames (ORFs) from high coverage transcriptomes from mouse and another four mammals covering 160 million years of evolution. We find that novel ORFs pervasively emerge from noncoding regions but are rapidly lost again, while relatively fewer arise from the divergence of coding sequences but are retained much longer. We also find that a subset (14%) of the mouse-specific ORFs bind ribosomes and are potentially translated, showing that such ORFs can be the starting points of gene emergence. Surprisingly, disorder and other protein properties of young ORFs hardly change with gene age in short time frames. Only length and nucleotide composition change significantly. Thus, some transcribed de novo genes resemble 'frozen accidents' of randomly emerged ORFs that survived initial purging. This perspective complies with very recent studies indicating that some neutrally evolving transcripts containing random protein sequences may be translated and be viable starting points of de novo gene emergence.

MeSH terms

  • Animals
  • Dipodomys / genetics
  • Evolution, Molecular*
  • Humans
  • Mammals / genetics*
  • Mice / genetics
  • Monodelphis / genetics
  • Open Reading Frames / genetics*
  • Rats / genetics
  • Transcriptome / genetics*