Functionally enigmatic genes: a case study of the brain ignorome

PLoS One. 2014 Feb 11;9(2):e88889. doi: 10.1371/journal.pone.0088889. eCollection 2014.

Abstract

What proportion of genes with intense and selective expression in specific tissues, cells, or systems are still almost completely uncharacterized with respect to biological function? In what ways do these functionally enigmatic genes differ from well-studied genes? To address these two questions, we devised a computational approach that defines so-called ignoromes. As proof of principle, we extracted and analyzed a large subset of genes with intense and selective expression in brain. We find that publications associated with this set are highly skewed--the top 5% of genes absorb 70% of the relevant literature. In contrast, approximately 20% of genes have essentially no neuroscience literature. Analysis of the ignorome over the past decade demonstrates that it is stubbornly persistent, and the rapid expansion of the neuroscience literature has not had the expected effect on numbers of these genes. Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks. Nor do they differ with respect to numbers of orthologs, paralogs, or protein domains. The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum--a genomic bandwagon effect. Finally we ask to what extent massive genomic, imaging, and phenotype data sets can be used to provide high-throughput functional annotation for an entire ignorome. In a majority of cases we have been able to extract and add significant information for these neglected genes. In several cases--ELMOD1, TMEM88B, and DZANK1--we have exploited sequence polymorphisms, large phenome data sets, and reverse genetic methods to evaluate the function of ignorome genes.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Brain / metabolism*
  • Computational Biology
  • Databases, Genetic*
  • Female
  • Gene Expression Profiling / methods
  • Genome
  • Genomics
  • Humans
  • Male
  • Mice
  • Mice, Inbred C57BL
  • Mice, Inbred DBA
  • Neurosciences / trends
  • Phenotype
  • Polymorphism, Genetic
  • Research Design