The origin of conserved protein domains and amino acid repeats via adaptive competition for control over amino acid residues

J Mol Evol. 2010 Jan;70(1):29-43. doi: 10.1007/s00239-009-9305-7. Epub 2009 Dec 19.

Abstract

Some proteins, such as homeodomain transcription factors, contain highly conserved regions of sequence. It has recently been suggested that multiple functional domains overlap in the homeodomain, together explaining this high conservation. However, the question remains why so many functional domains cluster together in one relatively small and constrained region of the protein. Here we have modeled an evolutionary mechanism that can produce this kind of clustering: conserved functional domains are displaced from the parts of the molecule that are undergoing adaptive evolution because novel functions generally out-compete conserved functions for control over the identity of amino acid residues. We call this model COAA, for Competition Over Amino Acids. We also studied the evolution of amino acid repeats (a.k.a. homopeptides), which are especially prevalent in transcription factors. Repeats that are encoded by non-homogenous mixtures of synonymous codons cannot be explained by replication slippage alone. Our model provides two explanations for their origin, maintenance, and over-representation in highly conserved proteins. We demonstrate that either competition between multiple functional domains for space within a sequence, or reuse of a sequence for many functions over time, can cause the evolution of amino acid repeats. Both of these processes are characteristic of multifunctional proteins such as homeodomain transcription factors. We conclude that the COAA model can explain two widely recognized features of transcription factor proteins: conserved domains and a tendency to accumulate homopeptides.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Amino Acids / genetics*
  • Animals
  • Computer Simulation
  • Conserved Sequence*
  • Evolution, Molecular*
  • Homeodomain Proteins / genetics
  • Protein Structure, Tertiary
  • Repetitive Sequences, Amino Acid*
  • Zebrafish / genetics

Substances

  • Amino Acids
  • Homeodomain Proteins
  • homeobox protein HOXA13