The phylogenetic informativeness of nucleotide and amino acid sequences for reconstructing the vertebrate tree

J Mol Evol. 2008 Nov;67(5):437-47. doi: 10.1007/s00239-008-9142-0. Epub 2008 Aug 12.

Abstract

To aid in future efforts to accurately reconstruct the vertebrate tree, a quantitative measure of phylogenetic informativeness was applied to nucleotide and amino acid sequences for a set of 11 genes. We identified orthologues and assembled published fossil-calibrated divergence times between taxa that had been sequenced for each gene. Rates of molecular evolution for each site were estimated to characterize the molecular evolutionary pattern of genes and to calculate the phylogenetic informativeness. The fast-evolving gene albumin yielded the highest informativeness over the period from 60 million years ago to 500 million years ago. In contrast, calmodulin yielded the lowest informativeness, presumably because functional constraint minimized substitutions in the amino acid sequence. The gene c-myc showed an intermediate level of informativeness. The nucleotide sequence of cytochrome b showed extremely high utility for recent epochs, but low utility for times before 100 million years ago. We ranked nine other genes for their utility during the epochs of the divergence of the muroid rodents, early placental mammals, early vertebrates, and early metazoa, yielding results consistent with, but more precise than, previous studies. Interestingly, DNA sequence always exceeded amino acid sequence in informativeness over all time scales, yet support values were at best moderately higher. For epochs not subject to strong phylogenetic conflict due to convergence, we advocate gleaning the additional power of the threefold increase in number of characters that is present for DNA sequences over resorting to the less noisy but less informative amino acid sequences.

MeSH terms

  • Algorithms
  • Animals
  • Calmodulin / genetics
  • Cytochromes b / genetics
  • Databases, Nucleic Acid*
  • Databases, Protein*
  • Evolution, Molecular*
  • Genetic Variation
  • Phylogeny*
  • Proto-Oncogene Proteins c-myc / genetics
  • Serum Albumin / genetics
  • Vertebrates / classification
  • Vertebrates / genetics*

Substances

  • Calmodulin
  • Proto-Oncogene Proteins c-myc
  • Serum Albumin
  • Cytochromes b