Evolution of the mammalian transcription factor binding repertoire via transposable elements

Genome Res. 2008 Nov;18(11):1752-62. doi: 10.1101/gr.080663.108. Epub 2008 Aug 5.

Abstract

Identification of lineage-specific innovations in genomic control elements is critical for understanding transcriptional regulatory networks and phenotypic heterogeneity. We analyzed, from an evolutionary perspective, the binding regions of seven mammalian transcription factors (ESR1, TP53, MYC, RELA, POU5F1, SOX2, and CTCF) identified on a genome-wide scale by different chromatin immunoprecipitation approaches and found that only a minority of sites appear to be conserved at the sequence level. Instead, we uncovered a pervasive association with genomic repeats by showing that a large fraction of the bona fide binding sites for five of the seven transcription factors (ESR1, TP53, POU5F1, SOX2, and CTCF) are embedded in distinctive families of transposable elements. Using the age of the repeats, we established that these repeat-associated binding sites (RABS) have been associated with significant regulatory expansions throughout the mammalian phylogeny. We validated the functional significance of these RABS by showing that they are over-represented in proximity of regulated genes and that the binding motifs within these repeats have undergone evolutionary selection. Our results demonstrate that transcriptional regulatory networks are highly dynamic in eukaryotic genomes and that transposable elements play an important role in expanding the repertoire of binding sites.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • Binding Sites / genetics
  • Conserved Sequence
  • DNA / genetics
  • DNA / metabolism
  • DNA Transposable Elements / genetics*
  • Evolution, Molecular*
  • Humans
  • Mice
  • Repetitive Sequences, Nucleic Acid
  • Sequence Homology, Nucleic Acid
  • Transcription Factors / metabolism*

Substances

  • DNA Transposable Elements
  • Transcription Factors
  • DNA