U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Alberts B, Bray D, Lewis J, et al. Molecular Biology of the Cell. 3rd edition. New York: Garland Science; 1994.

  • By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.
Cover of Molecular Biology of the Cell

Molecular Biology of the Cell. 3rd edition.

Show details

The Organization and Evolution of the Nuclear Genome

Introduction

Much of evolutionary history is recorded in the genomes of present-day organisms and can be deciphered from a careful analysis of their DNA sequences. Tens of millions of DNA nucleotides have been sequenced thus far, and we can now see in outline how the genes coding for certain proteins have evolved over hundreds of millions of years. Studies of the occasional changes that occur in present-day chromosomes provide additional clues to the mechanisms that have brought about evolutionary change in the past. In this section we consider some of the general principles that have emerged from such molecular genetic studies, with emphasis on the organization and evolution of the nuclear genome in higher eucaryotes.

Genomes Are Fine-tuned by Point Mutation and Radically Remodeled or Enlarged by Genetic Recombination 45

DNA nucleotide sequences must be accurately replicated and conserved. In Chapter 6 we discussed the elaborate DNA-replication and DNA-repair mechanisms that enable DNA sequences to be inherited with extraordinary fidelity: only about one nucleotide pair in a thousand is randomly changed every 200,000 years. Even so, in a population of 10,000 individuals, every possible nucleotide substitution will have been "tried out" on about 50 occasions in the course of a million years, which is a short span of time in relation to the evolution of species. Much of the variation created in this way will be disadvantageous to the organism and will be selected against in the population. When a rare variant sequence is advantageous, however, it will be rapidly propagated by natural selection. Consequently, it can be expected that in any given species the functions of most genes will have been optimized by random point mutation and selection.

While point mutation is an efficient mechanism for fine-tuning the genome, evolutionary progress in the long term must depend on more radical types of genetic change. Genetic recombination causes major rearrangements of the genome with surprising frequency: the genome can expand or contract by duplication or deletion, and its parts can be transposed from one region to another to create new combinations. Component parts of genes - their individual exons and regulatory elements - can be shuffled as separate modules to create proteins that have entirely new roles. In addition, duplicated copies of genes tend to diverge by further mutation and become specialized and individually optimized for subtly different functions. By these means the genome as a whole can evolve to become increasingly complex and sophisticated. In a mammal, for example, multiple variant forms of almost every gene exist - different actin genes for the different types of contractile cells, different opsin genes for the perception of lights of different colors, different collagen genes for the different types of connective tissues, and so on. The expression of each gene is regulated according to its own precise and specific rules. Moreover, DNA sequencing studies reveal that many genes share related modular segments but are otherwise very different: common sequence motifs are frequently found in otherwise unrelated proteins.

Genetic recombination, whereby one chromosome exchanges genetic material with another, is fundamental to the creation of such families of genes and gene segments. In Chapter 6 we discussed the molecular mechanisms of both general recombination and site-specific recombination. Here we consider some of their effects on the genome.

Tandemly Repeated DNA Sequences Tend to Remain the Same 46

Gene duplications are usually attributed to rare accidents catalyzed by some of the enzymes that mediate normal recombination processes. Higher eucaryotes, however, contain an efficient enzymatic system that joins the two ends of a broken DNA molecule together, so that duplications (as well as inversions, deletions, and translocations of DNA segments) can also arise as a consequence of the erratic rejoining of fragments of chromosomes that have somehow become broken in more than one place. When duplicated DNA sequences are joined head to tail, they are said to be tandemly repeated. Once a single tandem repeat appears, it can be extended readily into a long series of tandem repeats by unequal crossover events between two sister chromosomes, inasmuch as the large amount of matching sequence provides an ideal substrate for general recombination ( Figure 8-73). DNA duplication followed by sequential unequal crossing-over underlies DNA amplification, a process that often contributes to the formation of cancer by increasing the number of copies of genes (proto-oncogenes) that promote cancer (see Figure 24-27).

Figure 8-73. A family of tandemly repeated genes frequently loses and gains gene copies due to unequal crossing-over between sister chromosomes containing the genes.

Figure 8-73

A family of tandemly repeated genes frequently loses and gains gene copies due to unequal crossing-over between sister chromosomes containing the genes. This type of event is frequent because the long regions of homologous DNA sequence are good substrates (more...)

Tandemly repeated genes both increase and decrease in number due to unequal crossing-over (see Figure 8-73). They therefore would be expected to be maintained by natural selection in large numbers only if the extra copies were beneficial to the organism. We have already discussed the hundreds of tandemly repeated genes that code for the vertebrate large ribosomal RNA precursor; these are needed to keep up with a growing cell's demand for new ribosomes. Similarly, vertebrates have clusters of tandemly repeated genes that encode other structural RNAs, including 5S rRNA and the U1 and U2 snRNAs, as well as clusters of repeated histone genes, which produce the large amounts of histones required during each S phase.

One might expect that in the course of evolution the sequences of the genes in a tandem array - and of the nontranscribed spacer DNA between them - would drift apart. With many copies of the same gene there should be little selection against random mutations that alter one or a few of the copies, and most nucleotide changes in the long nontranscribed spacer regions would have no functional consequence. In fact, however, the sequences of the tandemly repeated genes and their spacer DNAs are generally almost identical. Two mechanisms are thought to account for this. First, recurring unequal crossing-over events will cause the continued expansion and contraction of tandem arrays, and computer simulations show that this will tend to keep the sequences the same ( Figure 8-74A). Second, related DNA sequences can become homogenized through gene conversionthe process whereby a portion of the DNA sequence is changed by copying a closely similar sequence present at a different site in the genome, as described in Chapter 6 ( Figure 8-74B). Although gene conversion does not require that the genes be tandemly repeated, in higher eucaryotes it seems to occur mainly between genes that are close to each other.

Figure 8-74. Two types of events that help to keep all DNA sequences in a tandem array very similar to one another.

Figure 8-74

Two types of events that help to keep all DNA sequences in a tandem array very similar to one another. (A) The continual expansion and contraction of the number of gene copies in a tandem array caused by unequal crossing-over (see Figure 8-73) tends to (more...)

The movement of one gene copy in a tandem array to a new chromosomal location will protect it from both of the above homogenizing influences. Thus, in higher eucaryotes accidental gene translocation is an important step in the evolution of new genes: it allows the translocated DNA sequence to begin to evolve independently, so that it can acquire new functions that might benefit the organism.

The Evolution of the Globin Gene Family Shows How Random DNA Duplications Contribute to the Evolution of Organisms 47

In addition to generating a number of sets of tandemly repeated genes, DNA duplications have played a more important general role in the evolution of new proteins. The globin gene family provides a good example because its evolutionary history has been worked out particularly well. The unmistakable homologies in amino acid sequence and structure among the present-day globin genes indicate that they all must derive from a common ancestral gene, even though some now occupy widely separated locations in the mammalian genome.

We can reconstruct some of the past events that produced the various types of oxygen-carrying hemoglobin molecules by considering the different forms of the protein in organisms at different levels on the phylogenetic scale. A molecule like hemoglobin was necessary to allow multicellular animals to grow to a large size, since large animals could no longer rely on the simple diffusion of oxygen through the body surface to oxygenate their tissues adequately. Consequently, hemoglobinlike molecules are found in all vertebrates and in many invertebrates. The most primitive oxygen-carrying molecule in animals is a globin polypeptide chain of about 150 amino acids, which is found in many marine worms, insects, and primitive fish. The hemoglobin molecule in higher vertebrates, however, is composed of two kinds of globin chains. It appears that about 500 million years ago, during the evolution of higher fish, a series of gene mutations and duplications occurred. These events established two slightly different globin genes, coding for the α- and β-globin chains in the genome of each individual. In modern higher vertebrates each hemoglobin molecule is a complex of two α chains and two β chains ( Figure 8-75). The four oxygen binding sites in the α2β2 molecule interact, allowing a cooperative allosteric change in the molecule as it binds and releases oxygen, which enables hemoglobin to take up and to release oxygen more efficiently than the single-chain version.

Figure 8-75. A comparison of the structure of one-chain and four-chain globins.

Figure 8-75

A comparison of the structure of one-chain and four-chain globins. The four-chain globin shown is hemoglobin, which is a complex of two α- and two β-globin chains. The one-chain globin in some primitive vertebrates forms a dimer that dissociates (more...)

Still later, during the evolution of mammals, the β-chain gene apparently underwent mutation and duplication to give rise to a second β-like chain that is synthesized specifically in the fetus. The resulting hemoglobin molecule has a higher affinity for oxygen than adult hemoglobin and thus helps in the transfer of oxygen from the mother to the fetus. The gene for the new β-like chain subsequently mutated and duplicated again to produce two new genes, epsilon and γ, the epsilon chain being produced earlier in development (to form α2epsilon2) than the fetal γ chain, which forms α2γ2 (see Figure9-52). A duplication of the adult β-chain gene occurred still later, during primate evolution, to give rise to a δ-globin gene and thus to a minor form of hemoglobin (α2δ2) found only in adult primates ( Figure 8-76). Each of these duplicated genes has been modified by point mutations that affect the properties of the final hemoglobin molecule, as well as by changes in regulatory regions that determine the timing and level of expression of the gene.

Figure 8-76. An evolutionary scheme for the globin chains that carry oxygen in the blood of animals.

Figure 8-76

An evolutionary scheme for the globin chains that carry oxygen in the blood of animals. The scheme emphasizes the β-like globin gene family. A relatively recent gene duplication of the γ-chain gene produced γG and γ A, (more...)

The end result of the gene duplication processes that have given rise to the diversity of globin chains is seen clearly in the genes that arose from the original β gene, which are arranged as a series of homologous DNA sequences located within 50,000 nucleotide pairs of one another. A similar cluster of α-globin genes is located on a separate human chromosome. Because the α- and β-globin gene clusters are on separate chromosomes in birds and mammals but are together in the frog Xenopus, it is believed that a translocation event separated the two genes about 300 million years ago (see Figure 8-76). As previously discussed, such translocations probably help stabilize duplicated genes with distinct functions by protecting them from the homogenizing processes that act on closely linked genes of similar DNA sequence (see Figure 8-74).

There are several duplicated globin DNA sequences in the α- and β-globin gene clusters that are not functional genes. They are examples of pseudogenes. These have a close homology to the functio nal genes but have been disabled by mutations that prevent their expression. The existence of such pseudogenes should not be surprising since not every DNA duplication would be expected to lead to a new functional gene. Moreover, nonfunctional DNA sequences are not rapidly discarded, as indicated by the large excess of noncoding DNA in mammalian genomes, discussed previously.

A great deal of our evolutionary history will be discernible in our chromosomes once the DNA sequences of many gene families have been compared in different animals (see also Figure 7-4).

Genes Encoding New Proteins Can Be Created by the Recombination of Exons 45

The role of DNA duplication in evolution is not confined to the generation of gene families. It can also be important in generating new single genes. The proteins encoded by genes generated in this way can be recognized by the presence of repeating, similar protein domains, which are covalently linked to one another in series. The immunoglobulins ( Figure 8-77) and albumins, for example, as well as most fibrous proteins (such as the collagens) are encoded by genes that have evolved by repeated duplications of a primordial DNA sequence.

Figure 8-77. Schematic view of an antibody (immunoglobulin) molecule.

Figure 8-77

Schematic view of an antibody (immunoglobulin) molecule. This molecule is a complex of two identical heavy chains and two identical light chains. Each heavy chain contains four similar, covalently linked domains, while each light chain contains two such (more...)

In genes that have evolved in this way, as well as in many other genes, each separate exon often encodes an individual protein folding unit, or domain. It is believed that the organization of DNA coding sequences as a series of such exons separated by long introns has greatly facilitated the evolution of new proteins. The duplications necessary to form a single gene coding for a protein with repeating domains, for example, can occur by breaking and rejoining the DNA anywhere in the long introns on either side of an exon encoding a useful protein domain; without introns there would be only a few sites in the original gene at which a recombinational exchange between sister DNA molecules could duplicate the domain. By enabling the duplication to occur at many potential recombination sites rather than at just a few, introns increase the probability of a favorable duplication event.

For the same reason, the presence of introns greatly increases the probability that a chance recombination event will generate a functional hybrid gene by joining two initially separated DNA sequences that code for different protein domains in such a way that both domains are preserved in the new protein that the hybrid gene encodes (see Figure 8-81, for example). The presumed results of such recombinations are seen in many present-day proteins (see Figure3-43). Thus the large separation between the exons encoding individual domains in higher eucaryotes is thought to accelerate the process by which random genetic-recombination events generate useful new proteins. This could help to explain the successful evolution of these very complex organisms.

Figure 8-81. An example of the exon shuffling that can be caused by transposable elements.

Figure 8-81

An example of the exon shuffling that can be caused by transposable elements. When two elements of the same type ( red DNA) happen to insert near each other in a chromosome, the transposition mechanism may occasionally use the ends of two different elements (more...)

Most Proteins Probably Originated from Highly Split Genes 48

The discovery in 1977 of genes split up by introns was unexpected. Previously all genes analyzed in detail were bacterial genes, which lack introns. Bacteria also lack nuclei and internal membranes and have smaller genomes than eucaryotic cells, and traditionally they were considered to resemble the simpler cells from which eucaryotic cells must have been derived. Not surprisingly, most biologists initially assumed that introns were a bizarre and late evolutionary addition to the eucaryotic line. It now seems likely, however, that split genes are the ancient condition and that bacteria lost their introns only after most of their proteins had evolved.

The idea that introns are very old is consistent with current concepts of protein evolution by the trial-and-error recombination of separate exons that encode distinct protein domains. Moreover, evidence for the ancient origin of introns has been obtained by examination of the gene that encodes the ubiquitous enzyme triosephosphate isomerase. Triosephosphate isomerase has an essential role in the metabolism of all cells, catalyzing the interconversion of glyceraldehyde-3 phosphate and dihydroxyacetone phosphate - a central step in glycolysis and gluconeogenesis (see Figure 2-21). By comparing the amino acid sequence of this enzyme in various organisms, it is possible to deduce that the enzyme evolved before the divergence of procaryotes and eucaryotes from a common ancestor; the human and bacterial amino acid sequences are 46% identical. The gene encoding the enzyme contains six introns in vertebrates (chickens and humans), and five of these are in precisely the same positions in maize. This implies that these five introns were present in the gene before plants and animals diverged in the eucaryotic lineage, an estimated 109 years ago ( Figure 8-78).

Figure 8-78. The ancient origin of split genes.

Figure 8-78

The ancient origin of split genes. (A) A comparison of the exon structure of the triosephosphate isomerase gene in plants and animals. The intron positions that are identical in maize (corn) and vertebrates are marked with green arrowheads,while the intron (more...)

In general, small unicellular organisms are under a strong selection pressure to reproduce by cell division at the maximum rate permitted by the levels of nutrients in the environment. To do this, they must minimize the amount of unnecessary DNA that they have to synthesize in each cell-division cycle. For larger organisms that live by predation, where size is an advantage, and for multicellular organisms in general, where rates of cell division are constrained by other requirements, there will not be such strong selection pressure to eliminate superfluous DNA from the genome. This argument may help to explain why bacteria should have lost their introns while eucaryotes have retained them. It also helps to explain why the multicellular fungus Aspergillus has five introns in its triosephosphate isomerase gene, whereas its unicellular relative, the yeast Saccharomyces, has none.

What is the mechanism by which introns are lost? Precise loss of introns would occur only rarely by piecemeal random deletions of short segments of DNA, yet precise and selective loss of introns seems not uncommon in eucaryotic cells (and perhaps was also frequent in the ancestors of bacteria). Whereas most vertebrates contain only a single insulin gene with two intron sequences, rats, for example, contain a second, neighboring insulin gene with only one intron. The second gene apparently arose by gene duplication relatively recently and subsequently lost one of its introns. Because intron loss requires the exact rejoining of DNA coding sequences, the most likely source of the information needed for such an event is an mRNA transcript of the original gene, from which the intron sequences will have been precisely removed. We know that messenger RNAs may be copied back into DNA through the activity of reverse transcriptases (see p. 282), and it is thought that recombination enzymes on occasion allow these DNA copies to become paired with the original sequence, which is then "corrected" to an intronless form by a gene-conversion type of event. This pathway of intron loss has been demonstrated in the laboratory using the powerful genetic tools available in the yeast S. cerevisiae.

Reverse transcriptases are not needed for the central genetic pathways, but they are produced in cells by specific transposable elements (see Table 8-4) as well as by all retroviruses. The generation of DNA copies of segments of the genome by reverse transcription has contributed in several ways to the evolution of the genomes of higher organisms, as we discuss later.

Table 8-4. Three Major Families of Transposable Elements.

Table 8-4

Three Major Families of Transposable Elements.

A Major Fraction of the DNA of Higher Eucaryotes Consists of Repeated, Noncoding Nucleotide Sequences 49

Eucaryotic genomes contain not only introns but also large numbers of copies of other seemingly nonessential DNA sequences that do not code for protein. The presence of such repeated DNA sequences in higher eucaryotes was first revealed by a hybridization technique that measures the number of gene copies. In this procedure the genome is broken mechanically into short fragments of DNA double helix about 1000 nucleotide pairs long, and the fragments are then de-natured to produce DNA single strands. The speed with which the single-stranded fragments in the mixture reanneal under conditions in which the double-helical conformation is stable depends on how long it takes each fragment to find a complementary fragment to pair with, which in turn depends on the concentration of suitable fragments in the mixture. For the most part, the reaction is very slow. The haploid genome of a mammalian cell, for example, is represented by about 6 million different 1000-nucleotide-long DNA fragments, and any fragment whose sequence is present in only one copy must randomly collide with 6 million noncomplementary strands for every complementary partner strand that it happens to find.

When the DNA from a human cell is analyzed in this way under conditions that require near perfect matching (high stringency conditions, see Figure7-17), about 70% of the DNA strands reanneal as slowly as one would expect for a large collection of unique (nonrepeated) DNA sequences, requiring days for complete annealing. But most of the remaining 30% of the DNA strands anneal much more quickly. These strands contain sequences that are repeated many times in the genome, and they thus collide with a complementary partner relatively rapidly. Most of these highly repeated DNA sequences do not encode proteins, and they are of two types: about one-third are the tandemly repeated satellite DNAs, to be discussed next; the rest are interspersed repeated DNAs. As we shall see, most of the latter DNAs derive from a few transposable DNA sequences that have multiplied to especially high copy numbers in the human genome.

Satellite DNA Sequences Have No Known Function 50

The most rapidly annealing DNA strands in an experiment of the type just described usually consist of very long tandem series of repetitions of a short nucleotide sequence ( Figure 8-79). The repeat unit in a sequence of this type may be composed of only one or two nucleotides, but most repeats are longer, and in mammals they are typically composed of variants of a short sequence organized into a repeat of a few hundred nucleotides. These tandem repeats of a simple sequence are called satellite DNAs because the first DNAs of this type to be discovered had an unusual ratio of nucleotides that made it possible to separate them by density-gradient centrifugation from the bulk of the cell's DNA as a minor component (or "satellite"). Satellite DNA sequences generally are not transcribed and are located most often in the heterochromatin associated with the centromeric regions of chromosomes. In some mammals a single type of satellite DNA sequence constitutes 10% or more of the DNA and may even occupy a whole chromosome arm, so that the cell contains millions of copies of the basic repeated sequence.

Figure 8-79. Satellite DNA.

Figure 8-79

Satellite DNA. A simple satellite DNA sequence from Drosophila is shown. It consists of many serially arranged repetitions of a sequence seven nucleotide pairs long, and it occurs millions of times in the Drosophila haploid genome.

Satellite DNA sequences seem to have changed unusually rapidly and even to have shifted their positions on chromosomes in the course of evolution. When two homologous mitotic chromosomes of any human are compared, for example, some of the satellite DNA sequences usually are found arranged in a strikingly different manner on the two chromosomes. Moreover, in contrast to the high degree of conservation of DNA sequences elsewhere in the genome, generally there are marked differences in the satellite DNA sequences of two closely related species. No function has yet been found for satellite DNA sequences: tests designed to demonstrate a role in chromosome pairing or nuclear organization have failed thus far to reveal any evidence for such a role. It has therefore been suggested that they are an extreme form of "selfish DNA" sequences, whose properties ensure their own retention in the genome but which do nothing to help the survival of the cells containing them. Other sequences that are commonly viewed as selfish are the transposable elements, which we discuss next.

The Evolution of Genomes Has Been Accelerated by Transposable Elements 51

Genomes generally contain many varieties of transposable elements. These segments of DNA were first discovered in maize, where several have been sequenced and characterized. Eucaryotic transposable elements have been studied most extensively in Drosophila, where more than 30 varieties are known, varying in length between 2000 and 10,000 nucleotide pairs; most are present in 5 to 10 copies per diploid cell.

At least three broad classes of transposable elements can be distinguished by the peculiarities of their sequence organization ( Table 8-4). Some elements move from place to place within chromosomes directly as DNA, while many others move via an RNA intermediate, as described in Chapter 6. In either case they can multiply and spread from one site in a genome to a multitude of other sites, sometimes behaving as disruptive parasites.

Transposable elements seem to make up at least 10% of higher eucaryotic genomes. Although most of these elements move only very rarely, there are so many elements that their movement has a major effect on the variability of a species. More than half of the spontaneous mutations examined in Drosophila, for example, are due to the insertion of a transposable element in or near the mutant gene.

Mutations can occur either when an element inserts into a gene or when it exits to move elsewhere. All known transposable elements cause a short "target-site duplication" because of their mechanism of insertion (see Figure 6-70); when they exit, they generally leave behind part of this duplication - often with other local sequence changes as well ( Figure 8-80). Thus, as transposable elements move in and out of chromosomes, they cause a variety of short additions and deletions of nucleotide sequences.

Figure 8-80. Some changes in chromosomal DNA sequences caused by transposable elements.

Figure 8-80

Some changes in chromosomal DNA sequences caused by transposable elements. The insertion of a transposable element always produces a short target-site duplication of the chromosomal sequence, which is generally 3 to 12 nucleotide pairs in length depending (more...)

Transposable elements have also contributed to genome diversity in another way. When two transposable elements that are recognized by the same site-specific recombination enzyme ( transposase) integrate into neighboring chromosomal sites, the DNA between them can become subject to transposition by the transposase. Because this provides a particularly effective pathway for the duplication and movement of exons ( exon shuffling), these elements can help to create new genes ( Figure 8-81).

Transposable Elements Often Affect Gene Regulation 52

A DNA sequence rearrangement caused by a transposable element is often observed to alter the timing, level, or spatial pattern of expression of a nearby gene without affecting the sequence of the protein or RNA molecule that the gene encodes. This can change a subtle aspect of animal or plant development, such as the shape of an eye or a flower. While most of these changes in gene regulation would be expected to be detrimental to an organism, some of them will bring benefits and therefore tend to spread through the population by natural selection.

Effects on gene regulation are common, partly because the movement of a transposable element will generally bring with it new sequences that act as binding sites for sequence-specific DNA-binding proteins, including a transposase and the proteins that regulate the transcription of the transposable element DNA. These sequences can thereby act as regulatory sequences called enhancers (see p. 422) to affect the transcription of nearby genes. Similar effects commonly contribute to the evolution of cancer cells, where oncogenes can be created by the transposition of such regulatory sequences into the neighborhood of a proto-oncogene, as we discuss in Chapter 24.

The organization of higher eucaryotic genomes, with long noncoding DNA sequences interspersed with comparatively short coding sequences, provides an accommodating "playground" for the integration and excision of mobile DNA sequences. Because gene transcription can be regulated from distances that are tens of thousands of nucleotide pairs away from a promoter, many of the resulting changes in the genome would be expected to affect gene expression; by contrast, relatively few would be expected to disrupt the short exons that contain the coding sequences.

Might the vast excess of noncoding DNA in higher eucaryotes have been favored by selection during evolution because of the regulatory flexibility that it has provided to organisms with a large variety of transposable elements? What is known about the regulatory systems that control higher eucaryotic genes is consistent with this possibility. Enhancers, like exons, seem to function as separate modules, and the activity of a gene depends on a summation of the influences received at its promoter from a set of enhancers (see Figure9-44). Transposable elements, by moving such enhancer modules around in a genome, may allow gene regulation to be optimized for the long-term survival of the organism.

Transposition Bursts Cause Cataclysmic Changes in Genomes and Increase Biological Diversity 53

Another unique feature that distinguishes transposable elements as mutagens is their tendency to undergo long quiescent periods, during which they remain fixed in their chromosomal positions, followed by a period of intense movement. Their transposition, and therefore their mutagenic action, is activated from time to time in a few individuals in a population of organisms. Such cataclysmic changes in genomes, called transposition bursts, can involve near simultaneous transpositions of several types of transposable elements. Transposition bursts were first observed in developing maize plants that were subjected to repeated chromosome breakage. They also are observed in crosses between certain strains of flies - a phenomenon known as hybrid dysgenesis. When they occur in the germ line, they induce multiple changes in the genome of an individual progeny fly or plant.

By simultaneously changing several properties of an organism, transposition bursts increase the probability that two new traits that are useful together but of no selective value by themselves will appear in a single individual in a population. In several types of plants there is evidence that transposition bursts can be activated by a severe environmental stress, generating a variety of randomly modified progeny organisms, some of which may be better suited than the parent to survive in the new conditions. It seems that, at least in these plants, a mechanism has evolved to activate transposable elements to serve as mutagens that produce an enhanced range of variant organisms when this variation is most needed. Thus transposable elements are not necessarily just disruptive parasites; rather, they may on occasion act as useful symbionts that aid the long-term survival of the species whose genomes they inhabit.

About 10% of the Human Genome Consists of Two Families of Transposable Elements 54

Primate DNA is unusual in at least one respect: it contains a remarkably large number of copies of two transposable DNA sequences that seem to have overrun our chromosomes. Both of these sequences move by an RNA-mediated process that requires a reverse transcriptase. One is the L1 transposable element, which resembles the F element in Drosophila and the Cin4 element in maize and encodes a reverse transcriptase (see Table 8-4, p. 392). Transposable elements have generally evolved with feedback control systems that severely limit their numbers in each cell (thereby saving the cell from potential disaster); the L1 element in humans, however, constitutes about 4% of the mass of the genome.

Even more abundant is the Alu sequence, which is very short (about 300 nucleotide pairs) and moves like a transposable element, creating target-site duplications when it inserts. It was derived, however, from an internally deleted host-cell RNA gene (7SL), which encodes the RNA component of the signal-recognition particle (SRP) that functions in protein synthesis (see Figure 12-39); it is therefore not clear whether the Alu sequence should be considered a transposable element or an unusually mobile pseudogene. It is present in about 500,000 copies in the haploid genome and constitutes about 5% of human DNA; thus it is present on average about once every 5000 nucleotide pairs. The Alu DNA is transcribed from the 7SL RNA promoter, a polymerase-III promoter that is internal to the transcript, so that it carries the information necessary for its own transcription wherever it moves. It needs a reverse transcriptase encoded elsewhere, however, to transpose.

Comparisons of the sequence and locations of the L1- and Alu-like sequences in different mammals suggest that these sequences have multiplied to high copy numbers relatively recently ( Figure 8-82). It is hard to imagine that these highly abundant sequences scattered throughout our genome have not had major effects on the expression of many nearby genes. How many of our uniquely human qualities, for example, do we owe to these parasitic elements?

Figure 8-82. The proposed pattern of evolution of the abundant Alu sequence found in the human genome.

Figure 8-82

The proposed pattern of evolution of the abundant Alu sequence found in the human genome. A related transposable element, B1, is found in the mouse genome. Both of these transposable DNA sequences are thought to have evolved from the essential 7SL RNA (more...)

Summary

The functional DNA sequences in the genomes of higher eucaryotes appear to be constructed from small genetic modules of at least two kinds. Modules of coding sequence are combined in many ways to produce proteins, whereas modules of regulatory sequences are scattered throughout long stretches of noncoding sequences and regulate the expression of genes. Both the coding sequences and the regulatory sequences are typically present in modules that are less than a few hundred nucleotide pairs long, which together account for only a small proportion of the total DNA.

A variety of genetic-recombination processes occur in genomes, causing the random duplication and translocation of DNA sequences. Some of these changes create duplicates of entire genes, which can then evolve new functions. Others produce new proteins by shuffling exons or alter the expression of old genes by exposing them to new regulatory sequences. This type of DNA sequence shuffling, which is of great importance for the evolution of organisms, is greatly facilitated by the split structure of higher eucaryotic genes and by the fact that these genes are often controlled by distant regulatory sequences.

Many types of transposable elements are present in genomes. Collectively, they constitute more than 10% of the mass of bothDrosophila and vertebrate genomes. Occasionally, transposition bursts occur in germ cells and cause many heritable changes in gene expression in the same individual. Transposable elements are thought to have had a special evolutionary role in the generation of organismal diversity.

Image ch24f27
Image ch9f52
Image ch7f4
Image ch3f43
Image ch2f21
Image ch7f17
Image ch6f70
Image ch9f44
Image ch12f39

By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.

Copyright © 1994, Bruce Alberts, Dennis Bray, Julian Lewis, Martin Raff, Keith Roberts, and James D Watson.
Bookshelf ID: NBK28308