Open Access: This content is Open Access under the Creative Commons license CC-BY-NC-ND.
NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Mattick J, Amaral P. RNA, the Epicenter of Genetic Information: A new understanding of molecular biology. Abingdon (UK): CRC Press; 2022 Sep 20. doi: 10.1201/9781003109242-2
RNA, the Epicenter of Genetic Information: A new understanding of molecular biology.
Show detailsThe transition from inorganic to organic chemistry in the late 19th century added proteins and nucleic acids, alongside sugars, starches, woods and fats, to the portfolio of components of living organisms. The disciplines of biochemistry and molecular biology emerged in the 20th century with the elucidation of metabolic and biosynthetic pathways and the demonstration that proteins are enzymes. Microbial genetics focused on metabolic enzymes led to the conclusion that genes are synonymous with proteins. Proteins were thought to be the central molecules of life and the active components of the chromosomes that transmitted genetic information; the other main component, DNA, seemed too simple and was thought to act simply as a scaffold. Genetic mapping developed along with the concept of ‘genes’ as discrete entities. Theoretical mathematical biology flourished, influenced by microbial genetics and the ‘lethality’ of mutations. Darwinian evolution and Mendelian inheritance were integrated in the ‘Modern Synthesis’, which asserted that mutations are random and that Lamarckian inheritance of experience does not occur. DNA and RNA were initially confused, but the former was found to be mainly located in the nucleus and the latter in the cytoplasm. DNA was eventually shown to contain the genetic information by transforming the phenotype of bacteria, but this finding was slow to be accepted until the elucidation of the structure of DNA with its evident ability to be replicated according to the rules of base pairing and to encode information in the sequence of its nucleotides.
For most of human history, the nature of matter, and life, was a subject of speculation. Famously, in the 4th century BCE, the Greek philosopher Aristotle cemented the ruminations of his predecessor Empedocles to assert that all matter – organic and inorganic – was composed of four elements – fire, air, water and earth – the ratio of which determined its properties. Aristotle also asserted that these elements are interchangeable but did not accept the deduction of his predecessors, Leucippus and Democritus, that all matter could be reduced to indivisible particles, ‘atomos’ or atoms. Aristotle’s views held sway for two millennia.
In 1773, Joseph Priestley showed that heating of mercury ‘calx’ (mercuric oxide) not only produced liquid mercury, as had been known for thousands of years, but also a gas that caused candles to burn brightly. It was also known that treating metals with acids produced another, highly flammable, gas, termed ‘phlogiston’. In 1778, Antoine Lavoisier mixed these two gases, added a lighted match and observed that they combined to form water. He named Priestley’s gas ‘oxygen’ (from the Greek, meaning acid-maker) and re-named phlogiston ‘hydrogen’ (water-maker). He also burned other substances such as phosphorus and sulfur and showed that they combined with air to make new materials, ‘compounds’, gaining weight in the process, then in 1789 published a list of 33 chemical elements, grouping them into gases, metals, nonmetals and earths. 1
In 1794, Joseph Proust proposed that compounds have defined chemical formulas, 2 and in 1803 John Dalton proposed the atomic theory of matter, 3 leading in the following decades to the identification and prediction of many other atomic ‘elements’, the development of simple nomenclature (H for hydrogen, O for oxygen, C for carbon, N for nitrogen, S for sulfur, P for phosphorus and so on), the distinction by Amedeo Avogadro in 1811 of atoms and combinations thereof (‘molecules’), 4 , 5 and the development of the Periodic Table in 1869 independently by Dmitri Mendeleev and Julius Meyer. 6
Such was the time that chemists started to explore the nature of matter in biology. The geneticists and physicists came later.
Sugars and Fats
Sugars, starches, oils and fats derived from plants and animals had of course been known for eons and used for nutrition, cooking and other practical applications.
In 1789, François Poulletier de La Salle and Michel Chevreul described a substance that could be extracted by alcohol from bile stones and named it ‘cholesterine’ (cholesterol) from the Greek ‘chole’, meaning bile and ‘stereos’, meaning solid. 7 In 1815, Henri Braconnot classified fats into ‘suifs’ (greases) and ‘huiles’ (fluid oils). 8 In the next decade, Chevreul developed a more detailed classification, encompassing greases, tallow, waxes, resins, oils and volatile oils, among others. 9 In 1847, Theodore Gobley isolated phospholipids from brains and egg yolks. 10 The identification of more complex forms, such as glycolipids and sphingolipids, came later as did the generic term ‘lipid’, coined in the 1920s by Gabriel Bertrand from the Greek ‘lipos’ (fat). 11
Also in 1789, Lavoisier determined that sugar is composed of carbon, hydrogen and oxygen and that the fermentation of sugar by yeast produced ethanol and carbon dioxide, long exploited in brewing and baking. 1 , 12 In 1833, Anselme Payen and Jean-François Persoz discovered the first enzyme activity (‘distase’), 13 and in 1839 Payen coined the term ‘cellulose’ from the French word ‘cellule’ for cell. 14 , 15
In 1857, Claude Bernard isolated and introduced the term ‘glycogen’ for the starch-like substance stored in the livers of mammals. 16 The term ‘carbohydrate’ (French ‘hydrate de carbone’) also originated around this time to describe high molecular weight chains of simple sugars such as glucose, whose composition could be expressed generally as Cn(H2O)n.
Proteins: ‘The Locus of Life’
In 1828, Friedrich Wöhler produced urea from ammonium cyanide, which showed that inorganic molecules can be converted into organic compounds 17 and demolished the widely held belief that the latter could only be produced through some sort of ‘vitalism’. 18 In 1839, Gerhardus Mulder described substances (fibrin, egg albumin and gelatin) that contain large amounts of C, H, O and N, and also S and P, which he considered ‘the most essential substances of the animal kingdom’. 19 , 20 He shared the results with the chemist Jons Berzelius, who suggested that these substances be called ‘proteins’, from the Greek πρωτειος (‘primarium’ or first). 19 , 21 , 22
Mulder’s work also showed that proteins from animals and plants (which played a ‘principal role in their economy’) shared a similar, but varied, atomic composition. 20 Their molecular nature, however, remained nebulous for decades. 23 , 24 Although proteins initially represented more a concept than a defined chemical entity, it became gradually accepted that, as major constituents of all organisms, they were central to the processes of life. Indeed, the ‘colloidal nature’ a of the substances of living tissues, at the time popularly described as the ‘protoplasm’, 20 exhibited many of the properties that were associated with proteins, and was in fact explicitly regarded a “proteinaceous substance”. 25
This idea was prevalent among the early proponents of Darwinian evolutionary theory. One of the most prominent, Thomas Huxley, defined the protoplasm in 1868 as the “locus of life” and postulated that the physical basis of life (including heredity) lay in this universal biological substance. 20 , 26 Huxley remarked: “It may be truly said, that the acts of all living things are fundamentally one” and that “all protoplasm is proteinaceous.” 27
Similarly, in 1871, Charles Darwin cautiously speculated about the role of proteins in the origin of life, while stressing the common ancestry of species:
It is often said that all the conditions for the first production of a living organism are now present, which could ever have been present. But if (and oh! what a big if!) we could conceive in some warm little pond, with all sorts of ammonia and phosphoric salts, light, heat, electricity, &c., present, that a proteine compound [emphasis added] was chemically formed ready to undergo still more complex changes, at the present day such matter would be instantly devoured or absorbed, which would not have been the case before living creatures were formed. 28
During the second half of the 19th century, the diversity of proteins became apparent and the empirical observations made were key to framing the concepts of enzymes and ‘biological specificity’. 20 , 29 During this time, Louis Pasteur and others demonstrated the nexus between enzymatic activity and ‘life’ in fermentation, 12 by then a well-established paradigm of physiological chemistry.
Pasteur also showed, using swan-necked flasks, which allowed air exchange but limited microbial contamination, that life did not arise spontaneously, and noted that organic molecules are chirally left-handed, b a fundamental step forward in biochemistry, underpinning, among other things, modern drug development. Also studying fermentation, in 1894 Emil Fischer promoted the importance of stereochemical rules (popularized as the ‘Lock and Key Model’) to explain the interaction of substrate with enzyme, highlighting the central role and exquisite specificity of the “so-called enzymes” in biological processes. 31 In 1897, Eduard Buchner demonstrated that yeast extracts ferment alcohol from sugar, showing that biochemical processes do not necessarily require living cells, but are catalyzed by the enzymes formed in cells. 32
Between 1899 and 1908, Fischer and others, notably Ernest Fourneau, Franz Hofmeister and Albrecht Kossel, made important advances in the understanding of the chemistry of proteins, sugars and nucleic acids, including the description of the peptide bond. The latter was a crucial shift in understanding how animal substances arise, from the traditional view that proteins are acquired from plants to the realization that they are synthesized from a set of constituent parts (amino acids c ) incorporated into peptide chains: the ‘peptide theory’, largely promoted by the work of Kossel. 20 , 23 , 34 , 35
It was not until 1926 that James Sumner first purified an enzyme (urease), 36 but proteins were already regarded as the molecules that underlie life processes, the ‘Protein World’. 37 Aleksandr Oparin proposed in 1924 that life originated on Earth through gradual chemical evolution of carbon-based molecules in a “primordial soup”, 38 with John (‘J. B. S.’) Haldane independently advancing a similar theory 5 years later. 20 , 38–40 Both suggested in the 1920s that polypeptides were the initial particles of ‘colloidal size’, whereas nucleic acids, which had been discovered 50 years earlier and known to be major components of chromosomes (see below), were not mentioned. d The idea that proteins were the primordial molecules was reinforced by the famous experiment by Stanley Miller and Harold Urey in 1953, which demonstrated the formation of amino acids e from inorganic molecules in the simulated reducing and highly electrified atmospheric conditions presumed to exist in the primitive Earth. 45 , 46
Biochemistry became a recognized scientific discipline in the 1930s and 1940s with the advent of a better understanding of the structure-function relationships of proteins, driven by physicists using techniques such as X-ray crystallography and isotopic labeling. 20 These advances coincided with the emergence and the coining of the term ‘molecular biology’, 47 whose focus is to understand the molecular basis of genetic material and how it determines cellular and organismal phenotypes.
In this sense, molecular biology is distinct from biochemistry, although there is considerable overlap and many biochemists, but perhaps not molecular geneticists or cell and developmental biologists, would assert that the terms are synonymous. f In any case both new disciplines were centered on the study of proteins, which heavily influenced the conceptions of genetic information and the mechanisms of heredity and development.
The period from 1930 to 1970 also saw great progress in the characterization of intermediary metabolism, the other heart of biochemistry. 20 The achievements included specification of the glycolytic (fermentation) pathway, the urea/ornithine, citric acid and glyoxylate cycles, 49 the pathways for lipid synthesis and degradation, the synthesis of amino acids and complex carbohydrates, and the nucleoside and pentose phosphate pathways for the synthesis of nucleotides and enzymatic cofactors, notably by Hans Krebs, Gustav Embden, Otto Meyerhof (who also discovered the universal energy currency, adenosine triphosphate, ATP), Jakub Karol Parnas, Otto Warburg, Horace Barker, David Green, Peter Mitchell, Ephraim Racker and Salih Wakil, among many others. 20 , 50–52
Nucleic Acids and Chromosomes
The history of nucleic acids g began with Friedrich Miescher’s discovery in 1869 of “nuclein”. This was a substance distinct from proteins, isolated from the nuclei of pus cells and characterized by resistance to protease digestion, a high content of phosphorus and the absence of sulfur. 54–56 Miescher made this discovery in the laboratory of Felix Hoppe-Seyler, who had the experiments repeated and 2 years later published Miescher’s paper together with others describing the isolation of nuclein from various sources. Miescher floated the idea that nuclein might be the genetic material, presumably because it was enriched in sperm, but vacillated on the issue. 57 Nuclein was later found to have an acidic nature and was named ‘nucleic acid’ (German ‘nucleïnsäure’) by his student Richard Altmann in 1889. 56 , 58 , 59
Although Miescher’s nuclein was later shown to be deoxyribonucleic acid (DNA), the ‘nuclein’ isolated from yeast by Hoppe-Seyler was not (mainly) DNA but the first description of what would later become known as ribonucleic acid (RNA). 55
In 1882, Walther Flemming described fibrous structures in the nucleus and their separation in mitosis, which he visualized by staining and termed ‘chromatin’ (stainable material). 60–62 Along with Edouard Van Beneden, he also described centrosomes 63 , 64 (see Chapter 15), a term introduced by Theodor Boveri in 1888. 65 , 66 Later that year, Heinrich Waldeyer termed the nuclear fibers ‘chromosomes’ (stainable bodies) (Figure 2.1). 62 , 67
In the years leading up to the turn of the 20th century, Albrecht Kossel showed that nucleic acids are an inherent component of chromatin and identified their constituent nucleoside bases: the ‘purines’ h guanine (G) and adenine (A), which have a double ring structure; and the ‘pyrimidines’ cytosine (C), thymine (T) and (what would be later recognized as its RNA counterpart) uracil (U), which have a single ring structure. 54 , 56 , 68 , 69
Chromosomes as the Mediators of Genetic Inheritance
Gregor Mendel’s principles of inheritance involving binary combinations of simple genetic traits were re-discovered around 1900 by the botanists Hugo DeVries, Carl Correns, and brothers Armin and Erich von Tschermak-Seysenegg, 70–73 and promulgated by William Bateson, who translated Mendel’s 1865 paper in 1901. 74 Bateson also coined the word ‘genetics’ (from the Greek gennō, γεννώ; ‘to give birth’) and many other genetic terms that are still in use, such as ‘homozygote’ and ‘heterozygote’. 75
Around the turn of the century, Carl Rabi and Boveri i described chromosome territories, nuclear transplantation in sea urchins and abnormal chromosomes in cancer, leading them to conclude that developmental differentiation is a consequence of regulatory structures within the hereditary material. 76–79
Based on these and other observations, William Sutton proposed in 1903 that chromosomes “constitute the physical basis of the Mendelian law of heredity” and that “one of the most characteristic features of chromatin is a large percentage content of highly complex and variable chemical compounds, the nucleo-proteids”, 80 which had been first described by William Halliburton in 1895. 81 Shortly thereafter, Wilhelm Johannsen coined the terms ‘gene’, ‘genotype’ and ‘phenotype’, although he did not speculate about the nature of the gene. 82 , 83
Direct evidence for the Chromosome Theory of Inheritance was first provided by the discovery of ‘sex chromosomes’ independently by Nettie Stevens and Edmund Wilson in 1905. 84 , 85 In 1914, chromosomes were conclusively demonstrated to be the entities that carry genes j by Calvin Bridges 86 , 87 working in the laboratory of Thomas Hunt Morgan, alongside Alfred Sturtevant and Hermann Muller, using the powerful fruitfly (Drosophila melanogaster) genetic system k that they had developed. 88
The previously conceptual genes had found a physical home, a ‘locus’. Morgan, Bridges, Muller and Sturtevant proposed that genes are linearly arranged on chromosomes like beads on a string and suggested that new combinations arise by ‘crossing-over’ (which was observed microscopically) and exchange of genetic material between pairs of chromosomes at meiosis, termed ‘recombination’. 88–91 On the other hand, Muller’s studies identified many loci (‘allelomorphs’) 92 that later turned to encode regulatory elements, including transposon insertions (Chapter 10) (Figure 2.2).
Analysis of the physical relationships between genetic markers (such as eye color or developmental mutations) relied on measuring the frequency of their co-inheritance in genetic crosses. Co-segregation of markers occurs in 50% of the progeny if the markers are on different chromosomes (or far apart on the same chromosome), but at a higher frequency if the loci are located near each other and separated only by occasional recombination events.
These inheritance patterns gave rise to the concepts of gene ‘linkage’ l and ‘linkage groups’ (i.e., genes connected on the same chromosome), and ‘genetic distance’ (the frequency of recombination between linked genes, measured in ‘centimorgans’, or cM m ). It also led to the description of phenotypic differences due to ‘cis’ interactions between genes located nearby on one chromosome; ‘trans’ interactions between genes located distally or on different chromosomes; homozygous and heterozygous variants, where the same or different variants (‘alleles’) are present on the parental chromosomes; ‘recessive’ mutations, where one copy of the ‘wildtype’ allele is sufficient for function and masks the presence of the damaged variant; ‘dominant’ mutations, where the mutant allele overrides the ‘normal’ gene; and partial ‘penetrance’ where the frequency of a phenotype differs from normal Mendelian ratios, due to ‘haploinsufficiency’ or the influence of ‘modifier’ genes.
The genetic maps resulting from such crosses led to the conclusion that genes are discrete objects with exclusive borders – a perception that reflects the low resolution of these early studies (and many others to this day). It also led to the idea of the ‘gene for …’, not just physical traits like eye color or, later, human genetic disorders such as cystic fibrosis (Chapter 11), but ultimately also for psychosocial traits, as if all genetic influences could be viewed in binary terms (genes -> traits), which underpinned the subsequent ‘one gene – one enzyme (protein) – one function’ assumption in biochemistry. 94 , 95
The view of genes as particulate entities was reinforced by the work of Nikolay Timoféeff-Ressovsky, who attempted to measure the “radius” of genes, 96 which heavily influenced Max Delbrück, who co-authored their ‘Classical Green Pamphlet’ in 1935, 97 which was “the starting point” for Erwin Schrödinger’s subsequent ruminations on the physical nature of genetic information (see below) and the “keystone in the formation of molecular genetics”. 98
However, Muller had shown by X-ray mutagenesis in 1930 that rearrangements that moved active genes near to heterochromatic regions (Chapter 4) of Drosophila chromosomes resulted in changes in the pattern of expression of these genes, 92 called ‘position effect variegation’ (PEV). 99 This was also observed by Barbara McClintock and others in maize, in that case with transposition underlying PEV (Chapter 5).
These observations led Muller and Richard Goldschmidt to challenge the conception of genes as distinct entities. 100–102 Nonetheless, the view of the gene as a discrete unit persisted even in the face of later discoveries that many genes (i.e., DNA segments that express RNA products) overlap or reside within other genes, 103–110 which has led to difficulties in genome annotation and a barrier to understanding the genome as an information continuum (Chapters 13–16).
Goldschmidt also coined the term ‘phenocopy’ to describe morphological changes in Drosophila that could be induced by imposition of stress during development, insisting on this basis, to little avail, that gene function had to be considered in developmental context 111 (Chapter 5). Goldschmidt was considered a heretic by the neo-Darwinists. 102 , 112
The Modern Synthesis
During the first four decades of the 20th century, evolutionary biologists and geneticists, notably Bateson, Haldane, R. A. (Ronald) Fisher, Sewall Wright, Ernst Mayr, G. Ledyard Stebbins, Theodosius Dobzhansky n and Julian Huxley, among other theoreticians of the time – collectively referred to as the ‘neo-Darwinists’ – brought together Mendelian inheritance, Darwinian gradualism and selection, and statistical genetics into what is known as the ‘Modern Synthesis’, after the title of Huxley’s 1942 book. 115 These pioneers of theoretical population genetics introduced important concepts such as the relevance of population size, genetic drift and the strength of selection, in parallel developing statistical methods and models that have found widespread applications to this day. 116 , 117
A tenet was that all evolutionary phenomena and species diversity can be explained in a way consistent with known genetic mechanisms, o with the unifying theme that natural and artificial selection operates on heritable variation arising by random mutation. p Biometric population approaches (which showed continuous trait variation) and Mendelian inheritance were reconciled by Fisher and others by invoking polymorphic multi-factorial traits and the ‘infinitesimal model’ of variation and selection. 113 , 115 , 122–124
A corollary of the supposition that mutation occurs randomly is that experience cannot influence the characteristics of the next generation, contrary to the proposal of Jean-Baptiste Lamarck a century before Darwin. q As evidence against ‘Lamarckian evolution’, evolutionary biologists recalled the 19th-century work of August Weismann, 127 who asserted that somatic cells only received a subset of the information contained in the germline, and that information does not flow in reverse from somatic to germ cells and (therefore) cannot be transmitted to the next generation. r The latter came to be known as the ‘Weismann Barrier’, 128 but was challenged in the 1940s and 1950s by Conrad Waddington (the father of epigenetics, Chapter 14), who provided evidence of the inheritance (‘genetic assimilation’) of characteristics acquired in response to environmental perturbation 129–131 (Chapter 5).
The ideas introduced by the Modern Synthesis had a lasting influence on the conceptual landscape of molecular biology and the interpretations of experimental observations. These included the later gene-centric models of genome variation, ‘fitness’ and evolution, and the invocation of ‘junk’ DNA (Chapter 7), as well as the concept that the sequences of important genes are ‘conserved’ during evolution, which has frequently been assumed in efforts to discriminate functional and non-functional regions of genomes (Chapter 11).
There was considerable cross-fertilization in the 1940s and 1950s between theoretical evolutionary biologists and microbial geneticists, notably Delbrück s and Joshua and Esther Lederberg, assessing lethal/competence phenotypes in bacteria and their viruses, which would lend support to the spontaneous and random nature of mutations as opposed to ‘pre-adaptations’ or ‘directed mutations’ 134 – a debate at the time.
Equally if not more importantly for the understanding of genetic programming in the following decades, the emphasis on lethal loss-of-function mutations overlooked the differences between essential genes and variations in regulatory (non-protein-coding) sequences that may have no effect on the viability of plants and animals but are the major drivers of quantitative trait variation, adaptive evolution, survival and reproductive success (Chapters 7 and 11).
These assumptions – that inheritance occurs entirely through Mendelian genes and mutations occur randomly – became entrenched before virtually anything was known about the nature of genetic variations and their phenotypic impact, especially in relation to the ‘complex traits’ that have large environmental components, and resulted, in part, in the lack of appreciation of Barbara McClintock’s later work on mobile genetic elements (Chapter 5). They also underpinned the dichotomy of the classical and historical ‘Nature versus Nurture’ t debates that raged for decades, not realizing or even considering that such complex phenotypes may be the integrated outcomes of overlapping genetic and epigenetic processes.
The concept of genetic determinism overflowed into sociological and political arenas, notably the common and at the time fashionable idea u (notably, extending from observations of selective breeding in agriculture and of domesticated animals by Darwin and others) that there are superior and inferior human characteristics (especially intelligence) that vary between individuals and ‘races’ 137 and might be improved by selective breeding (promoted by Francis Galton, who coined the term ‘eugenics’) or, worse, by sterilization, as occurred in the USA and by genocide in Nazi Germany. Genetics was also politicized in Stalinist Russia by Trofim Lysenko, who favored Lamarckian evolution in line with communist ideology, which led to the abolishment of population genetics and the persecution of geneticists and other scientists in the USSR. 138 , 139
Distinguishing DNA and RNA
During the same period, research on nucleic acids was surprisingly limited, despite the finding that they are a major component of chromosomes, the repositories of genetic information. 140 Indeed, nucleic acids barely feature in the history of biochemistry before 1940, 20 a consequence of the prevailing view that proteins have greater chemical diversity than the assumed monotony of nucleic acids, thought to be merely structural or metabolic entities.
This view was strengthened by the ‘tetranucleotide hypothesis’ put forward in 1909 by the chemist Phoebus Levene, who identified ribose sugars in ‘yeast nucleic acid’ 141 , 142 and deoxyribose sugars in calf thymus gland. 143–145 Levene proposed that phosphate-sugar-base units formed circular tetramers containing each of the four ‘nucleotides’, based on the perception that they are present in equimolar proportions. 54 , 142 , 145–147
Levene’s work in identifying the five common nucleotides (A, G, C, T/U) and the phosphodiester bonds between them was a major advance, 145 , 147 but the tetranucleotide hypothesis implied that nucleic acids could not carry complex information. However, during the 1920s and 1930s, Einar Hammarsten showed that the molecule later recognized as DNA has a very high molecular weight, using a gentler method for its isolation that avoided harsh treatment with alkali. 148 Robert Feulgen (who discovered the eponymous DNA stain, see below), as well as Hammarsten and his student Erik Jorpes, then demonstrated that nucleic acid (RNA) from pancreas contained more guanine than other bases. 54 , 149
These observations did not fit with a simple equimolar tetranucleotide structure (although Jorpes tried to rationalize the guanine excess within a pentanucleotide structure 149 ), but were not recognized and Levene’s hypothesis held sway for decades. 140 , 145 , 147 In addition, there was a perception that the different types of nucleic acids do not occur in all organisms, which argued against universal biological roles. One form, the ‘thymonucleic acid’ or ‘zoonucleic acid’ had been isolated from animal glands, pus cell nuclei and sperm and was initially thought to be present exclusively in animals; the other, known as ‘pentose nucleic acid’, ‘zymonucleic acid’, ‘yeast nucleic acid’ or (plant) ‘phytonucleic acid’, was thought to be absent from animal cells until the early 1940s. 54 , 150 , 151 Thus, there was confusion about the distribution and functions of what later became known as DNA and RNA.
It was also shown during the same period that DNA and RNA have differences in their nucleoside base composition (with DNA containing adenine, guanine, cytosine, thymine, whereas RNA contains adenine, guanine, cytosine and uracil, the demethylated analog of thymine) and in the sugars in their sugar-phosphate backbone, deoxyribose in DNA and ribose in RNA, hence ‘deoxyribonucleic acid’ and ‘ribonucleic acid’. 152
Erwin Chargaff, who later observed the purine-pyrimidine equivalences that underpinned the base pairing critical to the elucidation of the structure of DNA and the templating of messenger RNA, pointed out in 1950 that
Although only two nucleic acids, the deoxyribose nucleic acid of calf thymus and the ribose nucleic acid of yeast, had been examined analytically in some detail, all conclusions derived from the study of these substances were immediately extended to the entire realm of nature; a jump of a boldness that should astound a circus acrobat. 153
In 1924, Feulgen and Heinrich Rossenbeck developed a sensitive histochemical reaction that intensely stained DNA. 154 It demonstrated that DNA is present in the nucleus, but not the cytoplasm, v of plants and animals, and constituted important evidence for DNA as the possible genetic material. 155 Jean Brachet, perhaps the most important of the early RNA biochemists, developed new methods to differentiate DNA and RNA in the 1930s and early 1940s, using basic dyes and cytochemical staining, later combined with ribonuclease treatments (Figure 2.3). 156–160 His work showed that both DNA and RNA are universal constituents of animal and plant cells and that RNA, unlike DNA, is mainly located in the cytoplasm. Later, it was shown in bacteria that, while the DNA composition varies between different species, both DNA and RNA are always present. 161 , 162
Although DNA was linked with chromosomes, the function of RNA was for a long time a mystery. It was speculated that RNA might serve as an energy repository or a cytoplasmic precursor converted into DNA during cell division (the ‘conversion hypothesis’). 160 It was also unclear what form RNA might take, since the 2′ hydroxyl group of ribose represented a potential bifurcation point, such that RNAs could exist as branched molecules. Only later, in the 1950s, was this idea dispensed with, when studies by Alexander Todd and others demonstrated that RNAs are linear polymers w whose nucleotides, as in DNA, are linked by 3′-5′ phosphodiester bonds. 165 , 166
Confirmation of the role of RNA as an information molecule emerged from the study of plant and animal viruses that contained RNA but not DNA. In 1937, Frederick Bawden and Norman Pirie identified RNA in tobacco mosaic virus (TMV), 167 despite the previous suggestion by Wendell Stanley, who had crystallized TMV, that protein was the active ‘auto-catalytic’ agent of the virus. 168 Later experiments revealed that the TMV and other plant viruses, as well as foot-and-mouth disease virus and other animal viruses, contained RNA as the sole nucleic acid, which indicated that RNA must act the template for viral replication and protein synthesis. 169 , 170
One Gene-One Protein and the ‘Nature of Mutations'
The association between ‘genes’ and proteins dates back to 1902, when Archibald Garrod provided evidence for the inheritance of human disorders and phenotypes that reflected enzymatic deficiencies, such as alkaptonuria. 171 The causes of the enzyme deficiencies in such ‘inborn errors of metabolism’ (which included other congenital disorders, such as albinism) were still only vaguely defined, 172 as they preceded the conceptual terms ‘gene’ and ‘genome’. x
Indeed, even George Beadle and Edward Tatum’s influential 1941 ‘one gene – one enzyme’ principle, 174 based on studies of mutants of the filamentous fungus Neurospora crassa (red bread mold) (Figure 2.4), y was not well accepted before the nature of hereditary material was understood. 178 The phrase subsequently morphed into ‘one gene – one protein’ when it was realized that not all proteins are enzymes, and even this changed when it was found much later that, in the higher eukaryotes, one gene can produce variations of the same protein by alternative splicing (Chapter 7).
Beadle and Tatums’ work, nevertheless, united genetics and biochemistry. 94 , 181 It also popularized the use of microbes as model organisms. 182 In 1946, Tatum and one of his students, Joshua Lederberg (with whom he and Beadle later shared the Nobel Prize) z demonstrated genetic recombination in the enteric bacterium Escherichia coli, 184 which consequently became widely used as a model organism. Moreover, most genetic studies were focused on lethal, conditionally lethal (such as in auxotrophy aa or temperature-sensitivity) and/or phenotypically severe mutations, especially in haploid cells (bacteria and haploid Neurospora and yeast), which are overwhelmingly biased to protein-coding sequences (Chapter 7).
At the time, proteins were thought to comprise genes, rather than be the products of them. DNA was considered to have only peripheral functions, such as serving as an “intra-nuclear buffer”, 160 or acting as a scaffold during gene replication. The latter was championed by the small community of scientists then engaged in nucleic acids research, such as Hammarsten and Torbjörn Caspersson. 140 Caspersson, whose work was also instrumental in establishing the involvement of RNA in protein synthesis, proposed a metabolic relationship between nucleic acids and “gene reproduction”, noting that “synthesis of nucleic acid is closely connected with gene reproduction”, and that “it may be that the property of a protein which allows it to reproduce itself is its ability to synthesize nucleic acid”. 185 However, his conception was that the “structure-forming properties” of DNA (alluding to the high molecular weight DNA polymers earlier demonstrated by Hammarsten) was simply auxiliary to the basic proteins of the nucleus. 140 , 185
Research on genetic material up until World War II, therefore, concentrated on proteins as the prime candidate for the genetic material, the “protein version of the central dogma”. 54 Even the first reported infectious agents of bacteria, characterized as “filter-passing viruses” (later termed bacteriophages or ‘phages’ for short, soon to play a major role in the understanding of genes, Chapter 3), were considered to be “enzymes with power of growth”. 186
DNA Is the Genetic Material
In 1944, the physicist Erwin Schrödinger wrote a book entitled What Is Life?, in which he made the logical deduction that the genetic material would be comprised of an “aperiodic crystal” – that is, a molecule of regular structure with information embedded in its fine-scale variations – a “miniature code”, the first use of the term ‘code’ in relation to biology. 187
Schrödinger’s prediction was borne out in the same year bb by Oswald Avery, Colin MacLeod and Maclyn McCarty, who built on the studies of bacterial ‘transformation’ by Fred Griffith in the late 1920s 189 and showed that the change from a benign to a virulent form of the bacterium Streptococcus pneumoniae could be effected by DNA but not by protein. 54 , 190 Their finding was confirmed in E. coli a year later by André Boivin. 191
However, such was the entrenchment of prior expectations 140 that it took almost 10 years for the conclusion to be widely accepted, 192 and “even Avery himself was reluctant to accept it until he had buttressed his experiments with the most rigorous controls”. 144 Arguing that a small amount of contaminating proteins could be present in Avery’s preparations, not only Hammarsten, but also other eminent scientists, especially Alfred Mirsky, were unconvinced that DNA was the genetic material. 140 , 144 , 193 James Watson said, in retrospect, that (unfortunately) “most people didn’t take him [Avery] seriously”. 194 Moreover, as Gunther Stent observed later, 192 Avery’s finding was “premature”. cc General acceptance only came after the 1952 experiments by Alfred Hershey and Martha Chase that showed the uptake of 32 P-labeled DNA, but not 35 S-labeled proteins, into bacteria after infection with bacteriophage T4, 190 , 195 and the elucidation of the structure of DNA in 1953.
Levene’s tetranucleotide structure was finally and convincingly refuted by Chargaff’s demonstrations in the late 1940s that DNA “formed extremely viscous solutions in water” (confirming Hammarsten’s earlier observations), implying a structure much larger than a tetranucleotide, and by paper chromatography dd that its constituent bases were not present in equimolar proportions. 153 Instead, Chargaff showed that the pyrimidine (T, C) and purine bases (A, G) are present in equal amount in DNA, and that A and T, as well as G and C, occurred in the same proportions. ee That is, %G=%C and %A=%T, and that the ratios of these pairs of nucleotides (%G+C/%A+T) were the same in different tissues of the same organism, but varied between organisms. 207–209
The Double Helix – Icon of the Coming Age
Chargaff’s data was crucial for Watson and Crick’s interpretation of the X-ray diffraction patterns of DNA fibers obtained by Rosalind Franklin ff and Raymond Gosling, building on work by others, gg notably Bill Astbury and Florence Bell who more than a decade earlier had determined the planar structure of, and the ~3.4 angstrom spacing between, the bases (stacked like “a pile of plates”), 210–213 which led to the elucidation of its double-helical structure. 214–216
As is well known, and was enormously compelling at the time, 217 although it took a few years for its significance to be widely appreciated, 218 this structure is governed by nucleotide base (purine-pyrimidine) pairing rules, which immediately suggested a mechanism for the duplication of genetic information. 214 This was subsequently demonstrated in 1958 by Matthew Meselson and Franklin Stahl using labeling with heavy isotopes of nitrogen to distinguish the template from newly synthesized DNA strands in buoyant density gradients. 219 An enzyme mediating the synthesis of new, complementary strands was discovered by Arthur Kornberg and colleagues in 1956 and termed DNA-dependent DNA polymerase. 220–222
Not so well known is that John Masson Gulland, Denis Jordan and colleagues had shown in 1947 that DNA is held together by hydrogen bonds, 223 and that their PhD student James Creeth had, in his 1948 PhD thesis, proposed a model for the structure of DNA comprising two chains with a sugar-phosphate backbone on the exterior and hydrogen-bonded bases between the nucleotide bases of opposite chains in the interior (Figure 2.5). 224 , 225
Acceptance of the DNA as the genetic material and the significance of its double-helical structure in genetic inheritance and gene expression came only after concerns about the ability of its strands to unwind had been resolved by the Meselson and Stahl experiment, and a plausible mechanism for the expression of genetic information (i.e., RNA-templated protein synthesis) had been established 218 , 227 (Chapter 3).
A subtle but important feature of the structure of DNA hh is that its strands are antiparallel, with both strands going from 5′ to 3′ (with respect to the phosphate linkages that connect the sugar-phosphate backbone) but in the opposite direction to each other, with DNA replication, RNA transcription and protein synthesis all proceeding from 5′ to 3′ in relation to the sugar-phosphate linkages. The linear arrangement of the bases along the backbone of the helix was fundamental to the logical deductions and experimental approaches to deciphering the genetic code.
The principles of copying/reading by base pairing and the directionality of the information were critical for understanding the synthesis and roles of RNA. Similarly, the subsequent demonstration by Julius Marmur, Paul Doty and colleagues that complementary strands of DNA (and RNA transcribed from the DNA) could recognize each other by base pairing 233 , 234 played an important part in the identification of messenger RNA and in the first analyses of genomic sequence composition and complexity (Chapter 3).
Further Reading
- Cairns J. (2008) The foundations of molecular biology: A 50th anniversary. Current Biology 18: R234–36. [PubMed: 18364220]
- Cobb M. (2014) Oswald Avery, DNA, and the transformation of biology. Current Biology 24: R55–R60. [PubMed: 24456972]
- Crow J.F. (2005) Hermann Joseph Muller, evolutionist. Nature Reviews Genetics 6: 941–945. [PubMed: 16341074]
- Harman O.S. (2005) Cyril Dean Darlington: The man who ‘invented’ the chromosome. Nature Reviews Genetics 6: 79–85. [PubMed: 15630424]
- Horder T.J. (2006) Gavin Rylands de Beer: How embryology foreshadowed the dilemmas of the genome. Nature Reviews Genetics 7: 892–898. [PubMed: 17047688]
- Hunter G.K. (2000) Vital Forces: The Discovery of the Molecular Basis of Life (Academic Press, New York).
- Huxley J. (1942) Evolution: The Modern Synthesis (George Allen & Unwin Ltd, New York).
- Kohler R.E. (1975) The history of biochemistry: A survey. Journal of the History of Biology 8: 275–318. [PubMed: 11609896]
- Morgan T.H. (1922) Croonian Lecture:—On the mechanism of heredity. Proceedings of the Royal Society B: Biological Sciences 94: 162–197.
- Nature conference: Thirty years of DNA. Nature 302: 651–654 (1983). https://doi
.org/10.1038/302651a0. [PubMed: 6835401] - Nilsson E.E., Maamar M.B. and Skinner M.K. (2020) Environmentally induced epigenetic transgenerational inheritance and the Weismann Barrier: The dawn of Neo-Lamarckian theory. Journal of Developmental Biology 8: 28. [PMC free article: PMC7768451] [PubMed: 33291540]
- Olby R. (1994) The Path to the Double Helix: The Discovery of DNA (Dover Publications, New York).
- Satzinger H. (2008) Theodor and Marcella Boveri: Chromosomes and cytoplasm in heredity and development. Nature Reviews Genetics 9: 231–238. [PubMed: 18268510]
- Stent G.S. (1972) Prematurity and uniqueness in scientific discovery. Scientific American 227: 84–93. [PubMed: 4564019]
- Strauss B.S. (2016) Biochemical genetics and molecular biology: The contributions of George Beadle and Edward Tatum. Genetics 203: 13–20. [PMC free article: PMC4858768] [PubMed: 27183563]
- Teich M. and Needham D. (1992) A Documentary History of Biochemistry 1770–1940 (Leicester University Press, New York).
- Williams G. (2019) Unravelling the Double Helix: The Lost Heroes of DNA (Weidenfeld & Nicolson, New York).
Footnotes
- a
We now know that RNA can nucleate colloidal domains and does so in many contexts (Chapter 16).
- b
Possibly due to the chiral mutation bias of cosmic radiation. 30
- c
Interestingly, some amino acids were named after the plant and animal source from which they were initially isolated: the first discovered amino acid (in 1806) was asparagine, isolated from asparagus; later glutamate from gluten; serine from silk (from the Latin for silk, ‘sericum’); tyrosine (the crystals in aged cheese) from the Greek for cheese ‘tyrós’; valine from the roots of the valerian plant; glycine from sugarcane and gelatin (the Greek ‘γλυκύς’, sweet tasting); etc. 33
- d
- e
It was recently shown that similar conditions (involving hydrogen cyanide, hydrogen sulfide as the reductant, ultraviolet light as the energy source and copper photoredox and wet-dry cycling) can also produce ribonucleosides and lipid precursors. 41–44
- f
Alan Turing, the breaker of the Nazi Enigma Code and the father of modern computing, posited in his influential paper in 1952, ‘The Chemical Basis of Morphogenesis’, that “a system of chemical substances, called morphogens … is adequate to account for the main phenomena of morphogenesis”, noting that (in contrast) “the function of genes is presumed to be purely catalytic”. 48
- g
The first component of nucleic acids described was in fact the RNA nucleoside, inosine, by Justus von Liebig in 1847, which he isolated from beef broth and named ‘inosinic acid’. The ‘umami’ savory (‘brothy’ or ‘meaty’) taste (one of the five basic culinary tastes, the others being sweet, salty, sour and bitter) derives from a combination of the amino acid L-glutamate and ribonucleotides such as guanosine monophosphate and inosine monophosphate. 53
- h
The first purines discovered were caffeine and theobromine (found in chocolate). Purines, pyrimidines and related metabolites are remarkably versatile compounds, used widely in biology, not just as nucleic acid constituents but also as energy currency, such as ATP, GTP (guanosine triphosphate) and NAD (nicotinamide adenine dinucleotide), regulatory/signaling molecules (such as cyclic AMP) and protein modifier (such as ADP-ribose).
- i
Theodor and Marcella Boveri also observed that the cytoplasm played a role in hereditary processes and proposed that it was the interaction of cytoplasm and chromosomes that determined the development of an organism, although this interaction was not subjected to in-depth genetic analysis for nearly a century. 76
- j
Initially, an X chromosome-linked recessive white eye mutation that affected males.
- k
Drosophila provided an ideal experimental system for genetic analysis, as it is easily maintained in the laboratory and has a short generation time. Its importance as a model for animal development throughout the 20th century cannot be overstated, despite the initial skepticism of medical researchers about its relevance to human biology, which largely evaporated later when gene sequencing revealed that the genes controlling development and neural function in Drosophila have equivalents in humans. 88 Many genes involved in Drosophila development are also involved in cancers (Chapters 6 and 14).
- l
Linkage between genes (‘partial coupling’) was first observed by Bateson and Reginald Punnet in sweet peas in 1904 (see 93 ).
- m
A unit corresponding to a recombination frequency between linked loci of 1% per generation.
- n
- o
Some, notably Cyril Darlington, recognized the limitations of the classical gene and argued that the conception of genotypes as the sum of genes could not explain variation in animal and plant populations, which must be dependent on interactions among genes and between the genotype, the cellular machinery, the reproductive habit and environment of the organism. 118 , 119
- p
In the 1920s, Haldane analyzed the famous textbook example of the appearance of dark pigmentation in peppered moths during the Industrial Revolution and established that evolution could occur even faster than contemporaries such as Fisher had assumed. 120 It was much later shown that this occurred by a transposon insertion that altered gene expression 121 (see Chapters 5 and 10).
- q
Darwin himself was agnostic about the origin of the variations upon which he posited selection to act and was not antagonistic to Lamarck’s ideas. In Chapter V of The Origin of Species, he remarked (making an interesting distinction between artificial selection and natural evolution): “I have hitherto sometimes spoken as if the variations—so common and multiform in organic beings under domestication, and to a lesser degree in those in a state of nature—had been due to chance. This, of course, is a wholly incorrect expression, but it serves to acknowledge plainly our ignorance of the cause of particular variation.” 125 As pointed out by Devon Fitzgerald and Susan Rosenberg: “He also described multiple instances in which the degree and types of observable variation change in response to environmental exposures … Darwinian evolution, however, requires only two things: heritable variation (usually genetic changes) and selection imposed by the environment. Any of many possible modes of mutation—purely ‘chance’ or highly biased, regulated mechanisms—are compatible with evolution by variation and selection.” 126
- r
Part of the evidence was Weismann’s peculiar experiment in 1868 wherein he severed the tails of mice for five generations and showed that this experience had no effect on the presence or length of tails in the descendants. 127
- s
- t
An expression also coined by Galton, based on studies of twins, 135 a common approach used to this day to assess the relative contributions of inheritance and environment to complex traits.
- u
Although scientific racism and eugenics was fashionable in many societies and intellectual circles, important thinkers and scientists opposed those ideas, including Dobzhansky and Alfred Russel Wallace, the co-discoverer of evolution by natural selection, who stated that Galton’s eugenics was ‘impractical, ineffective or immoral’. 136
- v
Leaving aside the small amount of DNA later shown to be present in mitochondria and chloroplasts.
- w
- x
The word “Genom” was coined before the nature of the genetic material was known, by the botanist Hans Winkler in 1920: “I propose the expression Genom for the haploid chromosome set, which, together with the pertinent protoplasm, specifies the material foundations of the species …” (in his book Verbreitung und Ursache der Parthenogenesis; see 173 ).
- y
The concept had earlier roots. Lucien Cuénot showed at the turn of the 20th century by his studies of mouse coat color variation that Mendelian inheritance occurred in animals as well as plants. He concluded that “mnemons” (genes) are responsible for the production of enzymes and is credited with the first enunciation of the gene = enzyme concept. 175–178 Among other important findings, he also described the first alleles and lethal mutation in mice, the recessive yellow agouti allele. This same locus was used a century later by Emma Whitelaw and David Martin to show the epigenetic inheritance of metastable alleles determined by transposable elements 179 , 180 (Chapters 5, 14 and 17).
- z
Apparently overlooking the contributions of his wife, Esther. 183
- aa
The inability of an organism to synthesize a particular organic compound required for its growth in minimal media.
- bb
And later by the structure of DNA, with Schrödinger’s prediction explicitly recognized by Francis Crick (see 188 ).
- cc
Stent said “A discovery is premature if its implications cannot be connected by a series of simple logical steps to canonical, or generally accepted, knowledge.” 192
- dd
While first reported in 1925, 196 paper and ion-exchange chromatography was also detecting modified bases in DNA and RNA, 197 , 198 although the import of these modifications was not appreciated until much later (Chapters 14 and 17).
- ee
Chargaff later showed that the %G=%C and %A=%T in single-stranded bacterial DNA. 199 Chargaff’s Second Parity Rule has since been shown to hold for all double-stranded genomes, except mitochondrial DNA, over a scale of kilobases (E. coli) and megabases (human), due to abundant inverse symmetries thought to reflect the distribution of repetitive elements, 200–206 but may reflect the preservation of RNA secondary structure in the encoded transcripts (including in repetitive elements) (Chapter 16).
- ff
Franklin was initially skeptical that DNA had a helical structure, as evidenced by her comments in a satirical note, in the style of an in-memoriam card, sent to Maurice Wilkins in July 1952: “It is with great regret that we have to announce the death, on Friday 18th July 1952 of DNA helix … A memorial service will be held next Monday or Tuesday.” The card is reproduced in Brenda Maddox’s book ‘Rosalind Franklin: The Dark Lady of DNA’, p. 185. 210
- gg
The considerable back story of the application of X-ray crystallography to understanding the structure of DNA (and proteins) is laid out in Gareth Williams’ book ‘Unravelling the Double Helix: The Lost Heroes of DNA’. 211
- hh
Another feature of DNA, often overlooked, is that there are alternative forms of base pairing, notably Hoogsteen pairing, which delayed acceptance of the Watson-Crick model. 228 It can exist in different forms, specifically the A- and B-forms as shown by Franklin (the B-form being the classic double helix), the Z-form discovered by Alex Rich, and others such as G-quadruplexes and I-motifs, which exist naturally in vivo. 229–231 It was also later shown that the base stacking and helical dimensions vary according to nucleotide sequence. 232
- Sugars and Fats
- Proteins: ‘The Locus of Life’
- Nucleic Acids and Chromosomes
- Chromosomes as the Mediators of Genetic Inheritance
- The Modern Synthesis
- Distinguishing DNA and RNA
- One Gene-One Protein and the ‘Nature of Mutations'
- DNA Is the Genetic Material
- The Double Helix – Icon of the Coming Age
- Further Reading
- The Genetic Material? - RNA, the Epicenter of Genetic InformationThe Genetic Material? - RNA, the Epicenter of Genetic Information
- RecName: Full=RING finger protein 207RecName: Full=RING finger protein 207gi|158931146|sp|Q3V3A7.2|RN207_MOUSProtein
- Mus musculusMus musculusGene expression profiling reveals mast cell-dependent inflammation in the meninges in early EAE.BioProject
Your browsing activity is empty.
Activity recording is turned off.
See more...