U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Varki A, Cummings RD, Esko JD, et al., editors. Essentials of Glycobiology [Internet]. 4th edition. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 2022. doi: 10.1101/glycobiology.4e.20

Cover of Essentials of Glycobiology

Essentials of Glycobiology [Internet]. 4th edition.

Show details

Chapter 20Evolution of Glycan Diversity

, , , , and .

This chapter provides an overview of glycosylation patterns across biological taxa and discusses glycan complexity and diversity from an evolutionary perspective. As much of the currently available information concerns vertebrates, this chapter emphasizes comparisons between vertebrate glycans and those of other taxa. Evolutionary processes that likely determine generation of glycan diversity are briefly considered, including intrinsic host glycan-binding protein functions and interactions of hosts with extrinsic pathogens or symbionts.

RELATIVELY LITTLE IS KNOWN ABOUT GLYCAN DIVERSITY IN NATURE

The genetic code is shared by all known organisms, and core functions such as gene transcription and energy generation are conserved across taxa. Complex glycans are found in all organisms in nature, and some have argued that polysaccharides were the original macromolecules contributing to the origin of life itself. Regardless of their origins, they vary immensely in structure and expression (abundance and patterns of distribution in and on cells, secretions, and extracellular matrices) both within and between evolutionary lineages. Partly because of inherent difficulties in elucidating their structures, our knowledge about this diversity remains limited, and there are few comprehensive data sets. For many taxa, there is a lack of any information on glycan profiles. Sufficient data are available to indicate that even though all living cells require a glycocalyx (a dense and complex array of cell-surface glycans), there is no evidence for a universal “glycan code,” akin to the genetic code.

Importantly, glycan structures are not directly encoded in the genome. They are synthesized and modified by a network of enzymes in a template-independent manner, and glycophenotypes represent the outcome of co-expressed gene networks and nutrients. Glycans expressed by most free-living Bacteria (Eubacteria) and Archaea have relatively little in common with those of eukaryotes. They contain a much larger number of monosaccharide types and include many glycans exclusive to such microbes. In contrast, most major glycan classes in animal cells seem to be represented in some related form among other eukaryotes, and sometimes in Archaea. Figure 20.1 shows a circular depiction of the phylogeny of cellular life on earth. The rich glycan diversity encountered in the best-studied vertebrate species suggests similar diversity in other groups of organisms, and existing information points to complicated patterns. On the one hand, glycan patterns can form “trends” and characterize entire phylogenetic lineages, wherein one encounters further biochemical variation with subsets unique to certain sublineages. Conversely, many glycans show discontinuous distribution across phyla and distantly related organisms can produce surprisingly similar glycans, using either shared, ancient pathways or convergently (independently) evolved mechanisms.

FIGURE 20.1.. Circular depiction of phylogeny of cellular forms of life on earth.

FIGURE 20.1.

Circular depiction of phylogeny of cellular forms of life on earth. The lines inside the circle represent all 2.3 million species that have been named. However, biologists have genomic sequences for only ∼5% of them; as more sequences become available, (more...)

EVOLUTIONARY VARIATIONS IN GLYCANS

N-Glycans

The broadest base of evolutionary information concerns asparagine–N-linked glycans, a “general” glycosylation system found in all domains of life (Chapter 9). In prokaryotes, protein N-glycosylation takes place in the periplasm at the plasma membrane, whereas in eukaryotes, covalent attachment of an oligosaccharide takes place intracellularly at the endoplasmic reticulum (ER) membrane, with the protein-bound glycan being further processed in the ER and Golgi. Detailed studies of N-glycosylation systems in model organisms reveal that the covalent modification of proteins at asparagine side chains within the N-X-S/T sequon is a homologous process in all taxa, characterized by certain common properties: Nucleotide-activated monosaccharides serve as building blocks for assembly of an oligosaccharide on an isoprenoid lipid carrier in the cytoplasm. The lipid-linked oligosaccharide is translocated across the ER membrane (eukaryotes) or the plasma membrane (prokaryotes). In most eukaryotes, the lipid-linked oligosaccharide is extended further before transfer to protein. Translocated proteins with N-X-S/T consensus sequences can serve as acceptors for the oligosaccharyltransferase (OST), which catalyzes the en bloc transfer from the lipid-linked precursor to the amido group of asparagine. The structural diversity of the initially transferred oligosaccharide is highest in the archaeal domain and lowest in eukaryotes. However, the protein-bound glycan is further processed in the ER and in the Golgi compartments of eukaryotes, generating greater structural diversity of glycans exposed at the cell surface.

Analysis of N-glycosylation in model organisms from all three domains of life offers the opportunity to visualize evolutionary trends and to propose selective forces at work. Structural diversity of surface-exposed N-linked glycans between populations and species is common, likely driven by an evolutionary “arms race” caused by exploitation of host glycans by parasites and pathogens. The intracellular attachment of the oligosaccharide to proteins that are in the process of folding is the basis for the role of highly defined glycan structures in the modulation and the quality control of protein folding in the ER (Chapter 39). This intracellular function of N-linked glycans is reflected in the high degree of conservation of the ER pathway in eukaryotes: only some phylogenetically ancient protists are known to transfer truncated forms of otherwise similar lipid-linked oligosaccharides.

In addition to structural diversity, significant quantitative evolution of N-glycosylation is apparent. Because of the short N-X-S/T sequon in polypeptides directing the OST substrate, a general modification system has evolved that affects many proteins, some with multiple N-glycosylation sites. Analysis of the N-glycome reveals a correlation: There are far fewer N-glycoproteins in unicellular than multicellular organisms. Thus, N-glycan-mediated intrinsic cell–cell interaction likely forms a selective force that leads to increased N-glycoprotein diversity with multicellularity.

Newer analytical techniques (Chapter 50) reveal ever-increasing structural diversity of N-glycans in eukaryotes, arising from remodeling in the ER and Golgi via variation in trimming, extension, and branching by different building blocks. In addition, N-glycans can be modified by phosphorylation, methylation, etc. There are evolutionary trends of N-glycan processing in eukaryotes (Figure 20.2). In fungi, trimming is restricted to the quality control process of glycoprotein folding, and the diversity of building blocks used for extension is limited. In contrast, trimming to Man3GlcNAc2 is a characteristic of plants and animals. Interestingly, each unit of this pentasaccharide can serve as a substrate for branching and extension, but only animals seem to have branching from terminal mannose residues, resulting in multiantennary complex-type N-glycans. In contrast, modification of the β-linked mannose by xylose is a characteristic of plants.

FIGURE 20.2.. Characteristic pathways of N-glycan processing among different eukaryotic taxa.

FIGURE 20.2.

Characteristic pathways of N-glycan processing among different eukaryotic taxa. See text for discussion.

Lineage-specific N-glycosylation pathways are characterized by the presence or absence of functional glycosyltransferases resulting in defined glycan structures (e.g., LacdiNAc structures) and/or building blocks (e.g., sialic acids). At the organismal level, N-glycan structures can be organ-, cell type–, or sex-specific, because of differential expression of processing enzymes. Their structures can also vary at different developmental stages, during senescence, and in response to diseases and pathogens. The common outer-chain Galβ1-4GlcNAcβ1-(N-acetyllactosamine or “LacNAc”) is a prime example of phylogenetic variation of glycosyltransferase machinery. The structure, common in invertebrates (Chapter 26), is also found in plants. Some plants even add outer-chain Fucα1-3 residues to the GlcNAc residues of LacNAc units, generating Lewis x–like structures identical to those in animal cells (Chapter 24). In some taxa, such as mollusks, an outer GalNAcβ1-4GlcNAcβ1-structure (the so-called “LacDiNAc” or LDN) tends to dominate, in place of the typical LacNAc structure more commonly seen in vertebrates. The SO4-4-GalNAcβ1-4GlcNAcβ1-terminal units of pituitary glycoprotein hormones (Chapter 14) are conserved throughout vertebrate evolution, suggesting importance for biological activity.

With respect to monosaccharide building blocks, some appear restricted to certain evolutionary lineages. Arabinose, rhamnose, and xylose are typical in plants, but helminths share some of these monosaccharides. Many bacteria produce monosaccharides bearing unique modifications absent from animals, which in turn secrete defensive lectins (e.g., intelectin-1 or RegIIIa) specific for these microbial glycans. Interestingly, mammalian glycans are assembled from a small number of different monosaccharide units, whereas microbial glycans consist of more than 700 different monosaccharide building blocks. Although more data are needed to be certain, there appears to be a general trend toward reduction of monosaccharide complexity in more recently evolved multicellular taxa with multiple internal organ systems and even more so in complex multicellular organisms that have evolved adaptive immune systems—perhaps because the evolutionary pressure to change the cell surface as a response to viruses is reduced. On the other hand, the much deeper evolutionary history of microbes may contribute to this difference as well. In this regard, the evolution of N-glycosylation apparently took a different route in giant viruses, such as Chloroviruses. They have evolved to use atypical sequons and rely on their own unique glycosylation machinery rather than hijacking host glycosylation. However, the evolutionary forces that brought about these differences remain unclear.

Sialic Acids

Sialic acids are prominent at the outer termini of N-glycans, O-glycans, and glycosphingolipids of deuterostomes (Chapter 15). They were once thought to represent an evolutionary innovation unique to this lineage, which originated during the Cambrian Expansion (500 myr), and other scattered reports of sialic acids in a few other taxa were thought to reflect lateral gene transfer and/or convergent evolution (i.e., independent evolution of sialic acid synthesis in these taxa). However, although lateral transfer mechanisms exist and may explain the presence of sialic acids in some bacterial taxa, sialic acids are also reported in some fungi and mollusks. Genome sequences reveal the presence of a set of genes for sialic acid production and addition in some protostomes (e.g., in insects such as Drosophila or mollusks such as Octopus). Drosophila sialyltransferase and CMP-sialic acid synthase genes were found to express functional enzymes structurally and functionally similar to vertebrate counterparts, clearly indicating an earlier evolutionary origin of sialylation. Caenorhabditis elegans, the free-living nematode, on the other hand, does not contain genes for synthesizing or metabolizing sialic acid. Earlier claims for sialic acids in plants are probably due to environmental contamination and/or incorrect identification of the chemically related sugar Kdo (2-keto-3-deoxy-octulosonic acid). However, sialic acid biosynthetic genes of some bacterial species share homology with those of vertebrates. It is also now evident that sialic acids are an invention derived from genes of more ancient pathways for nonulosonic acid (NulO) synthesis (Chapter 15). In this scenario, NulOs were differentially exploited during evolution, becoming prominent as sialic acids only in the deuterostome lineage, although being abandoned or substantially reduced in complexity and/or biological importance in other animal and fungal taxa. Sialic acids also appear to contribute to self-associated molecular patterns (SAMPs) in vertebrates, which have immune-modulating intrinsic sialic acid–binding lectins (like Siglecs) and/or recruit blood plasma factor H and thereby dampen complement activation. Meanwhile, a variety of bacteria evade immunity by synthesizing sialic acid–like molecules via convergent evolution of the ancestral NulO biosynthetic pathway (Chapter 15). The ability to produce sialylated glycans is under positive selection in pathogenic microbes that commonly decorate their surface with sialic acids to evade the vertebrate host's immune responses. It is curious that invertebrates, such as echinoderms (sea urchins and starfish) have the highest sialic acid diversity in deuterostomes and that the simplest profiles are found in humans. Thus, sialic acids seem to have evolved in many possible directions, disappearing altogether, or undergoing respective diversification or simplification of their structures. Although there is a tendency for some types of sialic acids to be dominant in certain mammalian species (e.g., N-glycolylneuraminic acid in pigs and 4-O-acetylated sialic acids in horses), careful investigation reveals lower quantities of such sialic acids in many other species. Notably, humans are “knockout” primates for CMP–Neu5Ac hydroxylase (CMAH). The CMAH gene was inactivated in the hominid lineage ∼3 myr. Thus, unlike the closely related great apes, humans cannot synthesize the sialic acid Neu5Gc, which, however, can still be incorporated in human glycans from dietary sources (Chapter 15). Independent loss of function of CMAH has also occurred in other mammalian lineages, including New World primates, ferrets and other Mustelids, seals and sea lions (Pinnipeds), and two lineages of microbats. CMAH inactivation happened at least eight times during mammalian evolution, with some events as early as 200 myr (such as inactivation in platypus). Sauropsids (birds and reptiles, descendants of dinosaurs) appear to represent another lineage that lost Neu5Gc.

O-Glycans

O-glycans encompasses a large class of glycoconjugates, which comprise the monosaccharide core structures O-GalNAc, O-Man, O-Fuc, O-Glc, O-Gal, and O-GlcNAc. Homologs of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferases (ppGalNAcTs) initiating synthesis of mucin-type O-GalNAc, the most common O-glycan class in vertebrates, have been found throughout the animal kingdom (Chapter 10). Multiple ppGalNAcTs with different polypeptide substrate specificity work in metazoans. The common Core-1 Galβ1-3GalNAcα1-O-Ser/Thr structure of vertebrates exists in insects, where it also forms part of a mucin-like protective layer in the gut. Heavily O-glycosylated gel-forming mucins have been recruited and diversified in metazoans to mediate lubrication and protection of hydrated epithelia in direct contact with the environment. O-Glycosylation of mucin-like domains is also common among protists, in which the glycans are usually initiated by ppGlcNAcTs—evolutionary precursors of the ppGalNAcTs. O-Linked mannose represents another example of Ser/Thr glycosylation that is conserved in prokaryotes and eukaryotes, from yeast to mammals. O-Mannosylation plays diverse functions, supporting cell structural stability, facilitating protein folding, and affecting cell adhesion. In vertebrates, O-mannosylation is mediated by several families of glycosyltransferases, including distantly related protein O-mannosyltransferases (POMT1-2) and transmembrane and tetratricopeptide repeat containing proteins (TMTC1-4). These enzymes evolved to recognize distinct substrates, including unstructured regions (POMTs) and specific protein folds, such as cadherin domains (TMTCs). On α-dystroglycan, POMTs initiate the biosynthesis of the matriglycan, one of the most elaborate and highly specialized vertebrate glycans that connects muscles with the extracellular matrix (Chapters 13, 27, 45). In contrast, plants do not appear to have O-GalNAc and O-mannose. Instead, they express arabinose O-linked to hydroxyproline and galactose O-linked to serine and threonine (Chapter 24). Far less is known about O-glycosylation in prokaryotes, although O-glycans, such as O-mannose and novel Galβ1-O-Tyr modifications, can be found in bacteria (Chapter 21).

Glycosphingolipids

Glucosylceramide is found in both plants and animals (Chapter 11). However, the most common core structure of vertebrate glycosphingolipids (Galβ1-4Glc-Cer) is varied in other organisms (e.g., Manβ1-4Glc-Cer and GlcNAcβ1-4Glc-Cer in certain invertebrates). Another variation is inositol-1-O-phosphorylceramide (e.g., mannosyldiinositolphosphorylceramide, the most abundant sphingolipid of yeast) and GlcNAcα1-4GlcAα1-2-myo-inositol-1-O-phosphorylceramide, found in tobacco leaves. Galactosylceramide and its derivatives seem to be limited to the nervous system of the deuterostome lineage of animals. In contrast, protostome nerves contain mainly glucosylceramides. An evolutionary trend is suggested: A transition from gluco- to galactoceramides corresponds with changes in the nervous system from loosely structured to highly structured myelin. Regarding complex gangliosides of the deuterostome nervous system, some general trends are seen in comparing reptiles and fish to mammals: an increase in sialic acid content, a decrease in complexity, and a decrease in “alkali-labile” gangliosides (bearing O-acetylated sialic acids). The lower the temperature, the more polar the composition of brain gangliosides; poikilothermic (cold-blooded) animals tend to express many polysialylated gangliosides in the brain.

Glycosaminoglycans

Heparan and chondroitin sulfate are found in many animal taxa, including insects (Chapter 26) and in mollusks. The most widely distributed and evolutionarily ancient class appears to be chondroitin chains, which are not always sulfated (e.g., in C. elegans) (Chapter 25). The more highly sulfated and epimerized forms of heparin and dermatan sulfate tend to be found primarily in more recently evolved animal species of the deuterostome lineage. The same is true of hyaluronan, a secreted free glycosaminoglycan synthesized at the cell membrane and likely evolved from its precursor chitin to facilitate the movement of metazoan cells and shaping of organs during development and normal physiology. Echinoderms such as the sea cucumber make typical chondroitin chains, but some glucuronic acids have branches containing fucose sulfate. Simpler multicellular animals such as sponges can have unusual glycosaminoglycans that include uronic acids but do not have the typical repeat units of chondroitin and heparan sulfate. Plants do not have typical animal glycosaminoglycans. Instead, they have secreted acidic pectin polysaccharides not attached to protein cores (only hyaluronan is unattached among animal glycosaminoglycans), characterized by galacturonic acid and its methyl ester derivative (Chapter 24). Bacteria have mostly completely distinct polysaccharides (Chapter 21), although certain pathogenic strains can mimic mammalian glycosaminoglycan chains (see below).

Nuclear and Cytoplasmic Glycans

The O-β-GlcNAc modification common on cytoplasmic, mitochondrial, and nuclear proteins (Chapter 19) is widely expressed in “higher” animals and in plants, and its occurrence on histone tails implicates this modification in epigenetic regulation mechanisms. The connections between nutrient state, UDP-GlcNAc levels, and O-GlcNAcylation mean that gene regulation has an important metabolic dimension. Conserved homologs of the responsible O-GlcNAc transferase have been found in many eukaryotic taxa and in a wide range of bacteria. An independent clade of enzymes evidently originating in bacteria mediates O-fucosylation instead of O-GlcNAcylation in plants and numerous protists (Chapter 18). Most interestingly, in some animal pathogenic members of Pasteurellaceae and Enterobacteriaceae, homologs of the O-GlcNAc transferase catalyze a cytoplasmic N-glycosylation within N-X-S/T sequons of adhesins. Molecular mimicry of eukaryotic N-glycoproteins might be the driving force for this convergent evolution of an N-glycosylation system. Surprisingly, O-GlcNAcylation is apparently absent in yeast. However, nucleocytoplasmic yeast proteins can carry O-mannose in the same conserved regions that are modified with O-GlcNAc in mammals, suggesting that a cytoplasmic O-mannosylation evolved in yeast by convergent evolution to function similarly to O-GlcNAcylation.

Structural Glycans

The most abundant biopolymers in nature include secreted polysaccharides such as cellulose, hemicellulose, chitin, and glycosaminoglycans. These large, chemically stable molecules provide crucial structural support for capsules, cell walls, exoskeletons, and extracellular matrices of countless organisms. The repeated β1-4 glycosidic linkages of cellulose and chitin also represent chemically extremely resistant sequestrations of accumulated glycans, given the inability of most organisms to hydrolyze these robust linkages. Compared to other types of glycans, structural polysaccharides show relatively little variation in evolution, which may indicate a more stringent selection for chemical and mechanical properties required for their functions. Interestingly, some of these structures, such as glycosaminoglycans, can also play nonstructural roles in other biological contexts, including cell signaling, which adds additional selection forces driving their evolution.

VIRUSES HIJACK HOST GLYCOSYLATION

Viruses typically have minimalist genomes and use host-cell machinery for replication. Thus, glycosylation of enveloped viruses reflects that of the host. However, there are exceptions. Giant viruses such as the algae-targeting chlorella virus and members of the amoeba-targeting Mimiviridae family have genomes large enough that they can express their own glycosylation machineries. On a smaller scale, specific viruses express glycosyltransferases as virulence factors. For instance, a baculovirus-encoded glucosyltransferase glycosylates insect host ecdysteroid hormones to block molting, and bacteriophage-derived glucosyltransferases modify 5-hydroxymethyl cytosine bases in the phage DNA to protect it from bacterial restriction enzymes. Host-derived glycosylation in enveloped viruses is typically extensive and the resulting “glycan shield” protects the virus from immune reactions against the underlying polypeptide. In this regard, it has been suggested that the high frequency of heterozygous states for human congenital disorders of glycosylation (Chapter 44) may reflect selection for genomes that limit glycosylation of invading viruses. Host lectins may also be “hijacked” by glycans on viral surface glycoproteins. For example, Sialoadhesin (Siglec-1; Chapter 15) is used by the heavily sialylated porcine reproductive and respiratory syndrome virus (PRRSV) to gain entry into macrophages. Viral protein sequence can also influence host glycosylation (e.g., by structural constraints on access for glycosyltransferases or hydrolases), in a manner that favors viral antigenicity, such as the high-mannose N-glycans on HIV-1 envelope gp120 trimers.

VAST DIVERSITY IN BACTERIAL AND ARCHAEAL GLYCOSYLATION

Despite the enormous potential for structural diversity, a rather limited subset of all possible monosaccharides and their possible linkages and modifications are found in eukaryotic cells. Why one encounters only such a limited subset of the possible glycan structures is a puzzling question. Could there be trade-offs between the variety of glycan structures and the additional resources, such as energy and biosynthetic pathways, that cells/organisms need in order to maintain that variety? Or maybe the variety of structures is limited by a combination of positive and negative selection due to both, its intrinsic functions and interactions with symbionts and pathogens? Regardless, this limited subset has allowed better elucidation of structures of eukaryotic glycans. In contrast, Bacteria and Archaea have had several billion additional years to respond to selective pressure from pathogens; in particular, phages. These organisms also have short generation times and can exchange genetic material across vast phylogenetic distances via plasmid-mediated horizontal gene flow. Glycosylation in Bacteria and Archaea is far more diverse, both in terms of the range of monosaccharides used or synthesized, and in the types of linkages and modifications (Chapters 21 and 22). In addition, prokaryotic cell–cell interactions both within and between species are often mediated by glycans. However, most work to date has focused on the glycans of pathogens, and we may have barely scratched the surface of prokaryotic glycan diversity.

MOLECULAR MIMICRY OF HOST GLYCANS BY PATHOGENS

Despite great differences between pathways generating glycan structures of bacteria and those of vertebrates, occasional microbial surface structures are strikingly similar to those of mammalian cells. Interestingly, most such examples of “molecular mimicry” occur in pathogenic/symbiotic microorganisms, apparently adapting them for better survival in the host by avoiding, reducing, or manipulating host immunity. A few examples include Escherichia coli K1 and Meningococcus group B (polysialic acid), E. coli K5 (heparosan, heparan sulfate backbone), Group A Streptococcus (hyaluronan), Group B Streptococcus (sialylated N-acetyllactosamines), and Campylobacter jejuni (ganglioside-like glycans). Initially, it was thought that the responsible microbial genes arose via lateral gene transfer from eukaryotes. However, in all instances in which genetic information is available, evidence points toward convergent evolution rather than gene transfer. For example, genes synthesizing sialic acids in bacteria seem to have been derived from preexisting prokaryotic pathways for nonulosonic acids, an ancestral family of monosaccharides with a structural resemblance. In contrast, bacterial sialyltransferases bear little resemblance to those of eukaryotes, and the vast sequence differences between different bacterial sialyltransferases indicate that these have even been reinvented on several separate occasions. Of course, lateral gene transfer is common among Bacteria and Archaea, facilitating rapid phylogenetic dissemination of such enzymatic “inventions.”

INTERSPECIES AND INTRASPECIES DIFFERENCES IN GLYCOSYLATION

Why do closely related species differ with regard to the presence or absence of certain glycans? Does the same glycoprotein have the same type of glycosylation in distinct but related species? Relatively little data exist regarding these issues, but examples of both, extreme conservation and diversification are found. A reasonable explanation is that conservation of glycan structure reflects specific functional constraints for the glycans in question. In other instances, considerable evolutionary drift in the details of glycan structure might be tolerated, as long as the underlying protein is able to carry out its primary functions (changes with no consequences for survival or reproduction, i.e., those that are selectively neutral). Even in the absence of important endogenous functions, glycans can have key roles in mediating interactions with symbionts and pathogens. The evolution of diversity and microheterogeneity (across tissues and cell types) in glycosylation could well be of value to the organisms in providing additional obstacles to pathogens that use host glycans for attachment and entry. Free glycans (e.g., milk oligosaccharides) can also have important roles in attracting and feeding symbiont microbial communities needed for internal functions (e.g., immune maturation) and in accommodating or restricting these to particular areas of the host.

There can even be significant variation in glycosylation among members of the same species, particularly in terminal glycan sequences. The classic example is the ABH(O) histo-blood group system (Chapter 14), a polymorphism found in all human populations, which has also persisted for tens of millions of years of primate evolution and has even been independently rederived in some instances. Despite its clinical importance for blood transfusion, this polymorphism appears to cause no major differences in the intrinsic biology of individuals of the species (Chapter 14) beyond conferring a variable susceptibility to viruses, such as noroviruses, that use ABO glycans as receptors. Like other blood groups, the ABO polymorphism is accompanied by production of natural antibodies against the variants absent from an individual. These antibodies may be protective by causing complement-mediated lysis of enveloped viruses generated within other individuals who express the target structure. Thus, an enveloped virus generated in a B blood group individual might be susceptible to complement-mediated lysis on contact with an A or O blood group individual who has circulating anti-B antibodies. The glycosyltransferase gene ABO responsible for synthesizing ABO antigens is one of the few loci in the human genome shown to be under balancing (frequency-dependent) selection.

This latter mechanism may also provide an explanation for interspecies diversity via selection exerted by pathogens recognizing glycans as targets for attachment and entry into cells. This mechanism is likely operative in generating the extreme diversity of sialic acid types and linkages (Chapter 15). Recent analyses have tried to combine the two mechanisms: the antibody-mediated protection from intracellular but enveloped viruses and possible frequency-dependent protection from glycan-exploiting extracellular pathogens, such as Noro- and Rotaviruses and Plasmodium falciparum (the causative agent of malignant malaria). Modeling approaches have successfully generated observed frequencies of ABO by incorporating these two simultaneous selection pressures. The evolutionary persistence of the ABO system still needs further explanation.

Another unexplained phenomenon is genetic inactivation in Old World primates of the ability to synthesize the otherwise very common terminal Galα1-3Galβ1-4GlcNAc-R structure (Chapter 14). This variation is also associated with spontaneously appearing and persistently circulating antibodies against the missing glycan determinant, thus forming a kind of “interspecies blood group.” This glycan difference may also be protective for the primate lineage which lost the “αGal” structure and has a high-titer circulating antibody, as it is now better protected against infection by enveloped viruses emanating from other mammals. Independent losses of the vertebrate-specific sialic acid Neu5Gc in humans and some other mammalian clades is a further example of glycan evolution by loss of function (Chapter 15). In the process, these lineages have lost a potent signal of self, resulting in protection from microbes that synthesize this glycan.

Regardless of the mechanisms maintaining these types of polymorphisms, such intra- and interspecies diversity might also provide for “herd immunity,” a phenomenon whereby one glycan variant–resistant individual can indirectly protect other susceptible individuals by restricting the spread of a pathogen through the population. Such proposed protective functions of glycan diversity are only apparent at the level of populations. This complicates their study in model organisms in which the focus is classically on the individual. It is important to point out here that evolution itself is a process that occurs at the level of populations.

Future studies will have to test precisely how much of the extant inter- and intraspecies glycan variation is directly driven by host–pathogen interactions. Although glycan variation forms an important determinant of host susceptibility, variation on target tissues as well as defensive secretions (mucins) must be considered when trying to understand disease, especially epidemics or zoonotics involving different host species and their interactions (e.g., influenza A) (Chapter 15). Finally, recent evidence suggests that antibodies against glycans absent in a subset of females in an extant species might aid in speciation via the killing of sperm from the remaining males in the population still bearing the glycan.

USE OF MODEL ORGANISMS TO STUDY GLYCAN DIVERSITY

For obvious reasons, the most detailed information about glycan structures is available for various popular model organisms as well as certain well-studied pathogens, and useful comparative knowledge can be gleaned from the relevant chapters that follow in this part of the book. But we must be careful about extrapolating data from organisms long maintained under optimal laboratory conditions to the overall taxa that they represent. The realization that rodents are the closest evolutionary cousins to primates has provided added justification for their use to understand human disease. However, the late Nobel laureate Sydney Brenner suggested there is now enough information about humans to consider ourselves to be the “next model organism.” Indeed, the studies that combine tractable questions about the pathobiology of naturally occurring mutations affecting glycans in humans with mechanistic studies in suitable model organisms tend to provide deeper insights into glycan functions.

WHY DO WIDELY EXPRESSED GLYCOSYLTRANSFERASES SOMETIMES HAVE LIMITED INTRINSIC FUNCTIONS?

It was once popular to suggest that every glycan on every cell type must have a critical intrinsic host function. Analysis of data on glycosyltransferase-deficient mice suggests that this is not the case. For example, ST6Gal-I α2-6 sialyltransferase is the main enzyme that produces Siaα2-6Galβ1-4GlcNAcβ1- termini on vertebrate glycans. Although this glycan serves as a specific ligand for the B-cell regulatory molecule CD22 (Siglec-2; Chapter 35), it is also found on many other cell types, as well as on many soluble secreted glycoproteins. Furthermore, ST6Gal-I mRNA varies markedly among cell types, and ST6Gal-I transcription is regulated by several cell type–specific promoters, which are in turn modulated by hormones and cytokines. Despite these data, the prominent consequences of eliminating its expression in mice so far seem to be restricted to the immune and hematopoietic systems and some cell adhesion, apoptosis, and oncogenic pathways. If the specific intrinsic functions of the ST6Gal-I glycan product are in fact restricted to some systems, why is it expressed in so many other locations? And why up-regulate its expression so markedly in the liver and endothelium during a so-called “acute phase” inflammatory response? Besides feedback on the immune system, could it be that scattered expression of this structure in other locations functions as a “smoke screen” or temporary “firewall,” restricting intra-organismal spread of an invading pathogen? Could it also be that heavily glycosylated nonnucleated cells like mammalian erythrocytes act as “decoy traps” for viral pathogens that require nucleated cells for replication? Answers to these questions must take into account the evolutionary selection pressures (both intrinsic and extrinsic recognition phenomena such as host–pathogen interactions and innate immune contributions) on glycosyltransferase products. Many effects may also not be apparent in inbred genetically modified mice living in hygienic vivaria, requiring studies in a natural, pathogen-rich environment. It is also possible that other gene products are masking phenotypes in these model systems, compensating for the loss. Furthermore, it is likely we have not looked hard enough at such genetically modified mice nor applied the relevant environmental pressures to elicit phenotypes.

EVOLUTIONARY FORCES DRIVING GLYCAN DIVERSIFICATION IN NATURE

Based on available data, it is reasonable to suggest that the evolution of glycan diversification in complex multicellular organisms has been driven by selection pressures of both intrinsic and extrinsic origin relative to the organism under study (Chapter 7). Glycans are particularly susceptible to the “Red Queen” effect, in which host glycans must keep changing to stay ahead of the pathogens, which have more rapid evolutionary rates because of their short generation times, high mutation rates, and rampant horizontal gene transfer. Glycan evolution is expected to be under selective constraints—that is, opposing selective forces influencing their course of evolution. Given their important role in definition of the molecular frontier of cells and organisms, the same glycan can come under opposing selective pressures at different times in the life of a cell or organism. Glycans favoring cell motility (e.g., polysialic acid) will be favored for development but become detrimental when accidentally exploited by malignant cells. Glycans on reproductive tract secretions favoring survival of sperm (e.g., Glycodelin S) might be counter-selected in females who benefit from a different glycan form (Glycodelin A), which challenges male gametes as part of female quality control. Unique glycans evolved as reliable SAMPs can become a liability if exploited by pathogens through molecular mimicry. Given the rapid evolution of extrinsic pathogens and their frequent use of glycans as targets for host recognition, it seems likely that a significant portion of the overall diversity in vertebrate cell-surface glycan structure reflects such pathogen-mediated selection processes. Meanwhile, even one critical intrinsic role of a glycan could disallow its elimination as a mechanism to evade pathogens. Thus, glycan expression patterns may represent trade-offs between evading pathogens (or accommodating symbionts) and preserving intrinsic functions.

More gene disruption studies in intact animals could help differentiate intrinsic and extrinsic glycan functions. More systematic comparative glycobiology could also contribute, making predictions about intrinsic glycan function—that is, the consistent (conserved) expression of the same structure in the same cell type across several taxa would imply a critical intrinsic role. Such work might also help define the rate of glycan diversification during evolution, better define the relative roles of the intrinsic and extrinsic selective forces, and eventually lead to a better understanding of the functional significance of glycan diversification during evolution. The possibility that pathogen-driven glycan diversification might even favor the process of sympatric speciation (via reproductive isolation) also needs to be further explored.

ACKNOWLEDGMENTS

The authors appreciate helpful comments and suggestions from Cristina De Castro and Christopher Mark West.

FURTHER READING

  • Warren L. 1963. The distribution of sialic acids in nature. Comp Biochem Physiol 10: 153–171. doi:10.1016/0010-406x(63)90238-x [PubMed: 14109742] [CrossRef]
  • Kishimoto Y. 1986. Phylogenetic development of myelin glycosphingolipids. Chem Phys Lipids 42: 117–128. doi:10.1016/0009-3084(86)90047-2 [PubMed: 3549016] [CrossRef]
  • Galili U. 1993. Evolution and pathophysiology of the human natural anti-α-galactosyl IgG (anti-Gal) antibody. Springer Semin Immunopathol 15: 155–171. doi:10.1007/bf00201098 [PubMed: 7504839] [CrossRef]
  • Kappel T, Hilbig R, Rahmann H. 1993. Variability in brain ganglioside content and composition of endothermic mammals, heterothermic hibernators and ectothermic fishes. Neurochem Int 22: 555–566. doi:10.1016/0197-0186(93)90030-9 [PubMed: 8513283] [CrossRef]
  • Martinko JM, Vincek V, Klein D, Klein J. 1993. Primate ABO glycosyltransferases: evidence for trans-species evolution. Immunogenetics 37: 274–278. doi:10.1007/bf00187453 [PubMed: 8420836] [CrossRef]
  • Dairaku K, Spiro RG. 1997. Phylogenetic survey of endomannosidase indicates late evolutionary appearance of this N-linked oligosaccharide processing enzyme. Glycobiology 7: 579–586. doi:10.1093/glycob/7.4.579 [PubMed: 9184840] [CrossRef]
  • Drickamer K, Taylor ME. 1998. Evolving views of protein glycosylation. Trends Biochem Sci 23: 321–324. doi:10.1016/s0968-0004(98)01246-8 [PubMed: 9787635] [CrossRef]
  • Gagneux P, Varki A. 1999. Evolutionary considerations in relating oligosaccharide diversity to biological function. Glycobiology 9: 747–755. doi:10.1093/glycob/9.8.747 [PubMed: 10406840] [CrossRef]
  • Freeze HH. 2001. The pathology of N-glycosylation—stay the middle, avoid the risks. Glycobiology 11: 37G–38G. [PubMed: 11855366]
  • Angata T, Varki A. 2002. Chemical diversity in the sialic acids and related α-keto acids: an evolutionary perspective. Chem Rev 102: 439–469. doi:10.1021/cr000407m [PubMed: 11841250] [CrossRef]
  • Varki A. 2006. Nothing in glycobiology makes sense, except in the light of evolution. Cell 126: 841–845. doi:10.1016/j.cell.2006.08.022 [PubMed: 16959563] [CrossRef]
  • Stern R, Jedrzejas MJ. 2008. Carbohydrate polymers at the center of life's origins: the importance of molecular processivity. Chem Rev 108: 5061–5085. doi:10.1021/cr078240l [PubMed: 18956903] [CrossRef]
  • van Die I, Cummings RD. 2010. Glycan gimmickry by parasitic helminths: a strategy for modulating the host immune response? Glycobiology 20: 2–12. doi:10.1093/glycob/cwp140 [PubMed: 19748975] [CrossRef]
  • Springer SA, Gagneux P. 2013. Glycan evolution in response to collaboration, conflict, and constraint. J Biol Chem 288: 6904–6911. doi:10.1074/jbc.r112.424523 [PMC free article: PMC3591600] [PubMed: 23329843] [CrossRef]
  • Clark GF. 2014. The role of glycans in immune evasion: the human fetoembryonic defence system hypothesis revisited. Mol Hum Reprod 20: 185–199. doi:10.1093/molehr/gat064 [PMC free article: PMC3925329] [PubMed: 24043694] [CrossRef]
  • Le Pendu J, Nyström K, Ruvoën-Clouet N. 2014. Host–pathogen co-evolution and glycan interactions. Curr Opin Virol 7: 88–94. doi:10.1016/j.coviro.2014.06.001 [PubMed: 25000207] [CrossRef]
  • Corfield AP, Berry M. 2015. Glycan variation and evolution in the eukaryotes. Trends Biochem Sci 40: 351–359. doi:10.1016/j.tibs.2015.04.004 [PubMed: 26002999] [CrossRef]
  • Hinchliff CE, Smith SA, Allman JF, Burleigh JG, Chaudhary R, Coghill LM, Crandall KA, Deng J, Drew BT, Gazis R, et al. 2015. Synthesis of phylogeny and taxonomy into a comprehensive tree of life. Proc Natl Acad Sci 112: 12764–12769. doi:10.1073/pnas.1423041112 [PMC free article: PMC4611642] [PubMed: 26385966] [CrossRef]
  • Springer SA, Gagneux P. 2017. Glycomics: revealing the dynamic ecology and evolution of sugar molecules. J Proteom 135: 90–100. doi:10.1016/j.jprot.2015.11.022 [PMC free article: PMC4762723] [PubMed: 26626628] [CrossRef]
  • Van Etten JL, Agarkova I, Dunigan DD, Tonetti M, De Castro C, Duncan GA. 2017. Chloroviruses have a sweet tooth. Viruses 9: 88. doi:10.3390/v9040088 [PMC free article: PMC5408694] [PubMed: 28441734] [CrossRef]
  • Varki A. 2017. Biological roles of glycans. Glycobiology 27: 3–49. doi:10.1093/glycob/cww086 [PMC free article: PMC5884436] [PubMed: 27558841] [CrossRef]
  • Joshi HJ, Narimatsu Y, Schjoldager KT, Tytgat HLP, Aebi M, Clausen H, Halim A. 2018. SnapShot: O-glycosylation pathways across kingdoms. Cell 172: 632. doi:10.1016/j.cell.2018.01.016 [PubMed: 29373833] [CrossRef]
  • Suzuki N. 2018. Glycan diversity in the course of vertebrate evolution. Glycobiology 29: 625–644. doi:10.1093/glycob/cwz038 [PubMed: 31287538] [CrossRef]
  • West CM, Malzl D, Hykollari A, Wilson IBH. 2021. Glycomics, glycoproteomics, and glycogenomics: an inter-taxa evolutionary perspective. Mol Cell Proteomics 20: 100024. doi:10.1074/mcp.r120.002263 [PMC free article: PMC8724618] [PubMed: 32994314] [CrossRef]
Copyright © 2022 The Consortium of Glycobiology Editors, La Jolla, California; published by Cold Spring Harbor Laboratory Press; doi:10.1101/glycobiology.4e.20. All rights reserved.

The content of this book is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 Unported license. To view the terms and conditions of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0/

Bookshelf ID: NBK579955PMID: 35536960DOI: 10.1101/glycobiology.4e.20

Views

  • PubReader
  • Print View
  • Cite this Page

Important Links

Related Items in Bookshelf

Related information

  • PMC
    PubMed Central citations
  • PubMed
    Links to PubMed

Similar articles in PubMed

See reviews...See all...

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...