Protein-Linked Glycans

Three main types of attached glycans are recognized. N-linked glycans are attached to asparagine in an Asn-Xxx-Ser (or Thr) consensus sequence where Xxx is any amino acid except proline. O-linked glycans are generally smaller and more varied in structure [25] and are attached to either serine or threonine but with no directing consensus sequence. A third type comprises glycans that form part of a glycosylphosphatidylinositol lipid anchor. Recently a few glycoproteins have been reported in which cysteine replaces serine or threonine in the N-linked consensus sequence [26]. O-linked glycans from mammalian systems tend to have relatively diverse structures and are generally classified according to their core structures as outlined in Table 3.1. It is common to find several O-linked glycosylation sites in close proximity such that proteolysis rarely produces fragments with single sites as with the N-glycans. Consequently, peptides containing several glycans are often examined either as intact molecules or after stripping the glycans to their attached GalNAc residue. A database of O-glycan structures is available [28]. N-glycans, although often larger than O-linked structures, contain a trimannosyl-chitobiose pentasaccharide core that is attached to the protein by the reducing-terminal GlcNAc residue with an amide link and have well-defined overall structures.

Because of the complexity of many of these glycans, it is customary in this field to depict the structures in a picture format. Symbols such as circles or squares are chosen to represent the individual monosaccharide constituents,

Table 3.1 Structures of the cores of O-glycans 27.



Core 1

Galpi ^ 3GalNAca-Protein

Core 2

GlcNAcpi 6(Galpi 3)GalNAca-Protein

Core 3

GlcNAcpi 3GalNAca-Protein

Core 4

GlcNAcpi 6(GlcNAcpi 3)GalNAca-Protein

Core 5

GalNAcal 3GalNAca-Protein

Core 6

GlcNAcpi 6GalNAca-Protein

Core 7

GalNAcal 6GalNAca-Protein

Core 8

Galal 3GalNAca-Protein

Source: Adapted from "Varki, A., Cummings, R. D., Esko, J. D., Freeze, H. H., Stanley, P., Bertozzi, C. R., Hart, G. W., Etzler, M. E.: Essentials of Glycobiology, Second Edition. Cold Spring Harbor Laboratory Press, (2008).

which are then linked in the sequence of the glycan. In former times, most laboratories chose their own symbols, but a more consistent system was introduced by the Consortium for Functional Glycomics (CFG) and is outlined in Figure 3.1.

Mass-different monosaccharides were differentiated by shape, and the different types of, for example, hexoses were differentiated by color. Linkage was shown by written notes on the connecting bonds. Although widely adopted, this system suffers from major drawbacks, not least of which is the manner of indicating linkage. An alternative system, proposed in 2009, overcomes this problem by indicating the linkage positions by the angle of the lines connecting the monosaccharide symbols with full lines denoting p-bonds and broken lines showing а-bonds. The requirement for color was overcome by using shapes to show the different isobaric monosaccharides such as mannose (circle), galactose (diamond), and glucose (square) with various additions such as a solid fill to show the presence of an M-acetylamino group (as in GlcNAc (filled square)) or inclusion of a dot to show the absence of an OH group (e.g., a diamond with a dot to indicate fucose (deoxy-L-galactose). This system has the added advantage that it can be extended to include other monosaccharides, such as tyvelose (dideoxy-mannose (circle with two dots)), without the need to invent entirely new symbols [29]. To make this system more familiar to users of the CFG system, recently, for N-linked glycans, the CFG colors have been incorporated for depicting N-glycans. This system is used in this chapter.

The biosynthesis of N-glycans [30] is outlined in Figure 3.2. Briefly, the glycan Glc3Man9GlcNAc2 (1; Figure 3.2) is attached to the protein during transcription in the endoplasmic reticulum and then degraded by enzymatic removal of the three glucose residues (2). The intermediates in this process are

Methods for drawing N-glycans using symbols

Figure 3.1 Methods for drawing N-glycans using symbols. Two systems are in common use, the Consortium for Functional Glycomics (CFG) and the Oxford systems. The CFG system uses color to differentiate hexoses, whereas the Oxford system uses shapes. Furthermore, the Oxford system identifies modifications to the basic monosaccharides by simply adding features such as a fill (N-acetyl) or dot (deoxy), whereas the CFG system often needs additional symbols. Linkage is indicated by text on the bonds linking the symbols in the CFG system, whereas the Oxford system uses the angle of the bonds to denote the linkage position. A new version of the CFG system will incorporate this latter feature as an option. The figure shows some of the symbols and examples of structures drawn with the different systems. Two versions of the Oxford system are shown: black and white and the same structures drawn with inclusion of the CFG colors.

used by the cell to ensure correct folding of the protein. The central (d2) mannose is then removed (3), and the glycoprotein is transferred to the Golgi apparatus where the two а-linked mannose residues on the 3-antenna (d1 arm) and that on the 6-branch of the 6-antenna are removed to leave Man5GlcNAc2 (4). These glycans are all referred to as “high-mannose” glycans. A GlcNAc residue is then attached at the 2-position of the 3-linked mannose (7), after which several pathways propagate. Galactose, followed (8) by sialic acid, linked

Biosynthesis of N-linked glycans

Figure 3.2 Biosynthesis of N-linked glycans. The structures (drawn in the Oxford format) that are shown are only a few of those possible. Arrows are illustrative and do not necessarily depict intact biochemical pathways. The antennae of the complex glycans usually terminate in N-acetylneuraminic acid in either a2-3- or a2-6-linkage.

either a2 ^ 3 or a2 ^ 6, can be attached to the 4-position of the GlcNAc to give what is known as a hybrid glycan. Alternatively, the two remaining mannose residues attached to the 6-linked core mannose residue are removed (6), and then, typically for mammalian systems, either of the core mannoses can have one or two Gal-GlcNAc (9) and finally Neu5Ac-Gal-GlcNAc antennae attached to the 2- or 4-positions of the 3-mannose or at the 2- and 6-positions of the 6-mannose (9, 10, 11). Glycans with two such antennae (9) are known as bian- tennary complex glycans, those with three antennae (10) are triantennary glycans, and those with four antennae (11) are tetraantennary glycans. Furthermore, these antennae can be extended with addition of further Gal- GlcNAc groups. All of these glycans are known as “complex glycans" Other common modifications to these structures are the addition of fucose to the core GlcNAc (12, 13) or to the galactose or GlcNAc residues (14) of the antennae, the addition of GlcNAc to the 4-position of the core mannose (known as a “bisecting” GlcNAc—12) or the addition of further sialic acid residues to the ends of the antennae (15). Different types of N-glycan characterize various species: thus N-glycans from birds tend to have antennae that lack galactose, those from fungi are mainly high-mannose glycans, whereas plants and insects produce truncated glycans known as “paucimannosidic” glycans, often containing xylose with fucose attached to the 3- rather than to the 6-position of the reducing-terminal GlcNAc that is common in mammalian glycans. The N-glycans in all of these species contain the same trimannosyl-chitobiose core, and only archaea appear to synthesize different types of N-glycan.

Glycoproteins can contain one or many sites for glycan attachment. Each of these sites may be fully occupied with one or many glycans (over 100 in some cases), or they may be partially occupied or vacant. Occupancy of several sites by many glycans can result in large numbers of individual glycoproteins (known as glycoforms). It has been estimated, for example, that Desmodus rotundus salivary plasminogen activator, with its two N- and four O-linked sites, contains in excess of 330,000 individual molecular species if the glycans are randomly distributed [31, 32].

Structural determination of these compounds, therefore, requires not only the structure of the glycans but also identification of the site(s) to which they are attached. Not all possible sites are glycosylated and some may be partially glycosylated. Thus, the degree of glycosylation (site occupancy) is also a feature to be determined. Where molecules, such as IgG, contain several glycosylated protein chains, it is also important to determine the relationship of glycosyla- tion between the chains.

< Prev   CONTENTS   Source   Next >