Abstract
The current pandemic is caused by the coronavirus disease 2019 (COVID-19), which is, in turn, induced by a novel coronavirus (SARS-CoV-2) that triggers an acute respiratory disease. In recent years, the emergence of SARS-CoV-2 is the third highly pathogenic event and large-scale epidemic affecting the human population. It follows the severe acute respiratory syndrome coronavirus (SARS-CoV) in 2003 and the Middle East respiratory syndrome coronavirus (MERS-CoV) in 2012. This novel SARS-CoV-2 employs the angiotensin-converting enzyme 2 (ACE2) receptor, like SARS-CoV, and spreads principally in the respiratory tract. The viral spike (S) protein of coronaviruses facilities the attachment to the cellular receptor, entrance, and membrane fusion. The S protein is a glycoprotein and is critical to elicit an immune response. Glycosylation is a biologically significant post-translational modification in virus surface proteins. These glycans play important roles in the viral life cycle, structure, immune evasion, and cell infection. However, it is necessary to search for new information about viral behavior and immunological host's response after SARS-CoV-2 infection. The present review discusses the implications of the CoV-2 S protein glycosylation in the SARS-CoV-2/ACE2 interaction and the immunological response. Elucidation of the glycan repertoire on the spike protein can propel research for the development of an appropriate vaccine.
Introduction
A new infectious respiratory disease caused by SARS-CoV-2 emerged in December 2019. A significant number of patients are associated with acute respiratory distress syndrome (ARDS) presenting cough, fever, and dyspnea. Numerous members of the family Coronaviridae regularly circulate among the human population and frequently cause moderate respiratory disease (67,74). Severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) cause acute lung injury (ALI) and ARDS, which lead to pulmonary failure and fatality (43). This new virus is a member of the β group of coronaviruses. The International Committee on Taxonomy of Viruses (ICTV) named the virus SARS-CoV-2, which induces the disease COVID-19 that is a serious global public health concern (74).
In 2003, SARS-CoV-1 infected 8,098 individuals, with a mortality rate of 9%; the SARS-CoV-2 has infected 55,659,785 individuals, with 1,338,769 deaths around the world so far (the World Health Organization, WHO). The transmission rate of SARS-CoV-2 is higher than that of SARS-CoV-1, probably associated with the S glycoprotein in the receptor-binding domain (RBD) region that enhances its transmission capacity. SARS-CoV-1 and SARS-CoV-2 share ∼76% amino acids identity. The spike S glycoprotein of coronaviruses facilitates the binding to target cells. SARS-CoV-2 appears to be optimized to easily bind to the angiotensin-converting enzyme 2 (ACE2) receptor as the entry and uses the cellular serine protease TMPRSS2 for S protein priming. The efficiency of SARS-CoV-2/ACE2 receptor-binding site is determinant to SARS-CoV-2 transmissibility (19,27,39).
Viral infections are detected through pattern recognition receptors (PPRs) to recognize pathogen-associated molecular patterns (PAMPs). PAMPs include carbohydrates, proteins, lipids, lipoproteins, and nucleic acids from viral, parasitic, bacterial, and fungal origins, and are recognized by Toll-like receptors (TLRs) (30). Nevertheless, the viral envelope protein is regularly modified by the addition of complex glycan structures that represent half of the molecular weight. The post-translational modification by glycosylation of these antigens helps the pathogens to evade the host immune system and the capacity of the host to raise an effective adaptive immune response (62,63).
The cryo-EM structure of the SARS-CoV-2 S glycoprotein suggests that CoV-2 protein is highly glycosylated, with a similar pattern of glycosylation to that of SARS-CoV-1 S glycoprotein (27,62,64). In this review, we discuss the implications of SARS-CoV-2 S protein glycosylation in the SARS-CoV-2/ACE2 interaction and the immunological response.
Molecular Structure of SARS-CoV-2
SARS-CoV-2 is an enveloped virus that belongs to the subfamily Orthocoronavirinae in the family Coronaviridae, Order Nidovirales. Its size ranges from 65 to 125 nm in diameter, and it contains a positive-sense single-stranded RNA (ssRNA) genome (26–32 kb) (64). The subgroups of the coronavirus family have been classified: alpha, beta, gamma, and delta coronavirus (13,31). The phylogenetic study of the coronavirus genomes has shown that SARS-CoV-2 is a beta coronavirus, which includes SARS-CoV and MERS-CoV. The virus that causes COVID-19 is a SARS-like coronavirus, which had previously been reported in bats in China. SARS-CoV-2 seems to be closely related to the bat coronavirus RatG13, sharing >93.1% sequence identity of the spike (S) gene (18,48).
The SARS-CoV-2 genome was submitted to the National Center for Biotechnology (NCBI) with ID NC_045512, which comprises 29,903 bp ssRNA. SARS-CoV-2 has been reported to depict >80% identity with a previous coronavirus (SARS-like bat CoV) and contains 10 open reading frames (ORFs). The first ORF (ORF1a/b) constitutes two-thirds of the viral RNA, which translated into two polyproteins (10,25,66). The processing of the polyproteins, pp1a and pp1ab, results in 16 nonstructural proteins (nsp1-nsp16) in SARS-CoV and MERS-CoV forming the viral replicase transcriptase complex. The nsps polyproteins identify the membranes that originate from the rough endoplasmic reticulum (ER) into double-membrane vesicles, where viral replication and transcription occur.
The remaining SARS-CoV-2 ORFs, located in the last one-third of the genome, encode four main structural proteins: spike (S), envelope (E), nucleocapsid (N), membrane (M) proteins, among other accessory proteins (3a, 3b, p6, 7a, 7b, 8a, 8b, 8c, 9b, and orf14), which do not contribute to viral replication (Fig. 1A) (66). The SARS-CoV-2 S glycoprotein has also been reported to be modified by homologous recombination; that is, a mixture between the bat SARS-CoV and an unknown beta-CoV (27,52). The phylogenetic analysis network of the SARS-CoV-2 genome shows point mutations that modify the number of amino acids: named A, B, and C. The change A is related to the ancestral type according to the coronavirus of the bat group. Phylogenetic patterns are a description of the early stage of an epidemic before it is potentially affected by subsequent migration and mutation. Phylogenetic classification can be used to rule out or confirm these effects when designing treatments and eventually, vaccines (14).

Introduction to coronaviruses and the S glycoprotein.
The SARS-CoV-2 S Glycoprotein
The S glycoprotein of SARS-CoV-2 plays a relevant role in infecting the host cells and in its transmission capacity (27,70). The S glycoprotein is a type I transmembrane protein of 1255 amino acids, which is a trimer in the prefusion and postfusion conformations; the cryo-EM analysis has confirmed this structure for both conformations (Fig. 1B). This glycoprotein comprises two subunits responsible for host cell receptor binding (S1 subunit) and fusion of the viral and cellular membranes (S2 subunit) (19,35,60,70). The SARS-CoV-2 S glycoprotein contains 22 N-linked glycosylation sequons per protomer; the map of oligosaccharides has been resolved by cryo-EM for 16 of the sites and, experimentally, it has been confirmed that ∼19 of them are glycosylated (61,67,70). Twenty of the 22 N-linked glycosylation sequons of SARS-CoV-2 S are conserved in SARS-CoV-1 S. Specifically, 9 of 13 glycans in the S1 subunit and all 9 glycans in the S2 subunit are conserved between SARS-CoV-2 S and SARS-CoV-1 S (Fig. 1C). S2 subunit N-linked glycosylation is mainly conserved in SARS-CoV S glycoproteins, indicating that the availability of the viral fusion machinery is comparable between these viruses. Recent evidence has been published showing low levels of O-glycosylation in SARS-S protein (46,47,60 –62,72). These oligosaccharides contribute to S protein folding, impinge on priming by host proteases, and regulate antibody recognition (46,60,70). N-glycosylation is characterized by the binding of GlcNAc to the Asp amino in the Asp-X-Ser/Thr consensus sequence in which “X” represents any other amino acid except Pro. Mucin-type O-glycosylation is characterized by GalNAc linked to the hydroxyl of Ser or Thr residues. Mucins are a class of glycoproteins that contain a great number of O-GalNAc glycans (4).
The binding of the S glycoprotein of SARS-CoV-2 to ACE2 receptor is 10- to 20-fold higher than that of SARS-CoV-1 (16,27,70). Viral–cell membrane fusion occurs; the viral RNA genome is released into the cytoplasm, and the RNA is translated to the polyproteins pp1a and pp1ab, which encode nonstructural proteins and form replication–transcription complexes (RTCs) in a double-membrane vesicle (Fig. 2). The RTCs are replicated and synthesized continuously by subgenomic RNAs, which encode accessory proteins and structural proteins. The recently synthesized proteins are post-translationally modified by the ER and the Golgi apparatus, leading nucleocapsid proteins and envelope glycoproteins to assemble and form viral particles. The formation of new viral particles and their release are driven by the membrane protein (M), the envelope protein (E), and the nucleocapsid protein (N); interactions with the M protein might facilitate S protein incorporation into particles. The formation of the trimers of the S protein from the viral envelope provides virions with a crown-like (Lat. corona) appearance, from which the name “coronavirus” is originated. In the last stage, the virion fuses with the membrane to release the virus (9,18,48,66). Therefore, understanding the structure and function of the S protein can help develop monoclonal antibodies, drugs, and vaccines.

Schematic representation of the life cycle of SARS-CoV-2 in host cells. The virus begins its life cycle when the S glycoprotein binds to the extracellular ACE2 receptor. After receptor binding, a conformational change in the S protein helps viral envelope fusion with the cell membrane.
The Interplay Between the SARS-CoV-2 S Glycoprotein and the Receptor ACE2
The viral infection is initiated by the binding of the virus to the appropriate host cells (38). The glycans play multifaceted roles in the surfaces of host cells and viruses. These glycans participate in viral entry, proteolytic cleavage of viral proteins, and recognition and neutralization of the virus by the host immune system (4,21). Although several virus–host binding mechanisms directly include protein–protein interaction, carbohydrate molecules can similarly serve as primary receptors or coreceptors that contribute to the cell tropism and host restriction of the virus (41). Thus, the interactions among surfaces enriched by complex glycans and lectins, also recognized as glycan-binding proteins (GBPs), play a substantial role in infection by several viruses (32,34,42,53,58).
Several virus–host bindings involve direct interactions with sialic acids (carbohydrate molecules) that may also serve as receptor-binding determinants (53). Sialoglycans contribute to the composition and complexity of the glycan chain, and are responsible for its structural variety. Sialic acid (Sia) is a 9-carbon sugar in a complex group, the most frequent type is N-acetylneuraminic acid (Neu5Ac), and the amino group at C5 is generally N-acetylated. Other Sia derivatives contain N-glycolylneuraminic acid (Neu5Gc) or O-acetylated groups (Neu-O-Ac) (34,53). The viral surface glycoproteins recognize specifically Sia as receptors; that is, complex glycans composed of α2,3- or α2,6-linked sialic acids (N-acetylneuraminic acid) (42). For example, in humans, the upper respiratory epithelial surface shows mainly sialylated glycan receptors that include α2,6-linked sialic acid recognized by the hemagglutinin (HA) of influenza. Other viruses recognize the linear sulfate glycosaminoglycans (such as heparan sulfate), which act as coreceptors for a wide variety of viruses, including dengue and hepatitis C (53,54). This occurs by interactions between the negative charges of heparan sulfate proteoglycans and the basic amino acids of viral surface proteins (8).
As mentioned before, the glycans displayed on the viral surface are added during replication inside the host. On the contrary, glycosylation of the virus is critical for maintaining the stability of these proteins and the viral particles, as suggested for flaviviruses, including dengue and Zika, or for S glycoproteins from influenza A virus, coronavirus (SARS-CoV), Ebola virus, among others (4,21,41). In the most cases, these glycans maintain the stability of certain viruses, such as dengue and Ebola, by specific interactions with GBPs (such as C-type lectins) displayed on the host surface (41). The S glycoprotein of coronavirus is a trimeric protein that specifically recognizes different cell surface receptor glycoproteins and limits the entry of the virus into the host cell. The S glycoprotein binds to its cellular receptor: ACE2 for SARS-CoV and SARS-CoV-2; CD209L, which is a C-type lectin (also called L-SIGN), for SARS-CoV; and DPP4 (dipeptidyl peptidase 4) for MERS-CoV (15,21,34,71).
Specific N-glycosides have been recognized in MERS-CoV and SARS-CoV, which are necessary for trafficking and viral particle egress (63). Moreover, MERS-CoV has a Sia-binding site located inside domain A of the S1 subunit. This domain has a short, sulfated, α2,3-linked sialosaccharides, and long, branched, α2,3 di-Sia and tri-Sia glycans with 3 Galβ1–4GlcNAcβ1–3 (LacNAc) tandem repeats. The binding affinity of MERS-CoV S1A to α2,6-linked sialosides is low, and it does not bind Neu5Gc. MERS-CoV S1A has a Sia-binding preference in α2,3-linked. Neu5Gc, as well as 9-O-acetylation, inhibits MERS-CoV S1A binding. The distribution of glycans in the host delimits the tissue tropism, pathogenesis, and transmissibility by the distribution and receptor-binding specificity. The knowledge of the MERS-CoV and S protein–DPP4 interaction has led to the understanding of these aspects of the virus biology and its cross-species epidemiology (26,29,45,55).
The SARS-CoV-2 S glycoprotein is highly glycosylated with 22 predicted N-linked glycosylation sites and three O-glycosylation sites (60,62,70). Shajahan et al. observed high-mannose, hybrid, and complex-type glycans based on branching, fucosylation, and sialylation across the N-linked glycosylation sites (Fig. 1C). In the 22 sites of N-linked glycosylation on the S glycoprotein, 8 sites contain oligomannose-type glycans, which play important roles in proper protein folding and priming by host proteases, and 14 sites are glycosylated by complex-type glycans (60 –62). The highly sialylated glycans act as determinants in viral binding to the ACE2 receptor (19,46,55). Zhao et al. revealed six sites of N-linked glycosylation on ACE2, principally complex-type glycans, and low levels of high-mannose and hybrid glycans. Sulfated N-linked glycans could not be detected. In these glycoproteins, the O-glycans were present at very low levels of occupancy (72). The significant presence of complex-type N-glycans provides significant protection of the peptide backbone and a steric hindrance to processing enzymes. The sulfated N-linked glycans might be important in the immune regulation and receptor binding; however, they were not observed in ACE2. The glycans at each site of the immunogen appeared to be slightly more processed (9,72).
The heterogeneity of many glycosylation sites in the S-protein and ACE2 can be modified by several glycan structures, generating diversity in the site-specific glycosylation. Glycoproteins with a high density of glycans can facilitate the camouflaging of immunogenic epitopes and promote immune evasion (20,33,40,56,60). Specifically, the complex-type glycans are a crucial element to be considered in immunogen engineering. The epitopes of SARS-CoV-2 S glycoprotein recognized by the neutralizing antibodies can contain fucosylated glycans. Of the N-linked glycans of the S glycoprotein, 52% are fucosylated and 15% contain at least one sialic acid residue. Watanabe et al. reported that these glycoproteins are highly fucosylated; 98% of the detected glycans contain fucose residues (62).
Moreover, low levels of O-linked glycosylation have been detected, suggesting that O-glycans of this region are insignificant when the structure is native like. The presence of O-glycans in some viral proteins suggests an important role in biological activity. In the SARS-CoV-2 S1, the O-type glycosylation by O-GalNAc and O-GlcNAc seems to be involved in protein stability and function (47,62,63). The S glycoprotein is a target in vaccine design; the changes in the glycosylation of viral spikes can disclose important elements for the knowledge of viral biology and facilitate vaccine design strategies.
Viral glycosylation determines protein-mediated folding and stability (5,17,47,51). Cryo-EM studies showed that the interaction between the S glycoprotein and the ACE2 receptor induces dissociation of the S1 subunit from ACE2, and prompts the S2 subunit to become more stable in the postfusion state, which is essential for membrane fusion (27,61,65). In vitro-binding measurements, biochemical interaction studies, and crystal structure analysis revealed that SARS-CoV-2 RBD binds to human ACE2 with a high affinity in the nanomolar range (27,60). Wang et al. (61) described that glutamine 394 in the SARS-CoV-2 RBD region corresponds to residue 479 of SARS-CoV-1, and it is recognized by lysine 31 on the human ACE2 receptor, indicating that SARS-CoV-2 S glycoprotein has a high affinity to human ACE2 receptor and is more efficient than SARS-CoV to spread among people (60). In addition, Zhao et al. demonstrated a direct glycan–glycan interaction between the S glycoprotein and ACE2 receptor, adding complexity to interpreting the glycosylation diversity that is responsible for viral infection (72). The glycosylation variations in the S protein and the interaction with the ACE2 receptor are crucial to understanding the influence on immunological response and the efficiency in neutralizing antibodies (9,40,56).
In addition, it is now known that CD209L (C-type lectin that binds to high-mannose glycans) is an alternative receptor for SARS-CoV-2. Thus, the interaction between CD209L and the high-mannose N-glycan structure in the S glycoprotein of SARS-CoV-2 could be mediating the endocytosis of viruses (22,57). In summary, the S glycoprotein of SARS-CoV-2 binds to the ACE2 receptor and CD209L, facilitating virus entry and replication in the host cell.
Immune Response in SARS-CoV-2 Infection
It is now understood that N-linked glycosylation is necessary for studying location, structure, progeny development, and infectivity of several viruses; but its role in the immune response is less known (54,59). The viral entry into the host cell triggers the innate immune response that develops the inflammatory process (Fig. 3). The carbohydrate structures on the S glycoprotein and the release of the viral RNAs might, therefore, represent a unique class of PAMPs. The PAMPs are recognized by the host PRRs, such as C-type lectins, collectins, TLR3, TLR4, TLR7, TLR8, and TLR9 (1,2,24). Specifically, the receptors TLR3 and TLR4 recognize the SARS-CoV, causing an inflammatory response through both MyD88- and TRIF-mediated pathways; this process may be theorized for SARS-CoV-2 (30,59). Moreover, in the cytoplasm, the viral RNA receptor retinoic-acid-inducible gene I (RIG-I), the cytosolic receptor melanoma differentiation-associated gene 5 (MDA5), and the nucleotidyltransferase cyclic GMP-AMP synthase (cGAS) recognize the viral RNA and DNA (23,68,69). The recent evaluation of COVID-19 patients revealed an increase in the activity of the inflammasome and the IL-1β pathway induced by SARS-CoV; these processes play a critical role in its pathogenesis (11,12,49).

The immune response after SARS-CoV-2 infection. The binding of SARS-CoV-2 to the ACE2 receptor in the host cell through the S protein leads to the release of genomic RNA in the cytoplasm. TLR-3 receptors induce an immune response to dsRNA generated during SARS-CoV-2 replication, and cascades of signaling pathways (IRFs and NF-κB activation, respectively) are activated to produce type I IFNs and proinflammatory cytokines. The expression of type I IFN is important to increase the release of antiviral proteins for the protection of noninfected cells. Accessory proteins of SARS-CoV-2 can interfere with TLR-3 signaling and bind the dsRNA of SARS-CoV-2 during replication to prevent TLR-3 activation and evade the immune response. TLR-4 may recognize the S protein and lead to the activation of proinflammatory cytokines through the MyD88 signaling pathways. Virus–cell interactions contribute to the strong production of immune mediators. The secretion of large amounts of cytokines and chemokines (IL-1, IL-6, IL-8, IL-21, TNF-β, and MCP-1) is promoted in infected cells in response to SARS-CoV-2 infection. All these chemokines recruit lymphocytes to the infection site.
Unfortunately, the mechanism of antigen presentation in SARS-CoV-2 is unknown, but we can get some information from previous research on SARS-CoV and MERS-CoV. The antigen-presenting cells are responsible for presenting the viral antigen through the major histocompatibility complex (MHC) and are recognized by cytotoxic T lymphocytes (30,37,44).
Particularly, the MHC I molecules participate in the antigen presentation of SARS-CoV, but MHC II also contributes to its presentation. Moreover, the risk of SARS-CoV infection is associated with the gene polymorphisms of mannose-binding lectin that are related to antigen presentation (18,50). The high density of N-glycosylation (mainly, high-mannose N-glycans) in the S glycoprotein facilitates viral escape by interfering with proteolytic processing of envelope peptides for presentation by the MHC. Consequently, the antigen presentation stimulates the humoral and cellular immunity mediated by specific B and T cells (18,50).
The glycoconjugates are present in the cellular membrane; for this reason, they are critical for immune recognition. They are T-cell independent antigens that fail to induce immunological memory and immunoglobulin class-switching. Carbohydrate-based vaccines show that IgM antibody production dominates the immunological response with low IgG production. Similarly, the COVID-19 disease is presenting an antibody profile against the SARS-CoV virus with a typical pattern of IgM and IgG production. At ≥10 days after the onset of symptoms, high levels of IgG and IgM against NP or RBD of SARS-CoV-2 have been reported (3,28,33). The IgG antibodies play a protective role, but SARS-specific IgM antibodies disappear at the end of week 12. Moreover, the enhancement of IgA antibodies in the mucosal could be important for preventing SARS-CoV infections (3,28,33). In the acute phase, patients with SARS-CoV present a decline of CD4+ T and CD8+ T cells. However, in SARS-CoV-recovered patients, CD4+ and CD8+ memory T cells can stimulate T cell proliferation and the production of IFN-γ even if there is no antigen (6,73,75).
During viral infection, the equilibrium of the pro- and anti-inflammatory response is decisive regarding the clinical result. The principal reports are concentrated on severe cases and adaptive immune responses. However, the innate immune response, the reactant elements of the acute phase and cytokine storm are poorly understood. The cytokine storm is the principal factor for high mortality, multiorgan failure, ARDS, and disseminated intravascular coagulation (3,7,36,44,50,73). The report of Zhu et al., in Lancet, showed that ARDS is the primary cause of death by COVID-19 (74). The activation of the NOD-like receptor family pyrin domain-containing-3 (NLRP3) inflammasome is associated with virulence and pathogenicity of the SARS-CoV-2. In macrophages, epithelial cells, and endothelial cells, the activation of the inflammasome induces the increase of proinflammatory cytokines, IL-1β and IL-18, which contribute to the inflammation and severity of symptoms of COVID-19 (3,30). SARS-CoV and MERS-CoV employ several strategies to avoid immune responses and survive in host cells. These stimulate the production of double-membrane vesicles that are deficient in PRRs, then replicate within these vesicles, and thus avoid the host detecting their dsRNA (36,50,73,75). Therefore, antigen presentation is essential for gene expression in the immunological response and elimination of the SARS-CoV and MERS-CoV after infection. The understanding of the structure and mechanism of viral infection by SARS-CoV-2 is necessary for the development of specific drugs for the clinical treatment of the COVID-19 disease.
Conclusion
The present review discusses the mechanisms of SARS-CoV-2 binding to the ACE2 receptor. SARS-CoV S glycoprotein has a high affinity to the ACE2 receptor, and participates in viral entry into host cells and spreads among people. The SARS-CoV-2 S-protein is comprised of 22 N-linked glycosylation sequons per protomer. The N-linked glycosylation has an important role in protein folding and stability, and is responsible for viral tropism. The N-glycans in viral particles are necessary for trafficking to the surface and egress. The knowledge of glycan structures, recognition mechanisms, and their functionality has similarly resulted in the development of several therapeutic alternatives to treat SARS-CoV-2. Varying the glycosylation of S-protein's surface is, therefore, a mechanism by which new virus strains could evade the host immune response and diminish the efficacy of vaccines.
Footnotes
Author Disclosure Statement
The authors have no conflicts of interest to declare.
Funding Information
This study was financed by the DGAPA of the Universidad Nacional Autónoma de México, through the postdoctoral fellowship program to ERH, and the PAPIIT (IN213818) program.
