Abstract
Circoviruses represent a rapidly expanding group of viruses that infect both vertebrate and invertebrate hosts. Members are responsible for diseases of veterinary and economic importance, including postweaning multisystemic wasting syndrome in pigs, and beak and feather disease (BFD) in birds. These viruses are associated with lymphoid depletion and immunosuppressive conditions in infected animals leading to systemic illness. Circoviruses are small nonenveloped DNA viruses containing a single-stranded circular genome, encoding two major proteins: the capsid-associated protein (Cap), comprising the entirety of the viral capsid, and the replication-associated protein (Rep). Cap is the only protein component of the virion and plays crucial roles throughout the virus replication cycle, including viral attachment, cell entry, genome uncoating, and packaging of newly formed viral particles. Rep mediates recognition of replication origin motifs in the viral genome sequence and is responsible for endonuclease activity enabling nicking of the circular DNA and initiation of rolling-circle replication (RCR). Porcine circovirus 2 (PCV2) was the first circovirus capsid structure to be solved at atomic resolution using X-ray crystallography. The structure revealed an assembly comprising 60 monomeric subunits to form virus-like particles. Each Cap monomer harbors a canonical viral jelly roll domain composed of two, four-stranded antiparallel β-sheets. Crystal structures of two distinct macromolecular assemblies from BFD virus Cap were also resolved at high resolution. In these structures, the exposure of the N-terminal arginine-rich motif, responsible for DNA binding and nuclear localization is reversed. Additional structural investigations have also elucidated a PCV2 type-specific neutralizing epitope, and interaction between the PCV2 capsid and polymers such as heparin. In this review, we provide a snapshot of the structural and functional aspects of circovirus proteins.
Introduction
Viruses of the genus Circovirus are members of the small circular single-stranded DNA (ssDNA) virus family Circoviridae. The genus currently includes 39 virus species infecting a diverse range of vertebrate and invertebrate hosts (Virus Taxonomy: 2018b Release). Several circoviruses infect avian hosts such as beak and feather disease virus (BFDV) in Psittaciformes (4), canary circovirus (CaCV) (67), columbid circovirus (CoCV) (50), goose circovirus (GoCV) (87), finch circovirus (FiCV) (81), gull circovirus (GuCV) (86), raven circovirus (83), and starling circovirus (StCV) (18). In mammals, three viruses infect pigs: porcine circoviruses types 1 (PCV1), 2 (PCV2), and 3 (PCV3) (74). There are also canine circovirus in dogs (36), bat-associated circoviruses 1-8 (1), and mink circovirus (44). Circoviruses infecting fish include the barbell (34) and catfish circoviruses. In addition, circovirus-like viruses have been identified in starfish (21) and in crustaceans such as shrimp (66).
Circoviruses are among the smallest and simplest of all known viruses to replicate autonomously in animal cells (3,85). These viruses are common and infect a wide range of hosts (48,66). PCV and beak and feather disease virus (BFDV) are thus far the most studied circoviruses, likely due to their significant impact in agricultural and ecological systems, respectively (2,24). All circoviruses have a small circular ssDNA genome of ∼1.7–2.0 kb in length, encapsidated into a nonenveloped, icosahedral virion ranging from 12 to 32 nm in diameter (1,17,59). Circoviruses are a model of biological efficiency, encoding as few as two proteins within their viral genome to facilitate replication (84). The two major proteins common to circoviruses are the replication protein (Rep), which is associated with viral rolling circle genomic replication, and the capsid protein (Cap), which is the structural element of the viral capsid. The capsid shell is simplistic, consisting of a single capsid protein that self-assembles into icosahedral virus-like particles (VLPs) of 60 capsid protein molecules (37,78).
Transmission of circoviruses among vertebrates is predominantly through direct contact with virus shed through feces, feather, and nesting materials (43,53). Circoviruses display marked cell and tissue tropism. BFDV replication occurs in several tissues, including liver, skin, gastrointestinal tract, and bursa of Fabricius (71), whereas the Cap antigen has been identified in the spleen, thymus, thyroid, parathyroid, and bone marrow (41). However, viral attachment and entry into host cells may not necessarily translate into viral replication. Hence, some cells containing viral Cap particles may not represent permissive sites for circoviruses. For example, in vivo PCV2 has been found to replicate in the heart, liver, lymphoid organs, and lungs with the heart being the most important site for pathogenesis (75), while in vitro replication in cells derived from several other tissues such as epithelial, kidney, and monocyte lines have shown that they are permissive to PCV2, even though they do not participate heavily in the natural progression of pathogenesis (55,57).
This tissue tropism of circoviruses leads to direct consequences for virus-induced disease. Infections are associated with lymphoid depletion and immunosuppressive conditions in infected vertebrate animals, leading to systemic illness such as postweaning multisystemic wasting syndrome in pigs caused by PCV2, and psittacine beak and feather disease caused by BFDV infection in birds (41,69). To understand the pathogenesis of economically important circovirus-related diseases, it is important to appreciate the relationships between circovirus protein structures and their functions in the replication cycle. In this study, we will review the current understanding of circovirus protein structures, placing this within the context of the viral genome layout and the replication cycle.
The Replication Cycle of Circoviruses
Circovirus infection is initiated when virus particles bind to the surface of susceptible cells. The extracellular matrix component heparan sulfate (HS) has been shown to be bound by circovirus capsids (19) with this proteoglycan likely acting as a key cell receptor, as blocking of Cap:HS interaction with the small molecule epigallocatechin gallate prevents PCV2 uptake (42). Following attachment to the cell surface, circovirus virions are internalized, with evidence suggesting that this is mediated by an ATP- and caveolin-dependent process for BFDV and clathrin- or actin and small GTPase dependent for PCV2 (12,57). Although the details of the endocytosis and vesicular transport of circoviruses remain elusive, it is known that these viruses are able to cross bilayer membranes to enter the cytoplasm; presumably through the action of a cell-penetrating peptide (97). From the cytoplasm, viral Cap proteins interact with the host importin receptors through nuclear localization signal (NLS) motifs for transport into the nuclear lumen (12,97). Within the nucleus, circovirus ssDNA genomes dissociate from viral capsids, interact with host polymerase factors, and enter rolling circle replication (RCR) (57). Newly formed viral Cap proteins enter the nucleus and associate with the nascent genomic DNA through a process that is yet to be fully revealed and assemble as virus particles (37,78).
Genome Biology of Circoviruses
Circoviruses all possess a circular, ssDNA genome of ∼2,000 nucleotides. The genome contains ambisense genes, that is, they require proteins encoded by both the viral (ORF V) and complementary (ORF C) strands of the genome to replicate (58). Circoviruses are dependent on their host's DNA polymerase to replicate (87), and the intergenic region of the genome contains a putative stem-loop structure with a conserved nanonucleotide sequence motif (TAGTATTAC) (Fig. 1), with ssDNA synthesis initiated by Rep through a RCR strategy (5). Circoviruses are unusually diverse in genome sequence composition, displaying high rates of mutation (approaching that observed for RNA viruses), despite the constraints on genome size and phenotypic innovations due the structural limitations of encapsidation and the need to preserve the multiple functions of the capsid protein (such as virion attachment, DNA binding and assembly, nuclear localization, and nucleocytoplasmic shuttling) (29,78). However, the most detailed genetic analyses in large-scale datasets have generally been confined to PCV2 and BFDV (24,70). All circovirus genomes encode a Rep protein, encoded by ORF V1, and Cap protein, encoded by ORF C2.

The full-length BFDV genome has been characterized extensively from many species of psittacine birds from different continents (27,76,88). The BFDV genome length ranges from 1.7 to 2.0 kb, and contains seven putative ORFs (5). However, only ORF V1, located on the viral strand and ORF C2, located on the complimentary strand were found to be homologous to the Rep and Cap gene, respectively. In contrast, the PCV2 genome possesses 11 putative open reading frames (ORFs), many of which are overlapping (26). ORF's V1, V5, V7, and V10 are located on the viral positive strand, whereas ORF's C2, C3, C4, C6, C8, C9, and C11 are encoded by the complementary strand (26). Only five proteins encoded by these ORFs have been characterized in detail. Moreover (and similar to BFDV), ORF V1 and ORF C2 genes of PCV2 are the two major ORFs encoding the Rep and Cap protein, respectively. The Rep gene of PCV2 is capable of producing two functional proteins; the full-length Rep protein (314 aa) and the spliced frame-shifted version Rep′ protein (178 aa) and both are essential for initiating PCV2 replication (23). ORF C2 encodes for the structural capsid protein and has been well characterized. ORF C3 and ORF C4 are embedded within ORF V1, where ORF C3 products are known to induce apoptosis in virus-infected cell lines (45). The ORF C4 product may play a role in restricting ORF C3 transcription, thereby preventing virus-induced apoptosis (60). However, whether these ORF C3 and ORF C4 proteins truly contribute to the pathogenesis of PCV2 remains unclear.
Protein Biology of Circoviruses
Cap protein
Capsids are the only circovirus proteins incorporated into the virion and exposed to the extracellular milieu; functioning to encapsidate the viral genome but also representing the major antigenic determinant. In addition to its role in genome packaging, the capsid protein also plays key roles during viral replication, including attachment and cell entry, genome release, and translocation into the nucleus (13,68). Ritchie et al. (73) first elucidated the 26.3 kDa Rep and 23.7 kDa Cap of BFDV, and following this, several studies have tested the expression and antigenic character of full-length and truncated Cap protein for use in diagnostic assays as well as vaccine development. Candidate vaccines have been developed using baculovirus expression vectors in insect cell systems (29,83), as well as expression in bacterial based systems (65,77). Low endogenous expression, low solubility, and poor stability were common obstacles during early attempts at synthesis for atomic-level structural analyses of viral proteins (22,77). Expression was conspicuously higher when the first 40 amino acids were deleted from the N-terminus of the Cap sequence (29). Sarker et al. (77) expressed and purified full-length recombinant BFDV Cap in sufficient quantity and purity for crystallographic studies.
Circovirus capsid proteins can exist in multiple conformational assemblies during replication, including small intracytoplasmic nonmembrane-bound crystalline inclusion assemblies of 0.1–0.5 μm, larger membrane-bound inclusion bodies of 0.5–5.0 μm, intranuclear inclusion bodies composed of circular virus complexes of 10–12 nm, as well as fully mature, single icosahedral VLPs of 17 nm diameter (78).
Antigenic variation and genotype shifts have been well studied in relation to circovirus surface topology (80,98). Particularly for PCV2, the original surface pattern of the icosahedral fivefold axes, decorated with Loops BC, HI, and DE, were distinct from PCV1. A conserved tyrosine phosphorylation motif in Loop HI is present only in PCV2 along with a canonical PXXP motif for binding and activation of an SH3-domain-containing tyrosine kinase in host cells. These unique patterns of PCV2 capsid surface that are absent in PCV1 isolates has been considered crucial for cell entry, virus function, and pathogenesis (92). A recent phylogenetic study on the PCV2 strains circulated in Belgium from 2009 to 2018 highlighted that three residues (positions 59, 131, and 191) located on outside of cap protein has been under positive selection driving a genotype shift PCV2a to PCV2b and later on from PCV2d-1 to PCV2d-2 (94).
The N-terminus of the Cap protein contains NLSs for mediating transfer of this protein and the associated genomic DNA into the host cell nucleus (29). Both the NLSs, and antagonizing nuclear export signals, have been identified on the Cap protein of BFDV, and shown to be critical for nuclear entry and exit, respectively, in vitro (12). The NLS region is also responsible for binding the ssDNA genome. In insect cells, coexpression of BFDV Cap with the Rep protein led to localization of Rep within the nucleus, whereas expression of Rep alone was localized to the cell cytoplasm (29). This suggested a potential Cap/Rep interaction that accelerates movement of Rep into the nucleus for commencement of viral genomic replication (29).
The C-terminus (CT) of the PCV2 Cap is also known to play critical roles in the evolution, pathogenesis, and proliferation of this virus. A recent study discovered a critical PXXP motif in the C-terminal loop of the PCV2 Cap plays a critical role in self-assembly of VLPs in vitro. In this study, using site-directed mutagenesis approaches, they demonstrated that a strictly conserved residue (K227A) in the CT loop is essential for VLPs entry into PK15 cells and mutation of this residue results in abrogated infectivity compared with wild-type viruses (99).
Rep protein
The Rep proteins of circoviruses possess two functional domains; an endonuclease domain and a P-loop (a putative helicase domain) (14) (Fig. 2). The endonuclease domain of Rep is located at the N-terminus and is responsible for recognition of the origin of replication motif in the viral genome sequence, as well as being responsible for the endonuclease activity that enables nicking of the circular DNA and initiation of RCR (62). Recombinant Rep protein exhibits dual ATPase and GTPase activity (31). Rep proteins typically contain a helicase domain for DNA unwinding (33), and while helicase activity of Rep has been demonstrated for two different geminiviruses (a family of viruses distinct from circoviruses yet with similar circular ssDNA genomes and replication strategies) (16), solid evidence remains to be established for helicase activity in circoviruses.

Domain organization of circovirus Rep proteins. The N-terminal endonuclease domains have been solved by NMR (90) and crystallography (49). The dimer interface is mediated through interfaces and highlighted in transparent blue circles. The helicase domain remains to be solved. Color images are available online.
To date, a three-dimensional NMR solution structure of the endonuclease domain (90) and more recently, a crystal structure of the dimerized N-terminus from the Rep protein of PCV2 have been resolved (Fig. 2) (49). Dimerization is mediated through two interfaces (Fig. 2), with Leu35 and Ile37 located in interface I shown to be highly critical (49). The structurally resolved endonuclease domain of the PCV2 Rep highlighted tyrosine- or serine-rich motifs (motif I, II, and III) as important for serving in metal ion activation, DNA binding, and ssDNA cleavage (90). Structure-based sequence analysis revealed that these three motifs, located near the N-terminus of the endonuclease domain, are conserved across several circoviruses, geminiviruses, and nanoviruses (90). A specific mutation in motif II of BFDV Rep has been proposed to contribute to an increase in infectivity (40). Divalent metal ions (such as Mn2+ or Mg2+) were identified as cofactors for endonuclease and DNA cleavage functions of Rep (33,90).
The C-terminal domain of Rep possesses a P-loop/helicase motif, is conserved in circoviruses (33,58), and is indispensable for PCV replication (14). ATPase activity of this domain was detected in PCV, which is dependent on the presence of Mg2+ ions as a cofactor (82). A consensus ATP/GTP-binding motif was detected in geminiviruses and is seemingly essential for viral genome replication (8). A study demonstrated that the dual ATPase and GTPase activity of the C-terminal domain of the BFDV Rep is regulated by Walker A and B motifs typical of ATP-binding proteins, as well as a novel GYDG motif that is encoded between them (31).
In contrast to other circoviruses, which use a single multifunctional transcript of Rep, PCV1 and PCV2 have two differentially spliced transcripts (Rep and Rep’) for motif recognition, replication initiation, and helicase activity (89). These Rep and Rep’ proteins are able to associate as homodimers or heterodimers, and the quantities of these complexes vary during replication, which may suggest they enact different roles (14, 51).
Interaction of Rep and Cap proteins may also play a role in circovirus biology, such as subcellular localization of Rep to the nucleus for initiating viral genome replication using host polymerase (29). Despite containing 2 predicted NLS motifs by sequence analysis (accession number: Q9YUD3), expression of BFDV Rep protein alone in insect cells led to a predominantly cytoplasmic localization. Coexpression with Cap protein altered Rep localization toward nuclear import (29). Rep protein has thus been suggested to bind with Cap and to be cotranslocated across the nuclear envelope. The ATPase and GTPase activity of the BFDV Rep was also upregulated in the presence of Cap protein, which suggests that the interaction between the Rep and Cap proteins may lead to conformational and/or functional change in the Rep protein (31). However, since no structural data are available for these complexes, a detailed understanding of this interaction, and possible biological function is limited.
Circovirus Capsid Structures
Cap is the main antigenic protein of circoviruses and forms an icosahedral virion structure with 60 repeating subunits. The sequences among different species of circoviruses are diverse (64), yet the overall three-dimensional structure is conserved over large evolutionary distances (Fig. 3) (39). Attempts have been made to model the putative structures of different circoviruses detected in birds, fish, and mammalian hosts based on the Cap crystal structure data from PCV2. These models suggest a high degree of structural conservation (64), however, the accuracy of such homology models relies heavily on the amino acid sequence similarity with the template. When confronted with low sequence similarity (for example, less than 30% identity), computational prediction of target protein structure suffers from several challenges and shortcomings (10). This is exemplified in recent reports that important structural information within circoviruses cannot always be obtained from sequence data analysis alone. For example, the observed significant structural differences between PCV2a, b, and d would not be expected based on the high similarity of the sequences (38). Moreover, differences in the positioning of loop regions are proposed to represent a genotype shift between PCV2a and b, which would not be possible to ascertain from sequence analysis (6,63).

Structures of the circovirus VLP's from PCV2 (PDB 3ROR) and BFDV (PDB 5J36) based on crystallography. Each VLP comprises 60 capsids, arranged as 12 pentamers. One pentamer is shown in the center of each VLP. The sequence identity is 29.6%. BFDV, beak and feather disease virus; VLP, virus-like particle. Color images are available online.
Early structural insights from Crowther et al. (17) detailed the 3D structures of circovirus virions using electron micrographs for the first time. They found that the members of the genus Circovirus, such as PCV2 and BFDV, have a similar capsid structure that form multisubunit virions with T = 1 icosahedral symmetry. The first high-resolution atomic structure of PCV2 Cap was solved in 2011 using X-ray crystallography (accompanied with lower resolution cryoEM), where it was shown that the Cap subunits comprise a single canonical jelly roll domain (37) (Fig. 4). The jelly roll domain comprises two β-sheets, each containing four strands. Shorter loops (ranging from 4 to 9 amino acids) that connect the β-strands decorate the surface of the capsid, whereas the longer loops (ranging from 21 to 36 amino acids) predominantly mediate interactions between capsids (38).

The circovirus structure is built up from 60 capsid proteins. The monomeric unit is a jelly-roll domain, and forms a 60 capsid structure arranged as 12 pentamers. Color images are available online.
More recently, crystal structures of BFDV Cap were structurally resolved in two different macromolecular assemblies (78). These included a 10 nm assembly, comprising two face-to-face Cap pentamers, and a 17 nm assembly comprising 60 Cap monomers arranged as 12 pentamers (78) (Fig. 5). These assemblies exhibited inverted morphologies with respect to the positively charged N-terminal arginine-rich motif (ARM) domain, positioned on the exterior of the 10 nm assemblies, and on the interior of 17 nm assemblies (Fig. 6) (78). The influence of ssDNA on the interconversion between these two species was described, along with the possible biological roles. In the absence of ssDNA, the 10 nm assembly is prevalent, and the N-terminal ARM domain exposed. In contrast, in the presence of ssDNA, the 17 nm assembly predominates, and here the N-terminal ARM domain is interior where its function is to package DNA and create a neutrally charged virion. A structure of BFDV in the presence of ssDNA also provided possible DNA-binding sites on the inner Capsid shell, distinct from the ARM domain (Fig. 7). Many of these binding sites were conserved in a recent structure of PCV2d (38).

The beak and feather disease virus form at least two different macromolecular assemblies. A decameric species comprising two pentamers is ∼10 nm in diameter. The full VLP comprises 60 capsid molecules (or 12 pentamers) and is 17 nm in diameter. Color images are available online.

Inversion of the CAP ARM domains (shown in blue) between the 10 nm (left) and 17 nm (right) assemblies. Color images are available online.

Structure of the circovirus BFDV (PDB 5J37) and PCV2d (PDB 6OLA) VLP shown with yellow beta sheets and green loops. Bound ssDNA is shown in purple spheres. ssDNA, single-stranded DNA. Color images are available online.
Although structures of PCV2b and d, and BFDV VLPs have been resolved by X-ray crystallography and cryo-EM, the N-terminal ARM domains could not be resolved in these structures and were predicted to be structurally flexible (37,47). An N-terminal truncation of PCV2 capsid protein showed compromised immunogenicity and capsid assembly (25). The N-terminal region of PCV2 capsid protein contains several clusters of arginine residues and is predicted to form an α-helix. Recently, Mo et al. (56) reported the structural roles of the N-terminal fragment of PCV2 and pinpointed one PCV2 type-specific neutralizing epitope. Density was observed in the cryoEM structure for the N-terminal ARM region 15PRSHLGQILRRRP,27 including an α-helix, which interacts with the arginine-rich NLS-B (33RHRYRWRRKNG43) fragment located proximal to the β-barrel. However, the authors noted that the structure was not deposited to the PDB due to low resolution and structural flexibility. Nonetheless, these interactions were proposed to stabilize PCV2 VLPs in solution and were supported by truncation fragments of the ARM domain. The positioning of the N-terminal ARM domain was distinct to that presented in a recent structure of PCV2d by Khayat et al. (38). For example, Khayat models residues 36–42 of PCV2d to be located on the inner side of the capsid near the three-fold axes of symmetry. In contrast, in the structure reported by Mo et al., residues 33–42 are modeled near the five-fold axes of symmetry. These discrepancies have been suggested to be possibly due to differences in expression systems (mammalian vs. Escherichia coli).
Additionally, Mo et al. (56) determined the cryo-EM structure of the full-length PCV2 VLP in complex with the Fab fragment of a PCV2 type-specific neutralizing monoclonal antibody (mAb). In this study, the structure shows that the complementarity-determining regions of mAb-3H11 Fab bound to a protruding EF-loop region (134KATALT139) located on the PCV2 VLP surface. This EF-loop region could serve as a PCV2 type-specific neutralizing epitope. In a more recent study the neutralizing mAb 3A5 capable of binding PCV2a, b, and d genotypes was investigated. CryoEM showed the antibody bound at the five-fold symmetry axes (30). Significantly, two adjacent capsid proteins were recognized by this antibody.
The presence of a positively charged ARM domain at the N-terminus of the structural Cap protein is a distinctive and highly conserved feature of the Circoviridae (Fig. 8). These motifs are positioned in a similar fashion in Cap sequences across all circoviruses, although the number of arginine residues and sequence composition may differ (29). The concentrated clustering of positively charged residues often constitutes an NLS (95). Heath et al. (29) first recorded three partially overlapping bipartite NLSs with a high percentage of arginine residues in BFDV Cap. These NLSs were shown to be pivotal for nuclear localization of the capsid protein in both BFDV and PCV2 (29). Arginine-rich NLSs also contain a DNA-binding region, which facilitates the binding of Cap to both single and double-stranded DNA in a cooperative manner (29). Whether the NLS and the DNA-binding region are functionally coupled or act independently is yet to be determined. In addition to classical nuclear import pathways, PCV2 Cap has also been reported to recruit dynein for transport to the nucleus through the dynein/microtubule machinery (12). In this model PCV2 Cap subunits might act as a direct ligand of the cytoplasmic dynein IC1 subunit and an inducer of microtubule α-tubulin acetylation, which enhances intracellular transport (12).

Sequence alignment of circovirus capsid ARM domain. Positively charged arginine (R) residues are highlighted in red. ARM, arginine-rich motif. Color images are available online.
The ARM sequences of circoviruses also play an important role in regulating genome encapsidation (15,35). In some RNA virus capsid proteins, the ARM domains are related and exhibit a specific recognition sequence for viral RNA encapsidation and assembly (72). For example in brome mosaic virus, the ARM interacts specifically with the 5′-terminus of RNA1 and enhances stability of RNA1-containing particles as well as assisting in packaging of sgRNA4 (15). Similarly, in Flock House virus (FHV), the N-terminal ARM domain of the capsid protein is required for specific packaging of RNA2 and also regulates the efficiency of RNA1 packaging (52).
Deletion mutation experiments within the N-termini of capsid proteins have facilitated our understanding of the residues critical for encapsidation, assembly, and stability of the virion in ssRNA viruses (35,79). In cucumber necrosis virus such approaches have identified the highly basic KGKKGK sequence in the R domain of the capsid protein to be essential for encapsidation of the full-length genome and formation of polymorphic virion (35). Complete truncation of arginine-rich sequences from Cap protein resulted in dramatic shifts in the assembly equilibrium from T = 3 symmetry VLPs to T = 1 symmetry and intermediate-sized particles (35).
Similarly, complete deletion of the ARM domains abolished RNA encapsidation capability of Sesbania mosaic virus (79). In Sesbania mosaic virus, a minimum of three arginine residues were found to be essential for encapsidation of RNA, whereas experimental mutation on these residues resulted in complete reversal of RNA packaging ability (79). Similarly, in cucumber necrosis virus, site-directed mutagenesis revealed the influence of positively charged lysine residues in packaging of RNA within VLPs (35). Several ssRNA/ssDNA viruses neutralize the negative charge of the genomic material with positively charged domain of the capsid protein (93). Belyi, Muthukumar (9) showed that the packaged genome length and conformation is dominated by the nonspecific electrostatic interactions exerted from the net charge on the flexible peptide ARM domains. They postulated that genome length is in direct linear proportion to the net positive charge on capsid peptide arms, irrespective of the actual amino acid sequence, with a proportionality coefficient of 1.61 ± 0.03. This ratio is conserved across all ssRNA/ssDNA viruses with highly basic peptide arms, and is different from the one-to-one charge balance expected of specific binding (9).
The positively charged NLSs within the ARM domain remain buried in the internal surface of the PCV2 capsid, which might externalize during viral “breathing” (37). Additionally, the exposure of ARM domain in different Cap assemblies may aid in morphogenesis to higher order macromolecular VLPs in the presence of ssDNA (78) as described earlier. Mo et al. (56) suggests that PCV2 N-terminal fragment up to 40 amino acid residues could be structurally flexible, which is partially confirmed by the N-terminal truncated PCV2 structures published in literature (37,47). Expression of full-length PCV2 capsid protein was very difficult in soluble form in numerous protein expression systems (96), suggesting the N-terminal ARM domain of PCV2 capsid protein could play a role in protein folding and/or capsid formation (46). Guo et al. (25) showed that N-terminal truncated PCV2 capsid protein might not be able to self-assemble. Recently, structural data speculate that PCV2 capsid protein could self-assemble in solution into PCV2 VLP without the presence of an N-terminal fragment. However, this VLP originated from N-terminal-deleted PCV2 capsid protein, lacking the interactions between the α-helix in NLS from one capsid protein and NLS-B fragment from an adjacent capsid protein is instable and easily disassembled. Therefore, PCV2 N-terminus of PCV2 capsid protein, including the NLS fragment, play pivotal roles in PCV2 VLP stabilization rather than PCV2 VLP formation.
Cellular Attachment and Endosomal Membrane Disruption of Circoviruses
Among the diverse species of circoviruses, only PCV2 and BFDV has been studied for virion attachment, internalization, cellular and nuclear transport since in vitro propagation systems have not been established for other circoviruses. The PCV2 virion binds to HS and chondroitin sulfate B glycosaminoglycans on the cell surface (54), before entering the cell through clathrin-mediated endocytosis in monocytic cells (55) and dendritic cells (91), whereas BFDV is caveolin- and ATP-dependent (12). PCV2 gains entry into epithelial cell lines PK15, SK, and ST cells through caveolae-, clathrin-, and dynamin-independent small GTPase-regulated pathways (57). Endocytosed macromolecules are either recycled to the surface of the cell or degraded by the cell as endosomes acidify to lysosome in what is referred to as the endosome–lysosome system (32). Thus, PCV2 must escape from endosomes and enter the cells for infection to proceed. It was hypothesized that the heparan sulfate-binding motif, XBBXBX (B represents a basic amino acid and X represents a neutral/hydrophobic amino acid) located in the N-termini of PCV2 Cap regulate such attachment (54) until elucidation of a high-resolution crystal structure of PCV2 Cap demonstrated that these motifs localize internally in the capsid shell, and thus are not exposed to the surface (37).
Recent structural studies have revealed detailed interactions between the PCV2 capsid and heparin (19). Heparin bound one of five binding sites per capsid subunit and the interaction did not adhere to the capsid's icosahedral symmetry (19), where increasing lengths of heparin exhibit a greater affinity toward PCV2. This suggests that polymers high in sulfate content are capable of competing with the PCV2-HS interaction and, thus, have the potential to inhibit PCV2 infection (19).
Nonenveloped viruses use a variety of mechanisms for membrane disruption. FHV and Nudaurelia capensis ω virus possess amphipathic helices at the C-termini of their capsid proteins that are able to disrupt membranes. The C-termini are located inside the capsid, but autocatalytic cleavage at the C-termini generates amphipathic peptides (referred to as γ peptide) that can become exposed to the exterior through viral breathing (11). Sequence analysis of the PCV2 capsid protein does not identify amphipathic helices, or a myristoylated N-terminus; moreover, hydrophobic loops are not observed in the PCV2 VLP structures (28,37). However, PCV2 possesses an ARM at the N-terminus of the capsid protein that parallels the sequence properties of arginine-rich cell-penetrating peptides (CPP) (7). Arginine-rich CPPs are positively charged peptides with few anionic or hydrophobic residues. These short peptides possess the ability to traverse plasma and endosomal membranes (7). Indeed, such CPPs have successfully been used to transport 120-kDa proteins into the cellular cytoplasm (61). In vitro membrane disruption assays have shown that PCV2 VLP, unassembled capsid, and ARM peptide possess the ability to disrupt endosomal-like membranes, whereas VLP lacking the ARM sequence does not possess this capability (20). Membrane disruption by the VLP is insensitive to pH, but unassembled capsid protein and ARM peptide exhibit diminished activity at low pH. Liposome disruption assays, circular dichroism, and intrinsic tryptophan fluorescence assays have demonstrated PCV2 endosomal membrane interaction, wherein the ARM peptide externalizes from the capsid, its C-terminus (amino acids 28–40) anchors into the membrane, and the arginine-rich N-terminus (amino acids 1–27) drives membrane disruption (20).
Conclusion
In this review, we discuss our current knowledge of Cap and Rep proteins of circoviruses based on functional and structural data. The capsid structures of PCV2 and BFDV have detailed important insights into receptor binding, genome packaging, and cell penetration. The structural similarities between Cap proteins (despite low-sequence conservation) suggest that these biological mechanisms are relatively well conserved. The ability of circoviruses to replicate and cause significant pathogenesis with such few proteins is intriguing, and present excellent models of both biological efficiency and understanding of virus replication and assembly.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
Funding Information
No funding was received for this article.
