Abstract
Viral sequence integration into the mammalian genome has long been perceived as a health risk. In some cases, integration translates to chronic viral infection, and in other instances, oncogenic gene mutations occur. However, research also shows that animal cells can benefit from integrated viral sequences (e.g., to support host cell development or to silence foreign invaders). Here we propose that, comparable with the clustered regularly interspaced short palindromic repeats that provide bacteria with adaptive immunity against invasive bacteriophages, animal cells may co-opt integrated viral sequences to support immune memory. We hypothesize that host cells express viral peptides from open reading frames in integrated sequences to boost adaptive B cell and T cell responses long after replicating viruses are cleared. In support of this hypothesis, we examine previous literature describing (1) viruses that infect acutely (e.g., vaccinia viruses and orthomyxoviruses) followed by unexplained, long-term persistence of viral nucleotide sequences, viral peptides, and virus-specific adaptive immunity, (2) the high frequency of endogenous viral genetic elements found in animal genomes, and (3) mechanisms with which animal host machinery supports foreign sequence integration.
Introduction
V
Why Does the Virus-Specific Immune Response Persist for Months or Years After an Acute Virus Infection?
B cell and T cell immune responses are remarkably long lasting after an acute virus infection or after vaccination with a replication-competent vaccine in mammals. Even when a virus is typically cleared within a 5–10 day period postinfection, B cells and T cells with specificity for that virus can be sustained for months or years. In some cases, robust immune responses remain (in the bone marrow, blood, gut, and respiratory tract) for decades after the virus is cleared and without evidence of subsequent virus exposure (e.g., after vaccination with vaccinia virus, the smallpox vaccine) (4,34,40,72,73).
Debates continue as researchers attempt to explain mechanisms for the long-term maintenance of antiviral B cell and T cell activities to sustain sterilizing and nonsterilizing adaptive immune responses. Debates are complex, given the diverse requirements for virus recognition and diverse antiviral functions of B cell and T cell populations (e.g., recognition of free virus versus peptide–major histocompatibility complexes; virus neutralization versus lysis of virus-infected cells) (30,55,63). It can be argued that persistent antigen is not always required for the maintenance of an antigen-specific lymphocyte response. In fact, activated B cells can persist even when antibody genes are mutated, rendering the B cells unable to recognize their priming antigen (34). As a counter-argument, some researchers suggest that even though antigen is not absolutely required to retain antibody expression, antigen persistence may bolster immune response durability; perhaps replication-competent virus is sustained for months or years, but evades detection in standard assays due to masking by neutralizing antibodies (34,46). Yet another explanation is that there are permanent depots for viral antigens. Precisely what cells or structures hold these depots is unclear. One final explanation is that antibody-producing B cells and T cells are exposed to cross-reactivate irrelevant antigens or inflammatory environments that nonspecifically sustain their activities (2,24,34,54,65).
Hypothesis
Recent discoveries in microbiology have highlighted the importance of CRISPR-Cas (CRISPR-associated) systems in providing adaptive immunity against viruses in bacteria and archaea (7). In these systems, sequences from bacteriophages and other mobile genetic elements termed spacers are integrated into the CRISPR array, which are then transcribed to produce small CRISPR RNAs (crRNA) that guide Cas proteins to silence invading nucleic acids in a sequence-specific manner (7,11,13,22,35,44,61,83). Here we hypothesize that a similar principle is used by animals and that the incorporation of viral sequences into DNA may actively promote antiviral immunity. Not only might invading pathogen nucleic acids be silenced (using sequence-specific targeting tools such as antiviral interfering RNA) (33,66,74) but a sequence that is integrated into mammalian DNA may be used for transcription and translation of viral peptides. Peptides may then reactivate B cell and T cell responses in the animal's periphery. This mechanism would explain, at least in part, the long lasting presence of viral antigens after an acute virus infection (22,35) and the consequent long lasting protective, virus-specific immune response.
How Frequent Is Viral Sequence Integration into Animal Host DNA?
The presence of viral sequences in animal DNA has long been recognized. Both RNA and DNA viral elements appear in animal DNA, accounting for a significant percentage of the genome. Sequences are from retroviruses, herpesviruses (e.g., Epstein-Barr virus), adenoviruses, parvoviruses, polyomaviruses, arenaviruses, bornaviruses, filoviruses, rhabdoviruses, and hepadnaviruses (e.g., hepatitis B virus) to name a few (8,29,32,38,39,45,47 –49,52,59,70,82,85).
The concept that an integrated sequence may serve as a vaccine is not new (32,47,85). Decades ago, Klenerman et al. identified nonretroviral RNA virus (lymphocytic choriomeningitis virus) sequences in mouse splenocyte DNA 70 days after a virus infection (47). How many other viral sequences can be found in animal DNA? Systems-wide approaches support ever increasing observations of RNA and DNA viral sequence integration (45) into the genome of the animal host.
How Does Sequence Integration Occur?
In recent years, much has been learned of the adaptation phase of bacterial CRISPR-Cas immunity, wherein spacers derived from viruses and mobile genetic elements are incorporated into the CRISPR array (5,41,76). Cas1 and Cas2 proteins encoded by the cas operon are thought to be universally involved in adaptation, although additional proteins are known to participate in some CRISPR-Cas types (5,41,76). As researchers query CRISPR-Cas systems and other mechanisms by which viral sequences integrate into bacterial genomes (12,14,21), similar questions about the animal cell can be posed.
When viruses carry their own machinery for sequence integration
HIV-1 provides a simple example of a virus that carries its own integration machinery, including reverse transcriptase. Double-stranded viral DNA is produced and integrated into host DNA, supported by the virus-encoded integrase (37). But the virus is not on its own during the integration process. Preintegration complexes comprise linear viral DNA, viral proteins, and host proteins (e.g., high-mobility group protein), each of which can influence the integration process (27). Once viral DNA is integrated, transcription factor–promoter interactions, again modified by both virus and host, instruct transcription and viral protein production (36). Although newly produced HIV-1 proteins will clearly prime and boost the host's B cell and T cell immune response, they will also support virus replication in sites hidden from the immune system, often to the detriment of the host.
Further reliance on host machinery
Adeno-associated viruses (AAVs) provide insights into host-facilitated mechanisms of foreign sequence integration. The AAVs can persist as episomes in mammalian nuclei, but vector integration is also observed (23,64). AAV often integrates within a particular region of chromosome 19 in the human genome. This site-specific integration is supported by the AAV Rep proteins, but AAV vectors will also integrate in a Rep-independent manner into nonhomologous sites. In this case, AAV vectors appear to rely heavily on host factors. Nonhomologous integration of AAV vectors is observed in ribosomal DNA repeats and in CpG islands near transcription start sites. Segmental duplications, satellite DNAs, and palindromes define additional integration hotspots for AAV sequence integration, and these sites may similarly define hotspots for other proviral sequences.
Viral sequence integration opportunities during DNA damage and repair
It is estimated that mammalian cells incur thousands of DNA-damaging events per day (3,58), providing ample opportunity for viral sequence integration (43). Complex host repair mechanisms routinely support base excision repair (78), mismatch repair, nucleotide excision repair (68), homologous recombination, and nonhomologous end joining (16,17,42,51,53,56,58,69). The host's DNA damage response proteins are recruited to DNA breaks (56,67), a response that can be exploited by viruses to enhance viral replication (1). The hijacking of host repair mechanisms during a virus infection may further increase opportunities for viral sequence incorporation into host DNA.
Specialized processes for DNA recombination in lymphocytes
The specialized host sequence recombination functions of lymphocytes may uniquely support integration of foreign sequences. In developing B cells and T cells, dedicated DNA recombination mechanisms assemble V, D, and J gene segments to create antibody and T cell receptor sequences in host DNA. Processes involve the multisubunit complex recombinase activating gene 1 and 2 enzymes that rearrange genomic sequences while accommodating the incorporation of nongenomic nucleotides into sequence junctions. In mature B cells that are activated by foreign antigens, activation-induced deaminase mediates class switch recombination (CSR) events and somatic hypermutation (SHM) in immunoglobulin heavy and light chain loci (62,75). Perhaps the mechanisms dedicated to CSR and SHM additionally support viral element integration when lymphocytes are activated at the site of a virus infection.
How Do Viral Nucleic Acids Gain Nuclear Entry?
In the animal system, there is no lack of potential mechanisms by which viral nucleic acids in the cytoplasm may be shuttled into the cell nucleus (19,57). Whole viral genomes often enter the host nucleus to support virus amplification (e.g., retroviruses and influenza viruses). Influenza virus, for example, uses viral proteins with nuclear localization signals that bind to cellular nuclear import machinery (e.g., importins) to escort RNA through nuclear pores (10). Other viral genomes may enter the nucleus during mitosis when the nuclear envelope is disassembled, or by virus-induced transient disruption of the nuclear envelope and entry through resulting gaps (19,28). Even when viral replication occurs in the cytoplasm, viral components can be shuttled into the host's nucleus (60). Nonviral microvesicles, proteins, and lipids might protect viral proteins and nucleic acids (as is the case for DNA, mRNA, long noncoding RNA (lncRNA), and microRNA (miRNA) elements), while escorting viral elements within the cell or even from cell-to-cell (15,50,80,81). For the complementary synthesis of DNA from RNA sequences, retrovirus reverse transcriptases might be hijacked, and telomerases or retrotransposons such as long interspersed elements might be employed (20,26,71,84).
Discovering New Mechanisms for Antiviral Immunity in the Animal Host: Remaining Questions
In drosophila, macrophage-like hemocytes have been found to support many of the processes already described to benefit antiviral immune responses. Hemocytes can (1) uptake viral RNA from virally infected cells, (2) produce virus-derived complementary DNA and secondary viral siRNA, and (3) secrete viral siRNAs through exosome-like vesicles. Researchers have also found that exosomes containing viral siRNAs can be purified and transferred to naive hosts to confer virus-specific protective immunity (79). Experiments are now warranted to determine whether mammalian cells share some or all of these functions.
The hypothesis that viral sequence integration may be beneficial to support long-term B cell and T cell immune responses in the mammalian host poses unique questions. How often are viral sequence elements integrated into animal host DNA during an active virus infection? Does the frequency of the event depend on the type of virus and target tissue? Exactly what mechanisms are involved? How often are sequences duplicated in the genome (29)? Is integration usually confined to the site of infection (e.g., confined to the respiratory tract during a respiratory virus infection)? What cell types are most often affected? Are the target cells durable and migratory? Where in the host genome is integration most likely to occur and what is the average length of an integrated viral sequence? How often do open-reading frames exist and how often are viral peptides or proteins produced? Is peptide production controlled or constitutive? What other mechanisms preserve viral sequences within the cell without sequence integration? Answers to each of these questions will bring us closer to an understanding, not just of the immune system of bacteria and of exciting new laboratory and clinical tools (18,22,44), but also of sequence integrating mechanisms that may naturally boost the lymphocyte response and thereby protect mammals from infectious disease.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
