Abstract
Tick-borne encephalitis virus (TBEV) is a flavivirus with major impact on global health. The geographical TBEV distribution is expanding, thus making it pivotal to further characterize the natural virus populations. In this study, we completed the earlier partial sequencing of a TBEV pulled out of a pool of RNA extracted from 115 ticks collected on Torö in the Stockholm archipelago. The total RNA was sufficient for all sequencing of a TBEV genome (Torö-2003), without conventional enrichment procedures such as cell culturing or suckling mice amplification. To our knowledge, this is the first time that the genome of TBEV has been sequenced directly from an arthropod reservoir. The Torö-2003 sequence has been characterized and compared with other TBE viruses. In silico analyses of secondary RNA structures formed by the two untranslated regions revealed a temperature-sensitive structural shift between a closed replicative form and an open AUG accessible form, analogous to a recently described bacterial thermoswitch. Additionally, novel phylogenetic conserved structures were identified in the variable part of the 3′-untranslated region, and their sequence and structure similarity when compared with earlier identified structures suggests an enhancing function on virus replication and translation. We propose that the thermo-switch mechanism may explain the low TBEV prevalence often observed in environmentally sampled ticks. Finally, we were able to detect variations that help in the understanding of virus adaptations to varied environmental temperatures and mammalian hosts through a comparative approach that compares RNA folding dynamics between strains with different mammalian cell passage histories.
Introduction
TBEV contains an ∼11-kb positive single-stranded RNA genome, which is capped but not polyadenylated. The RNA molecule encodes a single polyprotein that is proteolysed and processed by a combination of virus and host proteins. The open reading frame (ORF) is flanked by 5′- and 3′-untranslated regions (UTRs), which carry both sequential and structural motifs that are important for flavivirus lifecycle processes such as translation, replication, and possible assembly (Gritsun and Gould 2006, 2007). By in silico folding, the evolutionary conserved RNA structures in both UTRs have been established. These structures have by convention been numbered from the 5′- and 3′-ends and inward as they appear in the genome (Gritsun and Gould 2006, 2007). The 5′-region of the TBEV RNA molecule contains stem loop (SL) structures 5′-SL1–4, with the SL1–2 being present within the 5′-UTR, whereas the SL3–4 is situated in the N-terminal region of the ORF (Khromykh et al. 2001, Kofler et al. 2006).
The 3′-UTR of TBEV can be divided into a variable part (V3′-UTR) and a highly conserved core element (C3′-UTR) (Wallner et al. 1995, Gritsun et al. 1997). The C3′-UTR contains the structural formations SL1–5. Progressive deletion experiments have demonstrated the essential role of SL1–5 for maintaining virus viability (Mandl et al. 1998, Pletnev 2001). Moreover, this region has been proposed to act as a virus promoter (Gritsun and Gould 2007). Upstream, three additional nonessential structures (SL6–8) have been suggested to function as enhancers for virus replication and translation (Gritsun et al. 1997, Proutski et al. 1999, Gritsun and Gould 2007). The V3′-UTR region is located between the end of the polyprotein and the C3′-UTR and it is highly heterogenic, both in nucleotide sequence and in length between strains. The variability in length has been postulated to be linked to the number of laboratory passages the virus encounters (Wallner et al. 1995, Mandl et al. 1998). Further, experimental passage of the virus with truncated V3′-UTR has demonstrated that deletions in the region could be dispensable during laboratory conditions (Mandl et al. 1998). Because of this heterogeneity, phylogenetic prediction of secondary RNA folding has proved difficult and the region between the stop of NS5 and the SL8 has been interpreted as unfolded for TBEV, that is, lacking any conserved RNA structures (Thurner et al. 2004, Gritsun and Gould 2007).
An alternative, albeit nonexclusive, interpretation rests on the well-established observation that arthropod-borne viruses are genetically constrained as they require a functional life cycle in two highly different cellular environments. Because the variable region is present in strains freshly isolated from ticks, and multiple passages in vertebrate cells trigger spontaneous losses of significant fragments within it, some have suggested that the V3′-UTR region probably plays a role during the viral lifecycle in tick hosts (Wallner et al. 1995, Hayasaka et al. 2001, Melik et al. 2007).
Long-range interactions between 5′- and 3′-UTRs occur via cyclization sequences (CS) recently reviewed by Villordo and Gamarnik (2009). CS interactions were first identified for the dengue virus (Hahn et al. 1987), and analogous sequences have been also identified for TBEV: CSA (Mandl et al. 1993), CSB (Khromykh et al. 2001), CSb-2 (Thurner et al. 2004), and CSb-1 (Kofler et al. 2006). The CSA is essential for functional replicase assembly (Khromykh et al. 2001), whereas the CSB elements seem dispensable for functional TBEV replication (Kofler et al. 2006).
Our contribution aims to interpret the biological significance of the V3′-UTR for differential replication/translation dynamics in mammalian and tick cells. This functional exploration was performed through the comparison of a newly sequenced genome, directly extracted from tick tissues, to other TBEV with different mammalian cells passage histories. Computer-predicted RNA foldings of 5′- and 3′-UTRs show that the secondary structures alternate between open form allowing translation and closed form for which the RNA conformation denies the cellular enzymatic complex access to the start codon of the TBEV polyprotein. Our analysis brings new evidence that the V3′-UTR region also carries RNA motifs that might have a role in the adaptation of the viral cycle to different environmental conditions. We further argue that this adaptation is performed through a temperature-sensitive trans-acting riboswitch, similar to a temperature-regulated translation mechanism that was recently found in a bacterial model system (Neupert et al. 2008).
Materials and Methods
Genomic TBEV sequencing
Sampling of ticks and extraction of total RNA have been previously described (Melik et al. 2007). A pool of I. ricinus (9 adults and 106 nymphs) collected in September 2003 on the island of Torö in the Stockholm archipelago (N 58° 80–92′, O 17° 82–53′) was used for the extraction of total RNA and partial sequencing of a TBEV genome (32% of strain Torö-2003) without any enrichment by cultivation (Melik et al. 2007). To complete the sequencing, additional sets of nested RT–polymerase chain reaction (PCR) primer sets were designed (Table 1). The outer and inner primers at the very ends of the genome were designed after W-TBEV strain Neudorfl (Wallner et al. 1996). Viral RNA (1–5 μL of total tick RNA) was reversely transcribed with 2 pmol outer reverse primer (Table 1), using the Superscript III First strand Synthesis System for RT-PCR (Invitrogen) according to the manufacturer's suggestions. Expand high-fidelity PCR system (Roche) was used to amplify 2 μL of each cDNA reaction according to the manufacturer's suggestions. The nested outer and inner primer pairs were added to a final concentration of 0.4 μM, respectively. The primer sequences and corresponding nucleotide genome positions according to W-TBEV strain Neudorfl are shown in Table 1. A PC-960G gradient thermal cycler (Corbett Research) was used with the nested cycling conditions described in Table 1. PCR products were confirmed by gel electrophoresis, purified by QiaQuick Gel extraction kit (Qiagen), and sequenced omitting any cloning steps. Analysis of nucleotide and deduced amino acid sequences were performed using BioEdit Sequence Alignment Editor (
TBEV RNA structures in the 3′-UTR
To identify thermodynamic conserved and hence putative functional RNA secondary structures of Torö-2003, the last 721 bases of the genome were blasted for related sequences. The five TBEV strains with the highest blast score were retrieved from GenBank: TBEV 235 (EF113082), TBEV 274 (EF113084), TBEV 280 (EF113086), TBEV 282 (EF116595), and TBEV 433 (EF116597). The sequences were aligned with CLUSTAL W and imported to RNAalifold (Mathews et al. 1999, Bernhart et al. 2008) and RNAz (Gruber et al. 2007) servers from the “Wienna RNA Package” available at
Temperature-dependent analysis of RNA structures
Long-range temperature-dependent structural rearrangements between 5′- and 3′-UTRs with potentially regulatory functions were identified by iterative in silico RNA folding with a 1°C increment between 18°C and 42°C. This amplitude was selected to cover the temperature range the virus may encounter from the questing tick cellular environment to the mammalian/avian cells (Randolph 2004, Anufriev et al. 2010, Gilbert 2010). The whole-genome TBEV strains described in the previous section were used. For each strain, the first 319 nucleotides (the 5′-UTR and part of the C-protein) were connected with a spacer of 20 polyadenosines to the last 377 nucleotides of the NS5 gene and the entire 3′-UTR. The RNAs were folded on RNA mfold webServer (version 2.3) (Zuker 2003) with free energies from the Turner group. These enthalpies are used to extrapolate the free energy and to fold at other temperatures than 37°C. The following parameters were set (all other kept in default mode): T = 18°C–42°C; p = 5; w = 11; n = 50. The two structurally different forms were one with a closed CSA, corresponding to the replicative form, and the other open form, for which we observed the structures 5′-SL1–4 and 3′-SL1, the latter with exposed CACAG motif. These forms are hereafter designated as closed and open structures, respectively. The percentage of both forms within the 50 most stable structures was plotted for each strain under investigation against the folding temperature. For the statistical analysis of the four W-TBEV strains, the proportion of closed structures was used as a response variable in generalized linear models assuming binomial error distributions. These strains were examined further with a polynomial regression that includes temperature (T), temperature2 (T2), and temperature3 (T3) as explanatory variables and aims to test for straight, parabolic, and S-shaped linear relationships between temperature and proportion of closed structures. In a first model, all four strains were included to establish if they differed in response to temperature, and then each strain was analyzed separately. If curvilinear effects were found, we pursued our investigation of breakpoint temperature in the distribution of the two formations using segmented regression. Statistical analyses were performed with R 2.10.1 (R Development Core Team 2009). Segmented regression used the add-on package segmented (Muggeo 2009), and plotting of error bars used the package gplot (Warnes 2009).
Results
Sequencing and characterization of the TBEV Torö-2003 genome
All TBEV genomes publicly available have been generated from strains handled in laboratory conditions, predominantly in a mammalian cell environment. To generate a genomic TBEV sequence straight from the tick reservoir at a natural foci, we completed the sequencing of the TBEV Torö-2003 (Melik et al. 2007). Primers based on sequences from conserved regions of W-TBEV covering C-prM and the remaining NS regions were used in nested RT-PCR of the Torö-2003 total RNA extract (Table 1). The PCR products were sequenced without any cloning procedures and the sequences were carefully scrutinized for double peaks. The new sequences (68% of the total genome) were assembled with the previous established sequences into a continuous genome sequence (DQ401140). This virus sequence is to our knowledge the first sequenced genome of a Swedish TBEV.
We compared our newly generated genome with the ORF of different TBEV subtypes (Table 2), which showed that Torö-2003 has the highest level of identity with TBEV 263 TS (97.4%) at the nucleotide level. A polyprotein alignment of different TBEV subtypes (data not shown) was then screened for unique features of Torö-2003. Most differences had only minor effects at the amino-acid level; however, five unique substitutions could be discerned. Phe524 within the helicase domain of NS3 and Ser167 in NS4A are tyrosine and alanine residues in Torö-2003, respectively. As the highly conserved NS5 protein is essential for the virus replication and also represents a putative virulence factor because of several described host–target interactions (Johansson et al. 2001, Brooks et al. 2002, Best et al. 2005, Lin et al. 2006, Park et al. 2007, Pryor et al. 2007, Werme et al. 2008, Ashour et al. 2009, Ellencrona et al. 2009, Laurent-Rolle et al. 2010, Wigerius et al. 2010), three unique residues in NS5 of Torö-2003 were of special interest. Residues Gln279 and Glu297 were substituted in Torö-2003 by arginine and alanine residues, respectively. Ser422 localized within the α8 helix downstream of the L3 loop was replaced by glycine in Torö-2003. The L3 loop of NS5 connects and mobilizes the α7 and α8 helices, which forms an entrance for the ssRNA substrate to the template tunnel during viral replication (Yap et al. 2007). Interestingly, all three NS5 substitutions are in rather close proximity within the finger region of the RNA-dependent RNA polymerase domain, suggesting that future studies on the replication kinetics of this virus should be performed.
Novel RNA structures detected in the variable region of the 3′-UTR
As Torö-2003 was sequenced directly from the natural reservoir, it provides a wild-type sequence well suited for further genetic characterization. The alignment, consisting of Torö-2003 together with the most similar W-TBEV 3′-UTRs, was analyzed by the RNAz and RNAalifold algorithms, which predict functional RNA structures based on both structural conservation and thermodynamic stability (Mathews et al. 1999, Gruber et al. 2007). The output of the computation is presented as a mountain plot, which depicts the conserved secondary structures between the W-TBEV 3′-UTRs (Fig. 1A). Apart from the previously described structures, SL1, SL2, Y-shaped SL3–4, SL5, SL6, and Y-shaped SL7–8 (Gritsun et al. 1997, Kofler et al. 2006, Gritsun and Gould 2007), novel secondary structures were identified within the variable region (SL9–14) (Fig. 1A). These structures SL9, SL10, Y-shaped SL11–12, SL13, and SL14 localize within the enhancer region and were visualized on mFold (Fig. 1B). An alignment of the 3′-UTR part of the full-genome TBEV strains was manually screened for sequences corresponding to structures SL9–14 (data not shown). Structures SL9–12 were found in all tick-borne flaviviruses investigated here except for OMSK, whereas SL13 and 14 were only found in the full-length 3′-UTRs of W-TBEVs, that is, Neudorfl, 263 TS, and Torö-2003, although the SL14 is disrupted by the extended polyA sequences present in Neudorfl and 263 TS (Fig. 1B). Further, the TBEV strains HYPR, Vasilchenko, Sofjin, LIV, and OMSK completely lack the sequences required for the SL13–14 formation. Truncation experiments have established that the region carrying our newly characterized SL structures is dispensable during mammalian infection (Mandl et al. 1998, Pletnev 2001, Hoenninger et al. 2008), but to establish if they have functional significance for the viral life cycle within the tick cell remains to be evaluated experimentally. The Y-shaped SL11–12 has a high level of sequence identity with the Y-shaped SL7–8 within both the stem and loop regions (Fig. 1B). Interestingly, the exposed sequences GCAGC, UGGUCG, and GAGAG were identical between these structures, which strengthen this folding (Fig. 1B). Further, the similar Y-shaped SL3–4 structure present in the 3′-UTR promoter region (Fig. 1C) also includes the GAGAG sequence in SL4, implying an analogous association to RNA or proteins for these structures. A schematic overview of the novel (SL9–14) and previously known W-TBEV UTR structures and the CS motifs is presented in Figure 1C.

The RNA folding of the TBEV UTRs suggests a temperature-sensitive riboswitch alternating between an open and a closed structure
It is well established that TBEV replication relies on RNA structures formed between the 5′- and 3′-UTRs via the CSA motif (Kofler et al. 2006); however, whether genomic cyclization has an effect on virus propagation at low ambient temperature in the ectothermic host remains to be established. Indeed, initiation of viral translation depends on eukaryotic initiation factors that facilitate ribosome assembly and scanning until the initiation codon is reached; however, base pairing by 5′ and 3′ CS generates a closed helix conformation sequestering access to AUG (Henkin 2008). The temperature-dependent hybridization of CSA/CSb could thus act as a localized thermosensor that under the influence of the V3′-UTR switch between low-temperature translation OFF and a high-temperature translation ON setting (Narberhaus et al. 2006, Henkin 2008). To test this hypothesis, the 5′- and 3′-UTRs were fused and folded in silico and then used to identify putative conserved structures at different temperatures (Fig. 1C). At each temperature, 50 RNA structures, which are part of the pool of the most stable arrangements of the TBEV UTRs, were retrieved. The originality of our approach consists in basing our reasoning not on a single most stable fold for each temperature, but instead to adopt a population view, which reflects the fact that different optimal folding patterns may be calculated if the folding energies are changed even slightly. It is then reasonable to assume that the viral pool of identical sequences displays simultaneously for a given temperature several optimal and suboptimal structures within a small percentage of the minimum energy and that each individual sequence permutes between these different folds. Typical folds for Torö-2003 are presented in Figure 2. Further, all the tick-borne flaviviruses tested had a temperature dependency between closed and open conformations in the RNA populations (data not shown).

Secondary RNA structures predictions by the m-fold server (2.3), representing the structural RNA switch between a closed replicative form and open AUG accessible form. Start and stop codons are shown in red; 5′ and 3′ CSA/CSb-1/CSb-2 are highlighted in green. Cyclization motifs are numbered according to 5′- and its complementary 3′-UTR sequences in Neudorfl. Interactions occur between nucleotides 110–129 and 11058–11077 for CSA; 157–166 and 10772–10781 for CSB-1; 163–176 and 10948–10959 for CSb-2. A schematic view of fused 5′- and 3′-UTRs of the Swedish TBEV Torö-2003 is presented in Fig. 1C.
At lower environmental temperatures (<25°C), the dominant variants correspond to the closed structures CSA/CSb-1 (Fig. 2A) and CSA/CSb-2 (Fig. 2B), whereas open folds optimal for efficient virus translation were abundant at higher temperatures (>35°C) (Fig. 2C). The Y-shaped SL11–12 and SL7–8 structures were only present on the closed CSA/CSb-1 formation (Fig. 2A), which could suggest a role for these structures in initiation of replication at lower temperature. SL13–14 was found at all temperatures for both the open and closed conformations (Fig. 2). SL14 is of particular interest as being the site where several TBEV strains lose their variable region or have polyadenosines incorporated. It is bordered by sequences that are fully single stranded at higher temperatures (Fig. 2C). The SL9 is immediately preceded by the GUCAG sequence (Fig. 1B) recently proposed, in combination with secondary structure, to render an exonuclease-resistant subgenomic 3′-UTR molecule important for flavivirus pathogenesis (Pijlman et al. 2008).
The effect of temperature on the proportion of open and closed conformations was quantified and revealed that over 90% of the structures retrieved in the interval of 18°C–27°C were in the closed conformation for Torö-2003 (Fig. 3) and would imply that TBEV translation is restricted in the ectothermic host at low ambient temperatures. Further quantification showed differential response to temperature between W-TBEV strains. In particular, when genome cyclizations for different strains were compared in relation to temperature, the relationship between strain temperatures was always significant (p < 0.001 for T and T2 and p < 0.01 for T3). The effect on the proportion of closed structure was significantly parabolic for Neudorfl (T and T2 had p < 0.001), Torö-2003 showed a pronounced S-shaped relationship (T3, p < 0.001), and 263 TS was marginally significantly S-shaped (p = 0.02), whereas for HYPR only direct linear relationship with temperature was significant (for T, p < 0.001). Based on results from the generalized model, we completed the analysis with segmented regressions to identify the breakpoints in the temperature effect, which correspond to the temperatures where the proportion of the closed structure starts to decrease. No such temperature was identified for HYPR (Fig. 3D), whereas our analysis demonstrates the presence of a single breakpoint for Neudorfl (Fig. 3A) and of two inflection points for Torö-2003 and 263 TS (Fig. 3B, C). The lower temperature breakpoint is the most pronounced, where the hitherto stable proportion of closed structures starts to diminish. The second breakpoints occur at higher temperatures that are relevant during the late blood meal of the tick or in the mammalian cells. The lower temperature breakpoint was 30.0°C (95% confidence interval: 28.2–31.8) for Neudorfl and 26.8°C (25.4–28.1) for Torö-2003, suggesting a putative adaptation for the virus present in the colder climate of Sweden. The temperature sensitive 263 had a lower breakpoint temperature of 23.6°C (20.6–26.6), especially in comparison with the virulent strain HYPR, which had a continuously dominant proportion of closed forms even at higher temperatures (Fig. 3). This indicates that the laboratory-handled HYPR strain has evolved genetic adaptation in response to a fully mammalian cell endothermic environment.

Proportions of closed RNA structures of four W-TBEV strains; (
Discussion
The dominance of closed forms at low temperature in our wild-type TBEV genome entails low translation levels in ectothermic tick cells. This provides a putative explanation to the fact that most ticks collected in high-incidence endemic TBEV areas have low or undetectable virus levels (Hudson et al. 2001, Han et al. 2002, Suss et al. 2004, Han et al. 2005, D'Agaro et al. 2009, Klaus et al. 2009). This observation relates to the fact that the sensitivity of our RT-PCR strategies is often too low for direct TBEV detection from naturally sampled ticks. In fact, of 4497 ticks sampled between 2003 and 2008 (data not shown), only the Torö-2003 pool generated virus amounts sufficient for genomic TBEV sequencing. Torö-2003 was sampled in September (Melik et al. 2007), and capitalizing on our study of the translational virus response to temperature, we propose that high temperature due to the warm season or high incidence of recently metamorphosed ticks could be factors explaining the accumulation of virus load at the collection time.
Strong base pairing by CS formation interferes with RNA unwinding and ribosomal scanning toward the start codon and thereby prevents ribosomal initiation at the 5′-terminus (Khromykh et al. 2001, Chiu et al. 2005). The previously published ΔG values for CS interactions of TBEV have been obtained with sequences folded at 37°C (Mandl et al. 1993, Khromykh et al. 2001, Thurner et al. 2004, Kofler et al. 2006). The individual ΔG (kcal) for TBEV CSA (−24.4) and ΔG for CSb-1 (−20.8) giving a total sum of −45.2 kcal as determined by Kofler et al. (2006) indicates that these structures are likely to have some impact on the viral translation. The strength of the CSA/CSb-1 hybridization energy could be compared with the ΔG of −34.4 kcal of a synthetic SL that completely abolished translation when inserted before the start codon of the Barley yellow dwarf luteovirus (Guo et al. 2001). The ectothermic cells at ambient temperature seldom exceed 20°C in regions where TBEV are endemic. For comparison, the respective ΔG values at 18°C for the CS of Torö-2003 were −37.21 (for CSA) and −21.23 (for CSb-1), generating a total sum of −58.44 kcal, whereas the lower ΔG of CSA alone possibly inhibits ribosomal scanning toward the start codon, thereby restricting polyprotein synthesis (Kozak 1986).
The novel SL structures detected in the genome of the tick-sequenced Torö-2003 strain occur in a region previously described as unfolded (Thurner et al. 2004, Gritsun and Gould 2007). The incongruence between our findings and previously published structures comes from the analysis of 3′-UTR with different levels of sequence similarity, partly due to different passage histories. The proposed structural RNA map of the full-length 3′-UTR as presented in Figure 1C brings functional clues to the earlier reported repeated elements (R2–3) (Wallner et al. 1995) as the periodicity of the structures becomes apparent. The importance of the function of Y-shaped SL11–12 is further supported by the existence of identical nucleotides at the loops and similarity of the stems compared with the Y-shaped SL7–8. Virus propagation during environmental temperature conditions in the tick host is poorly investigated, and a low efficiency of replication and translation is probably relying on multiple enhancing structures within the variable region.
Temperature-induced changes in the structural RNA conformation regulating translation have been recently tested in bacterial systems (Neupert et al. 2008). The trans-acting cyclization of the TBEV 5′- and 3′-UTRs prevents the translation by masking the ribosomal access to AUG. Moreover, this mechanism is temperature dependent and the melting of CSA/CS-b1 or CSb-2 acts as a thermosensitive riboswitch in conjunction with the unfolding of individual low-temperature secondary structural enhancers such as Y-shaped SL7–8 and SL11–12. Altogether we propose that temperature-dependent structural RNA rearrangement between open and closed conformations acts as a thermosensitive riboswitch for on/off setting of TBEV translation in the questing tick at environmental conditions. More genome sequences obtained without enrichment should in the future shed new light on TBEV sequence variation and genomic folding. Finally, the putative connection between RNA folding and high virulence in the case of HYPR strain calls for more research on adaptations that could alter the riboswitch mechanism. In conclusion, more studies on TBEV genome cyclization dynamics will shed light on important TBE epidemiological issues.
Footnotes
Acknowledgments
The authors are grateful to Dr. A. Brooks for useful comments and suggestions on the manuscript. This work was supported by grants to M.J. by the Baltic Sea Foundation.
Disclosure Statement
No competing financial interests exist.
