Abstract
Ovine adenovirus 287 (OAdV287) emerges as one of the most promising gene vectors resulting from its unique biological characteristics. To obtain a more detailed knowledge about the codon usage of OAdV287, a comparative study based on the codon usage of OAdV287 and the prototypes of human adenovirus serotypes 2 and 5 (HAdV2/5) was carried out. Some commonly used indices measuring the codon usage patterns, including effective number of codons, relative synonymous codon usage, and statistical methods, were adopted. Overall, OAdV287 had a more biased and conservative codon usage pattern than that of HAdV2/5. Both mutation pressure and natural selection played important roles in shaping the codon usage patterns of these three adenoviruses. All the preference codons of OAdV287 had A/U ends and were totally different from those of sheep and humans; however, the preference codons of HAdV2/5 mostly had G/C ends and were mostly coincident with those of sheep and humans. The codon usage analysis in this study supplies some clues for further comprehending the unique biological characteristics of OAdV287 as gene vectors.
Introduction
Seven serotypes of adenovirus have been isolated from sheep, which were referred to as OAdV serotypes 1–7. Serotypes 1–6 were assigned to the genus Mastadenovirus, and as serotype 7 presented sharply phylogenetic divergence from mastadenoviruses, it was proposed as the prototype of the newly recognized genus Atadenoviruses (Bangari and Mittal, 2006). OAdV287 is the representative strain of serotype 7, isolated first in Western Australia (Peet et al., 1983). Thirty open reading frames (ORFs) of OAdV287 in two transcriptional directions, including structural genes, nonstructural genes, and some functionally uncertain genes, have been identified (Fig. 1). For HAdV2/5, genome structures have been well studied and 36 ORFs have been identified. OAdV287 could efficiently transduce a variety of human and other animal cells in culture, but replicated abortively in a wide range of nonovine cell lines tested (Both, 2004). Currently, OAdV287 vectors are still in laboratory stage because of the incompletely clear properties about biocompatibility, stability, and biosafety, which need to be imperatively clarified.

The map of genome structure of OAdV287. Thirty putative open reading frames (ORFs) have been identified in two transcriptional directions including the ORFs in left-hand (LH) region, core region, early 4 region (E4), and right-hand (RH) region. OAdV287, ovine adenovirus 287; ORF, open reading frame.
Analysis of codon usage has proven to be an effective way to understand the molecular evolution of viruses and their individual genes. Also, it can be used for better understanding the interaction between viruses and hosts (Karlin et al., 1990). The relation of codon usage pattern between viruses and hosts is an interesting and complicated phenomenon. Codon usage pattern of poliovirus is mostly coincident with that of its host, whereas the codon usage pattern of Hepatitis A virus (HAV) is antagonistic to that of its host, and the codon usage pattern of Enterovirus 71 is a mixture of coincidence and antagonism to that of host (Sanchez et al., 2003; Mueller et al., 2006; Liu et al., 2011). A detailed knowledge of codon and nucleotide preferences of AdVs is necessary to develop an efficient gene expression system, and the analysis of synonymous codon usage among all adenoviral genomes fully sequenced has been done (Das et al., 2006). To our knowledge, the comparison of codon usage between AdVs and hosts has not been elaborated. In the present study, we compared the nucleotide bias and codon usage between OAdV287 and HAdV2/5, and the codon usage relations between viruses (OAdV28 and HAdV2/5) and their hosts (sheep and humans) were also analyzed, with a purpose of getting a more detailed knowledge about the potential serving as gene vectors.
Materials and Methods
Sequence data
The three DNA genomic sequences of AdVs in the present study were downloaded from GenBank (
Nucleotide composition and codon usage analyses
The nucleotide content of OAdV287 and HAdV2/5 were calculated by DNAStar7.0. Effective number of codons (ENC) was used to quantify how far the codon usage of a gene departed from equal usage of synonymous codons. ENC values range from 20 (only one codon is used for each amino acid) to 61 (all codons are used equally), and a codon can be considered as biased when its ENC value is <35 (Wright, 1990; Fu, 2010).
Relative synonymous codon usage (RSCU) of each codon in each ORF of OAdV287 and HAdV2/5 was calculated to investigate the characteristics of synonymous codon usage without the confounding influence of amino acid composition of different gene samples (Sharp and Li, 1986). The comparison of codon usage between viruses (OAdV287 and HAdV2/5) and their hosts (sheep and humans) were conducted based on their preference codons, and the codon usage data of humans and sheep were obtained from the codon usage database online (
To investigate whether nucleotide composition or mutation pressure can explain their codon usage patterns, ENC was plotted against GC3 (GC content at the third codon position, GC3) of each ORF of OAdV287 and HAdV2/5. The plot contains a reference curve, which is an approximate upper limit for the value of ENC and represents the expected position of genes whose codon usage is only determined by variation in GC3 contents. The result that the position of genes lies well below the curve reveals that nucleotide composition only explains some of the variation in codon usage (Wright, 1990).
The RSCU and ENC values were calculated by DNAStar7.0 and EMBOSS 6.1.0.1, respectively.
Statistical analyses
Principal component analysis was carried out to analyze the major trend in codon usage pattern among different genes (Mardia et al., 1979; Jolliffe, 2002). Each ORF of OAdV287 and HAdV2/5 was represented as a 59 dimensional vector, and each dimension corresponded to the RSCU value of each sense codon, which included several synonymous codons for a particular amino acid, excluding AUG, UGG, and stop codons. Principal component analysis can compress the high-dimensional information into a two-dimensional map, which provides a more convenient way to visualize differences in codon preferences between different genes. We set up a two-dimensional map, which was made up of the first principal component f 1 and the second principal component f 2. Both f 1 and f 2 were generated from principal component analysis on RSCU values.
Linear correlation analysis was conducted to study the correlation between synonymous codon usage bias and nucleotide compositions, wherein the f 1 and f 2 values were compared with nucleotide compositions at the third synonymy position of all ORFs between OAdV287 and HAdV2/5. The analyses were performed based on the Pearson's rank (two-tailed) correlation analysis. All statistical analyses were carried out by the statistical software SPSS11.7 for Windows.
Results
Nucleotide composition bias
The GC contents (%) for different ORFs of OAdV287 fluctuated from 27.95 to 42.22 with a mean value of 33.48±3.26, indicating that nucleotides A and T were the major elements in OAdV287 genome. The GC3 contents (%) for different ORFs of OAdV287 varied from 16.77 to 46.67, with a mean value of 25.49±6.16 (Table 2). For HAdV2/5, the mean values of GC contents (%) were 54.42%±7.77% and 54.55%±7.47%, and the mean values of GC3 contents (%) of HAdV2/5 were 60.61±11.97 and 61.10±11.27, respectively. Both the GC and GC3 contents (%) of HAdV2/5 were obviously higher than those of OAdV287 (Supplementary Tables S1 and S2; Supplementary Data are available online at
ENC, effective number of codons.
Synonymous codon usage
For different ORFs of OAdV287, the ENC values ranged from 33.90 to 53.04 with a mean value of 42.52±4.40. The counterparts in HAdV2/5 were 48.04±5.82 and 47.80±6.70, with a range of 38.17 to 60.51 and 30.24 to 60.63, respectively. Among all the ENC values, only one ORF (U exon) had high codon bias (ENC <35), indicating that neither OAdV287 nor HAdV2/5 had particularly significant codon usage bias (Table 2).
HAdV2 and HAdV5 had the same preference codons, which were very different from those of OAdV287 (Table 3). For OAdV287, all preferentially used codons had A/U end, especially the U-ended ones, accounting for 65% of all preference codons. UUA, GGA, AGA, GAA, AAA, and CAA were the preference codons for Leu, Gly, Arg, Glu, Lys, and Gln, respectively; the preference codons for the rest of the amino acids had U ends. Compared with OAdV287, nearly all the preference codons of HAdV2/5 had G/C ends, except for Ile and Phe, whose preference codons were AUU and UUU, respectively. The preference codons of humans and sheep had mostly G/C ends, and they shared the same preference codons for the same amino acid except for His and Arg—the preference codons for His and Arg were CAC and CGC in humans but CAU and AGG in sheep. Interestingly, the codons most abundantly used in HAdV2/5 were broadly consistent with those of humans and sheep, whereas the preference codons of OAdV287 were totally different from those of humans and sheep.
The RSCU values of preference codons for each amino acid are shown in bold.
AA is the abbreviation of amino acid.
RSCU values are mean values.
The preference codons of human (
The preference codons of sheep (
OAdV287, ovine adenovirus 287; RSCU, relative synonymous codon usage.
In the plot of ENC-GC3 over all the ORFs of OAdV287 and HAdV2/5, the GC3 contents of OAdV287 were lower than that of HAdV2/5 and most of the ENC values of ORFs of OAdV287 were smaller than those of HAdV2/5 (Fig. 2). In addition, a majority of ORFs lay below the reference curve, whereas only a few ORFs lay on the curve or above the curve, implying the nucleotide composition only explained some of the variation in their codon usage, so some other factors also shaped the codon usage pattern of these genes.

The ENC-GC3 plot of each ORF of OAdV287 and HAdV2/5. The curve indicates GC composition is the only factor that influences codon usage bias. A majority of ORFs lay below the reference curve, whereas only a few ORFs lay on or above the curve. HAdV2/5, human adenovirus serotypes 2 and 5; ENC, effective number of codons.
Correlation analyses
The f 1 and f 2 values accounted for 30.607% and 8.025% of the total variability, respectively, which can explain a substantial amount of variation in trends in codon usage. Interestingly, the ORFs of HAdV2/5 had a similar distribution, being apparently different from those of OAdV287; a more interesting result was that the ORFs of HAdV2/5 scattered more broadly than those of OAdV287 (Fig. 3), implying the codon usages of ORFs of OAdV287 had a more high consistency of co-evolution than those of HAdV2/5.

The plot of the first and second axis values of each ORF of OAdV287 and HAdV2/5 generated from principal component analysis on relative synonymous codon usage values. ORFs of HAdV2/5 had a similar distribution and scattered more broadly than ORFs of OAdV287.
The f 1 value was much bigger than the f 2 value, so f 1 value had a greaterinterpretation degree for the total variability in codon usage. The f 2 value had a very significant negative correlation with C3 content, whereas f 1 value had a very significantly positive correlation with C3 content, so the latter was considered persuasive. For OAdV287, the f 1 value was highly significantly correlated with the nucleotide contents at the third codon position except for A3. For HAdV2/5, the f 1 value was highly significantly correlated with all the nucleotide contents at the third codon position. Overall, significant correlations were observed among the nucleotide contents and f 1 (f 2) values (Table 4), implying that compositional constraint played an important role in determining the variation of synonymous codon usage among all ORFs of the three AdVs.
NS means nonsignificant (p>0.05).
p<0.01.
0.01<p<0.05.
Discussion
In contrast to HAdV2/5, the high AT content of the OAdV287 genome resulted in a corresponding low GC content. Correlation analysis between the first two values of principal component analysis and the nucleotide composition suggested that compositional constraint played an important role in synonymous codon usage among all the three AdVs. Two main factors were proposed to explain codon bias: mutational bias and natural selection, or both (Lavner and Kotlar, 2005). For viruses, the force driving their evolution mainly includes the mutational bias from the nucleotides composition of themselves and the selection pressure from the hosts (Liu et al., 2011). In our research, the ENC-GC3 plot suggested that both the mutational bias and selection pressure accounted for codon usage bias in OAdV287 and HAdV2/5. However, compared with HAdV2/5, the average ENC value of OAdV287 was smaller, so OAdV287 was more biased in codon usage, reflecting that the selection pressure from hosts may have played a more powerful role in shaping the codon usage of OAdV287 than in HAdV2/5.
The extent of codon bias in relation to the level of expression of virus genes in a host–parasite system has been previously described, and nucleotide composition and codon usage being active driving forces during the recent evolutionary history of the Astroviridae and in small DNA viruses has been reported (Berkhout et al., 2002; Sewatanon et al., 2007; van Hemert et al., 2007). This research suggests that some relationship between codon usage of viruses and their hosts does exist. The codon usage of HAV was complementary to that of human cells, never adopting those abundant in the host cells as their own preference codons, which has been interpreted as a subtle strategy to avoid competition for the cellular tRNAs in the absence of a precise mechanism of inducing shutoff of cellular protein synthesis, and has been recognized as a strategy to escaping the hosts' immunologic response (Sanchez et al., 2003; Pinto et al., 2007). Recent reports have indicated that codon usage can change protein folding process and protein structure through controlling the translation rate (Weygand-Durasevic and Ibba, 2010; Saunders and Deane, 2010), and the roles of rare codons in the control of translation speed have been extensively reported (Sørensen et al., 1989; Chou and Lakatos, 2004). Accumulation of rarely used codons may lead to a slow translation rate, and too low translation rate probably results in insufficiently synthesizing or changing the structure of the proteases that are responsible for arousing antiviral responses, so viruses can escape from host defenses and grow in a quiescent way (Mueller et al., 2006; Thacker et al., 2009). The biased nucleotide composition of OAdV287 was accompanied by a preference for A/U-ended codons, whereas most preference codons of HAdV2/5 have G/C ends, that is, similar to HAV the codon usage pattern of OAdV287 is mostly antagonistic to that of human and its natural host, sheep. Thus, the contrariety of codon usage between OAdV287 and hosts may cause lesser impact on host protein synthesis than HAdV2/5 do and likely induces a very low translation rate of OAdV287, which might be even undetected by host cell defenses, bringing some biosafety risks to its application as gene vectors.
OAdV287 was reported to efficiently replicate in ovine fetal lung (CSL503) and skin fibroblastic (HVO156) cell lines with a high titer (Boyle et al., 1994; Both, 2004). To date, no reports that OAdV287 could successfully replicate in nonovine cells are available. All the viral promoters are inactive in HepG2 human liver carcinoma cells; only some early promoters are active and the major late promoters are inactive and with only a minimal DNA replication in Michigan Cancer Foundation-7 (MCF-7) human breast cancer cells; DNA replication occurs and late proteins are synthesized in IMR90 human lung fibroblasts, but incorrect protein processing prevents the formation of infectious virus particles (Khatri et al., 1997; Kumin et al., 2002). The replication of OAdV287 can be blocked at different stages depending on cell type. Maybe it is the long-term co-evolution and co-adaptation between OAdV287 and sheep that make OAdV287 successfully replicate in sheep rather than in humans, for tissues differ in codon usage in mammals, and codon-mediated translational control may play an important role in the differentiation and regulation of tissue-specific gene products in humans and the tissue tropism of the OAd287 is different from that of HAdV2/5, as the liver is not the dominant target for OAd287 (Plotkin et al., 2004; Bangari and Mittal, 2006). Therefore, although it improves biological safety as a gene vector in theory that OAdV287 replicates abortively in nonovine cell lines tested after transfection, the potential safety risk cannot be ignored from the perspective of codon usage pattern. Based on the serum cross-neutralization tests, seven different serotypes of ovine AdVs can be identified, and the RFLP analysis has suggested that ovine AdVs 1–6 are fairly similar to each other and to mastadenoviruses (Barbezange et al., 2000). Although OAdV287 is also an ovine adenovirus, it presents dramatic phylogenetic divergence from the mastadenoviruses because of its high AT content of the genome and has been defined as the prototype of the new Atadenovirus genus (Both, 2004). So the analysis in the present study cannot represent all the ovine AdVs considering that OAdV287 is an atypical ovine adenovirus whose genome was sharply distinct from the other six serotypes of ovine andenoviruses.
Footnotes
Acknowledgments
This work was supported in part by grants from National Science and Technology Key Project (2009ZX08007-006B), International Science and Technology Cooperation Program of China (No. 2010DFA32640), and Science and Technology Key Project of Gansu Province (No. 0801NKDA034). This study was also supported by National Natural Science foundation of China (No. 30700597 and No. 31072143).
Disclosure Statement
No competing financial interests exist.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
