Abstract
Rhabdoviridae is one of the most diversified families of RNA viruses whose members infect a wide range of plants, animals, and arthropods. The members of this family are classified into 13 genera and >150 unassigned viruses. Here, we sequenced the complete genome of a rhabdovirus belonging to the Hart Park serogroup, the Kamese virus (KAMV), isolated in 1977 from Culex pruina in the Central African Republic. The genomic sequence showed an organization typical of rhabdoviruses with additional genes in the P-M and G-L intergenic regions, as already reported for the Hart Park serogroup. Our Kamese strain (ArB9074) had 98% and 78.8% nucleotide sequence similarity with the prototypes of the KAMV and Mossuril virus isolated in Uganda and Mozambique in two different Culex species, respectively. Moreover, the protein sequences had 98–100% amino acid similarity with the prototype of the KAMV, except for an additional gene (U3) that showed a divergence of 6%. These molecular data show that our strain of the KAMV is genetically close to the Culex annuliorus strain that was circulating in Uganda in 1967. However, this study suggests the need to improve our knowledge of the KAMV to better understand its behavior, its life cycle, and its potential reservoirs.
Introduction
R
Kamese virus (KAMV) was first identified from female Culex annulioris mosquitoes caught in the Kamese forest, Mawokota County, Uganda, in 1967 (1970). Like many other viruses, the newly identified virus was classified as a rhabdovirus according to its bullet-shaped morphology and antigenic relationships. In addition to the prototype strain (MP6186 from Uganda in 1967), several strains had been isolated, identified as KAMV using seroneutralization assay, but were not characterized at the molecular level. The strains came from different sylvatic species of Culex and Aedes africanus mosquitoes (World Health Organization Collaborating Center for Reference and Research on Arboviruses [CRORA], Institut Pasteur de Dakar,
Materials and Methods
Virus isolation
The viral strain described in this study was isolated from Culex pruina mosquitoes caught in 1977 at Lobé (18°03′E, 03°38′N) in the CAR. The mosquitoes were collected in a semi mountainous equatorial forest, identified, and grouped into pools of 30 individuals. Viruses were isolated and amplified by four serial passages in brains of suckling mice (Saluzzo, et al. 1980). The brain suspensions were then lyophilized and stored in sealed glass vials at room temperature until the beginning of our experiment. The lyophilized virus was resuspended in phosphate-buffered saline and inoculated in newborn mice aged 1 to 3 days old for virus amplification.
Nucleic acid sequencing
For molecular characterization, RNA was extracted from the brain tissue of a moribund newborn mouse using the QiaAmp Viral RNA Mini Kit (Qiagen) according to the manufacturer's instructions. Extracted total RNA was treated with Turbo DNase (Life Technologies) to remove the Mus musculus DNA genome and then retrotranscribed into cDNA with Super Script III reverse transcriptase and random hexamer primers (Life Technologies). The generated cDNA was amplified with the Phi29 enzyme, as described previously (Berthet, et al. 2008). Amplified DNA was quantified in the Quant-iT assay (Invitrogen), and a fixed amount of amplified DNA was fragmented using a Covaris M220 ultrasonicator according to the manufacturer's instructions. The 450 bp DNA fragments were used to construct a genomic library with the NEBNext® Ultra DNA Library Prep kit for Illumina® (New England Biolabs) according to the manufacturer's recommendations. Illumina sequencing was performed using the MiSeq instrument to give 150 bp in paired-end reads with the MiSeq sequencing kit v2 (Illumina). A total of 5,142,844 reads in paired ends were obtained. All raw reads were filtered according to quality, and the mouse genome sequence was filtered from these reads with Bowtie 2.0 software using the M. musculus Mn10 sequence as a reference. Viral reads corresponding to the KAMV genome were selected using a similarity approach with BLASTN and BLASTX search tools based on the only complete sequence of KAMV (KM204989) available in GenBank. For each selected read, only the region that matched the viral genome was considered as previously described. All reads were assembled with SPAdes software (version 3.5) to obtain the full-length viral genome in one step. The size of the obtained KAMV genome was 13,206 nucleotides with an average coverage of 13648 × . The KAMV sequence was deposited in GenBank under accession number KX497133.
Result and Discussion
The genome showed the typical rhabdovirus organization with the following order of the main genes: 5′-N-P-M-G-L-3′. As already observed in many rhabdoviruses belonging to the Hart Park serogroup, additional genes with unknown functions (UN) were found between the P-M and G-L genes (Gubala, et al. 2008, Walker, et al. 2011). Five additional complete ORFs, whose sizes ranged from 300 to 500 nt, were characterized between P-M and G-L. Actually, four ORFs with unknown function are found in the P-M intergenic region. U1, U2, and U4 are at positions 2210 to 2695, 2717 to 3217, and 3286 to 3718, respectively. U3 is found at positions 2844 to 3144 in the U2 ORF. The fifth ORF (U5) is found in positions 6441 to 6752. These intergenic ORFs have already been described for several Hart Park serogroup members such as FLAV, HPV, NGAV, and WONV (Gubala, et al. 2008, Gubala, et al. 2010, Allison, et al. 2014).
A selected set of rhabdovirus sequences belonging to the Hart Park serogroup was used to determine the phylogenetic relationships of our KAMV variant with other rhabdoviruses. The amino acids of L protein sequences were aligned in Geneious software (version 9.1.4), with manual editing to increase the quality of the alignment. The phylogenetic tree was constructed using the maximum likelihood method, and the statistical significance of the tree was estimated by bootstrapping with 1000 replicates. Phylogenetic analysis indicated that KAMV ArB9074 clustered with other KAMVs isolated from C. annulioris (Walker, et al. 2015) (Fig. 1).

Maximum likelihood phylogenetic tree of 16 rhabdovirus L protein sequences. Analysis at the amino acid level was based on sequences available in GenBank. The tree was generated using Geneious software (Geneious R9 version 9.1.4;
As shown in Table 1, the full-length KAMV genome isolated in the CAR in 1977 had 98% nucleotide sequence similarity with the prototype KAMV sequence (KM204989), which was isolated in Uganda in 1967. The various conserved proteins such as M, N, P, G, and L and all unknown function proteins, except U3, showed high amino acid similarity, which ranged from 98.0% to 100%, whereas the U3 protein had a 6% divergence with the prototype KAMV sequence. The comparison between the KAMV and Mossuril virus genomes, which is the most closely related viral sequence, showed variable divergence at the amino acid and nucleic acid levels according to the gene considered. Similarity at the amino acid level ranged from 79% to 99.3%. Although the N and unknown function proteins (U1 and U2) showed high nucleotide divergence (7.4% to 12.8%), protein sequences remained conserved (96.9–99.3%) except for the G and unknown function proteins (U3 and U5) (16–22.2% of divergence). This high divergence may be because of the natural evolution of the strain within a novel species of Culex and by the interaction between external viral proteins and host proteins.
FLAV, Flanders virus; G, glycoprotein; HPV, Hart Park virus; KAMV, Kamese virus; L, RNA-dependent RNA polymerase; M, matrix; MOSV, Mossuril virus; N, nucleoprotein; nucl, nucleic acid; P, phosphoprotein; prot, protein; UN, unknown protein.
Although KAMV has only been identified in the CAR and Uganda, C. pruina and C. annulioris are sympatric, hematophagous mosquitoes that are found in forested Central African countries (Mutebi, et al. 2012). Their presence in the same environment appears to suggest that they are infected by feeding on the same vertebrate reservoir, although this presumption remains to be tested.
Conclusion
The obtention of the complete genomic sequence of this KAMV strain demonstrates clearly that it is genetically close to the strain that was found in C. annulioris in Uganda in 1967. However, our strain presented large amino acid divergence for external viral proteins (e.g., the G protein) from other close strains belonging to the Hart Park group. This study demonstrates the usefulness of associating classical virology tools, that is, the isolation of a viral strain, with high-throughput sequencing to obtain a whole sequence of a specific viral variant. To conclude, our molecular data suggest the need to improve our knowledge on KAMV diversity and to better understand KAMV behavior and life cycle.
Footnotes
Acknowledgments
We thank Heïdi Lançon and Dr. Engel-Gautier for revising the english of the article. This study was supported by the Institut Pasteur de Bangui, CAR, and the Institut Pasteur, Paris, France (Programme Transversal de Recherche CEVACAR no. 385). The CIRMF is supported by the government of Gabon, Total-Fina-Elf Gabon, and the Ministère de la Coopération Française. The funders had no role in study design, data analysis, or preparation of the article.
Authors' Contributions
H.D.S.T., E.N., M.K., and N.B. designed and planned the experiments. H.D.S.T, B.S., and N.B. performed the experiments and the bioinformatics analysis. H.D.S.T., N.B., E.N., A.G., J.C.M., and M.K. analyzed the data. H.D.S.T. and N.B. wrote the article. All authors approved the final version of the article.
Author Disclosure Statement
No competing financial interests exist.
