Genetic Characterization of 13 Subtype CRF01_AE Near Full-Length Genomes in Guangxi,China

Abstract

Guangxi is an important transit area for HIV transmission in South China. Characterization of the full-length genome of HIV-1 prevalent in the area is important for phylogenetic analysis and vaccine development. CRF01_AE is one of the most rapidly spreading subtypes in Guangxi. In this study, we reported thirteen near full-length CRF01_AE genomes from Guangxi, China. The nearly full-length genome was reverse transcripted and amplified in two halves with the 1-kb overlap regions. The PCR products were sequenced directly. The sequence analysis showed that all of 13 strains were CRF01_AE recombinant subtypes. Two clusters were set up with all of the sequences that grouped separately with sequences from Vietnam and Fujian, China, which strongly suggested multiple introductions of CRF01_AE strains into Guangxi province. The results will improve our understanding on the phylogenetic relationship of CRF01_AE strains in South China and also help in the development of a successful HIV vaccine.

I n china, since the first reported cases of HIV in 1985, the estimated number of HIV-infected people has increased alarmingly to 840,000, with the virus spreading to every province and autonomous region.¹ The main prevailing strains are subtype B, CRF07-BC, CRF08-BC, and CRF01_AE, which accounted for about 94.9% of infections in China.² Guangxi is one of the provinces most affected by HIV/AIDS epidemics. Bordering Vietnam in the south and Yunnan province in the west, Guangxi was considered one of the major transit areas for HIV-1 transmission in southwest China.

CRF01_AE, along with CRF08_BC, is the dominant strain in Guangxi province.³ However, most of the genetic analyses of CRF01_AE in Guangxi were based on partial genome sequences, which may not provide comprehensive genetic information on the entire viral genome. The only two CRF01_AE full-length genomic sequences of virus reported in 2000 were isolated in 1997, which may be far different from the strains prevalent now.⁴ In this article, we describe 13 near full-length HIV-1 CRF01_AE sequences derived from plasma samples collected in 2005 and 2006 in Guangxi, China. Those sequences will contribute to molecular epidemiology studies and will be used for the development of an HIV-1 vaccine.

All of the 13 samples were collected between 2005 and 2006, with informed consent, from patients residing in Guangxi, China who had previously been diagnosed as being HIV-1 infected by antibody detection,. The epidemiological data are listed in Table 1. RNA was extracted from 500 μl of HIV-1-positive plasma specimens after being concentrated by centrifugation at 23,000 × g for 1 h using a high pure viral RNA kit (Roche, USA). RNA was reverse-transcripted into cDNA and the near full-length genome was amplified in two halves with 1-kb overlap regions. In brief, Superscript III (Invitrogen, USA) was used for reverse transcription with primer 1.R3.B3R (5′-ACTACTTGAAGCACTCAAGGCAAGCTTTATTG-3′) and vpu4 (5′- TTAATTTTACACATGGCTTTAGRCTTT-3′) separately for the 3′ and 5′ halves. Nested polymerase chain reaction (PCR) was performed by using High Fidelity Taq (Invitrogen, USA) according to the manufacturer's instructions with primers separately for the 3′ and 5′ halves. For the 5′ half, MSF12b (5′-AAATCTCTAGCAGTGGCGCCCGAACAG-3′) and vif4 (5′- TTGCCACTGTCTTCTGCTCTTTC-3′) were used for first round amplification and F2NST (5′-GCGGAGGCTAGAAGGAGAGAGATGG-3′) and vif3 (5′- TCGCTGTCTTCGCTTCTTCCTGCCAT-3′) were used for second round amplification. For the 3′ half, 07For7 (5′-CAAATTAYAAAAATTCAAAATTTTCGGGTTTATTACAG-3′) and 2.R3.B6R (5′-TGAAGCACTCAAGGCAAGCTTTATTGAGGC-3′) were used for first round amplification and VIF1 (5′-GGGTTTATTACAGGGACAGCAGAG-3′) and Low2c (5′-TGAGGCTTAAGCAGTGGGTTCC-3′) were used for second round amplification. The same amplification conditions were used in the two rounds of amplication of the two halves as follows: 94°C for 2 min and then three cycles (94°C for 30 s, 60°C for 1 min, 68°C for 5 min 30 s), then 32 cycles (94°C for 15 s, 60°C for 30 s, and 68°C for 5 min), followed by 68°C for 10 min. All amplifications were set up in a clean room using dedicated supplies and pipettes only. The positive PCR products were sequenced by Huada genomics company (China) with a variety of internal specific primers (available on request) after being purified.

Table 1.

Epidemiological and Clinical Data of the Thirteen Participants in Guangxi, China

Participants	Age (yr)	Gender	Transmission route	Year of collection	Place of infection (city)	Genebank accession number
05GX001	37	Female	Heterosexual	2005	Hezhou	GU564221
05GX002	28	Male	Heterosexual	2005	Liuzhou	GU564222
05GX012	59	Male	Heterosexual	2005	Laibin	GU564223
05GX013	28	Female	IDU^a	2005	Liujiang	GU564224
05GX014	23	Female	Heterosexual	2005	Liuzhou	GU564225
05GX034	28	Female	Heterosexual	2005	Liuzhou	GQ845124
05GX079	28	Male	Heterosexual	2005	Liuzhou	GQ845125
05GX128	45	Male	Heterosexual	2005	Liuming	GQ845126
05GX136	44	Male	Heterosexual	2005	Liuming	GU564226
05GX142	53	Female	Heterosexual	2005	Liuming	GU564227
05GX156	41	Male	IDU	2005	Liuming	GU564228
05GX162	31	Male	Heterosexual	2005	Liuming	GU564229
06GX239	51	Female	Heterosexual	2006	Hezhou	GU564230

IDU, injecting drug user.

All of the two overlapped subgenomic DNA fragments were successfully obtained from 13 samples. All of the sequenced fragments were edited and assembled into contiguous sequences on a minimum overlap of 30 bp with a 99–100% minimal mismatch with ContigExpress software, which is a component of Vector NTI Suite 6.0. Thirteen near full-length genomes were obtained. BLAST search against the HIV-1 sequence database and among themselves wasused to check for contamination⁵ and no evidence of sample contamination was observed. Analysis of sequences showed that all of the gene structures of those NFLG sequences were normal with the nine open reading frames (ORFs) intact and opened except 05GX136. Many G–A mutations in 05GX136 strains, which formed stop codons in many genes, suggested the existence of hypermutation. This was proved by the Hypermut program (http://www.hiv.lanl.gov) with HIV-1 CRF01_AE subtype strain CM240 used as a reference sequence (p < 0.01).

To determine the subtype, all 13 sequences were aligned with the reference sequences representing subtypes A–D, F–H, J, K, and CRF01_AE (http://www.hiv.lanl.gov) using Clustal W software. Alignments were converted into Fasta format and manually edited using BioEdit (version 7.0.0; T. Hall, North Carolina State University, Raleigh, NC). A neighbor-joining tree was constructed using the Kimura two-parameter method including both transitions and transversions with MEGA3.1 (Fig. 1). The reliability of topologies was estimated by performing bootstrap analysis with 500 replicates. The results indicated that all sequences were CRF01_AE strains (Fig. 2). The results were confirmed again by bootscan and Simplot (version 3.2; S. Ray, Johns Hopkins University) using reference sequences of each subtype from A to J and recombination subtype CRF01_AE (represented by the 05GX001 sequence in Figs. 3 and 4). No new intersubtype recombination was found in all 13 sequences and all of our sequences showed a similar genomic structure to the CM240 strain. Interestingly, the phylogenetic analysis showed two distinct clusters in the 13 near full-length HIV-1 CRF01_AE genomic sequences. Three strains, including 05GX001, 05GX079, and 06GX239, clustered together (cluster 2), and the remaining 10 sequences clustered separately (cluster 1). There was no phylogenetic clustering of IDU strains separate from heterosexually transmitted strains. Furthermore, subclustering was not associated with the patients' sociodemographic features such as age, sex, religion, residence, and ethnic group or clinical signs and symptoms of the disease.

FIG. 1.

Phylogenetic tree analysis. A neighbor-joining tree was created with the 13 near full-length HIV-1 sequences from Guangxi (marked with gray circles) and the reference sequences of subtypes A–D, F–H, J, K, CRF01_AE, and group O (http://hiv-web.lanl.gov/). Each reference sequence is labeled with the HIV-1 subtype, followed by the accession number. The bootstrap probability (more than 70%) indicated at the corresponding nodes of the tree represents the percentage of 500 bootstrap replicates. The scale bar represents 5% genetic distance (0.05 substitutions per site). (Color images available online at www.liebertonline.com/aid).

FIG. 2.

Phylogenetic tree of HIV-1 subtype CRF01_AE based on the near full-length genome using the neighbor-joining approach. Previously published near full-length genomic sequences representing subtype CRF01_AE from several countries or areas, labeled with subtype, source, and name, were obtained from the database (http://hiv-web.lanl.gov/) and used for comparison with the 13 Guangxi subtype CRF01_AE sequences generated in this study (marked with gray circles). The bootstrap probability (more than 70%, 500 replica) was also indicated at the corresponding nodes of the tree. HIV-1 subtype J was used as an outgroup. The scale bar represents 2% genetic distance (0.02 substitutions per site).

FIG. 3.

Similarity plots of the 05GX001 strain against subtype reference strains from the Los Alamos database, including A (92UG037), B (HXB2), C (95IN21068), D (ELI), F (93BR020), G (92NG083), H (056), J (SE7887), K(MP535), and CRF01_AE (CM240). Simplot was performed using a sliding window of 500 bp and a step size of 30 bp with replicates of the dataset used. The alignment was gap-stripped before being analyzed. (Color images available online at www.liebertonline.com/aid).

FIG. 4.

Bootscan analysis of the 05GX001 strain. The bootscan analysis was done using reference sequences of subtype A (92UG037), B (HXB2), C (95IN21068), D (ELI), F (93BR020), G (92NG083), H (056), J (SE7887), K(MP535), and CRF01_AE (CM240). The bootscan window was 500 bp with a step size of 30 bp. The x-axis indicates alignment; the y-axis indicates the percentage supporting the clustering with reference sequences. (Color images available online at www.liebertonline.com/aid).

To further investigate the genetic relationships between our strains and those from other countries or areas, we performed additional phylogenetic investigations that included all our strains aligned with subtype CRF01_AE full-length genomes of different countries or areas gained from the Los Alamos database. The tree confirmed the two distinct cluster definitions. A close relationship between cluster 1 strains and some strains in Fujian province and Japan was implied, suggesting their similarity to each other. Cluster 2, together with two variants isolated in 1997 in Guangxi, grouped with some strains prevalent in Vietnam (Fig. 3), which proved the close epidemiological relationship of the CRF01_AE strains in Guangxi and Vietnam as reported previously.^6

–9

Fujian is another province in China with CRF01_AE as the main prevalent subtype. To now, 13 near full-length genomes of variants in Fujian have been reported in the Los Alamos database.¹⁰ Unlike strains in Guangxi where two obvious subgroups were found, CRF01_AE sequences from Fujian dispersed among CRF01_AE strains from different countries in a phylogenetic tree, which suggested a greater complexity of HIV-1 CRF01_AE strains prevalent in Fujian than in Guangxi. The intersequence nucleotide distances of our sequences were computed using MEGA3.1 software and compared with CRF01_AE reference strains from Fujian Province, China. A low degree of interisolate diversity was found among our 13 strains with a mean overall distance of 4.5% ± 0.1%, whereas a high degree of interisolate diversity was found in Fujian with an overall distance of 7.2% ± 0.2%.

CRF01_AE was firstly identified in Thailand in the Southeast Asia region,^11,12 and caused a serious AIDS problem and an explosive HIV-1 epidemic. In China, the first subtype CRF01_AE was reported in Yunnan in late 1994.¹³ In Guangxi, CRF01_AE was believed to be imported from Vietnam through Pingxiang City in the south of Guangxi following the heroin traffic route.^6

–9 In this study, we characterized the entire genomic structure of 13 HIV-1 strains spreading throughout Guangxi and observed two clusters in CRF01_AE strains prevalent in Guangxi, which may suggest the possibility of more than one introduction of the HIV-1 CRF01_AE strain into the region. The characterization of the near full-length genome will contribute to our understanding of the evolution of the CRF01_AE epidemic in Guangxi and will also be helpful for the development of an effective vaccine.

Sequence Data

The nucleotide sequences have been submitted to GenBank and assigned accession numbers GQ845124–GQ845126 and GU564221–GU564230.

Footnotes

Acknowledgments

This work was supported by the National Key S&T Special Projects on Major Infectious Diseases (Grants 2008ZX10001-004 and 2008ZX10001-012) and the National Natural Science Foundation of China (Grant 30700706). We also want to express our thanks to Dr. Feng Gao from the Duke Human Vaccine Institute, Duke University Medical Center for providing the technical support for reverse transcription and nest ed PCR.

Author Disclosure Statement

No competing financial interests exist.

References

Xing

, Liang

, Wan

et al. Distribution of recombinant human immunodeficiency virus type-1 CRF01_AE strains in China and its sequence variation in the env V3-C3 region. Chin J Prev Med (Chin), 2004; 38:300–304.

Xing

, Liang

, Hong

et al. The potential relationship between variation in the env V3-V4 region of HIV-1 predominant strains and virus biological feature. Chin J Microbiol Immunol (Chin), 2005; 25:185–189.

Laeyendecker

, Zhang

, Quinn

et al. Molecular epidemiology of HIV-1 subtypes in southern China. J Acquir Immune Defic Syndr, 2005; 38:356–362.

Piyasirisilp

, McCutchan

, Carr

et al. A recent outbreak of human immunodeficiency virus type 1 infection in southern China was initiated by two highly homogeneous, geographically separated strains, circulating recombinant form AE and a novel BC recombinant. J Virol, 2000; 74:11286–11295.

Korber

, Learn

, Mulins

et al. Protecting HIV sequence databases. Nature, 1995; 378:242–243.

, Chen

, Shao

et al. Emerging HIV infections with distinct subtypes of HIV-1 infection among injection drug users from geographically separate locations in Guangxi Province, China. J Acquir Immune Defic Syndr, 1999; 22:180–188.

Quan

, Chung

, Long

et al. HIV in Vietnam: The evolving epidemic and the prevention response, 1996 through 1999. J Acquir Immune Defic Syndr, 2000; 25:360–369.

Kato

, Shiino

, Kusagawa

et al. Genetic similarity of HIV type 1 subtype E in a recent outbreak among injecting drug users in northern Vietnam to strains in Guangxi Province of southern China. AIDS Res Hum Retroviruses, 1999; 15:1157–1168.

Kato

, Kusagawa

, Motomura

et al. Closely related HIV-1 CRF01_AE variant among injecting drug users in northern Vietnam: Evidence of HIV spread across the Vietnam-China border. AIDS Res Hum Retroviruses, 2001; 17:113–123.

10.

Huang

, Zheng

, Yan

et al. Genetic characterization of CRF01_AE full-length human immunodeficiency virus type 1 sequences from Fujian, China. AIDS Res Hum Retroviruses, 2007; 23:569–574.

11.

McCutchan

, Hegerich

, Brennan

et al. Genetic-variants of HIV-1 in Thailand. AIDS Res Hum Retroviruses, 1992; 8:1887–1895.

12.

, Takebe

, Luo

et al. Wide distribution of 2 subtypes of HIV-1 in Thailand. AIDS Res Hum Retroviruses, 1992; 8:1471–1472.

13.

Cheng

, Zhang

, Capizzi

et al. HIV-1 subtype-E in Yunnan, China. Lancet, 1994; 344:953–954.