Abstract
We examined sequence variation in the HIV-1 gag p6 region from 27 individuals infected with HIV-1 CRF07_BC. An additional 269 gag p6 sequences of CRF07_BC from the Los Alamos National Laboratory database were also analyzed. A unique deletion of seven amino acid (aa) (p6Δ7) (aa 30–36, PIDKELY, in the HXB2 genome) was observed to exist exclusively in CRF07_BC. Indeed, 54.1% (160/296) of the CRF07_BC sequences contained the p6Δ7 mutation. The prevalence of the p6Δ7 mutation was 37.2% (29/78) and 92.3% (48/52) in CRF07_BC-infected intravenous drug users and men who have sex with men (MSM), respectively. Our results demonstrate that the p6Δ7 mutation dominates in MSM infected by HIV-1 CRF07_BC in China and suggests that this deletion could serve as a useful marker for monitoring HIV-1 evolution and epidemic. In future studies, it will be of interest to determine whether such genotypic variation influences viral replication capacity and disease progression.
O
CRF07_BC was first described in 2000 and is one of the dominant strains in China, 1 and may have originated in western Yunnan, China, in the early 1990s. 2 In 2004, CRF07_BC was transported to Taiwan through drug trafficking and caused an outbreak of HIV-1 CRF07_BC infection in intravenous drug users (IDUs). 3 Recently, CRF07_BC was also identified in Shan State, Myanmar bordering Yunnan, China, 4 representing the first CRF07_BC virus identified outside of China and indicating expansion of the HIV-1 CRF07_BC epidemic.
In 2007, Lin et al. described a 7–11 amino acid (aa) deletion, in particular, a unique 7aa deletion (p6Δ7) in the p6 domain of HIV-1 gag (p6Gag). The p6Δ7 variant exists only in CRF07_BC isolates and not in any other HIV-1 genotypes. 5 Since then, several studies showed that about 25%–30% of the IDUs infected with CRF07_BC carry the signature p6Δ7 mutation. 6,7 Jiang et al. also reported that in Yunnan, China, 27.5% of the CRF07_BC-infected men who have sex with men (MSM) were p6Δ7 positive. 8 Furthermore, it has been reported that infection with the CRF07_BC p6Δ7 variant showed significantly lower viral loads, slower immunological progression, lower replication capacity, and decreased virus growth kinetics, 6,8 –10 although Meng and colleagues reported that the p6Δ7 mutation resulted in rapid increase of HIV-1 viral loads. 7 These results suggest that the novel CRF07_BC virus with the p6Δ7 mutation may have different replication capacities, which in turn could impact disease progression and therapy.
p6Gag is required for virus assembly and budding through the interaction between the PTAP motif near the N terminus of p6Gag and host factor Tsg101 and the YPXnL motif (residues 32–46) in the C-terminal region for binding with host factor AIP1 (also known as Alix), which couples HIV-1 p6 to the late-acting endosomal sorting complex ESCRT-III. 11 Sequence variation in the p6Gag has been reported, including aa insertion, deletion, and duplication of PTAP or KQE motif, and may influence viral replication capacity and drug susceptibility. 12
In this study, we investigated the frequency of the p6Δ7 variant in 6 IDUs and 21 MSM infected with HIV-1 CRF07_BC in Guangzhou, China. The plasma samples were collected from January to June, 2013. Written informed consent was obtained from individuals enrolled in the study with ethical approval by the Ethics Committee of Guangdong Provincial Skin Diseases and STD Control Center (Guangzhou, China, protocol number 2013-H-01) and Southern Medical University, respectively.
HIV-1 genotypes for the plasma samples have previously been determined to be CRF07_BC.
13
To further pursue this approach, the HIV-1 gag gene was amplified by reverse transcriptase polymerase chain reaction (RT-PCR) with the primer sets and conditions listed in Supplementary Table S1 (Supplementary Data are available online at
Interestingly, the aa sequence alignment indicated the existence of a specific 7aa (aa 30–36, PIDKELY) deletion (p6Δ7) between the Tsg101 binding motif PTAP and the AIP1 binding domain YPXnL, and one subject (IDU15) had a 13aa (aa 25–37, PQKQEPIDKELYP) deletion (Fig. 1). Furthermore, the p6Δ7 variant was found in all 21 MSM samples and 66.7% (4/6) of the IDU samples tested (Table 1). The high frequency of the p6Δ7 variant observed in our study prompted us to further investigate the p6Δ7 prevalence in all CRF07_BC sequences collected nationwide in China and to explore the prevalence difference observed between our study and those reported previously.

Alignment of HIV-1 gag p6 aa sequences. Twenty-seven sequences obtained from 6 HIV-1-infected IDUs and 21 MSM in this study were aligned with the HXB2 sequence using MEGA6.0 14 and BioEdit 7.1.9 for data output. The numbering of aa sequences is based on the HXB2 engine. 15 Periods (.) denote identity with the reference sequence. Dashes indicate the deletion of aas in comparison with the reference prototype HXB2 (B.FR.K03455). Asterisks indicate stop codon. Tsg101- and API-binding domain are labeled in solid line. aa, amino acid; IDU, intravenous drug user; MSM, men who have sex with men.
The prevalence of p6Δ7 in MSM is higher than in IDUs (Fisher's exact test, p = .043 < .05).
aa, amino acid; IDU, intravenous drug user; MSM, men who have sex with men.
We then searched the Los Alamos National Laboratory (LANL) HIV-1 sequence database and found 269 CRF07_BC sequences with the p6Gag sequence, including 36 near full-length genome (NFLG) sequences (Supplementary Table S2). Consistent with the previous results, we found that the p6Δ7 mutation is unique to CRF07_BC viruses and is not present in any other HIV-1 genotypes and CRFs, including CRF08_BC. 5,7 Among the 269 sequences downloaded from the LANL database, 62.8% (169/269) had various deletions of 1–15 aas between the KQE and LKLSF motifs in p6Gag. It is also worth noting that 50.2% (135/296) of these CRF07_BC sequences contained the p6Δ7 variation (Table 2). Furthermore, 35.2% (25/71) of the IDUs and 87.1% (27/31) of the MSM were infected with the p6Δ7 variant (Table 2). These results indicate that the p6Δ7 variant is much more common than previously reported, in particular, in the population of MSM infected with HIV-1 CRF07_BC.
Others, without information on source of infection.
The prevalence of p6Δ7 among subjects is significantly different (Fisher's exact test, p < .05).
LANL, Los Alamos National Laboratory.
Since the previous studies focused on the analysis of CRF07_BC with NFLG sequences, we further analyzed 36 NFLG sequences of CRF07_BC available in the LANL HIV-1 database. Among them, 6 and 23 NFLGs came from MSM and IDUs, respectively (Supplementary Table S2). We found that 21.7% (5/23) of the CRF07_BC NFLGs from HIV-1-infected IDUs carry the p6Δ7 mutation (Table 3), which is similar to the prevalence reported previously. 6,7 However, all six CRF07_BC NFLGs from HIV-1-infected MSM carry the p6Δ7 mutation (Table 3), indicating unusual high frequency of the p6Δ7 variant in the CRF07_BC-infected MSM population.
Others, without information on the source of infection.
The prevalence of p6Δ7 among the subject population is significantly different (Fisher's exact test, p = .012 < .05).
According to the LANL HIV-1 database, the first CRF07_BC sequence available was sampled in 1997, while the CRF07_BC p6Δ7 variant was first identified in 2001 in IDU and in 2007 in MSM (Supplementary Table S2). Interestingly, for the CRF07_BC sequences analyzed, the prevalence of the p6Δ7 variant increased over time from 23.5% (4/17) during 1997–2002, 46.0% (40/87) during 2003–2007 to 61.1% (116/190) during 2008–2013 (Table 4). The prevalence difference is statistically significant over time (Pearson chi-square test, χ2 = 12.410, p < .05). The similar trend of the change of the p6Δ7 prevalence with statistically significant difference was also observed in the IDUs (Fisher's exact test, p = .046 < .05). However, the p6Δ7 prevalence did not change over time in the MSM (Fisher's exact test, p = .544 > .05, Table 4).
Others, those without infection sources.
The prevalence of P6Δ7 in total sequences analyzed is significantly different over time (Pearson chi-square test, χ2 = 12.410, p < .05).
The prevalence of p6Δ7 in IDUs is significantly different over time (Fisher's exact test, p = .046 < .05).
The prevalence of p6Δ7 in MSM is not different over time (Fisher's exact test, p = .544 > .05).
We then further investigated the prevalence of P6Δ7 as a function of geographic regions in China and found that in all of the sequences analyzed, the prevalence of p6Δ7 was significantly different among different regions (Pearson chi-square test, χ2 = 30.891, p < .05) with the highest prevalence of 70.7% (70/99) in Southern China, including Guangdong, Guangxi provinces, and Taiwan, and the lowest prevalence (32.0%, 31/97) in Western China, for example, in the Yunnan and Sichuan provinces (Table 5 and Fig. 2). In Northern China, including Xinjiang, Liaoning, Beijing, and Henan, the average prevalence of the p6Δ7 variant was 58.7% (Table 5 and Fig. 2). A similar trend was also observed in the p6Δ7-infected IDUs, and the prevalence of P6Δ7 in IDUs was significantly different among the three regions (Pearson chi-square test, χ2 = 12.191, p < .05). Moreover, for the MSM, the prevalence of the p6Δ7 variant was fairly similar among the regions investigated, ranging from 85.7% to 100% (Table 5 and Fig. 2), indicating stable infection by the CRF07_BC p6Δ7 variant throughout the whole of China.

Geographical distribution of CRF07_BC gag p6Δ7 variants. For the general population, IDUs and MSM are indicated; each province is labeled in different colors according to the prevalence of HIV-1 CRF07_BC gag p6Δ7 variants. The colors representing the different prevalences of HIV-1 P6Δ7 are indicated in the legend at the bottom left of the figure.
Those without source of HIV-1 infection.
Those without information of sampling places.
The prevalence of p6Δ7 in total sequences analyzed is significantly different among three regions (Pearson chi-square test, χ2 = 30.891, p < .05).
The prevalence of p6Δ7 in IDUs is significantly different in subjects from three regions (Pearson chi-square test, χ2 = 12.191, p < .05).
The prevalence of p6Δ7 in MSM does not differ between northern and southern China (Fisher's exact test, p = .125 > .05).
The 945-bp fragment (nt1348-2292 relative to the HXB2 genome) covering the 3′ portion of the HIV-1 gag gene was used to build the phylogenetic tree using the GTR + I +G model within the PhyML program (

Phylogenetic tree of HIV-1 gag. Phylogenetic analysis for the 133 CRF07_BC sequences containing the 945-bp fragments of the 3′ portion of the HIV-1 gag gene was performed on the basis of PhyML (
Thus, our study further confirms the finding that the p6Δ7 mutation is unique to HIV-1 CRF07_BC and occurs in epidemic proportions with a prevalence of 54.1% (160/296) in CRF07_BC-infected individuals, regardless of the source of infection. The CRF07_BC p6Δ7 variant can be transmitted and spread through multiple routes, primarily through intravenous drug injection and homosexual or heterosexual contacts. The p6Δ7 mutation dominates in CRF07_BC-infected MSM with a prevalence of 92.3% (48/52) and is an exclusive signature mutant virus in this population in China. To our knowledge, our study is the first to comprehensively analyze the frequency of the p6Δ7 variant in MSM and shows a dramatically higher rate of p6Δ7 mutation in CRF07_BC-infected MSM than that reported by Jiang et al. in which only 27.3% (18/66) of the CRF07_BC sequences from HIV-1-infected MSM contained the p6Δ7 mutation. 8 Considering the fact that HIV infection in MSM has become a major transmission route and that CRF07_BC is one of the dominant viruses infecting MSM in China, our data strongly suggest that CRF07_BC with the p6Δ7 mutation should be the target of HIV-1 molecular monitoring study.
Furthermore, it has been reported that both the virus isolates and the infectious clone of CRF07_BC with p6Δ7 mutation displayed relatively lower replication capacity and slower replication kinetics than HIV subtype B or Thai B′. 8 –10,17,18 In a case-controlled study in which the possible confounding factors, such as gender, HIV-1 subtypes, initial CD4 cell counts, and antiretroviral therapy, were matched, the viral loads of patients infected with CRF07_BC p6Δ7 variants were consistently lower than those with subtype B, 10 indicating a much slower disease progression rate in patients infected by the p6Δ7 variants. However, the possible molecular mechanisms remain to be elucidated.
One possibility is the impact of p6Δ7 mutation on virus assembly and budding. The p6Δ7 deletion overlaps with the AIP1 binding domain YPXnL at residue 36Y and may affect the interaction between HIV-1 gag and host factor AIP1, which in turn interferes with HIV assembly, budding, and maturation. 9,10 A similar defect has been observed for the Y36A mutation, supporting the critical role of residue 36Y in the interaction between HIV-1 gag and host factor AIP1 and virus budding. 19 Furthermore, due to the overlap of the open reading frames for the p6 and pol proteins, the 7aa deletion also affects the activity of HIV-1 protease (PR), which in turn interferes with HIV-1 Pr55gag/Pr160Gag-Pol processing and incorporation of PR and Vpr. 9,10 It is thought that these defects correlate with the delayed HIV-1 replication, since they can be improved by transfection with wild-type p6. 9 However, further study is needed to characterize the p6Δ7 variants and to elucidate the effect of the p6Δ7 mutation on HIV-1 infection and phenotypes.
In summary, our work demonstrates that monitoring the genetic evolution of HIV-1 and investigating the novel properties of these newly emerging CRF07_BC p6Δ7 variants will provide vital insights into our understanding of the dynamics and complexity of the HIV-1 epidemic in China. This, in turn, will provide critical information about HIV-1 replication, rational design of optimal therapeutic regimens for HIV-1-infected patients, and future vaccine development in China.
Sequence Data
The sequences of CRF07_BC gag reported in this study have been deposited in GenBank with the accession number KY818028-KY818054.
Footnotes
Acknowledgments
This research is supported by the Bureau of Science and Information Technology of Guangzhou Municipality (no. 201508020018, 201604020011) and the Guangdong Province Public Welfare Research and Capacity Building Project, sponsored by the Guangdong Provincial Department of Science and Technology (no. 2014B020212007).
Author Disclosure Statement
No competing financial interests exist.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
