Abstract
The global fight against human immunodeficiency virus (HIV) is complicated by its extensive genetic diversity, which arises from high mutation rates, rapid replication, and frequent recombination events. These factors lead to the emergence of numerous recombinant forms of HIV-1, contributing to the virus’s adaptability and complicating prevention and treatment efforts. In this study, we identified two novel, unique recombinant forms (URFs) of HIV-1, CRF01_AE/CRF79_0107 and CRF01_AE/CRF07_BC, through near full-length genome sequence analysis. These URFs were detected in two individuals within the student men who have sex with men (MSM) population of Shijiazhuang, Hebei Province, China. Both utilized CRF01_AE as the underlying template, and PQ585802 represents a second-generation recombinant form comprising CRF01_AE and CRF79_0107. It is a novel recombinant form that was initially identified. PQ585803 represents a second-generation recombinant form, composed of CRF01_AE and CRF07_BC, and exhibits distinctive characteristics when compared to previously identified recombinant forms. This study underscores the urgent need for targeted public health measures focusing on high-risk populations, such as MSM and students, to curb the spread of HIV-1. Tailored education, enhanced access to prevention services, and strategies addressing risky behaviors are critical in reducing HIV-1 prevalence and mitigating the challenges posed by recombinant forms.
Human immunodeficiency virus (HIV) demonstrates extensive genetic variability, driven by its elevated mutation frequency, rapid replication, and propensity for recombination. 1,2 As the virus undergoes evolutionary and spatial dissemination, diverse HIV-1 subtypes and mutations have been observed across distinct geographic regions. 3 Furthermore, 159 circulating recombinant forms (CRFs) and a multitude of unique recombinant forms (URFs) have been identified. 4
Unsafe sexual practices, multiple sexual partners, lack of condom use, and substance abuse are strongly associated with HIV prevalence among students. A study analyzing data from 2010 to 2019 reported 141,557 new HIV cases among students aged 15–24. The annual number of cases increased from 9,373 in 2010 to 15,790 in 2019. Notably, 80% of these infections were attributed to homosexual transmission. 5
Since the first AIDS case was reported in Hebei Province in 1989, a total of 17,891 new HIV/AIDS cases had been documented by 2020. Initially, the predominant transmission route was bloodborne. However, since 2005, homosexuals have become the most common route of HIV-1 transmission in Hebei Province. The proportion of men who have sex with men (MSM) among newly diagnosed cases in Hebei Province (more than 60.0%) has been significantly higher than the national level (25.6%). 6 Changes in HIV transmission routes are often accompanied by shifts in the prevalence of HIV-1 subtypes. In the MSM population of Hebei Province, the dominant strains are CRF01_AE and CRF07_BC, comprising 51.9% and 30.4% of cases, respectively. 7 The co-circulation of these strains has facilitated the emergence of novel recombinants, for example, CRF79_0107. 8
In this study, we detected two novel URFs of HIV-1 (CRF01_AE/CRF79_0107, CRF01_AE/CRF07_BC) by near full-length genome (NFLG) sequence analysis, which are different from the previously reported recombinants. HB010404, a 24-year-old male, received an HIV diagnosis in February 2022, with an initial CD4+ T cell count of 492 cells/µL. Similarly, HB010924, another 24-year-old male, was confirmed HIV positive in May 2023, presenting a baseline CD4+ T cell count of 406 cells/µL. Both individuals are unmarried students who acquired the infection through homosexual transmission.
The NFLG of both samples was amplified and sequenced. 200 µL of plasma samples obtained from two patients were drawn for RNA extraction using the QIAamp Viral RNA Mini Kit (QIAGEN, Cat. No. 52904) according to the manufacturer’s instructions. Then, the PrimeScriptTM One step RT-PCR Kit ver.2 (Takara, Code NO. RR055A) and Premix TaqTM (Ex-TaqTM Version 2.0 plus dye) (Takara, Code No. RR902A) were used to generate cDNA and perform nested polymerase chain reaction (PCR) to amplify the 3′ and 5′ hemi-molecular regions of the NFLG sequence, with an overlap region of 1 kb. The conditions for the reaction were as follows: 5 minutes at 94°C, 30 cycles of 30 seconds at 94°C, 30 seconds at 60°C, and 7 minutes at 72°C, and a 10-minute extension at 72°C. Subsequently, the PCR products were identified by gel electrophoresis and purified by Wizard® SV Gel and PCR Clean-Up System (Promega, A9281), and finally the amplified products were submitted to SinoGenoMax (China) for Sanger sequencing.
Sequence trimming and assembly was performed using Sequencher (v5.4.5) with BioEdit (v 7.2.5.0), and sequences with ≥5% of ambiguous bases were excluded from further analysis. Sequence alignment was performed using MAFFT (https://mafft.cbrc.jp/alignment/software/) and subtypes were typed using the Comet online website. All reference sequences (A-D, F-H, J, K, O, CRF01AE, all CRFs associated with B/C, 01/BC, and 01/B recombination) were retrieved and downloaded from the Los Alamos HIV Sequence Database (http://www.hiv.lanl.gov/content/index). FastTree (v 2.1.10) was used to construct a maximum likelihood (ML) tree of all sequences with the best nucleotide substitution model, GTR + CAT, SPR = 4. Statical support for branches was assessed using local support values (Shimodaira-Hasegawa-like test). The resulting phylogenetic trees were visualized and beautified by Interactive Tree of Life (iTOL, v6.0) (https://itol.embl.de). 9 To confirm the recombination pattern of URFs, recombination breakpoints were initially determined using online jpHMM-HIV (http://jphmm.gobics.de/submission_hiv.html) and online RIP 3.0 (http://www.hiv.lanl.gov/content/sequence/RIP/RIP.html). Finally, using the recombinant HIV-1 mapping tool (https://www.hiv.lanl.gov/content/sequence/DRAW_CRF/recom_mapper.html) performing visualization of recombination patterns.
NFLG phylogenetic tree analysis showed that HB010404 and HB010942 sequences formed unique monophyletic clusters independent of other subtypes and CRFs, which suggested that the novel recombinant forms had been discovered (Fig. 1). Further analysis of these two sequences using recombinant identification program and jpHMM revealed that both sequences had CRF01_AE as the backbone, HB010404 was one CRF79_0107 fragment inserted into CRF01_AE, and HB010942 was two CRF07_BC fragments inserted into the CRF01_AE. The chimeric recombinant structures of two sequences were described as follows: I CRF01_AE (HXB2, 491–6,029 nt); II CRF79_0107 (HXB2, 6,030–8,531 nt); III CRF01_AE (HXB2, 8,532–9,662 nt), HB010404 (Fig. 2A); I CRF01_AE (HXB2, 633–4,232 nt); II CRF07_BC (HXB2, 4,233–4,515 nt); III CRF01_AE (HXB2, 4,516–6,563 nt), IV CRF07_BC (HXB2, 6,564–7,692 nt); V CRF01_AE (HXB2, 7,693–9,616 nt), HB010942 (Fig. 2B).

Phylogenetic tree analysis. A NFLG phylogenetic tree of HB010404 (9,172 bp, red background) and HB010942 (8,984 bp, red background) was constructed. All reference strains were retrieved from the Los Alamos National Laboratory HIV Sequence Database and labeled into different backgrounds by subtype. Bootstrap tests with 1,000 replicates assessed the stability of each node, with bootstrap values below 75 being unlabeled and values between 0.75 and 1.0 shown in blue circles of five sizes.

Genomic maps of HB010404
Subregion tree analysis revealed that the CRF01_AE region in HB010404 (I, II) and HB010942 (I) was the parental origin of cluster 5 linkage, while HB010942 (III, V) was the parental origin of cluster 4 linkage (Fig. 3). HB010404 was the first identified recombinant pattern of CRF01_AE and CRF79_0107. The CRF07_BC regions in HB010942 (II, IV) were all derived from CRF07_BC-N (Fig. 3B).

Subregions phylogenetic analysis of HB010404
In conclusion, two novel URFs were identified in this study, both detected within the student MSM population in Shijiazhuang, Hebei Province, China. The backbone of these URFs was CRF01_AE, one of the predominant subtypes driving the HIV-1 epidemic in China. CRF01_AE exhibits significant genetic diversity, as it has diverged into multiple clusters due to distinct evolutionary trajectories. 10 Among these clusters, CRF01_AE clusters 4 and 5 are widely distributed among MSM populations. Compared to cluster 5, individuals infected with cluster 4 exhibit lower CD4+ T cell counts and experience prolonged immune recovery. 11 HB010942 was derived from the CRF01_AE cluster 5/CRF07_BC-N/CRF01_AE cluster 4 recombinant forms. CRF07_BC-N, a cluster derived from CRF07_BC-O, is predominantly reported among MSM populations. In China, CRF07_BC-N demonstrates a higher transmission-specific rate and a greater risk of transmission compared to CRF07_BC-O, with its prevalence concentrated in economically developed provinces. 12 HB010404, on the contrary, was derived from a recombinant form involving CRF01_AE cluster 5 and CRF79_0107. CRF79_0107, initially identified in Shanxi Province, represents the first reported recombinant form between CRF01_AE and CRF07_BC in China. It was speculated by its discoverer to primarily circulate within MSM populations. 8 The emergence of this URF suggests frequent cross-regional transmission of HIV-1, particularly among MSM populations. The increasing generation of recombinant forms adds to the genomic complexity of HIV-1 in China, complicating prevention and control efforts that rely on genotype-specific strategies. Therefore, targeted public health initiatives, including enhanced education and outreach programs, should focus on MSM and student populations to implement effective, tailored preventive measures aimed at curbing the further spread of HIV-1.
With the ongoing global spread of HIV, an increasing number of recombinant viruses are being identified. In China, CRF01_AE and the new CRFs play an increasing role in the HIV pandemic. 13 –15 Most URFs have been identified in the MSM population, and the new incidence among students is increasing every year, suggesting that MSM and students, as high-risk groups for HIV-1 transmission, should be the focus of HIV-1 resources. Implementing targeted education, improving their access to HIV prevention services, and resisting risky lifestyle factors in high-risk groups contribute to reducing HIV-1 prevalence.
Sequence Data
The nucleotide sequences of HB010404 and HB010942 were deposited in the NCBI GenBank with the accession numbers PQ585802 and PQ585803, respectively.
Footnotes
Authors’ Contributions
K.S.: Article writing, experimental operation, data analysis; Y.F.: Experimental operation, data analysis, article correction; C.W.: RNA extraction; J.H.: Provide HIV reference sequence; Y.L.: Article correction; L.J.: Sample information collection and sorting; B.Z.: Sample storage and transport; X.W.: Conducting HIV confirmatory tests; J.L.: Article correction, laboratory procedure; Z.L.: RNA extraction; E.D.: Provide experimental samples; H.L.: Provide experimental conditions, experimental design; L.L.: Provide experimental conditions; H.Y.: Article correction, experimental design.
Author Disclosure Statement
No competing financial interest exist.
Funding Information
The financial support in this study derived from the National Key Research and Development Program of China under Grant numbers (2022YFC2304403, 2022YFC2305202, 2022YFC2304903), the National Natural Science Foundation of China under Grant number (82173583), and Science and Technology Program of Shijiazhuang (231200103A).
