Abstract
The tight bottleneck during HIV-1 transmission generally results in only a single virus variant being transmitted. Investigation of the HIV-1 envelope glycoprotein (Env) can identify vulnerabilities of transmitting viruses that can be targeted by vaccines designed to elicit protection against global HIV-1. This study generated an HIV-1 subtype C consensus transmitted and early founder virus Env (EnvFVC) after detailed sequence analysis of 1,894 env genes obtained from 80 acutely infected individuals from South Africa, Malawi, and Zambia. The inferred EnvFVC sequence incorporates characteristics of transmitted and early founder viruses and results in the expression of a functional and conformationally intact Env. Overall, the “subtype-based” or “region-based” EnvFVC described here can be used in the development of a useful immunogen for novel vaccine design.
A
It is unknown whether the transmitted viruses carry an advantage over variants that fail to transmit. Sequence analysis has shown that transmitted variants tend to contain Env with shorter, more compact variable loop structures and fewer glycosylation sites as compared to viruses present during later stages of chronic infection. The more compact Env structures of the founder virus variants are thought to increase fitness in a transmission setting due to greater exposure of the CD4 binding site, although this increases Env sensitivity to neutralizing antibodies. 2,3 As infection progresses, escape from neutralizing antibodies is the main selecting force driving sequence diversity and glycosylation in the Env variable loops. 8 Founder virus populations also have an increased affinity for the α4β7 integrin, which is proposed to lead to the enhancement in the transmission fitness of these variants through improved cell-to-cell transmission or specific infection of lymphocytes that home to gut-associated lymphoid tissue (GALT). 9,10
A greater understanding of transmitted viruses may help identify Env vulnerabilities that can be targeted by vaccines designed to elicit protective antibodies against clinically relevant HIV-1. In addition, the extreme genetic bottleneck selects for an HIV-1 genome that encodes more relevant sequences for vaccine design than viruses that are the basis for current vaccines. HIV-1 subtype C has emerged as the most prevalent strain worldwide, and may be associated with a greater propensity for transmission than other HIV-1 group M subtypes. 11 Thus, this study aimed to evaluate all the available HIV-1 subtype C founder virus Env sequences and to generate and characterize a transmitted/early founder virus consensus Env sequence for incorporation into a vaccine immunogen construct.
Sequence data previously published using single genome amplification of HIV-1 Env sequences from patients acutely infected with HIV-1 subtype C were used to generate an HIV-1 subtype C founder virus Env sequence. The patient cohort was composed of 80 individuals from South Africa (40), 1 Malawi (29), 1 and Zambia (11) 4 all acutely infected with HIV-1 subtype C, as classified according to the Fiebig stage classification system, following heterosexual transmission. Twenty-four of the individuals were in Fiebig stage I/II (5–10 days postinfection), 20 patients were in stage III/IV (13–19 days postinfection), and 36 patients were identified as being in the latter stages of acute infection, Fiebig stages V/VI (88 days onward postinfection). Overall, in 70% of the patients identified among these three cohorts, productive infection occurred as a result of a single transmitted viral variant. For the remaining 30% of patients, productive infection was the result of simultaneous transmission of multiple variants (mean 2.4, range 2–5).
Of the published 1,927 env sequences isolated from these patients, 1,894 sequences were available and were downloaded as fasta files from the Los Alamos National Laboratory HIV sequence database (
The individual env sequences for each patient were initially aligned using Clustal X 2.0.11 (
Overall, an HIV-1 subtype C founder virus consensus Env sequence was successfully generated after detailed analysis of 1,894 env sequences derived from the 80 acutely infected patients (Fig. 1). Phylogenetic tree analysis revealed that EnvFVC was most closely related to the inferred ancestral and consensus C Env sequences rather than those from circulating viruses (results not shown). The amino acid length of the variable loops and number of PNGS were calculated. The EnvFVC sequence was generated with shorter V1/V2, V4, and V5 loop sequences, as well as one less PNGS site in the V1/V2 loop structure and additional PNGS included in the C2 region and gp41 ectodomain, as compared to the average number found in early transmitted subtype C viruses. These differences did not exclude any relevant known functional epitopes. Importantly, the EnvFVC sequence contained the relevant residues for CD4 binding, including the loop D of the constant region 2 (C2) (all numbering according to HXBc2: residues 275–283), the CD4 binding loop in the C3 region (residues 364–374), the bridging sheet (β20/21 hairpin) (residues 425–430), and β23 (residues 455–461) in the C4 region and the β24-α5 connection in the C5 region (residues 469–475; Fig. 1, red boxes with critical residues shaded red). Residues within these regions form the epitopes of the CD4bs-directed broadly neutralizing antibodies such as IgG1b12, VRC01, PGV04, and 3BNC117. 12 –15 Glycan-dependent epitopes such as those required for the monoclonal antibodies PG9/PG16, 16 CH01-04, 17 and PGT141-145 18 (residues N156 and N160) and the PGT family (PGT 125–128, 130, and 131 18 ) of bNAbs (residues N301 and N332) were present (Fig. 1, green boxes). Interestingly, if PNGS at positions 160 and 332 are not present during primary HIV-1 infection, they are subsequently introduced as a direct response to neutralizing antibody-mediated pressure. The relevance of PNGS at positions 160 and 332 in the context of immunogen design should be elucidated further. In addition, the linear epitope for the α4β7 integrin binding site (LDI/V—residues 179–181) and MPER directed bNAb 4E10 (NWFDIT—residues 671–677) were intact (Fig. 1, black background, with pink or yellow text, respectively).

Alignment of the envelope glycoprotein (Env) amino acid consensus sequences derived from HIV-1 subtype C acutely infected individuals from Malawi (MW_consen), South Africa (ZA_consen), and Zambia (ZM_consen) used to infer a subtype C founder virus Env consensus (Founder_Vi) sequence. The variable loop structures, signal peptide and Env precursor sites are all indicated above the sequence alignment. Potential N-linked glycosylation sites (PNGs) are shaded in blue, α4β7 integrin binding site tripeptide motifs (LDI/V) are indicated in pink, the tetrapeptide crown of the V3 loop structure (GPGQ/R) is shown in green, and the linear epitopes of the two broadly neutralizing antibodies (bNAbs) targeting the gp41 MPER region 2F5 (ELDKWA) and 4E10 (NWFDIT) are shown with a black background and are highlighted green and yellow, respectively. Regions boxed in red containing the residues important for CD4 binding and CD4 binding site-directed bNAbs epitopes, critical residues are shaded. Glycan-dependent epitopes targeted by bNAbs PG9/PG16 and PGT family antibodies are boxed in green. The (•) indicates identity to the consensus sequence and (-) indicates insertions/deletions at a particular position.
During transmission, it is widely accepted that there is also strong selection for viral variants using the CCR5 coreceptor.
19
As expected, the EnvFVC V3 loop was predicted to use the CCR5 for cell entry, using both the C-PSSMsinsi (
Lastly, the native HIV-1 subtype C consensus founder virus gp160 nucleotide sequence (normal HIV codon use) was generated at GeneArt (Life Technologies), cloned into the mammalian expression vector pcDNA3.1 (Invitrogen, Life Technologies), and used to create an EnvFVC pseudotyped virus as per standard protocols. In vitro phenotypic inhibition assays using 200TCID50 of the EnvFVC pseudovirus revealed an IC50 of 0.05 μg/ml for the entry inhibitor enfuvirtide, well within previously reported ranges. 20 Thus, the generation of an infectious pseudovirus indicates that the inferred HIV-1 subtype C consensus founder virus Env sequence codes for a functional EnvFVC. Codon optimized (humanized) gp120 and gp140 constructs were also generated at GeneArt (Life Technologies) and successfully used to express and purify functional and conformationally intact EnvFVC monomers and trimers, respectively.
Overall, our analysis confirmed that an HIV-1 subtype C consensus founder virus Env sequence with characteristics of transmitted and early founder viruses can be used to express a functional and conformationally intact Env. Previous research has shown that using a consensus Env gene approach improves the breadth of antibody and cell-mediated responses, although the level of improvement is still somewhat limited. 21 –23 The “subtype-based” or “region-based” EnvFVC described here can be used in the development of a useful immunogen for novel vaccine design.
Sequence Data
The full length HIV-1 subtype C founder virus consensus gp160 nucleotide sequence was submitted to GenBank using Sequin V13.05 (
Footnotes
Acknowledgments
Research funding from the South African HIV/AIDS Research and Innovation Platform (SHARP) is gratefully acknowledged.
Author Disclosure Statement
No competing financial interests exist.
