Abstract
The viral infectivity factor (Vif) is an HIV accessory protein that counteracts host antiviral proteins of the APOBEC3 family. Accumulating evidence highlights the pivotal role that accessory HIV proteins have on disease pathogenesis, a fact that has made them targets of interest for novel therapeutic and preventive strategies. Little is known about Vif sequence diversity outside of African or white populations. Mexico is home to Americas' third largest HIV-affected population and Mexican Hispanics represent an ever-increasing U.S. minority. This study provides a detailed analysis of the diversity seen in 77 Mexican Vif protein sequences. Phylogenetic analysis shows that most sequences cluster with HIV-1 subtype B, while less than 10% exhibit greater similarity to subtype D and A subtypes. Although most functional motifs are conserved among the Mexican sequences, substantial diversity was seen in some APOBEC binding sites, the nuclear localization inhibitory signal, and the CBFβ interaction sites.
I
The viral infectivity factor (Vif) is a 192-amino acid (23-kDa) HIV accessory protein essential for viral replication. The HIV-1 Vif protein counteracts host antiviral proteins of the apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like (APOBEC3 or A3) family. APOBEC3 family members are zinc-dependent deaminases with the unique ability to mutate cytidine to uridine in both DNA and RNA molecules, an activity known as nucleic acid editing. The APOBEC3 family has seven members grouped into two different classes: those bearing a single Zn binding domain (APOBEC3A, APOBEC3C, and A3H) and those having two such domains (APOBEC3B, APOBEC3G, APOBEC3D, and APOBEC3F). These proteins are known for interfering with viral replication and propagation of human infections caused by HIV, hepatitis C virus, hepatitis B virus, and retrotransposons. 2
APOBEC3G was discovered in 2002 in the search for host cell suppressors of the HIV-1 accessory Vif protein and is by far the best characterized member of this family. 3 APOBEC3G forms stable complexes with the viral core, which are then encapsidated into budding virions. 4 APOBEC3G hypermutates minus-strand HIV DNA fixing G to A mutations on plus-strand synthesis during the second round of viral replication, leading to the production of nonfunctional virions. Recent findings support the notion that APOBEC3G and APOBEC3F may exert additional antiviral activity by mechanisms other than cytidine deamination such as interfering with plus-strand transfer, reverse transcription, viral DNA synthesis initiation and priming, viral DNA elongation, and proviral integration. 4 Vif associates APOBEC3G with an elonginB (EloB)-elonginC (EloC)-Cullin5 (Cul5) E3 ligase complex and induces its proteosomal degradation through ubiquitination. 4 Vif also interferes with APOBEC3G translation, directly blocks APOBEC3G incorporation into budding virions, and blocks APOBEC3G catalytic activity. 5
The multifunctional Vif protein allows HIV to evade host innate mechanisms that would otherwise protect cells from exogenous viruses and endogenous mobile retroelements. This accessory viral protein has recently become a candidate target for both therapeutic and preventive interventions in HIV disease pathogenesis. Nevertheless, little is known about the genetic diversity of the Vif encoding region and of its structural and biological relevance, particularly among HIV strains circulating in less studied regions of the world such as Latin America. In this study we describe the molecular analysis of 77 HIV-1 Vif sequences derived from 68 Mexican mestizo isolates from San Luis Potosí, Mexico. Of the 77 sequences, eight were sampled from the same patient on two (MX071, 110, 207, 381, 385, and 399) or three (MX350 and 384) different occasions (more than 3 months apart each) in a way to assess the degree of quasispecies diversification. HIV-1 RNA and proviral DNA were extracted from peripheral blood obtained from antiretroviral therapy (ARV)-naive HIV-1-infected individuals referred to the state's public HIV/AIDS clinic “Centro Ambulatorio de Prevención y Atención en SIDA e ITS” from 2009 to 2014 following informed consent. Ethics approval for the study was granted by the corresponding Institutional Review Boards (Facultad de Medicina UASLP and the state's public health authority “Servicios de Salud del Estado de San Luis Potosí”).
Proviral DNA sequences encoding for the Vif protein were amplified using a nested-PCR approach. First round PCR generated a 1,430-bp fragment using primers VIF-FO (5′-TAA-RAG-AAR-AGG-GGG-GAT-TG-3′) and VIF-RO (5′-TCA-TTK-CCA-CTR-TCT-GAT-TG-3′) bordering the Vif/Vpr encoding region (spanning HxB2 positions 4792 to 6222). Second round PCR used previously published primers to amplify a 764-bp fragment encompassing the Vif encoding region (spanning HxB2 positions 4957 to 5721). 6 Final PCR component concentrations included 1.5 mM MgCl2, 200 μM dNTPs, 0.04 IU/μl Taq DNA Polymerase (Vivantis Technologies Sdn. Bhd. Malaysia), 800 nM/400 nM oligonucleotide primers (first and second round PCR, respectively), 200 ng of DNA (in the first PCR), and 2 μl of a 1:5 dilution of the first PCR product (the second PCR) in 12.5 μl final volumes. Thermal cycling consisted of an initial denaturing step at 94°C for 2 min followed by 30 cycles of 94°C for 20 s, 56°C for 30 s, and 72°C for 30 s followed by a final extension step at 72°C for 5 min for both programs.
Direct two-strand sequencing of PCR products was carried out at the Laboratorio Nacional de Biotecnología Agrícola, Médica y Ambiental of the Instituto Potosíno de Investigación Científica y Tecnológica (LANBAMA-IPICYT) using the inner primers. Raw nucleotide sequences for 77 of our local samples were manually trimmed down to 579 bp corresponding to the Vif-encoding region (spanning HxB2 positions 5041 to 5619). Vif-encoding consensus nucleotide sequences for group M (subtypes A, B, C, D, F1, F2, G, and H) and group O (as an outlier) were retrieved from the Los Alamos National Laboratory HIV database (
Consensus amino acid frequency and Shannon entropy levels were calculated for each site as a measure of variation in the Mexican Vif protein sequence alignments (
Sequence analysis revealed complete open reading frames with no frameshift mutations suggesting the presence of functional Vif genes in all but two sequences (go to
Sequence phylogeny was reconstructed using a Markov chain Monte Carlo algorithm-based Bayesian analysis suite (BEAUti and BEAST) employing a general time reversible substitution model and strict clock. 7 The 10 million sample trees produced were simplified using a 10% burn-in to generate a single maximum clade credibility tree (see Fig. 1). This phylogenetic tree shows that most of the Mexican samples cluster with the subtype B consensus. Sequences derived from four patients clustered more closely with the subtype D consensus sequence (MX141, 396, 395, and 352). In addition, the three sequences obtained on different dates for patient MX384 (accession numbers KP874176–KP8774178) are highly related to the CRF18_cpx circulating recombinant form described to be circulating in Cuba. This patient had spent 20 years living in the northeastern United States before being diagnosed as HIV+ in Mexico. With only the Vif gene sequence it is not possible to say for sure that this patient is infected with CRF18_cpx virus, but it seems most likely that this is the case. 8 Sequences clustering with the subtype D consensus possessed polymorphisms uncommon in other patient samples such as GGA60-GAA61, TTG109, and CAA157.

Mexican Vif encoding nucleotide sequence phylogeny. Maximum clade credibility phylogeny of 77 Mexican Vif encoding nucleotide sequences and nine consensus sequences. A colored version is publicly available for download from
Nucleotide sequences were translated into their corresponding protein sequences, aligned, and reformatted to HxB2 unanimity for further analysis. A protein sequence alignment composed of the 77 Mexican HIV sequences, the HxB2 reference sequence, and three consensus sequences is available for download from

Mexican Vif protein sequence alignment summary. Protein domains and functional regions are shaded to highlight functional regions and sites. These include tryptophans (W) involved in APOBEC3G binding, APOBEC3-protein family binding domains, the nuclear localization inhibitory signal, the MAPK phosphorylation sites, CBFβ interaction sites, the Cullin5 zinc binding domain, the Elongin B/C (BC) box, the protease processing site, known phosphorylation sites, the Cullin5-box, and the dimerization site.
As mentioned previously, the MX101 and MX464 sequences had a premature stop codon at positions 93 and 38, respectively. Both chains lack the central zinc binding region along with the C-terminal region, suggesting the presence of a nonfunctional Vif. Integrated proviruses bearing inactivating stop codons can be “rescued” through dual infection by other HIV quasispecies through complementation of function. In addition, reverse-transcriptase template switching recombination is another means by which nonfunctional viral genomes may be “reactivated” or continue to play an important evolutionary path for integrated viral genomes. 9
The N-terminal tryptophan stretch known for its important role in APOBEC3G/F binding (W5, W11, W21, W38, and W79 shown as empty boxes in Fig. 2) was conserved in all sequences except two. In MX071_2, W11 was replaced by an arginine residue while in MX186 W21 it was replaced by a cysteine. Different tryptophans mediate selective Vif binding and suppression of APOBEC3G family members. 10,11 Despite the lack of functional data, it is possible that these substitutions might interfere with APOBEC3G binding as they involve a change from a nonpolar residue to a basic and polar residue, respectively. With the exception of 171EDRWN175 the APOBEC3G binding motifs are found in the N-terminal region of the Vif protein.
Different motifs are used selectively by Vif to bind the different APOBEC3 family members. 9 The first motif (14DRMR17) used for binding APOBEC3C/D/F had limited conservative substitutions in the Mexican samples. This was also the case for the fourth (40YRHHY44, which binds APOBEC3G), fifth (69YWxL72, which binds APOBEC3F/G), sixth (74TGERxW79, which binds APOBEC3F), and eighth (171EDRWN175, which binds APOBEC3F) motifs that also exhibited limited polymorphism and mostly conservative substitutions. However, for the second (21WKSLVK26, which binds APOBEC3G), third (39F,H48, which bind A3H), and seventh (81LGQGVSIEW89, which binds APOBEC3F/G) motifs there were a significant number of sequences having nonconservative substitutions, which result in physicochemical property changes.
Examples of such substitutions are K22N present in 37.7% of the Mexican sequences (second motif), and F39S in 20.8%, F39C in 7.8%, H48N in 42.9% (third motif), and Q83H in 13% (seventh motif) of the Mexican sequences. In all, nonconservative amino acid substitutions resulting in physicochemical changes within these motifs were observed in 27 (35%) of our sequences. Only seven sequences (9.1%) had nonconservative substitutions in more than one of these APOBEC3 binding motifs (MX033, 089, 158, 168, 395, 399, and 542). The second and third APOBEC3 binding sites were most affected by nonconservative substitutions. Contrary to the APOBEC3D/F/G binding sites, Vif amino acids that bind A3H are known for being less conserved among HIV subtypes given the low reported frequency of active A3H alleles.
Amino acids 63–70 and 86–89 form β-strands that are required to maintain stable Vif expression levels and virus infectivity. 10,12 Interestingly, the first β-strand region exhibited great polymorphism and had substitutions in positions R63 and T67 that were mainly nonconservative in nature (25% and 42.8%, respectively). Both E88 and W89, present in the second β-strand region and near the charged central hydrophilic nuclear localization inhibitory signal (NLIS), were well conserved in all Mexican sequences. Hydrophobic interactions allow the Core Binding Factor β (CBFβ) to be bound by the Vif α/β domain, the remaining unburied surface of the Vif α/β domain being responsible for binding APOBEC3 through electrostatic interactions. 9 CBFβ comprises the regulatory non-DNA binding chain, which forms heterodimeric complexes with the DNA binding CBFβ chain to act as a transcription factor. The N-terminal CBFβ interaction site (84GVSIEW89) of most Mexican isolates was highly conserved, with most substitutions being conservative. The C-terminal CBFβ interaction site (102LADQLI107) was less conserved, with most substitutions being nonconservative in nature.
The nuclear localization inhibition signal 90RKKR93 present in the HxB2 (subtype B) reference is composed entirely of basic amino acids. 10 Although only five of our sequences (6.5%) had identical RKKR signals, the great majority of those having amino acid substitutions (n = 51, 66.2%) conserved its physicochemical properties. The remaining 25 sequences (32.5%) had nonconservative amino acid substitutions resulting in a change of physicochemical properties from basic to acidic. Most of these changes involved glutamic acid as a substituent, while only two sequences had an aspartic acid substituent (MX066 and 474). Most of these 25 nonconservative substitutions occurred in the second and third positions of the signal sequence (amino acids 91 and 92); only one occurred in the last amino acid position (amino acid 93) and none in the first position (amino acid 90). This signal sequence ensures cytoplasmic localization of the Vif protein through interference with the nuclear import pathway, possibly by binding importins.
It has been suggested that the arginine-rich NLIS present in subtype C sequences allows nucleic acid interaction and abrogation of APOBEC3 binding. 10 Interestingly, nearly 52% of our sequences (n = 40) have at least one additional arginine residue in this region. Further studies seeking to investigate the clinical and functional relevance of this finding in the Mexican population are warranted, as the NLIS is known for exhibiting geographic-dependent variations, even within group M subtypes.
Human casein kinase II (CKII) phosphorylates Vif residues 95ST96. T96 is also known to be phosphorylated by the mitogen-activated protein kinases 1 and 2 (MAPK or ERK2/1), which play an important role as extracellular signal-regulated kinases. 10 Approximately 44% of our sequences (n = 38) had nonconservative substitutions affecting phosphorylation susceptibility, all falling in position S95. Position T96 was conserved in all of the Mexican sequences, in agreement with the critical impact that mutations at this position have on Vif activity. Phosphorylation of viral proteins by cellular kinases regulates viral infectivity. As such, the substitution for nonphosphorylatable amino acids in these positions (S144, T155, S165, T170, and T188) has the potential to alter the corresponding signaling pathways. In Mexican HIV isolates, positions S144, T170, and T188 were conserved; these three sites are typical of subtype B sequences and are usually phosphorylated by serine/threonine protein kinases. However, nearly 24% (n = 18) of the sequences had a K155 instead of the group M prototypical T155. This substitution is seen in most subtype C sequences and BC circulating recombinant forms (CRF) and are not likely to be phosphorylated. 10
For Vif to exert its suppressive action it must bind APOBEC3 and mimic the human suppressor of the cytokine signaling-2 (SOCS) protein to subsequently act as the substrate binding subunit of a cullin RING ligase-5 (CRL5) E3 ligase complex. 9 The Zn binding H108C114C133H139 motif stabilizes a discrete Vif protein alpha-helical domain, which contains a cluster of hydrophobic conserved residues and enhances Cul5 binding directly (120IRxxL124). The four Zn-coordinating residues are highly conserved among group M and O subtypes as well as in our Mexican isolates. Whether the presence of nonconservative amino acid substitutions, which abrogate Zn binding properties in this region (in positions 114, 133, and 139), leads to Cul5 binding deficiencies and better clinical outcome remains unknown. Vif recognition and binding of APOBEC3 through Cul5 require the formation of a heterodimeric complex with EloB and EloC, which in turn stabilizes the complex and allows CBFβ recruitment.
The interaction between Vif and EloC is mediated by the 144SLQ(Y/F)LA149 motif present in the viral Elongin B/C-box. 9,13 This domain is perhaps the most critical Vif region determining APOBEC3 protein suppression. Previous reports have shown that the short side chain of Ala149 plays a crucial role in EloC binding. As such, this residue was conserved in all Mexican isolates. Only two nonconservative substitutions occurring at two different positions (146QY147) of this box were observed in Mexican sequences, the remaining part of the domain being conserved. This suggests that most Mexican HIV isolates are capable of targeting APOBEC3G for proteosomal degradation via the EloC/B-Cul5-SOCS box E3 ubiquitin ligase pathway.
After infecting a susceptible cell, HIV-1 protease gradually processes full-length Vif proteins into shorter C-terminal fragments through the L150 processing site.10 This site is highly conserved among all HIV group M subtypes as mutations of this site severely affect protease sensitivity and Vif function. HIV-1 infectivity and processing are regulated by the multimerization of Vif proteins through the proline-rich self-association domain located in the C-terminus (161PPLP164). 10,14 Vif dimerization through its C-terminus is thought to render Vif inactive, releasing Pr55 Gag, which is normally bound by this same region. 15 This mechanism possibly functions as a feedback loop system that allows Gag processing to continue after adequate APOBEC3G degradation. The PPLP sequence is highly conserved among the different group M subtypes and is equally conserved among our Mexican sequences. A single isolate possessed an amino acid substitution (L163F), which in theory should not interfere with Vif dimerization given its conservative (hydrophobic) nature.
Further analysis reveals that Mexican Vif protein sequences are subjected to selective pressures similar to other HIV-1 subtypes. Sliding window sequence analysis of Mexican Vif protein sequences reveals extremely low entropy levels (<0.2) in three important regions: (1) the first 18 N-terminal amino acid residues of the protein, (2) a small stretch of amino acids located in the fourth Structurally Conserved Region (SCR) spanning 52SSEVHIP58 (see Fig. 3),16 and (3) the C-terminal APOBEC3F binding site (168KLTEDRW174). The N-terminal region of Vif proteins is known for being highly conserved among HIV subtypes; however, the functional relevance and interactions that take place at this site remain unknown (although crystallographic analysis suggests that it might be responsible for interactions with CBFβ).

Mexican Vif protein sequence entropy and amino acid frequency map. The frequency of the most common amino acids observed at each site in the 77 Mexican HIV-1 Vif sequences is shown in the continuous line while the Shannon entropy level at each site is shown in bar graphs. Boxes indicate protein regions exhibiting low entropy levels.
Several SCRs have been described, and the conservative nature of most of them can be explained through the type of critical interactions they host (by binding APOBEC3 proteins, CBFβ, EloB/EloC) or the role they play (Vif dimerization motif). 16 However, this is not the case for 52SSEVHIP58, which is not known to host any particular interaction or play a functional role (other than, perhaps, stabilizing other interactions). Four additional regions of low entropy (<0.4 but higher than 0.2) were also observed in regions of known relevance: (1) that located between positions 68 and 76 responsible for binding several APOBEC3 members, (2) the CBFβ binding 86SIEWR90 motif, (3) the first half of the EloB/C binding box (141KVGSLQYLAL150), and (4) the Vif dimerization site located in 160KPPLPSV167. This same analysis shows that at least for the Mexican isolates, other regions of known functional importance are not as conserved or subjected to the same selective forces as those previously mentioned. This is the case of some APOBEC3 binding sites (22KSLVK26, 40YRHHY44, and that spanning 77R-V85), the NLIS (90RKRR93), the N-terminal Cul5 binding region (120IRNAIL125), and the C-terminal part of the EloB/C box (151ALITPKK158).
Analysis of viral quasispecies evolution in samples obtained from the same patient at different times (spanning between 1 and 4 years between samples) revealed low intraindividual variation (<7.5%) with respect to observed interindividual variations. These results are very similar to those published previously for a Ugandan population and reflect the pivotal role that conserved and functional Vif domains have on viral fitness.
The Vif protein of HIV-1 protects the virus against antiviral activity and plays an important role in determining infectivity. The Vif protein and APOBEC3F/G cellular host interactions have been studied extensively. Knowledge of the global genetic diversity of HIV-1 accessory genes is important as Vif and other retroviral accessory genes are candidate targets for novel preventive and therapeutic anti-HIV strategies. Our analysis of 77 Mexican isolate-derived HIV-1 Vif sequences furthers our knowledge of the global diversity of accessory proteins and sheds light on geospecific features. The diversity of Mexican HIV-1 Vif sequences is greater than that described for other human populations. Whether this is the result of population-specific immune selective pressures or the migratory influences to which Mexico is subjected remains unknown. 10
Footnotes
Acknowledgments
This project was funded in part by the Mexican government National Science and Technology Council through grants CONACYT I0017(CB-2011-01) #167374, SSA/IMSS/ISSSTE-CONACYT (FONSALUD-2009-01) #115226, and SSA/IMSS/ISSSTE-CONACYT (FONSALUD-2008-01) #87340. The authors would like to thank the patients attended by CAPASITS San Luis Potosí for providing the samples and understanding the scope of this study. Very specially, the authors would like to thank all of the state health workers who went out of their way in terms of time, effort, and expense to make this study possible.
Author Disclosure Statement
The authors declare that they have no competing personal or financial interests to disclose.
