Abstract
HIV-1 subtype C is associated with more than half of infections in southern Brazil and has been increasing in other regions of the country. In a previous study carried out in northeastern Brazil, we found a prevalence of 4.1% of subtype C. This work investigates the origin of subtype C in the state of Bahia based on five new viral sequences. The phylogenetic analysis showed that subtype C viruses found in Bahia descend from the main lineage that circulates in other Brazilian regions.
HIV-1
HIV-1 subtype C (HIV-1C) is the dominant genotype in the southern region (>50%), and the prevalence rates in the population decrease in the northernmost regions: <20% in the Central West and <10% in the Southeast, North, and Northeast. 2,3 Although the genotypes B and BF are prevalent in the Northeast, the subtype C accounts for 5% of infections in this region 4 and some studies highlight that the increase in C strain over time is discreet. 2,3 This increase may be related to their biological properties, causing long periods without the onset of symptoms, with slow disease progression leading to greater chances of transmission. 2
The subtype C prevalence in Bahia state was 1.7% in 20115 and 5% in 2017. 4 Samples presented in this study were collected in 2019 from HIV-1-infected patients, followed up at Professor Edgard Santos University Hospital in Salvador (Bahia/Brazil). This study analyzed five new HIV-1C sequences (Table 1) from a population study (data not published) of the prevalence of HIV-1 subtypes in Bahia. These sequences were obtained as previously described. 6 One of them (011) is the first subtype C Near Full Length Genome (NFLG) sequenced in the state. For the other four samples, it was possible to amplify only partial genome fragments (Fig. 1).

Sequenced genomic regions of HIV-1C viral isolates found in Bahia state, Brazil. HIV-1C, HIV-1 subtype C.
Sociodemographic and Clinical Characteristics Data from HIV-1 Subtype C Individuals from Bahia
A, Alanine; D, Aspartic acid; DRM, drug resistance mutation; E, Glutamic acid; K, Lisine; N, asparagine; NA, not amplified; T, threonine; V, valine.
Two subtype C reference datasets were used to investigate the phylogenetic origin of the Bahia subtype C sequences, the first one containing 570 (460 from Brazil and 110 from world) sequences, based on the pol gene fragment (nucleotides 2301–3200 relative to HXB2 reference strain),
7
and a second dataset, including 2,105 NFLG worldwide, available in the Los Alamos database was created to analyze the subtype C NFLG sequence. Duplicated entries were removed and 1,915 sequences (Supplementary Data S1) were used for phylogenetic reconstruction with the NFLG sample (011) obtained in this study. Alignment was performed using the MAFFT online platform.
8
Maximum likelihood (ML) phylogenies were reconstructed using IQ-TREE.
9
HIVdb (
Figure 2a shows the pol phylogenetic relationships of four new HIV-1C sequences from Bahia (007, 011, 136, and 224) plus reference sequences. All samples from Bahia grouped inside the largest Brazilian monophyletic cluster [Bootstrap (BS) = 98 and 99, respectively], which represents the main subtype C lineage that circulates in this country. Two sequences (007 and 011) clustered with samples from southern and southeastern Brazilian regions (states of Paraná, Rio de Janeiro, and Santa Catarina) (BS 94 and 97, respectively).

ML analyses showing phylogenetic relationships of HIV-1C sequences from Bahia with sequences from Brazil and worldwide. The trees were rooted by the midpoint and the color of the branches represents the geographic origin of sequences. The statistical support is indicated only at key nodes as bootstrap values. Trees were built under the evolutionary model GTR+I+G+F and visualized in the Figtree v1.4.4 software. Horizontal branch lengths are drawn to scale with the bar at the bottom, indicating 0.02 nucleotide substitutions per site.
One sample (136) clustered with sequences from four Brazilian regions (South, Southeast, Central-West, and Northeast), including one sequence from Pernambuco, another state from the Northeast region (BS 97). Another sample (224) clustered in a large group containing samples from three regions of Brazil: South, Southeast, and Central-West (BS 92) (Fig. 2a). Finally, sample 112 could only be amplified in a different genomic region (positions 4171–5196 relative to HXB2). Thus, it was not only phylogenetically analyzed separately (Fig. 2b) but also clustered inside the main Brazilian subtype C cluster (aLRT/BS = 84/81), again, close to a strain from Pernambuco.
In the NFLG dataset containing 1,916 (including sample 011 obtained in this study) sequences, 26 sequences were described in Brazil, one in Argentina, and one in Uruguay. The tree topology based on NFLG reveals that all South American C samples formed a monophyletic group (BS 100) separate from other subtype C sequences (Supplementary Data S2). Among the Brazilian subtype C sequences, the 011 was related to strains from the southern and southeastern regions, similar to what was found in the analysis of the pol region (Fig. 2).
Four (007, 011, 136, and 224) of five patients were being treated with antiretroviral therapy and four major drug resistance mutations (DRMs) were found in three of them: D30N (protease), K103T, E138K/A (reverse transcriptase), and V151L (integrase) (Table 1).
In Brazil, HIV-1C was first detected in 1992 in the city of Porto Alegre, the capital of Rio Grande do Sul, a southern Brazilian state. 10 Subsequent phylogeographic analysis estimated that this strain has been circulating in the south region since the 1970s 11 and afterward spread throughout the other four regions of the country: Central-West and Southeast in the early 1980s, 12 Northeast in the mid-1980s, 7 and North at the end of the 1980s. 13 Despite this, there is evidence of multiple introductions of HIV-1C in Brazil. 7,11
Viruses found in northernmost Brazilian states were phylogenetically linked to those found in the southern region (the main Brazilian lineage). 7,11,12 Our previous analyses have shown that the C subtype strains found in the Northeast region are descendant of this lineage. 7 Accordingly, despite the evidence of additional introductions of subtype C in Brazil, the new sequences obtained in this study were also inside the main cluster, related to strains from the southern, southeastern, and central-western regions (Fig. 2).
The ML analysis based on the NFLG (Supplementary Data) shows a specific cluster composed of sequences from South America and that sample 011 from Bahia is also related to viral sequences from Brazilian southern and southeastern regions, like in the pol ML tree (Fig. 2).
In the DRM analysis, the D30N substitution was found in one sequence (224) conferring resistance to nelfinavir, a protease Inhibitor. The D30N decreases the enzymatic activity of the mutated structure and consequently reduces the replicative capacity of the virus. 13,14 In addition, the substitution E138A found in 224 and 011 reduces the effectiveness of Etravirine and Rilpivirine, which are non-nucleoside reverse transcriptase inhibitor (NNRTI) drugs, as well as K103T in sample 136 (patient using nevirapine). In the analysis of mutations associated with resistance to integrase inhibitors, the V151L, found in one sequence (136), is a rare substitution that confers resistance to raltegravir and elvitegravir. In DRM studies, it is common to expect mutations that reduce the action of NNRTIs, especially in K103X, since this class represents the first regimen for initial treatment. 14
In summary, this study analyzes five new HIV-1C viruses from Bahia state, which seem to descend from the main lineage of this subtype that reached the southern region of Brazil in the mid-70s. 11 One of them comprises the first NFLG of subtype C obtained in the state. Besides, in agreement with our research group previous result, two new subtype C sequences grouped with another Northeastern sample, providing new evidence of the intraregional spreading.
Footnotes
Acknowledgments
R.C.O. and J.S.M.S. are grateful to Fundação de Amparo à Pesquisa do Estado da Bahia (FAPESB) for the Phd. scholarship. R.C.O., J.S.M.S., and J.P.M.-C. are grateful to C.B. and M.L.G. for advanced support to the research.
Authors' Contributions
R.C.O. conceptualization, investigation, and writing—original draft preparation; J.S.M.S. methodology and writing—review and editing; L.C.J.A. resources; M.L.G. resources and writing—review and editing; C.B. resources and writing—review and editing; J.P.M.-C. conceptualization, validation, investigation, writing—original draft preparation, and supervision.
Author Disclosure Statement
No competing financial interests exist.
Funding Information
R.C.O. and J.S.M.S., were supported by scholarships (FAPESB). This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Supplementary Material
Supplementary Data S1
Supplementary Data S2
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
