Abstract
The Northeastern Brazilian region has experienced a constant increase in the number of newly reported AIDS cases over the last decade, but the genetic diversity of HIV-1 strains currently disseminated in this region remains poorly explored. HIV-1 pol sequences were obtained from 140 patients followed at outpatient clinics from four Northeastern Brazilian states (Alagoas, Bahia, Ceará, and Piauí) between 2014 and 2015. Subtype B was the most prevalent HIV-1 clade (72%) detected in the Northeastern region, followed by subtypes F1 (6%), C (5%), and D (1%). The remaining strains (16%) displayed a recombinant structure and were classified as follows: BF1 (11%), BC (4%), BCF1 (1%), and CRF02_AG like (1%). The 20 HIV-1 BF1 and BC recombinant sequences detected were distributed among 11 lineages classified as follows: CRF28/29_BF like (n = 5), CRF39_BF like (n = 1), URF_BF (n = 9), and URF_BC (n = 5). Non-B subtypes were detected in all Northeastern Brazilian states, but with variable prevalence, ranging from 16% in Ceará to 55% in Alagoas. Phylogenetic analyses support that subtype D and CRF02_AG strains detected in the Northeastern region resulted from the expansion of autochthonous transmission networks, rather than from exogenous introductions from other countries. These results reveal that HIV-1 epidemic spreading in the Northeastern Brazilian region comprised by multiple subtypes and recombinant strains and the molecular epidemiologic pattern in this Brazilian region is much more complex than originally estimated.
Introduction
T
The Northeastern Brazilian region comprises nine states, the states of Alagoas, Bahia, Ceará, Maranhão, Paraíba, Pernambuco, Piauí, Rio Grande do Norte, and Sergipe, which together concentrate 15% of the total AIDS cases identified in the country. 1 Most AIDS cases in the Northeastern Brazilian region were identified in the heterosexual population (54%), followed by men having sex with men (MSM, 44%) and intravenous drug users (IDUs, 2%). 1 The incidence of newly reported AIDS cases in this region increased by 37% over last years, rising from 11.2 cases per 100,000 inhabitants in 2006 to 15.3 cases per 100,000 inhabitants in 2016. 1 The increase in the incidence observed in all Northeastern Brazilian states, but with variable proportions, ranging between 8% in Pernambuco to 83% in Maranhão states. 1 These data reveal the complex dynamic nature and the geographical heterogeneity of the AIDS epidemic in Brazil.
Most molecular epidemiologic surveys conducted in the Northeastern Brazilian region analyzed individuals sampled before 2010 and support an HIV-1 epidemic mostly driven by subtype B (70%–90%), followed by BF1 recombinants (4%–20%) and subtype F1 (5%–10%). 2 –11 Although some sequence analyses of the pol genomic region indicated a remarkable high prevalence (20%–30%) of subtype F1, 2,8,11 most of those sequences were reclassified as BF1 variants after the near full-length genome characterization. 10 In general, all these studies consistently described low prevalence of subtype C (<2%) and BC recombinants (<0.5%) and there is also almost no description of other HIV-1 clades circulating in the Northeastern region, with the exception of one subtype D 2 and one A/G recombinant. 11
More recently, few studies analyzed individuals sampled at more recent times (2010–2013) in two states from the Northeastern region, which support a much higher prevalence of subtype C and BC recombinants than previously estimated. 12,13 According to those studies, subtype C and BC recombinants together comprise nearly 5%–6% of HIV-1 sequences from Maranhão and Piauí states. These results suggest that the dynamic of HIV-1 subtypes of the epidemic in the Northeastern Brazilian region could be changing over time, with a trend toward an increasing prevalence of subtype C and BC recombinants. To test this hypothesis, we analyzed a total of 140 HIV-1 pol gene sequences obtained from individuals sampled between 2014 and 2015 at four different Northeastern Brazilian states (Alagoas, Bahia, Ceará, and Piauí).
Materials and Methods
Study patients
We included in this study all HIV-1 pol sequences obtained from patients failing antiretroviral therapy followed at outpatient clinics of the Public Health System in the Northeastern Brazilian region and whose samples were sent to the Laboratory of AIDS and Molecular Immunology (Fundação Oswaldo Cruz-Rio de Janeiro) for HIV drug resistance testing between 2014 and 2015. Through this convenience sampling approach, we included 140 pol sequences from four Northeastern states: Alagoas, Bahia, Ceará, and Piauí. The HIV-1 pol sequences encompassing the whole protease (PR) and part of the reverse transcriptase (RT) regions of pol gene were amplified and sequenced from plasmatic viral RNA by using the TruGene HIV Genotyping Kit (Siemens) following the manufacturers' protocol (HXB2 genomic positions 2262–3290) or an in-house genotyping method (HXB2 genomic positions 2253–3554). In the in-house genotyping protocol, viral RNA was extracted using the QIAamp Viral RNA Mini Kit (QIAGEN) and subjected to cDNA synthesis using the SuperScript® III RT enzyme (Life Technologies). The PR/RT fragment was amplified by nested polymerase chain reaction employing the Platinum® Taq DNA Polymerase (Invitrogen) and sequenced using the ABI BigDye Terminator v.3.1 reaction kit (Applied Biosystems). Primers and cycling conditions are described elsewhere. 14 The study was approved by the Oswaldo Cruz Institute Ethics Committee (CAAE 03925112.0.0000.5248).
HIV-1 subtyping
HIV-1 subtypes were initially determined with the REGA HIV-1 Subtyping Tool 3.0 software
15
and later confirmed by phylogenetic and bootscanning analyses with HIV-1 reference sequences of subtypes A–D, F–H, J, and K, and of Brazilian Circulating Recombinant Forms (CRFs), retrieved from the Los Alamos HIV Database (
HIV-1 reference datasets
HIV-1 subtype B pol Northeastern Brazilian sequences were aligned with subtype B reference sequences representative of the BPANDEMIC (n = 300) and the BCAR (n = 200) clades described previously. 20,21 HIV-1 subtype D and CRF02_AG pol sequences from Northeastern Brazil were aligned with all subtype D and CRF02_AG reference sequences, respectively, from Africa and Brazil available at Los Alamos HIV Database. The non-CRF-like recombinant sequences identified in this study were aligned with all URF_BF1/BC/BCF1 Brazilian recombinants available at Los Alamos HIV Database. The clustering pattern of Brazilian sequences was investigated by performing ML phylogenetic analyses as described above.
Results
In this study, we analyzed 140 HIV-1 pol sequences from patients living in four Northeastern Brazilian states: Alagoas (n = 22, 16%), Bahia (n = 56, 40%), Ceará (n = 42, 35%), and Piauí (n = 13, 9%). These sates comprise 55% of all HIV/AIDS cases detected in the Northeastern Brazilian region over the last 10 years and the sequence distribution of our study roughly matches the distribution of HIV cases across the four states in that period: Alagoas (10%), Bahia (52%), Ceará (33%), and Piauí (5%).
1
Subtype B was the most prevalent HIV-1 clade in our sample (n = 101, 72%), followed by BF1 recombinants (n = 15, 11%), subtype F1 (n = 9, 6%), subtype C (n = 7, 5%), BC recombinants (n = 5, 4%), BCF1 recombinant (n = 1, 1%), subtype D (n = 1, 1%), and AG recombinant (n = 1, 1%) (Fig. 1 and Supplementary Fig. S1; Supplementary Data are available online at

Map of Brazilian Northeastern region depicting the estimated prevalence of HIV-1 subtypes and intersubtype recombinants among HIV-infected individuals from different states. The total number of HIV-1 sequences analyzed in each locality is indicated bellow the pie charts. AL: Alagoas; BA: Bahia; CE: Ceará; MA: Maranhão; PB: Paraíba; PE: Pernambuco; PI: Piauí; RN: Rio Grande do Norte; SE: Sergipe.
The recombinant sequences detected in this study were distributed in 13 independent lineages, including nine BF1, two BC, one BCF1, and one AG (Fig. 2 and Supplementary Fig. S1). Five BF1 sequences sampled at Alagoas, Bahia, and Ceará states displayed the same mosaic structure (Fig. 2) and branched together (aLRT = 0.87) with the CRF28/29_BF reference sequences, being thus classified as CRF28/29_BF-like recombinants (Supplementary Fig. S1). One BF1 sequence from Piauí and the AG recombinant sequence from Ceará displayed the same mosaic structure (Fig. 2) and branched with high support (aLRT ≥ 0.95) (Supplementary Fig. S1) with the CRF39_BF and CRF02_AG reference sequences, respectively, and were thus classified as CRF39_BF-like and CRF02_AG-like recombinants. The remaining Northeastern recombinant lineages displayed mosaic structures different from those previously described in Brazilian CRFs (Fig. 2 and Supplementary Fig. S1) and were thus classified as URFs. Most of these URF lineages comprise only one or two sequences each, with exception of lineage BC-I that comprises four sequences sampled at two different states (Alagoas and Piauí).

Schematic representation of the intersubtype recombinant mosaic patterns of HIV-1 pol sequences from Northeastern Brazil. Segments are shaded according to the subtype assignment as described in the legend. Fragments not sequenced were represented in white. Dotted lines delimitate the protease (pro), reverse transcriptase (rt), and integrase (int) regions.
All non-CRF-like recombinant sequences from the Northeastern region identified in this study were combined with all Brazilian URF sequences available in the Los Alamos HIV Database. The ML phylogenetic analyses confirmed that lineages BF1-III to BF1-VII were not related to other Brazilian URFs (Fig. 3A). By contrast, the lineage BF1-I branched with high support (aLRT = 0.96) with an URF_BF1 sequence of unknown state of origin, while lineage BF1-II branched (aLRT = 0.95) with an URF_BF1 sequence from São Paulo (Fig. 3A). Similarly, one URF_BC sequence from Mato Grosso branched with high support (aLRT = 1) within the Northeastern lineage BC-I, while lineage BC-II branched with high support (aLRT = 1) with other three URF_BC sequences without description of state of origin (Fig. 3B). The BCF1 recombinant sequence did not branch with high support with other Brazilian URF_BCF1 previously described (Fig. 3C).

Maximum likelihood phylogenetic trees of HIV-1 URF pol sequences from Brazil. Sequences from this study (indicated with an asterisk) classified as URF_BF1
To better understand the putative origin of the subtype D and the CRF02_AG-like recombinant sequences detected in the Northeastern region, these sequences were aligned with African and Brazilian subtype D and CRF02_AG sequences already published. The ML phylogenetic analyses revealed that both the subtype D and the CRF02_AG-like sequences circulating in the Northeastern region branched within Brazilian clades previously described. The subtype D sequence from Alagoas branched (aLRT = 0.80) within a Brazilian subtype D lineage (DBR-I) that also comprises 14 subtype D sequences from Rio de Janeiro, São Paulo, and Mato Grosso do Sul states (Fig. 4). The CRF02_AG-like sequence detected in Ceará branched (aLRT = 0.91) within a Brazilian CRF02_AG lineage (CRF02BR-III) that comprises sequences from Amapá and São Paulo (Fig. 5).

Maximum likelihood phylogenetic tree of HIV-1 subtype D pol sequences from Brazil and African countries. Sequences from Africa, Brazil, and from this study are colored black, blue, and red, respectively, following the legend at top left. The gray boxes highlight the position of the Brazilian subtype D lineages. The aLRT support values are indicated only at key nodes. The tree was rooted using HIV-1 subtype B sequences (gray). Horizontal branch lengths are drawn to scale with the bar at the bottom indicating nucleotide substitutions per site. Color images available online at

Maximum likelihood phylogenetic tree of HIV-1 CRF02_AG pol sequences from Brazil and African countries. Sequences from Africa, Brazil, and from this study are colored black, blue, and red, respectively, following the legend at top left. The gray boxes highlight the position of the Brazilian CRF02_AG lineages. The aLRT support values are indicated only at key nodes. The tree was rooted using HIV-1 subtype G sequences (gray). Horizontal branch lengths are drawn to scale with the bar at the bottom indicating nucleotide substitutions per site. Color images available online at
Discussion
The characterization of the molecular epidemiologic profile of the HIV-1 epidemic in four different Northeastern Brazilian states reveals a complex scenario characterized by the cocirculation of subtypes B, F1, and C and diverse recombinant forms among those subtypes, as well as dissemination of other HIV-1 clades (subtype D and CRF02_AG) rarely detected among Brazilian HIV-1-infected individuals.
Subtype B was the predominant HIV-1 clade in the Northeastern Brazilian region (72%), followed by BF1 recombinants (11%), subtype F1 (6%), subtype C (5%), BC recombinants (4%), and other genetic variants (3%). Initial molecular epidemiologic surveys described an extremely low prevalence of subtype C (<2%) and BC recombinants (<0.5%) in the Northeastern Brazilian region. 2 –11 The combined prevalence of subtype C and BC strains estimated in this study (9%) and in other recent studies performed in Maranhão and Piauí states (5%–6%) 12,13 with patients sampled after 2010 supports a trend toward an increasing prevalence of these genetic variants in the Northeastern region. Increasing prevalence of subtype C has been also described in the Southeastern, 22 Central-Western, 23 and Northern 19 regions, consistent with the notion that HIV-1 subtype C has experienced a steady northward expansion from the epicenter in the southernmost Brazilian states. 24
Although subtype B was the most prevalent HIV-1 lineage in the Northeastern Brazilian region, its prevalence greatly varies among different states, ranging from 45% in Alagoas to 84% in Ceará. The subtype B prevalence observed in Alagoas state is the lowest already described among Brazilian states outside the Southern region. Despite the limited number of HIV-1 pol sequences from Alagoas analyzed (n = 22) in this study, we detected a remarkable viral diversity in this Brazilian state with cocirculation of subtype B (45%), BF1 (18%), subtype C (14%), BC (8%), subtype F1 (5%), and subtype D (5%). Alagoas displays an HIV-1 epidemic with only 6.295 AIDS cases described until June 2016, but an increasing incidence, with the number of newly reported AIDS cases rising by 48% over last years (from 8.6 cases per 100,000 inhabitants in 2006 to 12.8 cases per 100,000 inhabitants in 2015). 1 Future studies with more robust sampling schemes will be of paramount importance to understand the relative contribution of exogenous introductions and local expansion of different viral strains as driving forces of the HIV-1 epidemic in Alagoas.
A previous study conducted by our group revealed that nonpandemic HIV-1 subtype B variants (BCAR) account for a relatively large fraction (14%) of subtype B infections in the state of Maranhão, but a very low proportion (<1.5%) in Piauí and Pernambuco, where epidemic is mostly dominated by the BPANDEMIC clade. 25 Our ML phylogenetic analysis showed that all subtype B variants in this study, detected in Alagoas, Bahia, and Ceará, branched within the BPANDEMIC clade, consistent with the pattern observed in Piauí and Pernambuco states. This result confirms that the HIV-1 BPANDEMIC clade accounts for nearly all subtype B infections in the Northeastern Brazilian region, closely matching the pattern observed in the Southeastern, Central-Western, and Southern regions. The only exception is the Maranhão state where the molecular pattern of the HIV-1 subtype B epidemic is closely related to that observed in some states from the Northern region. 25
Our results indicate that HIV-1 BF1 and BC recombinant strains comprise a substantial fraction (15%) of all infections in the Northeastern Brazilian region and this proportion may be underestimated because we analyzed a small fraction (∼10%) of the HIV-1 genome. Some of the BF1 recombinants detected in this study displayed the same recombinant patterns of CRFs_BF circulating in the states of São Paulo (CRF28/29_BF) 26 and Rio de Janeiro (CRF39_BF), 27 two Brazilian states strongly connected to the Northeastern region. In this work, we also detected the circulation of two potentially new CRF_BF (called BF1-I and BF1-II in this study) and two potentially new CRF_BC (called BC-I and BC-II in this study) lineages comprising between three and five recombinant sequences with the same mosaic pattern. Full-length HIV-1 genome analyses will be necessary to determine whether those recombinant lineages represent new CRFs or just URFs with similar recombination breakpoints at pol gene.
In addition to the identification of HIV-1 subtypes B, C, and F1 and recombinant forms among them, we also detect the circulation of subtype D and CRF02_AG clades in the Northeastern region. A few cases of subtype D 8,10,28 –35 and CRF02_AG 14,22,30,36 –42 infections have been previously identified in Brazil and there are at least two CRF02_AG strains that were locally disseminated throughout the Rio de Janeiro state. 43,44 The ML analysis performed in this study showed that subtype D sequence from Alagoas state branched within a clade that only comprises Brazilian sequences from Rio de Janeiro, São Paulo, and Mato Grosso do Sul states, demonstrating for the first time the existence of an autochthonous subtype D transmission network circulating in Southeastern, Central-Western, and Northeastern Brazilian regions. Similarly, the CRF02_AG sequence detected in Ceará branched with Brazilian sequences from Amapá and São Paulo states, thus revealing an autochthonous CRF02_AG transmission network circulating in Northern, Northeastern, and Southeastern Brazilian regions.
The convenience sampling and the absence of epidemiological data about HIV-infected subjects were major limitations of our study. A more systematic sampling, covering individuals from all Northeastern Brazilian states and from different transmission risk groups would certainly provide a more accurate picture of the HIV diversity of this region. However, the four states sampled in this study contribute to the majority of HIV/AIDS cases detected in the Northeastern Brazilian region over the last 10 years, 1 which may reduce the negative effect of the sampling bias. Another potential constraint of our study was the relative small genomic fragment used for genotyping, which may lead to misclassifications of the HIV-1 genetic diversity due to uncharacterized recombination outside the analyzed parts of the genome. The pol gene alone, however, was described as one of the most recombinogenic regions of the HIV-1 genome 45 and a powerful target to monitor HIV-1 diversity. 46
In summary, this study strongly supports that the continuous expansion of the AIDS epidemic in the Northeastern Brazilian regions was accompanied by the introduction and dissemination of multiple HIV-1 strains. Pandemic HIV-1 subtype B, and subtype F1 and BF1 recombinants were the predominant clades circulating in Northeastern Brazil, but a relative high prevalence of subtype C and BC recombinants was also detected, mainly in the Alagoas state. Our study also described the introduction and circulation of subtype D and CRF02_AG strains in the Northeastern region that were most probably disseminated from other Brazilian regions, rather than from other countries.
Footnotes
Acknowledgments
The authors thank the Department of AIDS, STD and Viral Hepatitis, Ministry of Health, and Oswaldo Cruz Institute-IOC/FIOCRUZ. E.D. is funded by a fellowship from “Programa Nacional de Pós-Doutorado (CAPES-Brazil).”
Sequence Data
Sequences were deposited in GenBank under accession numbers KY581384-KY581523.
Author Disclosure Statement
No competing financial interests exist.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
