Abstract
C–C chemokine receptor (CCR) 5 (CCR5) is the main HIV-1 coreceptor involved in virus entry and cell-to-cell spread during acute and chronic infections: such CCR5 and T cell tropic viruses are adapted to and replicate in CD4+ memory T cells. Polymorphisms in CCR5 regulate CCR5 expression, which, in turn, influences HIV infection acquisition and subsequent disease progression. Among these polymorphisms, a 32-bp deletion in the CCR5 open reading frame (CCR5 Δ32) and a single nucleotide polymorphism (SNP) in the promoter (−2459G/A) are the most well-characterized polymorphisms. CCR5 Δ32 provides partial to full protection against HIV infection and, therefore, serves as a basis for gene deletion studies attempting to achieve a permanent HIV cure. Recent studies have discovered that certain SNPs in the CCR region, not within CCR5, also affect CCR5 expression, HIV infection, and disease progression. Although these studies provide further valuable information regarding the role of human genetic variation in HIV/AIDS, they did not incorporate −2459G/A. In this article, the author summarizes the knowledge gained through the discovery of these new SNPs and introduces the idea that by not incorporating −2459G/A, less comprehensive conclusions may have been reached. Until a strategy that delivers a cure to the millions is found, every piece of information that may help curtail the HIV/AIDS threat to public health should be considered useful.
Human Genetic Variation and HIV/AIDS
Early on in the HIV epidemic, significant differences in the rate of progression to AIDS were noticed among longitudinally followed HIV-infected persons. The role of the human leukocyte antigen (HLA) system genes in determining the course of disease was established by measuring the CD4+ T cell counts and/or the length of time between HIV infection and development of AIDS. 1,2 Several HLA genes or haplotypes appear to influence disease progression, although the effects are complex and may depend on interactions with other host and viral genes. 3 Before the discovery of the role of C–C chemokine receptor (CCR) gene polymorphism in HIV infection and disease progression, only genes of the HLA system were thought to influence HIV disease progression. In the 1990s, studies confirmed the protective role of homozygosity for a 32-bp deletion in CCR5 open reading frame (ORF, CCR5 Δ32) against HIV infection. 1 –3 The CCR5 Δ32 allele is predominantly found in European populations, with no or rare occurrences in Asians and native populations from Africa, the Americas, and Oceania. 4 The presence of one copy of the deleted CCR5 gene also influences the course of disease, as the onset of AIDS occurs later for some heterozygous persons than for those homozygous for the wild-type (wt) CCR5 gene. 1 –3 The discovery of the role of CCR5 alleles has prompted studies of the possible role of many other host genes in HIV infection and disease progression. 5 –7
A Recent Genome-Wide Association Analysis Uncovered rs1015164
A recent study by McLaren et al. 8 tested for association between ∼8 million common variants and set-point viral load (spVL) in 6,315 individuals of European ancestry. In this analysis, they found that the top chromosome three single nucleotide polymorphism (SNP) was rs1015164G/A (p = 1.5 × 10−19). The SNP rs1015164 lies near an antisense transcribed sequence RP11–24F11.2 that overlaps CCR5 and is only weakly correlated with CCR5 Δ32 (D′ = 0.89, r 2 = 0.03). Fine mapping of the 1.5 Mb CCR region association signals in the subset of 5,559 individuals, for whom the CCR5 Δ32 genotype data were available and CCR5 haplotype P1 (Hap-P1, described hereunder) carriage could be determined, showed that another SNP rs4317138T/C was the top SNP associated with spVL (p = 7.7 × 10−22). Interestingly, this SNP highly correlated with the top SNP identified in the analysis of the full sample, rs1015164G/A (D′ = 1, r 2 = 0.97). Using conditional association analysis, rs1015164G/A remained associated with spVL when conditioning on both CCR5 Δ32 and Hap-P1 (conditional p = 5.2 × 10−4). McLaren et al. 8 stated, “These SNPs are located within/near an antisense transcribed sequence that overlaps CCR5 and thus may play a role in regulating its expression. Demonstration of causality of these variants and/or a silencing effect of the antisense transcribed sequence will require functional studies.”
New Knowledge About rs1015164, CCR5 Expression, and HIV Outcome
A more recent study by Kulkarni et al. 9 has further substantiated the fact that human genetic variation affects HIV infection and disease progression. It also has shown that the role of human genetic variation in HIV/AIDS is not straightforward. The major findings of this study can be summarized as follows: rs1015164 G/A 8 is in genomic proximity to RP11–24F11.2, an antisense long noncoding RNA (lncRNA) gene that overlaps CCR5 and marks expression of the transcript encoded by the gene. Kulkarni et al. 9 termed this transcript CCR5AS. Their data reveal that higher expression levels of CCR5AS, as a consequence of a variant in an activating transcription factor 1 (ATF1) transcription factor binding site that rs1015164A marks, enhance CCR5 messenger RNA (mRNA) stability, thereby increasing CCR5 mRNA and cell surface expression. The rs1015164 SNP associates with HIV outcome (viral loads and CD4+ T cell counts) after infection. Thus, the complex interplay among rs1015164A, CCR5AS, and CCR5 provides the functional basis for the association between rs1015164A and lack of HIV control. Furthermore, these data represent a rare determination of the functional importance of a genome-wide disease association where expression of an lncRNA affects HIV infection and disease progression.
A CCR5 Promoter Polymorphism with Known Functional Significance
From this complex interplay, an important “old” player missing is CCR5 promoter polymorphism −2459G/A (rs1799987), also known as 59029G/A and 303G/A. In a variety of studies conducted in the 1990s, the −2459A allele, compared with the −2459G allele, has been shown to be associated with accelerated HIV disease progression. 10,11 That the −2459A allele was associated with significantly higher in vitro promoter activity, CCR5 expression, and HIV propagation, compared with the −2459G allele, was then shown in a number of studies. 12 –14 Recent studies have provided molecular mechanisms regarding the association between CCR5 promoter polymorphisms and transcriptional regulation of the promoter, and how this association correlates with CCR5 cell surface expression as well as HIV disease phenotype. 15 –17
The CCR5 haplotype nomenclature system consists of a total of nine polymorphisms, which include CCR5 ORF wt/Δ32 and −2459G/A. CCR5 haplotypes are organized into nine evolutionarily distinct human haplogroups (HH) designated HHA, -B, -C, -D, -E, -F*1, -F*2, -G*1, and -G*2. 18,19 Haplotypes HHA to HHD carry the −2459G allele, whereas haplotypes HHE to HHG*2 carry the −2459A allele. 18,19 Only HHG*2 carries the Δ32 allele. HHE, HHF*1, and HHG*1 are grouped together as Hap-P110 (see Ref. 18 for further discussion). In the same way that the ORF wt/Δ32 and −2459G/A alleles show differences in phenotypic effects in vitro as well as in HIV/AIDS cohorts, different CCR5 haplotypes influence HIV infection and disease outcomes differently. 18,20 Among all nine CCR5 haplotypes (HHA–HHG*2), the consistency of the association of −2459A allele-carrying HHE homozygosity (E/E genotype) with an unfavorable outcome across diverse populations is noteworthy, which suggests that the HHE haplotype confers similar phenotypic effects against distinct genetic backgrounds. 20 –24
Given this already known functional significance of −2459G/A (hereafter referred to as rs1799987G/A), a likely reason for not including this promoter polymorphism in the key analyses by Kulkarni et al. 9 could be that the rs1015164 SNP was found to have a genome-wide effect independent of Hap-P1 (carrying rs1799987A) by McLaren et al. 8 The analyses by Kulkarni et al. have found that (a) rs1015164A/G variation associates with HIV-1 viral load and CD4+ T cell counts across distinct populations (fig. 1 and Supplementary fig. 1) 9 ; (b) rs1015164 genotypes (AA/AG and GG) show a significant correlation with CCR5 cell surface expression in bulk memory CD4+ T cells and effector memory CD4+ T cells (fig. 2d) 9 ; (c) CCR5AS enhances CCR5 mRNA and cell surface expression (fig. 3a–c) 9 ; and (d) primary CD4+ T cells from rs1015164AA/AG donors show a considerable increase in infection with R5 tropic virus as compared with CD4+ T cells from donors with the rs1015164GG genotype (fig. 7b). 9 In these analyses, by incorporating rs1799987G/A genotype information, one could have gained an important complementary, more comprehensive insight regarding the outcome. For example, what happens to CCR5 expression, HIV-1 viral loads, and CD4+ T cell counts in individuals who carry both rs1015164A and rs1799987A alleles? Such individuals may have much higher CCR5 expression, higher viral loads, and decreased CD4+ T cell counts than those who carry either the rs1015164A or the rs1799987A allele. If that is the case, will they be considered as genetically “higher risk” individuals? Therefore, an examination of how rs1799987G/A polymorphism may have influenced those conclusions seems warranted.
A Population Genetics View
Allele frequencies and linkage disequilibrium
From a population genetics angle, one may ask, how probable it is that both rs1015164A and rs1799987A alleles occur together by chance, to perform meaningful analyses? Using the 1000 Genomes populations, the allele frequencies (Table 1) and linkage disequilibrium (LD; Table 2) values suggest that it is reasonably probable in certain populations. Note that the rs1799987A allele is highly prevalent all across the world (frequencies 32%–66%), whereas the rs1015164A allele has frequencies of 14%–37% in most populations and 2%–8% in those from Africa (Table 1). In the Japanese (JPT, n = 104; rs1015164A, 24%; rs1799987A, 52%) and European American (CEU, n = 99; rs1015164A, 32%; rs1799987A, 53%) populations, the two alleles are expected to occur together in 12% and 17% of the individuals, respectively. These populations were well represented in the study by Kulkarni et al. 9 (JPT, n = 504) and in the previous study by McLaren et al. 8 (Europeans, n = 5,559), where the rs1015164A allele frequencies were 27% and 31%, respectively, highly similar to those reported for the 1000 Genomes populations. Assuming that the rs1799987A allele frequencies in those populations 8,9 are the same as in the 1000 Genomes populations (JPT, 52%; CEU, 53%), the two alleles are expected to occur together in 14% (n = 71) and 16% (n = 889) of the individuals, respectively.
Characteristics of Polymorphisms Within and Around CCR5 in 1000 Genomes Populations
Phase 3 populations (26 populations, 2504 samples).
Ancestral allele/mutant allele.
GRCh38 coordinate.
Mutant allele.
Known as −2459G/A, 59029G/A, and 303G/A.
Linkage Disequilibrium Between Polymorphisms Within and Around CCR5 in 1000 Genomes Populations
—Data not available.
In the study by Kulkarni et al., 9 although the rs1015164A allele was less frequent in the African American patients (n = 992), and AA homozygous individuals were rare, patients carrying at least one rs1015164A allele (AA/AG) also had considerably higher viral load and decreased CD4+ T cell counts, pointing to a uniform deleterious effect of rs1015164A in HIV-1 infection across distinct populations. Considering the rs1015164A (2%–8%) and rs1799987A (32%–49%) allele frequencies in the 1000 Genomes African populations, these two alleles are expected to occur together only in 1%–4% of the individuals, requiring a very large sample size to study their combined effects.
Haplotype frequencies
The LD between rs1015164 and rs1799987 SNPs suggests that they are not independent of one another in all populations (Table 2). To further substantiate that the rs1015164A and rs1799987A alleles occur together, frequencies of the haplotype containing both variant alleles were calculated using sample genotype data from the 1000 Genomes populations. 25 Similar to the expected allele frequencies, the haplotype rs1015164A_rs1799987A is estimated to occur in the JPT, CEU, and African populations at frequencies 24%, 32%, and 5%, respectively.
Conclusions
With the new knowledge about rs1015164G/A, including its high correlation with rs4317138T/C, 8,26 an important question is whether the regulation of CCR5 expression is based on parallel or divergent mechanisms. Where does rs1799987G/A fit into this picture? Note that the LD patterns of rs1015164-rs1799987 and rs4317138-rs1799987 SNP pairs are highly similar (Table 2). Given that these variants are linked to some degree in many populations, could it be that the original reports of rs1799987G/A were tagging the stronger signal at rs1015164G/A? To answer these questions, studies, such as that by Kulkarni et al., 9 need to incorporate rs1799987G/A (−2459G/A) information, due to a mechanistic or constitutive explanation that this SNP already provides. The incorporation of this polymorphism is also important to test the hypothesis that individuals who carry both rs1015164A and rs1799987A alleles (the A_A haplotype) have much higher CCR5 expression, higher viral loads, and decreased CD4+ T cell counts than those who carry either the rs1015164A or the rs1799987A allele. Since there are large human genomic association data sets available, it seems possible to test this hypothesis; this analysis is beyond the scope of this article. Finally, considering the status of these polymorphisms may be important also in studies wherein new immunologic 27,28 and chemotherapeutic 29,30 strategies are evaluated. In those studies, knowing whether an individual, receiving such an immunologic or chemotherapeutic intervention, is carrying two, one, or no A_A haplotype would enable a better understanding of the response to the intervention. In this era of seeking an HIV cure, where all the emphasis has been on the 32-bp deletion, it seems that the well-described promoter polymorphism, −2459G/A, has been forgotten or ignored. 20 Until we find a strategy that delivers a cure to the millions, we should make use of every piece of information that may help curtail the HIV/AIDS threat to public health.
Footnotes
Acknowledgments
This article is the result of inspiration and guidance from the Higher Self, and is dedicated to the loving memories of the late Dr. Anil Ghosh, a fellow researcher and a dear friend. The author is grateful to Jasmine Olvany, Dr. Ricky Chan, Dr. Carolyn Myers, Marlin Linger, and Quentin Watson for commenting on the article.
Author Contributions
R.K.M. wrote the article.
Author Disclosure Statement
No competing financial interests exist.
Funding Information
No funding was received for this article.
