Abstract
Background:
The vaginal microbiome (VMB) plays an important role in the persistence of human papillomavirus (HPV) infection and differs by race and among women with cervical intraepithelial neoplasia (CIN).
Materials and Methods:
We explored these relationships using 16S rRNA VMB taxonomic profiles of 3050 predominantly Black women. VMB profiles were assigned to three subgroups based on taxonomic markers indicative of vaginal wellness: optimal (Lactobacillus crispatus, L. gasseri, and L. jensenii), moderate (L. iners), and suboptimal (Gardnerella vaginalis, Atopobium vaginae, Ca. Lachnocurva vaginae, and others). Multivariable Firth logistic regression models were adjusted for age, smoking, VMB, HPV, and pregnancy status.
Results:
VMB prevalence by subgroup was 18%, 30%, and 51% for the optimal, moderate, and suboptimal groups, respectively. In fully adjusted models, the risk of CIN grade 3 (CIN3) among non-Latina (nL) Blacks was twice that of nL Whites (odds ratio [OR] = 2.0, 95% confidence interval [CI]: 1.1, 3.9, p = 0.02). The VMB modified this association (p = 0.04) such that the risk of CIN3 was significantly higher for nL Blacks than for nL Whites only among women with optimal VMBs (OR = 7.8, 95% CI: 1.7, 74.5, p = 0.007). Within racial groups, the risk of CIN3 was only elevated among nL White women with suboptimal VMBs (OR = 6.0, 95% CI: 1.3, 56.9, p = 0.02) compared with their racial counterparts with optimal VMBs.
Conclusions:
Our findings suggest that race is a modifier of the VMB in HPV carcinogenesis. An optimal VMB does not appear to be protective for nL Black women compared with nL White women.
Introduction
Recent studies support the key role of the vaginal microbiome (VMB) in the persistence of high-risk human papilloma virus (HPV) infection and subsequent progression to cervical intraepithelial neoplasia (CIN). 1 There is evidence for a link between HPV and VMB; HPV-positive women exhibit higher VMB diversity and Lactobacillus depletion than uninfected women. 2 While it is accepted that the VMB of women with CIN differs from that of healthy women, the VMB composition before CIN development has not been well studied.
Such an inquiry into the VMB before CIN development might be important, particularly since the VMB is complex and (both spatially and temporally) dynamic. 3 If an association exists between the composition of the VMB at the time of screening and CIN development, this information could help guide clinical decisions regarding treatment.
To our knowledge, no previous studies have evaluated the impact of self-reported race (a proxy measure for extrinsic/environmental factors) on the relationship between the VMB and risk of CIN development despite reports that the VMB of non-Latina (nL) Black women exhibits increased microbial biodiversity (associated with HPV tumorigenesis). 4 –6
In nL Black women, there is a lower prevalence of the optimal (Lactobacillus crispatus-predominant) VMB (reported to have an inverse association with the severity of cervical abnormalities) compared with their nL White counterparts. 7 These racial differences in VMBs are likely to be driven by both environmental and host genetic factors 8 ; nL Black women are more frequently screened for cervical cancer (CCa) than their nL White counterparts in the United States (86% vs. 80%, respectively) and therefore might be more likely to be overdiagnosed and overtreated for indolent lesions.
With the second highest CCa screening rate in the country (91%), nL Black women in Virginia could be disproportionately affected by the unintentional harmful effects of overdetection and overtreatment of cervical lesions 9 ; nL Black women generally have a higher HPV prevalence (39% vs. 24%) and more persistent (longer) HPV infections (601 days vs. 316 days) than nL Whites. 10 –13 Perhaps the VMB may provide an explanation for the differential persistence of HPV infections by race.
If the VMB plays a role in CIN development, then elucidating its role may lead to identification of clinically relevant predictive and/or prognostic biomarkers and perhaps explain racial disparities in CIN development. In this study, we explored, for the first time, the role of race and the VMB before CIN diagnosis in the risk of severe dysplasia (CIN grade 3, CIN3).
Materials and Methods
Eligibility
Comprehensive inclusion and exclusion criteria for the original Vaginal Human Microbiome Project cohort generated as part of the NIH Human Microbiome Project (UH2/UH3AI083263) were published. 14 Briefly, the original study recruited 4,851 nonincarcerated consenting women aged 18 years and older from outpatient women's health clinics at the Virginia Commonwealth University Health System, who were willing or already scheduled to undergo a vaginal examination using a speculum between August 2009 and November 2013. 15,16
Of the 4,851 women, 3,242 consented to access their electronic medical record (EMR) at the time of the original study. We further excluded 42 women with a documented HIV-positive diagnosis and 150 women with missing or conflicting information on race or VMB, resulting in a study sample of 3,050 women. We used self-reported health histories abstracted from study questionnaires and confirmed by electronic health records to obtain HPV status and other relevant sociodemographic characteristics.
Women with HIV-positive self-reports were excluded from the analysis. HPV vaccination status was considered in the models, but the difference was not statistically significant (p = 0.7).
Vaginal microbiome
VMB profiles were derived from taxonomic analysis of the amplified V1–V3 regions of 16S rRNA in resident bacterial taxa. Optimal microbiomes are considered to have certain predominant Lactobacillus species such as L. crispatus, L. gasseri, and L. jensenii, known for their anti-inflammatory properties and lower microbial diversity, whereas suboptimal microbiomes are those associated with dysbiosis and greater microbial diversity. 17
Based on this information and published work from the Vaginal Microbiome Consortium, vagitypes were dichotomized for analysis into optimal (enriched for L. crispatus, L. gasseri, L. iners, and L. jensenii) and suboptimal (BVAB1, Gardnerella vaginalis, NA, NoType, OtherVT, and Atopobium vaginae) groups. 18 The dichotomized variable was used for regression analyses, but a trichotomous version separating L. iners (considered a transitional state) was used during univariate and bivariate analyses.
The 16S rRNA feature table of the VMB was normalized by rarefaction to the depth of the lowest number of reads in a sample for use in diversity analysis. Alpha diversity was quantified by calculating the Shannon index using the vegan package in R. The Kruskal–Wallis test was used to test for significance. Beta diversity was quantified using the Bray–Curtis distance and visualized using t-distributed stochastic neighbor embedding (t-SNE) plots.
Difference of beta diversity was calculated by the adonis2 function in the vegan package. 19 The 16S rRNA feature table of the VMB was normalized by Hellinger pretransformation using the decostand function in the vegan package for use in the distance-based redundancy analysis (dbRDA). The dbRDA was performed to test the influence of race and CIN3 on the composition of microbiomes using the capscale function in the vegan package in R. The Bray–Curtis distance was used to calculate redundancy.
For differential relative abundance analysis, the 16S rRNA feature table of VBMs was normalized by a centered log-ratio transformation and the significance of relative abundance difference was analyzed by the Kruskal–Wallis test in the ADLEx2 package in R using the paired Wilcoxon rank sum test value. 20 Adjusted p-values were generated using the Benjamini–Hochberg correction in the ALDEx2 package. Relative abundance difference was calculated using the aldex.effect function in the ADLEx2 package and evaluated by the per-feature median difference between two conditions.
Cervical intraepithelial neoplasia
The outcome of interest was high-risk CIN, defined as any CIN3 diagnosis on the first Pap test (anchor pap), occurring on or after the VMB sample collection date. CIN3 diagnosis data were sourced from billing claims using ICD 9 and 10 diagnosis codes. To avoid double counting the same diagnosis from billing data, only a CIN coded within 45 days of the anchor pap was considered related to that anchor pap. Once identified from billing data, CIN3 was confirmed from pathological reports.
Followed through October 2020, 327 (11%) patients had evidence of CIN, grades 1–3, which developed after VMB sample collection. Of these, 234 were either CIN grade I or II and 93 were pathologically or cytologically confirmed as CIN3 using concordant cytology and ICD codes. The mean time to diagnosis was 2,380 days and ranged from 31 to 4,003 days. Three of the 93 CIN3 cases were diagnosed within 45 days of the anchor pap. Seven were diagnosed between 252 and 998 days of the anchor pap and the rest (n = 83) were diagnosed after >1000 days.
Analysis
Epidemiological analysis
Chi-square and t-tests were used for univariate analysis and Firth logistic regression was used for multivariate analysis. 21 We used manual backward selection of relevant demographic and clinical variables (type 3 analysis), using an alpha of 0.10 as a selection criterion to retain a variable in the model. The final models were adjusted for age, smoking status, VMB, HPV, and pregnancy status.
The analysis was completed in two steps: first, we examined the additive effects of race (defined as nL Black and nL White) and separately of the VMB on CIN3. We reported unadjusted and adjusted odds ratios (ORs) and their corresponding 95% confidence estimates. The second step examined the VMB as a moderator on the effect of race on the risk of CIN3. To test this hypothesis, interaction terms between self-reported race and VMB were added to the model.
The interaction model was tested even if statistically significant associations were not found in the first step as interaction can exist even if first-order effects are nonsignificant. We report the odds ratios and 95% confidence intervals (CIs) for the corresponding statistically significant effects, stratified by race and VMB.
Analyses were completed using STATA statistical software, version 12 (Stata Corp, College Station, TX, USA), or SAS 9.4 statistical software using an alpha of 0.05, except for interaction terms, which used an alpha of 0.2, and the Breslow–Day homogeneity test. All p-values were two-sided.
Microbiome analysis
Alpha diversity was quantified by calculating the Shannon index using the vegan package in R. The Kruskal–Wallis test was used to test for significance. Beta diversity was quantified using the Bray–Curtis distance and visualized using a t-SNE plot.
Difference of beta diversity was calculated by the adonis2 function in the vegan package. 19 dbRDA was performed to test the influence of race and CIN3 on the composition of microbiomes using the capscale function in the vegan package in R. The Bray–Curtis distance was used to calculate redundancy.
Results
Descriptive results
Samples used in this study were collected from 3050 women attending VCU Health Center women's outpatient clinics, who agreed to participate in the NIH-supported Vaginal Human Microbiome Project (VaHMP) between August 2009 and November 2013 (Tables 1 and 2). 7 Participants were predominantly nL Black (70%), married (71%), and nonsmokers (56%), with a mean age of 34 years.
Distribution of Self-Reported Behavioral and Clinical Characteristics by Race
Distribution of Self-Reported Behavioral and Clinical Characteristics by Cervical Intraepithelial Neoplasia Grade 3 Status
Gardnerella vaginalis, Atopobium vaginae, Ca. Lachnocurva vaginae, and others often associated with bacterial vaginosis.
VMB profiles were assigned to three subgroups based on taxonomic markers indicative of vaginal wellness 22 : optimal, including Lactobacillus (L) crispatus, L. gasseri, and L. jensenii; moderate, including only L. iners (considered a transitional state); and suboptimal, including G. vaginalis, A. vaginae, and Ca. Lachnocurva vaginae). The prevalence of these subgroups in the sample was 18%, 30%, and 51% for optimal, moderate, and suboptimal subgroups, respectively. Twelve percent of women reported having a previous HPV diagnosis.
Compared with nL Whites, nL Black women were younger (33 vs. 36 years, p < 0.0001), more likely to be married (83% vs. 46%, p < 0.001), and current smokers (43% vs. 23%, p < 0.001); nL Black women also had a higher prevalence of suboptimal VMBs (58% vs. 38%) and lower prevalence of optimal VMBs than nL Whites (12% vs. 33%). They were less likely to report ever having an HPV-positive diagnosis (9% vs. 20%, p < 0.001), but twice as likely to have a CIN3 lesion, compared with nL Whites (4% vs. 2%, p = 0.003).
Patients were followed forward in time through 2020 and 11% (n = 327) went on to develop CIN, of which 28% (n = 93) were CIN3 cases. Women who developed CIN3 did not differ in age (p = 0.14), but were more likely to be current smokers (42% vs. 35%, p = 0.003), report an HPV-positive infection (18% vs. 12%, p = 0.011), and have the suboptimal VMB subtype at baseline (63% vs. 50%, p = 0.011).
Associative results
Figure 1a and b shows the associations identified. When unadjusted, the risk of a CIN3 lesion for nL Black women was twice that for nL Whites (OR = 2.3, 95% CI: 1.3, 4.0). When adjusted for age, smoking, VMB, HPV, and pregnancy status, the risk of CIN3 among nL Black women compared with nL White women remained similar to the unadjusted estimate (OR = 2.0, 95% CI: 1.1–3.9). However, the VMB modified the relationship between race and risk of CIN3 (p = 0.04), such that when stratified by VMB subtype, only nL Black women with optimal VMBs experienced an increased risk of CIN3 compared with nL Whites with comparable VMBs (OR = 7.8, 95% CI: 1.7–74.5, p = 0.007, Fig. 1a).

No significant excess risk of CIN3 was identified within any other VMB subtype. Within racial groups, the risk of CIN3 was elevated only among nL Whites with suboptimal VMBs compared with their (nL White) racial counterparts with the optimal VMB subtype (OR = 6.0, 95% CI: 1.3– 56.9, p = 0.02, Fig. 1b). In contrast, there was no differential risk of CIN3 by VMB subtype among nL Black women.
Notably, the crude risk of CIN3 was triple for women with the suboptimal VMB subtype compared with women with optimal VMBs (OR = 2.91, 95% CI: 1.38–6.12).
VBM diversity
Our analysis showed that while there was no overall difference in the VMB alpha diversity among women who developed CIN3 compared with those who did not (Fig. 2a), stratified by race, the baseline VMB diversity of nL White women who went on to develop CIN3 was higher compared with their racial counterparts who did not develop CIN3 (p = 0.09, Fig. 2c).

Alpha diversity of the vaginal microbiome associated with race and CIN3. Comparison of alpha diversity of the vaginal microbiome quantified by the Shannon index among all participants
In contrast, VMB diversity did not differ among nL Black women according to CIN3 status (Fig. 2b). When comparing races among women who did not develop CIN3, the VMB of nL Black women was much more diverse than that of their nL White counterparts (Fig. 2d). However, the alpha diversity of the baseline VMB for women who developed CIN3 did not differ by race (Fig. 2e).
Beta diversity was visualized using a t-SNE plot and dbRDA plot and measured using the Adonis test (Fig. 3). The t-SNE plot and Adonis test indicated that both race and CIN3 state were significantly associated with the composition of the VMB (Fig. 3a). However, the VMB was only associated with CIN3 state in nL White women, but not in nL Black women.

Beta diversity and composition of the VMB associated with race and CIN3.
This finding seems to be consistent with observations in epidemiological and alpha diversity analyses and illustrates that the association between the VMB and CIN3 seems race dependent. Additionally, race seems to have a stronger impact on the VMB than CIN3 because the association between the VMB and CIN3 was less significant in the Adonis test (Fig. 3a) and the main dimension (CAP1) in the dbRDA plot was mainly associated with race (Fig. 3b).
Consistent with the Adonis test, the VMB in participants with CIN3 did not overlap with controls in the dbRDA plot, suggesting that CIN3 is associated with the composition of the VMB (Fig. 3b).
Due to the limited number of CIN3 cases, differential abundance analysis showed that only L. crispatus was significantly depleted in the VMB of participants with CIN3, and L. crispatus was more depleted in nL White than in nL Black women, who developed CIN3 (Supplementary Data S1). Dysbiosis-associated taxa considered part of the suboptimal VMB category (e.g., G. vaginalis, A. vaginae, and Sneathia spp.) were insignificantly enriched in CIN3-positive participants (Supplementary Data S1).
Discussion
In this study, we identified the VMB as an effect modifier in the relationship between race and risk of CIN3; this analysis revealed a racial disparity in the risk of CIN3 only among women with optimal VMBs (predominantly comprising L. crispatus) and was absent otherwise. Thus, nL Black women had nearly eight times the risk of CIN3 compared with their nL White counterparts within the context of this optimal VMB.
Within racial groups, a significant excess risk was only identified among nL White women with suboptimal VMBs compared with their racial counterparts with optimal VMBs. The risk of CIN3 did not differ by VMB subtype among nL Black women. Aligned with these findings, nL Black women had similar microbial biodiversity regardless of CIN3 diagnosis, whereas in nL Whites, microbial diversity was higher among those who developed CIN3 than among those who did not.
This work largely supports previously published data regarding higher rates of CIN3 in nL Black women and women with suboptimal VMBs. Specifically, we observed an increased risk of CIN3 among women with suboptimal VMBs, representing a Lactobacillus-depleted polymicrobial phenotype that is generally associated with suboptimal vaginal health, compared with those with optimal VMBs, which is consistent with previous reports. 23 –25
An optimal VBM with Lactobacillus dominance has been shown to be protective against HPV, HIV, and HSV infections. Conversely, the suboptimal VMB has been implicated in numerous poor outcomes in women's health, including increased risk of sexually transmitted infections, inflammation, higher likelihood of persistent HPV and preterm birth, miscarriage, and infertility. 26 –28 Suboptimal VMB is an indicator of poor health regardless of race, ethnicity, or other social factors 29 and this study extends this observation to cervical dysplasia risk.
In agreement with existing literature, we confirmed an independent relationship between race and risk of CIN3, with nL Black women having an increased risk of CIN3 compared with their nL White counterparts. However, to our knowledge, this is the first exploration of the VMB as a potential contributor to racial disparities in CIN3 risk. The differential relationship of VMB with race may suggest diverging etiologic mechanisms; the VMB potentially contributes to HPV carcinogenesis differentially by race.
Surprisingly, nL Black women with optimal VMBs had a significantly higher risk of CIN. The fact that the VMB is not equally protective might pinpoint a missing etiologic event that could explain the disparities seen in CIN among women with optimal VMBs. When considering racial differences, a causal mechanism could be that nL Black women in the United States have higher rates of cumulative effects of stress than nL White women. 30
The role of chronic stress and metabolic dysfunction in immune dysregulation leading to poor health outcomes is well known, especially in cancer care and carcinogenesis. 31 Stress is a possible contributing factor in CIN development, both by directly influencing inflammation and immune functions and by indirectly influencing carcinogenesis by altering the VMB, highlighting the VMB as a likely mediator in the causal pathway of CIN development. 32
Exposure to proinflammatory stimuli, such as chronic stress, deserves further investigation in nL Black women with optimal VMBs and development of CIN3. There is evidence of a link between the microbiome and immune cell functions. For example, some bacterial taxa have been suggested to influence the HPV vaccine response. 33
The key strengths of our study include a large biologically characterized cohort with a robust representation of nL Black women. Additionally, only premenopausal women were included, eliminating confounding microbiome changes during menopause. 34
Among some key limitations is the lack of information on environmental exposures of women evaluated in this study. Increasing evidence suggests that racial differences in VMBs are driven by environmental and host genetic factors. 8 For example, smoking, number of sexual partners, hormone replacement therapy and hormone contraception use, timing of menstrual cycles, diet, and exercise are some factors that can impact the VMB. 35 While some of this information was available for some participants, there were incomplete or missing data, which might have led to residual confounding.
The HPV genotype was not available; however, prior work indicated similar patterns of VMB when assessing those with only high-risk subtypes and those with any HPV subtype. 36 With the increasing complexity of CIN management and declining hysterectomy rates, exploring additional biomarkers, such as the VMB for persistent HPV and carcinogenesis, is increasingly important in providing personalized care for women.
In future studies, we aim to corroborate whether some of the found associations remain and explore in detail the role of the HPV genotype in the relationship between race and VMB. To our knowledge, this is the first report on the absence of a protective effect of an optimal VMB against CIN. If corroborated, this might explain, in part, the differential risk of CIN by race.
Additionally, it could be a rationale for including VMB profiling as a decision-aid tool in management of women at risk of CIN.
Footnotes
Authors' Contributions
K.Y.T. was involved in conceptualization, project administration, and writing—original draft preparation; K.Y.T., B.Z., R.A.P., M.G.S., and G.A.B. were involved in methodology; K.Y.T, B.Z., and R.A.P. were involved in formal analysis; K.Y.T., B.Z., M.G.S., and S.S. were involved in data curation; K.Y.T., B.Z., M.G.S., S.S., J.F.S., G.A.B., and V.L.S. were involved in writing—review and editing; K.Y.T. and B.Z. were involved in visualization; G.A.B. and R.A.W. were involved in supervision; and K.Y.T., G.A.B., and R.A.W. were involved in funding acquisition. All the authors have read and agreed to the published version of the article.
Ethics Approval
Institutional Review Board Statement: The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board (or Ethics Committee) of Virginia Commonwealth University (protocol codes and dates are as follows: HM15527 [last app April 6, 2022]; HM15528 [last March 16, 2022]; HM12169 [November 24, 2021]; and HM20010007 [September 23, 2021]).
Informed Consent Statement
Informed consent was obtained from all the subjects involved in the study.
Data Availability Statement
Data are available upon request from the corresponding author.
Author Disclosure Statement
G.A.B. reports an invention disclosure entitled “Vaginal microbiome markers for the prediction and prevention of preterm birth and other adverse pregnancy outcomes.” G.A.B. is a member of a Scientific Advisory Committee for Juno, LTD., a startup biotech firm focused on using the vaginal microbiome to address issues related to women's gynecologic and reproductive health. The authors declare that they have no conflicts of interest.
Funding Information
This research was funded by an American Cancer Society Institutional Research Grant, IRG-21-134-46, and National Institutes of Health grants: UH2/UH3AI083263, U54HD080784, R01 HD092415, and RO1HD092415 05S1.
Supplementary Material
Supplementary Data S1
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
