Abstract
The Tshepo study was the first clinical trial to evaluate outcomes of adults receiving nevirapine (NVP)-based versus efavirenz (EFV)-based combination antiretroviral therapy (cART) in Botswana. This was a 3 year study (n=650) comparing the efficacy and tolerability of various first-line cART regimens, stratified by baseline CD4+: <200 (low) vs. 201-350 (high). Using targeted maximum likelihood estimation (TMLE), we retrospectively evaluated the causal effect of assigned NNRTI on time to virologic failure or death [intent-to-treat (ITT)] and time to minimum of virologic failure, death, or treatment modifying toxicity [time to loss of virological response (TLOVR)] by sex and baseline CD4+. Sex did significantly modify the effect of EFV versus NVP for both the ITT and TLOVR outcomes with risk differences in the probability of survival of males versus the females of approximately 6% (p=0.015) and 12% (p=0.001), respectively. Baseline CD4+ also modified the effect of EFV versus NVP for the TLOVR outcome, with a mean difference in survival probability of approximately 12% (p=0.023) in the high versus low CD4+ cell count group. TMLE appears to be an efficient technique that allows for the clinically meaningful delineation and interpretation of the causal effect of NNRTI treatment and effect modification by sex and baseline CD4+ cell count strata in this study. EFV-treated women and NVP-treated men had more favorable cART outcomes. In addition, adults initiating EFV-based cART at higher baseline CD4+ cell count values had more favorable outcomes compared to those initiating NVP-based cART.
Introduction
Asignificant proportion of the 6.6 million persons receiving combination antiretroviral therapy (cART) in low and middle income countries of the world reside in sub-Saharan Africa. 1 Large numbers of national initiatives offering public nonnucleoside reverse transcriptase inhibitor (NNRTI)-based cART have commenced in the region with preliminary data documenting impressive efficacy outcomes among the vast majority of cART-treated adults. 2 –6 In resource-rich settings, based on available data from numerous clinical trials, 7 –13 efavirenz (EFV) is the NNRTI of choice, and is “preferred” for first-line cART, along with the NRTIs tenofovir (TDF) and emtricitabine (FTC). 7 This recommendation is based on efficacy and more favorable tolerability data. 7 –13 In resource-limited limited settings, however, the majority of cART-treated adults are female 2 –6,14 –18 and have been prescribed nevirapine (NVP)-based cART regimens due to the potential teratogenic effects of EFV. Family planning considerations in sub-Saharan Africa also strongly influence the choice of NNRTI, especially as pregnancy rates among cART-treated women are high and EFV is limited to women committing to using at least two reliable contraceptive methods.
The 2NN trial, 19 a large adult randomized trial, compared 1216 adults receiving stavudine (d4T) plus lamuvidine (3TC) with either NVP or EFV in North and South America, Australia, Europe, South Africa, and Thailand. They found non-inferiority (NVP vs. EFV) in their primary outcome of virologic failure. Additional 2NN analyses, 19 however, showed an association between NVP and higher rates of serious toxicity. The CPCRA 058 and INSIGHT study 20 team, reporting randomized clinical trial data from NNRTI-treated adults in the United States and Western Europe, however, did show inferiority in primary endpoint, namely higher rates of virologic failure with and without resistance among NVP-treated vs. EFV-treated patients. Wester et al. 14 performed an analysis of the Adult Antiretroviral Treatment and Drug Resistance (“Tshepo”) study in Botswana using Cox proportional hazards analysis and concluded that there was no significant difference by assigned NNRTI in time to death or virologic failure. Women receiving NVP-based cART, however, trended toward higher virological failure rates when compared to EFV-treated women, Holm-corrected log-rank p-value=0.072. 14 There were no differences among men. 14 Furthermore, they concluded that individuals treated with NVP had significantly shorter times to treatment modifying toxicity when compared to those receiving EFV-based cART. 14
Current methods used to evaluate effect modification (often referred to as statistical interaction) in time to event data, such as the Cox proportional hazards model, posit highly restrictive parametric models and attempt to estimate parameters that are specific to the model proposed. 21 These methods tend to be biased and force providers to estimate parameters out of convenience rather that what they are actually interested in. The targeted maximum likelihood estimation (TMLE) methodology, originally proposed by van der Laan and Rubin, 21,22 and applied to time to event outcomes by Moore and van der Laan 23 improves on the currently implemented methods in both robustness (its ability to provide unbiased estimates) and flexibility (allowing providers to estimate parameters of direct interest to them).
Evaluating data from the recently completed Tshepo study, a clinical trial comparing the efficacy and tolerability of various first-line cART regimens in Botswana, we compared TMLE results to results obtained from conventional Cox proportional hazards analyses. In doing so, we aimed to definitively evaluate the causal effect of NNRTI treatment and effect modification by sex and baseline CD4+ cell count on the time to virologic failure or death [intent-to-treat (ITT)] and the time to minimum of virologic failure, death, or treatment modifying toxicity [time to loss of virological response (TLOVR)], the preferred FDA outcome in this unique sub-Saharan African clinical trial population.
Materials and Methods
Study design
Utilizing outcomes data from the recently completed Adult Antiretroviral Treatment and Drug Resistance (“Tshepo”) study, a clinical trial comparing the efficacy and tolerability of various first-line cART regimens in Botswana in which patients were stratified by baseline CD4+ cell count [<200 cells/mm3 (low) versus 201–350 cells/mm3 (high)], we retrospectively compared TMLE results to results obtained from conventional Cox proportional hazards analyses.
The Tshepo study was an open-label, randomized, 3×2×2 factorial design study conducted in Gaborone, Botswana to evaluate the efficacy, tolerability, and development of drug resistance of six different first-line cART regimens: zidovudine (ZDV), lamuvidune (3TC), and nevirapine (NVP) (Arm A); zidovudine (ZDV), lamuvidune (3TC), and efavirenz (EFV) (Arm B); zidovudine (ZDV), didanosine (ddI), and nevirapine (NVP) (Arm C); zidovudine (ZDV), didanosine (ddI), and efavirenz (EFV) (Arm D); stavudine (d4T), lamuvidine (3TC), and nevirapine (NVP) (Arm E); and stavudine (d4T), lamuvidine (3TC), and efavirenz (EFV) (Arm F). The study also compared two different adherence strategies: standard-of-care (SOC) versus SOC plus community-based supervision (Com-DOT) to determine the optimal means of promoting adherence among adults receiving first-line cART.
Participants were assigned in equal proportions (in an open-label, unblinded fashion) to one of six initial treatment arms and one of two adherence arms using permuted block randomization. Randomization was stratified by CD4+ cell count (less than 200 cells/mm3, 201–350 cells/mm3) and by whether the participant had an adherence assistant. Half of the participants were enrolled in each CD4+ cell count stratum, but there were no restrictions on whether they had an adherence assistant prior to study enrollment.
The primary endpoints of the study were development of virologic failure with genotypic drug resistance and development of treatment-related toxicity, as defined by first incidence of a grade 3 or higher adverse event. Secondary endpoints were death for any reason and time to nonadherence, as estimated by an adherence rate of less than 90%. For additional study details, please refer to the previously published article by Wester et al. 14
The study was approved by the institutional review boards of the Botswana Ministry of Health (Health Research Development Committee) and the Harvard School of Public Health (Human Subjects Committee) and written informed consent was obtained from all participants.
Study population
Adult (≥18 years of age), HIV-1 infected, cART-naive Botswana citizens who attended one of the five ART screening clinics in Gaborone were approached for possible enrollment. All potentially eligible adults had to qualify for cART based on existing Botswana national ARV treatment guidelines 24 –26 of having an AIDS-defining illness and/or CD4+ cell count ≤200 cells/mm3 or meet the study's eligibility criteria of a CD4+ cell count between 201 and 350 cells/mm3 with a plasma HIV-1 RNA level greater than 55,000 copies/ml. Inclusion criteria were hemoglobin value >8.0 g/dl; absolute neutrophil count ≥1.0×103/mm3; aminotransferase levels less than five times the upper limit of the normal; and for women of child-bearing potential, a willingness to maintain active contraception throughout the duration of the study and a negative urine pregnancy test within 14 days of study enrollment. Exclusion criteria were poor Karnofsky performance score (40 or below); an AIDS-related malignancy other than mucocutaneous Kaposi's sarcoma; grade 2 or higher peripheral neuropathy; major psychiatric illness; and for women, actively breastfeeding or less than 6 months postpartum.
Data collection and follow-up
Clinical and adherence assessments were done monthly at the study clinic. To monitor treatment efficacy, CD4+ cell counts (FACS Calibur flow cytometer, Becton Dickinson, San Jose, CA) and plasma HIV-1 RNA levels (Amplicor HIV-1 Monitor test, version 1.5 Roche Diagnostics Systems, Branchburg, NJ) were obtained at enrollment and then every 2 months for the duration of the study. Laboratory safety monitoring included comprehensive chemistry and full blood count specimens at study enrollment, then every month for the first 6 months of the study, every 2 months during months 6–12 of study participation, and every 4 months during the remainder of participation. In addition, all patients had lipid chemistries performed at baseline and then every 6 months. Comprehensive care for study participants was provided in accordance with existing national policy and was free of charge. 25,26
Statistical considerations
Conventional time-to-event results, performed utilizing Cox proportional hazards models, were compared to results obtained utilizing the TMLE methodology. 21
The TMLE estimates that are presented represent the mean difference in the marginal additive risk in the probability of survival and the mean difference in marginal log relative hazard between levels of the possible effect modifier; which was taken over the first 34 months following randomization. 21 The difference in the additive risk at the final time point, 34 months, was also calculated (data not shown). For all computations, the treatment variable A was set as for analyzing the effect of NVP versus EFV treatment. The effect modifier variable, V, equaled 1 for males and 0 for females. Thus, a negative mean difference in the marginal difference in the log relative hazard and a positive mean difference in the additive risk indicated that EFV was having a larger beneficial causal effect for females. 21
The estimates of causal parameters for numerous outcomes of interest were performed. For purposes of these analyses, we focused on the following two outcomes: (1) the time to minimum of virologic failure or death censored by the end of study [which we will refer to as the “intent-to-treat” (ITT) outcome] and (2) the time to minimum of virologic failure or death or treatment modification censored by end of study, which is the preferred FDA outcome for evaluating the efficacy and safety of a particular treatment (which is commonly referred to as the “time to loss in virologic response,” which we will refer to as “TLOVR” outcome). For each outcome, we addressed the following causal questions: (1) Was the effect of NNRTI-based cART different by sex? and (2) Was there causal effect modification of NNRTI-based cART by baseline CD4+ cell count?
All statistical analyses were conducted using R statistical software.
Results
Baseline characteristics
Overall, 650 adults were enrolled, 451 (69.4%) of whom were female. The median age was 33.3 years [IQR 28.9–38.7]. Forty-three percent had advanced WHO clinical disease (Stage 3 or 4) at the time of enrollment. Of patients, 330 (50.9%) were enrolled in the lower CD4+ stratum with a median CD4+ of 137 cells/mm3, and 320 (49.1%) were enrolled in the upper CD4+ cell count stratum (CD4+ cell count value between 201 and 350 cells/mm3 and plasma HIV-1 RNA >55,000 copies/ml) with a median CD4+ of 252 cells/mm3. Baseline characteristics of patients in the NVP vs. EFV arms were evenly balanced at entry, with 325 patients randomized to each NNRTI arm (Table 1). Females differed significantly from enrolled males in that they were younger, had lower body weights (and body mass indexes), as well as lower hemoglobin values, as expected (Table 2). Baseline characteristics of the entire study population by baseline CD4+ cell count strata (low versus high) are shown in Table 3.
Baseline Characteristics of Study Population by NNRTI Assignment [Nevirapine (NVP) Versus Efavirenz (EFV)-Based Combination Antiretroviral Therapy]
BMI, body mass index; IQR, interquartile range; TB, tuberculosis.
Baseline Characteristics of Study Population by Sex
To compare the distribution of study characteristics by sex, we employ chi-square tests. Similarly, we use a two-sample rank sum test for continuous variables by sex.
Continuous variables are reported as medians (interquartile range).
Baseline Characteristics of Study Population by Baseline CD4+ Cell Count Strata [Namely, Low (CD4+ Cell Count ≤200) versus High (CD4+ Cell Count 201–350)]
To compare the distribution of study characteristics by CD4 cell count, we employ chi-square tests. Similarly, we use a two-sample rank sum test for continuous variables by CD4 cell count.
Continuous variables are reported as medians (interquartile range).
Causal effect modification by sex
Sex did significantly modify the effect of NVP versus EFV for the ITT and TLOVR outcomes.
For the ITT outcome, namely, the time to minimum of virologic failure or death censored by the end of study, the Cox proportional estimate was −1.16, with a corresponding standard error of 0.45; p-value=0.011; and using TMLE, the mean marginal log relative hazard was −1.13 (standard error=0.49; p-value=0.021) and risk difference in probability of survival=0.062 (standard error=0.025; p-value=0.015).
For the TLOVR outcome, namely, the time to minimum of virologic failure or death censored by treatment modification or the end of study, the Cox proportional estimate was −0.95, with a corresponding standard error of 0.35; p-value=0.007; and using TMLE, the mean marginal log relative hazard was −0.82 (standard error=0.33; p-value=0.013) and risk difference in probability of survival=0.116 or approximately 12% (standard error=0.035; p-value=0.001).
Causal effect modification by baseline CD4+ cell count
We then sought to determine whether there was causal effect modification due to baseline CD4+ cell count level [low (<200 cells/mm3) versus high (201–350 cells/mm3] for our two specified outcomes.
Evaluating the ITT outcome, the Cox proportional estimate was 0.403, with a corresponding standard error of 0.402; p-value=0.320; and using TMLE, the mean difference in marginal log relative hazard was 0.859 (standard error=0.468; p-value=0.066) and mean difference in additive risk in probability of survival=–0.053 (standard error=0.028; p-value=0.089) (Table 2).
For the TLOVR outcome, the Cox proportional estimate was − 0.675, with a corresponding standard error of 0.317; p-value=0.033; and using TMLE, the mean difference in marginal log relative hazard was −0.829 (standard error=0.356; p-value=0.020) and mean difference in additive risk of the probability of survival=–0.115 (standard error=0.051; p-value=0.023). For the TLOVR outcome, the Cox proportional hazard estimate was significantly different from zero but the TMLE had a more significant p-value. Specifically, for the TLOVR outcome, the Cox proportional estimate was 0.675, with a corresponding standard error of 0.317; p-value=0.033; and using TMLE, the mean difference in marginal log relative hazard was 0.829 (standard error=0.356; p-value=0.020) and mean difference in additive risk of probability of survival=–0.115 (standard error=0.051; p-value=0.023).
Figure 1 depicts the significant effect modification by baseline CD4+ cell count (high vs. low) observed for the TLOVR outcome. Not only is the effect of treatment, EFV vs. NVP, among randomized study subjects having high baseline CD4+ cell counts (201–350 cells/mm3) different than the effect seen when evaluating randomized study subjects having low (<201 cells/mm3) baseline CD4+ cell counts, the effects are in opposite directions. Adults with high baseline CD4+ cell counts who receive NVP-based cART have a higher probability of experiencing virologic failure, death, and/or treatment modification (TLOVR outcome). Conversely, adults with low baseline CD4+ cell counts who receive EFV-based cART have a probability of experiencing these same adverse outcomes. Furthermore, these figures depict the double robustness of the TMLE methodology. Figure 1A used data adaptive methods to fit the initial hazard regression for all combinations of baseline CD4+ and treatment levels while Fig. 1B used an intentionally misspecified initial estimate that was the same for all four groups. Yet, Fig. 1B shows that the effect modification was still detected since the censoring mechanism and propensity scores were known to be correctly specified.

Causal effect modification by baseline CD4+ cell count: low (CD4+ cell count <201 cells/mm3) versus high (CD4+ cell count 201–350 cells/mm3) for the TLOVR Outcome (Time to Virologic Failure, Death, or Treatment Modifying Toxicity). The survival curve in
Discussion
Our study compared the effectiveness of NVP-based versus EFV-based cART among a large group of adults in Botswana. Published outcomes using conventional Cox proportional hazards showed that females receiving NVP-based cART trended toward having higher virologic failure rates compared to EFV-treated women. 14 This was most likely due to the fact that NVP-treated adults tended to modify treatment sooner than EFV-treated adults due to the significantly higher treatment-modifying toxicity rates among NVP-treated (vs. EFV-treated) patients. 14
Using TMLE, we further evaluated study outcomes, specifically evaluating the causal effect of assigned NNRTI on the time to virologic failure or death (ITT outcome) and the time to minimum of virologic failure, death, or treatment modifying toxicity (TLOVR outcome) by sex and baseline CD4+ cell count [high (201–350 cells/mm3) versus low (<201 cells/mm3)].
Evaluating for possible effect modification by sex, we found that sex does modify the effect of EFV versus NVP on the time to death and the time until TLOVR. EFV-treated women tended to have more favorable outcomes in contrast to NVP-treated males having more favorable outcomes. For the TLOVR outcome, the average causal risk difference between the effect in males versus the females was 12% (p-value=0.001). These results substantiate the borderline statistically significant results obtained among NNRTI-treated females with virologic failure using conventional Cox proportional hazards techniques. The main reason for this was that NVP-treated females tended to modify their cART regimens at significantly higher rates than the other treated groups. The major difference in treatment modification rates tended to occur almost immediately post initiating NVP-based cART. However, treatment modification does not explain the entire effect modification that was observed because both outcomes including virologic failure and death were also statistically significant in terms of an effect in males versus females. In fact, for the outcomes including both virologic failure and death, the average risk difference between the effect in males versus females was approximately 6% (p-value=0.015).
Evaluating for possible effect modification by baseline CD4+ cell count, we did find that baseline CD4+ did modify the effect of NVP versus EFV on the time until death and the TLOVR outcome. The effect modification for the outcome that included both virologic failure and death was close to significant. EFV tended to be favorable compared to NVP among cART-treated adults having higher (201–350 cells/mm3) baseline CD4+ cell count values. Among randomized patients having low baseline CD4+ cell count values (<200 cells/mm3), there was no significant difference in the treatment survival curves. For the TLOVR outcome, the average risk difference between the effect in the high CD4+ cell count strata versus the low CD4+ strata was 12% (p-value=0.023).
One possible explanation for this is that NVP-treated adults have higher overall rates of treatment modifying toxicities, which occur along a full continuum of CD4+ cell count ranges, particularly high (>250 cells/mm3) CD4+ cell count values, some of which may be life-threatening (i.e., hepatotoxicity and/or cutaneous hypersensitivity reactions). Such toxicities may be particularly problematic among persons with higher CD4+ cell counts, resulting in higher rates of treatment modifying toxicities as such persons generally feel healthier and experience far fewer comorbidities and are therefore more likely to report earlier grade toxicities and have lower thresholds to request a change in their treatment.
Limitations include the fact that there were very few deaths among EFV-treated adults having high baseline CD4+ cell count values. NVP-treated patients having less pronounced immunosuppression, namely, having high baseline CD4+ cell counts (201–350 cells/mm3), however, tended to die at a higher frequency and sooner compared to those having more advanced baseline immunosuppression (i.e., having low baseline CD4+ cell counts, in the less than 201 cells/mm3 range).
Conclusions
This is the first study to date comparing time to event parameter estimates obtained using the conventional Cox proportional hazards model to TMLE among clinical trial participants in sub-Saharan Africa. As we previously indicated, TMLE is an efficient and robust method for estimating causal effect parameters that directly answer a specific scientific question of interest. TMLE allows for the clinically meaningful delineation and interpretation of the causal effect of NNRTI treatment and effect modification by sex and baseline CD4+ cell count strata in this recently completed study. As the majority of cART-treated adults in resource-limited settings are female, which is in sharp contrast to male predominant U.S. and Western European cohorts, it will be of paramount importance to continue to evaluate for possible effect modification by relying on a statistical methodology that provides desired and more importantly readily interpretable parameter estimates.
In our urban Botswana setting, EFV-treated women and NVP-treated men had more favorable cART outcomes. In addition, adults initiating EFV-based cART at higher baseline CD4+ cell count values had more favorable outcomes compared to those initiating NVP-based cART. Based on these findings, policymakers in sub-Saharan Africa may want to consider restricting NVP use to adult males having baseline CD4+ cell counts less than 200 cells/mm3. Additional studies including longer-term follow-up of larger numbers of patients are clearly warranted, as such information will greatly inform such potential policy decisions.
Footnotes
Acknowledgments
We would like to formally acknowledge the Botswana Ministry of Health, the Princess Marina Hospital administration, outpatient adult Infectious Disease Care Clinic (IDCC), and inpatient Medical Ward teams, the entire Adult Antiretroviral Treatment and Drug Resistance (Tshepo) study team, and our funder, the Bristol-Myers Squibb foundation for their support of this research initiative. We also want to formally acknowledge and thank all adult study participants.
The project described was also supported by the following research grants from the National Institute of Allergy and Infectious Diseases, K23AI073141 (PI: C. William Wester, MD, MPH) and P30AI 060354 (PI: C. William Wester, MD, MPH), Harvard Center for AIDS Research (CFAR) grant. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Allergy and Infectious Diseases or the National Institutes of Health. Portions of the work in this article were also supported in part by the Russell M. Grossman Endowment, NIEHS Grant ES015493, and NIH Grants R01 AI074345-04 and R01 AI51164.
Author Disclosure Statement
No competing financial interests exist.
