Abstract
Objective:
To evaluate the responsiveness of the Endometriosis Health Profile-30 (EHP-30) and ascertain score changes that are indicative of response to treatment. A post hoc analysis of two Phase III, double-blind, placebo-controlled, randomized clinical trials among women with moderate-to-severe endometriosis-associated pain (Elaris Endometriosis I and II [EM-I and EM-II]).
Materials and Methods:
EHP-30 core items and sexual relationship module were administered at day 1, month 3 (M3), and month 6 (M6) to monitor patient-reported impacts of endometriosis-related pain. A seven-response level Patient Global Impression of Change (PGIC) was administered at M3 and M6. Dysmenorrhea (DYS), nonmenstrual pelvic pain (NMPP), and dyspareunia (DYSP) were collected using a daily diary. Three psychometric approaches, “triangulation,” were used to suggest responder thresholds for the EHP-30 domains. The three approaches were anchor- and distribution-based analyses and use of clinically relevant indicators (DYS, NMPP, DYSP).
Results:
EM-I and EM-II enrolled 871 and 815 women, respectively. All EHP-30 domains improved during the trials (M3, M6). Differences (p < 0.001) for all EHP-30 domains were found among the PGIC responses at M3 and M6, indicating greater change was associated with greater EHP-30 improvements. Large effect sizes were noted for all EHP-30 domains (EM-I range −0.59 to −1.80; EM-II range −0.52 to −1.59). EHP-30 thresholds of meaningful change ranged from −20 to −35, with greater changes indicating greater improvement in health status.
Conclusion:
Responder thresholds by EHP-30 domain are recommended to evaluate treatment efficacy. Clinicians can individualize goals of treatment by EHP-30 domain and track changes using the EHP-30.
Introduction
Endometriosis is a condition where endometrial cells from the lining of the uterus grow outside the uterus and is characterized by some cardinal symptoms: dysmenorrhea (DYS); dyspareunia (DYSP); chronic nonmenstrual pelvic pain (NMPP); and infertility. 1 This chronic gynecological disorder affects 6% to 10% of reproductive-age women and is present in 38% of women with infertility and up to 87% of women experiencing chronic pelvic pain. 2 There is no cure for endometriosis, but it can be treated with medication for pain management (nonsteroidal anti-inflammatories, opioids), hormonal drugs (oral contraceptives, progestins, danazol, gonadotropin-releasing hormone [GnRH] agonists), nonpeptide GnRH antagonist, or the surgical removal of areas with endometriosis. 2 –4
Endometriosis is associated with a substantial impact on women's lives, including daily tasks, marital/sexual relationships, social life, and employment, as well as physical and psychological aspects of life. 5 –9 Additionally, as endometriosis symptom severity and the number of symptoms experienced increase, health-related quality of life (HRQL) decreases. 9
A recently published integrative review of the HRQL burden associated with endometriosis confirmed the HRQL impacts and burden of disease and noted both disease-specific and generic instruments that have been used to characterize the effect of endometriosis on HRQL. 10 One such disease-specific HRQL measure is the Endometriosis Health Profile-30 (EHP-30). The EHP-30 is a patient-reported outcome (PRO) measure that represents the patient's perspective about her experiences with the impacts of endometriosis. The reliability and validity of the EHP-30 has been assessed and affirmed. 11,12 However, criticisms of the EHP-30 evaluation include the small sample size used to evaluate the responsiveness and scoring thresholds, and that the social support domain failed to show responsiveness. 11 Responsiveness is the instrument's ability to detect change, for example the instrument's ability to measure gains and losses within a concept. 13 The responder threshold is the score change that indicates a meaningful change.
Objectives of the post hoc analysis reported herein were motivated by the noted criticisms of the EHP-30. Hence, one objective was to evaluate the responsiveness of the EHP-30 using two elagolix Phase III clinical trial data sets. 4 A second objective was to identify the score change for the EHP-30 domains that would represent a meaningful HRQL change indicative of treatment benefit (“HRQL responder threshold”) through the triangulation of three psychometric approaches.
Materials and Methods
Data source and sample
Elagolix, a new class of medication to treat moderate-to-severe endometriosis-associated pain, was studied in two Phase III clinical trials and the EHP-30 was used to assess HRQL among the women. 4 Data were utilized from these two Phase III randomized, double-blind, placebo-controlled clinical trials among women with moderate-to-severe endometriosis-associated pain (Elaris Endometriosis I and II [EM-I and EM-II]; NCT01620528 and NCT01931670) for this post hoc analysis; sample sizes for these post hoc analyses are noted in the results tables. 4
The EM-I trial included 871 participants randomized and treated in one of three parallel dose groups in a 3:2:2 ratio to receive daily doses of either placebo (n = 374), elagolix 150 mg once daily (n = 249), or elagolix 200 mg twice a day (n = 248) for 6 months. The EM-II trial included 815 participants randomized and treated in one of three parallel dose groups in a 3:2:2 ratio to receive daily doses of either placebo (n = 360), elagolix 150 mg once daily (n = 226), or elagolix 200 mg twice a day (n = 229) for 6 months. In both trials, the primary endpoints were evaluated after 3 months of treatment but treatment continued for 6 months total; the trials included a post-treatment follow-up period for 12 months. Coprimary efficacy endpoints were the proportion of responders at month 3 based upon the mutually exclusive scales for daily assessment of DYS and NMPP measured by the endometriosis daily pain impact electronic diary (e-Diary). 14 Both trials included the EHP-30 among additional PRO measures to document and represent the patient's perspective and experience.
PRO measures
The EHP-30, a 30-item disease-specific HRQL instrument, was developed with input from patients obtained through qualitative interviews, and has undergone evaluations of reliability, validity, and responsiveness. 12 Minimally important EHP-30 score changes have been explored among 66 women in the United Kingdom with endometriosis who had undergone medical and surgical interventions—but only included complete data on 40 women. 11 The EHP-30 core items include the following domains: pain; control and powerlessness; emotional well-being; social support; and self-image. The recall period is the last 4 weeks and the response options are Never, Rarely, Sometimes, Often, and Always. The score for each domain ranges from 0 to 100. There are six supplementary modules developed alongside the core items that can be administered with the EHP-30; one of which is the five-item sexual relationship module that was included in EM-I and EM-II. The response options for the sexual relationship module additionally allow participants to select Not Relevant. The EHP-30 and sexual relationship module were administered at baseline, month 1, 3, and 6 for EM-I and EM-II.
DYS and NMPP were assessed through patient report on a daily basis. The questions were mutually exclusive and based on the women reporting if she had her “period in the last 24 hours.” The questions were, “Choose the item that best describes your pain during the last 24 hours when you had your period” for DYS and “Choose the item that best describes your pain during the last 24 hours without your period” for NMPP. The response options and scoring were the same for DYS and NMPP. The response options included None (no discomfort), Mild (mild discomfort but I was easily able to do the things I usually do), Moderate (moderate discomfort or pain, I had some difficulty doing the things I usually do), and Severe (severe pain, I had great difficulty doing the things I usually do), and were assigned a score of 0, 1, 2, and 3 respectively. Baseline scoring was averaged over the 35 calendar days before and including the first study drug dose date. Subsequent monthly average pain scores for DYS and NMPP were averaged over the number of days when the participant reported DYS or NMPP pain within each respective time window.
The DYSP Diary was a single item that instructed participants to “Choose the item that best describes your pain during sexual intercourse during the last 24 hours.” The response options included Not Applicable (I was not sexually active for reasons other than my endometriosis or did not have sexual intercourse), Absent (no discomfort during sexual intercourse), Mild (I was able to tolerate the discomfort during sexual intercourse), Moderate (intercourse was interrupted due to pain), and Severe (I avoided sexual intercourse because of pain), and were assigned a score of 0 to 3 for absent to severe, respectively. Women who selected “Not Applicable” for 35 days before baseline were excluded from DYSP analyses.
The Patient Global Impression of Change (PGIC) item used in EM-I and EM-II stated, “Since I started taking the study medication, my endometriosis-related pain has”: very much improved, much improved, minimally improved, no change, minimally worse, much worse, very much worse. The item was scored 1 to 7 accordingly. The PGIC was administered monthly from month 1 to 6. This type of patient-reported rating of change has been recommended for use in endometriosis research, 15 and has been implemented in clinical research to evaluate the participant's perspective of change. 16 –18
Analysis
Descriptive statistics (mean, standard deviation [SD], range, frequencies for categorical data) were generated for sociodemographics, basic clinical characteristics, and PRO data at baseline (day 1). The responsiveness analysis and responder threshold determination included anchor-based, distribution-based, and clinical outcome analyses. The use of these multiple methodologic approaches, also called triangulation, 19,20 contributes to the cumulative evidence about a measure's responsiveness, and can be evaluated for a threshold, or score change that indicates a successful intervention.
Anchor-based analyses, using the PGIC values at month 3 as the anchor were developed using general linear modeling. Pairwise comparisons between least square means were performed using Scheffe's test adjusting for multiple comparisons. Three different approaches were used to categorize the seven-level response options of the PGIC. In one model, all seven response options were evaluated separately. In another model, three categories were created: improved (very much improved; much improved; minimally improved); no change; and worsened (minimally worse; much worse; very much worse). Finally, values were stratified by two categories: improved (very much improved; much improved; minimally improved); and no change or worsened (minimally worse; much worse; very much worse).
Distribution-based analyses involved two approaches using month 3 data. The first approach consisted of calculating a 0.5 SD unit at baseline for the EHP-30 domains. It has been suggested that one-half SD of a measure represents a clinically meaningful change. 21 The second strategy estimated the HRQL responder definition threshold as one standard error of measurement (SEM = baseline SD × [√1 − reliability]). 22 Change beyond one SEM has demonstrated correspondence with an important change in several other chronic diseases. 22 –25 The difference between day 1 and month 3 mean scores by domain was evaluated using Cohen's D. The distribution-based parameters were compared with anchor-based estimates to provide confidence in the HRQL responder threshold, with the distribution-based results considered as supportive to the anchor-based results.
To assess the clinical relevance of EHP-30 score changes, general linear models with pairwise comparison between least square means were performed using Scheffe's test adjusting for multiple comparisons to evaluate the differences in EHP-30 domain scores by clinical outcome status. Specifically, participants were analyzed by their responder status for DYS, NMPP, and DYSP based on changes from baseline to month 3. 4 The relationship of the EHP-30 domain scores to the change in DYS and NMPP (day 1 to month 3) were assessed using Spearman's correlations.
Through the triangulation of these analytic approaches that is examining multiple results to determine a single estimate, a responder threshold for each EHP-30 domain and the sexual relationship module was ascertained. 19,20,26 All analyses were completed using SAS™ version 9.4. 27
Results
A total of 871 women participated in the EM-I trial and 815 in the EM-II trial. Sociodemographics, basic clinical characteristics, and PRO data at baseline are shown in Table 1. The detailed clinical characteristics are described elsewhere. 4 Large improvements in all EHP-30 domains were observed from baseline to month 3 and sustained through month 6.
Sociodemographic Characteristics at Day 1, EM-I, and EM-II
Comparisons based on chi-squared test for categorical variables, one-way ANOVA for continuous variables (p-values shown are from the overall F).
Description of pain due to endometriosis over the last 28 days when patient had her period (mutually exclusive with NMPP).
Description of pain due to endometriosis over the last 28 days when patient did not have her period (mutually exclusive with DYS).
Description of sexual intercourse pain in the past 35 days before the study visit. Response options include: Not Applicable; None; Mild; Moderate; and Severe.
BID, twice daily; DYS, dysmenorrhea; EM, endometriosis; NMPP, nonmenstrual pelvic pain; NSAID, nonsteroidal anti-inflammatory drug; QD, daily; SD, standard deviation.
The PGIC was used as an anchor to assess mean score changes in the EHP-30 domains and sexual relationship module. When using the seven response levels of the PGIC, the observed trend in score changes indicated that greater improvement in the EHP-30 domains and sexual relationships model was experienced by the patients who reported greater improvements on the PGIC. Of note, the sample sizes for some subgroups were small (e.g., only n = 4 for “very much worse” for the sexual relationship in the EM-I dataset and n = 3 in the EM-II dataset), their pairwise comparisons were not statistically significant (data not shown).
When a three-level grouping was used for the PGIC (improved, no change, and worsened), the same trends were identified; greater improvement in the PGIC also represented greater improvements in the EHP-30. To illustrate this, Figure 1 displays the cumulative distribution function plot for the change in EHP-30 pain domain score from day 1 to month 3 (EM-I and EM-II, Fig. 1a, b, respectively). The EHP-30 showed statistically significant differences (p < 0.0001) between the improved and no change groups and the improved and worsened groups for all domains in both trials (data not shown). However, there were no statistically significant differences between the no change and worsened groups for any of the domains.

Cumulative Distribution Function: day 1 to month 3, EHP-30 Pain Domain by PGIC.
The results of the two-level PGIC was grouped by women reporting improvement (very much improved; much improved; minimally improved) and women reporting no change or worsening (no change; minimally worse; much worse; very much worse). The mean score changes are shown in Table 2. The greatest score changes were observed in the control and powerlessness domain (both trials), while the smallest score change in the two trials was in the self-image domain. Score changes in EM-I ranged from −20.3 to −40.8 and from −17.8 to −36.0 in EM-II.
Endometriosis Health Profile-30 Domain Change Scores by Improved Versus No Change or Worsened Patient Global Impression of Change at Month 3
Change in PRO score month 3–day 1.
The PGIC is a 7-point scale, which is categorized for these analyses. The sample distribution was assessed and the data categories into two groups: Improved (very much improved; much improved; minimally improved) No Change or Worsened (no change; minimally worse; much worse; very much worse). Between-group comparisons were significantly different, p < 0.0001, for all domains and for both trials.
EHP-30, Endometriosis Health Profile-30; EM, endometriosis; LS, least square; PGIC, Patient Global Impression of Change; PRO, patient-reported outcome; SE, standard error.
The distribution-based estimates of responder definition are shown in Table 3. Large effect sizes were found for all EHP-30 domains (EM-I range −0.6 to −1.8; EM-II range −0.5 to −1.6). The SEM estimates for the six EHP-30 domains ranged from 4.9-point change on the pain domain (EM-I data) to 10.9-point change on the self-image domain (EM-II data). Baseline ½ SD values by domain were calculated and ranged from a 7.1-point change on the pain domain (EM-I data) to 14.2-point change on the self-image domain (EM-II data), which translates to a 7%–14% change in score. These estimates are smaller than those generated through anchor-based methods. While these smaller score changes would be easier to meet, they may not fully reflect the change needed for a participant to achieve a meaningful change, and are based entirely on statistical interpretation. As such, patient-based anchors are more suitable, as they are anchored with patient perception of change.
Distribution-Based Estimates of Endometriosis Health Profile-30 Domain Change Scores
Calculated as month 3 score minus day 1 score.
Paired t-test comparing responses at day 1 and month 3.
Calculated as mean change score ÷ SD of day 1 score.
Calculated as SD PRO score at day 1 * SQRT (1 − reliability coefficient [Cronbach's alpha] of PRO at day 1 visit). The value of 1 SEM is considered to be a meaningful change using the one-SEM criterion.
Each domain has a 0–100 scale range, where 0 indicates the best health status.
SEM, standard error of measurement.
The correlations between all six of the EHP-30 domains and the change in DYS and NMPP from day 1 to month 3 were moderate and statistically significant (p < 0.0001). The EHP-30 domain correlation with the change in DYS was lowest on the self-image domain (EM-I 0.29; EM-II 0.17) and highest on the pain domain (EM-I 0.50; EM-II 0.41). The EHP-30 domain correlations with the change in NMPP was also lowest on the self-image domain (EM-I 0.34; EM-II 0.22) and highest on the pain domain (EM-I 0.56; EM-II 0.50).
The score changes from baseline to month 3 for each domain of the EHP-30 and the sexual relationship module were assessed by each of the DYS, NMPP, and DYSP responder groups. The assessment of DYS, NMPP, and DYSP responder groups showed that the responders, compared with nonresponders, consistently had a greater score change (i.e., greater improvement) for all EHP-30 domains and the sexual relationship module (p < 0.0001). The EM-I and EM-II score changes for DYS responders were pain −36.8, −34.5; control and powerlessness −46.4, −38.9; emotional well-being −25.7, −20.1; social support −27.7, −23.3; self-image −24.0, −19.7; and sexual relationship −27.5, −24.7. The NMPP and DYSP responder groups had similar EHP-30 domain and sexual relationship module change score values to the DYS responders (data not shown).
Triangulation of the anchor-based, distribution-based, and clinical relevance results, lead to recommending the following responder definition thresholds: pain −30, control and powerlessness −35, emotional well-being −20, social support −20, self-image −20, and sexual relationship −20 (Table 4).
Proposed Endometriosis Health Profile-30 Responder Thresholds
Discussion
This analysis demonstrated the responsiveness of the EHP-30 associated with treatment of elagolix for moderate-to-severe endometriosis-associated pain. The analysis was conducted using patient-reported data on their perception of change at the 3-month time point. The 3-month time point was the trial program's primary endpoint, but data were also viewed at the 6-month time point, and trends continued. In addition, this post hoc analysis used a triangulation approach to assess responder thresholds for each domain of the EHP-30 and the sexual relationship module.
There has been limited research assessing the responsiveness of the EHP-30 and modules, 11,28,29 and no one had yet to examine the responder definition thresholds in a population with moderate-to-severe endometriosis-associated pain similar to the participants in EM-I and EM-II. The Dutch version of the EHP-30 was evaluated for responsiveness and meaningful score changes among a prospective study of 228 women with endometriosis and found the Dutch EHP-30 to be responsive to change. 28 The Dutch study used a global change in general health status item that grouped the five-point response options into three categories for analysis (much better and somewhat better = improved; about the same = no change; somewhat worse and much worse = deterioration). After 6 months, women reporting improvement (n = 80) had an improvement in their EHP-30 scores. The global change in general health item was used as an anchor to determine the minimal score change that indicates improvement, which were much lower than what was suggested in our research (e.g., they suggested EHP-30 change scores of: pain 11.5, control and powerlessness 12.5, emotional well-being 5.5, social support 10, self-image 3.2, and sexual relationship 17.5 as meaningful). Importantly, their research included women with a variety of endometriosis treatments (e.g., starting/stopping oral contraceptives; starting/stopping progestins; starting/stopping GnRH-analogs; coagulation; oophorectomy, etc.), which are not comparable to our study population with a standardized treatment regimen. 28
One strength of the study reported herein is that the sample size was drastically larger than that of Jones et al., 11 which only included 40 patients with complete data. Another strength was that the data were from two Phase 3 randomized, double-blind, placebo-controlled clinical trials, 4 as opposed to a convenience sample. 11 Unlike Jones et al., 11 this analysis demonstrated the responsiveness of all EHP-30 domains, including social support. Importantly, the interpretation of the responder threshold values are limited to moderate-to-severe endometriosis-associated pain, and different thresholds may apply to women with mild endometriosis-associated pain. One might wonder about the application of the threshold values in the EM-I and EM-II studies. The trial reported improvements in EHP-30 scores between baseline and month 3 with statistically significant mean differences between elagolix and placebo for three of six domains (EM-I) and four of six domains (EM-II). We have utilized the responder thresholds for the specific EHP-30 domains to compare treatment versus placebo groups; these results will be presented and discussed in a separate article. 4,30
Given the heterogeneity of the domain scores, it is recommended to have domain-specific responder threshold values rather than a “one number fits all” value. While most domains of the EHP-30 have a change score near 20, the pain domain suggests a higher score threshold of about 30 and the control and powerlessness domain even higher at about a 35-point decrease. These higher thresholds for the pain and control and powerlessness are attributed to the fact that these domains are where women experienced the greatest improvements (and greatest change) with treatment, which is also reflected in the effect sizes and overall change scores.
Importantly, there is no single domain of the EHP-30 that represents greater value over the other domains from a clinical perspective; nor should there be. The value of each domain should be determined by the patient's perspective with the clinician's guidance on treatment options. Considering the patient's own biological and nonbiological characteristics to individualize treatment approaches has been explored by The National Institute of Diabetes and Digestive and Kidney Disease (NIDDK). The NIDDK organized a series of meetings to explore patient outcomes and ways to individualize treatment for urinary incontinence. 31 Perhaps the NIDDK's efforts can be replicated among the gynecologic community. Similar to urinary incontinence, endometriosis has biological and nonbiological factors that influence treatment success. As there is no single endometriosis treatment, the combination of factors and the patient's unique beliefs, values, and goals for treatment should be individualized. Clinicians can use the EHP-30 as a tool to mark progress toward the patient's individualized goals. Patient preference for improvement area can vary based on the women's specific values and experiences. Therefore, focus cannot be made on the performance of one domain over others.
One aspect of patient-focused drug development is collecting the patient's experiences, including impacts of the condition during the clinical intervention. Having data directly from the patient on how they feel and function is critical. Guidance on the development of such measures has been put forth by the Food and Drug Administration and industry professional societies. 13,32,33 Personalized medicine has traditionally been focused on disease efficacy and balancing potential side effects. 34 But what about focusing on the outcomes that are important to women with the disease? There has been a call for more attention to psychological aspects of endometriosis care because of the independent associations they found for pain intensity and preoccupation of pain with HRQL. 35 The preoccupation of pain, or pain cognition, was measured using three patient-reported questionnaires: The Pain Catastrophizing Scale; the Pain Vigilance and Awareness Questionnaire (PVAQ); and the Pain Anxiety Symptoms Scale (PASS). Endometriosis treatment should include personalized goals for women and the EHP-30 can be used as a tool to measure progression toward patient goals of endometriosis treatment. Patients can identify the domain that is most important to them, and the clinician can monitor the progress toward improvement using the EHP-30.
This study defines EHP-30 domain score changes that indicate a clinically meaningful change in a patient's HRQL. The responsiveness analysis demonstrates that the EHP-30 domain scores can be used to monitor a patient's movement toward treatment goals, supporting a clinician's personalized medicine approach when setting treatment goals. Additionally, the findings of this research support the recommendation that the EHP-30 responder thresholds should be unique to each domain. The responder thresholds are indicative of treatment efficacy on impacts of endometriosis-associated pain, which has value in both clinical trials and community-based practice.
Author Contributions
All authors listed meet the requirements for authorship, had involvement in the conceptualization and writing of this article, believe that this article adheres to ICMJE requirements, and have read and approved this final version of the article and the authorship list.
Footnotes
Acknowledgments
Medical writing services were provided by Rebecca Speck, PhD, MPH. Financial support for medical writing was provided by AbbVie, Inc. to Evidera.
Compliance with Ethical Standards
The studies were conducted in accordance with the Declaration of Helsinki, local independent Ethics Committee/Institutional Review Board requirements, and good clinical practice guidelines. Informed consent was obtained from all individual participants included in the study. AbbVie is committed to responsible data sharing regarding the clinical trials we sponsor. This includes access to anonymized, individual, and trial-level data (analysis data sets), as well as other information (e.g., protocols and Clinical Study Reports), as long as the trials are not part of an ongoing or planned regulatory submission. This includes requests for clinical trial data for unlicensed products and indications. These clinical trial data can be requested by any qualified researchers who engage in rigorous, independent scientific research, and will be provided following review and approval of a research proposal and Statistical Analysis Plan (SAP) and execution of a Data Sharing Agreement (DSA). Data request can be submitted at any time and the data will be accessible for 12 months, with possible extensions considered. For more information on the process, or to submit a request, visit the following link:
Author Disclosure Statement
A.M.S. and M.C.S. are employees of and own stock/stock options in AbbVie. R.P., K.S.C., and J.C. are employees of Evidera, who were paid consultants to AbbVie in connection with this study. H.S.T. received grant support and consulting fees from Pfizer and Ovascience, consulting fees from Bayer, AbbVie, and Perrigo. Evidera received funding from AbbVie to conduct the study and develop this article. Publication of study results was not contingent on the sponsor's approval or censorship of the article.
Funding Information
AbbVie, Inc. funded the study and participated in the study design, research, analysis, data collection, interpretation of data, reviewing, and approval of the publication.
