The Cancer-Specific Health Economic Measure QLU-C10D is Valid and Responsive for Assessing Health Utility in Patients with Thyroid Cancer

Abstract

Background:

Health economic appraisals often rely on the assessment of health utilities using preference-based measures (PBM). The cancer-specific PBM, European Organisation for Research and Treatment of Cancer Quality of Life Utility — Core 10 Dimensions (EORTC QLU-C10D), was developed recently, and now needs to be validated in various clinical populations.

Methods:

In a multicenter, multinational prospective cohort study, we longitudinally collected EORTC QLQ-C30 and EQ-5D-5L data from patients with thyroid cancer. We applied seven country-specific value sets to the QLQ-C30 data to derive country-specific utility values and used the EQ-5D-5L as a comparator PBM. Criterion validity was assessed by correlating index scores and Bland–Altman plots. Construct validity was investigated by correlating domain scores. Known-group comparisons and responsiveness were assessed using external clinical criteria.

Results:

A total of 181 patients with thyroid cancer from nine countries (three continents) provided analyzable data. Patients were included if they had differentiated, medullary, or anaplastic thyroid cancer. Mean utility values of both instruments were generally lower compared to general population norms. No floor or ceiling effects were present for the QLU-C10D. The intra-class correlation for EQ-5D-5L and QLU-C10D index values ranged from 0.761 to 0.901 across the measurement timepoints, supporting criterion validity. Spearman’s correlation coefficients ranged from 0.289 to 0.716 for theoretically corresponding domain pairs. The QLU-C10D detected differences in 9 of 15 known-group comparisons, supporting sensitivity. Clinically important changes were detected by all QLU-C10D country specific value sets, supporting responsiveness. Further, the QLU-C10D had higher statistical efficiency than the EQ-5D-5L in 74.7% of comparisons.

Conclusions:

The QLU-C10D is a valid PBM for health economic evaluations in thyroid cancer studies. We recommend its use to estimate health utilities in economic evaluations of thyroid cancer therapies.

Introduction

Health care system decision-makers endeavor to examine health technology performance to ensure evidence-based practices and improve the overall quality of care.¹ An increasingly important tool in this endeavor is cost-utility analysis (CUA).² CUAs provide policymakers with direct comparisons of the clinical benefits and the financial impact of alternative healthcare strategies.² This frequently includes the assessment of health-related quality of life (HRQoL). CUAs typically attempt to achieve this using quality-adjusted life years (QALYs) as the outcome measure. QALYs merge the complementary concepts of length and quality of life, the latter captured as health-state utilities with values ranging from 1 (representing full health) to values below 0, where 0 represents being dead).³ In essence, health state utilities are the quality adjustment metric in QALYs. They are commonly derived by using preference-based measures (PBMs), which capture health states and the preference for certain health states in a given population.⁴ Comparing QALYs and costs of different therapeutic options in CUAs can provide guidance for healthcare systems around which treatment strategies to prioritize and to reimburse.² Especially in oncology, where treatment strategies are rapidly advancing but treatment costs can be high, a reliable evaluation of their costs and benefits is needed.⁵

In this work we focus on the estimation of health state utilities for patients with thyroid cancer. Globally, the annual incidence of thyroid cancer is estimated at 2.83 cases per 100,000 individuals, with mortality of 0.59 per 100,000.⁶ However, survival differs significantly depending on the type of thyroid cancer. Notably, patients with papillary or follicular subtypes have a 5-year survival of well over 95%.⁷ For patients with medullary thyroid cancer, the 5-year survival lies between 75% and 85%.⁸ For the most aggressive variant, anaplastic thyroid cancer, median survival time is less than 6 months.^9,10 While thyroid cancer often carries a favorable prognosis, it is still a malignancy, posing possible loss of life years and the reduction of health-related quality of life (HRQoL). Patients with thyroid cancer report compromised HRQoL, compared to the general population or healthy controls prior to treatment,¹¹ while receiving treatment,¹² after treatment¹³ and once classified as a cancer survivor.¹⁴ In light of this, several new treatment strategies and follow up regimes¹⁵ are currently being developed and tested for the various types of thyroid cancer.¹⁶ These new treatment options and adjunct modalities ought to be evaluated in terms of clinical benefit and affordability.

In order to aid the evaluation of cancer treatments from a health economic perspective, the Multi-Attribute Utility in Cancer Consortium^17,18 and the European Organisation of Research and Treatment of Cancer (EORTC)^{19

–27} developed a cancer specific preference-based measure—EORTC QLU-C10D. It relies on the structure and content of the widely used EORTC Quality of life Core Questionnaire, the EORTC QLQ-C30.^28,29 Additionally, country specific value sets^{18

–27} are available that enable calculation of health state utilities that respect country-specific preferences for the health states described by the QLU-C10D. Hence, the QLU-C10D provides a scoring algorithm for the EORTC QLQ-C30, allowing the prospective and retrospective calculation of health state utilities from EORTC QLQ-C30 data. Contrary to generic utility instruments such as the EQ-5D,^30,31 the HUI,³² or the SF6D,³³ the QLU-C10D is specifically designed to estimate health utilities from cancer patients only. As a last step in its development, the QLU-C10D requires psychometric validation in various patient populations and countries.

Quality criteria for health status questionnaires demand the validation of health status instruments before recommending their use in a specific population and setting.³⁴ The aim of the present study was to assess the construct validity and responsiveness of the QLU-C10D for use in the thyroid cancer patient population following the criteria for valid and ready-to-use health status questionnaires.³⁴ For the purpose of this study, health utilities estimated by the EQ-5D-5L³¹ served as a benchmark, as this instrument is the successor of the most frequently used generic utility instrument, the EQ-5D-3L.³⁰

Patients and Methods

Patients

Data for this analysis were collected during phase IV of the development and validation of the EORTC thyroid cancer module, the EORTC QLQ-THY34.³⁵ Patients participating of the phase IV EORTC QLQ-THY34 development study were included in this nested study if they further provided responses to the EQ-5D-5L. The study design is a prospective cohort study. Patients were assessed in a total of 13 centers in 9 countries across three continents. These were collaborating centers in the phase IV module development of the EORTC QLQ-THY34 study. Patients were included if they had a secured diagnosis of thyroid cancer (ICD-10, C73), were 16 years of age and older, provided a written informed consent, and were proficient in the language of the questionnaire. A sampling matrix was deployed to ensure the inclusion of patients with differentiated, medullary, and anaplastic thyroid cancer. Treatment modalities entailed surgery, non-surgical interventions (RAI, tyrosine kinase inhibitors, radiofrequency ablation, radiation therapy), or other local or systemic anti-cancer treatment. No power calculation was performed for this nested study. Detailed eligibly criteria and further information on the enrolment process are described elsewhere.³⁵

Data collection

Clinical and sociodemographic data were obtained from clinical records. The questionnaires were self-administered by the patients. Data collection took place either via an online assessment tool or via paper-and-pencil assessments. Data were collected at three timepoints: Baseline (t1) defined as up to 4 weeks before start of treatment. Follow-up (t2) at 6 weeks after the first day of treatment and the last visit (t3) 6 months after t2. Clinical information was gathered from medical charts. Patients of collaborating centers who agreed to participate, filled out the EORTC QLQ-C30 and the EQ-5D-5L at each timepoint. The Subjective Change Questionnaire (SSQ)²⁸ was presented at t2 and t3. Patients with missing questionnaire data at baseline were excluded from the current analyses. The process of data selection is shown in Figure 1. Bias analyses were performed by comparing included and excluded cases with regard to sociodemographic (sex, age) and clinical variables (histology and planned treatment at baseline) as outlined in the sampling plan for the validation of the EORTC QLQ-THY34 (results see Supplementary Table S10).³⁵

FIG. 1.

Flow chart patient selection.

Instruments

EORTC QLQ-C30

The EORTC QLQ-C30 is comprised of 30 questions that are collated into 15 scales. Of these, five are functioning scales (physical, role, emotional, social, and cognitive), nine are symptom scales (fatigue, nausea and vomiting, pain, dyspnea, insomnia, appetite loss, constipation, diarrhea, and financial difficulties), and one is a global health status (GHS) scale. The GHS assesses overall self-reported overall health and QoL. The Physical Functioning scale does not have a recall period, all other scales have a recall period of 1 week. Responses are rated on a four-point Likert scale, ranging from ‘not at all’ to ‘very much’, for all questions except for GHS items, which are rated on a scale of 1 to 7, with 1 being ‘very poor’ and 7 being ‘excellent’.

QLU-C10D

The QLU-C10D uses 13 out of the 30 items from the EORTC QLQ-C30, which form ten dimensions: Physical Functioning (PF), represented by the long and short walk items of the QLQ-C30; Role Functioning (RF), represented by the work and daily activities item; Social Functioning (SF), represented by the social and family life items; Emotional Functioning (EF), represented by the depression item; Pain (PA), represented by the pain item; Fatigue (FA), represented by the tired item; Sleep (SL), represented by the trouble sleeping item; Appetite (AP), represented by the lack of appetite item; Nausea (NA), represented by the nausea item; and Bowel Problems (BP) represented by the diarrhea and constipation items. The selection of the 13 items forming these 10 domains is based on previous work by King et al.¹⁷ It is advised that the QLQ-C30 is administered and the QLU-C10D health state utility is calculated from the respective health state classification system.

A single utility index score per patient is calculated from these 13 items as a weighted sum of the component QLQ-C30 items, using a utility scoring algorithm that incorporates the utility weights that were generated in a valuation study to derive the health state preferences of the general population of a particular country. A number of country-specific scoring algorithms are available.^{18

–27,36} While a standardized discrete-choice experiment methodology has been used across all countries, and the general form of the algorithm is the same, the actual values of the algorithm’s weights differ between countries. The weighting aspect of the algorithm imposes a disutility (utility decrement) for any loss of function or symptom experience worse than the full health level. All these algorithms yield a maximum value of 1 representing full health, i.e., no problems with any aspects of functioning and no symptom experience. Each step away from full health leads to a lower utility score. The minimum possible value is for the so-called ‘pits’ state, i.e., the health state with the worst level on all domains, which will vary between countries, depending on each country’s worst level utility weights. Values below 0 are possible and are associated with health states that are interpreted as worse than being dead.

EQ-5D-5L

The EQ-5D-5L is a generic health status questionnaire that is frequently used in health economic studies.³⁷ The questionnaire assesses five aspects of general health: Mobility (MO), Self-Care (SC), Usual Activities (UA), Pain/Discomfort (PD), and Anxiety/Depression (AD). Each item has five response options: ‘no problems’, ‘slight problems’, ‘moderate problems’, ‘severe problems’, and ‘extreme problems’. Additionally, patients assess their overall health through a visual analogue scale (VAS). Valuation studies for the EQ-5D-5L rely on the time-trade-off, or discrete-choice experiments in the general population.³⁸ National value sets are available for a large range of countries and can be found on the EuroQOL website (https://euroqol.org/information-and-support/resources/value-sets/) The VAS assesses self-rated overall health on a scale ranging from 0 (worst imaginable health) to 100 (best imaginable health).

Subjective significance questionnaire (SSQ)

The SSQ utilized for this study was based on Osoba et al. 1998.³⁹ It was adapted for the present study to assess patients’ perceived changes in health domains which had been rated as most relevant by patients with thyroid cancer in phases I⁴⁰ and III⁴¹ of the development of QLQ-THY34. The SSQ for this study comprised five items: (i) fatigue, (ii) voice, (iii) tingling sensations, (iv) temperature tolerance, and (v) overall QoL. The item wordings were “Since the last time I filled out the questionnaire, my [symptom/QoL] is … compared to the last assessment” and the seven response options were ‘very much worse’, ‘moderately worse’, ‘a little worse’, ‘same’, ‘a little better’, ‘moderately better’, and ‘very much better’.

Statistical analysis

Sample characteristics are presented as absolute numbers, means, standard deviations (SD), and valid percentages. Health state utilities for the QLU-C10D and the EQ-5D-5L were calculated by using the country-specific value set of seven different countries [Australia (AUS), Canada (CAN), Germany (GER), Spain (ESP), Italy (ITA), Japan (JPN), and the United Kingdom (UK)]^{18,19,21,22,24,27,42

–48} because patients from most of these countries are included in the current study. We also aimed to validate the Canadian country value set despite not having Canadian cancer patients in the sample. While the method of applying domestic values sets to external data is common practice for some jurisdictions,⁴⁹ the use of a Canadian country value set is explicitly requested by the Canadian Drug and Heath Technology Agency.⁵⁰ Hence, the validation of the Canadian value set enables a specific use-case for health technology assessments in Canada.

Construct validity

QLU-C10D and EQ-5D-5L general population utility norms^{51

–58} were used to put the utilities derived in this study in relation to expected utilities from the general population. Floor and ceiling effects for index scales and domain scores were investigated as valid percentage of patients reaching the lowest/highest possible score. For utility index scores, the highest possible score was 1, and the lowest possible score was that of the ‘pits’ state, the value of which varied between countries, depending on each country’s worst level utility weights. In case more than 15% reached the highest or lowest level it was considered relevant for validity in terms of limiting discriminatory ability of the measure.³⁴ Spearman correlation coefficient (r) was calculated to analyze convergent and divergent validity of QLU-C10D and EQ-5D-5L domains. Theoretically corresponding domains (PF and MO/UA, RF and UA, SF and UA, PA and PD, EF and AD) were hypothesized to show moderate to strong correlations. Cancer-specific domains were expected to show weak to moderate correlations with all domains of the generic measure. Standard convention was used for interpreting r (<0.50 weak, <0.70 moderate and ≥0.70 strong correlations).⁵⁹

As both measures, QLU-C10D and EQ-5D-5L, are designed to measure a similar construct (health-state utilities), we expected an overall high agreement between both index scales. Intra-class correlations (ICC) for index scales were calculated based on pairs of observations. ICC values for absolute agreement ≥0.75 indicate excellent agreement.⁶⁰

Bland–Altman plots were generated for visual inspection of agreement between measures separately for country-specific value set. Differences (QLU-C10D utility minus EQ-5D-5L utility) were plotted on the y-axis and means on the x-axis of the scatter plot. Minimal important differences (MID) reported for the EQ-5D index score in an UK oncology population⁶¹ were used as a crude measure to pre-define acceptable levels of systematic measurement difference of ±0.08; other MIDs reported for the EQ-5D-5L range from 0.072 for the Malaysian value set to 0.101 for the Taiwanese value set.⁶² Levels of Agreement (LOA) were drawn at ±1.96 × SD of the mean difference.

Sensitivity

The sensitivity of the QLU-C10D was assessed by its ability to discriminate between ‘known groups’, i.e., the QLU-C10D’s ability to detect statistically significant differences in utility values between specific patient subgroups which were expected to differ in health status in clinically important ways. Fiveteen sub-groups were formed; (a) 13 groups based on clinical variables and (b) two based on patient self-reported variables. Namely these are:

Clinical variables at baseline: (1) Tumor histology (differentiated vs. medullary, anaplastic, other); (2) Hypoparathyroidism (no vs. yes); (3) Karnofsky clinician-rated performance status (≥90 vs. <90); (4) Lymphatic invasion (no vs. yes); (5) Psychiatric comorbidity (no vs. yes); (6) Status of Disease (no evidence of disease vs. structural disease/incomplete and indeterminant response); (7) Vocal Cord Impairment (no vs. yes); (8) Current hormone withdrawal (no vs. yes); (9) Central and/or lateral neck dissection (no vs. yes); (10) Resection status (R0 vs. R1 and R2); (11) Thyroidectomy (none vs. total); (12) Treatment scheme (monotherapy vs. bi-/multimodal); (13) UICC (I + II vs. III + IV)

Patient Self-Reported variables: (1) VAS score (>50 vs. ≤50), and (2) GHS (>50 vs. ≤50);

For these sub-group comparisons, independent t-tests were calculated. As effect size measure Cohen’s d was used, calculated as the mean difference divided by the pooled standard deviation (SD).

Responsiveness

Responsiveness of the QLU-C10D to clinically important changes in health state was calculated for three periods: (i) from t1 to t2, (ii) from t1 to t3, and (iii) from t2 to t3.

External criteria for a change in disease experience (i.e., subjective change of quality of life) were used: (i) Change in VAS score [<10 points change (=stable) vs. ≥10 points deterioration or improvement],⁶³ (ii) Change in GHS score [<10 points change (=stable) vs. ≥10 points deterioration or improvement],³⁹ which were previously defined as meaningful group difference. Additionally, we relied on (iii) the QoL item of the SSQ,³⁹ recoded to “improved”, “stable”, and “deteriorated”. The SSQ category ‘improved’ was assigned for patients that indicated a little-, moderately-, or very much better QoL. Patients that indicated little-, moderately-, or very much worse QoL were categorized as ‘deteriorated’. Patients that reported the same QoL compared to the previous assessment were categorized as ‘stable’.

Ability to detect change was analyzed using paired (within group change) t-tests and ANOVA for difference between group change. The Responsiveness Index (RI) was calculated as effect size for change (=mean change within improvement or deterioration group divided by the SD of change of the stable group). For responsiveness comparisons with the EQ-5D-5L the Difference in Responsiveness Index (DRI) was calculated (RI of QLU-C10D minus RI of EQ-5D-5L).

In case statistically significant differences/changes were detected, relative efficiency (RE) was calculated as quotient of t-values of each instruments’ pair of country-specific value set (t-value QLU-C10D/t-value EQ-5D-5L).⁶⁴ RE was calculated to directly compare QLU-C10D efficiency in detecting group differences or change in patients with thyroid cancer with the EQ-5D-5L:^65,66 RE values smaller than 1 indicate higher efficiency of the EQ-5D-5L, and, conversely, RE values higher than 1 indicate higher efficiency of the QLU-C10D. Higher efficiency translates into smaller required sample sizes. To account for multiple testing, p-values were Bonferroni–Holmes corrected in order not to exceed an overall alpha level of 5%.⁶⁷

Sensitivity and responsiveness analysis rely on the assumption of clinical important differences and changes. While the sensitivity analysis relies on a-priori defined clinical groups, the responsiveness analysis relies on an estimate of change on the GHS, VAS, and SSQ which were previously deemed clinically meaningful.^39,63

Results

Patient characteristics

A total of 186 patients enrolled in this study completed both the EORTC QLQ-C30 and the EQ-5D-5L at least once at baseline. Data selection and reasons for exclusion are presented in Figure 1. Patients were assessed in a total of 13 centers in 9 countries. In our sample 128 (68.8%) of the respondents were female, and the mean age was 50.98 years (SD 16.4). A total of 145 patients (78.0%) had differentiated thyroid cancer, 20 patients (10.8%) had the medullary form, and 21 patients (11.3%) had anaplastic/other thyroid cancer. Further sociodemographic and clinical characteristics are presented in Table 1.

Table 1.

Patient Characteristics at Baseline^88,89

Patients assessed at baseline (N)	186
Age (median / SD)	50.98 / 16.4
Sex n (%)
Male	58 (31.2%)
Female	128 (68.8 %)
Formal Education (primary, secondary, and tertiary) n (%)
<10 Years	51 (27.4%)
>10 Years	135 (72.6 %)
Partnership n (%)
has a partner	139 (74.7 %)
has no partner/ unknown	47 (25.3%)
Cohabitation n (%)
lives with partner	131 (70.4 %)
lives alone / other / unknown	55 (29.6%)
Mean days to t2 after baseline [mean (SD); min–max]^a	51 (20); 25–181
Mean days to t3 after baseline [mean (SD); min–max]^a	229 (28); 168–323
Patients assessed per country N (%)
Italy	62 (33.3 %)
Switzerland	37 (19.9 %)
Austria	19 (10.2 %)
Germany	19 (10.2 %)
Greece	16 (8.6 %)
Brazil	10 (5.4 %)
Japan	10 (5.4 %)
Spain	7 (3.8 %)
Cyprus	6 (3.2 %)
Status of disease at baseline n (%)
Structural disease	98 (52.7 %)
Biochemically incomplete resp.	7 (3.8 %)
Indeterminate response	15 (8.1 %)
No evidence, unknown	66 (35.4 %)
Vocal Cord Impairment n (%)
No	121 (65.1 %)
Yes	40 (21.5 %)
Unknown / not applicable	25 (13.4 %)
Karnofsky Performance Status n (%) (1)
≥90	120 (64.5 %)
<90	66 (35.5 %)
UICC stage 7th version n (%) (2)
I	76 (40.9 %)
> I	69 (37.1 %)
unknown	41 (22.0 %)
Treatment Scheme n (%)^b
Monotherapy	89 (47.8 %)
Bi-, multimodal therapy	39 (21.0 %)
Unknown / not applicable	58 (31.2 %)
Histology n (%)
Differentiated	145 (78.0 %)
Medullary	20 (10.8 %)
Anaplastic, other	21 (11.3 %)

Baseline (t1) defined as up to 4 weeks before start of treatment. Follow-up (t2) at 6weeks after the first day of treatment and the last visit (t3) 6 months after t2.

Monotherapy categorized if patients received a single anti-cancer treatment; Bi-,multimodal therapy categorised if patients received two-or-more anti-cancer treatments [surgery, non-surgical interventions (RAI, tyrosine kinase inhibitors, radiofrequency ablation, radiation therapy), or other local or systemic anti-cancer treatment].

UICC, Union for International Cancer Control.

Bias analysis, comparing the full sample of the thyroid cancer module EORTC QLQ-THY34 development study with the sub-sample for this analysis, found no significant differences which would indicate selection bias (Supplementary Table S10).

Overall, health states for the patients with thyroid cancer were generally lower than the population norms for both measures and all country-specific value sets. The exception was that at t3, the QLU-C10D utility values for Germany and the UK exceeded the population norms. The smallest differences between population norms and patients with thyroid cancer for the QLU-C10D were found for the Canadian (t3) and the UK (t1) country value sets (Δ0.008), while the largest difference was observed for the Italian value set at t2 (Δ0.086). The lowest difference between population norms and patients with thyroid cancer for the EQ-5D-5L was found for the German country value set at t3 (Δ0.01), whereby the highest difference was observed for the Spanish value set at t1 (Δ0.199). Details in Table 2.

Table 2.

Mean Health State Utility Values

	t1(n = 186)mean / SD		t2(n = 173)mean / SD		t3(n = 155)mean / SD		Normative values:(reportedelsewhere^{54 –61})
Country value sets	QLU-C10D	EQ-5D-5L	QLU-C10D	EQ-5D-5L	QLU-C10D	EQ-5D-5L	QLU-C10D	EQ-5D-5L
AUS	0.639/0.296	0.794/0.303	0.628/0.279	0.823/0.271	0.684/0.283	0.880/0.187		0.910
CAN	0.692/0.248	0.747/0.251	0.687/0.233	0.758/0.230	0.735/0.231	0.812/0.201	0.743	0.864
GER	0.745/0.220	0.803/0.247	0.738/0.208	0.816/0.224	0.781/0.199	0.870/0.168	0.763	0.88
ESP	0.710/0.238	0.764/0.225	0.702/0.221	0.776/0.203	0.747/0.217	0.827/0.172		0.963
ITA	0.764/0.224	0.780/0.268	0.757/0.213	0.793/0.242	0.800/0.206	0.850/0.189	0.843	0.930
JPN	0.673/0.254	0.812/0.170	0.664/0.238	0.818/0.155	0.712/0.232	0.855/0.141		0.921^a
UK	0.716/0.229	0.780/0.222	0.710/0.212	0.791/0.200	0.754/0.210	0.838/0.170	0.724	0.820

Sample of female participants, 45–54 years.

SD, Standard Deviation; t1, baseline, t2 and t3, subsequent timepoints after Baseline; Country value sets: AUS, Australia, CAN, Canada, GER, Germany, ESP, Spain, ITA, Italy, JPN, Japan, UK, United Kingdom.

Across all country-specific value sets and assessment timepoints, there were no relevant floor or ceiling effects present for the QLU-C10D index scores. Across all country value sets and timepoints, merely one patient (≈0.6% of patients) reached the lowest possible score, while 12.3% was the highest percentage of patients reaching the maximal score for the QLU-C10D (Australian value set at t3). For the EQ-5D-5L index scores, no relevant floor effects were found but ceiling effects were present for all country-specific value sets and most timepoints (Table 3). Details regarding the floor and ceiling effects of the domains of each instrument are provided in the supplementary material (Supplementary Table S1).

Table 3.

Floor Effects (at Pit States) and Ceiling Effects (at 1 = Full Health) QLU-C10D and EQ-5D-5L Index Scales

Index		QLU-10D		EQ-5D-5L
Country value sets	Time-point	Floor n / %	Ceiling n / %	Floor n / %	Ceiling n / %	N
AUS	t1	0/0	14/7.5	0/0	29/15.6	186
	t2	0/0	10/5.8	0/0	26/15.0	173
	t3	0/0	19/12.3	0/0	36/23.2	155
CAN	t1	0/0	7/3.8	0/0	28/15.1	186
	t2	0/0	8/4.6	0/0	24/13.9	173
	t3	0/0	10/6.5	0/0	34/21.9	155
GER	t1	0/0	9/4.8	0/0	28/15.1	186
	t2	0/0	9/5.2	0/0	24/13.9	173
	t3	0/0	11/7.1	0/0	34/21.9	155
ESP	t1	0/0	9/4.8	0/0	28/15.1	186
	t2	0/0	10/5.8	0/0	24/13.9	173
	t3	0/0	11/7.1	0/0	34/21.9	155
ITA	t1	0/0	9/4.8	0/0	28/15.1	186
	t2	0/0	9/5.2	0/0	24/13.9	173
	t3	0/0	11/7.1	0/0	34/21.9	155
JPN	t1	0/0	7/3.8	0/0	28/15.1	186
	t2	0/0	8/4.6	0/0	24/13.9	173
	t3	0/0	10/6,5	0/0	34/21.9	155
UK	t1	0/0	7/3.8	0/0	28/15.1	186
	t2	0/0	8/5.8	0/0	24/13.9	173
	t3	0/0	10/6.5	0/0	34/21.9	155

Bold percentages indicate the existence of a ceiling/floor effect.

Ceiling effects were assessed for the highest possible score, i.e., 1 (full health).

Floor effects were assessed for the lowest possible score, i.e., ‘the pits state’ which varied depending on each country’s worst level utility weights. Pits states: (QLU-C10D / EQ-5D-5L): AUS (−0.1 / −0.3), CAN (−0.15 / −0.15), GER (−0.01 / −0.66), ESP (−0.05 / −0.42), ITA (0.03 / −0.57), JPN (−0.22 / −0.03), UK (−0.08 / −0.29).

t1, Baseline, t2 and t3, subsequent assessments after baseline; AUS, Australia, CAN, Canada, GER, Germany, ESP, Spain, ITA, Italy, JPN, Japan, UK, United Kingdom.

ICC were strong between the QLU-C10D and the EQ-5D-5L index scores across all time points (ICC ranges from 0.761 to 0.901; all ICC are reported in Supplemental Material, Supplementary Table S9). On domain level, moderate to strong Spearman correlations were observed for theoretically corresponding domains. The domain pair Pain—Pain/Discomfort showed the highest correlation (Rho 0.601 for all country-specific value sets). Weak to moderate correlations were observed for theoretically distant domain pairs. The QLU-C10D domain Bowel Problems had the lowest correlation with each of the EQ-5D-5L domains (Table 4).

Table 4.

Summary of Correlations QLU-C10D and EQ-5D-5L at Index and Domain Level, for Baseline, t2, and t3

		Domains (spearmańs rho)
		1^a		2^a	3^a	4^a	5^a	6^a	7^b	8^b	9^b	10^b
QLU-C10D	Index Scales (ICC)^a	Physical Functioning	Physical Functioning	Role Functioning	Social Functioning	Emotional Function.	Pain	Fatigue	Sleep disturbances	Appetite loss	Nausea	Bowel Problems
EQ-5D-5L	Index Scales (ICC)^a	Mobility	Usual Activities	Usual Activities	Usual Activities	Anxiety / Depression	Pain / Discomfort	Usual Activities	All	All	All	All
Expected Correlation	Strong≥0.700	Strong, moderate≥0.50	Strong, moderate≥0.50	Moderate≥0.50	Moderate≥0.50	Strong≥0.70	Strong≥0.70	Weak, moderate<0.70	Weak<0.50	Weak<0.50	Weak<0.50	Weak<0.50
Min^a	0.761	0.536	0.403	0.436	0.289	0.455	0.601	0.306	0.113	0.067	0.148	0.015
Max^b	0.901	0.639	0.584	0.694	0.468	0.615	0.716	0.516	0.462	0.444	0.354	0.348
AUS	0.761	0.536	0.403	0.436	0.289	0.575	0.601	0.306	0.385	0.437	0.345	0.348
CAN	0.890	0.536	0.552	0.591	0.364	0.569	0.601	0.419	0.401	0.444	0.349	0.341
GER	0.869	0.536	0.552	0.591	0.364	0.455	0.601	0.446	0.401	0.438	0.349	0.325
ESP	0.879	0.536	0.552	0.591	0.362	0.569	0.601	0.446	0.462	0.438	0.354	0.279
ITA	0.871	0.536	0.552	0.591	0.364	0.455	0.601	0.446	0.401	0.412	0.349	0.341
JPN	0.819	0.536	0.552	0.591	0.364	0.569	0.601	0.446	0.388	0.444	0.349	0.341
UK	0.883	0.536	0.552	0.591	0.364	0.569	0.601	0.446	0.424	0.412	0.349	0.341

Minimal correlations across the three measurement timepoints.

Maximal correlations across the three measurement timepoints all correlations >0.196 are statistically significant with p ≤ 0.005.

ICC, Intra Class Correlations; t1, Baseline, t2 and t3, subsequent timepoints after Baseline; QLU-C10D and EQ-5D-5L Country value sets: AUS, Australia, CAN, Canada, GER, Germany, ESP, Spain, ITA, Italy, JAP, Japan, UK, United Kingdom.

Detailed correlations are reported in the Supplementary Data S1.

Further, investigating the agreement between the QLU-C10D and EQ-5D-5L utility values, Bland-Altman plots display the difference of utility values plotted against the mean of the utility values. The mean difference of utility values ranges from −0.015 for the Italian value set to −0.186 for the Australian value set. The mean of QLU-C10D utilities was always lower compared to the utilities derived by the EQ-5D-5L. Mean differences for the Australian (−0.186) and the UK (−0.139) exceeded the pre-defined threshold (MID) of acceptable systematic measurement difference. For all country-specific value sets, in the upper measurement continuum of the instruments (above 0.8), the discrepancy between the measures appears to become smaller, indicating proportional bias. Figure 2 (A) displays the range of the level of agreement across the various country-specific value sets.

FIG. 2.

Bland–Altman Plots-QLU-C10D and EQ-5D-5L country-specific value sets at baseline. Figure 2A shows an overview of minimal and maximal level of agreement above and below the mean difference area of all country-specific value sets. The mean difference area of all country-specific value sets is plotted along with the zero-difference line, indicating that all mean differences are below, and hence QLU-C10D mean utilities are lower than EQ-5D-5L in all comparisons. Dotted line: y-axis = 0.00 (indicating no difference between both measures) LOA = Level of Agreement.

Known-group comparisons showed that 9 of 15 hypothesized known-group variables yielded statistically significant health state differences. For hormone withdrawal, neck dissection status, resection status, thyroidectomy status, treatment scheme, and UICC none of the instruments detected significant differences.

In total 54 RE indices were calculated with t-values of group differences of histology results (RE >1.17), hypoparathyroidism status (RE >0.93), Karnofsky groups (RE >0.87), lymphatic invasion (RE >1.13), psychological disorders (RE >0.97), disease status (RE >0.98), vocal cord impairment (RE >0.75), VAS score (RE >1.15) and GHS score (RE >1.23). Thirty-nine of these (=72.2%) favored the QLU-C10D, i.e., this measure showed higher efficiency in detecting group differences in most comparisons. Groups based on vocal cord impairment and Karnofsky performance state, as well as the Japanese and Canadian country value set, frequently favored the EQ-5D-5L (see Fig. 3A). Differences of known group comparisons showed frequently large divergence for Australian and Japanese country value set pairs, while Canadian, German and Italian pairs showed small divergence. Detailed results are presented in the Supplementary Table S2.

FIG. 3.

Head-to-Head Comparison of the Relative Efficiency (RE). Sensitivity (A) and Responsiveness analysis (B–D). Horizontal graphs represent distance between highest and lowest RE values with country-specific value set pairs. Icons at each endpoint of the graphs indicate the country-specific value sets with lowest and highest RE value. RE >1 favors QLU-C10D, RE <1 favors EQ-5D-3L. Values exceeding the x-axis scale are marked with grey icon “>“. Numbers in parentheses = number of REs favoring the QLU-C10D / number of RE calculated). Abbreviations: GHS, Global Health Scale (EORTC QLQ-C30); VAS, Visual Analogue Score (EQ-5D); SSQ 5, Subjective Significance Questionnaire QoL item; RE, Relative Efficiency; AUS, Australia, CAN, Canada, GER, Germany, ESP, Spain, ITA, Italy, JAP, Japan, UK, United Kingdom.

Responsiveness for the total sample over three time periods yielded no significant results. Difference in change between VAS groups, GHS groups and QoL groups yielded significant results at all timepoints with all country-specific value sets. RE favored QLU-C10D in 46 of 56 cases (82.1%). RE indices for responsiveness to HRQoL improvement favored the EQ-5D-5L for VAS between t1 and t2 and t2 and t3, and at QoL groups between t2 and t3. Responsiveness for GHS improvement favored QLU-C10D at all three timepoints. RE favored QLU-C10D in 25 of 49 comparisons (51.0%) at HRQoL improvement. Decreasing HRQoL groups yielded no significant results at VAS between t2 and t3, and at GHS between t1 and t3. RE indices for decreasing HRQoL favored the QLU-C10D in all cases (35 of 35), i.e., VAS score (t1 and t2, t1 and t3), GHS score (t1 and t2, t2 and t3), and QoL. Together, in 106 of 140 (75.7%) responsiveness comparisons RE favored the QLU-C10D. AUS and GER yielded frequently highest RE, while JAP and CAN often showed the smallest RE. Effect size differences (DRI) ranged between 0.036 and 0.519 at the HRQoL decrease groups, and between 0.005 and 0.336 at the improved groups. Large differences were frequently observed for the QLU-C10D-EQ-5D-5L pairs for the value sets AUS and ITA, while smaller DRI occurred for the JPN value set.

Discussion

We performed a validation of the QLU-C10D according to the validity criteria for health status questionnaires,³⁴ whereby the psychometric approach deployed was in accordance with previous publications by the EORTC Quality of Life Group. ^68
–70 This is the first study that we are aware of that has investigated the clinical validity of the QLU-C10D in patients with thyroid cancer, and is amongst the first studies to compare the QLU-C10D with the EQ-5D-5L in a prospectively collected sample.^71

–74

From our results we conclude that the QLU-C10D is a suitable measure to estimate health state utilities in patients with thyroid cancer. The derived utility values are commonly lower compared to the utility values of the general population, which is in accordance with widely reported HRQoL data of this cancer population.^11

–14 The only exception was the utility values for Germany and the UK at the last timepoint, which exceeded those of the general population, echoing a similar finding previously reported by Thiagarajan et al.⁷⁵ when comparing HRQoL data of a thyroid cancer population to the German general population.

The QLU-C10D did not show any floor effects (at the pits state values) or ceiling effects (at the full health value of 1), thus the scale structure in the upper and lower end of the measurement continuum appears sufficient. The high correlation of the QLU-C10D and the EQ-5D-5L index scores suggest good criterion validity of the measure. Investigating the correlations at the domain level, the pattern suggests good construct validity of the QLU-C10D. Theoretically corresponding domain pairs, such as ‘Physical Functioning-Mobility’, ‘Role Functioning-Usual Activities’, ‘Social Functioning-Usual Activities’, ‘Emotional Functioning-Anxiety/Depression’, and ‘Pain-Pain/Discomfort’ showed moderate to high correlations. Conversely, theoretically distant domain pairs showed lower correlation coefficients. Still, the only theoretically corresponding domain pair that exceeded the threshold for high correlation was the Pain-Pain/Discomfort pair. The lower-than-expected correlation of the domain pair ‘Emotional Functioning-Anxiety/Depression may be explained by the higher prevalence of anxiety (included in the EQ-5D-5L) compared to depression⁷⁶ (included in both the EQ-5D-5L and the QLU-C10D) in patients with thyroid cancer. Despite this less-than-ideal psychometric finding, the content validity (as prerequisite of construct validity) of the QLU-C10D was recently judged to be more relevant to patients with cancer in comparison to generic PBMs.⁷⁷

Consistent with previous reports,^68,70 the QLU-C10D consistently estimated lower utility values compared to the EuroQOL measurement system. This may be because the QLU-C10D has more dimensions and hence more utility decrements than the EQ-5D-5L, but other factors such as differing valuation methods may also have been at play. Whatever the reason, the systematic difference must be taken into consideration when performing CUAs, and different PBMs were relied upon to assess treatment strategies. In such cases, methods to map, crosswalk, or link utility scores could possibly be applied.⁷⁸ Still the presence of systematic differences across the measurement continuum limits the comparability of EQ-5D and QLU-C10D scores, such that their scores are not interchangeable.

Most important for the psychometric validation of the instrument, the QLU-C10D was able to detect differences between clinical groups and health state changes over time in accordance with clinical data and expectations. In doing so, the QLU-C10D was more efficient compared to the EQ-5D-5L in most cases. As reported previously,⁷⁴ the QLU-C10D appears to capture deterioration of health states more effectively (all country-specific value sets showed higher REs for the QLU-C10D in the “deterioration” group) than for improving of health states (some country-specific value sets showed favorable REs for the EQ-5D-5L) in comparison to the EQ-5D-5L. Previous reports also indicate a limited responsiveness of the EQ-5D-5L in papillary thyroid cancer.⁷⁹

To date, the QLU-C10D’s psychometric properties have been studied in diverse patient groups and for various country-specific value sets. Investigated cancer entities include gastric cancer,⁷³ myelodysplastic syndrome,⁶⁸ metastatic melanoma,⁸⁰ Barrett’s and esophagus cancer⁷¹ undergoing esophagostomy,⁸¹ breast cancer,⁸² neuroendocrine tumors,⁸³ and patients treated with Nivolumab.⁷⁴ In most cases, comparisons of the QLU-C10D were made against the EuroQOL measurement system,^{68,71,73,74,80,81,83} with additional studies using the SF-6D⁷¹ and the PROMIS preference score⁸² as comparator measures. Moreover, many country-specific value sets (Australia, Austria, Canada, France, Germany, Italy, the Netherlands, Poland, UK, and USA) were employed for these psychometric evaluations.^{68,71,73,74,80

–83} A summary of these findings stipulates that the QLU-C10D is able to detect differences in health state utilities across various clinical known groups such as therapeutic groups⁷³ and treatment modalities,⁸² ECOG status, age, and comorbidities,⁶⁸ UICC stages,⁸² and for patients with or without disease progression.⁸³ Furthermore, the QLU-C10D showed a good efficiency in detecting these differences^68,74 and health state changes over time.^74,81

The discussion regarding the use of generic and disease specific PBMs is not new.^84,85 While the advantage of generic PBMs lies in their applicability across diseases, the potential of disease specific instruments (such as the cancer-specific PBM, QLU-C10D) must not be disregarded. Further, QLU-C10D domains capture aspects of health which are highly relevant to cancer patients,¹⁷ thus being able to account for health state changes relevant to both patients and their managing clinicians. Moreover, the measure appears to have satisfactory measurement properties specifically demonstrated here for patients with thyroid cancer. Lastly, this current validation-relying on the 5-level version of the EQ-5D-has shown, that the QLU-C10D detects known-groups and health state changes more efficiently than the generic measurement system despite having more answer categories than the QLU-C10D. This suggests that the measurement precision of the QLU-C10D may arise from a superior content validity⁷⁷ of this instrument for the cancer patient population.

We therefore argue that the QLU-C10D may be a suitable supplement, or in selected cases (within-disease assessments, clear contraindication of generic PBMs, scoping assessments of health state utilities in phase II trials) a substitute, for other generic PBMs. Deploying an instrument (if used as primary outcome) with enhanced efficiency in detecting health state differences/changes has the advantage of reducing the required sample size.⁶⁶ The use of a validated instruments in cancer clinical trials may ultimately influence the accuracy of study outcomes and thus impact clinical guidelines and the provision of certain medicines in daily clinical practice. Currently, the QLU-C10D is used as a measure of cost-effectiveness in various clinical cancer trials. The trials currently concern patient with skin-cancer (NCT06163820), older cancer patients (NCT05797727), or patients with brain metastasis (NCT06163820). Following, this validation of the QLU-C10D in patients with thyroid cancer, we suggest that this instrument may be used in future cancer clinical trials in patients with thyroid cancer.

Limitations

In this analysis we mainly rely on the assessments of European patients with thyroid cancer, only 20 patients included in this study were from non-European countries (Brazil and Japan). Further, while this study thoroughly investigated the psychometric properties of the QLU-C10D according to well-established criteria, QALYs were not calculated and a CUA was not performed. Estimating QALYs with the QLU-C10D and comparing them to the QALYs estimated by other PBMs has been done previously,^74,80,86,87 and informs the health economic field regarding the ultimate effect of using different PBMs when estimating utility values.

Conclusion

The QLU-C10D is a valid and fit-for-purpose PBM for health economic evaluations in patients with thyroid cancer. The instrument displayed good psychometric properties and was more efficient in detecting health state differences and changes than the EQ-5D-5L in over 75% of comparisons. Hence, the QLU-C10D may help to evaluate novel therapeutic strategies and adjunct technologies to reduce treatment-related morbidities in patients with thyroid cancer in terms of their cost-utility.

Footnotes

Authors’ Contributions

Conceptualisation: E.M.G., M.T.K., R.N., S.Se., M.J.P. Acquisition of data: S.Si., G.I., G.P.S., J.I.A., O.H., I.I., G.F., D.F., J.I., N.K., L.D.L., M.P., R.R.G. Anlysis and interpretation of data: S.Se., M.J.P., E.M.G., M.T.K., S.Si. Drafting of the article: M.J.P., S.Se., E.M.G.. Critical revision of the article: S.Si., G.I., G.P.S., J.I.A., O.H., I.I., G.F., D.F., J.I., N.K., L.D.L., M.P., R.R.G., M.T.K., R.N. Statistical analysis: S.Se. Provision of study materials or patients: S.Si., G.I., G.P.S., J.I.A., O.H., I.I., G.F., D.F., J.I., N.K., L.D.L., M.P., R.R.G. Obtaining funding: E.M.G. Administrative, technical, or logistic support: M.T.K., R.N., S.Si. Supervision: E.M.G., S.Sl. All authors have read and approved the article.

Author Disclosure Statement

The authors have no conflict of interest to declare.

Funding Information

This work was supported by an EORTC grant (#12/2016). The EORTC Quality of Life Group business model involves charges for commercial companies using EORTC instruments. Academic use of EORTC instruments is free of charge.

Supplementary Material

Supplementary Data S1

Supplementary Table S1

Supplementary Table S2

Supplementary Table S3

Supplementary Table S4

Supplementary Table S5

Supplementary Table S6

Supplementary Table S7

Supplementary Table S8

Supplementary Table S9

Supplementary Table S10

References

Roberts

SLE

, Healey

, Sevdalis

. Use of health economic evaluation in the implementation and improvement science fields-a systematic literature review. Implement Sci, 2019; 14(1):72; doi: 10.1186/s13012-019-0901-7

Neumann

, Thorat

, Shi

, et al. The changing face of the cost-utility literature, 1990-2012. Value Health, 2015; 18(2):271–277; doi: 10.1016/j.jval.2014.12.002

Roudijk

, Donders

ART

, Stalmeier

PFM

. Setting dead at zero: Applying scale properties to the QALY Model. Med Decis Making, 2018; 38(6):627–634; doi: 10.1177/0272989X18765184

Brazier

, Akehurst

, Brennan

, et al. Should patients have a greater role in valuing health states? Appl Health Econ Health Policy, 2005; 4(4):201–208; doi: 10.2165/00148365-200504040-00002

Zoratti

, Zhou

, Chan

, et al. Health Utility Book (HUB)-Cancer: Protocol for a systematic literature review of health state utility values in cancer. MDM Policy Pract, 2019; 4(2):2381468319852594; doi: 10.1177/2381468319852594

Schuster-Bruce

, Jani

, Goodall

, et al. A Comparison of the burden of thyroid cancer among the European Union 15+ Countries, 1990-2019: Estimates From the Global Burden of Disease Study. JAMA Otolaryngol Head Neck Surg, 2022; 148(4):350–359; doi: 10.1001/jamaoto.2021.4549

Surveillance Research Program, National Cancer Institute. SEER*Explorer: An interactive website for SEER cancer statistics. Available from: https://seer.cancer.gov/statistics-network/explorer/

, Lu

, Hu

, et al. A prediction model for the 5-year, 10-year and 20-year mortality of medullary thyroid carcinoma patients based on lymph node ratio and other predictors. Front Surg, 2022; 9:1044971; doi: 10.3389/fsurg.2022.1044971

Yau

, Lo

, Epstein

, et al. Treatment outcomes in anaplastic thyroid carcinoma: Survival improvement in young patients with localized disease treated by combination of surgery and radiotherapy. Ann Surg Oncol, 2008; 15(9):2500–2505; doi: 10.1245/s10434-008-0005-0

10.

Sugitani

, Miyauchi

, Sugino

, et al. Prognostic factors and treatment outcomes for anaplastic thyroid carcinoma: ATC Research Consortium of Japan cohort study of 677 patients. World J Surg, 2012; 36(6):1247–1254; doi: 10.1007/s00268-012-1437-z

11.

van Velsen

EFS

, Massolt

, Heersema

, et al. Longitudinal analysis of quality of life in patients treated for differentiated thyroid cancer. Eur J Endocrinol, 2019; 181(6):671–679; doi: 10.1530/EJE-19-0550

12.

Hoftijzer

, Heemstra

, Corssmit

EPM

, et al. Quality of life in cured patients with differentiated thyroid carcinoma. J Clin Endocrinol Metab, 2008; 93(1):200–203; doi: 10.1210/jc.2007-1203

13.

Büttner

, Hinz

, Singer

, et al. Quality of life of patients more than 1 year after surgery for thyroid cancer. Hormones (Athens), 2020; 19(2):233–243; doi: 10.1007/s42000-020-00186-x

14.

Tan

LGL

, Nan

, Thumboo

, et al. Health-related quality of life in thyroid cancer survivors. Laryngoscope, 2007; 117(3):507–510; doi: 10.1097/MLG.0b013e31802e3739

15.

Lamartina

, Grani

, Durante

, et al. Follow-up of differentiated thyroid cancer - what should (and what should not) be done. Nat Rev Endocrinol, 2018; 14(9):538–551; doi: 10.1038/s41574-018-0068-3

16.

Colombo

, Giancola

, Fugazzola

. Personalized treatment for differentiated thyroid cancer: Current data and new perspectives. Minerva Endocrinol (Torino), 2021; 46(1):62–89; doi: 10.2373/6/S2724-6507.20.03342-8

17.

King

, Costa

DSJ

, Aaronson

, et al. QLU-C10D: A health state classification system for a multi-attribute utility measure based on the EORTC QLQ-C30. Qual Life Res, 2016; 25(3):625–636; doi: 10.1007/s11136-015-1217-y

18.

King

, Viney

, Simon Pickard

, et al. MAUCa Consortium. Australian utility weights for the EORTC QLU-C10D, a multi-attribute utility instrument derived from the Cancer-Specific Quality of Life Questionnaire, EORTC QLQ-C30. Pharmacoeconomics, 2018; 36(2):225–238; doi: 10.1007/s40273-017-0582-5

19.

Gamper

, King

, Norman

, et al. European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group. EORTC QLU-C10D value sets for Austria, Italy, and Poland. Qual Life Res, 2020; 29(9):2485–2495; doi: 10.1007/s11136-020-02536-z

20.

Jansen

, Verdonck-de Leeuw

, Gamper

, et al. European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group. Dutch utility weights for the EORTC cancer-specific utility instrument: The Dutch EORTC QLU-C10D. Qual Life Res, 2021; 30(7):2009–2019; doi: 10.1007/s11136-021-02767-8

21.

Kemmler

, Gamper

, Nerich

, et al. European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group. German value sets for the EORTC QLU-C10D, a cancer-specific utility instrument based on the EORTC QLQ-C30. Qual Life Res, 2019; 28(12):3197–3211; doi: 10.1007/s11136-019-02283-w

22.

McTaggart-Cowan

, King

, Norman

, et al. The EORTC QLU-C10D: The Canadian Valuation Study and algorithm to derive cancer-specific utilities from the EORTC QLQ-C30. MDM Policy Pract, 2019; 4(1):2381468319842532; doi: 10.1177/2381468319842532

23.

Nerich

, Gamper

, Norman

, et al. French value-set of the QLU-C10D, a cancer-specific utility measure derived from the QLQ-C30. Appl Health Econ Health Policy, 2021; 19(2):191–202; doi: 10.1007/s40258-020-00598-1

24.

Norman

, Mercieca-Bebber

, Rowen

, et al. European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group and the MAUCa Consortium. U.K. utility weights for the EORTC QLU-C10D. Health Econ, 2019; 28(12):1385–1401; doi: 10.1002/hec.3950

25.

Revicki

, King

, Viney

, et al. United States utility algorithm for the EORTC QLU-C10D, a multiattribute utility instrument based on a cancer-specific Quality-of-Life Instrument. Med Decis Making, 2021; 41(4):485–501; doi: 10.1177/0272989X211003569

26.

, Wong

EL-Y

, Luo

, et al. EORTC QLG. The EORTC QLU-C10D: The Hong Kong valuation study. Eur J Health Econ, 2024; 25(5):889–901; doi: 10.1007/s10198-023-01632-4

27.

Finch

, Gamper

, Norman

, et al. EORTC Quality of Life Group. Estimation of an EORTC QLU-C10 value set for spain using a discrete choice experiment. Pharmacoeconomics, 2021; 39(9):1085–1098; doi: 10.1007/s40273-021-01058-x

28.

Giesinger

, Efficace

, Aaronson

, et al. Past and current practice of patient-reported outcome measurement in randomized cancer clinical trials: A systematic review. Value Health, 2021; 24(4):585–591; doi: 10.1016/j.jval.2020.11.004

29.

Aaronson

, Ahmedzai

, Bergman

, et al. The European Organization for Research and Treatment of Cancer QLQ-C30: A quality-of-life instrument for use in international clinical trials in oncology. J Natl Cancer Inst, 1993; 85(5):365–376; doi: 10.1093/jnci/85.5.365

30.

Rabin

, Charro

F D

. EQ-5D: A measure of health status from the EuroQol Group. Ann Med, 2001; 33(5):337–343; doi: 10.3109/07853890109002087

31.

Herdman

, Gudex

, Lloyd

, et al. Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L). Qual Life Res, 2011; 20(10):1727–1736; doi: 10.1007/s11136-011-9903-x

32.

Feeny

, Furlong

, Boyle

, et al. Multi-attribute health status classification systems. Health Utilities Index. Pharmacoeconomics, 1995; 7(6):490–502; doi: 10.2165/00019053-199507060-00004

33.

Brazier

, Usherwood

, Harper

, et al. Deriving a preference-based single index from the UK SF-36 Health Survey. J Clin Epidemiol, 1998; 51(11):1115–1128; doi: 10.1016/s0895-4356(98)00103-6

34.

Terwee

, Bot

SDM

, Boer

MRd

, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol, 2007; 60(1):34–42; doi: 10.1016/j.jclinepi.2006.03.012

35.

Singer

, Al-Ibraheem

, Pinto

, et al. International Phase IV field study for the reliability and validity of the European organisation for research and treatment of cancer thyroid cancer module EORTC QLQ-THY34. Thyroid, 2023; 33(9):1078–1089; doi: 10.1089/thy.2023.0221

36.

EORTC Quality of Life Group. EORTC Quality of Life Utility – Core 10 Dimensions (QLU-C10D): Overview of available value sets for the QLU-C10D. Available from: https://qol.eortc.org/eortc-qlu-c10d/ [Last accessed: April 22, 2024 ].

37.

Zhou

, Guan

, Wang

, et al. Health-related quality of life in patients with different diseases measured with the EQ-5D-5L: A systematic review. Front Public Health, 2021; 9:675523; doi: 10.3389/fpubh.2021.675523

38.

Stolk

, Ramos-Goñi

, Ludwig

, et al. The development and strengthening of methods for valuing EQ-5D-5L—An overview. In: Value Sets for EQ-5D-5L: A Compendium, Comparative Review & User Guide. ( Devlin

, Roudijk

, Ludwig

, eds.) Springer; 2022: pp.13–27.

39.

Osoba

, Rodrigues

, Myles

, et al. Interpreting the significance of changes in health-related quality-of-life scores. J Clin Oncol, 1998; 16(1):139–144; doi: 10.1200/JCO.1998.16.1.139

40.

Singer

, Husson

, Tomaszewska

, et al. Quality-of-Life priorities in patients with thyroid cancer: A multinational European Organisation for Research and Treatment of Cancer Phase I Study. Thyroid, 2016; 26(11):1605–1613; doi: 10.1089/thy.2015.0640

41.

Singer

, Jordan

, Locati

, et al. EORTC Quality of Life Group, the EORTC Head and Neck Cancer Group, and the EORTC Endocrine Task Force. The EORTC module for quality of life in patients with thyroid cancer: Phase III. Endocr Relat Cancer, 2017; 24(4):197–207; doi: 10.1530/ERC-16-0530

42.

Devlin

, Shah

, Feng

, et al. Valuing health-related quality of life: An EQ-5D-5L value set for England. Health Econ, 2018; 27(1):7–22; doi: 10.1002/hec.3564

43.

Finch

, Meregaglia

, Ciani

, et al. An EQ-5D-5L value set for Italy using videoconferencing interviews and feasibility of a new mode of administration. Soc Sci Med, 2022; 292:114519; doi: 10.1016/j.socscimed.2021.114519

44.

Ludwig

, Graf von der Schulenburg

J-M

, Greiner

. German value set for the EQ-5D-5L. Pharmacoeconomics, 2018; 36(6):663–674; doi: 10.1007/s40273-018-0615-8

45.

Norman

, Mulhern

, Lancsar

, et al. The use of a discrete choice experiment including both duration and dead for the development of an EQ-5D-5L value set for Australia. Pharmacoeconomics, 2023; 41(4):427–438; doi: 10.1007/s40273-023-01243-0

46.

Ramos-Goñi

, Craig

, Oppe

, et al. Handling data quality issues to estimate the spanish EQ-5D-5L value set using a hybrid interval regression approach. Value Health, 2018; 21(5):596–604; doi: 10.1016/j.jval.2017.10.023

47.

Shiroiwa

, Ikeda

, Noto

, et al. Comparison of value set based on DCE and/or TTO data: Scoring for EQ-5D-5L health states in Japan. Value Health, 2016; 19(5):648–654; doi: 10.1016/j.jval.2016.03.1834

48.

Xie

, Pullenayegum

, Gaebel

, et al. Canadian EQ-5D-5L Valuation Study Group. A time trade-off-derived value set of the EQ-5D-5L for Canada. Med Care, 2016; 54(1):98–105; doi: 10.1097/MLR.0000000000000447

49.

Knies

, Evers

SMAA

, Candel

MJJM

, et al. Utilities of the EQ-5D. Pharmacoeconomics, 2009; 27(9):767–779; doi: 10.2165/11314120-000000000-00000

50.

Canadian’s Drug and Health Technology Agency. Guidelines for the Economic Evaluation of Health Technologies: Canada—4th Edition. Available from: https://www.cadth.ca/guidelines-economic-evaluation-health-technologies-canada-4th-edition [Last accessed: February 6, 2024 ].

51.

Pilz

, Nolte

, Liegl

, et al. EORTC Quality of Life Group. The European Organisation for Research and Treatment of Cancer Quality of Life Utility-Core 10 Dimensions: Development and investigation of general population utility norms for Canada, France, Germany, Italy, Poland, and the United Kingdom. Value Health, 2023; 26(5):760–767; doi: 10.1016/j.jval.2022.12.009

52.

Grochtdreis

, Dams

, König

H-H

, et al. Health-related quality of life measured with the EQ-5D-5L: Estimation of normative index values based on a representative German population sample and value set. Eur J Health Econ, 2019; 20(6):933–944; doi: 10.1007/s10198-019-01054-1

53.

Hernandez

, Garin

, Pardo

, et al. Validity of the EQ-5D-5L and reference norms for the Spanish population. Qual Life Res, 2018; 27(9):2337–2348; doi: 10.1007/s11136-018-1877-5

54.

Klapproth

, Sidey-Gibbons

, Valderas

, et al. Comparison of the PROMIS Preference Score (PROPr) and EQ-5D-5L Index Value in General Population Samples in the United Kingdom, France, and Germany. Value Health, 2022; 25(5):824–834; doi: 10.1016/j.jval.2021.10.012

55.

McCaffrey

, Kaambwa

, Currow

, et al. Health-related quality of life measured using the EQ-5D-5L: South Australian population norms. Health Qual Life Outcomes, 2016; 14(1):133; doi: 10.1186/s12955-016-0537-0

56.

Meregaglia

, Malandrini

, Finch

, et al. EQ-5D-5L Population Norms for Italy. Appl Health Econ Health Policy, 2023; 21(2):289–303; doi: 10.1007/s40258-022-00772-7

57.

Shiroiwa

, Fukuda

, Ikeda

, et al. Japanese population norms for preference-based measures: EQ-5D-3L, EQ-5D-5L, and SF-6D. Qual Life Res, 2016; 25(3):707–719; doi: 10.1007/s11136-015-1108-2

58.

Yan

, Xie

, Johnson

, et al. Canada population norms for the EQ-5D-5L. Eur J Health Econ, 2024; 25(1):147–155; doi: 10.1007/s10198-023-01570-1

59.

Akoglu

. User’s guide to correlation coefficients. Turk J Emerg Med, 2018; 18(3):91–93; doi: 10.1016/j.tjem.2018.08.001

60.

Cicchetti

. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment, 1994; 6(4):284–290.

61.

Coretti

, Ruggeri

, McNamee

. The minimum clinically important difference for EQ-5D index: A critical review. Expert Rev Pharmacoecon Outcomes Res, 2014; 14(2):221–233; doi: 10.1586/14737167.2014.894462

62.

Henry

, Barry

, Hobbins

, et al. Estimation of an instrument-defined minimally important difference in EQ-5D-5L index scores based on scoring algorithms derived using the EQ-VT Version 2 valuation protocols. Value Health, 2020; 23(7):936–944; doi: 10.1016/j.jval.2020.03.003

63.

Pickard

, Neary

, Cella

. Estimation of minimally important differences in EQ-5D utility and VAS scores in cancer. Health Qual Life Outcomes, 2007; 5:70; doi: 10.1186/1477-7525-5-70

64.

Fayers

, Machin

, eds. Quality of Life: The Assessment, Analysis and Interpretation of Patient‐Reported Outcomes. John Wiley & Sons, Ltd; 2007.

65.

Liang

, Larson

, Cullen

, et al. Comparative measurement efficiency and sensitivity of five health status instruments for arthritis research. Arthritis Rheum, 1985; 28(5):542–547; doi: 10.1002/art.1780280513

66.

King

, Bell

, Costa

, et al. The Quality of Life Questionnaire Core 30 (QLQ-C30) and Functional Assessment of Cancer-General (FACT-G) differ in responsiveness, relative efficiency, and therefore required sample size. J Clin Epidemiol, 2014; 67(1):100–107; doi: 10.1016/j.jclinepi.2013.02.019

67.

Holm

. A Simple sequentially rejective multiple test procedure. Scand J Stat, 1979; 6(2):65–70. http://www.jstor.org/stable/4615733

68.

Gamper

, Cottone

, Sommer

, et al. The EORTC QLU-C10D was more efficient in detecting clinical known group differences in myelodysplastic syndromes than the EQ-5D-3L. J Clin Epidemiol, 2021; 137:31–44; doi: 10.1016/j.jclinepi.2021.03.015

69.

Pilz

, Seyringer

, Hallsson

, et al. The EORTC QLU-C10D is a valid cancer-specific preference-based measure for cost-utility and health technology assessment in the Netherlands. Eur J Health Econ, 2024; doi: 10.1007/s10198-024-01670-6

70.

Pilz

, Seyringer

, Al-Naesan

, et al. EORTC Quality of Life Group. Cancer-specific health utilities: Evaluation of core measurement properties of the EORTC QLU-C10D in lung cancer patients-data from four multicentre lux-lung trials, applying six country tariffs. Pharmacoecon Open, 2024; 8(4):627–640; doi: 10.1007/s41669-024-00484-9

71.

Bulamu

, Chen

, Ratcliffe

, et al. Health-Related Quality of life associated with Barrett’s esophagus and cancer. World J Surg, 2019; 43(6):1554–1562; doi: 10.1007/s00268-019-04936-w

72.

Bulamu

, Chen

, McGrane

, et al. Health utility assessments in individuals undergoing diagnostic and surveillance colonoscopy: Improved discrimination with a cancer-specific scale. Cancer Causes Control, 2024; 35(2):347–357; doi: 10.1007/s10552-023-01789-6

73.

Pan

C-W

, He

J-Y

, Zhu

Y-B

, et al. Comparison of EQ-5D-5L and EORTC QLU-C10D utilities in gastric cancer patients. Eur J Health Econ, 2023; 24(6):885–893; doi: 10.1007/s10198-022-01523-0

74.

Shaw

, Bennett

, Trigg

, et al. A Comparison of generic and condition-specific preference-based measures using data from nivolumab trials: EQ-5D-3L, Mapping to the EQ-5D-5L, and european organisation for research and treatment of cancer quality of life utility measure-core 10 dimensions. Value Health, 2021; 24(11):1651–1659; doi: 10.1016/j.jval.2021.05.022

75.

Thiagarajan

, Fatehi

, Menon

, et al. Assessment of quality of life in thyroid cancer patients using the EORTC thyroid-specific questionnaire: A prospective cross-sectional study. Eur Arch Otorhinolaryngol, 2024; 281(4):1953–1960; doi: 10.1007/s00405-024-08471-w

76.

Tagay

, Herpertz

, Langkafel

, et al. Health-related Quality of Life, depression and anxiety in thyroid cancer patients. Qual Life Res, 2006; 15(4):695–703; doi: 10.1007/s11136-005-3689-7

77.

Gibson

AEJ

, Longworth

, Bennett

, et al. Assessing the content validity of preference-based measures in cancer. Value Health, 2024; 27(1):70–78; doi: 10.1016/j.jval.2023.10.006

78.

Wolowacz

. New ISPOR recommendations—mapping methods for estimation of health state utility. Value Health, 2017; 20(1):28–29; doi: 10.1016/j.jval.2016.11.026

79.

Lubitz

, Gregorio

, Fingeret

, et al. Measurement and variation in estimation of quality of life effects of patients undergoing treatment for papillary thyroid carcinoma. Thyroid, 2017; 27(2):197–206; doi: 10.1089/thy.2016.0260

80.

Kim

, Cook

, Goodall

, et al. Comparison of EQ-5D-3L with QLU-C10D in metastatic melanoma using cost-utility analysis. Pharmacoecon Open, 2021; 5(3):459–467; doi: 10.1007/s41669-021-00265-8

81.

Bulamu

, Vissapragada

, Chen

, et al. Australian Immunonutrition Study Group. Responsiveness and convergent validity of QLU-C10D and EQ-5D-3L in assessing short-term quality of life following esophagectomy. Health Qual Life Outcomes, 2021; 19(1):233; doi: 10.1186/s12955-021-01867-w

82.

Klapproth

, Fischer

, Rose

, et al. Health state utility differed systematically in breast cancer patients between the EORTC QLU-C10D and the PROMIS Preference Score. J Clin Epidemiol, 2022; 152:101–109; doi: 10.1016/j.jclinepi.2022.09.010

83.

Soare

I-A

, Leeuwenkamp

, Longworth

. Estimation of health-related utilities for (177)Lu-DOTATATE in GEP-NET patients using utilities mapped from EORTC QLQ-C30 to EQ-5D-3L and QLU-C10D utilities. Pharmacoecon Open, 2021; 5(4):715–725; doi: 10.1007/s41669-021-00280-9

84.

Lorgelly

, Doble

, Rowen

, et al. Cancer 2015 investigators. Condition-specific or generic preference-based measures in oncology? A comparison of the EORTC-8D and the EQ-5D-3L. Qual Life Res, 2017; 26(5):1163–1176; doi: 10.1007/s11136-016-1443-y

85.

Versteegh

, Leunis

, Uyl-de Groot

, et al. Condition-specific preference-based measures: Benefit or burden? Value Health, 2012; 15(3):504–513; doi: 10.1016/j.jval.2011.12.003

86.

Yang

, Vioix

, Hook

, et al. Health utility analysis of tepotinib in patients with non-small cell lung cancer harboring MET Exon 14 skipping. Value Health, 2023; 26(8):1155–1163; doi: 10.1016/j.jval.2023.02.007

87.

Jansen

, Coupé

VMH

, Eerenstein

SEJ

, et al. Cost-utility and cost-effectiveness of a guided self-help head and neck exercise program for patients treated with total laryngectomy: Results of a multi-center randomized controlled trial. Oral Oncol, 2021; 117:105306; doi: 10.1016/j.oraloncology.2021.105306

88.

Péus

, Newcomb

, Hofer

. Appraisal of the Karnofsky Performance Status and proposal of a simple algorithmic system for its evaluation. BMC Med Inform Decis Mak, 2013; 13:72; doi: 10.1186/1472-6947-13-72

89.

Kim

, Kim

, Lee

, et al. Comparison of long-term prognosis for differentiated thyroid cancer according to the 7th and 8th editions of the AJCC/UICC TNM staging system. Ther Adv Endocrinol Metab, 2020; 11:2042018820921019; doi: 10.1177/2042018820921019