Abstract
Abstract
Background and objective:
Missing data are a common problem in palliative care research. Often the most impaired patients are unable to participate in studies. This may result in biased findings. We investigated whether observed patient reported outcomes should be adjusted for bias resulting from nonparticipation.
Methods:
Of 791 patients with cancer admitted to palliative care, 304 (38%) participated by answering the European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30 questionnaire. For the 15 symptoms and problems measured by the EORTC QLQ-C30 mean scores based on observed responses only were compared to two methods including imputed (estimated) scores based on patient characteristics for the missing data.
Results:
All mean differences between scores based on observed responses only and the two methods including imputed scores for the missing data were less than 5 on a 0–100 scale. For 4 of the 30 comparisons a significant underestimation of symptomatology was found when using observed responses only.
Conclusions:
We did not find indications that using observed responses only resulted in clinically important underestimation of palliative care patients' symptomatology. Either nonparticipants' scores did not differ significantly from participants' or the variables used to describe the non-participants were insufficient predictors of the patients' scores. In any case, the study indicated that imputation of scores of nonparticipants in palliative care may not be worthwhile unless very good predictors are available.
Introduction
Usually the risk of biased findings resulting from nonparticipation is evaluated by comparing participating and nonparticipating patients with regard to background variables such as age, gender, and disease site. But to our knowledge no studies in palliative care have tried to quantify the possible bias resulting from nonparticipation or have investigated whether results based on the observed HRQOL scores from participants only should be adjusted for such bias to better reflect the level of symptomatology of the total patient sample (participants and nonparticipants).
As an alternative to patient self-assessment, proxy assessment could be used to assess the patients' HRQOL. However, poor agreement between staff ratings and patient ratings have been found in palliative care14–16 and it has been recommended to avoid staff ratings as substitutes for patient self-assessments.14,16 Alternatively, family caregivers or significant others may be used as proxies. However, poor agreement has also been observed for significant others, particularly for more subjective aspects of the patient's HRQOL (e.g., pain and feelings). 17 Furthermore, some significant others may be too emotionally affected by their relative's condition to be able and willing to participate as proxies and therefore, even if proxies are included, complete data cannot be expected. Based on these considerations we did not investigate the use of proxy ratings in the present study.
The aim of the present study was to investigate whether nonparticipation/missing data in palliative care studies leads to biased estimates of the patients' HRQOL. We focused on the patients' first contact with the palliative care department. That is, we investigated how to describe the symptomatology of all patients admitted to a palliative care department. We compared HRQOL scores from participants only to results where estimated scores for nonparticipants were added. We acknowledge that there is no way of knowing the true scores of nonparticipants, but the study elucidates whether there is evidence for significant bias resulting from nonparticipation. If such bias is found adjustments of scores may be required to better reflect the HRQOL of the total patient population.
Methods
Study population
From June 1998 to August 2003, a longitudinal study evaluating the symptomatology of palliative care patients using patient self-assessment questionnaires was carried out in the Department of Palliative Medicine, Bispebjerg Hospital, Copenhagen, Denmark. The study was approved by the local ethics committee. Inclusion criteria were admittance to the department, Danish speaking, age at least 18 years, and informed consent. 18 In this period 791 eligible patients were admitted to the department. These 791 patients constituted the total patient sample we wanted to evaluate. Evaluations based on data collected in the first 2 years18–21 and the last 2 years,14,22 respectively, have previously been reported.
Study design and questionnaire
On the day of first contact with the department the patients were informed about the study unless the staff considered this inappropriate due to the patient's situation. Consenting patients received a set of questionnaires including the EORTC QLQ-C30 (version 3.0). 3 The EORTC QLQ-C30 is one of the most widely used cancer-specific HRQOL questionnaires.23,24 It consists of 30 items measuring 15 HRQOL domains: 9 multi-item scales (6 functional and 3 symptom scales) and 6 single-item symptom measures. The scale scores are constructed by summation of item responses and transformation to a 0–100 scale. 25 For functional scales higher scores reflect better functioning whereas for symptom measures high scores reflect severe symptomatology. The possible effect of nonparticipation was investigated for these 15 HRQOL domains.
Participants
We used two definitions of a participant: (1) For the comparisons of participants and nonparticipants a participant was defined as a patient who gave informed consent and actively participated in the study by answering at least one item from the EORTC QLQ-C30 and (2) For the comparison of methods for measuring quality of life we defined the participants for each of the 15 QLQ-C30 domains as the patients who had a valid score for the particular domain. That is, the number of participants may vary across the HRQOL domains.
Statistical analyses
Comparison of participants and nonparticipants
To assess how representative the participating patients were for the total sample of patients we compared the participating and nonparticipating patients with regard to: age, gender, affiliation to the department (inpatient, outpatient, home care, or nursing home), cancer site (breast, colorectal, gynecologic, head and neck, lung, or others), Karnofsky Performance Status (KPS), and time to death after admission. t test (age and KPS), χ2-test (gender, affiliation, and site), and log-rank test (time to death) were used for univariate comparisons. Furthermore, multiple logistic regression analysis 26 with participation/nonparticipation as outcome and the six abovementioned variables as explanatory variables was conducted. Finally, using multiple logistic regression for ordinal outcomes we investigated whether variables significantly associated with participation also were significant predictors for the scores on the 15 HRQOL domains. If a variable was a significant predictor both for participation and for the score on a HRQOL domain, the participants might not be a representative sample, i.e., results based on participants only might be biased for that domain.
Comparison of methods for measuring quality of life
Three analytic procedures for assessing HRQOL were compared: the standard approach using scores from the participating patients only (i.e., missing data were ignored) and two procedures using observed scores from the participating patients but additionally using imputed (estimated) scores for the missing data. The imputed scores were predicted from the characteristics of the nonparticipating patients. Assuming that the imputed scores are reasonable approximations of the true but unobserved scores for the nonparticipants a comparison of the mean scores based on observed scores only and mean scores based on the observed plus imputed scores will indicate whether using observed scores only leads to biased findings. For each HRQOL domain the mean score obtained using the observed data only was compared to the mean scores based on the observed plus imputed data using t tests.
The following describes the three procedures for assessing the HRQOL of the palliative care patients.
Observed scores from participants. For each of the 15 QLQ-C30 scales we calculated the mean score and standard deviation based on the responses from the participating patients only. This is the usual procedure for evaluating the quality of life of a population of palliative care patients. Here it is implicitly assumed that the participants are a representative (randomly selected) subsample of the original target population (here the total sample of patients admitted to the department). If this is not the case, the procedure leads to biased findings. Observed scores combined with imputed scores using logistic regression. If a patient had an observed score this score was used. If not, a score was imputed using ordinal logistic regression.27–29
For each HRQOL domain we estimated an ordinal logistic regression model with the domain score as the dependent variable. As explanatory variables we used: age, gender, affiliation to the department, cancer site, KPS, and time to death. For each patient with a missing score on a domain we used the expected score calculated from the estimated regression model as the imputed score, i.e., the average score we would expect for patients with that particular combination of characteristics. Observed scores combined with imputed scores using the “closest neighbor” method. Again the observed score was used when available. If not available, an imputed score was calculated using a variant of the so-called closest or nearest neighbor method.29,30 The idea of this method is that patients with similar characteristics (same age, gender, etc.) are more likely to have similar HRQOL scores. Therefore, for each patient with a missing score the closest neighbor(s) among the patients with observed scores was identified and the mean of these patients' observed scores was then used as the imputed score. To find the closest neighbors we used the patients' age, gender, affiliation to the department, cancer site, KPS score, and time to death. Here we grouped age into less than 50, 50–60, 60–70, 70–75, greater than 75 years, KPS into 0–30, 40, 50, 60, 70–100, and time to death into 1–10, 11–20, 21–30, 31–50, more than 50 days. For each patient with a missing score we found all participating patients with the same characteristics, calculated their mean score on the HRQOL domain and used this mean score as the imputed score. For example, if an inpatient above 75 years with breast cancer and a KPS score of 40 who died less than 10 days after admission, did not have a physical functioning score we found all patients with the same characteristics, calculated their mean physical functioning score and used this as the imputed score.
We used two different imputation methods because there was no clear choice for the best method and using two methods would elucidate whether the choice of imputation method significantly affected the findings. If the two methods resulted in the same findings this would strengthen the conclusions.
For all analyses conducted the level of significance was set at p = 0.05.
Results
Comparison of participants and non-participants
Table 1 shows the characteristics of the participating and the nonparticipating patients. In all, 304 patients (38%) participated by giving informed consent and answering at least one item from the EORTC QLQ-C30 (206 patients (68%) had answered all 30 items, 75 (25%) had answered 25–29 of the 30 items while 23 (8%) had answered 5–24 items). The remaining 487 patients were the nonparticipants for whom we had no direct information about their HRQOL. The univariate comparisons indicated that participants had significantly better performance status (KPS) and lived longer. This was confirmed in the multivariate logistic regression analysis. Furthermore, the regression analysis revealed that inpatients participated significantly more often than other patients (Table 2).
Total n = 791, 304 participating (38%) and 487 nonparticipating (62%) patients.
Participants: patients responding to one or more of the EORTC QLQ-C30 items; nonparticipants: patients with no HRQOL assessments.
Test statistics: t test (age, KPS), χ2 (gender, affiliation, and site), and, log-rank (time to death).
KPS, Karnofsky Performance Status; EORTC, European Organization for Research and Treatment of Cancer; HRQOL, health-related quality of life.
Odds ratio for participating compared to not participating.
KPS, Karnofsky Performance Status.
Logistic regression analyses showed that physical functioning depended significantly on affiliation to the department and time to death; role functioning, cognitive functioning, and global quality of life depended on KPS and time to death; nausea/vomiting depended on affiliation; emotional functioning, fatigue, and dyspnoea depended on time to death; social functioning and constipation depended on KPS. The level of pain, insomnia, appetite loss, diarrhea, and financial difficulties did not depend on affiliation, KPS, or time to death (Table 3).
p values <0.05 are in
KPS, Karnofsky Performance Status.
Comparison of methods for measuring quality of life
Table 4 shows the mean scores for the 15 HRQOL domains using each of the three measurement procedures: the mean scores based on the observed responses only and mean scores including imputed scores for patients with missing scores using logistic regression and closest neighbor methods, respectively. Generally, the differences in mean scores between the standard method using observed scores only and the two methods including imputed scores for the missing data were small across all 15 HRQOL domains. All mean score differences were less than 5 on a 0–100 scale. Only 5 mean differences were larger than 2 (physical functioning, social functioning, global quality of life, dyspnea, and appetite loss). For half of the 30 comparisons the mean difference was less than 1. The differences between the observed mean scores and the mean scores based on observed plus imputed scores were nonsignificant for 26 of the 30 comparisons (87%). The 4 significant differences indicated an underestimation of problems if using the observed scores only.
Patient reported scores only.
Patient reported scores if available otherwise imputed scores using logistic regression prediction.
Patient reported scores if available otherwise imputed scores using closest neighbour method.
p values for comparing the observed scores and the ‘observed + imputed’ scores based on the particular imputation method. p values <0.05 are in
Discussion
In this study the nonparticipating patients had poorer performance status, lived shorter after admission to the department, and were less likely to be inpatients. These clinical characteristics were also associated with several of the HRQOL domains. One would expect that if patient characteristics are related both to participation and to HRQOL scores then nonparticipation results in bias.
Therefore, it is surprising that we did not find any major differences between mean scores based on participants only and mean scores including imputed values for nonparticipants. All mean differences were less than 5 points on a 0–100 scale. Only four of the 30 comparisons showed a significant difference, and of these only three differences were larger than 2 points. A difference of 10 points or more on a 0–100 scale is often regarded as a clinically relevant difference but it has been suggested that differences as small as 2–5 points may be relevant.31,32 That is, even if we regard 2 points as a clinically relevant difference, only 3 of the 30 comparisons (i.e., 10%) indicated that using the observed scores only resulted in a statistically and clinically significant underestimation of the patients' problems and symptoms. Hence, except for a few borderline cases all differences were so small that they probably have no clinical relevance.
Our finding of only minor differences suggests one of two things. Either the participants were representative, i.e., results based on the observed patient responses can be generalized to the whole patient population admitted to palliative care. Or the imputed scores lacked precision resulting in imprecise estimates of the nonparticipants' HRQOL. The patient characteristics used explained less than one third of the variation in the participating patients' HRQOL scores. Whether this is insufficient for precise estimation of the non-participants' HRQOL cannot be known; this would require that we knew the values of the missing scores. The variables we used to estimate the imputed scores are those most commonly available. If other important factors not accounted for in this study could be identified and included in future studies more precise imputed scores may be obtained. For example, measures of the patients' socioeconomic status may be candidates for inclusion in future studies. But unless substantial auxiliary information about the nonparticipants and about the HRQOL domains can be obtained imputing scores for missing data may not improve measurement precision noticeably.
The imputation methods used here have some limitations. First, if the missing data are so-called nonignorable or missing not at random the methods may fail to recover the missing values.29,30 This could be the case if the missing values are poorly predicted from the available patient characteristics and participation depends on the HRQOL scores we want to measure, e.g., if patients with severe symptoms participate less often. Methods for imputing scores for this type of missing data exist but depend on assumptions, which cannot be verified 29 ; if the assumptions are wrong, such imputed scores may result in even less precision than using observed scores only. Second, all nonparticipants with the same characteristics get the same imputed scores. This results in an underestimation of the variation in the sample which again may result in underestimation of p values. This can be avoided if multiple imputation methods are used.29,30 But multiple imputation is considerably more complex than the methods used here and would not give more precise estimates of the population mean scores, only of the variation in the sample, and hence p values.
We used real data (observed and missing) to investigate the effect of missing data and the imputation methods. Alternatively, simulation could be used by replacing some of the observed scores with imputed scores and then compare imputed and observed scores. However, the results of such a simulation depend heavily on how the missing data are generated. If the simulated missing data differ substantially from the actual missing data mechanism in the real patient population, the conclusions of such a simulation study may be misleading. The inherent problem in such a study is that we cannot know whether this is the case since this would require we knew the missing data mechanism, which we do not. Therefore, we chose to use real missing data instead of simulated data.
In summary, this study did not indicate that using data from participants only leads to underestimation of the true symptomatology of palliative care patients: when comparing mean HRQOL scores based on participants' scores only with mean scores where imputed scores were added for nonparticipants, only few and small differences were found. This means one of two things. Either the mean scores of participants were close estimates of the average symptomatology of the total patient population and thus, non-participation did not lead to bias. Or the applied clinical characteristics were poor predictors of the nonparticipants' HRQOL. In any case, imputation of scores of nonparticipants in palliative care did not significantly change results and unless variables resulting in a precise characterization of nonparticipants' HRQOL can be obtained such imputation does not seem to be worthwhile.
Footnotes
Acknowledgments
The study was funded by the Danish Cancer Society grant no. PP 03 013. No contribution was made by the funding source to any aspects of the study.
Author Disclosure Statement
No competing financial interests exist.
