Meta-analysis: complex relationships between patient satisfaction,age and item-level response rate

Abstract

Causality of the relationship between the objective quality outcomes of care and patient satisfaction has been questioned in many studies. Consequently, it is highly important to study potential confounders in order to improve reliability and validity of patient satisfaction surveys and enable comparisons between objective and subjective outcomes. This study aimed to test the effect of item-level response rate on the results of patient satisfaction surveys and its interaction with another potential confounding factor, patient age. The data included 39 surveys with balanced Likert-scale items. The surveys were systematically gathered from PubMed and had been published 2005–2014. The relationship between the item-level patient satisfaction and item-level response rate was almost without exception positive when the overall patient satisfaction was >4.2 on a traditional 1–5 scale and patients were middle-aged or older. The meta-analysis demonstrated that the relationship between item-level patient satisfaction and item-level response rate is situational, and generalisations regarding the size of the correlation should be made with caution. Controlling for item-level response rate and patient age, simultaneously, is necessary to improve validity of patient satisfaction surveys. The present study calls for novel age-specific approaches to deal with missing data.

Keywords

confounding factors item-level response rate Likert-scale meta-analysis patient satisfaction

Introduction

Patient satisfaction is one of the most intensively studied issues and commonly used measures in health care (Voutilainen et al., 2015b) and, typically, it has been expected to reflect the quality of care (Wolosin et al., 2012; Zusman, 2012). Recently, however, the studies in which patient satisfaction has been associated with empirical objective quality outcomes have become more common and causality of the relationship between quality and satisfaction has been questioned (Godil et al., 2013). For example, high patient satisfaction does not appear to be a valid proxy of low morbidity and mortality (Fenton et al., 2012; Godil et al., 2013). Consequently, it is highly important to study methodological aspects and potential confounders to improve reliability and validity of patient satisfaction surveys and thus better enable comparisons between subjective and objective quality outcomes. Patient satisfaction is not a worthless measure, but if its relationship with potential confounders cannot be controlled for, it inevitably remains more or less vague and disconnected from more objective measures of quality, such as morbidity.

Nonresponse is one of the main potential confounders in survey research. Traditionally, nonresponse in survey research is considered a statistical nuisance, something which needs to be eliminated and, indeed, many sophisticated methods have been generated to handle the problem (Little and Rubin, 2002). Much less attention has been paid to the possibility that nonresponse could be used as an additional source of information. Perhaps nonresponse reveals something valuable about survey respondents and it should be included as an explanatory variable in the survey analysis (cf. Voutilainen et al., 2014), not substituted with an approximation of the actual variable. Ignoring nonresponse may lead to biased survey results and further misleading conclusions. For example, if the average item-level response rate is 90%, that is, 10% of answers are missing, and the average patient satisfaction result is 4 on a 1–5 scale, the actual satisfaction ranges from 3.7 (all missing answers are 1) to 4.1 (all missing answers are 5). This ostensibly minor difference can be crucial, as many patient satisfaction surveys in general result in high satisfaction ratings and absolute differences in the ratings between surveys are typically small (Voutilainen et al., 2015b).

Several patient-related confounding variables can affect patient satisfaction surveys and the most well-known of them is patient age (Hekkert et al., 2009; Kvist et al., 2014; Rahmqvist, 2001; Voutilainen et al., 2014, 2015b, 2016). Older patients in general seem to have a tendency to give higher ratings, which inevitably leads to the commonly demonstrated outcome that older patients are more satisfied with their care than younger patients (e.g. Rahmqvist and Bara, 2010; Schoenfelder et al., 2011; Shirley and Sanders, 2013). The outcome per se may be a fact, older patients are more satisfied; but at least partly the outcome may be due to the older patients’ tendency to skip items rather than give low ratings (Voutilainen et al., 2014). In the latter case, ignoring missing answers causes overestimation of satisfaction ratings. The effect of patient age on the relationship between the item-level patient satisfaction and response rate has not been studied and no meta-analyses of the relationship performed prior to the present research.

Aim

The purpose of the present analysis was to study the effect of item-level response rate on the results of patient satisfaction surveys both at the level of studies across sporadic surveys, in which case the item-level response rate refers to the average item-level response rate per survey, and at the level of questionnaire items within single surveys. It was hypothesised that the relationships among satisfaction, age and item-level response rate are positive and linear regardless of the level of analysis, survey or item. The hypothesis was justified on the basis of the earlier studies, which showed that patient satisfaction, item-level response rate and patient age can all have correlative relationships between each other when values pooled over items and respondents are compared across surveys (Voutilainen et al., 2015b) as well as when item-specific values are compared between respondents within the same survey (Voutilainen et al., 2014, 2016). The overall response rate, which refers to respondents who participated in the study by returning the survey questionnaire, for instance, is not dealt with in this study.

Methods

Data

Data were systematically gathered from PubMed (National Center for Biotechnology Information, US National Library of Medicine) in March 2015. Sampling was restricted to surveys published between 2005 and 2014. The phrase ‘patient* satisf* AND care’, referring to the paper’s title and/or abstract, resulted in 9824 full text English language scientific peer-reviewed journal papers excluding reviews. A read through was performed to find articles in which patient satisfaction with care was (i) measured with three or more balanced Likert-scale items and (ii) the item-level response rate was reported. This first selection resulted in 60 articles. Those 60 articles were read through carefully and 21 of them were excluded for the following reasons: results were pooled over scale points resulting in an asymmetric scale, such as extremely satisfied versus all other scale points (n = 6); results were pooled over items, that is, only domain-level findings were reported (n = 5); no patient age was supplied (n = 5); majority of patients were < 18 years old (n = 2); no unprocessed satisfaction ratings per item were reported (n = 1); patients were able to select ‘not applicable’ but the number of those who did so was not reported (n = 1); the paper dealt with interview, not self-reported survey (n = 1). If patient satisfaction was evaluated using different scales in the same study, the scale consisting of more items than the other scales was included. The final data for the present meta-analysis included 39 articles (see Table 1 in the online Supplementary Material).

Variables

The Likert scale was chosen as the target scale type because it is the most commonly used scale type in patient satisfaction surveys. Only surveys using the balanced Likert-scale were included because the balanced and unbalanced Likert-scales cannot be straightforwardly compared with each other (see Voutilainen et al., 2015b). The unbalanced Likert-scale has been generated to diminish a ceiling effect (Hessling et al., 2004) by stretching the positive end of the scale, which in turn results in lower absolute levels of satisfaction (Voutilainen et al., 2015b). Patient age was chosen as a control variable to help estimate the size of the effect of item-level response rate on survey results and because the balanced Likert scale, specifically, has been found to be vulnerable to confounding effects of patient age (Voutilainen et al., 2016).

In the present study, the overall patient satisfaction (OPS) is the satisfaction score pooled over all items and respondents per survey. The item-level patient satisfaction (IPS) is the satisfaction score per single item pooled over all patients who responded to the item in question. The item-level response rate (IRR) is the percentage of patients who responded to the item in question. The average IRR (AIRR) is the mean of all IRRs per survey. Consequently, AIRR is not the overall response rate, typically abbreviated as ORR, which refers to respondents who participated in the study by returning the survey questionnaire, for instance. Patient age is the mean age of patients who responded to the survey. For the analyses, satisfaction scores were linearly converted to a 0–100 scale to enable comparisons across surveys with different scale lengths. The length of the Likert scale per se seems to have no effect on the results of patient satisfaction surveys, unlike the number of items per scale, which correlates negatively with the overall satisfaction score, hence longer questionnaires have a tendency to result in lower satisfaction ratings than shorter questionnaires (Voutilainen et al., 2015b).

Statistical analyses

First, OPS, patient age and AIRR were associated with each other to reveal survey-level relationships among the three variables. Three linear regression models were executed: OPS explained by AIRR, OPS by age, and AIRR by age. Second, the Pearson’s correlation coefficients between IPS and IRR were associated with OPS, patient age and AIRR to find out whether the direction of the relationship between IPS and IRR changes in relation to OPS, age and AIRR. Three linear regression models were executed: correlation coefficients explained by OPS, age and AIRR. The models were executed with IBM® SPSS® Statistics 21.

Third, a meta-analysis was performed in R, a free software environment for statistical computing, using the package ‘meta’ (Schwarzer, 2010) to reveal item-level relationships between satisfaction and response rate. The Pearson’s correlation coefficient between IPS and IRR was used as an outcome variable in the meta-analysis. As the studies included in the meta-analysis were functionally nonidentical, that is, factors possibly affecting the outcome differed across the studies, a random model was chosen as the mode of meta-analysis. I², Cochran Q, Tau² and the approximated range of effects based on Tau² were calculated to quantify heterogeneity in the meta-analysis (Borenstein et al., 2009; Higgins, 2008; Higgins et al., 2003).

Results

Survey-level relationships

The relationship between OPS and AIRR was statistically nonsignificant (explanatory power 5.4%, F_1,37 = 2.105, p = 0.155), whereas the relationship between OPS and patient age was significant (explanatory power 21.6%, F_1,37 = 10.210, p = 0.003) (Figure 1(a) and (b)). Two studies reporting the two lowest OPS scores resembled outliers and, if they were removed, AIRR and patient age explained 20.3% and 13.1% of the original variation in OPS, respectively (F_1,35 = 8.930, p = 0.005 and F_1,35 = 5.263, p = 0.028). The relationship between OPS and AIRR as well as that between OPS and patient age were positive so that OPS increased together with AIRR and age.

Figure 1.

Relationships between the overall patient satisfaction (OPS), average item-level response rate (AIRR) and patient age. Studies indicated with open circles are potential outliers.

The relationship between AIRR and patient age was more complex. Although patient age did not seem to explain the variation in AIRR (explanatory power 1.9%, F_1,37 = 0.731, p = 0.398), nearly all of the lowest response rates were detected in surveys in which patients were younger than the patient median age (48 years) in the data (Figure 1c). When seven studies with the lowest AIRRs (cut-off point of 91%) were excluded from the data, the relationship turned statistically significant and the model explained 20.7% of the variation in AIRR (F_1,30 = 7.850, p = 0.009). In other words, the finding suggested that AIRR may be nonlinearly related to patient age.

Associations between item- and survey-level variables

OPS explained 20.7% of the variation in correlation coefficients between IPS and IRR (F_{1, 37} = 9.655, p = 0.004) (Figure 2a). The correlation appeared to have a tendency to be more often positive than negative when OPS was high, which for its part generated the observed positive linear relationship. Patient age explained only 7.2% of the variation in correlation coefficients between IPS and IRR and the linear regression model performed was statistically nonsignificant (F_1,37 = 2.853, p = 0.100) (Figure 2b). Interestingly, in each case, when the correlation between IPS and IRR was <–0.17, the patient mean age in the survey was lower than the patient median age (48 years) in the data. AIRR did not explain the variation in correlation coefficients between IPS and IRR (explanatory power <1%, F_1,37 = 0.043, p = 0.837) (Figure 2c).

Figure 2.

Relationship between the item-level patient satisfaction (IPS) and response rate (IRR) explained by the overall patient satisfaction (OPS), patient age and average item-level response rate (AIRR). The vertical dotted line in (b) indicates the median age of patients: 48 years.

Item-level relationships

The meta-analysis rather convincingly showed that the relationship between IPS and IRR is in most cases positive, that is, IPS increases together with IRR. The weighted average correlation coefficient between IPS and IRR was 0.343 (Figure 3). However, the range of the weighted correlation was very wide, from −0.376 to 1, which in practice means that no useful generalisations regarding the size of the effect can be made on the basis of the present results. The proportion of true heterogeneity (I²) also was very large, nearly 100%, indicating that the relationship between IPS and IRR is strongly situational and thus, most probably, affected by cooperative actions of many indirect factors.

Figure 3.

Meta-analysis of the relationship between item-level patient satisfaction and response rate.

Discussion

The main message of the present study was that patient satisfaction, age and item-level response rates do associate with each other. Basically, the hypothesis stated was accepted, although the relationships among satisfaction, age and item-level response rate were even more complex than expected. The meta-analysis showed that the relationship between IPS and IRR is strongly situational and, thus, most probably affected by cooperative actions of many indirect factors, which, in the present study, remain unknown. Moreover, the direction of the relationship between IPS and IRR was not constant but to some extent depended on the average age of patients. The relationship was almost without exception positive (IPS increased together with IRR) when OPS was >80 on a 0–100 scale, corresponding to 4.2 on a traditional 1–5 scale. This information is potentially valuable and affects many surveys, as, in general, patient satisfaction surveys result in high satisfaction ratings and patients are older than the median age of the population (Voutilainen et al., 2015b). The nonlinear associations between IPS, IRR and age signify that IPS cannot be estimated solely from IRR or age, but both IRR and age have to be taken into account, simultaneously. The survey-level findings supported the item-level findings. The associations across OPS, age and AIRR were even rather straightforward when AIRR was high: age correlated positively with satisfaction and negatively with AIRR. The survey-level associations are important to recognise, but they are only helpful when the purpose is to compare results across sporadic surveys. Survey-level results cannot be generalised to single surveys or items or individual patients; it leads to ecological fallacy (Portnov et al., 2007).

In patient satisfaction surveys, less satisfied patients seem to skip more items than highly satisfied patients (Voutilainen et al., 2014). In practice, this means that skipped items should be taken into account, especially when calculating item-specific average scores, and skipped items should be replaced with respondent-specific average scores, not with item-specific average scores. It is illogical to assume that less satisfied patients skip items because they are unwilling to give high scores. But how should highly satisfied patients who do not answer all items be assessed? In some cases at least, those patients are older than other patients (Voutilainen et al., 2014), corresponding to the present item-level finding that positive correlations between IPS and IRR are more often observed in surveys of older than of younger adults. If older adults truly skip items because they are unwilling to give low scores (Bowling, 2002), this has the potential to affect the results of a vast majority of patient satisfaction surveys. Perhaps a missing answer should be dealt with as an expression of dissatisfaction weighted by the patient’s age?

How to deal with item-missing data without replacing or eliminating items?

One possible procedure is to apply item response theory (IRT). IRT is based on the idea that the weight of survey items is unequal and the weight of each item can be estimated on the basis of earlier surveys carried out with the instrument in question by means of the items’ relative difficulty and discrimination (DeMars, 2010). IRT has been successfully applied to reduce the bias caused by ignoring the missing data (Holman and Glas, 2005). It has also been introduced to health care in the form of the Patient-Reported Outcomes Measurement Information System (PROMIS) (Cella et al., 2010) and, more recently, applied to patient satisfaction surveys to reveal patients’ latent satisfaction level and thus produce more reliable and valid estimates of satisfaction (Carle et al., 2014; Jean-Pierre et al., 2014). PROMIS provides banks of survey items, that is, items arranged into pools so that all items belonging to the same pool are reliable estimates of the same component of self-reported health and, in principle, only one item per component needs to be answered. Consequently, PROMIS based on IRT enables reliable measurements with fewer items as well as the use of interchangeable items.

Another possible method to deal with item-missing data is to apply artificial neural networks, such as Kohonen’s self-organising map (SOM). The SOM is an unsupervised artificial neural network especially suitable for exploratory data mining, that is, discovering patterns in large multi-dimensional datasets (Kohonen, 2013, 2014; Wang, 2003). The SOM has been used mainly for data classification, data compression, pattern recognition and diagnostic purposes in a wide variety of fields of science, including patient satisfaction surveys (Voutilainen et al., 2014). The SOM is highly flexible: its results can be further analysed with other methods (Voutilainen et al., 2014) and combined with results provided by other methods (Voutilainen et al., 2015a). In the basic SOM, missing items are replaced with the generic term ‘unknown’, but there is also a SOM-based fuzzy map model for clustering of incomplete Likert-scaled datasets. In the fuzzy SOM, missing items are given values weighted by the possibility of each possible value (Wang, 2003). The possibility ranges between 0 and 1 and can be based on general knowledge and/or distributions of the available data. Fuzzy maps in general are based on fuzzy logic dealing with partial truth, that is, variables can be completely true, completely false, or something in between.

Limitations

The present results and conclusions can only be generalised to patient satisfaction surveys with the balanced Likert-scale, due to the ceiling effect (cf. Voutilainen et al., 2016), and adult respondents. Research on surveys with the unbalanced Likert scales as well as those reporting low patient-evaluated satisfaction (<75 at 0–100 scale corresponding to 4 at 1–5 scale) needs to be executed in the near future in order to draw wider conclusions. High heterogeneity among the studies used in the meta-analysis needs to be taken into account in the interpretation of the results.

Conclusions

The relationships among patient satisfaction, age and IRR appear to be multifaceted, including both linear and nonlinear combinations, and strongly situational, and thus are most probably affected by cooperative actions of many indirect factors. If patients’ satisfaction is associated both with their tendency to skip items and with age, as it seems, controlling for IRR and age, simultaneously, is necessary to improve validity of patient satisfaction surveys. The present findings strengthen the view of a complex phenomenon which cannot be ignored or easily controlled for (Elliott et al., 2005).

Key points for policy, practice and/or research

It is highly important to study methodological aspects and potential confounders to improve reliability and validity of patient satisfaction surveys and enable comparisons between subjective and objective quality outcomes.

The relationships among patient satisfaction, age and item-level response rate appear to be predictable, especially when the overall satisfaction is high and patients are middle-aged or older.

Complex relationships among patient satisfaction, age and item-level response rate cannot be ignored or easily controlled for. Taking into account both patient age and item-level response rate, simultaneously, is necessary in order to improve validity of patient satisfaction surveys.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship and/or publication of this article.

Supplemental Material

The online appendices are available at .

Ari Voutilainen (PhD, MHSc, RN, EMT) is a researcher at the Department of Nursing Science, University of Eastern Finland, Kuopio, Finland. Over the last five years he has focused on handling large complicated datasets with sophisticated statistical methods, such as artificial neural networks and spatial models. Since defending his PhD in 2009, he has worked as a postdoctoral researcher on projects funded by the Academy of Finland and the University of Eastern Finland’s Innovative Research Initiatives, and as a senior researcher on a project funded by Kuopio University Hospital with a governmental grant. Between May 2013 and April 2016 he led his own research project.

References

Borenstein

Hedges

Higgins

JPT

(2009) Introduction to Meta-analysis, Chichester: John Wiley & Sons Ltd.

Bowling

(2002) An “inverse satisfaction law”? Why don’t older patients criticize health services? Journal of Epidemiology & Community Health 56(7): 482.

Carle

Jean-Pierre

Winters

(2014) Psychometric evaluation of the patient satisfaction with logistic aspects of navigation (PSN-L) scale using item response theory. Medical Care 52(4): 354–361.

Cella

Riley

Stone

(2010) The patient-reported outcomes measurement information system (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. Journal of Clinical Epidemiology 63(11): 1179–1194.

DeMars

(2010) Item Response Theory, New York: Oxford University Press.

Elliott

Edwards

Angeles

(2005) Patterns of unit and item nonresponse in the CAHPS® Hospital Survey. Health Services Research 40(6 Pt 2): 2096–2119.

Fenton

Jerant

Bertakis

(2012) The cost of satisfaction: A national study of patient satisfaction, health care utilization, expenditures, and mortality. Archives of Internal Medicine 172(5): 405–411.

Godil

Parker

Zuckerman

(2013) Determining the quality and effectiveness of surgical spine care: Patient satisfaction is not a valid proxy. The Spine Journal 13(9): 1006–1012.

Hekkert

Cihangir

Kleefstra

(2009) Patient satisfaction revisited: A multilevel approach. Social Science & Medicine 69(1): 68–75.

10.

Hessling

Traxel

Schmidt

(2004) Ceiling effect. In: Lewis-Beck

Bryman

Liao

(eds) Encyclopedia of Social Science Research Methods, Thousand Oaks, CA: SAGE Publications, pp. 106–107.

11.

Higgins

JPT

(2008) Commentary: Heterogeneity in meta-analysis should be expected and appropriately quantified. International Journal of Epidemiology 37(5): 1158–1160.

12.

Higgins

JPT

Thompson

Deeks

(2003) Measuring inconsistency in meta-analyses. British Medical Journal 327(7414): 557–560.

13.

Holman

Glas

CAW

(2005) Modelling non-ignorable missing-data mechanisms with item response theory models. British Journal of Mathematical and Statistical Psychology 58(Pt 1): 1–17.

14.

Jean-Pierre

Cheng

Paskett

(2014) Item response theory analysis of the patient satisfaction with cancer-related care measure: A psychometric investigation in a multicultural sample of 1,296 participants. Supportive Care in Cancer 22(8): 2229–2240.

15.

Kohonen

(2013) Essentials of the self-organizing map. Neural Networks 37: 52–65.

16.

Kohonen

(2014) MATLAB Implementations and Applications of the Self-organizing Map, Helsinki: Unigrafia Oy.

17.

Kvist

Voutilainen

Mäntynen

(2014) The relationship between patients’ perceptions of care quality and three factors: Nursing staff job satisfaction, organizational characteristics and patient age. BMC Health Services Research 14: 466.

18.

Little

RJA

Rubin

(2002) Statistical Analysis with Missing Data, 2nd edn. Hoboken, NJ: Wiley.

19.

Portnov

Dubnov

Barchana

(2007) On ecological fallacy, assessment errors stemming from misguided variable selection, and the effect of aggregation on the outcome of epidemiological study. Journal of Exposure Science and Environmental Epidemiology 17(1): 106–121.

20.

Rahmqvist

(2001) Patient satisfaction in relation to age, health status and other background factors: A model for comparisons of care units. International Journal for Quality in Health Care 13(5): 385–390.

21.

Rahmqvist

Bara

(2010) Patient characteristics and quality dimensions related to patient satisfaction. International Journal for Quality in Health Care 22(2): 86–92.

22.

Schoenfelder

Klewer

Kugler

(2011) Determinants of patient satisfaction: A study among 39 hospitals in an in-patient setting in Germany. International Journal for Quality in Health Care 23(5): 503–509.

23.

Schwarzer G (2010) meta: Meta-Analysis with R. R package version 1.6–1. Available at: http://CRAN.R-project.org/package=meta (accessed 7 January 2016).

24.

Shirley

Sanders

(2013) Patient satisfaction: Implications and predictors of success. The Journal of Bone and Joint Surgery (American Volume) 95(10): e69.

25.

Voutilainen

Hartikainen

Sherwood

(2015a) Associations across spatial patterns of disease incidences, socio-demographics, and land use in Finland 1991–2010. Scandinavian Journal of Public Health 43(4): 356–363.

26.

Voutilainen

Kvist

Sherwood

(2014) A new look at patient satisfaction: Learning from self-organizing maps. Nursing Research 63(5): 333–345.

27.

Voutilainen

Pitkäaho

Kvist

(2016) How to ask about patient satisfaction? The visual analogue scale is less vulnerable to confounding factors and ceiling effect than a symmetric Likert scale. Journal of Advanced Nursing 72(4): 946–957.

28.

Voutilainen

Pitkäaho

Vehviläinen-Julkunen

(2015b) Meta-analysis: Methodological confounders in measuring patient satisfaction. Journal of Research in Nursing 20(8): 698–714.

29.

Wang

(2003) Application of self-organising maps for data mining with incomplete data sets. Neural Computing and Applications 12(1): 42–48.

30.

Wolosin

Ayala

Fulton

(2012) Nursing care, inpatient satisfaction, and value-based purchasing: Vital connections. Journal of Nursing Administration 42(6): 321–325.

31.

Zusman

(2012) HCAHPS replaces Press Ganey survey as quality measure for patient hospital experience. Neurosurgery 71(2): N21–N24.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.15 MB