Abstract
Regularly reported patient surveys are an important dimension of hospital quality management. This study investigates whether providing hospital staff with interim feedback on patient survey results following a best practices workshop can help hospitals improve patient centeredness. Standardized surveys with consecutive patient samples were administered in accredited breast cancer center (BCC) hospitals in one German state (18 million inhabitants), over a 6-month period, in 2012. Two studies were conducted by applying a combination of regression point displacement (RPD) and interrupted time series (ITS) designs. In Study 1, 2 of the 27 hospitals that had previously participated in a best practices workshop to discuss patient-centeredness issues were randomly chosen and were provided interim feedback of patient survey results and workshop minutes. In Study 2, 4 randomly chosen hospitals of 32 that had not participated in the workshop also received interim feedback but no workshop minutes. Control hospitals in both studies neither received feedback nor workshop minutes. The impact of interim feedback was evaluated by applying graphical assessments and multiple regression analyses. Both graphical assessments (locally weighted scatterplot smoothing (LOESS) lines, RPD plots) suggested an effect of interim feedback. Multiple regression results did not unambiguously support these findings. The suggested design approach may prove particularly useful to assess effects in pilot studies, when resources are not available to conduct a randomized study or when its conduct is contingent on initial, positive evidence.
Systematically reported patient surveys of hospital experience provide a vital dimension of hospital quality management. While the ultimate purpose of these surveys is to assess aspects of quality of care as perceived by patients and to develop measures to improve quality, there has been a long debate as to whether and under which circumstances they are able to fulfill this purpose (e.g., Anhang Price, Elliott, Cleary, Zaslavsky, & Hays, 2014; Cheraghi-Sohi & Bower, 2008; Cleary, 1999, Gleeson et al., 2016). Despite psychometric reservations, surveys are commonly utilized to assess quality and to guide policy decisions in hospitals throughout the world. In the United States alone, such surveys annually include hundreds of thousands of patients (e.g., Goldstein, Farquhar, Crofton, Darby, & Garfinkel, 2005) that not only measure patient satisfaction in a general sense but also focus upon ratings of patient centeredness (Cleary et al., 1998).
Despite the widespread use of surveys, few evaluation studies have thoroughly assessed the potential of patient surveys to actually improve patient centeredness as a consequence of timely feedback from patient survey results (Gleeson et al., 2016). However, studies on the utilization of patient surveys have shown that patient survey reports are often discussed among staff and routinely used to initiate quality improvement efforts (Elliott et al., 2010; Greenhalgh & Meadows, 1999; Iversen, Bjertnaes, Groven, & Bukholm, 2010; Reeves & Seccombe, 2008; Riiskjaer, Ammentorp, Nielsen, & Kofoed, 2010; Vingerhoets, Wensing, & Grol, 2001). Barriers to use are common (e.g., perception of limited action ability, lack of incentives to incorporate with a managerial, or frontline worker level) and lack of knowledge regarding the best ways to implement patient-based feedback are widely prevalent (Coulter, Locock, Ziebland, & Calabrese, 2014; Davies & Cleary, 2005; Friedberg, SteelFisher, Karp, & Schneider, 2011). The identification of these barriers, the development of aids to assist providers in implementing lessons from these reports, and the creation of more efficient processes aimed to improve quality of care each represents formidable future tasks for patient-centeredness research.
Further complicating the assessment of hospital-based surveys, previous measures, interventions, settings, and study designs have been quite heterogeneous (Alexander & Hearld, 2009, 2011) making it difficult to identify the mechanisms of change underlying interventions aiming at hospital quality improvement. Indeed, there is an absolutely small number of research studies that evaluate interventions that aim to assist hospitals distributing patient survey results and, in turn, to assess their impact on patient centeredness (Gleeson et al., 2016).
This article evaluates a feedback-based intervention to assist hospitals in reacting in a more timely fashion to patient survey results intended to improve patient centeredness. The intervention was a brief, written feedback condition given to hospitals that either (self-selectively) participated in a feedback workshop (Study 1) or those that did not (Study 2). The intervention was implemented within an existing patient survey reporting system, thus allowing for generation of “practice-based” evidence (Green & Glasgow, 2006).
Because, as is often true with such a patient survey reporting system, resources were scarce, so no randomized controlled trial could realistically be conducted. Instead, a variant of the regression point displacement (RPD) design was applied (Trochim & Campbell, Unpublished). To maximize inferential quality, the RPD design was combined with an interrupted time series (ITS) design.
The primary purpose of this article is to provide a methodologically sound context upon which to evaluate the common, survey-based approach used to inform hospital policy. We implement a design strategy in which two designs were simultaneously utilized within the same research context to enable us to further enhance the inferential quality of the research.
Method
Survey
Standardized surveys with consecutively sampled patients were administered in BCC hospitals certified according to criteria set out by the German state of North Rhine-Westphalia (approximately 18 million inhabitants). Adopting the definition of patient centeredness given by the Institute of Medicine (“health care that establishes a partnership among practitioners, patients, and their families (when appropriate) to ensure that decisions respect patients’ wants, needs, and preferences and that patients have the education and support they need to make decisions and participate in their own care”; Hurtado, Swift, & Corrigan, 2001, p. 7), the survey instrument was developed throughout the 2000s and used in various patient surveys in Germany (Pfaff, Freise, Mager, & Schrappe, 2003). The survey included 21 scales that were made up of 98 items and that represented different dimensions of patient centeredness including the provision of disease-specific information, physician empathy, and support by nurses and physicians. These 98 items were combined into one single score that served as our measure for patient centeredness (see “Statistical Analysis” section for details).
Hospital participation in the administration of patient surveys was part of certification requirements and has occurred annually since 2006, each February through July (for details, see Kowalski, Würstlein, Steffen, Harbeck, & Pfaff, 2011). As a result of this stipulation, hospitals provided relevant data and patient reporting occurred at high rates. Patient survey results were reported annually to the hospitals, in late November or early December (about 2 months before the beginning of the new survey period). This research study was based on the 2012 patient cohort, applying an identical survey procedure as in previous years. Hospitals were required to include all patients who had undergone inpatient surgery for newly diagnosed breast cancer between February 1 and July 31, 2012, and who have had at least one malignancy as well as a postoperative histology. Hospitals were allowed to additionally include patients throughout January to mid-August.
Shortly before discharge, patients were asked for informed consent by hospital staff and, once they agreed, provided clinical data on forms provided by staff (cancer stage, type of surgery, etc.). Completed forms were sent to the surveying institution, the Institute of Medical Sociology, Health Services Research, and Rehabilitation Science (IMVR) of the University of Cologne, Germany. Consenting patients then received a mailed questionnaire with two subsequent reminders. The survey was approved by the Medical Ethics Committee of the University of Cologne. Eighty-seven of the 88 eligible hospitals participated in the survey (99% hospital response rate). The patient response rate was 87.1% (4,234 of the 4,863 consenting patients returned questionnaires).
Intervention and Treatment Assignment
Figure 1 presents a flowchart of the intervention and treatment assignment described in the following sections.

Randomization.
At the beginning of the survey period 2012, best practices workshops were offered to all BCC hospitals that participated in the 2011 survey. The primary goal of these workshops was for BCC hospitals to exchange “best practices” with regard to patient-centered care. Representatives from 26 hospitals that fulfilled the inclusion criteria participated in the best practices workshops. Participants were presented results of the previous patient survey, listened to a presentation on staff–patient interaction, and took part in a focused, 75-min group discussion on one of the three specified topics. Talks and group discussions were summarized and distributed to all participants at their home hospitals, after the workshop. These workshop participant hospitals made up the sample for Study 1. We expected these hospitals to be substantially different from nonparticipant hospitals (Study 2) in that administrators were more committed to quality improvement due to their decision to send staff to a workshop.
Intervention: Interim Feedback Report
The interim feedback report was provided by the surveying institution and sent out to the designated hospital employee, most often the so-called BCC coordinator and, in Study 1, additionally to staff members who participated in the workshop (see below). The designated hospital employee was chosen because he or she monitored the annual survey and thus made certain interim reports were examined and distributed. Intervention hospitals received interim feedback based on patient survey results after the first 14 (or 20) weeks of the survey period. Results of each of the 21 indicators that were also presented in the annual report were listed on a single sheet of paper and compared with results of the previous year and with results from the other hospitals. Substantial improvements and deteriorations as compared to the previous year were marked in green (improvement) or red (deterioration).
To ensure that each selected hospital would receive interim feedback in a timely manner, information was sent by both mail and e-mail. The mailed version of the interim report consisted of multiple copies prepared for the hospital staff. The interim report package was accompanied by a cover letter with instructions regarding how to use the interim feedback to improve its effect (Hysong, 2009). In the report, a suggestion was made to distribute the interim report sheet to the wards/clinics and to discuss results with employees. In addition, the letter explained the rationale for the interim feedback.
Treatment Assignment and Study Designs
Randomized experiments (REs) are well known to provide the best evidence upon which to base policy decisions. However, REs are not only expensive but also introduce a substantial delay in evidence creation for hospital administrators who are motivated to establish more immediate policies. In such instances, decision makers often create policies using the best information available based on quasi-experimental designs rather than experimental designs (Shadish, Cook, & Campbell, 2002).
These practical shortcomings are consistent with the common use of relatively weak observational studies or hospital-based reports that utilize descriptive, correlational data. However, practical concerns must be balanced with the long-standing need for methodological rigor at the organizational level in health care. For example, Issel (2014) recently pointed out that descriptive research in health-care administration is no longer (and probably has never been) sufficient for high-quality program evaluations. As a practical remedy, we apply a research approach that combines two existing designs, arriving at a strategy that is more rigorous than a purely observational approach but is easier to apply, timelier, and less costly than a randomized controlled trial (RCT). This combination of two designs within the same data set has historical precedent in the social sciences (e.g., Boruch, 1975). These so-called hybrid designs allow researchers to use the stronger design to eliminate validity threats that occur with the weaker design and to corroborate estimates found using the weaker design with those of the stronger.
In this evaluation, we applied the underutilized RPD design. In an unpublished article, Trochim and Campbell (Unpublished) systematically present applications of RPD from several different disciplines. Below, we discuss the logic of the RPD design in more detail, but here we first introduce its straightforward structure. At its most basic, the RPD design determines the distance of a dot from a line (the discontinuity); the larger the discontinuity, the greater the likelihood that the effect is due to the intervention. Each dot is established by the pre–post result in each hospital chosen to receive the intervention. We then calculated the departure of every pre–post result (dots) in an intervention hospital from the regression line made up of the set of pre–post results in the control hospitals that did not receive the intervention.
Hospitals were included in the study and eligible for placement into the intervention and the control group if they had at least 6 patients responding to the survey as of April 30, 2012 (i.e., after half the survey period had elapsed), and if at least 18 patients were included in the 2011 survey period. This decision was made to fulfill data privacy standards and to ensure that small samples had less impact on our study’s results. After the exclusion of 28 hospitals because of an insufficient number of patients in 2011 or during the first 3 months of 2012, 59 hospitals were eligible for placement.
To test the impact of interim feedback, two studies were conducted. In Study 1, of 26 hospitals that participated in the workshop and had at least six patients included in the survey by April 30, 2012, 1 was randomly drawn from hospitals above the median based on an overall composite score in the previous (2011) survey and 1 was randomly drawn from hospitals in the lower half. This approach was taken to reduce the possibility that both hospitals chosen were at one tail of the pretest distribution (which might limit the degree of possible change in the dependent variable; Ivers et al., 2012). These two intervention hospitals received interim results together with a copy of the workshop minutes to reinforce the impact of feedback. One hospital received the intervention 14 weeks after the beginning of the survey period (May 8, tx 1, “first feedback”) and one 20 weeks after the beginning of the survey period (June 16, tx 2, “second feedback”). The intervention feedback was sent to workshop participants by e-mail and mail.
In Study 2, of 33 hospitals that had not participated in the workshop and had at least six patients included in the survey by April 30, 2012, 2 were randomly drawn from the 16 hospitals in the upper half of the baseline composite score in the 2011 survey and 2 were randomly drawn from the 17 hospitals from the lower half of the previous survey. These four hospitals received interim results but no copy of the workshop minutes. Two hospitals (one randomly chosen from the top and one randomly drawn from the bottom half) received the intervention 14 weeks after the beginning of the survey period (May 8, tx 1, “first feedback”) and two 20 weeks after the beginning of the survey period (June 18, tx 2, “second feedback”). The intervention was distributed by e-mail and mail to contact persons at the four hospitals.
Two studies were conducted because hospitals that participated in the workshop differed from nonparticipating hospitals in that we expected them to be generally more motivated to work on patient-centeredness issues. In addition, since the workshop took place in only one location, more distant hospitals might feel that the costs of participation were too high. These concerns were consistent with previous research evidence reflecting better results in patient surveys of hospitals that had previously participated in the workshop (Kowalski, Yeaton, & Pfaff, 2013). Therefore, we expected lower pretest scores and, thus, more definitive results in the group of nonworkshop participating hospitals. To additionally strengthen the intervention for Study 1 hospitals, we distributed workshop minutes (reminder or “booster”), making it a “multifaceted intervention” from which we might expect a somewhat larger effect (Jamtvedt, Young, Kristoffersen, O’Brien, & Oxman, 2006).
Using design notation developed by Campbell and his colleagues (e.g., Shadish et al., 2002), the assignment procedure for both studies is schematically illustrated in Box 1, with R indicating random assignment of hospitals, “o” schematically indicating observations that are done by continuously collecting monthly patient survey results, and “tx ” indicating the implementation of the intervention (or “treatment”) at two different time points (1 and 2).
Intervention Assignment
Thus, this treatment assignment combines two distinct research designs: the RPD design, illustrated by Linden, Trochim, and Adams (2006) using simulated data and recently applied in the field of criminology (Sundt, Salisbury, & Harmon, 2016), as well as the interrupted time-series design with “switching replications” (Shadish et al., 2002, p. 192; hereafter, termed ITS design). Both designs have specific weaknesses and strengths (e.g., eliminate different validity threats), but the combination of these two designs helps to eliminate a larger number of validity threats and to minimize inferential limitations.
Trochim and Campbell describe the RPD framework as “a pretest-posttest quasi-experimental design that usually involves a single treated group and multiple control groups” (Trochim & Campbell, In Press, p. 2). Random assignment in the RPD framework applied here was not done to eliminate selection bias but rather to ensure that individual hospitals were not chosen because their characteristics were likely to produce a discontinuity or because political climate for change was more favorable in those hospitals.
Thus, the RPD with random allocation of treatment units is not a “mini-RCT.” In the RPD, randomization is not used to establish between-group equality in potential confounding variables between treatment and control groups, and there typically are only one or a few intervention units. These units are aggregates, usually “sites” like schools, companies, or hospitals in which pre- and posttest observations are averaged across individuals. Its name derives from the graphical illustration of the design and the potential treatment effect; the bigger the intervention effect is considered the more substantial the deviation (displacement) from a regression line that is fit to the posttest observations in the control units as a function of their pretest (see “Statistical Analysis” section).
Trochim and Campbell recommend the application of the RPD to test “demonstration programs and pilot projects” and emphasize that it “remains very much a quasi-experimental design, for which many of the common threats to validity must be examined on the basis of contextual information not included in the statistical analysis” (p. 4). One important reason for applying this variant of the RPD instead of an RCT in which approximately half of the units would be assigned to the treatment is the resulting much lower cost; the preparation of interim feedback requires staff resources that need to be completed in a short time for which no additional funding was available. Thus, instead of not evaluating interim feedback, the relatively low-cost RPD approach was taken, bearing in mind that such an assessment can be viewed as a preliminary study which may then lead to an RCT.
To mitigate the threat of what is called “history” (Shadish et al., 2002) when some extraneous event, like new hospital guidelines, happen to coincide with the implementation of interim feedback and to which an RPD is vulnerable, an ITS design was added (feedback was given at different points in time to different hospitals, allowing the researcher to determine whether change was coincident with each of these time points). Thus, the RPD is buttressed with the multiple replication feature of our ITS design (feedback was implemented at different times in groups initially serving as controls).
An ITS design (e.g., Goldberg et al., 2000) typically tracks units of analysis over time and compares their pre- and postintervention scores (i.e., multiple observations before and after the intervention). The “staggering” element of the design refers to the delayed onset of the intervention in half of the intervention hospitals, a feature which also enhances temporal generalizability (ensures that an effect is not due to particularities of the time period chosen in which the intervention was first introduced). As a second kind of counterfactual, no change is expected in the control series at times when change occurs in the treatment series. By combining the positive features of these two designs, our hybrid approach also allowed a replication of the same data set using the different analytic approaches of two different research designs. This ITS design focus is quite different than that of the RPD design, where we utilized only the aggregated pre- and posttest scores (but no “in-between” scores) when establishing the counterfactual regression line.
Statistical Analysis
After the exclusion of 387 patients because of missing values on the date of surgery variable that was necessary to establish if the patient was treated before or after feedback was sent out, our sample consisted of 2,985 patients treated in the 59 study hospitals. Based on the 98 items which were the basis of the 21 indicators that are reported in the annual reports each hospital received, a composite score was calculated that allowed ranking the hospitals, with higher scores indicative of a more patient-centered experience. In the annual reports, these 98 items were also the basis for the overall ranking of hospitals and, thus, important for hospital comparisons and certification (Certification Body “ÄKZERT”, 2014). The utilization of the overall score instead of analyses of all 21 indicators was done for two reasons, one substantive and one methodological. Since the interim feedback included all 21 indicators and the overall ranking of the hospitals was based on all 21 indicators, we decided that this was the soundest solution (the overall score being the most important value for the hospitals). Secondly, we wanted to avoid “fishing” or needing to explain why some indicators improved while others didn’t.
However, combining many measures may have had the unintended consequence of making the composite a relatively insensitive measure of effects that contributed to nonsignificant findings. Means of these 98 individual items were calculated to yield one value for each patient, and items were subsequently summed and transformed to a scale ranging from 0 to 100. Missing values for these 98 items were imputed using multiple imputation methods with SPSS 22.0 with all 98 items being both predicted and predicting variables to create data sets. Since the later regression model was calculated using hospital-aggregated scores, individual-level regression coefficient standard errors were not affected by the imputation. Instead, individual item scores were averaged.
In our first graphical analyses, LOESS (local regression) lines were plotted based on individual patient scores as a function of the date of surgery, separately, for both studies. LOESS lines with an Epanechnikov weighting kernel function and 50% adaptation were used in each figure, for each of the three conditions (i.e., hospitals that received no interim feedback, hospitals that first received feedback, and hospitals that received second feedback).
Second, the “classic” RPD visualization graphic is shown: Individual values were aggregated by hospital level into two groups: one group included patients before the intervention (pretest) and one group included hospitals with patients treated after the intervention (posttest). We considered early and late interim feedback to be two distinct interventions and plotted four RPD visualizations: two interventions for Study 1 and two for Study 2, with one figure each depicting scores after early feedback as a function of scores before early feedback and one figure depicting scores after late feedback as a function as scores before late feedback. Changes in patient centeredness are thus assessed by comparing pretreatment composite score results with posttreatment composite score results. Nonintervention hospitals were plotted as circles and intervention hospitals as Xs. To visually assess posttest score deviations from the expected (pretest) score, a regression line of control hospital pre–post values was created. Scores above the regression line for control hospitals suggest a positive effect of the intervention.
In addition, statistical tests were performed. As suggested by Trochim and Campbell (Unpublished), a multiple regression analysis was calculated using the aggregated scores for each RPD. Following the approach used by Linden et al. (2006), the statistical model was:
with Y being the dependent variable “posttest score,” β0 the y-intercept, β1 the pretest coefficient, β2 the estimated treatment effect (discontinuity), Zi the dichotomous assignment variable for being or not being an intervention hospital, and ei the error term. No statistical adjustments were made for possible case mix differences in the hospitals (i.e., variation in characteristics of patients treated in different hospitals, like age, ethnicity, or socioeconomic status (SES)). This decision was supported by results from an earlier paper using a patient cohort from the same population of hospitals in which these differences were minimal (Kowalski, Kuhr, Scholten, & Pfaff, 2013). Cohen’s f 2 measuring local effect sizes of β1 and β2 within the multiple regression models are calculated as described in Selya, Rose, Dierker, Hedeker, and Mermelstein (2012). Effect sizes of 0.02, 0.15, and 0.35 are termed small, medium, and large, respectively (Cohen, 1988).
Results
In Figures 2a and 2b, we plot the LOESS-based results for the composite scores of patients from control hospitals and from hospitals in which interim feedback was given. In Study 1 (Figure 2a: hospitals that took part in the workshop), baseline data in the hospital that first received interim feedback reflected a decreasing trend prior to intervention. While that downtrend continued during the month immediately after the intervention began, the slope of the LOESS scores then flattened. Baseline data in the hospital that received later interim feedback were more variable to begin, with an upward trend before feedback was initiated. Subsequent to treatment initiation, both the level and the slope of the composite scores continued to increase substantially, and the average composite score, compared to the control series, was larger. During the survey period, a small, general uptrend was visible in the set of 24 control hospitals.

Study 1, impact of feedback for hospitals that took part in the workshop (a). The solid, vertical line marks the point in time when feedback was first given, and the solid LOESS (local regression) line is for the hospitals that received first feedback. Graphs were derived from individual patient scores (overall n = 1,355—1,254 [control]; 74 [May]; 27 [June]). The dot–dashed horizontal line marks the time point for of the second feedback condition, while the dot–dashed LOESS line is for the hospitals that received second feedback. The minus–dashed LOESS line is for the control hospitals. Study 2, impact of feedback for hospitals that did not take part in the workshop (b). The solid vertical line marks the distribution of the first feedback, the solid LOESS (“local regression”) line is for the hospitals that received first feedback. Graphs were derived from individual patient scores (overall n = 1,630—1,449 [control]; 97 [May]; 84 [June]). The dot–dashed horizontal line marks the distribution of the second feedback, the dot–dashed LOESS line is for the hospitals that received second feedback. The minus–dashed LOESS line is for the control hospitals.
In Study 2 (Figure 2b: hospitals that did not take part in the workshop), for the two hospitals that received earlier interim feedback, an uptrend that was observed prior to intervention at first disappeared but then continued. From an average pretest versus average posttest point of view, the treatment appeared to have had an impact. But the general trend of the composite scores during the multiple data points before treatment generally continued during the posttreatment period. However, it is important to note that the distance between the 2 treatment hospitals and the 29 control hospitals substantially widened in the posttreatment phase. There was an increase in composite score level for the two hospitals that received feedback at a later time and that increase, while small, was maintained at the high end of the range of composite scores (and was consistently higher than in the control series). The average LOESS result for control hospitals was essentially flat during the duration of the second study.
With regard to the RPD visualizations, all but one intervention hospitals showed a displacement from the regression line in the expected direction, though the displacements were small (with regard to the regression results the pretest-adjusted increases were 4.45, 2.06, and 1.33). The single hospital without displacement in the expected direction was slightly below the regression line (Figure 3a).

RPD of Study 1 (workshop participating hospitals), early feedback (a). Circles represent control hospitals, X represents the hospital that received interim feedback on May 8. Regression line was fitted based on control hospitals. RPD of Study 1 (workshop participating hospitals), late feedback (b). Circles represent control hospitals, X represents the hospital that received interim feedback on June 16. Regression line was fitted based on control hospitals.
The regression results for Studies 1 and 2 (Tables 1 –4) showed higher posttest scores in intervention hospitals in three of the four statistical models compared to control hospitals when adjusting for pretest scores and were in line with the RPD visualizations. None of these effects was statistically significant at the p < .05 level, and all were small using Cohen’s effect size criteria. For six hospitals, the chance that exactly five of them are above the regression line (assuming the chance is .50 that a hospital will be above or below the regression line) is .09, and the chance of getting exactly six of the six hospitals above the regression line is .02. Thus, a binomial test of the likelihood that five or six intervention hospitals scored above rather than below their respective regression lines yielded a one-sided p value of .11.
Results of the Multiple Regression Study 1, Early Feedback.
Note. Model fit: p = .212; adj. R 2 = .052.
Results of the Multiple Regression Study 1, Late Feedback.
Note. Model fit: p = .375; adj. R 2 = .002.
Results of the Multiple Regression Study 2, Early Feedback.
Note. Model fit: p < .001; adj. R 2 = .512.
Results of the Multiple Regression Study 2, Late Feedback.
Note. Model fit: p = <.001; adj. R 2 = .502.
Discussion
We utilized a novel design approach to assess the effect of interim feedback on subsequent patient survey results based on data from German BCC hospitals. Beneficial effects of feedback were modest but consistent across five intervention hospitals. To rule out some of the study’s weaknesses, we combined the RPD with a stronger quasi-experiment, the ITS design. Our findings took advantage of commonly collected, retrospective survey data with a high response rate.
This is one of the first studies that have utilized the RPD design based on actual data taken from an applied setting. The RPD design showed displacements from the regression line in the expected direction for five of the six hospitals (all four nonworkshop hospitals and one of the two workshop hospitals). While a single hospital in Study 1 had slightly lower posttest scores than expected, that study’s pretest score was at the high end of the composite scale and may have been susceptible to a ceiling effect. Thus, the graphical analyses based on the RPD part of the design suggested a relatively clear indication of a treatment effect, while inference was less conclusive for the LOESS-based, ITS design. The regression analysis of the aggregated scores showed effects in the expected direction; however, they were not statistically significant. Bearing in mind the small case numbers (32 and 27 hospitals, respectively), this finding is not surprising and hints to a generally critical issue when applying multiple regression in RPD analyses, as suggested by Trochim and Campbell (viz., the small chance of producing statistically significant results due to the oftentimes small samples in “demonstration programs”).
Two previous studies that applied RPD either used simulated data (Linden, Trochim, & Adams, 2006) or used RPD as a graphical and analytical framework for a nonequivalent control group design without randomly assigning intervention groups (Kowalski, Yeaton, et al., 2013). The absence of coincident change in the control series and the replication of favorable benefits in treatment groups removed history as a threat to the validity of causal inference.
Strengths and Weaknesses of the Current Study
To enhance external validity, we conducted two studies in two distinct samples that differed in whether or not they had volunteered to participate in a previous workshop. As expected, results were more conclusive in the nonworkshop participating hospitals. These hospitals had initially lower pretest scores and, thus, more room for improvement. The slope of their regression line was steeper, with lower, initial scores and a relatively higher increase. Trochim and Campbell suggested the RPD for “demonstration programs and pilot projects” which accurately characterizes the context of our study, as the effect of interim feedback has seldom been evaluated with a strong research design.
Since it was not realistic to conduct a more rigorous evaluation of the intervention due to limited resources and staff priorities, the RPD design provided an opportunity to practically assess the intervention’s impact. Results suggested that feedback was effective, though both the visualizations and the regression results were not entirely convincing. However, this tentative conclusion provides a sound rationale for conducting a subsequent RCT. When RCTs are difficult to conduct, this incremental approach may be reasonable for a number of research settings (i.e., examine the initial results of quasi-experiments first, then move to an RCT when preliminary results are promising). By acting in empirically validated stages, researchers and hospitals can avoid wasting valuable time and financial resources.
At both the hospital and the patient level, we had very high rates of participation, thus minimizing differential attrition as a threat to internal validity. However, by itself, even with these high participation rates, the ITS element of the hybrid design did not allow strong causal inference. While the ITS design evidence was only suggestive of beneficial change after introduction of feedback, during some periods of hospital enrollment, there were too few patients to meaningfully cluster average responses into 2- or 4-week sequences to reliably assess “blips.” This shortcoming was particularly problematic when there were a small number of postintervention, follow-up data points to judge the impact of feedback with the LOESS-line analysis. In addition, the overall small numbers of intervention and control hospitals and the large variability about the regression lines of the control hospitals made it difficult to detect a small intervention effect. On a positive note, a common threat to internal validity in other studies, differential attrition, was not an issue here, largely because participation in the survey was necessary for the hospitals to be accredited.
When imagining other health-related settings in which a hybrid design approach makes sense, one should consider that organization- (or “site”-) based interventions are necessarily difficult to evaluate, paradoxically because institutions willing to participate are well functioning to begin. In other words, uniformly high functioning health-care institutions have small, between-site differences which makes it difficult to assess between-institution change (beneficial treatment effects). The specialized subset of German hospitals we investigated consisted of certified, high-quality provider sites that treated a relatively homogenous subset of patients (i.e., mostly women suffering from breast cancer with a median age of 60 years). Simply said, large, between-group effect sizes are unlikely in such contexts. Yet, should researchers choose a subset of hospitals with substantial problems, the external validity of findings will be quite limited.
Future Research
Future research might include the development of electronic tools to more quickly report patient feedback. However, data security protection and lessons from survey research would demand at least some delay or mechanism to avoid the possibility that hospitals might deduce which patient gave a specific evaluation (i.e., the need to produce secure data that are not individual related for given hospitals) and also to avoid socially desired responses. We made no systematic assessment of what was done with the interim feedback we sent to individual hospitals, but we received encouraging feedback by two addressees asking us to send out these reports on a regular basis. These staff noted that they “discussed” results together with their team.
Future research should systematically assess what was done with the feedback, onsite. Increasing the strength of the treatment may also lead to larger and more clinically significant displacements (e.g., give feedback more often, give feedback to more people, and require that all staff read feedback). In addition, we would like to emphasize that even though research presented here concerns patient centeredness, this does not mean there are no potential areas for application of the design elsewhere in health-care improvement. On the contrary, the example of Sundt et al. (2016) shows its broad applicability.
To more rigorously evaluate the ITS part within a hybrid design, a much larger data set would be necessary. This modification would enable researchers to track changes in more units than in the present design, which would likely avoid ceiling effects. For a better understanding of the general utility of the RPD framework, more evaluation studies are necessary to determine its usefulness for different research disciplines. Since clusters of participants were used as the assignment unit, smaller numbers of treatment units on the RPD-to-RCT continuum lead to smaller, aggregated study costs. While reduced costs come at the price of reduced causal inference, the RPD design nonetheless provides evaluators with a quality assessment of impact that, if promising, enables a sound rationale for conducting a “full” RCT.
Footnotes
Acknowledgments
We would like to thank the patients who participated in the survey and the breast cancer centers that supported this study. We confirm that all patient identifiers have been removed or disguised, so that the patients described are not identifiable. The patient survey was requested and initiated by the North Rhine-Westphalian Ministry of Work, Health, and Social Affairs (MAGS NRW). The hospitals provided patients’ addresses and clinical information, as reported. Costs were borne by the participating hospitals as part of the BCC certification and benchmarking process. No additional funding was received for the study. Neither the ministry nor the hospitals were involved in the analysis and interpretation of results or preparation of this manuscript. We are indebted to many anonymous reviewers and an acting editor who, in extensive discussions, improved the manuscript and pointed us to aspects of the research we had previously overlooked.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
