Abstract
Objectives
Root cause analysis (RCA) is a framework for structured investigations of safety incidents. Our aim was to identify the barriers to successful learning in health care and to make recommendations for service development.
Methods
A qualitative study that ‘tracked’ the investigation procedures and practices of ten patient safety incidents in two National Health Service (NHS) hospitals. Non-participant observations of the complete investigation process in various managerial and administrative settings, together with semi-structured qualitative interviews with those involved in the process, and following the completion of the final report.
Results
There are several challenges to undertaking root cause analysis in health care. These are associated with forming and leading the investigation team; gathering and analysing supporting evidence; and formulating and implementing service improvements. Undertaking root cause analysis remains a complex non-linear task which entails balancing a multiplicity of concerns and expectations. Supporting enhanced incident investigation requires keeping in focus the instrumental aim of triggering sustainable service improvement and not for the investigation to become an end in itself.
Conclusions
Health services leaders need to provide open endorsement of root cause analysis and of the staff carrying it out; enhance staff participation within learning activities and new analytic tools; and develop capabilities in change management.
Introduction
Health care institutions across the world have adopted Root cause analysis (RCA) to inform the structured investigation of patient safety incidents. 1 Originally developed to analyse industrial incidents,2,3 RCA represents a family of approaches that have strong links with engineering, psychology and ‘human factors'. In particular, RCA directs analytical attention to the latent factors and corresponding ‘error chains’ that condition, enable or exacerbate the potential for active error. 4 In practice, it fits within a model of organizational learning that usually involves the identification of safety events through incident reporting, the stratification of incidents to determine their relative priority, structured investigations to realize the underlying causes and finally the formulation and implementation of recommendations for safety improvements. The application of RCA at the centre of this process reflects the belief that organizational learning can occur through a rational, robust and rigorous process. 3
RCA is often described as a stepwise, linear process; 5 effectively summarized by Amo (Box 1). 6 There is a broad consensus that RCA is a toolbox rather than a single method. 3 Woloshynowych et al. 7 suggest more than 40 techniques are available, such as timelines, cause-effect charts, ‘five whys', fault trees, and fishbone diagrams. These offer different methods for identifying, mapping and understanding latent or root cause factors. As with many health care systems, the English National Health Service (NHS) has endorsed RCA as the main tool for incident investigation. By 2000, it became mandatory for all incidents leading to permanent injury or death to be investigated in this way and following the creation of the National Patient Safety Agency (NPSA) in 2002, more than 8000 NHS staff have been trained in RCA. 8 Although the NPSA does not mandate a particular process, its training highlights the London protocol. 9 Like other approaches, this recommends that investigations should be undertaken by a small operational team; that team members should agree the appropriate terms of reference and methods for gathering evidence; that team members should be involved in the processes of interrogating, analysing and discussing evidence based upon the various RCA tools; and that they should participate in the drafting of recommendations for service improvements.
The seven steps for root cause analysis (RCA)
Identify the incident to be analysed
Organize a team to carry out the RCA
Study the work processes
Collect the facts
Search for causes
Take action
Evaluate the actions taken
Evidence from other industries suggests that RCA has the potential to inform learning in the aftermath of safety events, yet a growing body of research testifies to the variable and often complex application of RCA in health care. 8 In Australia, research shows how the success of RCA can be influenced by time constraints, lack of expertise and difficulties with inter-professional working. 10 Similar issues are also shown to curtail the translation of RCA recommendations into practice. 11 Offering a more critical interpretation, organizational scholars 12 suggest that RCA fosters forms of critical reflection among clinicians that draw them into new forms of organizational control. Research also testified to the role of more informal and backstage interactions for both sharing knowledge about patient safety and providing a focus for learning. 13 These often benefit from greater levels of trust, shared understanding and relevance to clinical practice. More broadly, research also shows how the implementation of safety procedures, including RCA but also incident reporting and checklists, is often contingent upon cultural differences among professional groups. 14 For instance, managers and clinicians often have different understandings of how patient safety can be understood and improved, leading to tensions in the introduction of new learning systems. 15 Such research shows how the realities of organizational life can make problematic the translation of management practices developed in other industries into health care. Of particular significance are the diverse demands placed upon leaders to achieve service improvements in the face of limited resources and competing priorities, and the underlying tensions between occupational groups, especially clinicians, on the relative importance of service improvements. The aim of this study was to investigate further how RCA practices and procedures are undertaken, with the intention of identifying the barriers to successful learning.
Methods
Approach
The research design followed in the ethnographic tradition. 16 Ethnography is well-established in the field of patient safety research 17 and provides the opportunity to observe first-hand how patient safety events transpire, are communicated and are analysed. 18
Research sites
The study was undertaken in two English NHS hospitals (trusts) located in different regions. Trust A was a medium-sized general hospital with 6000 staff treating 500,000 patients per year; trust B was a large teaching hospital with 12,000 staff treating over one million patients per year. The selection of two trusts enabled comparison of the similarities and differences in investigation procedures arising from factors such as hospital size and workload. The study was undertaken over 12 months (2008–09) with research being split equally between both sites. The research received ethical approval through the National Research Ethics Service and local research governance with each trust.
Data collection
After an initial period identifying the organizational arrangements for patient safety within each trust, the primary focus of research was the tracking of ten incident investigations from start to completion. In collaboration with each trust's risk management department we were contacted when an RCA investigation was to be initiated. At this point, one of the research team commenced observations alongside the lead investigator, shadowing their meetings and work activities. For all cases we observed how the investigation team was selected and formed; how evidence was collected; the application of RCA tools; the identification of error chains and root cause factors; and the drafting and dissemination of the final report. Observations were carried out in a variety of settings and with different groups, including each trust's risk management department where we observed the work of risk managers in gathering evidence, contacting participants, making telephone calls and undertaking paperwork. We also observed large meetings where incidents were formally analysed by an expert team, together with small conversations and interviews that occurred before and after these meetings. We were also able to record how investigation reports were drafted and collated. In total we conducted 960 hours of observations. During this period we conducted 102 ethnographic or in situ interviews to clarify events or issues. We also completed 34 semi-structured interviews with those involved in investigations. These followed a thematic guide that centred on the experiences of participating in an investigation, the use of RCA tools, the perceived contributions and barriers to organizational learning. These interviews were undertaken between four and six weeks after the completion of the incident to determine the perceived effect of the process of service delivery.
Analysis
Our findings were recorded in individual field journals, following a shared thematic template that was developed from the literature and findings that emerged through the observation process. The complete set of observational records and verbatim interview transcripts were shared across the research team for close reading and preliminary analysis. Through weekly workshops, the research team mapped the ten investigation processes with the aim of identifying and explaining common actions, interactions and occurrences. From this, we elaborated common and cross-cutting issues, as described in the literature, as well as themes that emerged from the data. These were then used to further analyse and code the data to describe and explain the challenges of undertaking RCA. In line with our objectives, our findings elaborate the challenges of conducting RCA, presented along three themes: forming the investigation team and gathering evidence; conducting the analysis and identifying root causes; and formulating and implementing service improvements (Table 1).
The challenges of conducting root cause analysis (RCA)
Results
Forming the investigation team and gathering evidence
The formation of a knowledgeable, skilled and respected investigation team is integral to undertaking a robust and inclusive review of the evidence and for generating legitimate recommendations for service improvement. Teams are ideally composed of representatives of the various health care groups contributing to the delivery of care but who were not directly involved in the incident. This process is often complex and protracted, and the formation of this team remains challenging.
A preliminary obstacle, especially for the lead investigator (usually a hospital or departmental risk manager), was ensuring the participation of, what they regarded as the most appropriate individuals. Frequently it was observed that these individuals, especially those who had a prominent position in the clinical area where the incident occurred, were difficult to recruit and bring together. As well as delaying proceedings this also created frustration among the rest of the team.
I think the rambling in this meeting had to do with the fact that we did not know certain facts. And we did not know them because the relevant people were not present. (Consultant)
Multiple reasons were repeatedly mentioned for this lack of attendance. While some lacked confidence in the usefulness of RCA, others feared the potential to be ‘convicted’ or blamed (‘…an investigation is still an investigation’), and still others appeared to refrain from participating to safeguard their reputation within the organization. However, many simply could not participate because they experienced ‘diary conflicts’ or in some cases had left the trust (high employee turnover). Accordingly, risk managers often had to decide between the completeness of the team and the timeliness of the investigation.
Similar challenges were experienced during the collection of evidence relating to an incident. During several investigations, the team realised in the course of reviewing evidence in the RCA meeting that key pieces of information were missing.
There were about four different sheets with different people recording his level of pain over the same time period, none of which matched up. We would have needed the log from the PCA machine, which would have told us how many times he pressed the button in order to get the analgesia. (Clinical Director)
The quote sheds light on a further problem related to the quality of information provided in case notes and statements. Patient and departmental records, while providing detailed diagnostic and treatment information, typically contained insufficient detail for the purpose of system analysis. As such, lead investigators were often required to ‘trawl’ additional information sources, such as computer systems, staff rotas, maintenance records and other routine hospital data. Data collected often represented a patchwork of sources that needed to be carefully weaved together to provide different perspectives of the event. This proved particularly difficult when cases were ambiguous and clinical judgments varied across the various specialties involved.
In sum, investigators faced several challenges, from securing the open and willing participation of staff, to being confident of the quality of information contained within records and statements and dealing with the ambiguity of clinical ‘facts'. This in turn severely affected the quality of evidence and the accuracy of subsequent analysis and recommendations. In many cases, these initial challenges stem from the fact that the success of RCA depends in part on organizational conditions and a positive safety culture that the tool itself aims at producing.
Conducting the analysis and identifying root causes
Several small and informal meetings occurred throughout each investigation, such as ad hoc conversations when gathering evidence or reviewing case records. However, each investigation centred on a facilitated ‘RCA meeting’ in which the investigation team collectively scrutinized the evidence, applied RCA tools and determined the root causes. These meetings followed a similar format that comprised: reviewing the initial incident report; describing the methods of data collection; presenting an overview of the findings based upon a timeline of events; analysis of the timelines; and identification of causal connections and root causes. In all cases these activities were facilitated by the investigation lead but involved other team members based upon their respective expertise and experience. For example, in one investigation relating to surgical swabs the chief nurse and theatre sister were frequently required to describe current checking policies and talk about the pressures that could undermine staff compliance.
One of the most significant challenges to this process related to the influence of professional status. During the meetings, turn-taking tended to follow a hierarchical pattern with doctors often speaking first and most, senior nurses and managers having some voice, and junior staff talking only when asked. In many cases, the review of incident data was dominated by one or two professional or managerial representatives who assumed greater authority, especially clinical groups who appeared to assume and justify their dominance based upon their clinical expertise within the particular area. As such, we found groups such as managers tended to have a greater voice for more procedural investigation matters, including when forms should be completed and the writing of the report. However, variability in participation between groups not only reflected status differences but also differences in interpretation of RCA. Despite being promoted as addressing the systemic and organizational factors that frame patient safety, the analysis of reports and interviews showed how in many cases the investigative focus remained on local clinical practices and context, especially when dominated or led by clinicians.
Significantly enough, these difficulties applied more often when the facilitators did not have a clinical background and was less prominent when a clinician or an expert nurse was in charge. It appeared at times that the authority of the facilitators and their need to ask repeatedly ‘what do you do normally’ put them in a weak position. This raises questions on the robustness of the process and its capacity to grant to the specialized RCA investigators a status comparable to that of the most authoritative practitioners.
Most of the investigation teams made few attempts to utilize the available techniques from the RCA toolkit, such as the ‘five whys’ or ‘fishbone’ tools. Across all incidents the timeline was the only method used to systematically analyse evidence. On reflection, this time-orientated style of analysis appeared to align better with the prevailing mindset of clinicians than the tools drawn from an engineering tradition. The critical question here is whether this derives from lack of training in RCA, resistance to their use or from the fact that these tools do not speak to the nature of the practice they are supposed to investigate. While most RCA tools are aimed at providing a helicopter view, health-care practitioners seem to favour tools that help them reconstruct the event in its temporal unfolding and from the fictitious perspective of being there, that follows a narrative instead of argumentative type of communication style. The question is therefore whether only more and better training in exploiting the existing techniques is needed or whether new and more appropriate analytical tools are needed that are attuned with the nature of clinical activity.
A further significant finding was that many potential root causes were discounted in the analytical processes, often based upon the assumption that such latent factors were not easily resolved (e.g. lack of resources) or because the complexity and ambiguity of the issue would not allow it to be resolved with a single, clearly containable countermeasure. For example, during one observation it clearly emerged that in liability-prone activities such as childbirth, minor troubles tended to absorb all available medical attention, resulting in major potential problems elsewhere. However, the analysis stopped short at asking whether the current liability pressure was one of the hidden causes in this case (instead of making the hospital safer by making doctors more careful, the risk of legal consequences simply pushed the risk elsewhere). The discussion remained instead on smaller and more manageable issues, such as the working climate and the training of young doctors.
This issue was often compounded by the fact that investigation teams were usually under pressure to produce quick and tangible results or that some of the critical constituencies involved in an incident were not represented at the meeting. Accordingly, analysis tended to centre on either the identification of factors that could easily be fixed or the allocation of responsibility to organizational groups not represented in the process.
Finally, little effort was made to explore connections between current incidents and other occurrences. In most of the meetings, reported incidents were examined in sequence following the order they were filed. Patterns of incidents were rarely discussed. In most cases participants vaguely referred to them with expression such as ‘it's another one of those', ‘it's the old same story'. At most, they would share some anecdotes of similar incidents that occurred in the past, ‘This reminds me of another case we had recently'. However, systematic analysis rarely followed. Trends were only addressed when several serious incidents of the same type occurred very closely in time, so that the pattern was so evident that it could not be ignored. In all other cases the unit of analysis remained mostly at the level of the single incident and the investigations often failed to build on previous experience and exploit previous learning.
The combination of the above factors meant that while those directly involved in the RCA benefitted from the investigations, the capacity of the process to generate collective and organizational learning was limited. Because of its localized nature, the learning very often was not shared across the organization, let alone more widely.
Dealing with emotions and blame
While most of the policy literature presents RCA as a rational or technical endeavour, little or no attention is paid to emotions. Our study found, however, that anxiety, fear, and shame, significantly affect the RCA process and its outcomes.
First, RCA meetings were often emotionally charged, especially when investigation teams were composed of staff directly involved in the incident. The RCA process seemed to help the participants to cope with the stress and unspoken sense of guilt associated with an adverse event. 19
It was a lengthy meeting and, obviously, very distressing for the staff involved. They were very distressed because the lady had died … it was upsetting. (Nursing manager)
However, the RCA process was scarcely sensitive (if not insensitive) towards such emotions. Moreover, these were often perceived as conflicting with the objective view of the RCA. As a result, emotions were not legitimized, properly expressed or managed. Conflicts, anxiety and anger were thus rarely brought to the surface, guilt denied and uncertainty (and ignorance) downplayed. While some individual facilitators appeared able to establish an environment where emotions could be aired and elaborated, others appeared intimidated by the occasional burst of anger and conflict. In these cases, the facilitators tended to follow rigidly the procedure, inviting participants to stick to the facts and ‘evidence'. In such instances, there was a decline in the quality of interaction: people would go quiet; they would show signs of disengagement; they were keen to finish quickly. The discussion would swiftly revert to general, abstract and safe themes and the focus moved to how things should be done. When this happened it could be anticipated that the result of the investigation was sub-optimal and that the nature of the interaction had got in the way of the effort to progress learning:
[The RCA participant] was very distressed, so it was useful to go back, sit and have a quieter chat with her, to be able to go through it a bit more supportively … she cried and we had a cuddle, but actually to talk through it. The other staff nurse who wasn't involved was able to say ‘I'll completely change my practice now’ … I think the process was too structured … it didn't give them a chance, really, to say anything. (Nursing manager)
In the best of cases, as in the example above, the shortcomings of the process were dealt with on an ad hoc basis thanks to the personal sensitivity of some of the participants. In many other cases, not actively engaging with emotions became a barrier to learning from the adverse event.
Second, blame and fear of being blamed were present during the RCA process. While the idea of no blame is at the core of current approaches to learning from adverse events, it remains a distant ideal. Blame is still in circulation, yet by imposing a strict, politically correct way of speaking, issues are left undiscussed. This in turn produces blind spots both in the discussion and analysis of incidents. In one case, after a not very successful discussion, a second, informal and quite lengthy corridor meeting took place. For 45 minutes, participants spoke out about their concerns, talked openly about blame, and aired the emotions repressed during the formal meeting. While such a clear-cut distinction between formal and informal encounters were observed only occasionally, off-the-record conversations where people spoke about their emotions and aired their concerns occurred in almost all RCAs.
Handling emotions and blame constitutes further challenges in conducting RCAs. The fact that current policy tends to ignore this aspect and that a procedure-oriented approach is adopted during the training creates potential problems as emotions, anxiety and blame represent an important aspect of dealing with the aftermath of adverse events.
Formulating and implementing change
Given the issues outlined above, the formulation and implementation of change, especially through the investigation report, faced several challenges. Reports are both the end product of the investigation and the tool through which recommendations are circulated for future implementation. However, the production of the final report seemed at times to become an end in itself without necessarily contributing to change. Lead investigators often pointed out that their work would be evaluated on the basis of the quality of the report sent to the management and often forwarded to the regional office. Producing a well-presented, competent looking document took precedence over making change happen.
At the same time, given that reports are presumed to capture the real causes, dissent and conflict were frequently neglected and ignored. For example, during one RCA meeting several participants raised concerns about the suggested attribution of responsibility emerging from the investigation. Different interpretations emerged during the process, each supporting a very different reconstruction of the event and attribution of causes. In the final report, however, such variation and discrepancy in the interpretation of evidence was completely neglected. Only rarely did reports capture the controversies that had emerged or mentioned that the underlying cause remained ambiguous. Related to this, the narratives presented in the final investigation report usually presented a linear account of the events leading up to a safety incident, often neglecting the complex details of how factors related or combined. Even when discussing the latent factors of an incident, the official accounts remained closely bound to the temporal unfolding of the incident and rarely presented a more analytical argumentation.
A third challenge relates to the need to produce a convincing, acceptable and workable action plan, which compounded the tendency to discuss issues in terms of available solutions rather than root causes. The causal factors identified were thus phrased in terms of: ‘what would have helped here …', or ‘what we do in obstetrics in such occasions is to …’ The formulation of service improvements was therefore limited by a lack of attention to the complexity of causal factors which meant that recommendations for change tended to centre on small improvements that could be delivered at a departmental level. This tended to include disciplinary action or additional training for staff who were seen as at fault, but neglected attention to wider organizational change that might involve more substantial resource implications, or indeed were seen as beyond the gift of the investigation team.
Finally, and somewhat surprisingly, there was little evidence that the investigation teams had a coherent orientation towards managing change. Although risk managers had received training and perfected their skills in diagnosing patient safety problems, they were totally unprepared to address the challenges of turning recommendations into sustainable changes. They limited themselves to designating a clinician to be in charge of implementing the actions proposed in the action plan. Occasionally, they would also ‘follow the actions through’ a couple of months down the line to verify that they had been put in place. They saw themselves mainly as friendly and collaborative investigators and controllers, and change was someone else's responsibility. Accordingly, the issue of how to facilitate change, how to address the likely resistance to change, how to fit the changes needed to prevent the re-occurrence of incidents with competing agendas and initiatives were scarcely if ever considered.
Discussion
Although RCA provides a powerful tool for informing learning in many high-reliability organizations, its use in health care remains potentially uneven and challenging. Despite being promoted as a rational and linear device for learning, inclusive of a comprehensive conceptual model and toolbox of methods, it is unclear whether this model or the tools are having the desired effect. Research from the UK and Australia highlights how resource and time constraints, and a lack of expertise can hinder investigations and constrain service improvement,11,20 usually followed by calls for more advanced or thorough training. 8 Our findings deepen our understanding of the challenges in forming the investigation team, gathering evidence, analysing data, and producing recommendations for service improvements. This highlights the difficulties of articulating the principles of patient safety, specifically RCA, to the health-care workforce, with the consequence that only pockets of awareness and expertise exist, often limited to designated risk managers. Moreover, analytical tools appear to have limited resonance or utility to staff, suggesting a potential difference of emphasis between occupational groups. 21 It also suggests that there are difficulties in co-ordinating the involvement of staff groups, with status differentials restricting an open and thorough analysis of findings. Related to this, issues of blame still permeate these processes, with RCA being used as a technique to rationally and legitimately allocate responsibility to particular organizational groups. Finally, RCAs are beleaguered by the cultural difference, inter-occupational politics and organizational pressures that often undermine forms of service improvement.
There are many ways that the contribution of RCAs to organizational learning might be enhanced (Table 2). First, participation in an investigation needs to be more thoroughly acknowledged and rewarded in staff appraisal or departmental improvement plans, so as to encourage participation and safeguard the time and contribution of staff. Second, it is important to enhance the leadership capabilities and skills of investigation leads to direct and oversee the process, negotiate with senior organizational groups and manage the potential hierarchy and status differentials within health care, as well as the psychological aspects of the process. Although most investigation leads are trained in the techniques of RCA, other health-care leads and managers would benefit from similar training to improve participation and understanding. Third, the RCA toolkit needs to be revised and simplified. While the principle that incidents should be systematically investigated should be reinforced at all levels, the tools used to shed light on adverse events should be commensurate with the constraints of life on the frontline. A reasonable balance needs, therefore, to be struck between length and depth of investigations and return in terms of learning. Simplified tools capturing the essential feature of the event would offer a better return on investment. Finally, more thought should be given to the process that follows the formal conclusion of investigations. Currently, RCA recommendations appear not to be widely shared and not connected with practice, with few explicit procedures to review and audit change. There seems to be also a lack of attention for the necessity to actively and competently manage implementation. The risk is that RCA facilitators increasingly see themselves as forensic investigators while their role should be that of change agents. This aspect needs to be addressed to prevent investigations becoming yet another box to be ticked.
Summary of lessons learned
The study has several limitations. First it included only two health-care providers. The time and resource constraints of ethnographic research, especially the processes of shadowing staff, attending meetings and interviewing study participants placed a constraint on the feasible number of sites. Second, the study is focused exclusively on the English NHS and the hospital context and makes no claims about the experiences of undertaking RCA investigations in other settings or systems. Reflecting upon the limitations of the study and the findings, further lines of research can be identified, especially around how to deepen analysis in the RCA process and the translation of findings into practice. The findings also provide a basis for further large-scale quantitative work, including the development of a survey that could be circulated nation-wide with the findings associated with levels of reporting and possible performance improvements.
Footnotes
Acknowledgements
The research team also included Jacky Swan and Peter Spurgeon whose invaluable contribution is gratefully acknowledged. The research was funded by the Warwick International Manufacturing Research Centre (WIMRC), Project ‘Improving the capacity of healthcare organisations to act on evidence in patient safety’ (PTOC21). The WIMRC is funded by the UK EPSRC with supplementary support from collaborating partners.
