Abstract
Dynamic risk and protective factors serve to assess the violence risk level of (forensic) psychiatric patients and offer guidance to clinical interventions. Risk assessment scores on Historical Clinical Risk Management–20 (HCR-20) risk factors and Structured Assessment of Protective Factors for violence risk (SAPROF) protective factors at different treatment stages were compared with violent incidents during treatment for 399 multidisciplinary coded assessments on 185 male and female forensic psychiatric patients. At later stages of treatment, less risk factors and more protective factors were observed, and predictive validities were higher. The HCR-20 and SAPROF scores showed good overall predictive validity for inpatient violence. The combination of risk factors and protective factors was a good predictor of incidents of aggressive behavior for different groups of patients, such as patients with violent or sexual offending histories, patients with major mental illnesses or personality disorders, and patients with a high score on psychopathy. Implications of these findings and recommendations for future research are discussed.
Violence risk assessment has emerged as an increasingly valuable specialty to assist clinicians in forecasting the likelihood of violence, understanding its causes, and preventing its (re)occurrence (Skeem & Monahan, 2011). Thorough assessment of risk and protective factors for violence is vital for risk appraisal and violence prevention in clinical practice. To improve the short- to medium-term prediction of aggression within forensic inpatient settings, it is viable to use risk assessment measures that are sensitive to important clinical changes (Chu, Thomas, Ogloff, & Daffern, 2013). Well-established structured professional judgment (SPJ) tools generally incorporate potentially dynamic factors, aiming to enable risk evaluations that are sensitive to personal and situational change. The inclusion of potentially changeable or dynamic factors (from hereon called “dynamic factors”) empirically related to violence risk brings forth another possible use of risk assessment: the guidance of clinical interventions.
Changeable factors provide opportunities for violence prevention as they may inform treatment plans and risk management strategies intended to diminish violence risk (de Ruiter & Nicholls, 2011; Douglas & Skeem, 2005). After determining which dynamic risk factors are present and which protective factors could potentially be developed, mental health care professionals may use or target these specific factors in treatment. Although surprisingly little evidence is available regarding the possibility of specific treatment interventions to generate changes in presumably dynamic factors, many treatment efforts aim to do just that. Periodic re-assessment can provide more insight into the achieved changes in, and progress on, risk and protection. Thus, dynamic risk assessment may serve three purposes: (a) assessment of current risk level, (b) provision of guidance to treatment interventions, and (c) evaluation of change.
Mixed results have been found regarding the value of dynamic factors. A meta-analysis of different risk assessment tools by Campbell, French, and Gendreau (2009) found better results for static than dynamic risk factors in terms of predicting inpatient violence. On the contrary, a more recent study by Wilson, Desmarais, Nicholls, Hart, and Brink (2013) on repeated risk assessments of forensic psychiatric inpatients demonstrated that dynamic risk factors significantly predicted institutional violence, even after controlling for static risk factors. The dynamic Clinical (C) and Risk Management (R) factors of the Historical Clinical Risk Management–20 (HCR-20 Version 2; Webster, Douglas, Eaves, & Hart, 1997) have been demonstrated to be useful in clinical practice for the assessment of violence risk and guidance of treatment for many different samples of patients (see de Vogel & de Ruiter, 2006; Douglas, Blanchard, Guy, Reeves, & Weir, 2010; Guy, Packer, & Warnken, 2012; O’Shea, Mitchell, Picchioni, & Dickens, 2013).
Several studies have shown the changeability of the dynamic clinical and risk management factors during treatment, the relationship between changing risk assessment scores and treatment progress, and the positive effect of changes in risk factors on reduced violent behavior. Müller-Isberner, Webster, and Gretenkord (2007) demonstrated an orderly correspondence between decreased dynamic risk factor scores and lower security levels in forensic inpatient psychiatry and concluded that the C and R scales were useful measures to gauge progress in forensic psychiatric inpatient treatment. Tengström and colleagues (2006) also found that HCR-20 scores were positively correlated with patients’ level of security. In addition, a study by Dernevik, Grann, and Johansson (2002) found that the predictive validity of the HCR-20 was better for lower security management conditions than for the high-security stage. In an HCR-20 study with forensic psychiatric inpatients by Belfrage and Douglas (2002), significant changes in C and R scores were demonstrated between three repeated assessments with 6-month time intervals in between. In a follow-up of this study, Douglas, Strand, and Belfrage (2011) found significant changes in C scores between four consecutive assessments. They differentiated between subgroups of patients that showed different patterns in C scores over time and demonstrated a clear correspondence between the improvements in each group and variations in violent behavior. In a prospective multicentre study of discharged patients with schizophrenia living in the community, Michel and colleagues (2013) demonstrated the changeability of all dynamic risk factors in the HCR-20 between repeated assessments at five different time-points. They found that changes on three of the C items and on three of the R items were related to differences in aggressive behavior.
A relatively understudied aspect of risk assessment is that of protective factors for violence risk (de Ruiter & Nicholls, 2011; Fougere & Daffern, 2011). It has been argued that protective factors offer balance in risk evaluations and are essential for risk assessment and treatment guidance (Miller, 2006; Rogers, 2000). Protective factors are defined as factors that contribute directly or indirectly to the prevention of violence risk. They may be personal, social, or situational factors. To date, much remains unknown about protective factors and their contribution to the assessment of violence risk. It could be argued that some risk and protective factors lie at the opposing end of the same risk domain (e.g., lack of personal support vs. supportive social network), whereas other factors do not have a risk or protective counterpart (e.g., the risk factor personality disorder or the protective factor professional care). However, others may not be convinced yet of the uniqueness of protective factors and may view them as merely the opposite of risk factors.
Theoretically, the presence of a protective factor appears different from the absence of a risk factor. Although the absence of a risk factor entails no further increase in risk, the presence of strength factors actually provides protection against risk factors that are present in a certain situation, thus, lowering the risk of violent outcome. Similarly, the absence of protective factors (e.g., having no supportive intimate relationship, not participating in structured leisure time activities, not having clear life goals, not taking medication, or not having a court order) should in itself not lead to an increase in violence risk. However, the presence of these factors could provide a significant violence risk reduction in case certain risk factors are present (e.g., medication may benefit a psychotic patient, an intimate partner may assist someone in staying away from drugs or alcohol). Although as of yet the issue has not been settled yet on the theoretical assumptions behind protective factors, at the very least, the positive approach of protective factors seems to have clinical value, inspiring patients and clinicians to strive for improvement. Studies on risk assessment in adolescent samples have reported good results for the clinical utility and predictive validity of protective factors for violence prevention (Lodewijks, de Ruiter, & Doreleijers, 2010; Lösel & Farrington, 2012; Rennie & Dolan, 2010). Studies on strength factors in adult samples have found good predictive value for inpatient aggression (Desmarais, Nicholls, Wilson, & Brink, 2012; Wilson, Desmarais, Nicholls, & Brink, 2010) and for violent recidivism in discharged prisoners (Ullrich & Coid, 2011).
Given the focus on risk factors in the HCR-20, more recently, an additional risk assessment tool was developed that specifically assesses protective factors for violence risk in adults: the Structured Assessment of Protective Factors for violence risk (SAPROF; de Vogel, de Ruiter, Bouman, & de Vries Robbé, 2009, 2012). The 17 protective factors in the SAPROF are predominantly dynamic and have shown to be good predictors of inpatient violence and self-harm (Abidin et al., 2013), discharge from psychiatric treatment (Davoren et al., 2013), and violent recidivism after treatment for violent offenders (de Vries Robbé, de Vogel, & de Spa, 2011), as well as for sexual offenders (de Vries Robbé, de Vogel, Koster, & Bogaerts, 2015). The protective factors of the SAPROF showed to be predictive of violent recidivism after discharge, even when controlling for risk factor scores. Moreover, the combined use of the HCR-20 risk factors and SAPROF protective factors has demonstrated incremental predictive validity over risk factors alone (de Vries Robbé et al., 2011; de Vries Robbé, de Vogel, & Douglas, 2013). When protective factor scores were deducted from risk factor scores, the resulting risk level corrected for available protection, HCR-20 score − SAPROF score, was able to predict violent re-convictions significantly better than the HCR-20 alone.
A further clinically valuable finding has been that improvements on protective factors during treatment showed to be related to reduced levels of violent behavior after treatment. In a recent retrospective multiphase file study on a similar patient sample as the current study, the within-patient changeability of the dynamic HCR-20 and SAPROF factors was demonstrated by comparing ratings at the start and at the end of forensic psychiatric treatment (de Vries Robbé, de Vogel, Douglas, & Nijman, 2015). It was found that improvements on risk and protective factor scores during treatment were indeed in themselves predictive of less violent recidivism after discharge from treatment. Although it could not be studied whether the observed changes were in fact due to specific treatment interventions, this finding provided more insight into the potentially changeable nature of the dynamic factors in both tools and into the relationship between assessed improvements and reduced violent recidivism. The more patients changed for the better on their risk and protective factors during treatment, the more successful their rehabilitation appeared to be.
Present Study
This study set out to provide a prospective clinical evaluation of the relationship between a combined assessment of risks and strengths and treatment phasing in forensic psychiatric practice. Furthermore, this study investigates the predictive validity of risk factors and protective factors for aggressive incidents in clinical practice across different groups of patients. More insight in these topics is vital for evaluating the usefulness of the HCR-20 and the SAPROF for guiding treatment initiatives and risk management decision making. To our knowledge, comparison of the predictive validity of dynamic risk and protective factor scores at different stages of treatment and for different types of patients within the same inpatient psychiatric setting has not yet been reported on.
The aim of the present study was twofold. First, we aimed to assess the differences in dynamic risk and protective factor scores between different stages in forensic psychiatric treatment. It was expected that as patients progressed through the various stages of treatment, dynamic risk and protective factors would improve accordingly. That is, less risk factors and more protective factors would be present in later stages of treatment. The second aim was to assess the predictive validity of the HCR-20 and the SAPROF for aggressive incidents during clinical treatment, differentiating between different treatment stages and different groups of patients. It was hypothesized that the HCR-20 and the SAPROF would be related to violence during all stages of treatment and that this relationship would be present for different groups of patients.
Method
Setting
The study was carried out at the Van der Hoeven Kliniek in Utrecht, a forensic psychiatric hospital in the Netherlands. This hospital treats patients convicted of violent or sexually violent offending, for which the court found them not fully responsible due to their psychopathology (i.e., major mental illness or personality disorder). Generally, patients are admitted after a period of imprisonment. Patients are considered at high risk of reoffending and are therefore sentenced to mandatory inpatient treatment (terbeschikkingstelling—tbs). The main goal of treatment is to reduce violence risk. The Van der Hoeven Kliniek follows a cognitive behavioral and relapse prevention model through an eclectic approach based on personal and social responsibility. Among the many aspects of treatment are psychiatric support, individual psychotherapy, group-based interventions, (psycho-)education, social network involvement, work skills development, and engagement in structured leisure activities. All activities aim to assist with a safe and successful reintegration into society.
Treatment consists of four main stages, of which the first two are more internally hospital focused and the last two are more externally society focused: (a) Intramural treatment without leaves, (b) Intramural treatment with supervised leaves to the community, (c) Intramural treatment with unsupervised leaves to the community, and (d) Transmural treatment—living in private or hospital housing in the community while supervised by a hospital community treatment team. The court order is in effect for as long as deemed necessary by the court, with the aim to rehabilitate patients safely back into society. The current average treatment length at the hospital from admission to discharge is about 7 years. The necessity of prolonged treatment is periodically being re-evaluated by the hospital by means of a thorough evaluation of treatment progress and risk of violence, which is communicated to the court. Violence risk assessment is carried out with the HCR-20 (risk factors) and the SAPROF (protective factors) when advice to the court is required or when re-assessment is deemed necessary by treatment staff to evaluate the present level of risk or the feasibility of risk management plans. Assessment outcomes are used to guide decision making regarding treatment phasing. Risk assessments are carried out at least once a year; however, in practice, the time to re-assessment generally varies between 6 and 12 months, depending on when a new assessment is needed.
Participants
This study included 399 assessments of 185 forensic psychiatric patients. The majority of the sample was male (79%, n = 146). Mean age at assessment was 41 years (SD = 9.71 years, range = 21-73). Of the 185 patients, 70% (n = 130) had a history of general (non-sexual) violent offending, while 30% (n = 55) had a history of (predominantly) sexually violent offending. Of the sexual offenders, 60% (n = 33) also had a history of general violent offending. Most of the patients (89%, n = 165) were diagnosed with at least one personality disorder (particularly Cluster B), while 53% (n = 98) of the patients were diagnosed with a major mental illness (primarily psychotic disorders, such as schizophrenia). Of the patients diagnosed with a major mental illness, 81% (n = 79) were also diagnosed with at least one personality disorder. Comorbidity with a history of serious problems with substance use was present in 69% (n = 128) of the cases. A high score (≥30) on the Psychopathy Checklist–Revised (PCL-R; Hare, 2003) was present for 33 patients (18%). Of the 399 assessments, 51 (13%) were carried out for the Intramural situation without leaves, 116 (29%) for the Supervised Leave situation, 79 (20%) for the Unsupervised Leave situation, and 153 (38%) for the Transmural situation—living outside the hospital.
In most international forensic psychiatric settings (e.g., in the United Kingdom or in North America), patients predominantly suffer from major mental illnesses (see de Ruiter & Trestman, 2007), and most clinical studies on HCR-20 have been carried out on populations with predominantly psychotic disorders. The present study compared predictive accuracy of risk assessment results for patients with a primary diagnosis of a personality disorder (a score of 2 on HCR-20 item H9, in the absence of a score of 2 on HCR-20 item H6) and patients with a primary diagnosis of a major mental illness (a score of 2 on HCR-20 item H6, regardless of H9 score). In addition, comparisons were made for patients scoring high and low on psychopathy, for patients with a history of violent offending versus those with a history of sexual offending, and for men versus women. Finally, separate analyses were carried out for patients in the more internally focused stages of treatment and for those in the more externally focused stages of treatment.
Measures
HCR-20
The HCR-20 (Webster et al., 1997) is the most widely used SPJ tool for the structured assessment of violence risk in forensic psychiatric practice (Singh et al., 2013). The HCR-20 contains 20 risk factors: 10 Historical (H) factors, five C factors, and five R factors (see Table 1 for a list of the risk factors). The items are scored on a 3-point scale (0-2), with higher scores reflecting the presence of a risk factor. PCL-R scores were available to code item H7, Psychopathy. The 10 H factors of the HCR-20 retain their high scores once coded as present at any time, while the 10 dynamic C and R factors are presumed to be changeable and expected to decrease as treatment progresses.
Descriptive Statistics HCR-20 (N = 399 Assessments)
Note. HCR-20 = Historical Clinical Risk Management–20.
SAPROF
The SAPROF (de Vogel et al., 2009; de Vogel, de Ruiter, et al., 2012) is an SPJ tool specifically for the assessment of protective factors for violence risk. The tool is intended to be used in conjunction with a risk-focused assessment tool, such as the HCR-20. It contains 17 protective factors organized within three scales: five internal factors (e.g., Intelligence, Coping, and Self-Control), seven motivational factors (e.g., Work, Leisure Activities, and Life Goals), and five external factors (e.g., Social Network, Professional Care, and External Control; see Table 2 for an overview of all protective factors in the SAPROF). The factors are rated on a 3-point scale (0-2), with higher scores indicating a protective factor is present. The first two internal SAPROF factors are static and generally do not change during treatment. The other factors are all dynamic and thus potentially changeable. Factors 3 through 14 are dynamic factors that are expected to increase during treatment as these are mostly internal, motivational, and social network factors that may benefit from (psychotherapeutic) interventions. The last three factors concern protection from the treatment team, housing supervision, and the court order, which are vital protective factors that are present for all patients during mandatory clinical treatment and are not expected to change until the end of treatment, when they actually decrease as the mandatory treatment is dropped. After the items have been rated, the assessor has the option to indicate items as particularly important for the individual. Items that provide vital protection for the individual may be marked as key factors, while items that are considered to be most relevant as treatment goals for the near future may be marked as goal factors.
Descriptive Statistics SAPROF (N = 399 Assessments)
Note. SAPROF = Structured Assessment of Protective Factors for violence risk; HCR-20 = Historical Clinical Risk Management–20.
In addition to rating the presence of the 20 HCR-20 risk factors and the 17 SAPROF protective factors, a concluding Final Risk Judgment is made by integrating and combining the protective factors (including key/goal indications) and risk factors that are present for the individual in his or her situation. In the current study, the Final Risk Judgment was made on a 5-point scale: low, low-moderate, moderate, moderate-high, or high. For comparison reasons, total scores were composed for the HCR-20, the SAPROF, and their subscales. In addition, an overall total score was constructed, comprised of the total risk factor score minus the total protective factor score, which was labeled as the HCR–SAPROF index. This index, the risk score corrected for available protection, is seen as the closest total score equivalent to the Final Risk Judgment as it reflects risk level while taking into account the level of protection that is present. The calculation of total scores is done purely for research purposes; in clinical practice, only final judgments are composed.
Procedure
The HCR-20 and the SAPROF were coded prospectively in clinical practice for routine assessment purposes. In addition to the knowledge about patients from daily interaction, hospital files were consulted in the assessment process. Hospital files consist of biographical information, psychological and psychiatric assessment reports, court reports on treatment progress, and case notes on treatment plans and treatment evaluations. Ratings of the HCR-20 and the SAPROF were carried out in multidisciplinary teams that each consisted of three evaluators: (a) a researcher/psychodiagnostic worker, (b) a treatment supervisor responsible for the patient, and (c) a sociotherapist working as group leader on the patient’s ward. Each assessment is first carried out individually and independently by each of the three evaluators. Subsequently, the evaluators discuss the assessment in a consensus meeting of 1 to 1.5 hr. During this meeting, consensus scores are agreed on through discussion for all risk and protective factors in the tools and for the Final Risk Judgment. In addition, at the consensus meeting, different risk scenarios and risk management plans are contemplated. Prior to being allowed to take part in these clinical risk assessments, all treatment staff are trained in the use of the HCR-20 and the SAPROF in 1-day workshops. Through intensively discussing the assessment of each case with multiple raters, consensus meetings provide a continuous feedback loop and therewith a constant training in the exact meaning of the different risk and protective factors in the tools. Good results were previously found for the interrater reliability of the Dutch HCR-20 and SAPROF (see de Vries Robbé et al., 2013). The consensus ratings are used for all analyses in the present study.
Statistical Analyses
To determine the correlations between the HCR-20 and the SAPROF, Pearson’s correlation analysis was used. Spearman’s rho correlation analysis with Bonferroni correction was applied for the item-level correlations. Pearson’s point-biserial correlation analysis was used to examine the correlations between the scores on the different tools and incidents of aggression during the year following the assessment. To evaluate the overall difference in scores between the four treatment stages, ANOVA analysis was carried out with post hoc comparisons with Tukey HSD (honest significant difference) correction for the differences in scores between the four consecutive treatment stages. For all pairwise comparisons, Cohen’s d effect sizes were calculated. Critical values for Cohen’s d are d ≥ .80 = large, and .50 ≤ d < .80 = moderate (Cohen, 1988). To be able to compare results between the more internally focused part of treatment (the Intramural stage and the Supervised Leaves stage) and the more externally focused part (the Unsupervised Leaves stage and the Transmural stage), in further analyses all assessments from the first two treatment phases were combined (n = 167, 42%) and compared with the joint assessments from the last two treatment phases (n = 232, 58%).
To assess the predictive validity for (no) violent incidents of the HCR-20 and the SAPROF individually (as well as that of their scale and item scores) and of the combined HCR–SAPROF index and the Final Risk Judgment, receiver operating characteristics (ROC; Mossman, 1994; Rice & Harris, 2005) analyses were conducted resulting in area under the curve (AUC) values. AUC values of .71 and above are considered large (Rice & Harris, 2005). For the purpose of determining significant differences between AUC values, comparative analyses were carried out using the ROCTools statistical software for the analysis of ROC curves (Allaire & Cismaru, 2007). This program applies the DeLong, DeLong, and Clarke-Pearson (1988) method for comparing correlated ROC curves values (different tools, same sample) and the Hanley-McNeil Z-statistic method (Hanley & McNeil, 1983) for comparing independent ROC curves (same tools, different samples).
The 399 risk assessments included in the present study concerned 185 individual patients. Multiple assessments were available for 66% (n = 122) of the patients (range = 1-4 assessments per patient). For the comparative analyses between risk assessment scores at different stages of treatment, only one assessment per patient was used for each treatment stage to prevent bias from repeated assessments within the treatment stages. This resulted in 249 assessments (out of the available 399 assessments) over the four different stages of treatment: 42 for Intramural (17%), 72 for Supervised Leave (29%), 47 for Unsupervised Leave (19%), and 88 for Transmural (35%). For the ROC analysis, initially only one assessment per patient, the first available assessment, was included (n = 185) to prevent bias by possible inflated AUC values caused by the inclusion of multiple assessments per patient. However, when comparison was made with results from the ROC analysis on the full sample including multiple assessments per patient (N = 399), no significant differences in predictive accuracy were observed (see “Results” section). Therefore, in all subsequent ROC analyses, the full sample was included.
Outcome
Data on incidents of aggression were collected from daily hospital reports up until 12 months after the assessment. To ensure similar follow-up times for all patients, only assessments for which sufficient follow-up time was available in the treatment stage the assessment was carried out for were included in the present study. The minimal follow-up time was set at 10 months. Due to the fact that the follow-up time after many assessments was less than 10 months (i.e., a new assessment was carried out within 10 months after the initial assessment or the patient had been discharged or moved up to a new treatment stage), only about half of the available clinical assessments were usable for the current study. This unfortunately implied that a longitudinal, within-patient comparison of changes in assessment scores over time was not possible. Instead, assessments carried out for different treatment stages were compared, regardless of the subject of the assessment.
Results
Tables 1 and 2 show the mean total, scale, and item scores on the HCR-20, the SAPROF and the HCR–SAPROF index. The correlation between the HCR-20 total score and the SAPROF total score was r = −.69 (p < .001). The highest inter-item correlations between the HCR-20 and the SAPROF were between risk item R3 Lack of Personal Support and protective item 13 Social Network (rS = −.75, p < .001) and between risk item C5 Unresponsive to Treatment and protective item 9 Motivation for Treatment (rS = −.61, p < .001).
Comparison of Assessment at Different Treatment Stages
In Table 3, the average total scores are presented for the HCR-20, the SAPROF, and the HCR–SAPROF index for each of the four treatment stages. HCR-20 scores were lower for patients in the further stages of treatment, F(3, 245) = 20.79, p < .001, while SAPROF scores were higher for further stages of treatment, F(3, 245) = 40.69, p < .001. As HCR-20 scores decreased and SAPROF scores increased, consequently violence risk scores for the HCR–SAPROF index also lowered between the stages of treatment, F(3, 245) = 35.70, p < .001. Differences between the consecutive Intramural stage and Supervised Leave stage showed a significant decrease in HCR–SAPROF index scores, t(112) = −6.99, d = 1.43, p < .001. In turn, scores for the Supervised Leave stage were significantly lower compared with those for the Unsupervised Leave stage, t(117) = −3.09, d = 0.59, p = .014. Differences between the final two stages were not significant. Figure 1 presents the Final Risk Judgments per treatment stage, in which a similar decreasing risk pattern is observable.
Mean HCR-20 and SAPROF Scores for Different Treatments Stages (n = 249)
Note. HCR-20 = Historical Clinical Risk Management–20; SAPROF = Structured Assessment of Protective Factors for violence risk.

Final Risk Judgments at Different Stages of Treatment in Percentages (n = 249)
Predictive Validity
All incidents of physical aggression (e.g., hitting, pushing) or threatening verbal aggression (e.g., comments such as “Next time I will kill you” or “You better watch out or I will hurt someone”) that resulted in confinement in a recovery or seclusion room for some period of time and/or which resulted in criminal charges (very few) were included as violent outcome. Although sexual violence was also included in this definition, sexual incidents rarely occurred. The overall observed aggressive incident rate, during the year following each assessment, was 11% (n = 44, out of 399 assessments), concerning roughly 39% physical violence (n = 17) and 61% verbal threats (n = 27). Aggressive incident rates were higher for the initial stages of treatment (Intramural 27%, n = 14; Supervised Leaves 15%, n = 17) than for the further stages of treatment (Unsupervised Leaves 10%, n = 8; Transmural 3%, n = 5). Psychopathic patients had the highest incident rate (21%, n = 15).
The point-biserial correlation between the total scores on the tools and incidents of violence was rpb = .31 (p < .001) for the HCR-20, rpb = −.27 (p < .001) for the SAPROF, and rpb = .32 (p < .001) for the HCR–SAPROF index. Table 4 shows the results from the ROC analyses for the total and subscale scores on the HCR-20 and the SAPROF, as well as for the HCR–SAPROF index and the Final Risk Judgment. Predicted outcome was aggressive incidents during the year following the assessment. The first analysis concerned one assessment for each patient, which showed good predictive validity for both the HCR-20 and the SAPROF (AUC = .77 and .76, respectively). Next, these results were compared with those from an analysis with multiple assessments per patient. As illustrated in Table 4, the results for the multiple assessment analysis were equally good to those for the analysis that included only one assessment per patient (AUC = .79 and .75, respectively), no significant differences were observed.
Predictive Validity (AUC Values) for Violent Incidents During Treatment (N = 399 Assessments, 1-Year Follow-Up)
Note. The values for the HCR-20, the HCR–SAPROF index, and the Final Risk Judgments concern violent incidents; the values for the SAPROF concern no incidents of violence. Final judgments are made on a 5-point scale. One assessment = One assessment per patient. AUC = area under the curve; HCR-20 = Historical Clinical Risk Management–20; SAPROF = Structured Assessment of Protective Factors for violence risk.
p < .05. **p < .01. ***p ≤ .001 (two-tailed).
Subsequently, the predictive accuracy was compared for the first two internally focused treatment stages (Intramural and Supervised Leave) and the last two externally focused treatment stages (Unsupervised Leave and Transmural treatment). The predictive validities of assessments for patients in the later stages of treatment were higher (AUC = .78-.85) than those of assessments in the earlier stages (AUC = .66-.68). Comparative analyses on the AUC values between the first and the last treatment stages showed a significant difference in predictive validity between treatment stages for the HCR-20 (Z = 2.40, p < .05) and for the HCR–SAPROF index (Z = 2.03, p < .05) but not for the SAPROF and the Final Risk Judgment. Especially, the HCR-20 H items (Z = 3.51, p < .001) and R items (Z = 2.11, p < .05) performed better during the further treatment stages.
No significant differences in predictive validity were observed between the HCR-20 and the SAPROF for any of the comparisons. Overall, the total HCR–SAPROF index score had the highest predictive accuracies (AUC = .70-.85). Comparative analyses on the AUC values showed significantly better predictive accuracy for the HCR–SAPROF index than for the SAPROF score for both the total sample and the externally focused treatment stage group, χ2(1, N = 399) = 9.40, p < .01 and χ2(1, n = 232) = 3.98, p < .05, respectively. No significant differences between the HCR–SAPROF index and the HCR-20 or the Final Risk Judgment were found.
Results are presented separately in Table 5 for different groups of patients: male patients with a history of violent offending (MV), male patients with a history of sexual offending (MS), female patients (F), patients with a major mental illness (MMI), patients with a personality disorder (PD), and patients with a high score (≥30) on the PCL-R. Overall, results for the different tools were fairly comparable across patient groups. The best predictive validities were found for the MS group, and the lowest predictive validities were found for the female patients. No significant differences were found between patients with a primary diagnosis of MMI and patients with a primary diagnosis of PD. Nor were significant differences found between patients with high scores on the PCL-R versus those without, despite the fact that several of the HCR-20 and SAPROF subscale scores were not significant predictors for the high-psychopathy group. Comparison of total score AUC values for the tools revealed no significant differences in predictive accuracy between male and female patients. However, the HCR-20 R scale showed a significantly lower predictive value for females (Z = 2.27, p < 0.05). When comparing the predictive accuracy of the tools for male patients with a history of violent offending versus those with a history of (also) sexual offending, a significant difference was found for the HCR-20. The total score performed significantly better for the sexual offender group (Z = 2.21, p < 0.05). The predictive accuracy of the SAPROF, the HCR–SAPROF index, and the Final Risk Judgment were not significantly different between any of the compared groups. For most groups of patients, the total score on the HCR–SAPROF index showed slightly higher AUC values than the Final Risk Judgment; however, these differences were not significant for any of the groups. Additional analyses were carried out for each group with only one rating per patient. Findings were highly similar to those described above (subgroup results are available from the authors on request).
Predictive Validity (AUC Values) for Violent Incidents During Treatment (N = 399 Assessments, 1-Year Follow-Up)
Note. The values for the HCR-20, the HCR–SAPROF index, and the Final Risk Judgments concern violent incidents; the values for the SAPROF concern no incidents of violence. Final judgments are made on a 5-point scale. High psychopathy = PCL-R score ≥ 30. AUC = area under the curve; MMI = major mental illness; PD = personality disorder; HCR-20 = Historical Clinical Risk Management–20; SAPROF = Structured Assessment of Protective Factors for violence risk.
p < .05. **p < .01. ***p ≤ .001 (two-tailed).
Predictors of Violence at the Item Level
Of the individual HCR-20 factors (see Table 1), 11 showed significant predictive validity for aggressive incidents (AUCs ranging from .59 to .72), four Historical factors and seven dynamic factors. The significant historical factors were H2 Young Age at First Violent Incident, H5 Substance Use Problems, H7 Psychopathy, and H10 Prior Supervision Failure. The significant dynamic factors were C1 Lack of Insight, C2 Negative Attitudes, C4 Impulsivity, C5 Unresponsive to Treatment, R1 Plans Lack Feasibility, R2 Exposure to Destabilizers, and R4 Noncompliance With Remediation Attempts. Overall, the HCR-20 items C2 Negative Attitudes, C4 Impulsivity, and R4 Noncompliance With Remediation Attempts were the best predictors of violent incidents during treatment. Risk items C2 Negative Attitudes and C4 Impulsivity were strong predictors for most groups. For the female patients, H2 Young Age at First Violent Incident was also a good predictor, while for male patients, H7 Psychopathy predicted well, for the sexual offenders H6 Major Mental Illness, and for the personality disordered H5 Substance Use Problems. For those patients scoring high on Psychopathy, C1 Insight and R4 Noncompliance With Remediation Attempts were the best risk predictors.
For the SAPROF, the last three factors virtually did not differentiate as during mandatory inpatient clinical treatment, these are always coded as present. Eight of the remaining 14 protective factors (see Table 2) demonstrated to be significant predictors of inpatient aggression (AUCs ranging from .60 to .72). The individual dynamic factors significantly predictive of violent incidents were 4 Coping, 5 Self-Control, 6 Work, 7 Leisure Activities, 8 Financial Management, 9 Motivation for Treatment, 10 Attitudes Toward Authority, and 12 Medication. Overall, the factors 4 Coping, 5 Self-Control, 6 Work, and 10 Attitudes Toward Authority were the best predicting ones. Protective factors, 5 Self-Control and 6 Work, generally performed well across groups, while for women item 1 Intelligence was most predictive, for the sexual offender group item 4 Coping, and for the MMI group item 10 Attitudes Toward Authority.
Discussion
This study aimed to investigate the differences in risk assessment scores and predictive validities thereof for aggressive incidents between stages of forensic psychiatric treatment. Furthermore, the study compared the predictive validity of the HCR-20 and the SAPROF across different groups of patients. Overall, assessment scores were found to be more positive and better predicting for later stages of treatment. Predictive validity results were found to be fairly consistent across the various patient groups.
Differences Between Stages of Treatment
As was expected, on average, the HCR-20 total risk scores decreased with further treatment stages, whereas the total scores on the SAPROF protective factors increased as treatment progressed. In the final Transmural treatment stage, protective factors slightly decreased again. As freedom and independence became much greater in this stage, internal protective factors such as Coping and Self-Control were more difficult to maintain, while the external protection from the supervised Living circumstances was decreased. Altogether, the HCR–SAPROF index showed a clear pattern of reduction through the different stages of treatment. These results were not surprising, given the fact that risk assessment outcome is actually used as input for treatment phasing decision making in clinical practice. Thus, it appears logical that patients in further stages of treatment have lower risk factors scores and higher protective factors scores. Similar patterns of decreasing risk factors and increasing protective factors during treatment were found in previous studies (e.g., de Vries Robbé, de Vogel, Douglas, & Nijman, 2015; Müller-Isberner et al., 2007). Although the current study does not have a longitudinal prospective repeated assessment design and thus true within-patient changeability of the factors could not be demonstrated, the differences in group-level assessment scores between treatment stages can be viewed as an indicator of the dynamic abilities of the HCR-20 and the SAPROF.
Predictive Accuracy for Inpatient Aggression
The current study shows the predictive value of a combined risk and protection assessment of violence for incidents of interpersonal aggression during treatment. Both the HCR-20 risk factors and the SAPROF protective factors showed good predictive validity. Five historical risk factors, seven dynamic risk factors, and eight dynamic protective factors demonstrated significant individual predictive validity for aggressive incidents during treatment. Interestingly, the prospective study by de Vogel and de Ruiter (2006) that previously looked at the predictive validity of the HCR-20 in a comparable inpatient Dutch forensic psychiatric sample, revealed virtually the same risk factors as significant predictors of inpatient physical violence. The prospective study by Abidin and colleagues (2013) found similar results for the predictive validity of the SAPROF items. All eight significantly predictive SAPROF items in the current study were also found to be predictive of inpatient aggression in the Abidin study. In addition, they found four other protective factors that were significant predictors. They also looked at the HCR-20 and found overlapping risk factors with the current study as best predictors.
Previous retrospective file studies found a significant improvement in predictive accuracy when the HCR-20 was combined with the SAPROF, for both violent (de Vries Robbé et al., 2011) and sexual offenders (de Vries Robbé, de Vogel, Koster, & Bogaerts, 2015), compared with predictions with the HCR-20 score alone. For most patient groups in the current prospective clinical study, the combined HCR–SAPROF index scores showed the best predictive values; however, these were not significantly better than the HCR-20 scores alone. Although in previous studies on violent recidivism in the community after discharge the combined use of the tools outperformed the HCR-20, in the current study on violent incidents during treatment, the combined use of the tools outperformed the SAPROF. Taken together, these results suggest that using a risk-focused tool combined with a protection-focused tool could be of potential value for violence risk assessment both in clinical practice and in release decision making. An alternative explanation for the observed additional value of using both risk and protective factors could be that the gain does not stem from the supposed different approach in both types of factors, but merely results from including more factors in the assessment, thus providing for a more wholesome evaluation. Either way, using both tools appears beneficial to a well-informed assessment of violence risk in clinical practice.
In general, the combined total score of both tools performed slightly, although not significantly, better than the Final Risk Judgment. This result differs from the finding reported in the meta-analysis by O’Shea and colleagues (2013), including 20 independent studies on the predictive efficacy of the HCR-20 for aggression in psychiatric facilities, that overall, the Final Risk Judgment had the highest mean effect size for the prediction of inpatient aggression. In addition, they reported that studies did not appear to have equal efficacy across different patient groups: Effect sizes were generally greater in samples suffering from psychiatric disorders compared with samples that included more patients with personality disorders. When comparing MMI and PD samples in the present study, a similar trend was found; however, differences in predictive accuracy between the two groups were not significant. Neither were significant differences found between patients with high-psychopathy scores and patients with lower scores. For patients with high-psychopathy scores, the HCR-20 H scale and R scale as well as the SAPROF Internal scale were not significantly predictive of inpatient aggression. However, because the HCR-20 C scale and the SAPROF Motivational scale were good predictors, the overall rating of the HCR–SAPROF index showed good predictive validity for the high-psychopathy group.
The female sample showed slightly lower predictive values than the male sample in the present study. Although the HCR-20 H and C scales performed well, the R items performed poorly for the female patients and significantly less good than for the men. The SAPROF Internal scale was also not able to predict significantly for the female sample, although SAPROF item 1 Intelligence was the best predictor for women. Nevertheless, the total scores of the HCR-20 and the SAPROF were significant predictors of inpatient aggression for female as well as male patients and no significant differences were found between the predictive validities of the total tool scores between the male and the female sample.
To our surprise, in the current male sample, both the HCR-20 and the SAPROF were more accurate predictors of general inpatient aggression for the sexual offenders than for the violent offenders. For the HCR-20, this difference was significant. This is an interesting finding as generally (additional) specific sexual offender tools are used to assess violence risk in sexual offenders, such as the Sexual Violence Risk-20 (SVR-20; Boer, Hart, Kropp, & Webster, 1997) or the STABLE (Fernandez, Harris, Hanson, & Sparks, 2012). Although these sexual offending tools are highly valuable for specifically assessing sexual violence risk and for guiding interventions aiming to reduce sexual violence risk (see, for example, Hanson & Morton-Bourgon, 2009), in the current study, the HCR-20 and the SAPROF showed to be valuable for the assessment of general violence during treatment of sexual offenders. Previous studies on the SAPROF and the HCR-20 in discharged samples of forensic psychiatric patients found equally strong predictive accuracy for general violent recidivism after discharge for patients convicted of sexual offending as patients convicted of violent offending (see de Vries Robbé et al., 2013). In addition, the SAPROF protective factors demonstrated good predictive validity for sexually violent recidivism as well as general violent recidivism in those who previously sexually offended (de Vries Robbé, de Vogel, Koster, & Bogaerts, 2015). Therefore, inclusion of protective factors in the assessment and treatment guidance of sexual offenders may also offer a meaningful addition to the risk assessment of sexual offending.
The current study demonstrates mixed results in terms of static versus dynamic predictors. Overall, the dynamic scales showed better predictive values than the static historical scale of the HCR-20. In general, all scales performed better for the later stages of treatment, indicating that clinicians in this study were better able to distinguish between those patients more likely to become violent and those less likely at the later treatment stages. Especially, for the HCR-20 H and R items, predictive accuracy was better for the Unsupervised Leaves—Transmural stage. The H items were insufficiently able to differentiate between violent and non-violent patients during the earlier stages of treatment. However, these historical vulnerabilities had a greater impact during the later stages. At the same time, raters were unable to meaningfully assess the R items during the first treatment stages, but performed well later on. When patients have reached the later treatment stages, perhaps clinicians have come to know them better and have learned from their past behavior during earlier treatment stages. As a result, they may be better able to accurately rate the dynamic items and predict who might become violent again. It may also be the case that in a more restricted and controlled inpatient setting, risk factors are less likely to result in actual aggression, due to the fact that supervision and risk management are more intensive and access to triggers such as substances and external bad influences is limited. Thus, it is possible that certain (historical) risk factors may have a smaller effect on aggressive behavior during the earlier treatment stages, but become more manifest again when risk management is less stringent. Although differences were not significant for the protective factors, these also had a stronger effect during the further stages of treatment. Because the Unsupervised Leaves and Transmural treatment phases are the first real independent re-entering into the community and thus offer delicate practice-ground for rehabilitation, the finding that the risk assessment tools work well for these stages is important for risk management and community reintegration strategies.
It has been argued that when risk assessment tools are used in clinical practice to guide treatment, this inevitably leads to lower predictive validities (Pedersen, Ramussen, & Elsass, 2012). The general aim of treatment is to prevent violence from occurring; therefore, risk management strategies generally become more stringent when risk levels increase (Hart, 1998). As a result, the predicted violent outcome is less likely to happen, and thus, the predictive accuracy of the risk assessment is negatively affected. In the treatment setting of the present study, risk assessment plays an important part in hospital decision making regarding treatment phasing and risk management planning. In general, preventive risk management strategies (protective factors) seem to be quite effective, given the low overall aggressive incident base rate of 11% in this high-risk forensic psychiatric population, compared with the incident rates described in other studies (see, for example, Nicholls, Brink, Greaves, Lussier, & Verdun-Jones, 2009). In addition, the aggressive behavior that was observed in the present study rarely resulted in serious physical injury. Despite the seemingly gainful treatment efforts, the predictive values of the HCR-20 and the SAPROF for violent incidents were still good, especially for the patients in the further stages of treatment. This leads us to believe that perhaps treatment should consider risk assessment results even more carefully and possibly adjust risk management strategies accordingly.
Limitations
As this was a true prospective clinical validation study, only the risk assessments that were available from clinical practice could be used. Because time between repeated assessments was variable and for many patients successive assessments with sufficient follow-up were not available, in the present study, it was not possible to analyze within-patient treatment changes. Instead, assessments for different stages of treatment were studied on a group level. When multiple assessments of different patients are viewed together, on a group level, the decrease of risk factors and increase of protective factors appears to take place gradually as treatment progresses. However, this movement in scores is not always as gradual as it seems. In fact, in clinical practice, instead of a continuous improvement during treatment, often ups and downs are observed. In addition, when patients move into a new treatment phase, they initially face more challenges related to their increased freedom and independence, which may temporarily put a strain on risk and protective factors. Regardless, on average, risk factors showed to be lower and protective factors showed to be higher in the further stages of treatment. This finding is likely confounded by the fact that clinical decision making regarding treatment phasing is generally informed by risk assessment outcome. Thus, the observation that risk levels are lower in the further treatment stages should merely be viewed as a confirmation of what is attempted to be accomplished in forensic clinical treatment.
The fact that only assessments were included that were carried out in one hospital with a fairly homogeneous high-risk forensic psychiatric patient population poses a limitation in terms of generalizability. After an initial analysis including only one assessment per patient, comparison was made with results from a subsequent analysis on the larger sample, including multiple assessments for some patients. Because results for the “clean” one assessment per patient sample were virtually identical to those for the full sample, it was concluded that including multiple assessments per patient had little effect on the findings in this study. Therefore, it was decided to include the full sample in all subsequent analyses for the different groups of patients and consider each assessment as independent. However, the use of multiple assessments for some patients may have affected the results from the different patient group analyses to some degree. The smaller the subsample, the greater the potential influence of repeated measures for one individual. Regardless, it was decided that for the present study, the increased sample size advantage outweighed this possible limitation. Nevertheless, even with the inclusion of multiple assessments, the subsamples of some patient groups were quite small. Especially, the results concerning the female patients and those concerning the patients high on psychopathy should therefore be interpreted with caution.
A further limitation may be that the means of gathering outcome data in the present study may not have been entirely free of bias. Because it is general policy that all aggressive incidents that are followed by a sanction of seclusion are reported in the daily hospital bulletin, we can be reasonably sure that most serious incidents were analyzed in the current study. However, in clinical practice, it sometimes remains somewhat arbitrary at which point an incident of aggressive behavior reaches the threshold that makes clinicians decide on the necessity of seclusion as an intervention. Some patients may be sanctioned to seclusion more easily than others for similar types of behavior. In addition, the more freedom a patient has to move freely outside the hospital without supervision, the more often aggressive behavior may go unnoticed. It is therefore expected that the reported incident rate of patients in the Transmural phase is by definition lower than that of the patients who are unable to leave the hospital grounds. However, even during the Transmural stage, supervision is quite intensive through regular contact with the patient, network meetings, and unannounced house-calls. Therefore, most fairly serious incidents are likely to be reported at some point, either by the police, by the patient’s network, or by the patient him- or herself.
Finally, in terms of research design, it could be argued that the high number of different raters involved in the many risk assessments in this study poses a limitation. However, this is how these tools are used in clinical practice, and the inclusion of different raters from various disciplines could in fact also be a strength of the study rather than a limitation. The advantage of multidisciplinary consensus ratings is that although all essential patient information should be documented in the files, raters often have different knowledge about a patient from their own perspective and relation to the patient. De Vogel and de Ruiter (2006) demonstrated that multidisciplinary consensus ratings showed better predictive values compared with individual ratings. In addition, consensus meetings provide a valuable platform for generating treatment plans and sharpening risk management strategies. Moreover, they provide an opportunity for ongoing feedback and training on how the different factors in the tools are intended to be coded. Therefore, multidisciplinary assessments are considered best-practice in managing violence risk (Haque & Webster, 2012). Additional studies on the present data will focus more on the differences between the individual assessments by raters from different disciplines.
Future Research
To be able to conclude that improvements in risk assessment scores during treatment indeed positively affect reductions in violent behavior, future prospective clinical studies should include repeated assessments of risk and protective factors carried out at set time intervals and compare the observed within-patient treatment progress to violent outcome. It is advised to include risk as well as protective factors in such studies to be able to evaluate the effects of changes in both. Ideally, prospective research in clinical practice should also attempt to routinely document detailed descriptions of specific treatment efforts that take place with specific groups of patients, to be able to draw conclusions about what interventions successfully targeted specific risk or protective factors for specific groups of patients, and whether this resulted in reduced levels of violent behavior. Although this type of research is not easily accomplished, it is a necessary step that needs to be taken to further improve the clinical utility of these promising tools for the assessment of violence risk and guidance of treatment in forensic clinical psychiatry.
In addition, in future clinical studies, it is recommended to apply the revised HCR-20 Version 3 (HCR-20V3; Douglas, Hart, Webster, & Belfrage, 2013). The aim of this revision of the HCR-20 was to make the tool more clinically applicable and increasingly valuable for treatment guidance. Future studies should also aim to include bigger female samples to be able to draw more solid conclusions regarding the differences in psychometric properties of risk and protective factors between men and women. An additional tool especially for the assessment of female specific risk factors could be used in accordance with the HCR-20: the Female Additional Manual (FAM; de Vogel, de Vries Robbé, van Kalmthout, & Place, 2012). A tool for female specific protective factors has not yet been developed. It is advised that future HCR-20 and SAPROF validation studies with female samples also aim to investigate female specific risk and protective factors to evaluate whether these could increase the predictive accuracy of risk assessments for women.
Conclusion
The current study presents good results for the predictive accuracy of inpatient aggression by a combined violence risk assessment, including both HCR-20 risk factors and SAPROF protective factors for various groups of patients. Differences in dynamic risk and protective factor scores were observed between changing stages of treatment. In clinical practice, the provision of balance in risk assessment through the combined use of risks and strengths seems to provide for a more well-rounded observation of current functioning. Moreover, in addition to aiming to reduce risk factors, focusing on strengthening protective factors and setting positive treatment goals offers hope and optimism among patients and clinicians and, as a result, may enhance motivation and ultimately treatment success.
Footnotes
Acknowledgements
The authors wish to thank all clinicians and researchers at the Van der Hoeven Kliniek in the Netherlands who participated in the data collection for this study. Michiel de Vries Robbé and Vivienne de Vogel are both authors of the SAPROF, Kevin S. Douglas is author of the HCR-20. None of the authors receive financial benefits from either tool.
