Abstract
The present study examined the predictive properties of Violence Risk Scale–Sexual Offender version (VRS-SO) risk and change scores among Aboriginal and non-Aboriginal sexual offenders in a combined sample of 1,063 Canadian federally incarcerated men. All men participated in sexual offender treatment programming through the Correctional Service of Canada (CSC) at sites across its five regions. The Static-99R was also examined for comparison purposes. In total, 393 of the men were identified as Aboriginal (i.e., First Nations, Métis, Circumpolar) while 670 were non-Aboriginal and primarily White. Aboriginal men scored significantly higher on the Static-99R and VRS-SO and had higher rates of sexual and violent recidivism; however, there were no significant differences between Aboriginal and non-Aboriginal groups on treatment change with both groups demonstrating close to a half-standard deviation of change pre and post treatment. VRS-SO risk and change scores significantly predicted sexual and violent recidivism over fixed 5- and 10-year follow-ups for both racial/ancestral groups. Cox regression survival analyses also demonstrated positive treatment changes to be significantly associated with reductions in sexual and violent recidivism among Aboriginal and non-Aboriginal men after controlling baseline risk. A series of follow-up Cox regression analyses demonstrated that risk and change score information accounted for much of the observed differences between Aboriginal and non-Aboriginal men in rates of sexual recidivism; however, marked group differences persisted in rates of general violent recidivism even after controlling for these covariates. The results support the predictive properties of VRS-SO risk and change scores with treated Canadian Aboriginal sexual offenders.
In a recent legal case, Ewert v. Canada, the application of clinical forensic risk assessment tools with Aboriginal inmates under federal custody was subjected to heavy scrutiny. At issue on the table was the appropriateness of the use of structured risk assessment tools with Aboriginal offenders. Having served 30 years of two life sentences for murder (a sexual homicide) and attempted murder (a sexual assault leaving the victim permanently brain injured), the plaintiff had launched similar lawsuits against the Correctional Service of Canada (CSC) about the prejudicial application of these instruments as well as grievances within the Service (see decision by Judge Michel Beaudry in Ewert v. Canada, 2007, for a comprehensive review of these matters). In the current matter, several instruments had been identified as having been used and argued to be inappropriate in their application to this man owing to the lack of compelling data supporting their psychometric properties with Aboriginal offenders. In his September 18, 2015, decision on the Ewert matter, Judge Michael Phelan concurred and strongly encouraged CSC to reconsider applications of such tools to Aboriginal offenders or the plaintiff more specifically absent further compelling data.
Issues in the Forensic Assessment of Diverse Offender Populations
The instruments under consideration in the Ewert v. Canada (2015) decision included Hare’s Psychopathy Checklist–Revised (PCL-R; Hare, 1991, 2003), the Sex Offender Risk Appraisal Guide (SORAG; Quinsey, Rice, & Harris, 1995), Violence Risk Appraisal Guide (VRAG; Harris, Rice, & Quinsey, 1993), Static-99 (Hanson & Thornton, 1999), and the Violence Risk Scale–Sexual Offender version (VRS-SO; Wong, Olver, Nicholaichuk, & Gordon, 2003). Research conducted thus far on these and related tools with Aboriginal and non-Aboriginal offenders has focused primarily on their predictive properties for various recidivism outcomes. One study has been published on a CSC sample providing support for the predictive validity of PCL-R scores with Aboriginal and non-Aboriginal offenders and the stability of the four-factor structure of the tool across both ancestral groups (Olver, Neumann, Wong, & Hare, 2013). A meta-analysis of five independent Canadian samples further demonstrated that the Static-99, Static-2002, and their revisions each significantly predicted sexual violence among Aboriginal and non-Aboriginal sexual offenders (Babchishin, Blais, & Helmus, 2012). Other lines of research have similarly examined a family of general risk-needs assessment measures not mentioned in the decision: the Level of Service Inventory (LSI) scales and its variants. Findings from meta-analysis support the predictive accuracy of the LSI scales among Canadian Aboriginal offenders (Wilson & Gutierrez, 2014), North American Aboriginal, Black, Hispanic, and ethnic minority offenders in general (Olver, Stockdale, & Wormith, 2014), as well as major risk-need domains (e.g., antisocial attitudes, associates) among Canadian Aboriginal offenders (Gutierrez, Wilson, Rugge, & Bonta, 2013). The Ewert v. Canada (2015) written decision seems to suggest that the judge was not persuaded by the findings of the Aboriginal PCL-R report while the Babchishin et al. (2012) meta-analysis did not receive mention. The LSI measures were also not identified in the Ewert decision, possibly because there was no indication that this brand of tools had been administered to the plaintiff.
The overrepresentation of Canada’s Aboriginal peoples in federal correctional custody is a matter of national concern. Although representing only 3% of the general population, Aboriginal men and women comprise 22.6% of the federal prison population (Public Safety Canada, 2014). Reviews elsewhere further document the overrepresentation of ethnic and racial minorities in criminal justice settings around the world including New Zealand (Anaya, 2015), Australia (Krieg, 2006), United States (Travis, Western, & Redburn, 2014), and the United Kingdom (Ministry of Justice, 2013) to list but a few. Ethnic and racial minorities in North American countries and the Western English speaking world, including Aboriginal persons, often have a history of colonialization, and for decades or even centuries in some instances, have faced an erosion of traditional culture, beliefs, and language, family upheaval, poverty, social plight, intergenerational trauma, and human rights violations (Brzozowski, Taylor-Butts, & Johnson, 2006; Perreault, 2011; Serin, 2010).
In a review of risk assessment measures with Canadian Aboriginal individuals, Rugge (2006) noted that Aboriginal offenders tend to score higher on risk assessment measures, more frequently score as high risk, and have higher rates of recidivism. In each of the aforementioned studies, ethnic minorities in general and Aboriginal offenders, in particular, do score higher on conventional risk tools and they do have higher rates of all recidivism outcomes. However, the predictive accuracy of these tools is often somewhat lower among Aboriginal offenders, depending on the tool and sample. Although higher scores still are associated with increased frequencies of recidivism, this situation indicates that other unmeasured variables not captured by these tools contribute to observed differences in base rates of recidivism. This situation speaks to the need to exercise appropriate cautions, sensitivity to cultural context, and professional discretion in applications of forensic clinical measures with diverse populations.
VRS-SO
In his 2015 decision, Judge Phelan aptly described the VRS-SO as follows:
[22] This test is a rating scale designed to assess risk and predict sexual recidivism, to measure and link treatment changes to sexual recidivism and to inform the delivery of sexual offender treatment. The VRS-SO comprises static and dynamic factors and generates both qualitative and quantitative assessments of inmates. The VRS-SO is used following sex offender treatment to assess the success of that treatment.
The VRS-SO is a sex offender risk assessment and treatment planning tool designed to assess sexual violence risk, identify targets for sexual violence risk management, and to assess changes in risk from treatment or other change agents. Research on three nonoverlapping CSC samples of treated sexual offenders (Olver, Nicholaichuk, Kingston, & Wong, 2014; Olver, Wong, Nicholaichuk, & Gordon, 2007; Sowden & Olver, 2017), a New Zealand sample of treated child molesters (Beggs & Grace, 2010, 2011), and an Austrian sample of pedophiles (Eher, Olver, Heurix, Schilling, & Rettenberger, 2015) support the predictive accuracy of VRS-SO scores for sexual and other forms of recidivism (e.g., general violence). Four of the aforementioned samples, which feature treated sexual offenders, also demonstrated VRS-SO change scores, representing risk reduction, to be associated with decreases in sexual, violent, and/or general recidivism. The three Canadian samples had high proportions of Aboriginal offenders, ranging from about one third to approximately one half, while approximately 20% of the New Zealand sample consisted of men of Maori descent. Although it stands to reason that the predictive and change properties of the VRS-SO would translate to the demographic and cultural subgroups within these samples, this remains an untested assumption.
Context of the Present Study
Although many of these tools mentioned in Ewert v. Canada, including the VRS-SO, were developed on samples with Aboriginal offenders, as the previous review illustrates, relatively few studies have examined the psychometric properties of these tools within Aboriginal persons specifically. This situation also has implications for the application of risk assessment tools and the evaluation of change with offenders from diverse backgrounds in general. There is an understandable and important motivation to prevent possible harms from being incurred through the application of tools that could misrepresent its clientele, for instance, through misclassifying an individual as high risk or otherwise artificially inflating risk level if the tool is somehow biased or inappropriate for use with the population in question. But what about the potential for risk assessment to help, not only decision making authorities such as judges and parole boards, but also for offenders themselves? For instance, what if certain dynamic tools may be able to capture changes in risk in a valid and reliable manner, such as positive gains made in a treatment program? As the VRS-SO’s relevance to Aboriginal offenders was under scrutiny in this legal matter, its predictive properties are the focus of investigation in a large CSC-based sample of Aboriginal and non-Aboriginal sexual offenders.
Method
Participants and Study Samples
Participants included 1,063 adult male federally incarcerated sexual offenders who attended sexual offender treatment programming in CSC institutions across one of five geographic regions (Pacific, Prairie, Ontario, Quebec, Atlantic). In all, 393 of the men identified as Aboriginal, specifically, First Nations, Métis, or Circumpolar peoples of the Arctic, while 670 men identified as non-Aboriginal, most of whom were White. The present study sample is an amalgamation of three independent and nonoverlapping CSC samples from which Static-99R and pre- and posttreatment VRS-SO ratings were obtained. These three studies each received CSC and University of Saskatchewan Behavioural Research Ethics Board (REB) approval, as well as REB approval for data linkage and secondary analysis of these data to conduct Aboriginal and non-Aboriginal comparisons.
Sample 1 (Olver et al., 2007; see also Olver, Beggs Christofferson, Grace, & Wong, 2014) comprised 321 treated sexual offenders with retrospective archival VRS-SO and Static-99R ratings obtained from comprehensive institutional file information. The men had attended the Clearwater High Intensity Sex Offender Program between 1983 and 1997 at the Regional Psychiatric Centre (RPC) in Saskatoon, Saskatchewan, Canada. Sample 2 (Olver, Nicholaichuk, et al., 2014) included 562 treated sexual offenders who had received treatment services through the National Sex Offender Program (NaSOP), a comprehensive cognitive behavioral sexual offender program offered in low, moderate, and high intensities, between 2000 and 2008. This sample also included admissions to the Clearwater Program from 2001 to 2008 who were subsequently released. Prospective ratings on the VRS-SO dynamic items and Static-99R were completed by psychologists or program facilitators and extracted. Archival ratings of the VRS-SO static factors were completed on the basis of institutional file and criminal record information (Olver et al., 2016). Sample 3 (Sowden & Olver, 2017) comprised 180 treated sexual offenders who had attended the Clearwater Program between 1997 and 2001. As in Sample 1, this was a retrospective study in which archival VRS-SO and Static-99R ratings were made on the basis of comprehensive institutional file information.
Comparative analyses for Aboriginal and non-Aboriginal offenders had not been conducted heretofore within these three individual samples owing to insufficient power from small cell sizes that result from separating the sample into ancestral subgroups. Thus, combining the three samples in this manner has important advantages of providing a larger N to permit subgroup analyses and to detect possible effects that may be overlooked within smaller samples.
Measures
Static-99R
Static-99R (Helmus, Thornton, Hanson, & Babchishin, 2012) is a 10-item static actuarial sexual offender risk assessment tool. The item content comprises information regarding criminal history and offender and victim demographics. Possible scores range from −3 to 12. Meta-analytic research (k = 24, N = 8,390) supports the predictive accuracy of Static-99R for 5-year (area under the curve [AUC] = .72) and 10-year (AUC = .71) rates of sexual recidivism (Helmus et al., 2012). Interrater reliability data were available for Sample 1 for Static-99 (the age item was recoded to generate Static-99R scores) and Sample 3 for Static-99R; as Sample 2 was a prospective evaluation of in-time Static-99R and VRS-SO ratings, interrater reliability information was not available. Interrater reliability via intraclass correlation coefficient (ICC; single measure, consistency, two-way mixed effects) was reported in Sample 1 (Olver et al., 2007) on 35 randomly selected cases (ICC = .82) and Sample 3 (Sowden & Olver, 2017) on 21 randomly selected cases (ICC = .97).
VRS-SO
The VRS-SO (Wong et al., 2003) is a 24-item risk assessment and treatment planning tool designed to assess sexual violence risk, identify targets to be prioritized for intervention, and to assess changes in risk from treatment or other change agents. The VRS-SO consists of seven static (historical, generally unchanging) and 17 dynamic (potentially changeable) items. Items are scored on a 4-point (0, 1, 2, 3) ordinal scale in which higher scores indicate greater risk for sexual violence. Total possible scores range from 0 to 72. Change is operationalized through a modified application of the stages of change (SOC) model (Prochaska, DiClemente, & Norcross, 1992), which posits that individuals move along a continuum of five stages involving cognitive, experiential, and behavioral changes as they attempt to remediate problem areas. Each of the stages (Precontemplation, Contemplation, Preparation, Action, Maintenance) is operationalized for each dynamic item. Items receiving a 2 or 3 rating are considered criminogenic and are given a baseline SOC rating at pretreatment; the SOC is then rerated on each criminogenic item to track treatment progress and change. Progression from one stage to the next is associated with a 0.5-point change denoting risk reduction; the lone exception is progression from Precontemplation to Contemplation, which does not receive a point deduction as there are no behavioral changes relevant to risk. Deterioration from one stage to the next is coded as a 0.5 increase as appropriate. The change ratings are summed across the dynamic items at posttreatment to yield a total change score. The change score can be deducted from the pretreatment dynamic total to yield a posttreatment score.
Interrater reliability data were available for Samples 1 and 3 as described earlier. For Sample 1 (Olver et al., 2007) interrater reliability via ICC (single measure, consistency, two-way mixed effects) was reported on 35 randomly selected cases with interrater correlation values of .74 (pretreatment dynamic), .79 (posttreatment dynamic), and .64 (change scores) obtained. Interrater reliability for Sample 3 (Sowden & Olver, 2017) was reported on 21 randomly selected cases with ICC values of .73 (.86 with outlier removed; pretreatment dynamic), .74 (.87 with outlier removed; posttreatment dynamic), and .83 (.84 with outlier removed; change scores) obtained.
Recidivism variables
Recidivism information was obtained from the Canadian Police Information Centre (CPIC). The outcome data for Samples 1 and 3 were updated most recently in 2011 and for Sample 2 in 2015. Sexual recidivism was defined as any criminal code violation for a sexually motivated offense (e.g., sexual assault), including child pornography offenses. Offenses adjudicated as nonsexual crimes that were determined to be sexual in nature, when such information was available, were also coded as sexual offenses. Such instances usually could only be determined when the new nonsexual crime involved a return to federal custody, as the Criminal Profile Report could be retrieved electronically and reviewed for offense details. Violent recidivism was defined as any criminal code violation for an offense against the person, whether it was sexual or nonsexual (e.g., murder, robbery, nonsexual assault) in nature. Samples 1 and 3 used conviction as the criterion for recidivism while Sample 2 used charges in addition to convictions. All recidivism criteria were coded in a binary manner (1 = recidivated, 0 = did not recidivate). Offense date information was obtained for new sexual and violent offenses to perform survival analyses; when the exact date was available, this time was used. Time served in custody prior to the next sexual or violent offense was subtracted off the total survival time to yield a closer approximation of time in the community prior to recidivism.
Data Analytic Plan
The three samples were combined into a single sample to boost sample size, increase statistical power, and to expand the range of potential analyses (that might be sensitive to low power) to separately examine the predictive properties of VRS-SO scores in Aboriginal and non-Aboriginal offenders. The analyses proceeded in several phases. First, group comparisons between Aboriginal and non-Aboriginal men were made on Static-99R and VRS-SO static, dynamic, total, and change score ratings via t tests with Cohen’s d computed to provide a measure of effect size in standard deviation units. Chi-square was also used to examine group differences in 5-year and 10-year rates of sexual and violent recidivism with Cohen’s d computed using mean and standard deviation information.
Second, the predictive accuracy of Static-99R and VRS-SO static, dynamic, total, and change scores was examined for fixed 5- and 10-year rates of sexual and violent recidivism via receiver operating characteristic (ROC) curve analysis among Aboriginal and non-Aboriginal subgroups. ROCs generate an AUC statistic ranging from 0 to 1.0 representing the likelihood that a randomly selected recidivist has a more deviant score on the measure than a randomly selected nonrecidivist. For the change score analyses, because positive change score values represent decreased risk, the outcome variable predicted for those AUC analyses was binary nonrecidivism, such that higher AUC values for change scores corresponded to associations of positive change to decreased recidivism (or increased nonrecidivism). In addition to utilizing raw change score information, we also computed residualized change scores by regressing the change score on the pretreatment dynamic score and saving the residual for analysis. Because higher risk men with higher scores have more room to change and thus higher change scores (yet are still higher risk than lower scoring men, who have less room to change and thus lower change scores), the residualized change score represents the amount of change unconstrained by pretreatment score (Beggs & Grace, 2011). The difference in AUC magnitudes of Aboriginal and non-Aboriginal subgroups were compared on each measure in the prediction of each outcome and examined for significance using MedCalc for Windows 12.5 (MedCalc Software, Ostend, Belgium), which tests areas under independent curves using the procedures outlined in Hanley and McNeil (1982, 1983).
Third, we conducted Cox regression survival analyses to examine the unique associations of change to sexual and violent recidivism after controlling for baseline risk and individual differences in follow-up time. Again, these analyses are performed separately among Aboriginal and non-Aboriginal groups. Two sets of regressions were conducted for each ancestral group for each outcome: (a) VRS-SO pretreatment total score (i.e., static + dynamic) entered in the first step, followed by VRS-SO dynamic change score, entered in the second step, and (b) Static-99R and dynamic pretreatment score entered in the first step, followed by dynamic change scores in the second step. These analyses are intended to examine the extent to which positive treatment change as captured by the VRS-SO translates into risk reduction and decreased sexual and violent recidivism for both Aboriginal and non-Aboriginal offenders controlling for initial risk.
Fourth, we repeated these Cox regression analyses on the entire sample, this time entering the binary ancestry variable as a covariate in the first step, followed by risk and change score information, as outlined above, in the second step. These analyses were performed for sexual and violent recidivism outcomes and were intended to examine to what extent Aboriginal group membership is associated with higher observed rates of sexual and violent recidivism after controlling for risk and change information. Of note, all Cox regression survival analyses were performed with and without employing sample as a strata variable to account for possible differences among the samples in rates of recidivism and predictor associations with the recidivism criteria. The results of the analyses employing the sample strata variable are reported given that this serves as a more rigorous test of the prediction models to increase confidence in the results.
Finally, we conducted a series of calibration analyses using logistic regression modeling examining rates of recidivism among Aboriginal and non-Aboriginal offenders as a function of VRS-SO pretreatment score and change. For the sake of space considerations, we centered our analyses on fixed 5-year follow-ups, as the 5-year outcome is a very common criterion employed in sex offender risk assessment research and information provided for instrument norms. These analyses permitted examination of recidivism trajectories between the two ancestral groups as a function of increasing pretreatment score at different thresholds of change.
Results
Aboriginal and Non-Aboriginal Differences on Risk, Change, and Outcome Measures
As seen in Table 1, Aboriginal offenders scored significantly higher on the Static-99R and VRS-SO static, dynamic, and total scores than non-Aboriginal men. The differences were most pronounced on static measures (nearly half a standard deviation), but smaller on dynamic scores (closer to one fifth of a standard deviation). Of note, there were no significant differences between Aboriginal and non-Aboriginal men on dynamic treatment related changes as measured by the VRS-SO. Rather, both groups made approximately one half a standard deviation of change from pre- to posttreatment (ds = .44 and .45, p < .001 for Aboriginal and non-Aboriginal men, respectively) and thus significant treatment gains from CSC-based sexual offender programming. Of note, Aboriginal offenders had significantly higher 5-year and 10-year rates of sexual and violent recidivism than non-Aboriginal offenders. The differences in regard to sexual recidivism would be considered small in magnitude while the differences for violent recidivism would be considered at least moderate in magnitude (Cohen, 1992).
Aboriginal and Non-Aboriginal Descriptive Statistics and Group Comparisons on Risk and Change Measures.
Note. VRS-SO = Violence Risk Scale–Sexual Offender version; ns = not significant.
p < .05. **p < .01. ***p < .001.
Predictive Accuracy of the VRS-SO as a Function of Aboriginal Ancestry
The sample was followed up for a mean of 12.26 years (SD = 4.77) postrelease. Over the time period, 17.7% of the sample was charged or convicted for a new sexual offense, and 36.7%, any new violent (including sexual) offense. Fixed 5- and 10-year follow-ups were also employed for sexual (11.7% and 19.4%, respectively) and violent (23.9% and 39.7%, respectively) recidivism variables as one means of controlling for follow-up time in univariate analyses.
As seen in Table 2, the Static-99R and VRS-SO static, dynamic, and total scores significantly predicted 5- and 10-year rates of sexual and violent recidivism in both Aboriginal and non-Aboriginal offenders. All AUC magnitudes for static, dynamic, and total risk scores were higher for non-Aboriginal men; however, VRS-SO change scores (i.e., raw scores and residualized change scores) demonstrated more equivalent predictive accuracy for sexual and violent recidivism between the groups with more than half of the values being higher for Aboriginal men. In all, 25 out of 32 AUC values were higher for non-Aboriginal men, while five were higher for Aboriginal men, and two were the same. Given that it would be expected that 50% of values should be higher for non-Aboriginal offenders and 50% lower, the pattern of findings (25/32 AUCs) is significantly different from chance at .001. 1 The ratio of AUC values for change scores specifically, (five higher for Aboriginal men, two lower, one same) would be consistent with the expected split. Of note, the difference in prediction was not substantial between the groups for all outcomes. Specifically, the 95% confidence intervals overlapped quite substantially between the racial/cultural groups and all but two of these differences in AUC magnitude were nonsignificant; the exception was in the prediction of 5-year general violence by the Static-99R and VRS-SO static (with higher magnitudes for non-Aboriginal men). Although the magnitude of the differences between the two groups was not large and generally not significant, there is a significant pattern overall of slightly lower AUCs for Aboriginal offenders for risk scores.
Prediction of 5- and 10-Year Sexual and Violent Recidivism (Fixed Follow-Ups) by Static-99R and VRS-SO Risk and Change Scores as a Function of Aboriginal Versus Non-Aboriginal Ancestry.
Note. VRS-SO = Violence Risk Scale–Sexual Offender version; Non-Ab = non-Aboriginal; AUC = area under the curve; CI = confidence interval. Significant differences (p < .05) in AUC magnitude between Aboriginal and non-Aboriginal group found only for 5-year violence on Static-99R and VRS-SO static; all other AUC difference magnitudes between ancestral groups were nonsignificant.
p < .05. **p < .01. ***p < .001.
By contrast, the magnitude of the association of positive change to decreased recidivism was actually slightly higher in Aboriginal offenders for three out of the four outcomes; however, again, given the highly overlapping confidence intervals in AUC change magnitudes, and as confirmed through significance testing, the results would generally suggest equal strength of prediction between treatment change and reduced recidivism for both groups of men. Of note, change scores were significantly associated with decreased sexual and violent recidivism for 12 out of the 16 associations examined. Of the nonsignificant AUC values for change scores, one was not significant for Aboriginal offenders (i.e., raw change score prediction of 5-year sexual recidivism), while three were not significant for non-Aboriginal offenders (i.e., raw and residualized change score prediction of 10-year sexual recidivism and raw change score prediction of violent recidivism). Overall, however, the substantive findings indicate that positive treatment changes were associated with risk reduction and decreased recidivism to an equal magnitude for treated Aboriginal and non-Aboriginal sexual offenders. 2
Association Between VRS-SO Measured Risk Change and Decreased Sexual and Violent Recidivism Among Aboriginal and Non-Aboriginal Offenders
A series of hierarchical Cox regression survival analyses were conducted to examine the association of positive treatment change to decreased sexual recidivism after comprehensive controls for baseline risk in the two racial/cultural groups (Table 3). In these analyses, risk assessment scores were entered first in Block 1, followed by the VRS-SO change score in Block 2. For space considerations, only Block 2 is shown. (As demonstrated in ROC analyses, risk assessment scores for all measures predicted each recidivism outcome.) Sample was also entered as a strata variable. As seen in regression Model 1, both VRS-SO pretreatment total scores and dynamic change scores were significantly uniquely associated with sexual recidivism. The hazard ratio (HR) magnitude of .906 would correspond to an approximate 9% to 10% decrease in the hazard for sexual violence for every 1-point increase in change score after controlling for baseline risk. Model 2, this time entering the Static-99R and VRS-SO pretreatment dynamic scores as covariates, generated the same findings broadly speaking. These analyses were repeated in Models 3 and 4, respectively, for non-Aboriginal offenders. Again, whereas risk scores uniquely predicted increased recidivism, change scores significantly uniquely predicted decreased recidivism. Of note, however, is that the HR magnitudes were not meaningfully different between the two racial/cultural groups, particularly for the change scores. Rather the HR values ranged from .900 to .911 with highly overlapping confidence intervals, demonstrating that the strength of the association between change scores and decreased recidivism, even after controlling for baseline risk, was equivalent for Aboriginal and non-Aboriginal men.
Cox Regression Survival Analysis: Unique Associations of Risk and VRS-SO Change Scores to Sexual Recidivism as a Function of Aboriginal Versus Non-Aboriginal Ancestry.
Note. VRS-SO = Violence Risk Scale–Sexual Offender version; HR = hazard ratio; CI = confidence interval; LL = lower limit; UL = upper limit.
Table 4 presents the results of hierarchical Cox regression analyses this time examining the unique associations between risk and change scores with violent (including sexual) recidivism. Again only Block 2 is presented for space considerations and the sample strata variable was employed as in previous analyses. Among Aboriginal men, again, risk and change scores each uniquely significantly predicted violent recidivism; in this instance, higher risk scores were associated with increased violence, whereas higher change scores were associated with decreased violence (Models 1 and 2). The HR magnitudes were slightly smaller than in the sexual recidivism analyses. Models 3 and 4, repeated in the non-Aboriginal subgroup, demonstrated the same pattern of findings. As with the sexual recidivism analyses presented in Table 3, again the HR values for the change score associations with decreased violence were particularly comparable in magnitude (.905-.915) between Aboriginal and non-Aboriginal men.
Cox Regression Survival Analysis: Unique Associations of Risk and VRS-SO Change Scores to Violent (Including Sexual) Recidivism as a Function of Aboriginal Versus Non-Aboriginal Ancestry.
Note. VRS-SO = Violence Risk Scale–Sexual Offender version; HR = hazard ratio; CI = confidence interval; LL = lower limit; UL = upper limit.
Associations of Aboriginal Group Membership to Sexual and Violent Recidivism Controlling for Risk and Change Information
Next, the unique association between Aboriginal ancestry and increased sexual and violent recidivism was examined through Cox regression survival analysis controlling for baseline risk and VRS-SO measured treatment change (see Table 5). Aboriginal men had higher risk scores as well as higher rates of sexual and violent recidivism as demonstrated in previous analyses. If higher risk scores explained the higher observed rates of sexual and violent recidivism, then Aboriginal group membership should no longer be associated with higher observed rates of recidivism once risk and change scores are controlled or held constant. This was found to be the case for sexual recidivism; as seen in Block 1, as with univariate analyses, Aboriginal group membership was associated with higher rates of sexual recidivism. Once the risk and change variables were entered as covariates (Block 2, Models 1 and 2), however, Aboriginal group membership was no longer significantly associated with higher rates of sexual recidivism for Model 2. The magnitude of the association had decreased although was still significant in Model 1; of note, this association only remained significant when the strata variable was employed. A different pattern was found in the prediction of violent recidivism. Aboriginal group membership was significantly associated with higher observed rates of general violence both on its own (Block 1), and after controlling for risk and change information (Block 2, Models 3 and 4). Of note, risk and change scores continued to uniquely significantly predict future violence; they just did not account for all of the observed differences in base rates of violence between Aboriginal and non-Aboriginal offenders.
Cox Regression Survival Analysis: Unique Associations of Aboriginal Ancestry to Sexual and Violent Recidivism Controlling for Risk and VRS-SO Change Scores (N = 1,063).
Note. All analyses performed using sample as a strata variable. Aboriginal ancestry covariate in Model 1 Block 2 was not significant (p = .064) without employing the sample strata variable; all other findings remained consistent. VRS-SO = Violence Risk Scale–Sexual Offender version; HR = hazard ratio; CI = confidence interval; LL = lower limit; UL = upper limit.
Calibration Analyses: Logistic Regression Derived Estimates of Sexual and Violent Recidivism
The final set of analyses entailed the use of logistic regression modeling to estimate 5-year rates of sexual and violent recidivism among Aboriginal and non-Aboriginal offenders using VRS-SO pretreatment total score and change score information. Four sets of logistic regression analyses were conducted in which VRS-SO pretreatment and change scores were entered simultaneously into the regression equation to predict each outcome separately among Aboriginal and non-Aboriginal offenders. Pretreatment and change scores significantly uniquely predicted each 5-year outcome within each ancestral group (see Figure 1 note for coefficient values). The Hosmer−Lemeshow tests were all nonsignificant indicating that the data appropriately fit a logistic distribution.

Calibration analyses: Logistic regression derived 5-year estimates of sexual and violent recidivism among Aboriginal, non-Aboriginal, and overall sample as a function of VRS-SO pretreatment total score and change.
Specific VRS-SO pretreatment scores and change scores of 0 (no change), 3.5 (representing the mean amount of change), and 6 (representing 1 SD above the mean, or above average change) were then used to compute estimated rates of 5-year sexual and violent recidivism using the formula,
The results are illustrated in Figures 1A (change score of 0), 1B (3.5 points change), and 1C (6 points change). Inspection of the figures shows declines in projected rates of recidivism with successive increases in change irrespective of ancestry and the declines do not appear to be disproportionate. However, projected 5-year violent recidivism rates are consistently higher for Aboriginal men irrespective of change, while estimated rates of sexual recidivism are much closer for the two groups; at extreme scores, the projected rates of sexual recidivism are actually slightly lower for Aboriginal offenders when modeled through logistic regression. That is, for sexual recidivism, the scorewise recidivism estimates (in each change group) cross over between Aboriginal and non-Aboriginal offenders, such that recidivism estimates for Aboriginal offenders are somewhat higher than average for low scores but lower than average for high scores.
Discussion
The present study examined the predictive properties of VRS-SO risk and change scores against the legal backdrop of the recent Ewert v. Canada decision, which ruled that the risk assessment and other structured forensic instruments used in the plaintiff’s case did not have satisfactory research supporting the psychometric properties of these tools with Aboriginal offenders. Drawing on a combined CSC-based sample of 393 Aboriginal and 670 non-Aboriginal treated sexual offenders, several analyses were conducted pertaining directly to the issues raised in this legal matter. The findings have implications not only for risk assessment with Aboriginal men in Canadian corrections, but also for the practice of risk assessment with diverse offender populations in general, and beyond North America.
One issue of prominence was whether such tools predicted their targeted recidivism outcomes among Aboriginal offenders, and to do so as well with White, non-Aboriginal offenders. In the current investigation, Static-99R and VRS-SO scores significantly predicted all recidivism outcomes in both Aboriginal and non-Aboriginal offenders. The prediction magnitudes were slightly higher for non-Aboriginal men for more than three quarters of the effect sizes; however, given the overlapping confidence intervals, at best there is weak evidence for stronger prediction by risk scores among non-Aboriginal offenders. It is important to position this trend in some context. There is little reason to believe that established static and dynamic risk variables would not predict recidivism in different subgroups of offenders (White, Aboriginal, and otherwise), and so it should stand to reason that constellations of risk variables, combined in the form of risk scales, also predict recidivism outcomes. Risk instruments are imperfect measurements devices, whether they apply numeric scores in an actuarial manner or apply structured professional judgment to a collection of item ratings to arrive at a final risk appraisal. The use of a structured approach, whether this be via the VRS-SO or other measure helps to apply risk information in a systematic manner, to reduce bias, and to increase the fairness and accuracy of decision making.
A second, related issue concerned normative scores on these tools. Men of Aboriginal descent did generate higher risk scores on the Static-99R and VRS-SO than non-Aboriginal men; however, the largest differences were found on static tools heavily weighted toward criminal history, whereas smaller differences were found on changeable dynamic factors. It is worth noting that higher scores for a subgroup may not necessarily indicate problems with the scale so long as it reflects actual differences; to the extent that test scores are being artificially inflated or used for unfair purposes would higher scores become problematic. Of additional importance was that there were no differences in dynamic change scores representing improvements from sexual offender treatment. That is, both groups made the same amounts of change overall, as well as a significant and nontrivial amount of change corresponding to nearly one half of a standard deviation. Thus, although Aboriginal offenders are more likely to score higher on the VRS-SO, they demonstrated equal amounts of positive treatment change on this tool from CSC-based sexual offender programming as non-Aboriginal offenders. One implication of this finding is that evidence informed programming, such as that following the principles of risk, need, and responsivity, has the potential to reduce risk in diverse offender populations. A further implication is that such risk changes can be captured in a valid and reliable manner.
A third issue relevant to this legal decision and beyond is to what extent use of these tools could cause harm or mischaracterize the risk of diverse offender populations and Aboriginal persons in particular. Importantly, positive treatment changes on the VRS-SO were associated with decreased recidivism among both racial/cultural groups in this sample. There was no evidence that change had any differential associations with outcome, or that either group was disadvantaged by the tool in terms of recognizing positive treatment changes or risk reduction made. Rather, the data indicate that the VRS-SO could capture positive risk-relevant changes in treated Aboriginal sexual offenders and that these changes translated into decreased recidivism to a similar degree (i.e., near identical HR magnitudes) as with non-Aboriginal offenders.
That Aboriginal men scored higher on the risk tools overall as well as had higher rates of sexual and violent recidivism indicate that the higher risk scores likely account, in part, for these higher observed recidivism rates. This explanation seems to be most compelling for the prediction of sexual recidivism; the group differences tended to be smaller on this outcome and Aboriginal group membership did not consistently significantly predict increased rates of sexual violence once risk and change information had been accounted for. This is not the whole story, however. After controlling for risk and treatment change, Aboriginal men still had significantly higher rates of general violent recidivism postrelease than non-Aboriginal men, and this was evident through calibration analyses.
The results of calibration analyses demonstrated that recidivism estimates for sexual offending among Aboriginal offenders was somewhat higher than average for low scores, but lower than average for high scores. When applied to general violent recidivism, Aboriginal men had higher estimated rates of recidivism at both lower and higher scores. The calibration analyses have similarities to the results of Wilson and Gutierrez’s (2014) calibration analyses with the Level of Service/Case Management Inventory (LS/CMI; Andrews, Bonta, & Wormith, 2004). In this instance, they found higher general recidivism estimates for Aboriginal men that were most pronounced at lower scores on the tool, but with the disparities in recidivism estimates decreasing, and classification accuracy being more comparable, at higher scores on the tool.
What these findings demonstrate is that there are likely variables and contextual circumstances unique to Aboriginal group membership, and not necessarily tied to risk or treatment performance per se, that partly account for the higher rates of general violent recidivism observed in this group (see Wilson & Gutierrez, 2014, for a detailed analysis of possible considerations). Thus, risk and change information on its own does not explain all individual differences in recidivism rates at least for general violence. Canada’s Gladue provision (R v. Gladue, 1999) is an important sentencing tool to take into account historical, social, and cultural considerations unique to Aboriginal persons that may have contributed to the individual’s conflict with the law to inform sentencing and sanctions. As such, the Gladue provision permits consideration of other factors, at least at the time of sentencing that may have bearing on involvement in antisocial behavior.
In closing, in addition to demonstrating acceptable predictive accuracy for sexual and general violence among Aboriginal and non-Aboriginal offenders, significant associations demonstrated that higher change scores often amounted to decreased recidivism, regardless of ancestry. The results also indirectly speak to the integrity of the specialized sexual offender programs that these men participated in at the time within Canadian corrections to reduce risk and prevent recidivism. As these data focused squarely on the exact offender group represented by the plaintiff, federally incarcerated Aboriginal sexual offenders, these findings could be no less pertinent to Ewert v. Canada. At a more global level, the use of risk assessment instrumentation and applicability of mainstream correctional programs with diverse offender populations is an issue that is not without controversy in jurisdictions around the world. It is hoped the present study will stimulate further research on the predictive properties of risk and change scores of other risk tools with culturally, ethnically, and racially diverse offenders in other settings.
Any tool has the potential for misuse; the long history of atrocities featuring misapplications of IQ testing to brand children as feebleminded, deport newly landed immigrants, or to justify the reproductive sterilization of low scoring adolescents as recently as 45 years ago is a dark reminder. The problem, however, resides not so much within the assessment tools themselves as it does with the failure to use psychological tests in an ethical and responsible manner or for their intended purposes. Indeed, responsible use of IQ testing by trained personnel can be used to generate beneficial applications such as educational or vocational accommodations, access to specialized programs or services, or financial or other environmental supports. The same applies no less to risk assessment. Slavish adherence to numeric actuarial risk estimates, overreliance on purely static tools, or failure to use professional discretion in the integration of assessment data to inform risk appraisals can result in harmful decisions to any offender subgroup, regardless of culture, race, or ethnicity. The use of dynamic assessment tools, however, to track positive treatment progress and change can frankly inform the graduated release and supervision of individuals whose risk can be safely managed in the community and provide an opportunity for them to lead new and more satisfying lives free of sexual violence. To not use evidence informed tools with a certain subgroup when the science supports their psychometric properties arguably does a disservice to frontline staff, decision makers, and the offender clientele themselves, all of whom stand to benefit when such tools can inform the management of risk and prevention of recidivism.
Footnotes
Authors’ note
The views expressed in this article are solely those of those authors and do not necessarily reflect the views of the Correctional Service of Canada, University of Saskatchewan, Royal Ottawa Health Care Group, University of Ottawa, University of Canterbury, University of Nottingham, or Swinburne University of Technology.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Mark Olver, Stephen Wong, Terry Nicholaichuk, and Audrey Gordon are developers of the Violence Risk Scale-Sexual Offender version and receive remuneration for consultation and trainining services with the tool.
