Abstract
We examined the predictive properties of the Level of Service Inventory–Ontario Revision (LSI-OR) in a sample of 604 provincially incarcerated men with mental illness from a correctional mental health facility followed up nearly 2 years after release. Recidivism base rates and LSI-OR scores were relatively consistent across major mental disorder categories, but higher among individuals with personality disorder, substance use disorder, or dual diagnosis. LSI-OR scores predicted general and violent recidivism in the overall sample and among specific diagnostic groups. Calibration analyses were conducted to model 1-year recidivism estimates for the overall sample and among individual diagnostic groups associated with individual LSI-OR scores. Good correspondence was observed among the different diagnostic groups, with some difference in recidivism trajectories given the differences in base rate. The results support the predictive properties of the LSI-OR with correctional mental health samples and inform the recidivism estimates associated with LSI-OR scores in this population.
Assessments of risk for future recidivism are an essential component of the administration and management of offenders within correctional and forensic mental health systems around the globe. Risk assessment is a systematic process of collecting, aggregating, and integrating information from file and interview to inform the final appraisal and concordant recommendations. From the point of intake to case closure, recidivism risk assessments are used to inform sentencing, classification, security level, the intensity and foci of treatment programming, release decisions, and community supervision, monitoring, and restrictions (Andrews, Bonta, & Wormith, 2006).
A large number of risk instruments have been developed for different targeted outcomes and populations (e.g., sexual offending, general violence, intimate partner violence), based on theory and research supporting the presence of specific static and dynamic risk factors for such groups, with implications for risk assessment and management. However, general risk/need domains reflecting a propensity for rule violation or antisociality (Hanson & Morton-Bourgon, 2005) are common to all risk measures and predictive of recidivism across offender groups. General risk/need tools specifically tap this broad propensity and are designed to be used across a range of offender populations to inform case planning.
Level of Service (LS) Measures: A Brief Overview
The Level of Service Inventory (LSI) is one such general risk/need assessment tool; with more than a dozen variants and revisions (including youth, self-report, and screening versions), the LS family of tools is the most widely used collection of risk instruments in the world, with over 1 million administrations annually (Wormith, 2011). The LS scales are grounded in a General Personality and Cognitive Social Learning Theory (GPCSL), which identifies a core set of static and dynamic risk/need domains, termed the Central Eight, involved in the origin and maintenance of antisocial behavior. Within the GPCSL framework, the LS measures are designed to appraise risk, to identify targets based on the Central Eight for intervention, and to inform service delivery to manage risk. A recent comprehensive meta-analysis (Olver, Stockdale, & Wormith, 2014) of the predictive accuracy of the LS measures from 151 independent samples and 137,981 offenders found total scores on the tool to predict general criminal recidivism in male (rw = .30, k = 80, n = 77,920), female (rw = .31, k = 45, n = 17,802), White (rw = .29, k = 24, n = 40,989), non-White (rw = .27, k = 36, n = 25,780), adult (rw = .25 to .42, k = 4 to 55, n = 2,518-78,505), and youth (rw = .28, k = 30, n = 15,447) offender samples. The LS measures, and their criminogenic need domains, also predicted violent recidivism in these broad demographic subgroups, as well as other recidivism outcomes (e.g., institutional offending, half-way house failure, reincarceration).
Relevance of the Central Eight to Persons With Mental Illness
The Central Eight and the LS measures are highly relevant in risk appraisals of persons with mental illness. Indeed, the vast majority of studies have shown that the predictors of recidivism are largely shared between persons with and without mental illness (Kingston, Olver, Harris, Wong, & Bradford, 2015; Rezansoff, Moniruzzaman, Gress, & Somers, 2013 Skeem, Winter, Kennealy, Louden, & Tatar, 2014). In their meta-analysis of 96 forensic samples of persons with mental illness (n = 23,900), Bonta and his colleagues (Bonta, Blais, & Wilson, 2014; Bonta, Law, & Hanson, 1998) found that the Central Eight risk/need domains were significantly associated with general and violent recidivism.
The risk relevance of the Central Eight among persons with mental illness is not particularly contentious, but what is of greater debate is the role of mental health symptomatology in risk assessment. The Bonta et al. (2014) meta-analysis and an earlier quantitative review (see Bonta et al., 1998) found that clinical predictors such as major mental illness diagnoses (e.g., psychosis, mood disorder, anxiety disorder) and previous psychiatric hospitalizations were weak predictors of general and violent recidivism. A notable exception was that a diagnosis of personality disorder (PD), namely, antisocial PD, was a reliable predictor of recidivism, although the authors noted that this variable is highly consistent with several of the aforementioned risk/need factors.
Elsewhere, in a large epidemiological study of 31,104 offenders, Rezansoff et al. (2013) found that released offenders with a substance use disorder (SUD), occurring in isolation or comorbid with another mental disorder (dual diagnosis or DD), had twice the odds of reoffending within a 2-year period than those without such diagnoses. Individuals with nonsubstance-related mental disorders, by contrast, were at no greater risk of recidivism than individuals without any diagnosis. Subsequent studies in forensic samples of persons with mental illness have extended these findings (Kingston et al., 2016; Kingston et al., 2015). Of note, the Bonta et al. (2014) review did not examine active symptomatology and its association with violence. To this end, a major meta-analysis of psychosis and violence (Douglas, Guy, & Hart, 2009) found a significant link between the two, particularly for community and civil psychiatric samples, although the strength of the association was smaller for correctional samples and nonsignificant for forensic psychiatric samples.
Discrimination and Calibration Properties of the LS Measures
Discrimination and calibration are two ways of evaluating the predictive accuracy of a risk tool. The predictive accuracy findings of the LS measures reported from past research have pertained largely to the property of discrimination (relative risk), that is, to what extent risk scores are associated with higher rates of recidivism irrespective of actual rates of recidivism. A related property is the calibration of a risk instrument (absolute risk), that is, what recidivism rates are associated with risk scores and to what extent do expected recidivism rates for a given score correspond to observed recidivism rates for that score. Important decisions are made regarding the calibration properties of risk assessment tools, such as the level of treatment intensity correctional cases are assigned, whether they are granted conditional release, how much supervision they receive in the community, their volume of release conditions/restrictions, amount of reporting required as part of community supervision, and so on. Calibration research on the Level of Service Inventory–Ontario Revision (LSI-OR; Andrews, Bonta, & Wormith, 1995) has shown lower rates of recidivism to be observed for adult female offenders than for adult male offenders with the same score, or falling in the same risk category (Hogg, 2011). Using the same LSI-OR sample, Wilson and Gutierrez (2014) found low-scoring Indigenous men had higher predicted rates of recidivism compared with low-scoring non-Indigenous men, but at higher scores, Indigenous and non-Indigenous recidivists were classified with similar accuracy.
In short, while existing research supports the calibration properties of the LS scales, demonstrating that increasing scores are associated with successively higher rates of recidivism, the same risk score or score category may also be associated with different rates of recidivism for a particular offender group. Calibration estimates have important and direct applications for offender classification and management. It is important to bear in mind, however, that calibration estimates can be influenced by a variety of factors that affect the outcome variable, such as variation in base rates, follow-up time, jurisdictional differences in the definition of the criterion variable (e.g., charges vs. convictions), and the comprehensiveness and reliability of the outcome data source. Clear definitions of the outcome variable, obtained from a reliable data source, controlling for individual differences in follow-up time can help increase the accuracy and consistency of calibration estimates for risk classification.
Present Study: Context and Rationale
The predictive properties of the LS measures for recidivism have been well established across a range of clinical groups that transcend age, gender, and ethnic–racial ancestry (Olver et al., 2014). The existing literature also supports the predictive accuracy of the Central Eight and the LS measures among persons with mental illness (Bonta et al., 2014; Skeem et al., 2014). Most of this work to date has examined the discrimination properties of the LS measures or collections of predictors. Moreover, there has yet to be a formal examination of the calibration properties of the LS measures in a correctional mental health sample to evaluate to what extent recidivism rates from the normative sample apply to persons with mental illness.
The present study endeavors to address this gap in the literature through examining the discrimination and calibration properties of the LSI-OR in a large provincially incarcerated adult male sample of persons with mental illness. The LSI-OR is used throughout provincial corrections and mental health sectors of Ontario, but of note, it is also known as the Level of Service/Case Management Inventory (LS/CMI; Andrews, Bonta, & Wormith, 2004; J. Stephen Wormith, personal communication, June 28, 2018). Therefore, results concerning the LSI-OR have direct implications for the LS/CMI, which employs Ontario provincial corrections norms and is used in jurisdictions around the world. Examining the recidivism rates associated with LSI-OR risk scores among persons with mental illness is an important priority and it has direct ramifications for applications of the LS/CMI with this population; specifically, do the rates of recidivism in the LS/CMI normative sample extend to those of a correctional mental health sample? Finally, calibration metrics, such as the expected/observed (E/O) index (see Method), have been applied to examine the calibration properties of risk tools (Hanson, 2017), but to our knowledge, this has yet to be done with the LS scales in general or on a variant of the tool in a correctional mental health sample.
Method
Sample
The sample consisted of 604 provincially incarcerated adult male offenders who were receiving treatment and stabilization services at an all-male wing of a Canadian provincial correctional mental health facility. As provincial offenders, the men were held criminally culpable for their offenses and sentenced to a duration of custody under 2 years (M = 1.0 years, SD = 0.5). Thus, the sample did not include individuals adjudicated Not Criminally Responsible on account of Mental Disorder (i.e., often referred to as Not Guilty by Reason of Insanity in other jurisdictions). Most of the sample was White (76.3%, n = 461), followed by Indigenous Canadian (8.4%, n = 51), Black (8.4%, n = 51), Asian (1.8%, n = 11), Middle Eastern (1.3%, n = 8), and the remaining being of other or unknown racial descent (3.6%, n = 22). Marital status information was available for approximately two thirds of the sample (65.1%, n = 393), most of whom were single/never married (62.6%, n = 246), followed by divorced/separated (22.4%, n = 88), married or equivalent (137%, n = 54), and widowed (1.0, n = 4). The sample (n = 593) was 34.5 years (SD = 11.8) of age on average upon admission and 34.8 years (SD = 11.8) at release. In terms of their criminal histories, the men averaged 2.7 (SD = 3.6) prior charges and convictions for a violent offense, 21.8 (SD = 25.2) total prior charges and convictions, and 11.0 (SD = 12.8) prior sentencing dates.
All admissions had mental health concerns prompting a referral to the facility. Formal Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR; American Psychiatric Association, 2000) diagnoses were assigned at intake by a licensed forensic psychiatrist; however, these diagnostic data were available for only half of the sample (50.5%, n = 305). Owing to resource and time constraints, diagnostic data were not extracted for the most recent admissions (mid 2014-2015; 49.5%, n = 299) during the second wave of data collection for the present study. Of the 305 cases with diagnosis available, 31.5% (n = 96) had a diagnosis of schizophrenia or other psychosis, 41.3% (n = 126) mood disorder, 44.3% (n = 135) anxiety disorder, 36.7% (n = 112) PD, 81.6% (n = 249) SUD, and 78% (n = 238) a co-occurring SUD with another mental health diagnosis or DD. Comorbidity was high in this sample, with 55.4% of cases meeting symptom criteria for multiple diagnoses (excluding SUD) and 89.8% of cases when including SUD (see Table 1). As a result, the subgroup ns do not sum to 305.
Rates of Comorbidity of Psychiatric Diagnoses and Associations With 1-Year General and Violent Recidivism
Note. Comorbidity values represent percentage of cases with one diagnosis that also have a corresponding diagnosis (e.g., comorbid mood and anxiety disorder represented 46% of cases diagnosed with mood disorder and 43% of cases diagnosed with anxiety disorder). Associations with general and violent recidivism evaluated via odds ratios (OR) with values above 1.0 indicating the diagnosis to be associated with increased recidivism, and values below 1.0, decreased recidivism.
p < .05. **p < .01. ***p < .001.
LSI-OR
The LSI-OR (Andrews et al., 1995) is a risk/need rating scale designed to appraise risk for general recidivism, to identify criminogenic needs to be targeted for risk management, and to inform case management from the point of intake through to case closure. It is rated by frontline service providers on the basis of case history/file information and interview. The LSI-OR is organized into eight sections. Section A assesses General Risk/Need Factors and comprises 43 items organized into the Central Eight. Section B assesses Specific Risk/Need Factors that include Personal Problems with Criminogenic Potential and History of Perpetration. Section C is a brief Risk/Need Summary (risks and strengths), whereas Section D assesses Institutional Factors. Section E constitutes the Risk/Need Profile in which Section A item scores are organized into one of five risk bands across the eight domains as well as the total score: Very Low (0-4), Low (5-10), Medium (11-19), High (20-29), and Very High (30-43). Section F assesses Other Client Issues (i.e., Social, Health, and Mental Health and Barrier to Release), Section G assesses Special Responsivity Issues (e.g., cultural issues), and Section H represents the Program/Placement Decision.
Procedure
The LSI-OR was completed on every individual referred to the facility as part of the comprehensive referral package accompanying all offenders admitted to this institution. All LSI-OR raters underwent a 3-day training provided by the province of Ontario’s Correctional Service College. The LSI-ORs were most typically completed by social workers or rehabilitation officers at the institution. Each LSI-OR initial assessment included, at minimum, a face-to-face interview with the offender, an interview with a collateral contact, and a detailed review of file documentation, including information pertaining to the index offense, prior criminal record, and any other relevant file information, such as prior program participation. The Authorization section of the tool required raters to identify themselves as having completed the LSI-OR and their position.
Recidivism Criteria
Outcome data were retrieved on June 10, 2016, from the Offender Tracking and Information System (OTIS) used by the province of Ontario’s Ministry of Community Safety and Correctional Services, the same tracking system used for the LS/CMI normative sample. In this study, recidivism was defined as a return to provincial correctional supervision on a new charge or conviction within 2 years of the completion of a provincial sentence of incarceration. Two recidivism outcomes were examined: violent recidivism, which included any new charge or conviction for crimes against the person (e.g., assault, homicide, threats, sexual offenses), and general recidivism, which included any new criminal charge or conviction.
Planned Analyses
The analyses proceeded in several phases focusing on Sections A, B, and E of the LSI-OR. All cases were included in the analyses, regardless of the availability of diagnostic data, given that this represented a common sample referred and admitted under common criteria to the same facility. 1 First, Cohen’s d was computed to examine the difference in LSI-OR need and total scores between binary diagnostic groups (i.e., diagnosis–no diagnosis) for the following categories: schizophrenia/psychosis, mood disorders, anxiety disorders, PD, SUD, and DD (i.e., comorbid SUD and mental health conditions). Significant positive d values would indicate that diagnosis was associated with higher scores (i.e., increasing risk), whereas significant negative d values would indicate a diagnostic category to be associated with lower scores (i.e., decreasing risk). The benefit of d is that it is not affected by fluctuating diagnostic base rates. Individuals were included in a given diagnostic subgroup for all such analyses if their symptomatology fulfilled the DSM-IV-TR criteria for a given diagnosis. This was due to the high level of comorbidity in the sample, which prohibited examining nonoverlapping groups arranged according to a single primary diagnosis due to power restrictions and consequent inflation of Type II error for some subgroups (i.e., mood and anxiety disorders). In addition, the association between diagnostic group membership and general and violent recidivism was examined via odds ratio (OR) to evaluate the risk relevance of certain diagnostic categories.
Second, the discrimination properties of LSI-OR total and need scores for general and violent recidivism were examined on the total sample and six diagnostic subgroups via receiver operator characteristic (ROC) analyses. ROC analyses generate an area under the curve (AUC) statistic ranging from 0 to 1.0, representing the probability that a randomly selected recidivist has higher scores than a randomly selected nonrecidivist. With values of .50 indicating chance-level accuracy, values of .556, .639, and .714 correspond to small, medium, and large effect sizes, respectively (Rice & Harris, 2005).
Third, the calibration properties of LSI-OR total scores for general and violent recidivism were examined in similar fashion in the sample overall and among diagnostic subgroups. Helmus and Babchishin (2017) note, “A consensus on the appropriate statistics for calibration is not yet developed, largely because the emphasis on calibration has only begun in recent years” (p. 14). Calibration was examined through logistic regression modeling employing 1-year fixed follow-ups (see Hanson, Helmus, & Thornton, 2010, for similar applications with Static-2002R data). Logistic regression generates a constant (B0), the log odds of the recidivism base rate, while criterion predictor associations are represented by logistic regression coefficients (B1), which represent the proportionate increase in rates of recidivism between adjacent scores on the predictor variable (i.e., LSI-OR score). The following log linking function
A direct test of calibration is to compare expected with observed recidivism rates, that is, to what extent observed rates of recidivism associated with a given score (or category of possible scores) align with the rates of recidivism for the same scores as estimated from logistic regression or otherwise expected from another sample. To do this, we computed the E/O index following the procedure set out by Hanson (2017) which involves computing the ratio of the expected (E) number of recidivists for a given score (or score category) to the number of recidivists directly observed (O) for the same score or score category. The expected number of recidivists may be obtained through logistic regression or represent the actual number of recidivists from a reference group. The index cannot be computed when there are zero observed recidivists as this would require dividing by zero. The E/O index was computed in two capacities. First, the E/O index was computed to examine calibration properties of the LSI-OR internal to the sample, that is, to what extent observed rates of 1-year violent and general recidivism lined up with 1-year rates estimated through logistic regression for the five risk bands. Second, a more rigorous test of calibration would be to compute the E/O index for 1-year reincarceration rates for the risk bands in the normative sample (Andrews et al., 2004) for provincially incarcerated offenders, applying their recidivism frequencies to the current sample’s cell ns to permit direct comparison. The latter test would enable conclusions to be drawn as to how well the LSI-OR is calibrated for a correctional sample of persons with mental illness.
E/O values of 1.0 indicate perfect calibration, whereas values above 1.0 indicate higher number of recidivists estimated for a given risk band than that observed (i.e., overestimation of the number of recidivists by a given category), and values below 1.0 indicate a smaller number of recidivists estimated for a given risk band (i.e., underestimation of the number of recidivists for that category). The significance of the E/O index can be examined through computing 95% confidence intervals (CI) through the following formula (Rockhill, Byrne, Rosner, Louie, & Colditz, 2003; as taken from Hanson, 2017):
When the bounds of the 95% CI do not overlap with 1.0, the E/O index represents significant differences between the expected and observed recidivism rates.
Results
LSI-OR Comparisons as a Function of Mental Health Diagnosis
Table 2 presents basic descriptive statistics for the overall sample and six diagnostic subgroups. Broadly speaking, the sample was high risk/need with a mean score of 26.5 (SD = 7.7), which is significantly higher than the Canadian provincial male inmate norms of 22.1 (SD = 8.4), by approximately half of a standard deviation, d = 0.52, p < .001 (see Andrews et al., 2004). In terms of the risk level for the overall sample, less than 1% (n = 3, M = 3.0, SD = 1.0) were Very Low, 2.2% Low (n = 13, M = 9.2, SD = 1.1), 15.7% Medium (n = 95, M = 15.3, SD = 2.7), 42.9% High (n = 259, M = 24.9, SD = 2.8), and 38.7% Very High (n = 234, M = 34.0, SD = 7.7).
LSI-OR Descriptive Statistics and Associations With Diagnostic Group Membership
Note. The d values are standardized mean differences between binary diagnostic groups (yes–no) on an LSI-OR measure. Positive d values indicate diagnostic group membership is associated with higher scores on the LSI-OR measure; negative d values, lower scores. LSI-OR = Level of Service Inventory–Ontario Revision.
The broadly high-risk nature of the sample applied across each of the diagnostic groups, although binary comparisons (Cohen’s d) demonstrated that diagnoses of PD, substance disorder, or DD were associated with higher total risk scores and greater risk and need in all need domains except for Family/Marital and (for PD only) Leisure/Recreation and Procriminal Attitude. Conversely, diagnoses of mood or anxiety disorder were associated with lower risk/need, including the total LSI-OR score, Education/Employment, and (for mood disorder only) Leisure/Recreation and Substance Use, as well as (for anxiety disorder only) Procriminal Attitude and Antisocial Pattern (negative d values). Anxiety disorder diagnosis was associated with a greater number of strengths, while having a substance-related diagnosis was associated with decreased strengths. Finally, schizophrenia or other psychotic disorder diagnosis was not significantly associated with higher risk on most of the LSI-OR domains except for Leisure/Recreation and Procriminal Attitude.
Discrimination Analyses: Predictive Accuracy for Violent and General Recidivism
The sample was followed up for a mean 1.9 years (SD = 0.88, range = 3.8 months to 4.2 years) after release, during which 16.4% (n = 99/604) were charged or convicted for a violent offense and 47.7% (n = 288/604) for any new offense. One-year rates of violent and general recidivism were 12.6% (n = 61/483) and 36.2% (n = 175/483), respectively. As seen in Table 1, diagnoses of PD were associated with significantly increased odds of violent and general recidivism, diagnoses of SUD and DD were associated with significantly increased odds of general recidivism, and diagnoses of schizophrenia were associated with significantly decreased odds of general recidivism. No other diagnoses were significantly associated with recidivism.
The next set of analyses examined the predictive accuracy of LSI-OR scores for general and violent recidivism using unfixed and 1-year fixed follow-ups (Table 3). The fixed follow-ups yielded slightly higher AUC magnitudes notwithstanding the greater power afforded to the larger overall sample with unfixed follow-ups. LSI-OR total scores demonstrated moderate predictive accuracy for general recidivism and small but significant predictive accuracy for violent recidivism. In turn, the Central Eight risk domains significantly predicted general recidivism with small-to-moderate AUC magnitudes. Criminal History and Antisocial Pattern were the strongest predictors in terms of AUC magnitude, and these domains, along with Substance Abuse, significantly predicted violent recidivism. Five of the eight criminogenic need domains did not significantly predict violent recidivism. Finally, additional specific risk/need considerations significantly predicted general (but not violent) recidivism with small magnitude accuracy, while the strengths domain of the LSI-OR failed to significantly predict any recidivism criteria.
Predictive Accuracy of LSI-OR Scores for Violent and General Recidivism Over Fixed and Unfixed Follow-Ups
Note. Unfixed follow-up n = 602-604; fixed follow-up n = 480-482. LSI-OR = Level of Service Inventory–Ontario Revision; AUC = area under the curve; CI = confidence interval.
p < .05. **p < .01. ***p < .001.
These analyses were repeated using fixed 1-year general recidivism outcome as the criterion variable among the six specific diagnostic subgroups (Table 4). First, LSI-OR total risk/need score demonstrated moderate predictive accuracy for general recidivism irrespective of diagnostic subgroup. AUCs tended to be somewhat lower in higher risk/need groups (personality, substance, and dual diagnoses), likely given that these analyses are incremental validity tests of sorts, as diagnostic subgroup status is controlled for; as such, LSI-OR scores were able to meaningfully discriminate recidivists from nonrecidivists, even among subgroups associated with elevated base rates of recidivism. In terms of the specific risk/need domains, Criminal History and Antisocial Pattern were consistent predictors of outcome across the diagnostic groups, followed by companions (schizophrenia, mood, and anxiety diagnoses) and substance abuse (mood, anxiety, and substance disorder diagnoses). The largest volume of criminogenic needs demonstrated significant predictive accuracy among men diagnosed with mood disorders, with Employment/Education and Leisure/Recreation also significantly predicting this outcome in this subgroup. Specific risk/need scores and strengths did not significantly predict recidivism in any of the diagnostic subgroups.
Predictive Accuracy of LSI-OR Scores for General Recidivism (Fixed 1-Year Follow-Up) Among Diagnostic Groups
Note. LSI-OR = Level of Service Inventory–Ontario Revision; AUC = area under the curve; CI = confidence interval.
p < .05. **p < .01. ***p ⩽ .001.
Calibration Analyses: Rates of Recidivism Associated With LSI-OR Scores
The next set of analyses examined the rates of recidivism associated with LSI-OR scores or clusters of scores within its five-tiered risk bands. The observed base rates for violent and general criminal recidivism supported the predictive accuracy of the bins in that higher observed rates of recidivism were associated with subsequent increases in risk level from one bin to the next, N = 483, χ2 = 42.91, p < .001. Figure 1 graphs the trajectories of 1-year violent and general recidivism using logistic regression modeling for all possible LSI-OR scores. Results from the Hosmer–Lemeshow test were nonsignificant, demonstrating that the recidivism data and their association with risk scores fit a logistic distribution. Results of logistic regression generated the following terms for violent (B0 = −2.746, B1 = 0.081, p = .001) and general (B0 = −3.421, B1 = 0.103, p < .001) recidivism. These values were applied to generate the estimated rates of recidivism graphed in Figure 1. The actual rates of recidivism associated with a given LSI-OR score for each outcome demonstrated a saw-toothed pattern, with the general trend of an upward increase in observed rates of recidivism associated with higher LSI-OR scores, reflecting the instability of actual recidivism rates. In part, this is owing to the small ns associated with some cells, particularly for extreme scores, and thus, the actual rates are quite variable. As demonstrated previously by Hanson et al. (2010), logistic regression modeling then served to smooth out the recidivism trajectories, reducing error and bias, to estimate rates of recidivism associated with a given LSI-OR score in this correctional mental health sample.

LSI-OR Total Score Calibration Analyses of Overall Sample Comparing 1-Year Violent and General Recidivism Trajectories for Actual and Logistic Regression Estimated Rates (N = 483).
Table 5 reports the results of E/O index calculations. Given that there were no observed recidivists for the Very Low and Low categories, the E/O index could not be computed, although it could be computed for the remaining categories for both outcomes. In terms of calibration properties internal to and descriptive of the sample, rates of general and violent recidivism were slightly overestimated for the Medium and Very High categories, with a much smaller margin found for the estimation of general recidivism (8% and 2%, respectively) compared with violent recidivism (94% and 10%, respectively), bearing in mind for the latter this represented a very low base rate for the Medium category. For the High Risk category, the number of recidivists for general and violent recidivism was slightly underestimated (7% and 19%, respectively). Importantly, however, as the 95% CIs all overlapped with 1.0, none of these differences between observed and expected recidivism rates were statistically significant for either outcome or any of the risk categories.
LSI-OR E/O Index Values for 1-Year Rates of General and Violent Recidivism
Note. For comparisons with normative sample, observed n recidivists are computed using the percentages from the normative sample (Andrews et al., 2004) applied to the observed cell n for each risk band from the current sample. The E/O index is significant when confidence intervals (CIs) do not overlap with 1.0. LSI-OR = Level of Service Inventory–Ontario Revision.
To compare the recidivism rates observed in the present sample with those expected from the LS/CMI normative sample, a more rigorous test of calibration, we computed the E/O index using the number of recidivists based on the observed 1-year recidivism frequencies in the original normative sample for provincial inmates (Andrews et al., 2004), compared with the observed 1-year frequencies in the current sample. As seen at the bottom of Table 5, the LS/CMI significantly overestimated the number of recidivists for the Medium (i.e., predicting 113% more recidivists) and High (38% more) risk categories compared with the study sample and slightly overestimated the recidivism rates for the Very High risk group (20% more).
Figure 2 presents the results of calibration analyses of LSI-OR scores for 1-year general recidivism repeated for the six diagnostic subgroups. Given the small and fluctuating cell sizes of the diagnostic groups, the E/O index was not computed for these. However, the results demonstrated the relative stability of recidivism rates estimated from LSI-OR scores across the diagnostic subgroups. In addition, the schizophrenia subgroup showed a slightly shallower trajectory, and individuals with mood disorder had slightly higher recidivism estimates for extreme scores. Taken together, the results support the stability of 1-year general recidivism estimates of LSI-OR scores for specific diagnostic subgroups.

LSI-OR Calibration Analyses for Specific Diagnostic Subgroups (1-Year General Recidivism)
Discussion
In this study, we examined the discrimination and calibration properties of the LSI-OR (aka LS/CMI) on a fairly large correctional sample of persons with mental illness serving short provincial terms at a Canadian mental health correctional facility. The work builds on the existing corpus of research supporting the psychometric properties of the family of LS measures with diverse offender samples. The calibration properties of the tool had yet to be examined among persons with mental illness, with the current study being based within the same jurisdiction as the original LS/CMI normative sample to extend the generalizability of findings.
Preliminary analyses demonstrated that major mental disorder categories of schizophrenia/psychosis, anxiety disorders, and mood disorders on their own tended to be associated with a slightly lower density of criminogenic needs, as demonstrated by lower scores within these domains and overall risk on the LSI-OR. These groups also had lower rates of recidivism. By contrast, SUD, PD, and DD were positively associated with increased density of criminogenic needs and higher risk rating overall; these diagnoses were also associated with increased odds of recidivism. This is consistent with previous research demonstrating that mental illness on its own in a correctional context tends not to be strongly associated with risk and need (Kingston et al., 2016, 2015) and for nonsubstance-related mental disorder to have limited associations with community recidivism rates among offender samples (Rezansoff et al., 2013). The same supporting lines of research, however, have tended to find the opposite for PD (especially antisocial personality) and substance-related conditions, which in and of themselves are criminogenic given that problems in two of the Central Eight need domains are pertinent to these diagnoses, that is, PD and SUD have risk relevance.
Discrimination and Calibration Properties of the LSI-OR for Persons With Mental Illness
LSI-OR scores also showed good discrimination of recidivists from nonrecidivists in ROC analyses. With the total score demonstrating moderate magnitude accuracy for recidivism, the individual Central Eight need domains demonstrated small-to-moderate accuracy for general recidivism, while specifically Criminal History, Substance Use, and Antisocial Personality Pattern were also predictive of violent recidivism. These latter three domains were most frequently predictive of 1-year general recidivism within each mental disorder category, along with companions and the total score. Several of the criminogenic needs that were significantly predictive in the overall sample (e.g., Procriminal Attitudes, Employment/Education, Leisure/Recreation) were not so among the diagnostic subgroups, in part likely owing to decreased power from declining cell sizes, although the AUC magnitudes tended to remain fairly stable. Even within diagnostic groups that have inherent criminogenic potential (e.g., SUD), in some instances, the concordant need domain (e.g., drug and alcohol problems) still predicted outcome. The total score was the most robust and stable entity, however, representing the sum total of risk and need used to inform service intensity, risk reduction, and decisions concerning supervision and release. In all, the findings support the discrimination properties of the LSI-OR and the Central Eight, and thus the risk relevance of these domains in a correctional mental health sample to extend previous meta-analytic findings (Bonta et al., 2014; Olver et al., 2014).
In terms of calibration, use of logistic regression and computation of the E/O index demonstrated that LSI-OR observed and expected rates of general and violent recidivism corresponded well internally to the sample and among diagnostic subgroups. However, calibration was considerably poorer when examined relative to the LS/CMI normative sample. Relative to the 1-year recidivism rates of the LS/CMI normative sample of provincially incarcerated offenders (Andrews et al., 2004), the LSI-OR overestimated general recidivism for the three upper risk bands in the present sample, particularly with the moderate risk group. These issues will each be discussed in turn.
First, examination of individual trajectories of general recidivism as a function of increasing LSI-OR total score demonstrated similar rates of recidivism and absolute increases among the diagnostic subgroups, with an exception being individuals diagnosed with schizophrenia/psychosis; in this group, lower estimated rates of recidivism for high LSI-OR scores (i.e., 20+) were observed compared with the other diagnostic groups, likely given that schizophrenia diagnoses were associated with lower rates of general recidivism in the present sample. A few studies have reported a weak relationship between psychosis and criminal outcomes, such as violence, in offender samples (Junginger, Claypoole, Laygo, & Cristiani, 2006; also see, Skeem, Kennealy, Monahan, Peterson, & Appelbaum, 2016). In a large forensic psychiatric sample, Rice and Harris (1992) compared individuals with and without a diagnosis of schizophrenia and who were matched on other relevant variables (e.g., age, severity of index offense) and found that offenders with schizophrenia exhibited a significantly lower rate of criminal recidivism and nonsignificantly lower rates of violent recidivism compared with persons not diagnosed with schizophrenia. Moreover, the types of offenses committed by offenders with schizophrenia were less serious than the comparison group. Rice and Harris surmised that individuals without schizophrenia were more likely to exhibit alcohol problems compared with the comparison group which may have contributed to their elevated rates of recidivism. Another potential explanation they advanced was that the schizophrenia patients received more intensive supervision following release from the institution which affected their ability to reoffend. In the current context, discharged men do receive aftercare, and even when their sentence expires, the possibility remains that they receive continued services within their local health authority. Although information was not available for the current study regarding access to community services after release, it remains a distinct possibility accounting for lower rates of recidivism among persons diagnosed with schizophrenia/psychosis in the present sample.
Second, in terms of LSI-OR calibration relative to the normative sample, LSI-OR scores overestimated 1-year rates of recidivism in the study sample for all but the very lowest and highest scores. We attribute this to two possibilities. One possibility is that certain diagnoses mitigated risk for recidivism (e.g., schizophrenia, anxiety disorders) and that this contributed to lower observed and estimated recidivism rates. The normative LS/CMI sample is a general provincially incarcerated sample of offenders, most of whom would not have major mental health diagnoses aside from PD and SUD. A second possibility is that the study sample was also a treated sample of offenders. These men were referred for correctional mental health treatment services, which included addressing criminogenic needs through correctional programming as well as treatment of mental health symptomatology through medical and psychosocial interventions, all of which have the potential to mitigate risk and reduce recidivism. By contrast, there is little guarantee that the men in the normative sample received much in the way of correctional or mental health programming, and if they did this would likely be quite variable.
Clinical Implications of LSI-OR Use With Persons With Mental Illness
The present results support the predictive accuracy and hence the discrimination properties of LSI-OR scores, particularly for general recidivism, in a correctional mental health sample and among specific diagnostic subgroups, comorbidity notwithstanding. The available evidence demonstrates that there is adequate ability of LSI-OR scores to differentiate recidivists from nonrecidivists to inform service intensity and release decision making. Individual need domains evinced smaller in magnitude prediction of outcomes than the total score, but these too were predictive in sample as a whole and would support use of the tool to inform prioritization of treatment needs for correctional programming. That the LSI-OR demonstrated significant but smaller magnitude predictive accuracy for violence specifically is not unexpected is consistent with prior research (Olver et al., 2014) and indicates that using a violence-specific risk tool in tandem with the LSI-OR in violence risk evaluations may be warranted.
The calibration findings demonstrated that successively higher rates of recidivism were observed with increasing LSI-OR scores, and that LSI-OR-generated estimates of general recidivism were quite consistent across diagnostic subgroups. However, the LSI-OR considerably overestimated rates of general recidivism in this mental health correctional sample, which begs the issue as to which recidivism estimates to use. The findings support use of the LSI-OR for risk classification, but suggest that rates of recidivism in a correctional mental health sample may be lower with respect to a given score or risk band than those seen in a general correctional sample. The disparity was particularly high in the moderate risk band but smaller in the high and very high risk bands.
Strengths, Limitations, and Conclusions
Perhaps the most significant limitation of the current study was the lack of diagnostic data for the entire sample. As such, the cell sizes were too small among the diagnostic groups to conduct more extensive tests of calibration among these subgroups. This is offset by the availability of a sufficiently large sample size to permit use of logistic regression to model 1-year rates of recidivism and to examine and compare recidivism trajectories between the diagnostic subgroups. Further research may benefit from extending and replicating these calibration analyses to larger diverse samples. Second, given that the DSM-IV-TR diagnoses were made by a single mental health professional, the generalizability of study findings pertaining to diagnostic subgroups elsewhere depends, at least in part, on the accuracy of these diagnoses. A third potential limitation was the short and variable follow-up time, which seemed to have the greatest impact in terms of power for analyses of violent recidivism, given its particularly low base rate in this sample. Fourth, although a comprehensive and reputable source of official recidivism data was employed, there are shortcomings inherent with use of official records, such as unreported offending or attrition not detectable by the criminal justice system (e.g., civil commitment). It is also possible that some men relocated and reoffended in another province, and that such events were not captured by this provincially based tracking system. A final potential limitation is that sample may not be representative of other correctional mental health samples, given that it was particularly high risk and need, had high rates of comorbidity, and received treatment and stabilization services. These shortcomings notwithstanding, the large sample size and magnitude of prediction permitted detecting several significant effects for this outcome and general recidivism as a broad criterion appeared to be largely unaffected.
The administration of well-validated risk assessment tools is a cornerstone of best practices in offender rehabilitation. Offender rehabilitation models such as Risk–Need–Responsivity (RNR; Bonta & Andrews, 2017) underscore the importance of the identification and modification of criminogenic needs that when changed are associated with subsequent reductions in recidivism. Traditionally, the focus of assessment and treatment in correctional mental health samples has been directed at acute and untreated serious mental illness. However, research has increasingly shown that nonsubstance-related mental illness is not directly related to recidivism for the majority of correctional clients with mental illness (see Peterson, Skeem, Kennealy, Bray, & Zvonkovic, 2014), but rather that general risk factors identified among offenders in general have particular salience. Risk assessment measures such as the LS tools warrant consideration as part of a comprehensive assessment protocol with correctional clients diagnosed with mental illness to promote accurate classification and effective risk management strategies.
Footnotes
Authors’ Note:
The views and opinions here are those of the authors and do not necessarily reflect those of the University of Saskatchewan, the HOPE Program, or the Royal’s Institute of Mental Health Research.
