Abstract
Consistent risk category placement of criminal justice clients across instruments will improve the communication of risk. Efforts coordinated by the Council of State Governments (CSG) Justice Center led to the development of a principled (i.e., a system based on a given set of procedures) method of developing risk assessment levels. An established risk assessment instrument (Level of Service Inventory–Revised [LSI-R]) was used to assess the risk-level concordance of the CSG Justice Center Five-Level system. Specifically, concordance was assessed by matching the defining characteristics of the data set with its distribution qualities and by the level/category similarity between the observed reoffending base rate and the statistical probability of reoffending. Support for the CSG Justice Center Five-Level system was found through a probation data set (N = 24,936) having a greater proportion of offenders in the lower risk levels than a parole/community data set (N = 36,303). The statistical probabilities of reoffending in each CSG Justice Center system risk level had greater concordance to the observed Five-Level base rates than the base rates from the LSI-R original categories. The concordance evidence for the CSG Justice Center Five-Level system demonstrates the ability of this system to place clients in appropriate risk levels.
Deriving meaning from an assessment score or category is at the core of conducting assessments. With forensic and criminal justice assessments, the assigning of meaning to a risk assessment score or category has multiple benefits. First, clients can be placed in groups and then differentiated from other groups of clients (i.e., lower vs. higher categories of severe problems). Second, having meaningful categories allows for theory testing of crime-related etiologies (Krauss, Sales, Becker, & Figueredo, 2000). Third, meaningful categories can guide practice. A specific category can suggest optimal levels of supervision or interventions. The purpose of this article is to introduce a general and principle-based framework for assigning client risk levels that are independent of a specific risk instrument’s categories. In this study, the utility of the Council of State Governments (CSG) Justice Center Five-Level system is demonstrated through a low-risk data set having greater frequencies in the low risk levels and greater concordance between observed and predicted recidivism at each level of risk.
In an applied risk assessment setting, multiple instruments can be available (i.e., one for parole, one for probation), or multiple instruments required (i.e., one for violence risk, one for general risk), or multiple instruments administered to ensure that no risk areas are unattended. Each instrument having instrument-based categories gives rise to a problem. The interpretation of one instrument’s results can differ from other instruments that are applied to the same client. Accommodating these different risk category interpretations is a challenging issue, especially under cross-examination. As noted by Heilbrun, Marczyk, DeMatteo, and Mack-Allen (2007), different interpretations may even affect the perceived accuracy of the information, stating, “Consistency across sources makes it more likely that the agreed-upon information is accurate” (p. 53).
When multiple risk assessment instruments are administered to the same client, the risk category concordance among the instruments can be poor. In a study using five sexual risk assessment instruments, the top 25 percentile for an instrument was used to identify high-risk cases (Barbaree, Langton, & Peacock, 2006). Similarly, the bottom 25 percentile rank on an instrument was used to identify low-risk cases. Only 3% of the sample was placed in the high-risk category by all five instruments, and only 4% of the sample was placed in the low-risk category by all five instruments. Even between two instruments (Static-2002R, Hanson & Thornton, 2003; Static-99R, Hanson & Thornton, 2003) with similar developmental strategies and items, the percent agreement for the risk categories was 62.9% (Jung, Pham, & Ennis, 2013). Risk categories were derived using the same percentile rank method as with the Barbaree study (Barbaree et al., 2006), except that the boundary points were the 38th percentile for low/moderate and the 91st percentile for moderate/high.
This lack of concordance among instruments can have implications for accurate estimates of reoffending. In an applied setting, the accurate prediction of violence may be compromised when a practitioner uses two risk instruments that lack concordance. To illustrate, one study used a sample of mostly violent offenders to compare four standardized risk assessment instruments (Mills & Kroner, 2006). Concordance was measured by standardizing (Z scores) the scale scores of each instrument and calculating an average difference score among the cases. This average score of concordance was used to place the offenders into congruent (66% of scores) and incongruent (33% of scores) groups. For the incongruent group, the ability of individual instruments to predict general and violent reoffending dropped. This drop occurred even though the individual instruments had adequate predictive validities. In contrast to Barbaree et al. (2006) and Jung et al.’s (2013) studies examining categorical concordance as the outcome, this study showed that the outcome of predictive accuracy can be threatened when there is a lack of concordance among risk assessment instruments.
Category Concordance
In addition to the lack of concordance among instruments, the category concordance between the developmental sample and subsequent applications can be weak (Bourgon, Mugford, Hanson, & Coligado, 2018). This can occur for multiple reasons: differences in sample base rates, differences in mean scores and distribution characteristics, and differences in the endorsed content within the categories. For example, administration of the Youth Level of Service/Case Management Inventory (YLS/CMI) in young Japanese offenders has shown almost no endorsement of substance abuse items, whereas in Canadian contexts, endorsement has been high (Takahashi, Mori, & Kroner, 2013). In this Japanese study, only 3.1% of the sample was placed into the original categories of “High” and “Very High.” Having the top two categories underrepresented occurred even though the aggregate predictive statistic (zero order correlation = .34) was similar to many Canadian studies (random effect size = .34, Olver, Stockdale, & Wormith, 2014, Table 12).
Category concordance can also be examined via predictive validities. Using the Sex Offender Risk Appraisal Guide (SORAG; Quinsey, Harris, Rice, & Cromier, 2006), the predictive validates for each of the nine categories were compared between Canadian and Austrian samples (Rettenberger, Rice, Harris, & Eher, 2017, Table 6, 7-year violence recidivism follow-up). For the Canadian sample, there was a progressive increase of recidivism among the categories with no reversal categories. In contrast, the Austrian sample’s Category 6 had a 43.8% rate of observed recidivism and Category 7 had a 33.3% rate of observed recidivism. With a 3-year violent recidivism follow-up period, the Austrian sample had similar rates of observed recidivism between three categories (Rettenberger et al., 2017, Table 3S): Categories 1 (3.0%) and 2 (2.0%), Categories 4 (12.7%) and 5 (12.2%), and Categories 6 (21.8%) and 7 (21.0%).
CSG Justice Center Five-Level and Category Concordance
In response to this lack of concordance among and within instruments, the CSG Justice Center developed a strategy to provide a framework for developing general risk categories. A series of convenings of criminal justice researchers, managers, and practitioners between 2014 and 2016 resulted in a consensus on an underlying dimensional conceptualization of risk. Much of the discussion was informed by the Risk, Needs, and Responsivity (RNR) model (Andrews & Bonta, 2010). The decision for five levels was based on discussions of matching risk/need levels to appropriate supervision and services, accounting for considerations among the criminal justice researchers, managers, and practitioners (see Appendix). Initial discussions suggested between two and eleven - levels, with serious consideration given to three, four, and five levels (CSG Justice Center, 2014).
Similar to the CSG Justice Center Five-Level system, statistically based methods have found the optimal number of categories to be between five and nine (Mills, Jones, & Kroner, 2005). The method used for this study was to determine the probability of reoffense for each category, and then change the boundary scores of the categories (and the number of categories) to maximize predictability. The potential category combinations were constrained by three parameters: (a) each category contained at least 10% of the possible range of scores, (b) categories contained at least 10% of the participants, and (c) each successive higher category was associated with an increased likelihood for reoffense. In the Mills study, five optimal categories were found for the Violence Risk Appraisal Guide (VRAG) and nine for the Level of Service Inventory–Revised (LSI-R). The disadvantage to this approach is that the optimal number of categories is largely influenced by a specific risk assessment instrument.
The CSG Justice Center Five-Level system has multiple steps for developing five risk levels (details in the “Method” section) that are not scale-specific. Any correctional risk assessment scale with nine or more items and moderate predictive accuracy (area under the curve [AUC] ~.70) can be formulated into the CSG Justice Center Five-Level system. Conceptually, the CSG Justice Center Five-Level system has been applied to sex offender measures (Hanson, Bourgon et al., 2017; Olver et al., 2018). There are two defining qualities that contribute to this being a principled approach. First, the CSG Justice Center Five-Level system is based on a set of given procedures involving normative and criterion-referenced psychometric principles. Second, this approach can be applied to most risk assessment instruments. This article uses the LSI-R to examine level/category concordance between two systems (CSG Justice Center Five-Level vs. LSI-R original categories).
Current Study
Given that the CSG Justice Center Five-Level system is a principled approach for developing risk levels, unique data sets should have different frequency distributions (i.e., less concordance) across the levels. For example, if the CSG Justice Center Five-Level system is applied to a high-risk data set, it would be expected that there would be fewer Level I cases (lower risk) than if the system is applied to a probation data set (i.e., expected greater number of Level I cases). This would demonstrate a higher level of concordance with the defining characteristics of the data set. A probation data set, by definition (i.e., overall lower risk cases), would have more Level I offenders.
To assess the level of concordance, two data sets are compared. Participants in the probation data set required less criminal justice involvement with certain types of offenses, and therefore were lower risk offenders. These offenses typically required no, or limited involvement (i.e., fines), to limited supervision (i.e., once a month for 6 months). Using this data set to develop the five levels, it would be expected that relatively more cases would be in Levels I and II than in the parole/community data set, which would include prison offenders that have supervision requirements. In addition to fewer Level I and Level II cases, a parole/community data set would have a greater proportion of cases assigned to the higher levels.
Research Question 1
Compared with a parole/community data set, the percentage of cases in Levels I and II should be higher in the current probation data set. The percentage of cases in the upper levels should be higher in the parole/community data set. Thus, it is anticipated that there will be a matching of the defining characteristics of the data set with its distributional qualities.
Research Question 2
The second is strategy to assess the concordance between the CSG Justice Center Five-Level system’s statistical probability of reoffending with observed reoffending base rates. This comparison reflects the calibration of the levels/categories, which is defined as the proportion of similar offenders that are expected to recidivate (Hanson, 2017). Greater concordance is expected between the CSG Justice Center Five-Level statistical probability of reoffending and the CSG Justice Center Five-Level observed reoffending base rates for each level than the CSG Justice Center Five-Level statistical probability of reoffending and the LSI-R original category observed reoffending base rates. The poorer level of concordance with the LSI-R original categories is based upon the LSI-R not directly considering base rates (statistical probability or observed) in determining the original categories.
The closer match between the statistical probability of reoffending and the observed reoffending base rates for each level reflects better calibration. These calculations use aggregate-type statistics (i.e., odds ratios) across the range of scale scores. From these statistics, five levels are created (computation described in the “Method” section). Predetermined statistical parameters from the calculation of the predicted statistical probability of reoffending are used for the boundary scores, which define the five levels. Concordance between the statistical probability of reoffending and the observed reoffending base rates will vary due to the statistical parameters (predictor indicators) that use the aggregated scale scores (LSI-R) in their calculations.
The original five categories from the LSI-R manual did not use statistical parameters or observed reoffending base rates in deriving the boundary scores. The LSI-R category boundary scores were based on distribution characteristics of the normative sample. Given that these boundary scores are not derived from reoffending rates, the difference between the statistical probability of reoffending means and the observed reoffending base rate means for each level (CSG Justice Center Five-Level system) should be smaller than the difference between the LSI-R original five-category base rate means and the CSG Justice Center Five-Level system base rate means. The greater concordance between the statistical probability of reoffending and observed reoffending base rate would provide support for the CSG Justice Center Five-Level system with applied data. The intent of using the LSI-R is to illustrate application of the CSG Justice Center Five-Level system; our intent is not to develop new categories for the LSI-R.
Method
Participants
The Kansas Probation Dataset consisted of 24,936 probationers from the state of Kansas between 2005 and 2015. Administration of the LSI-R began in 2003 but was not fully implemented until 2005. Of note, the cases during these 2 years did not have an outcome recorded. These cases were dropped, reducing this data set to 24,936. There were no significant differences between the current reduced data set and the original data set on most demographic characteristics. This increases the confidence that the reduced data set was not unduly biased and is representative of the total data set.
The mean age was 32.0 (SD = 11.2) years. Racial composition was 22.1% (n = 5,505) African American, 75.2% (n = 15,821) Caucasian, 1.5% (n = 378) American Indian/Alaskan Native, 0.8% (n = 206) Asian/Pacific Islander, and 0.4% (n = 90) Other. Ethnic composition was 11.8% (n = 2,936) Hispanic descent. Marital status was 60.1% (n = 14,985) single, 14.0% (n = 3,495) married, 3.1% (n = 783) separated, 12.8% (n = 3,190) divorced, 0.9% (n = 225) widowed, and 9.1% (n = 2,258) not reported. Type of index offenses were 44.2% (n = 11,020) drug, 19.1% (n = 4,756) nonsex person, 12.2% (n = 3,039) property, 2.6% (n = 654) sex, 12.3% (n = 3,078) other, and 9.6% (n = 2,389) not recorded. Gender was recorded for only 2,651 participants, of which 14.0% (n = 371) were female. The LSI-R was administered for probation sentencing purposes and was not considered until mid-2014 for prison sentencing procedures. With the follow-up period being 2 years, there were no cases that had the LSI-R administered for prison sentencing purposes. Also, Kansas state is a sentencing guideline state and therefore criminal history affects who is on probation and who is sent to prison. This would further reinforce the defining characteristics of this data set being lower risk.
The LSI-R Multi-State Parole/Community Dataset (N = 36,303) was used to compare to the current Kansas Probation Dataset. These data were gathered from Colorado, Oklahoma, Indiana, Vermont, and Washington (a part of the LS/CMI normative data set) from offenders living in the community or halfway houses. Other than the LSI-R scores, no other information was available for this data set. 1
Measures
The LSI-R consists of 54 items (total score range from 0 to 54) rationally grouped into 10 subscales: Criminal History, Education/Employment, Finances, Family/Marital, Accommodations, Leisure/Recreation, Companions, Alcohol/Drug, Emotional/Personal, and Attitude/Orientation. Internal consistency estimates (i.e., Cronbach’s alpha coefficients) for the total score range from .64 to .90 (Andrews & Bonta, 2001). Validity studies indicate that elevated LSI-R scores of offenders residing in halfway houses are indicative of parole violations and return to prison (Vose, Cullen, & Smith, 2008). Meta-analyses have indicated that the LSI-R total score is related to the likelihood of reoffending (Olver et al., 2014).
The LSI-R provides a criminal risk score based on personal history and social interactions. Andrews and Bonta (2010) suggested that the four areas of antisocial cognitions, antisocial associates, history of antisocial behavior, and antisocial personality patterns are the prominent areas responsible for criminal behavior. Each of these areas is covered in the LSI-R. This instrument was primarily developed with probationers and offenders with sentences less than 2 years, to aid in determining the level of supervision upon release.
Regarding the outcome measures, “observed reoffending base rate” referred to the actual base rate of reoffending. Reoffending was defined as a new conviction for any offense within 2 years of follow-up. “Statistical probability of reoffending” referred to the logistic regression logit (using the LSI-R as the predictor) transformed into a probability (Helmus & Hanson, 2011).
Calculation of the CSG Justice Center Five-Level system
Three steps are required to develop the CSG Justice Center Five-Level system (Table 1, also see Babchishin, Kroner, & Hanson, 2017). The first step is to identify the median score of the risk assessment instrument for the data set. This is the midpoint of Level III. The second step is to calculate the boundary score between the second (Level II) and fourth (Level IV) risk levels. To accomplish this, odds ratios are required. The odds ratio is the odds of reoffending in a high-risk group (number of offending group [base rate] by nonoffending group) divided by odds of reoffending in a low-risk group (number of offending group [base rate] by nonoffending group). Using the odds ratio statistic, we identify the ratio that is associated with the expected reduction effect of treatment following the RNR model. Integrating this expected reduction effect of treatment into risk levels is a practical expression of the Risk principle. Based on meta-analyses of treatment outcome studies (e.g., Andrews et al., 1990; Hanson, Bourgon, Helmus, & Hodgson, 2009), the average effect of correctional treatment in real world settings can be expressed as r = .10, d = .20, or odds ratios of 0.70/1.43. The calculations below used the Kansas Probation Dataset. The median scale score of the LSI-R (raw score of 18) is associated with the odds ratio of “1.0.” The boundary scores for the third (Level III) level are defined by the average effect of treatment (between odds ratio of 0.70 and 1.43). The lower boundary score, defining the Level II/III boundary, is an odds ratio of 0.70 (raw LSI-R score of 14). The upper boundary score defining the Level III/IV boundary is an odds ratio of 1.43 (raw LSI-R score of 22). Specifically, the range of scores for the third risk level range from LSI-R total scores associated with an odds ratio of 0.70 to 1.43.
Three Steps Are Required to Develop the CSG Justice Center Five-Level System.
Note. Samples of
The third step is to calculate the boundary scores between the first (Level I) and second (Level II) levels and the fourth (Level IV) and fifth (Level V) levels. To accomplish this, statistical probabilities of reoffending (predicted probabilities) are used (Helmus & Hanson, 2011). The statistical probability of reoffending is derived from logistic regression modeling (2-year follow-up), for which the LSI-R total scores were used. The statistical probability of reoffending of ≤5% (raw LSI-R score of 11) provides the Level-I/II threshold. The lowest raw score of the LSI-R of “0” defined the lowest point of Level I. The statistical probability of reoffending of ≥85% (raw score of 50; there were no raw LSI-R scores above 50, and thus no cases in Level V for the Kansas Probation Dataset) provides the Level-IV/V threshold (Babchishin et al., 2017).
Using a lower risk data set for this study, Level IV was split into IVa and IVb levels. The threshold was 1.5 the odds ratio of Level IVb over Level IVa, which is a technique used by Hanson, Babchishin, Helmus, Thornton, and Phenix (2017). Their argument was that there were no Level-V sex offenders. Splitting Level IV would create greater distinction at the upper risk levels. The use of an odds ratio for the Level IVa and Level IVb distinction is in keeping with a principled approached of developing risk levels and would allow for continuity among different risk assessment instruments. Similar to the Hanson strategy with too few Level-V offenders, this study with the Kansas Probation Dataset had no Level-V probationers.
Results
Applying the CSG Justice Center Five-Level system to specific data sets should result in lower or higher frequencies at each level according to the nature of the data set (i.e., high-risk data set has greater frequency in Level V). In Table 2, the Kansas Probation Dataset, which by criminal justice definitions would constitute a lower risk data set, had relatively greater frequencies in the lower risk levels. The LSI-R Multi-State Parole/Community Dataset had more than double the number of cases in Level IV (71.3% vs. 31.8%). The same trend occurred with Level IVa (31.3% vs. 23.2%) and Level IVb (40.0% vs. 8.6%). The same inverse trend occurred at the lower risk levels (Level I, 6.7% LSI-R Multi-State Parole/Community Dataset vs. 18.9% Kansas Probation Dataset; Level II, 3.0% LSI-R Multi-State Parole/Community Dataset vs. 9.5% Kansas Probation Dataset). These expected frequency increases (upper levels of a higher risk data set), and frequency decreases (lower levels of lower risk data set) lend support for the CSG Justice Center Five-Level system with applied data and provide support for Research Question 1.
Percentage of Offenders and ROC at Each Level for the Kansas Probation Dataset (N = 24,936) and LSI-R Multi-State Parole/Community Dataset (N = 36,303).
Note. Levels are based on the Kansas Probation Dataset. The IVa- and IVb-level percentages are based on the total samples. ROC = receiver operating characteristic; LSI-R = established risk assessment instrument.
Table 3 contains the percentage of offenders according to the original LSI-R categories. Approximately 70% of the sample (Categories 1 and 2) in Table 3 (Kansas Probation column) is differentiated by two categories, whereas in Table 2 (Kansas Probation column), approximately 70% of the sample is differentiated by three levels. Thus, even with stronger receiver operating characteristics (ROCs), the majority of lower and mid-risk range cases in the Five-Level system will have greater differentiation for targeted population.
Percentage of Offenders and ROC at Each Original LSI-R Category for the Kansas Probation Dataset (N = 24,936) and LSI-R Multi-State Parole/Community Dataset (N = 36,303).
Note. Categories are based on the LSI-R normative sample. ROC = receiver operating characteristic; LSI-R = established risk assessment instrument.
Greater level concordance was expected between the CSG Justice Center Five-Level statistical probability of reoffending and the observed reoffending base rate, than between the CSG Justice Center Five-Level statistical probability of reoffending and the LSI-R observed base rate means. Using the Kansas Probation Dataset, the corresponding observed base rate for the LSI-R original categories were calculated (Table 4, far-right column) for each level. For Levels II, III, and IVa, the CSG Justice Center Five-Level system’s observed reoffending base rate were relatively close to the statistical probability of reoffending (Table 4, bold columns). The Level-IV statistical probability of reoffending was closer to the corresponding LSI-R categories, but the breakdown of this level showed that the Level-IVa statistical probability of reoffending mean was similar to the corresponding Level IVa observed reoffending base rate mean. The intra-class correlation (Dunn, 1989) between the CSG Justice Center Five-Level observed reoffending base rate and the statistical probability of reoffending (Table 4, bold columns) was .816. The intra-class correlation between the CSG Justice Center Five-Level statistical probability of reoffending and the LSI-R original categories’ observed reoffending base rate was .743. The intra-class correlation was .797 between the LSI-R Original category statistical probability of reoffending and the LSI-R original categories’ observed reoffending base rate. Overall, these results provide support of slightly greater concordance (Research Question 2) between the CSG Justice Center Five-Level statistical probability of reoffending and the CSG Justice Center Five-Level system’s observed reoffending base rate, than between the CSG Justice Center Five-Level statistical probability of reoffending and the LSI-R original categories’ observed reoffending base rate.
Observed Reoffending Base Rate and Statistical Probability of Reoffending for the CSG Justice Center Five-Level System and the LSI-R Original Categories (Kansas Probation Dataset, N = 24,936).
Note. Numbers in () are the LSI-R raw score ranges for the CSG Justice Center Five Levels. Numbers in [] are the LSI-R raw score ranges for the LSI-R original categories. Numbers in {} are the range of the statistical probability of reoffending at each level. Greater level concordance expected between the two bold columns (CSG Center Five-Level Observed Base Rate and Statistical Probability of Reoffending) than between CSG Justice Center Five-Level Probability of Reoffending and LSI-R observed base rate. The observed base rate is based on reoffending, defined as a new conviction for any offense within 2 years of follow-up. LSI-R original categories boundary points are based on distributional characteristics of the LSI-R normative data set. The Kansas Probation Dataset observed reoffending base rate is 11%. CSG = Council of State Governments; LSI-R = established risk assessment instrument.
Discussion
Support for the CSG Justice Center Five-Level system was accomplished through the conceptual concordance between two differing risk-level data sets and their respective frequency distributions. The anticipated differences were found, with the lower risk data set (Kansas Probation Dataset, N = 24,936) having more participants in the lower risk levels than the higher risk data set (LSI-R Multi-State Parole/Community Dataset, N = 36,303). Second, a stronger relationship was demonstrated between the probabilities of reoffending and observed base rate for each CSG Justice Center system risk level than the same concordances with the LSI-R original categories. These two types of concordance suggest that the CSG Justice Center Five-Level system can place clients in appropriate risk levels.
The present results highlight two ways of calculating risk categories. The LSI-R original categories used a norm-referenced approach, whereas the CSG Justice Center Five-Level system used norm-referenced and criterion-referenced procedures. Comparing these two ways of calculating risk categories demonstrates the benefits of a combined norm-referenced and criterion-referenced procedure. One benefit of this combined norm-referenced and criterion-referenced procedures is that the statistical probability of reoffending calculations for each of the five levels are closer to the observed level base rates (two bold columns) than the base rates from the LSI-R original categories. This, to some extent, could be conflated as the observed reoffending base rates assisted in the calculation of the five levels. Yet, probability and observed rates were closer.
A second benefit of a combined norm-referenced and criterion-referenced procedure is the reduced likelihood of one level having greater observed base rates than the next greater level. The LSI-R original categories of 4 and 5 had a reversal of base rates. Category 4 had a greater base rate (32.0%) than Category 5 (29.3%; Table 4, column 5). With the Level-V system, the criterion-reference component assists in precluding such a reversal. Level IVa had an observed base rate of 19.9%, and Level IVb had an observed base rate of 29.2%. The likelihood of such a base rate reversal among categories increase without a principled approach of determining risk categories. A similar reversal has previously been demonstrated when an instrument’s original categories is applied to a different sample (see Mills et al., 2005, Table 2). Application of the CSG Justice Center Five-Level system will help to reduce level reversals of observed base rates.
Incorporating a principled approach (i.e., a system based on a given set of procedures) of determining risk levels will also reduce the inconsistent assignment of risk among risk assessment instruments. Commenting on the application of risk assessment instruments, Heilbrun et al. (2007) has noted that consistency of risk-level assignment will increase the perceived accuracy of the assessment. This could be partially based on the finding that consistent assignment of risk assessment instruments results in better statistical outcome prediction (Mills & Kroner, 2006). It is also argued that risk communication should be consistent across groups (i.e., probation officers, psychologists; Heilbrun, Newsham, & Pietruszka, 2016), but even within a single group (forensic clinicians) there can be substantial disagreement (Hilton, Carter, Harris, & Sharpe, 2008). The CSG Justice Center Five-Level system attempts to both increase the consistency of level assignment and the consistency of the meaning involved with level assignment. Thus, even with certain samples not having a Level-V system, Levels IVa and IVb would not drift into denoting Level IVb as “high/highest risk” level.
Correct classification of criminal justice clients has had a long tradition (Austin, 1986; Clements, 1981). Correct placement into appropriate risk levels has implications for clients changing risk levels. As noted in previous research, the percentage of change among risk levels is predictive of recidivism (Cohen, Lowenkamp, & VanBenschoten, 2016; Vose, Lowenkamp, Smith, & Cullen, 2009). More importantly, in the Vose et al. study, the interaction between the percentage of change and the risk category at the first assessment was predictive of recidivism. Inappropriate placement into risk categories may restrict the occurrence of this interaction, which suggests a lower acknowledgment of clients cascading down over time. In fact, this interaction did not occur with the female clients in the Vose et al. study. In a two-jurisdiction study, the comparison of the same scale across two jurisdictions produced statistical concordance among risk categories, but sufficient lack of concordance that could result in guidelines for clinical practice (Rettenberger et al., 2017). To compensate for this situation, the authors argue for the necessity of local risk instrument norms. Alternatively, using a principled approach may reduce the low concordance among risk categories between jurisdictions. Application of the CSG Justice Center Five-Level system will assist in greater concordance among risk assessment instruments and allow professionals, stakeholders, and clients to better understand the risk of reoffending and its application.
Limitations
In this study, concordances were an indicator of correct classification, which is only one type of support for the CSG Justice Center Five-Level system. Second, we did not have access to content scale or item-level data. Distinct profiles may occur at each level. Profile differences due to item scatter or domain elevations would have intervention implications. Using a latent class analysis, Taxman and Caudy (2015) found that four distinct profiles of criminogenic needs best accounted for their data (N = 17,252). The profile capturing high needs (i.e., substance abuse, criminal peers) and high destabilizers (i.e., employment, housing) had a 24.7% 2-year recidivism rate within this profile in the lowest risk category (Taxman & Caudy, 2015, Table 8). A traditional risk approach would exclude this group from intervention (see Reich, Picard-Fritsche, & Rempel, 2018, for a similar conclusion). Finally, the main requirement in applying the CSG Justice Center Five-Level system is a large normative sample. The identification of the median (i.e., the middle of the risk distribution) from such a sample is the beginning point of the calculations. Optimally, other data sets could then be compared with the normative data set to determine differences. This would allow for some definitive conclusions regarding LSI-R cut points. In this article, we compared two data sets, which precludes commenting on specific cut points. The comparison of these two data sets, because of data set characteristics, allowed for the conclusion that the CSG Justice Center Five-Level system appears to be viable.
Conclusion and Application
The CSG Justice Center Five-Level system is a principled approach that can be applied to most risk assessment instruments within criminal justice and forensic settings. Thus, this article used the LSI-R as an example, but other instruments used to predict criminal justice outcomes could benefit from the CSG Justice Center Five-Level system. Making principled distinctions among levels will assist in a higher level of concordance in the application of risk-related classification and decision rules.
Footnotes
Appendix
CSG Justice Center Five-Level System Table.
| Level | Criminogenic needs | Predicted 2-year reoffending rate without intervention |
|---|---|---|
| I | None or few. If any, mild and/or transitory. |
5% or less. |
| II | A few. Some mild and transitory, or possibly acute. |
Higher than 5% but lower than III. |
| III | Multiple. Some severe. |
Average—similar to the reoffending rate of offenders in the middle of the risk distribution. |
| IV | Multiple. Some chronic and severe. |
Higher than average but lower than V. |
| V | Multiple. Chronic, severe, and entrenched, likely across psychological, interpersonal, and lifestyle domains. |
85% or greater. |
Acknowledgements
The authors thank Bree Derrick for her assistance with the data set.
Authors’ Note
Evan M. Lowder is now affiliated with George Mason University, Fairfax, USA.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
