Abstract
The assessment of inmate risk and need in prison poses a unique challenge to correctional policy makers because it is used for two purposes: classification and case management. Classification and case management require assessment instruments that are designed to predict two separate outcomes: institutional misconduct and community recidivism. The current research examines differences between a prison classification instrument developed to predict misconduct and a case management instrument developed to predict community recidivism using a sample of 414 inmates in Ohio. The results indicated substantial differences between assessment instruments and that separate risk and needs assessments should be conducted. A hybrid assessment system is suggested that seeks to maximize accuracy and efficiency by including select factors from each instrument.
The assessment of inmate risk and need plays a vital role in the management of correctional populations. By categorizing offenders into groups who have low, moderate, and high probabilities to offend, risk assessment instruments allow correctional administrators to efficiently allocate custody resources (Austin & Hardyman, 2004). Similarly, the assessment of criminogenic needs provides case managers with the capability to refer inmates to programs that are designed to reduce the likelihood of recidivism on release (Jones, 1996). The accurate assessment of the likelihood of institutional misconduct and community recidivism is important to public policy not only because it encourages the effective control of potentially dangerous inmates in prison but also because it encourages the successful use of programming to discourage recidivism on release to the community.
Although early first-generation assessments of risk and need were conducted using a clinical approach, research on risk assessment has found that actuarial approaches are much more likely to accurately classify offenders and identify criminogenic needs (Grove, Zald, Lebow, Snitz, & Nelson, 2000; Jones, 1996). Second-generation risk assessments, commonly used in prison to determine custody level, are composed of static items that primarily measure criminal history (Andrews & Bonta, 2010). Third-generation risk assessments measure static and dynamic factors and are suggested to provide a more accurate measure of risk to engage in antisocial behavior while at the same time identifying criminogenic needs (Andrews & Bonta, 2010; see also Bonta & Motiuk, 1992). Still, the use of third-generation risk assessment instruments in prison assumes that these instruments can validly predict institutional misconduct while at the same time identify criminogenic needs that are related to recidivism on release to the community. Thus, an important research question in the area of risk and needs assessment in prison is whether assessment instruments designed to measure criminogenic needs that are related to recidivism in the community can also be used to assess the likelihood of institutional misconduct (Weinrath & Coles, 2003). To do so, an assessment instrument must have predictive validity for institutional misconduct and community recidivism.
To answer the research question above, the current study seeks to compare differences between two assessment instruments that were constructed based on separate outcomes: institutional misconduct and community recidivism. This is accomplished using a sample of men and women inmates from Ohio who were interviewed at prison intake and subsequently released into the community. After an examination of the predictive validity of instruments, two reduced versions of these instruments are constructed and incorporated into a risk assessment system that provides an accurate and efficient means to assess inmate risk and criminogenic needs.
Primary Assessment Concerns in Prison
The assessment of offender risk and needs in prison is concerned with two major issues: security and programming. Effective assessment systems should help correctional administrators accurately estimate the likelihood of offending behavior as well as efficiently allocate treatment resources. As noted by Andrews, Bonta, and Hoge (1990), the principles of effective classification suggest that prison administration should utilize assessment instruments that (a) categorize inmates by their likelihood of engaging in offending behavior and (b) identify criminogenic needs related to recidivism that can be addressed with programming.
The Risk Principle of Effective Classification
Andrews et al. (1990) referred to the first issue as the risk principle of effective classification. This principle stresses the importance of understanding an offender’s risk to engage in antisocial behavior so that security and treatment resources can be allocated to those who pose the greatest risk to misbehave. In prison, the risk principle is best understood as concerned with two types of risk. First, the risk that the inmate poses to engage in institutional misconduct is of primary importance when applying the risk principle to allocate security resources. Thus, risk assessment in prison focuses on measuring the risk to engage in institutional misconduct to determine the appropriate custody level (Austin & Hardyman, 2004).
The second type of risk is concerned with the allocation of treatment resources. As Andrews and colleagues (1990) noted, the risk principle of effective classification suggests that high risk cases should receive the most intensive treatment (see also Lowenkamp, Latessa, & Holsinger, 2006). Because this aspect of the risk principle is concerned with programs that attempt to reduce recidivism, an outcome other than institutional misconduct is needed. Although reduced institutional misconduct is perhaps an additional benefit of prison programming (French & Gendreau, 2006), treatment in prison is considered justified because it discourages the likelihood of reoffense on release to society. As a result, assessments in prison must be able to identify criminogenic needs that are related to recidivism in the community. Thus, applying the risk principle in the prison context suggests that inmates who pose the greatest risk of misconduct should be given highest priority for security resources and that inmates who pose the greatest risk of community recidivism should be given highest priority for treatment resources.
The Needs Principle of Effective Classification
Another issue surrounding effective classification is what Andrews et al. (1990) referred to as the needs principle. The needs principle suggests that assessment systems should identify dynamic risk factors for two primary reasons. The first reason is because instruments that rely on static, unchanging factors do not accurately estimate the risk of offending. Static factors, which consist primarily of measures of criminal history, are criticized for failing to account for changes in an offender’s environment, associations, and thinking patterns that can contribute to the likelihood of offending (see Andrews, Bonta, & Wormith, 2006; Makarios, Steiner, & Travis, 2010). As a result, the inclusion of dynamic risk factors into risk assessment instruments provides a more accurate indication of risk because they measure risk factors that are able to fluctuate over time (Gendreau, Little, & Goggin, 1996). In prison, including dynamic risk factors should thus help to increase the accuracy of estimating the risk to engage in institutional misconduct. That is, the likelihood of institutional misconduct is most accurately estimated when assessment instruments consider dynamic environmental and personal factors that contribute to offending in prison.
There is a good deal of research that supports the proposition that the prediction of offending behavior is maximized when dynamic risk factors are included in risk assessment instruments. For example, Gendreau et al. (1996) conducted a meta-analysis of the predictors of recidivism and found that dynamic risk factors produced mean effect size estimates that were equal to those of criminal history. Furthermore, the Level of Service Inventory–Revised (LSI-R; Andrews & Bonta, 1995), a risk assessment instrument that includes dynamic risk factors, produced a larger mean effect size estimate than the Salient Factor Score (SFS; Hoffman, 1983), a risk assessment instrument that includes only static items. In the prison context, Wright, Salisbury, and Van Voorhis (2007) constructed a risk/needs assessment instrument for women in prison. They found that when compared with the static classification instrument, dynamic risk scales and gender responsive risk scales were more accurate in the prediction of misconducts.
The second argument for the inclusion of dynamic risk factors into risk assessment instruments revolves around case management and the effective treatment of offenders (Andrews et al., 1990). Although static risk assessment instruments may indicate which offenders are in most need of programming by identifying high risk offenders, they fail to tell case managers which types of programs are needed. The inclusion of dynamic risk factors into risk assessments identifies the risk factors that make a particular offender high risk to recidivate, which allows case managers to use this information to prioritize treatment needs by domain (see Latessa, Lemke, Makarios, Smith, & Lowenkamp, 2010).
Although the first reason for including dynamic risk factors into risk assessment instruments in prison revolves around the accurate prediction of institutional misconduct, the second reason is concerned with the inclusion of dynamic risk factors that are related to recidivism on release to the community. Similar to the risk principle, the needs principle thus requires the consideration of two types of risk to accurately assess the risk of institutional misconduct while identifying dynamic risk factors for case management. An important assumption of combined risk and need assessment systems in prison is that they display predictive validity of institutional misconduct and community recidivism. That is, it is assumed that these systems can be used to classify inmates based on the likelihood of misconduct while also working as case management tools by identifying the factors associated with community recidivism.
Applying the Risk and Needs Principles in Prison Classification
Research that examined the ability of combined risk/needs assessment instruments to predict institutional misconduct and community recidivism provides support for these instruments. For example, the LSI-R is a well known risk/needs assessment instrument that has been shown to predict both outcomes. The LSI-R was developed to predict community recidivism and has seen a wealth of support in its predictive validity. In their analysis of average effect sizes, Gendreau, Goggin, and Smith (2002) examined 33 studies of the predictive validity of the LSI-R and found an average correlation of .33 with general recidivism in the community. Research that has examined the ability of the LSI-R to predict institutional misconduct also indicates that it shows promise in this area. For example in two separate studies, Holsinger, Lowenkamp, and Latessa (2004, 2006) found that the LSI-R and the youth version of the LSI-R (YLS/CMI) had correlations of .40 and .39 with institutional misconduct, respectively. Furthermore, Bonta and Motiuk (1992) examined the ability of the LSI-R to predict institutional misconduct and reincarceration. They found that the LSI-R had a correlation of .35 with recidivism and a correlation of .21 with their measure of institutional disciplinary problems.
Despite the promising evidence suggesting that combined risk and needs assessment instruments may have predictive validity for institutional misconduct and recidivism, there is research that brings these findings into question. Wright, Clear, and Dickenson (1984) cautioned against the use of universal risk assessment instruments because differences in populations may produce differences in the predictors of recidivism. Confirming these suggestions, different risk factors predict recidivism at different stages in the criminal justice system (Latessa et al., 2010; Urbaniok et al., 2007). Still, relatively little research has directly examined whether a risk and needs assessment instrument that was developed to predict community recidivism can perform as well as an instrument developed specifically to predict institutional conduct.
Weinrath and Coles (2003) examined the predictive validity of two separate instruments, one developed for the purpose of community risk/needs assessment and one developed specifically for predicting institutional misconduct. Their findings are instructive because they use the same sample to compare the predictive validity of a combined risk/needs assessment instrument and an instrument previously developed specifically for institutional misconduct. They found that when predicting community recidivism and institutional misconduct, the institutional misconduct classification instrument outperformed the combined risk/needs assessment instrument. Their findings suggest that this was the case because the institutional misconduct instrument relied primarily on static factors, which displayed stronger relationships with both outcome measures. Weinrath and Coles concluded by developing an integrated scale that incorporated measures from instruments that were predictive of each outcome, suggesting that combined risk/needs assessment instruments should incorporate predictors of community recidivism and institutional misconduct if they are going to be effectively used in the prison setting.
Weinrath and Cole’s (2003) findings may be troubling to correctional policy makers because they suggest that third-generation risk and needs assessment tools may struggle to predict both of the outcomes of institutional misconduct and community recidivism. If different dynamic risk factors are needed to accurately predict both outcomes, it suggests that correctional administrators should adopt assessment procedures that involve the administration of two separate third-generation instruments. Unfortunately, third-generation assessments require the administration of a structured interview and intensive training on the measurement of the items (see Andrews & Bonta, 2010). Each interview can take from 45 to 90 min to complete and providing one interview for each assessment is likely a daunting, if not impossible, task for correctional administrators. Weinrath and Coles suggested a hybrid approach that creates one assessment system that includes items from each tool. Other options that could help to alleviate the costs involve reducing the number of dynamic items measured, thus reducing the length of the structured interviews, as well as using screening instruments to identify and screen out low risk cases from the full assessment.
Gender and Classification
Prior research on gender and risk assessment suggests that risk assessment instruments may vary in their ability to predict recidivism based on gender (e.g., Reisig, Holtfreter, & Morash, 2006). As Van Voorhis, Wright, Salisbury, and Bauman (2010) pointed out, although prior research examining traditional risk assessment instruments has found that they oftentimes are able to predict female offending (see Simourd & Andrews, 1994; Smith, Cullen, & Latessa, 2009), an emerging body of research suggests that gender may confound the relationship between risk factors and recidivism (Manchak, Skeem, Douglas, & Siranosian, 2009; Reisig et al., 2006; Uggen & Kruttschnitt, 1998). Taken together, the literature on gender and recidivism suggests that it is not an “either–or” question. Although it is likely many of the predictors of recidivism are similar for men and women (Makarioset al., 2010), it is also likely that other predictors are gender specific (see Zahn, Hawkins, Chiancone, & Whitworth, 2008). This suggests that an important aspect of the predictive validity of risk assessment instruments involves examining whether findings vary by gender. As a result, the current research disaggregates all analyses by gender to examine differences and similarities in the predictive validity of each instrument for women and men.
The current research seeks to examine differences in two assessment instruments that were developed to behaviorally predict either institutional misconduct or community recidivism. This was accomplished by constructing separate assessment instruments for classification and case management in a sample of male and female Ohio prison inmates. Differences and similarities in the predictive items are compared, followed by an examination of the predictive validity that each instrument maintains with both outcomes. Finally, statistics are presented on a hybrid instrument that was developed to build off of the advantages of each instrument and reduce some of the resource constraints often associated with third-generation assessments in prison.
Method
The data for the current research come from a larger study that was designed to develop a risk and needs assessment system across all stages in the Ohio criminal justice system. The Ohio Risk Assessment System (ORAS) involved sampling offenders at pretrial, during community supervision, at prison intake, and at release from prison. Potential predictors of recidivism were gathered from offenders in each sample during a structured interview and case file review. Outcome measures were gathered from an official database maintained by the state of Ohio. Separate risk assessment instruments were constructed at each stage of the criminal justice system based on the factors that were found to be related to recidivism for each sample. For a further description of the ORAS see Latessa et al. (2010). The data for this research come from inmates in the prison intake sample.
Participants
Data in this sample come from offenders admitted to eight correctional facilities in Ohio. Sites were selected using the following considerations: geographic representation, recommendations from stakeholders from the Ohio Department of Rehabilitation and Correction, and availability of the sites to participate (although no site declined to participate in the project). Interviews took place during site visits that occurred between June and November of 2007. The original, full intake sample consisted of all offenders who had entered prison within 6 months of the site visits, were not restricted by security constraints, and agreed to be interviewed (N = 804).
As this research sought to identify a group of offenders that were admitted to prison and subsequently released into the community, offenders were included in the current sample only if they were also scheduled to be released from prison within the next 6 months (n = 427 or 53% of the original sample). Although these requirements provided a group of offenders who had relatively short prison sentences (the average prison sentence was 5.4 months), it was necessary considering the time constraints involved in gathering an adequate follow-up in the community (i.e., inmates had to be tracked for a year after release from prison). It is worth noting that because of the sentencing structure in Ohio, commitments to prison for a year or less are common. This is because sentencing reform in Ohio created low level felonies that allowed for prison sentences of less than a year. In 2007, 57% of prison inmates were sentenced to prison for less than a year (Gonzalez & Bennie, 2009), which is similar to the percentage of offenders that the current sample comprises from the original full intake sample. Still, the short sentence length of the sample of offenders should be taken into account when interpreting the findings from this study. Inmates were also excluded from the current sample because of missing data (n = 4) or if they had prison sentences of less than 2 months (n = 9), leaving a total sample size of 414. Descriptive statistics for the sample are displayed in Table 1. The sample was 38% African American and 43% of the sample was in prison for a drug crime. The average age was 33 years old and 34% of the sample were female.
Descriptive Statistics for the Study Sample (n = 414)
Procedures
This research used a prospective design that involved the collection of data on potential risk factors first, then a subsequent follow-up of offenders to gather information on recidivism and institutional misconduct. Data collection for the risk factors involved structured interviews and file reviews that were conducted by trained researchers, as well as self-report surveys completed by offenders as they waited to be interviewed. In all, the structured interview and self-report survey gathered information on more than 200 potential risk factors and took approximately 45 to 90 min to complete. Each assessment instrument was constructed by including items that were significantly related to each respective outcome. That is, after the collection of data, risk factors that were related to misconduct were included in the classification assessment instrument and items that were related to recidivism in the community were included in the case management assessment instrument.
Outcome Variables
The two primary outcomes for this research involve (a) whether an inmate was found guilty of an offense by the Rule Infraction Board (RIB) and (b) whether an inmate was arrested for a new crime after release from prison. A guilty disposition from the RIB was used in the current research because the RIB typically only handles more serious and/or habitual violations of the rules. An RIB infraction is an official measure of inmate misconduct and has been criticized because the detection of misconduct may be correlated with various institutional characteristics such as security level (Light, 1990). Still, research comparing self-reported and official misconduct in prison has found both to be valid measures of inmate adjustment (Simon, 1993; Van Voorhis, 1994). New arrest in the community was chosen because of the limited follow-up and problems associated with further processing into the criminal justice system (see Maltz, 1984). New arrests were measured by searching for arrest records on a law enforcement database that tracks arrest data for the state of Ohio.
Table 1 presents descriptive statistics for both outcome variables. Worth noting is the relatively short time at risk and the resulting low base rates of prison misconducts (5.4 months and 16%, respectively), especially for female offenders. As mentioned previously, these limitations should be taken into account, but the short period of follow-up is indicative of what happens to many Ohio inmates in practice given the state’s sentencing structure (Gonzalez & Bennie, 2009). Furthermore, the Receiver Operating Characteristics (ROC) analyses used in the current research have been used in prior research when base rates are low (see Lowenkamp, Lemke, & Latessa, 2008; Swets, Dawes, & Monahan, 2000). Still, only 10 women in the current sample engaged in a misconduct. Thus, caution should be taken when interpreting the results for female offenders that predict misconduct.
Measures
Classification Instrument
Table 2 displays information on the items included in the classification instrument as well as the distribution of inmates by risk level and risk score. The classification instrument includes 24 items from seven domains, including both static factors such as criminal history as well as dynamic factors such as criminal thinking errors. Most inmates classified by the instrument are considered either low (44%) or moderate risk (46%) for future misconducts, with relatively few inmates designated as high risk (10%). The instrument risk score correlated with misconduct (r = .39, p = .00) and had an alpha of .71.
Description of the Classification Instrument
Only 10 females experienced a misconduct so caution should be taken when interpreting the results for female offenders that predict misconduct.
p < .05. **p < .01.
Case Management Instrument
Table 3 provides a description of the items included in the case management instrument. It contains a total of 31 items from 8 domains. Fifteen percent of the full sample is at high risk for a new arrest, with low and moderate risk offenders nearly splitting the remaining 85% equally. The case management risk score ranged from 3 to 29, had an alpha of .73, and a correlation of r = .37 (p = .00) with new arrest. All items in the case management instrument are listed in the appendix.
Descriptive Statistics for the Case Management Instrument
p <.05. **p < .01.
Hybrid Assessment System
The hybrid assessment system consists of a series of instruments: the reduced classification instrument, the case management screen, and the full case management instrument discussed above. Table 4 presents descriptive statistics for the reduced classification instrument that includes a subset of items from the full classification instrument.
Description of the Reduced Classification Instrument
Only 10 females experienced a misconduct so caution should be taken when interpreting the results for female offenders that predict misconduct.
p < .05. **p < .01.
Reduced classification instrument
The reduced classification instrument was designed to classify inmates by their risk to engage in institutional misconduct with a smaller number of items that did not require the administration of a structured interview. Using some of the items that were strongly related to the outcome and that were likely to be found in an offender’s case file/presentence investigation report, the reduced classification instrument contains only six items. For the full sample, the reduced classification instrument provides a risk score that ranges from 0 to 9 and displays a moderate correlation with misconduct (r = .32; p = .00). 1
Case management screening instrument
This instrument was created to identify low risk cases that do not need to be given the full case management needs assessment and as a result avoid the lengthy structured interview. The inclusion of dynamic risk factors is necessary for case management because they identify criminogenic needs for treatment. Still, because research indicates that primarily moderate and high risk offenders should receive the most intensive treatment (see Lowenkamp & Latessa, 2004), a brief screening instrument that identifies low risk cases can be used to eliminate these cases from receiving the full assessment. Table 5 provides information regarding the case management screening instrument. Of interest, the screen only contains four items, two of which overlap with the reduced classification instrument. The case management screen produces a risk score that ranges from 0 to 5 for the full sample and displays a moderate correlation with rearrest (r = .30; p = .00).
Description of the Case Management Screening Instrument
p < .05. **p < .01.
Analyses
This research seeks to examine differences in assessments that were constructed for either classification (i.e., predictive of institutional misconduct) or case management (i.e., predictive of community recidivism). The results section presents validation statistics for the instruments described above. Differences in misconduct and rearrest are presented by assessment instrument and gender. Correlation coefficients are presented to describe the strength of the relationship between each instrument and outcome. ROC analyses were used that provide the Area Under the Curve statistic (AUC) to provide an assessment of the accuracy in prediction that each assessment instrument provides.
Results
The results section is divided into three subsections. The first examines the classification assessment instrument that was constructed to predict institutional misconduct. The second section reviews the case management assessment instrument that was constructed to predict community recidivism. The final subsection presents results from the hybrid approach that uses items from each instrument and attempts to develop an assessment system that would not only accurately predict each measure, but do so in a way that is as efficient as possible.
The Classification Instrument
Figure 1 displays several series of graphs that speak to differences in the likelihood of the outcome by the risk level of the classification instrument. The first series examine differences in misconduct by risk levels for all inmates. The results indicate that there are substantively meaningful differences in the rates of misconduct by risk category. The ROC analysis produced an AUC statistic of .73, indicating that there are a relatively larger number of true positives than false positives. The r value of .37 (p = .00) indicates that the risk levels display a moderate to strong relationship with recidivism. The second and third series compares differences in the percentage of inmates who engage in misconduct at each risk level by gender. Although the statistics for female offenders are smaller than those for male offenders, the results indicate that the AUC does not fall outside of the original confidence interval, suggesting that the AUC for men and women is not significantly different and that the instrument displays a statistically similar level of predictive validity by gender.

Predictive Validity of the Classification Tool for Misconduct
Figure 2 displays results from analyses that sought to examine the predictive validity of the classification instrument when using new arrests in the community as an outcome. The results indicate that the classification assessment instrument displays substantially lower predictive validity when new arrest is used as an outcome. The difference in the rates of recidivism by risk level is not as large and the r value of .13 (p = .00) is substantially lower than that obtained when misconduct is the outcome (r = .37; p = .00). Furthermore, the arrest AUC value of .57 falls outside the 95% confidence interval of the misconduct AUC (.64-.78). This means that the instrument does a significantly poorer job at maximizing true positives and minimizing false positives when new arrest is the outcome. It suggests that the classification assessment instrument would perform poorly if used to identify treatment targets for case management. The next two series in Figure 2 display the predictive validity of the classification instrument on new arrests by gender. The results indicate that the classification instrument has lower correlation and AUC values for males, but the AUC of .54 for males and .59 for females does not fall outside of the confidence interval for all inmates, again suggesting that the results are not significantly different.

Predictive Validity of the Classification Tool for Arrest
The Case Management Instrument
Figures 3 and 4 present a series of graphs that display the predictive validity of the case management instrument. The first series in Figure 3 compares the rates of new arrests by risk level for all inmates. It indicates substantial increases in the likelihood of a new arrest for each risk level. The significant AUC of .70 and r value of .38 (p = .00) suggests that the instrument performs well when predicting new arrests. The second and third series in Figure 3 measure the predictive validity by gender. The AUC and r values for male and females are statistically similar, suggesting the instrument performs equally well for both genders.

Predictive Validity of the Case Management Assessment Instrument for Arrest

Predictive Validity of the Case Management Assessment Instrument for Misconduct
Figure 4 compares differences in prison misconducts by case management risk level. The results in first series for all inmates indicate substantially smaller differences in the rates of failure when prison misconduct is used as an outcome. The r value of .17 (p = .00) is also smaller than the correlation of the case management instrument with new arrest. Finally, the AUC of .62 is considered relatively low and falls outside of the AUC confidence interval when new arrest is used as an outcome. The second two series present the predictive validity of the case management instrument on misconducts by gender. The results suggest that the case management instrument performs significantly better for females than males. The AUC of .75 for female offenders is larger than that for males (AUC = .57) and falls outside of the confidence interval for all offenders. However, given the low base rate for female offenders caution should be taken when interpreting these results.
In sum, the last two sections have presented validity results from two instruments that were designed for two separate purposes. The classification instrument was designed to accurately predict the likelihood of misconduct so that inmates could be assigned to appropriate classification levels while in prison. The case management instrument was designed to predict community recidivism with criminogenic needs that could be targeted with treatment.
To further illustrate the differences in predictive power of each instrument, Table 6 displays the results of several logistic regression equations that predict both outcomes with each assessment instrument. The results are consistent with the findings presented in the graphs and reveal that although both instruments are able to predict their respective outcomes, the instruments struggle to display adequate levels of predictive power with each other’s respective outcome. The pseudo r2 statistics for all offenders indicate that the classification instrument explains more of the variation in the odds of misconduct than new arrest (misconduct Nagelkerke r2 = .17; arrest Nagelkerke r2 = .04). The statistics for the case management instrument also indicate that it performs better when predicting new arrest (Nagelkerke r2 = .19) than institutional misconduct (Nagelkerke r2 = .04). Furthermore, the instruments that were created to predict each outcome have relatively few items in common. The appendix provides a list of all of the items included in each assessment. Between the two instruments, there are a total of 48 items that predict either of these outcomes and only eight (17%) of these overlap. When examining gender differences, the results suggest similarities in the predictive power of the classification instruments for males and females, but differences in the predictive power of the case management instrument when predicting misconduct. Consistent with the results in Figure 4, the results suggest that the case management instrument performs better when predicting misconduct in females when compared with males, female exp(B) = 1.19; male exp(B) = 1.11. Again, the low base rate for female offenders with misconduct should encourage caution when interpreting these results.
Logistic Regression Models Using Both Assessment Instruments to Predict Both Outcomes
Note. CI = confidence interval.
Only 10 females experienced a misconduct so caution should be taken when interpreting the results for female offenders that predict misconduct.
The Hybrid Assessment System
The previous two sections indicate that each instrument is important for its own purpose and that neither should be used for the purposes of the other. The hybrid assessment system was developed to address these issues by minimizing the amount of resources needed to assess inmates. This was accomplished by creating a reduced classification instrument and using a screening instrument to identify only moderate and high risk cases for the full case management assessment. Thus, three assessments are potentially given, but the first two are relatively short and do not require a structured interview. In addition, the second assessment is designed to screen out a large percentage of cases from the final full case management assessment that involves a structured interview.
The Reduced Classification Instrument
Although the reduced classification instrument is less time-consuming than the full instrument, it is important that the reduced instrument maintains acceptable levels of predictive validity. The first series of charts in Figure 5 provide a comparison of the predictive validity of the reduced classification instrument. The reduced classification instrument has a somewhat weaker r value (recall from Figure 1 that the Pearson correlation with the full classification instrument and misconducts was r = .37, p = .00). Still, the AUC of .71 falls within the full classification confidence interval for misconduct (see Figure 1; AUC 95% confidence interval [CI] = [.64, .78]). This indicates that removing the other items from the full classification instrument did not greatly impact the predictive validity of the reduced classification instrument. The second two series from Figure 5 present the predictive validity of the reduced classification instrument by gender. The results suggest that although the Pearson correlation for females is smaller, the AUC is statistically similar. Nevertheless, the low base rates for female offenders should encourage caution when interpreting these results.

Predictive Validity of the Reduced Classification Tool Predicting Misconduct
The Case Management Screening Instrument
As noted in Table 5, the case management screen uses four variables to identify over 40% of offenders who do not need the full assessment (i.e., they are low risk) while maintaining a moderate relationship with new arrest. Figure 6 presents the predictive validity of the case management screen. Similar to the reduced classification instruments, the reduced case management screen maintains a slightly lower r value with its outcome than the full instrument, but the AUC value of .67 falls within the confidence interval of the full assessment instrument (95% CI = [.65, .75]). The second two series presents the results by gender and suggests that the instrument predicts similarly for males and females.

Predictive Validity of the Case Management Screening Instrument Predicting Arrest
The Case Management Instrument
The full case management assessment instrument that involves the structured interview (see Figure 3 and Table 3) is only administered to those cases that are deemed to be of moderate or high risk by the screener. Although the administration of this assessment is somewhat lengthy, the measurement of dynamic risk factors provides an accurate assessment of the risk the inmate poses on release to the community and provides case managers with the means to efficiently allocate treatment resources. The full case management instrument includes multiple items from six domains and as a result can categorize cases by priorities in treatment need based on the likelihood of recidivism. Table 7 presents statistics for each of the treatment domains contained in the full case management instrument by gender. As the table indicates, each treatment domain places inmates into groups that have increasing likelihoods of recidivating on return to the community. These groups allow case managers to prioritize treatment need by risk level and as a result efficiently allocate treatment resources. Worth noting is that there are similar correlations for men and women in the domains of education and finances, social support, substance abuse, and thinking errors. Substantive gender differences are found in the domains of criminal lifestyle (male r = .22, p = .00; female r = .11, p = .20) and anger management (male r = .15, p = .01; female r = .06, p = .45).
Treatment Domains for the Case Management Assessment Tool
Discussion
In prison, the assessment of risk and needs poses a particular problem because instruments used to determine custody levels and treatment need require two separate outcomes. This research used a sample of inmates admitted to Ohio prisons to examine similarities and differences in assessment instruments constructed for each purpose. The findings from this research produced marked differences in assessment instruments that were constructed to predict institutional misconduct and community recidivism separately.
Furthermore, the results suggested that although each instrument displayed acceptable levels of predictive validity with their respective outcomes, they failed to adequately predict their opposing outcomes. For example, adopting the classification instrument would not provide an accurate assessment of criminogenic needs that are related to arrest in the community. Likewise, the case management instrument would perform poorly if used to classify offenders by risk of institutional misconduct. At first glance, the inability of each full assessment instrument to predict both outcomes is troublesome for correctional policy makers because they seem to imply that third-generation assessments for custody risk and treatment need should be administered to all inmates at intake. On the contrary, the current research provides a hybrid approach to assessing inmate risk and need that involves minimizing the number of items used for classification and screening out the number of inmates that receive the full case management assessment. In doing so, the hybrid instrument provides a means to accurately and efficiently assess the likelihood of institutional misconduct as well as measure dynamic risk factors that can be used for case management.
An important implication from this research is that it calls into question the validity of a “one size fits all” approach to risk assessment in prison. As Wright, Clear, and Dickinson (1984) noted, assessment instruments developed for one population may not readily transfer to another population. The findings presented here are consistent with Latessa et al. (2010) and Urbaniok et al. (2007) in that they found substantive differences in the predictors of outcomes at different stages in the criminal justice system. The findings from the current study also refute the “one size fits all” approach to risk assessment as marked differences were found between instruments constructed on predicting the outcomes of inmate misconduct and community recidivism.
The removal of dynamic risk factors for the use of classification warrants discussion. Although the estimates were modestly smaller, the reduced instruments provided estimates that fell within the confidence intervals for their counterparts that included static and dynamic instruments. This conflicts with other research (e.g., Bonta & Motiuk, 1992; Van Voorhis et al., 2010), but is consistent with Weinrath and Coles (2003) who found that institutional misconduct was best predicted by static factors. It is possible that traditional dynamic risk factors may not be predictive in the prison context. For example, research on prison adjustment suggests that although there is some similarity in the causes of prison misconduct, there are also differences (for a review, see Gendreau, Goggin, & Law, 1997). Thus, the assessment of dynamic risk factors in prison may require context specific measures that operationalize concepts that are related to poor institutional adjustment. Future research on prison classification should examine how the inclusion of dynamic risk factors that are specific to the prison context affects the predictive validity of instruments designed to gauge the likelihood of prison misconduct.
This research also found gender differences and similarities. In all, most analyses produced similarities in the predictive power of the risk assessment instruments. These findings are consistent with Makarios et al. (2010) and indicate that there are a substantial number of risk factors that predict recidivism for men and women. Still, some key differences are worth noting. The results indicate that the case management assessment instrument maintained a stronger relationship with misconduct for women. Furthermore, the analyses of treatment domains from the case management instrument suggest that the domains of anger management and criminal lifestyle did not perform as well for females.
Thus, although the case management instrument performed at acceptable levels for females overall, some of the specific treatment domains may differ for females. This is important, because as Van Voorhis et al. (2010) noted, the use of gender neutral risk assessment instruments for case management may fail to link women offenders with the appropriate gender responsive programming. Given the findings, the implementation of the hybrid assessment system presented in the current research may benefit from a gender responsive trailer for women offenders that focuses on factors such as victimization, relationship problems, and parental issues (for a full discussion, see Van Voorhis et al., 2010). Still, the findings of similarities and differences observed in the current research are consistent with Zahn et al. (2008) and suggest that although many of the factors that predict recidivism do not vary by gender, there are important differences that reflect the gendered context of offending that should be taken into account (see also Agnew, 2009).
The limited generalizability of the current sample is worth noting. Although the current findings produced marked differences in the construction of assessment instruments developed to predict prison misconduct or community arrest in a sample of inmates with short-term sentences from Ohio, caution should be taken when attempting to generalize this to other populations. Replication and revalidation of any assessment system is always important, both from a research perspective, but also from an agency perspective. Of course, social scientists should take caution when speaking about the implications from any single study as replication and extensions of research help scholars to understand the generalizability of different findings. From an agency perspective, it is always important to produce quality assurance mechanisms to ensure that an assessment system is performing correctly. As Flores, Lowenkamp, Holsinger, and Latessa (2006) noted, training and implementation are vital aspects of third-generation risk assessment systems because the objective measurement of dynamic risk factors requires a good deal of clinical expertise. Beyond initial training, it is also important to establish oversight procedures such as interrater reliability checks and supervisor reviews. Furthermore, MacKenzie (2000) indicated that evidenced-based corrections encourages the implementation of practices that have been shown to be effective, but also the evaluation of practices once implemented to ensure fidelity. Validation efforts of risk assessment systems at the agency level ensure that the assessment system is working as desired and can result in revisions to both the items that are measured, as well as the cut-points for each instrument. Given these concerns, it is clear that replications, extensions, and validations of this research are needed before the current results be strongly considered for policy implications.
Another limitation of this study is that it used official data as outcomes. As noted previously, official data are limited because it tends to underestimate the prevalence of antisocial behavior and because it can be influenced by external factors such as the level of supervision, and criminal justice actor characteristics that influence the discretionary decision to invoke the law (Hindelang, Hirschi, & Weis, 1981; Light, 1990). For example, some may argue the measures of arrest may be recorded at higher rates in communities that have more police presence. Thus, more surveillance in these communities may make people more likely to be arrested, not because they are engaging in more crime, but because the police are more likely to detect crimes. This argument is perhaps more pronounced in prison, where inmates are placed in institutions with different custody levels and thus institutions that are designed to have different levels of supervision and surveillance. Despite this limitation, research has also found that self-report data also have limitations in regard to reliability and validity (see Hindelang et al., 1981; Thornberry & Krohn, 2002). Furthermore, research comparing the validity of self-report and official data generally concludes that both are valid measures of criminal behavior (Hindelang et al., 1981; Kirk, 2006; Thornberry & Krohn, 2002) and inmate misconduct (Simon, 1993; Van Voorhis, 1994). Still, the current research was only able to gain access to official measures of arrest and misconduct and could not compare differences in findings between self-report and official data. This limitation should be kept in mind when interpreting the results of this study.
As noted throughout this study, the low base rate for female offenders with misconduct is another limitation of this study. This is a common problem with research that seeks to examine differences between men and women because there are often fewer female offenders and female offenders usually offend at lower rates than men (Belknap & Holsinger, 1998). This research used ROC analyses, which are less susceptible to low base rates, and found that the appropriate instruments predicted acceptably for female offenders. Yet, given that only 10 women engaged in misconducts in the current study, caution should be taken in the interpretation of the analyses that predicted misconduct with the female subsample. Future research should seek to gather larger sample sizes of female inmates with longer follow-up periods to examine whether the results found here can be trusted to be generalized to other female inmate populations.
Keeping these limitations in mind, this research found that the assessment of inmate risk and need requires the development of an assessment system that acknowledges differences in the predictors of institutional misconduct and community recidivism. Although it suggests that separate instruments should be developed for classification and case management, it also provides a means to accurately and efficiently achieve these goals. Efficiency is achieved by eliminating the need for a structured interview to assess the risk of institutional misconduct and by screening out low risk cases for the structured interview to assess treatment needs. Although there were some reductions in predictive validity when dynamic items were excluded, the results indicated that these reductions were kept to a minimum. In doing so, this research provides a hybrid approach to the risk and needs assessment of prison inmates that correctional policy makers can utilize to minimize the costs of assessment while maximizing the effectiveness of custody and programming resources.
Footnotes
Appendix
Items in the Assessment Instruments
| Items | Class. Inst. | Red. Class. Inst. | Case Man. Inst. | Case Man. Screen |
|---|---|---|---|---|
| Age at assessment | X | X | X | X |
| Age at first arrest | X | X | ||
| Most serious arrest under 18 | X | |||
| Prior adult felony convictions | X | |||
| Official misconduct | X | X | X | |
| Prior probation | ||||
| Community supervision revoked | X | X | ||
| Committed to DYS | X | X | X | X |
| Arrest for a violent offense | X | X | X | |
| Prior commitment prison | X | X | ||
| Escape | X | X | ||
| Suspended/expelled | X | X | ||
| Used arrest | X | |||
| Employed prior to incarceration | X | |||
| Better use of time | X | |||
| Frequently unemployed | X | |||
| Attitude toward boss | X | |||
| Length employment | X | |||
| Marital status | X | |||
| Support available | X | |||
| Level satisfaction with support | X | |||
| Stabilized residence | X | |||
| Living situation | X | |||
| Drug neighborhood | X | |||
| Age drugs | X | X | ||
| Enrolled in drug program | X | |||
| Abstinence from alcohol | X | |||
| Drug employment | X | |||
| Drug health | X | |||
| Diagnosed mental health | X | |||
| Criminal friends | X | |||
| Gang membership | X | X | ||
| Criminal activities | X | |||
| Control anger | X | |||
| Anger intimidation | X | X | ||
| Anger expression | X | |||
| Walk away from fight | X | |||
| Punch in face | X | |||
| Gets upset when told what to do | X | |||
| Problem solving skill | X | |||
| Acts impulsively | X | |||
| Feels lack of control over events | X | |||
| Gotten “the short end of the stick” | X | |||
| “Have to do what it takes to get ahead” | X | |||
| “People who get conned deserve it” | X | |||
| No regrets | X | |||
| Self image | X | |||
| Envy | X | |||
| Total | 24 | 6 | 31 | 4 |
Note. Class. Inst. = classification instrument; Red. Class. Inst. = reduced classification instrument; Case Man. Inst. = case management instrument; Case Man. Screen = case management screening instrument; DYS = Department of Youth Services.
