Abstract
This article describes the development and initial validation of the Juvenile Sexual Offense Recidivism Risk Assessment Tool–II (JSORRAT-II). Potential predictor variables were extracted from case file information for an exhaustive sample of 636 juveniles in Utah who sexually offended between 1990 and 1992. Simultaneous and hierarchical logistic regression analyses were used to identify the group of variables that was most predictive of subsequent juvenile sexual recidivism. A simple categorical scoring system was applied to these variables without meaningful loss of accuracy in the development sample for any sexual (area under the curve [AUC] = .89) and sexually violent (AUC = .89) juvenile recidivism. The JSORRAT-II was cross-validated on an exhaustive sample of 566 juveniles who had sexually offended in Utah in 1996 and 1997. Reliability of scoring the tool across five coders was quite high (intraclass correlation coefficient [ICC] = .96). Relative to the development sample, however, there was considerable shrinkage in the indices of predictive accuracy for any sexual (AUC = .65) and sexually violent (AUC = .65) juvenile recidivism. The reduced level of accuracy was not explained by severity of the index sexual offense, time at risk, or missing data. Capitalization on chance and other explanations for the possible reduction in predictive accuracy are explored, and potential uses and limitations of the tool are discussed.
Keywords
The general public and government officials have demonstrated increasing concern about sexual violence over the past 15 years, partially due to the large number of sexual offenses reported each year. For instance, the Federal Bureau of Investigation (2009) reported 102,498 arrests for forcible rape and other sexual offenses in 2008. Public concern has focused on repeat sexual offenders despite the fairly low meta-analytic estimates of average 5- to 6-year detected sexual recidivism, 11.5% to 13.7% (Hanson & Morton-Bourgon, 2005, 2009).
Numerous state and federal statutes intended to prevent future offenses by known sexual offenders have resulted. The most widely known statutes require sexual offenders to register with local authorities, provide for community notification regarding their release, specify where they may not reside, and/or permit post-sentence involuntary confinement of some offenders deemed to be at high risk to reoffend because of a mental disorder or behavioral abnormality.
Although originally intended for adult sexual offenders, such legislation has increasingly been applied to juveniles in a number of states. Nationally, the Adam Walsh Child Protection and Safety Act of 2006 was passed by the federal legislature to unify state standards for registration and community notification of sexual offenders. Under that act, juveniles 14 years old or older with specified criminal histories are classified in Tier III and required to register for their lifetime with local authorities. Registrants provide a host of information, including their name, aliases, social security number, address, name and address of employers, license plate numbers, finger prints, photographs, and DNA samples.
Application of these laws to juveniles is problematic in that possible increases in community safety must be balanced against possible detrimental effects to the juvenile, which may include stigma, isolation, alienation, vigilantism, and lost opportunities—all limiting the ability of juveniles who have sexual offended (JSOs) to reintegrate successfully into their community (Trivits & Reppucci, 2002). Another important potential detrimental impact is the possibility for contagion effects (e.g., Boxer, Guerra, Huesmann, & Morales, 2005). One-size-fits-all legislation and/or interventions tend to aggregate juvenile sex offenders of various risk levels together, increasing the likelihood of lower risk youth becoming higher risk through such exposure. Individually and collectively, these potential detrimental effects may actually increase the risk of impacted juveniles and the threat they pose to community safety.
In the larger context of mixed results, there are a few studies that suggest that some of these statutes, as applied to adults, may lower danger to the public (e.g., Duwe & Donnay, 2008). However we can identify no evidence suggesting that these laws reduce sexual recidivism for juveniles. Absent such evidence, it is difficult to justify the application of adult statutes to juveniles, particularly given the many potential detrimental effects on impacted JSOs that may increase, rather than decrease, the threat they pose to community safety.
To guard against possible detrimental effects to juveniles and to assist with a multitude of other decisions (e.g., resource allocation, treatment, placement), researchers have invested considerable effort in identifying factors that increase risk of reoffending sexually. One goal of this research was the development of risk assessment tools that could be used to empirically identify offenders who are more versus less likely to reoffend, enabling differential placement, programming, and intervention. In addition, the development of these tools was a reaction to the general inability of unguided clinical judgment to produce reliable and accurate predictions of future sexual violence (e.g., Hanson & Bussière, 1998; Hanson & Morton-Bourgon, 2009).
To date, several empirically guided sexual offense risk assessment tools have been developed for juveniles. These include the Estimate of Risk of Adolescent Sexual Offense Recidivism (ERASOR; Worling & Curwen, 2001), the Juvenile Sex Offender Assessment Protocol–II (JSOAP-II; Prentky & Righthand, 2003), the Multiplex Empirically Guided Inventory of Ecological Aggregates for Assessing Sexually Abusive Adolescents and Children (MEGA; Miccio-Fonseca, 2010), the Risk Assessment Matrix (RAM; Christodoulides, Richardson, Graham, Kennedy, & Kelly, 2005), the Juvenile Risk Assessment Tool (J-RAT; Rich, 2001), and its variants (e.g., Interim Modified Risk Assessment Tool [IM-RAT]; the Cognitively Impaired Juvenile Risk Assessment Tool [CI/J-RAT]). Others have attempted to develop scales that tap into protective factors that minimize the risk of sexual reoffense (e.g., Protective Factors Scale; Bremer, 2001).
All of the previously mentioned tools would be considered to be empirically guided or structured clinical judgment risk assessment tools. Although such tools draw from the empirical literature in selecting items, fairly extensive clinical judgment can be required to score the items and to weight and combine the items into an overall risk estimate. The introduction of clinical judgment in the scoring, weighting, and combining of items likely introduces an additional source of error into the final risk estimate.
Fully actuarial, or statistically derived, risk assessment tools have been developed for adults to address these problems. Actuarial tools often derive items from historical risk indicators (e.g., number of prior sexual offenses), as opposed to dynamic risk factors. Risk indicators are generally more easily and reliably scored, and they are assumed to be the logical outcome of the presence and interactions of underlying dynamic risk factors. Finally, the scoring, weighting, and combining of items into an overall estimate of risk is empirically determined from development and validation research.
Examples of fully actuarial adult sexual offense risk assessment tools include the Static-99 (Hanson & Thornton, 1999), the Minnesota Sex Offender Screening Tool–Revised (MnSOST-R), and the Sex Offender Risk Appraisal Guide (SORAG; Quinsey, Harris, Rice, & Cormier, 1998). These empirically validated, actuarial tools substantially improved predictive accuracy over unguided clinical judgment, and they are generally more accurate and consistent than empirically guided tools in predicting sexual recidivism (Hanson & Morton-Bourgon, 2009). These actuarial risk assessment tools commonly are used to inform a variety of release-related decisions with adult populations (Doren, 2002), and none have emerged as more accurate than the others (Hanson & Morton-Bourgon, 2005).
Except in very rare circumstances, these established actuarial risk assessment tools for adult sexual offenders could not be used with juveniles because the tools were not developed on or for juveniles, and they had not been adequately validated with juveniles. Many experts argued that JSOs require risk assessment tools specific to that population because risk for juveniles is more fluid than for adults because of the incomplete and dynamic state of development, social structure, and education in adolescence (e.g., Prentky & Righthand, 2003).
Despite the documented improvement in predictive accuracy associated with actuarial risk assessment with adults, actuarial methods for juveniles were not pursued or embraced by the field for a number of reasons. One group of reasons focused on the difficulty of the task given the low base rates of juvenile sexual recidivism and given the inconsistent findings of past research on risk factors for adolescents, which was presumed to result from the dynamic and incomplete level of development in adolescence (e.g., Caldwell, 2002). These concerns, though plausible, are ultimately empirical questions that can only be answered through additional research. For example, much of the past research on risk factors for JSOs has been based on very small samples of convenience. It was certainly possible that results might be more consistent with larger and more representative samples of JSOs.
A second set of reasons focused on possible undesirable effects of developing and validating risk assessment tools for JSOs, even if it was possible to do so. One commonly asserted concern was that risk assessment could unduly stigmatize juveniles because risk can be significantly reduced through treatment and maturation. In other words, there was a danger that the label would persist even though risk had changed. We, too, saw this as a very legitimate concern, but one that could be greatly mitigated through the appropriate use of risk assessment and carefully written reports that clearly note the time frame of the assessment (when the assessment expires) as well as the limitations of risk assessment with juveniles.
A related concern was that the development and validation of risk assessment tools for JSOs would hasten the application of adult statutes, policies, and procedures to juveniles. We agreed that there were many reasons to oppose such applications to juveniles, and we did not begin work on this project until it was clear that state and federal governments were extending registration, community notification, and even civil commitment to juveniles, with or without risk assessment tools.
With that increased push to apply sexual offender registration, community notification, and, in some rare instances, post-sentence commitment to JSOs, despite the disparity between legal protections in the adult and juvenile courts, we believed that risk assessment with juveniles became even more important. Accurate risk assessment with juveniles promised the usual benefits of more efficient and strategic resource allocations, more informed programming and treatment assignments, and, at least theoretically, better outcomes, particularly given JSOs’ young age and incomplete maturation. In addition, accurate risk assessment with juveniles held the potential to keep low risk juveniles low risk through early identification and segregation from higher risk juveniles to avoid contagion effects and to provide compelling data from systematic, larger scale research that would argue against the extension of adult policies and procedures to at least the large majority of JSOs.
Development of the Juvenile Sexual Offense Recidivism Risk Assessment Tool–II (JSORRAT-II)
Within this context, work on the JSORRAT-II began. The authors chose to use an actuarial approach to identify optimal indicators of risk that are past products and behavioral proxies of underlying dynamic risk factors and their interactions (Epperson & Ralston, in press). This decision was driven by the success of actuarial methods in developing adult tools, by the fact that the literature on dynamic risk factors with juveniles was still in its infancy, and the obvious difficulty of directly measuring dynamic risk factors.
Method
Participants
Juvenile Justice case files representing a total of 636 male juveniles who were adjudicated guilty for a sexual offense between 1990 and 1992 in the state of Utah were used for the development sample. These cases represented an exhaustive sample of all juveniles for whom we could locate records and who were adjudicated for a sexual offense within the specified window of time. The vast majority of juveniles were between the ages of 12 and 17.99 years of age at the time of intake for their index sexual offense; however, because of some flexibility in Utah’s juvenile court jurisdiction, four juveniles were 11 years old and 10 were 18 years old at the time of their intake. The average age was 15.18 (SD = 1.57). Consistent with Utah’s demographic makeup, the majority of juveniles in this sample were White (76.4%). The remainder of the sample was Hispanic/Latino (7.7%), African American/Black (2.2%), Asian American (1.6%), Native American (1.4%), Multiethnic or Other (1.1%), and unspecified (9.6%).
This sample was advantageous for two reasons. First, because this was an exhaustive sample, its members represented the full spectrum of juvenile sexual offenses in Utah. Thus, no attempt was made to limit the sample to those offenders with specific characteristics or with particular types of sexual offenses or specific geographic regions within the state. Subsequently, such a broad examination allowed for greater generalizability across all types of juveniles who commit all types of sexual offenses in the state. Second, the sample was sufficiently large to provide power to detect moderate and strong predictors of sexual reoffense.
Materials
Case files
Utah Juvenile Court and the Utah Division of Juvenile Justice Services staff located and copied all juvenile judicial and youth corrections case files for the 636 JSOs in the sample. Our intent was to have all case files edited to appear as they did on the date the JSO left the juvenile justice system for their last sexual offense in the 1990 to 1992 window. The task of editing the files was undertaken by the same two agencies, as well as by volunteers from Utah’s Network on Juveniles Offending Sexually (NOJOS), a statewide organization of caseworkers, treatment providers, prevention specialists, and victim advocates who work with JSOs. After all files were edited, they were shipped to the authors for review and data extraction.
Because editors erred on the side of inclusion, data for some recidivating offenses were still included in a number of files. This was not problematic, however, because this was a development study in which the coders were blind to the time frame that was used to define index and recidivating offenses. In addition, as described later, eliminating information on recidivating offenses from the data set prior to item-selection analyses was a simple matter.
The content of case files varied somewhat. However, the vast majority contained several standard pieces that were used for data extraction. All case files included a record of criminal involvement with the juvenile justice system. These records included arrest, investigation, court, and youth corrections reports. The majority of cases also contained additional reports describing past sexual perpetrations and the index sexual offense, including information about the nature of the offense, events leading up to and following the offense, and the victims. Many files also contained probation reports, psychological evaluations, and documents describing educational history, social functioning, substance abuse and mental health issues, and treatment histories. Lastly, some cases included documents relevant to familial involvement with the courts, such as Department of Human Services reports regarding abuse and neglect.
Procedures
Data extraction and entry
Eight research assistants, who had no prior knowledge of the juveniles or the time frames that would be used to define index or recidivating sexual offenses, reviewed and extracted information from the files into two primary codebooks. The first codebook included background information, such as demographic data and information about caregiving structure, family relationships, child abuse and neglect, academic performance, school behavior, consenting sexual history, substance use, mental health issues, therapy (substance abuse, mental health, sexual offender specific), and criminal offenses (charges, adjudications, and sentences). The second codebook captured variables related to specific offense characteristics, such as information about the victim (e.g., gender, age, relationship), pre-offense behaviors (e.g., stalking, grooming), methods for achieving compliance (e.g., threat, force, bribery), offense location (e.g., school, victim’s home), specific sexual acts involved in the offense (e.g., fondling, penetration), the use of weapons (e.g., real, feigned), the role of the offender in the offense (e.g., sole perpetrator, member of a group), and post-offense behaviors (e.g., threatened additional harm, turned self in to authority figures). A separate offense-specific codebook was filled out for each victim of the juvenile offender.
Research assistants were trained to accurately read and extract data from case files over the course of several meetings of 1 to 2 hr each. Detailed instructions were provided for each variable in the two codebooks in the first two meetings. Research assistants were then paired and given identical practice cases to do individually. After completion of these cases, the assistants met with the lead researcher to review the file and discuss discrepancies in coding. This process was repeated until the assistants could produce consistent results according the prescribed protocols. A log book containing clarifications of how to code emerging idiosyncratic situations was maintained in the lab for assistants to consult throughout the study.
Each completed codebook was entered into SPSS data files twice by two different assistants, and the double entries were compared to detect data entry errors. When discrepancies were encountered, the original codebook was consulted to correct the data entry error.
Juvenile sexual recidivism data
After all files had been reviewed and entered into the SPSS data set, the index sexual offense was established and sexual recidivism status was determined. For the present study, juvenile sexual recidivism was defined as any charge for a new sexual offense prior to age 18.
Recidivism data were obtained through a search of a statewide electronic database maintained by the Utah Division of Juvenile Justice Services and through data extracted from the files. In total, 84 of the 636 juveniles (13.2%) sexually recidivated prior to age 18. Furthermore, 61 of the 84 (72.6%) juvenile sexual recidivists were charged with a sexually violent offense.
Because researchers had created offense characteristics codebooks for each sexual offense documented in the file, any offense characteristics codebook for a recidivating offense was deleted from the data set before item-selection analyses were performed. This ensured that only offense characteristics from prior and index offenses were considered as potential predictors of juvenile sexual recidivism.
Data analysis
Grouping variables
Prior to analyses, all coded potential predictor variables were organized hierarchically into families, groups, and subgroups of variables based on conceptual similarity. As an example, child abuse was specified as one family of variables. Within the child abuse family, variables were separated into one of four groups based type of abuse (e.g., sexual, physical, emotional, neglect). Within each of these groups several subgroups of variables were created to look at the types of abuse in different ways. For example, in the sexual abuse group, one subgroup included the different subtypes of sexual abuse, and another subgroup included variables looking at the frequency of sexual abuse.
A total of 10 families were created and evaluated: history of sexual offending, sexual offense characteristics, sexual offender treatment, child abuse, special education, discipline problems at school, family instability, mental health diagnoses, mental health treatment, and history of non-sexual offending.
Item-selection analyses
The dependent variable in all item-selection analyses was juvenile sexual recidivism, as defined by a new charge for a sexual offense subsequent to the index sexual offense and prior to the juvenile’s 18th birth date. The specified procedures were designed to identify the set of variables that optimally discriminated juvenile sexual recidivists from non-recidivists. During the initial phases of the analyses, continuous variables were used whenever possible. This allowed for testing of both linear and curvilinear relations between predictor variables and juvenile sexual recidivism.
Four distinct steps were used to select optimally discriminating variables. In Step 1, chi-square, biserial correlation, and logistic regression analyses were used to identify all subgroup variables that were significantly associated with sexual recidivism (p < .05). Subgroups that did not yield a single significant variable were eliminated from further analysis.
Step 2 identified the best marker variable(s) within each subgroup. When only one subgroup variable was significant after the first step, it was retained for further analyses. When more than one subgroup variable was significant, logistic regression analysis was utilized to select the optimal predictor(s) and eliminate redundant predictors within those subgroups. Specifically, all variables within a subgroup were entered simultaneously into a logistic regression. If all Wald chi-square statistics were significant, indicating each was making a unique contribution to the prediction of sexual recidivism status, then all variables in that subgroup were retained for later analyses. When all variables were not significant, one of three strategies was used to identify the best marker variable(s). In the case of suppression effects with only two variables, the stronger of the two variables was retained. In the case of three or more variables, hierarchical logistic regression analysis was utilized to assess variables in different orders to determine the optimal set of predictors. If no pattern emerged, then the variables were collapsed into a single variable. For example, several types of sexual abuse (e.g., history of being fondled, history of being penetrated anally) were correlated with each other, and no clear pattern emerged suggesting one abuse behavior was more predictive of sexual reoffense than others. In addition, the presence of any one of these types of sexual abuse histories resulted in a similar increase in juvenile sexual recidivism rates. In that instance, all sexual abuse history variables of a “hands-on” nature were collapsed into a new variable representing the presence or absence of “hands-on” sexual abuse.
Step 3 of the analyses occurred at the group level. Variables were analyzed within groups using a “drill-down” procedure. This procedure entailed entering variables at the more general group level into the first step of several hierarchical logistic regression analyses followed by the more specific subgroup variables in the second step. This procedure was utilized to determine whether specificity of the variables enhanced the predictive accuracy. Using the sexual abuse group as an example, the variable representing whether the juvenile had ever been the victim of sexual abuse was entered into the first block of the logistic regression followed by the variable representing the presence or absence of “hands-on” sexual abuse, a more specific variable. The results indicated that “hands-on” sexual abuse added to the prediction of sexual reoffense status, but the reverse was not true. Consequently, the more general presence or absence of any sexual abuse variable was dropped from further analyses. To further “drill-down,” the presence of “hands-on” sexual abuse variable was entered into the first step of another logistic regression analysis, followed by the more specific frequency of “hands-on” sexual abuse variable. The results indicated that the frequency variable added to the prediction above the more general presence or absence variable, and the reverse was not true. Thus, only the frequency of “hands-on” sexual abuse variable was retained for the fourth step of analyses.
During the fourth step of the analyses, all variables within families were analyzed to identify the best markers within each of the 10 families. This step also used hierarchical logistic regression and utilized the “drill-down” procedure, but at the family level rather than the group level. One additional element was added, however. Because the best predictor of future behavior is often past behavior, particularly in the area of sexual offending (e.g., Hanson & Morton-Bourgon, 2005; Långström, 2002), we wanted to ensure that variables within families were retained for later analyses only if they both made a unique contribution within their family and significantly predicted sexual recidivism over and above the variables representing past sexual offending. Thus, as a first step, we used hierarchical logistic regression analysis to determine the variables that optimally predicted future sexual recidivism from the sexual offending history family. Once those variables were determined, the variables from each of the remaining families were entered into the second block of separate hierarchical logistic regression analyses, one for each family. Any family of variables that did not add significantly to the prediction of sexual reoffense was dropped from further analyses.
During the final step of analyses, we endeavored to determine the optimal number of families using hierarchical logistic regression analysis. The families were entered in the following order: sexual offending history, sexual offense characteristics, child abuse, sexual offender treatment, special education, school discipline, mental health diagnoses, mental health treatment, family instability, and non-sexual offending history. For a family to be retained, it was required to add significantly to the prediction of juvenile sexual recidivism at the p < .05 level. In addition, variables within significant families were retained only if the associated Wald χ2 was significant at the p < .10 level. When a variable was retained, it remained in the model regardless of how it performed as additional variables were entered.
Although the p value required for Wald χ2 tests associated with the individual items was relaxed in hierarchical logistic regression, each of these items had already passed through several stages of analysis by predicting significant variance at each stage. A variable might be incorrectly identified at the bivariate level, but the likelihood that it would have been retained at the final step is relatively small given the sequential analytic procedures used.
Results
Because of space limitations, none of the individual logistic regression results are reported. In addition, only the strongest variables within each of the 10 families that survived through the fourth step are summarized in Table 1. Those variables with one asterisk made a unique contribution to the prediction of juvenile sexual recidivism relative to other variables within that same family, and variables with two asterisks also made a unique contribution over and above sexual offending history.
Selected Variables From 10 Families and Their Bivariate Relations With Juvenile Sexual Recidivism (N = 636).
Note. Variables without an asterisk did not make a unique contribution to the prediction of juvenile sexual recidivism beyond the other variables in the same family. Variables with one asterisk made a unique contribution to the prediction of juvenile sexual recidivism relative to other variables in the same family. Variables with two asterisks made a unique contribution to the prediction of juvenile sexual recidivism relative to other variables in the same family and also contributed uniquely beyond the sexual offending history variable family (or were in that family). ADD = Attention Deficit Disorder; ADHD = Attention Deficit Hyperactivity Disorder; PTSD = Posttraumatic Stress Disorder; JSO = Juvenile who sexually offended.
A summary of the hierarchical logistic regression analyses in the fifth step of analyses to determine the optimal number of families and variables is provided in Table 2. This table lists the seven families that contributed to the prediction of juvenile sexual recidivism at the p < .05 level. It also lists the individual component variables within each family that contributed to the prediction of juvenile sexual recidivism at the p < .10 level.
Results of Hierarchical Logistic Regression Analyses With Final Variable Families.
Note. The Block χ2 is the test of the additive contribution of the current block relative to previous blocks in the prediction of juvenile sexual recidivism. The Wald χ2 statistic is the test of the unique contribution to the prediction of juvenile sexual recidivism for each variable relative to other variables in the same block and those in previous blocks.
In total, 12 variables from seven families emerged as optimal predictors of juvenile sexual recidivism. No variables from the mental health diagnoses, mental health treatment, and family stability families were retained based on the rules previously specified for Step 5 of the analyses. All 12 linear effects were significant at the p < .05 level. Of the final 12 variables, one (number of victims) had quadratic and cubic effects that were significant at the p < .10 level.
Performance of the model in the development sample
The Area under the Receiver Operator Characteristics curve (AUC-ROC) statistic was used to assess the accuracy of the model with the development sample. Because different probability cut-scores can lead to different estimates of accuracy, it is often more desirable to assess the global accuracy of a model using a statistic that considers all possible cut-point levels. ROC analysis is one such statistic. A detailed description of this statistic can be found elsewhere (e.g., Quinsey et al., 1998; Swets, 1996; Swets, Dawes, & Monahan, 2000).
The ROC curve for the full model was generated using SPSS. The AUC for this model was .91 (95% confidence interval [CI] = [.87, .94]). Because .50 falls well below the lower bound of the confidence interval, the level of overall accuracy was clearly statistically significant. In addition, this level of accuracy would generally be considered to be very strong. Using the equations and tables found in Rice and Harris (2005), an analogous Cohen’s d effect size for this AUC would be 1.9. However, this level was achieved using the development sample, for which the tool was custom designed. Thus, one would expect shrinkage in this value in subsequent validation studies due to some degree of capitalization on chance relationships in the development sample that do not fully generalize to other samples.
Simplification of the model
The ultimate goal of this project was to develop an empirically based risk assessment tool. The beta-weights of the logistic regression analysis can be used for such purposes. However, such a tool would be difficult for many to understand and implement because of its complexity. Consequently, we explored a simpler categorical scoring system for the final variables in the model and assessed the trade-off between simplicity and predictive accuracy.
The categorical scoring system resulted from an examination of the juvenile sexual recidivism rates associated with the levels of each of the 12 selected items. The first step involved assigning a score of 0 to the level of the variable associated with the lowest rate of juvenile sexual recidivism. For example, juveniles with a history of only one sexual offense adjudication sexually recidivated at a rate of 6.2%. Because this was the lowest recidivism rate associated with any level of that variable, a score of 0 was assigned to juveniles whose index offense was their only sexual offense. Second, we assigned a score of 1 to the next level of the variable that represented a meaningful increase in rate of recidivism. In the example of sexual offense adjudications, juveniles with two adjudications recidivated at a rate of 26.3%, so juveniles with two adjudications received a score of 1 on that variable. Third, an additional score of 1 was associated with each additional, meaningful increase in recidivism rate. In the event that no meaningful increases were apparent with successive levels of the variable, those levels were collapsed together into one scoring category. Continuing the example, juveniles with three adjudications for sexual offenses reoffended at a rate of 35.1%, so all juveniles with three sexual offense adjudications received a score of 2. The distribution beyond four sexual offense adjudications thinned considerably and JSOs with four sexual offenses had similar rates of reoffense to those with five, six, and seven or more sexual offense adjudications. Consequently, those levels were collapsed into a “four or more” category where juveniles in that category reoffended at a 41.4% rate. All juveniles in that last category received a score of 3. With the exception of 1 variable, all of the final variables had at least 25 juveniles in each level of the variable. See Table 3 for the scoring values associated with all 12 variables.
Categorical Scoring for the Final 12 JSORRAT-II Variables.
Note. JSORRAT-II = Juvenile Sexual Offense Recidivism Risk Assessment Tool–II.
Performance of the simplified model
Using total scores from the simplified, categorical scoring system to predict juvenile sexual recidivism in the development sample yielded an AUC of .89 (95% CI = [.85, .92]), which reflected a very minimal decrease in accuracy. Thus, the final 12 variables using categorical scoring in the final model comprised the JSORRAT-II.
The potential range of scores on the JSORRAT-II is 0 to 21. The actual range of scores in the development sample was 0 to 15, and the distribution was positively skewed. As documented in Table 4, roughly two thirds (69.5%, 442/636) of the sample scored at the lower end of the distribution between 0 and 4. The associated rate of juvenile sexual recidivism for this group was 2.7%, demonstrating the ability of the tool to identify a very large number of youth in the development sample who presented very low risk. The other third of the sample scored between 5 and 15 and had an associated juvenile sexual recidivism rate of 37.1%. The predicted probabilities of juvenile sexual recidivism, derived from logistic regression, associated with each score in the development sample are also listed in Table 4.
Score Distributions and Predicted Probabilities in the Development Sample.
Although the JSORRAT-II was optimized to predict future juvenile sexual offenses of any kind, it performed equally well in predicting just sexually violent juvenile recidivism. The AUC for predicting sexually violent recidivism was .89 (95% CI = [.86, .93]), identical to the predictive accuracy of the JSORRAT-II in predicting all types of juvenile sexual recidivism.
One concern was the presence of some juveniles in the development sample who were relatively older at the time of their index offense. These JSOs had little time to reoffend prior to leaving the juvenile justice system, potentially reducing the accuracy of the tool and the associated juvenile sexual recidivism rates. To investigate this possibility, selection ratios, and rates of sexual recidivism were calculated separately for juveniles under the age of 17 at the time of their index offense, and again for juveniles under the age of 16 at the time of their index offense. These rates and their patterns were very similar across all three subsamples. 1 In addition, the AUC values for each of the three groups were nearly identical (full sample = .89, 17 and younger sample = .88, 16 and younger sample = .88).
Initial Validation of the JSORRAT-II
Results from the development study indicated that the JSORRAT-II was a promising juvenile sexual recidivism risk assessment tool that warranted an investment of additional time and resources to conduct a validation study with an independent sample. The primary purpose of the second study, then, was to assess the predictive validity of the JSORRAT-II with a new, large, and representative sample of juveniles who offended sexually in the state of Utah. Because of jurisdictional and demographic similarities between the development and validation samples, we hypothesized that the JSORRAT-II would exceed chance-level predictive accuracy for sexual recidivism. However, because the items were tailor-made for the development sample, some shrinkage in the indices of predictive validity was expected.
Method
Participants
The validation study utilized juvenile justice case files from 566 male JSOs aged 11 to 18 years who were adjudicated guilty for a sexual offense in Utah in 1996 and 1997 (index offense). The case files represented an exhaustive sample of male JSOs from the state of Utah whose index offense fell within that window. Four case files represented JSOs who were also in the JSORRAT-II development sample (i.e., they also had sexual offense between 1990 and 1992). They were not excluded from the present sample because they had a new index offense and therefore constituted a new situation and prediction.
At the time of their index sexual offense, JSOs ranged in age from 11.0 to 17.9 years of age. The mean age was 15.0 (s = 1.6). Like the development sample, the validation sample was predominantly White (76.0%). The remaining JSOs were Latino (12.4%), African American/Black (1.4%), Asian/Pacific Islander (1.1%), Multiracial (3.9%), or from some other racial-ethnic background (1.8%). A total of 3.5% of JSOs did not have a listed racial-ethnic background. Unlike the development sample, we were not able to access expunged cases for this cohort. Because of the somewhat unique demographics of the state, we attempted to code for religious affiliation. However, no affiliation was listed in the file for most JSOs (81.8%).
Materials
Juvenile judicial and corrections case files
Juvenile justice case files for all JSOs in the study were located and copied by the staff of the Utah Juvenile Court and the Utah Division of Juvenile Justice Services. All files were transported to the researchers, where they were prepared for scoring. To emulate a prospective study, all case files were arranged chronologically by two undergraduate research assistants. After chronological arrangement, one the authors removed all information after one of two time periods. First, if the JSO did not recidivate sexually after their 1996 to 1997 index sexual offense, all information found in the case file dated January 1, 2000 or later was removed. Second, if the JSO was identified as having a recidivating offense, all information was removed from the first mention of that offense onward. If the recidivating offense occurred in 2000 or later, all information dated January 1, 2000 or later was removed. These two steps were instituted to ensure sufficient information to code the JSORRAT-II, while ensuring that the coders were blind to the JSO’s recidivism status. The case files varied in content, but the majority contained nearly identical types of forms and content found in case files used in the JSORRAT-II development study.
Sexual recidivism data
The Utah Juvenile Justice Services conducted an electronic search of the statewide juvenile court/juvenile justice services database to generate a list of charges, adjudications, offense dates, charge dates, and adjudication dates for each JSO up to July of 2006. Juvenile sexual recidivists, defined as those JSOs with a formal charge for a new sexual offense prior to age 18 were identified from this list.
To determine which JSOs sexually recidivated, the JSO’s index sexual offense had to be identified from the list. If a JSO had only one sexual offense in the 1996 through 1997 window, that offense was identified as the index sexual offense. In the event that there were two or more sexual offenses within the window, the first of these offenses was identified as the index sexual offense. Sexual recidivism was then defined as any new charge for a sexual offense occurring both after sanction for the index sexual offense and prior to age 18.
A total of 72 (12.7%) JSOs were identified as having a new, recidivating juvenile sexual offense. The proportion of recidivists in the validation and development samples was not significantly different, χ2(1) = 0.06, p > .05. Forty-six of 72 juvenile sexual recidivists (66.7%) recidivated with a sexually violent offense. Of those offenders who were under age 17 (n = 494) at the time of the index offense, 70 (14.2%) sexually recidivated, and of those under age 16 (n = 389) at the time of the index offense, 65 (16.7%) recidivated sexually.
Procedure
Data extraction
Five undergraduate research assistants, with no knowledge of the recidivism status of the JSOs, were trained to extract data from the files to a coding form that included the 12 JSORRAT-II items and about 30 research variables of continuing interest. Outside the substitution of the reduced coding form for the longer coding books, the training procedures were identical to those from the development study.
Reliability cases
A total of 16 cases were identified for reliability purposes. Identification of these cases followed one of two strategies. Initially, four cases were identified at random for the research assistants to score over the first 4 weeks. Then, after approximately 100 cases had been scored, the data were entered into a SPSS datasheet and JSORRAT-II total scores were calculated. From these scores, an additional 12 cases were selected to ensure that the distribution of possible scores was represented in the reliability cases.
Approximately once per week, thereafter, each coder was instructed to score one of these reliability cases. The coders were instructed not to discuss these cases with other coders and to place their coding forms in a separate secure location where the other coders would not have access to their responses. These coding form responses were used to assess interrater reliability.
Data entry
As in the development study, each coding form was double-entered into a SPSS database to assess and corrected for data entry errors.
Data analysis
Interrater reliability was calculated as an intraclass correlation coefficient (ICC) for absolute agreement. Overall predictive accuracy of the total JSORRAT-II score was assessed in two ways. First, a one-tailed, independent-samples t test was used to determine whether there was a significant difference in the total scores of recidivists and non-recidivists. Second, the AUC-ROC statistic was calculated to determine the predictive accuracy of the JSORRAT-II for juvenile sexual recidivism.
Results
Reliability analyses
All five research assistants coded the same 16 cases over the course of the project. After any data entry errors were resolved, JSORRAT-II total scores were calculated. The median and modal total scores for each individual case ranged from a 0 to 11, with the mean total scores ranging from 0 to 11.2. The overall mean score across all reliability cases was 4.26 (s = 3.47).
The singular intraclass correlation (ICC) for absolute agreement, using a two-way mixed effect model, was calculated for the JSORRAT-II total score. This statistic counts baseline differences between raters as error, so it is a conservative and appropriate index of reliability for a risk assessment tool where one is interested in absolute and not just relative agreement.
The singular ICC for absolute agreement for JSORRAT-II total scores in this study was .96 (95% CI = [.92, .98]). As expected, this reliability coefficient is quite high. Very high reliability was expected because each research assistant received extensive didactic and experiential training at the beginning of the project, and they received additional corrective feedback throughout the course of the project. Because of the intensity and duration of training for the coders, this coefficient of reliability cannot be viewed as representative of “real-world” scoring based on a typical 1-day training workshop. However, as noted earlier in this article, the JSORRAT-II is relatively easy to score with appropriate training, leading to the possibility of high “real-world,” interrater reliabilities. Evidence for this possibility exists from an unpublished study of state evaluators who performed assessments in Utah following a 1-day training workshop. In that study of seven evaluators who scored the same 17 cases, the singular ICC for absolute agreement was .91 (Epperson & Ralston, 2006).
Predictive validity analyses
After double-entry errors were resolved, JSORRAT-II total scores were calculated. Total scores ranged from 0 to 16 in the full sample, with a mean score of 3.56 (SD = 3.26). Similar to the development sample, scores were skewed in a positive direction, with nearly half again scoring 0 to 2 and sexually recidivating at a 7.8% rate and a little more than two thirds of the sample again scoring between 0 and 4 and recidivating at a 9.8% rate. The remaining 30% of the sample scored between 5 and 16, with an overall sexual recidivism rate of 20.8%. The score distribution of the validation sample is presented in Table 5.
Score Distributions and Predicted Probabilities in the Validation Sample.
Overall predictive validity
Overall predictive accuracy of the total JSORRAT-II score was assessed in two ways. First, a one-tailed, independent-samples t test confirmed a significant difference between the total scores of recidivists (5.06) and non-recidivists (3.34), t(87.1) = 3.79, p < .05. Cohen’s d for the difference was .53, which is a moderate effect (Cohen, 1988).
The AUC-ROC statistic was the second method used to assess overall accuracy of the JSORRAT-II in predicting juvenile sexual recidivism. Using total scores from all 566 JSOs to predict any juvenile sexual recidivism yielded an AUC value of .65 (95% CI = [.59, .72]). This value represented a significant improvement over chance-level prediction and is roughly equivalent to a Cohen’s d of .5 (Rice & Harris, 2005). Total scores predicted sexually violent recidivism at the same level of overall accuracy (AUC = .65, 95% CI = [.57, .73]).
Although the predictive accuracy of the tool with the validation sample was clearly statistically significant, this level of accuracy was substantially reduced compared with the AUC value of .89 found in the development sample. The impact of the lower overall accuracy is also reflected in the predicted probabilities associated with each JSORRAT-II score in the development and validation samples. As summarized in Table 5, the predicted probabilities in the validation sample ranged from .08 to .42. Although this is a significant and helpful spread, it represents about half the range obtained in the development sample (.01-.93). Several post hoc analyses were performed to explore potential explanations for the amount of shrinkage observed.
Exploration of potential impact on accuracy of time since index offense and severity of index offense
Because time from the index offense to age 18 had the potential to influence the predictive accuracy of the JSORRAT-II, the AUC-ROC statistic was recalculated for only those JSOs who were under age 17 at the time of their index offense. The resulting AUC value of .65 (95% CI = [.58, .71]) for those 494 JSOs did not reflect any improvement over the previous estimate for the total sample. Similarly, the AUC value for only the 389 JSOs who were under age 16 at the time of their index offense was .65 (95% CI = [.58, .71]) and did not represent an improvement in prediction over the initial estimate for the full sample.
“Hands-off” sex offenses were excluded during the development and validation of some adult sexual offender risk assessment tools, so differential accuracy for the JSORRAT-II based on severity of the index sexual offense was explored. Severity of the index sexual offense was defined in terms of the level of the charge, misdemeanor or felony. Three AUC-ROC values were calculated: (a) one for JSOs with exclusively misdemeanor sexual offenses, (b) one for JSOs with exclusively felony sexual offenses, and (c) one for those who had at least one felony sexual offense. The AUC values were .63, .65, and .64, respectively, indicating that the charge level of the index sexual offenses did not affect the predictive validity of the JSORRAT-II.
Exploration of the potential impact of missing data on predictive accuracy
According to the JSORRAT-II scoring manual (Epperson, Ralston, Fowers, DeWitt, & Gore, 2006), missing data are to be scored as zero on all items. However, missing data, when scored as a zero, have the potential to deflate the predictive validity of tools like as the JSORRAT-II, and this problem is likely to have more of an impact on recidivists. For example, if the JSORRAT-II is a valid assessment of risk, non-recidivists should score high on only a few items, whereas, recidivists should score high on a greater number of items. Thus, any missing data would likely disproportionately impact recidivists, artificially deflating their scores. For this study, legitimate scores of zero were differentiated from scores of zero resulting from missing data.
Of the 566 JSOs in the sample, 400 (70.7%) had complete data for all 12 items of the JSORRAT-II. An additional 122 (21.6%%) JSOs were missing data for just one item, 28 (4.9%) were missing data for two items, and only 16 (2.8%) JSOs had missing data for three or more items. The potential impact of missing data on overall predictive accuracy was assessed by calculating ROC statistics for the subsample of JSOs for whom complete data were available and comparing it with that obtained for the total sample. The resulting AUC = .65 (95% CI = [.58, .73]) demonstrated no improvement in accuracy for the complete data sample. Thus, overall accuracy was not affected by missing data, largely eliminating missing data as a potential explanation of the shrinkage in the index of predictive accuracy. Of course, missing data could still modestly impact the risk estimates associated with JSORRAT-II scores.
Discussion
The purpose of this project was to empirically develop and validate a tool that could adequately assess the risk of juvenile sexual recidivism with male JSOs. The 12 JSORRAT-II items that emerged from the item-selection process can be scored through review of information typically found in most juvenile court and corrections files. The items are generally behaviorally anchored, requiring relatively little interpretation on the part of the evaluator. In addition, the categorical scoring system makes the JSORRAT-II uncomplicated and intuitive to use. Finally, because of the exhaustive sampling procedures, the JSORRAT-II can theoretically be used with the entire spectrum of sexual offending by juvenile males presuming, of course, successful validation and replication.
The JSORRAT-II predicted juvenile sexual recidivism well above chance level in the development sample. In addition, JSORRAT-II scores were positively skewed and yielded a wide range of associated juvenile sexual recidivism rates (predicted probabilities from .01 to .93). This suggested that the majority of JSOs were very low risk and that this group could be identified. For example, 305 JSOs (48% of the sample) scored between 0 and 2 and only 3 of those JSOs sexually recidivated as juveniles (1% rate). An additional 22% of the sample scored between 3 and 4, with an associated juvenile sexual recidivism rate just under 7%. At the other end of the spectrum, the JSORRAT-II identified approximately 30% of the sample with significantly elevated rates of reoffense (24%-82%). This level of promise clearly justified the investment of additional resources in reliability and validation studies.
The results of reliability studies confirmed that the JSORRAT-II can be reliably scored with appropriate training. Our lab workers, who were very highly trained, achieved excellent reliability, as reflected in the singular ICC = .96 for absolute agreement. In addition, state evaluators with a 1-day workshop also demonstrated high reliability, as reflected in a singular ICC = .91 for absolute agreement (Epperson & Ralston, 2006).
The predictive accuracy achieved by the JSORRAT-II with the Utah validation sample was statistically significant and demonstrated a moderate effect size. Thus, the validation study was successful. However, the index of predictive accuracy (AUC = .65) demonstrated a reduction of about 25% relative to performance in the development sample (AUC = .89). The distribution of scores in the validation sample was markedly similar to that of the development sample, with about 48% of sample scoring 0 to 2 and another 22% scoring 3 to 4. In both samples, the associated recidivism rates of these two groups were well below the base rate. Consistent with the reduction in overall accuracy, the differences in the validation sample were not as great, and the range of predicted probabilities in the validation sample was 8% to 42%. Although significant and meaningful, this reflects a substantial reduction in range in comparison with the development sample.
There is always some capitalization on chance relationships in development samples that results in shrinkage in the index of predictive accuracy. However, it does not follow that all the shrinkage is due to capitalization on chance, particularly when comparing indices of predictive accuracy for a single validation sample with those from the development sample. We were able to explore some potential systemic explanations for the shrinkage. By comparing the predictive accuracy of the JSORRAT-II in carefully constructed subsamples, we were able to rule out missing data, severity of index sexual offense, and time from index offense to age 18 as contributors to the reduced level of accuracy in the validation sample because the AUCs for each subsample were essentially unchanged relative to the AUC for the total sample.
Although there clearly was considerable capitalization on chance relationships in the development sample, it is likely that two additional methodological factors and several juvenile justice factors also contributed to the observed shrinkage. The first potential methodological contributor relates to the exclusion of expunged cases in the validation sample. Unlike many states, Utah does not officially expunge juvenile case files when JSOs turn 18. Instead, the juvenile must petition the court and demonstrate exemplary behavior throughout their time under court jurisdiction and that they are at a low risk to reoffend in any way.
In contrast to the development sample, the validation sample did not include expunged cases because the authors were not granted access to them. However, some information about these offenders is known. Using the recidivism database, which included recidivism data for all juveniles adjudicated within the specified time period, the authors identified 40 JSOs whose cases were officially expunged. None of these JSOs sexually recidivated prior to age 18, similar to the development sample. Given that these cases would likely score low on a tool such as the JSORRAT-II, as was the case for those in the development sample, their inclusion would most likely improve overall validity indices.
The second potential methodological contributor was the conservative strategy used to ensure that files contained no information about recidivating offenses. As mentioned earlier, after case files were chronologically ordered, one of the authors removed information based on one of two rules. First, if the JSO was a non-recidivist, all information after December 31, 1999, was removed, and second, if the JSO was a recidivist, all information from the first mention of the recidivating offense was removed. If the JSO’s recidivating sexual offense occurred after December 31, 1999, then all information was removed per the first rule. Although this strategy was essential to keep the research assistants blind to the recidivism status of the JSOs, it may also have resulted in the removal of information that was relevant to the scoring of the JSORRAT-II items. For example, if a JSO was adjudicated for his index offense in December of 1996 and had a recidivating offense in February of 1997, only a few months separated the adjudication and the first mention of the second offense. Typically, information relevant to many background items, as well as some offense-related items, was found in psychological, treatment, and probation reports that originated weeks to several months after the adjudication. Some of that information was, therefore, unavailable for such an offender, effectively deflating the recidivist’s scores on those items. Conversely, a 12-year-old may have committed an index sexual offense in November 1997 and been under court supervision until a recidivating offense in 2002, but there would be no records after December 31, 1999, with similar potential effects on scoring.
There are also some possible juvenile justice system contributions to the observed shrinkage. In the development of the JSORRAT-II, an effort was made to utilize what should be the most consistently available and reliable information in files. For offense-related information, that meant that we relied on officially charged and/or adjudicated offenses. Consequently, Items 1 to 6 on the JSORRAT-II are dependent on charging and adjudication practices, so it is possible for scores on these items to vary with changes in charging and adjudication practices.
The development sample utilized JSOs adjudicated guilty for a sexual offense from 1990 through 1992, while the validation sample utilized cases from JSOs adjudicated guilty for a sexual offense in 1996 and 1997. Between those two time periods, state and federal governments passed laws (e.g., The Jacob Wetterling Crimes Against Children Act of 1994) requiring states to set up systems to track sexual offenders through registration and community notification. It is possible that these laws and other temporal changes to policies and procedures could have increased or decreased the frequency with which juveniles were charged and/or adjudicated for sexual offenses.
Given the number of JSOs in the development and validation samples, it appears that there was an increase in juveniles adjudicated for sexual offenses. The development sample included 636 JSOs adjudicated guilty of a sexual offense over a 3-year window, an average of 218 per year. In contrast, the validation sample included 566 JSOs adjudicated guilty over a 2-year window, an average of 283 per year. This represents a 30% increase.
In response to an open-ended query, state officials confirmed that charging and adjudication practices changed rather sharply in 1995. Specifically, more minor sexual offenses that historically would have been handled outside the juvenile justice system were charged and handled through the juvenile court beginning in 1995. The primary reason cited for this change was the expiration of a state grant that had funded treatment and programming options for non-adjudicated youth. Beginning in 1995, funded treatment and programming was more likely to require adjudication.
The passage of registration and community notification laws may have also impacted sexual recidivism rates in yet another way—by stimulating more effective risk management practices that reduce the threat posed by high risk offenders to the public (see Epperson & Ralston, in press). Adult sexual recidivism rates have declined substantially in the last 15 to 20 years (Helmus, Hanson, & Thornton, 2009), and at least some studies suggest that effective risk management has played a significant role in those declines (e.g., Duwe & Donnay, 2008). Thus, it is possible that the greater number of high-scoring non-recidivists in the validation sample was, in part, due to increased secure placements, supervision, and attention that higher risk JSOs received in the post-1995 world. If this was the case, then it is possible that the danger to the community posed by some of the higher scoring youth was lowered through the imposition of external controls, effectively deterring them from recidivating despite higher risk scores. In the absence of the increased supervision and attention, it is possible that the rate of recidivism for high-scoring non-recidivists would have approximated pre-1994 levels, which would have had the impact of increasing the indices of predictive validity in this sample. For sexual recidivism base rates to be so similar across the two samples, under this theory, additional recidivists would have to come disproportionately from the lower risk ranks because of less monitoring and supervision.
All of the previous explanations are plausible, though speculative. It is probably not the case that one explanation encompasses all of the reasons for the general validity shrinkage. Instead, it is more likely that some combination of two or more of the explanations were at work to in this present study, in addition to capitalization on chance.
Whatever the reason or reasons, the reduced predictive accuracy of the JSORRAT-II has important implications for the prediction of risk with JSO populations. Specifically, although the level of accuracy achieved with the validation sample is significant and may be sufficient to inform a range of placement, programming, and treatment decisions that impact the juvenile for a finite and relatively short period of time, it suggests that it may be very difficult to achieve the level of longer term (into adulthood) predictive accuracy required to inform longer term actions, such as those mandated by many current laws (e.g., registration and community notification for 15-25 years, civil commitment for an indeterminate period of time, etc.).
Being subjected to registration and community notification requirements have the potential to carry with them detrimental effects for adults, including loss of opportunity to engage in pro-social activities, personal safety, and hope (Levenson & Cotter, 2005; Tewksbury, 2005). Given the differences in cognitive, emotional, and social structures of juveniles, it is not a big leap to assume that these effects may have a much greater negative impact on juveniles (Caldwell, 2007). As a result, decisions about imposing registration and community notification requirements on juveniles must carry with it a high degree of confidence, so as to minimize the number of false positives and the detrimental effects to those JSOs who are least likely to reoffend sexually. At this time, neither the JSORRAT-II nor any other juvenile risk assessment tool has demonstrated the level of longer term accuracy required to inform such decisions.
However, the problem of reduced accuracy has less impact for decisions that carry fewer penalties for making false positive predictions. Decisions about programming, supervision, and treatment may not require the same level and duration of predictive accuracy because the consequences of making a false positive prediction are much less than the long-term stigmatization, harassment, and lost opportunity during critical periods of development and maturation that are likely to occur from being subjected to community notification and registration (Levenson & Cotter, 2005; Tewksbury, 2005). This is not to say that false positive predictions for these types of decisions carry no detrimental effects (e.g., increased financial responsibility by the state, possibility for contagion effects, deprivation of freedom), but these effects are less intrusive and at least partially mitigated by the moderate predictive accuracy of tools like the JSORRAT-II.
Strengths, Limitations, and Future Research
The present study utilized two large, exhaustive samples of JSOs adjudicated for a sexual offense in Utah at two different points in time. The advantage of using such samples is that they are representative of the full spectrum of JSOs in an entire state, unlike smaller samples of convenience used in many studies reported in the literature. Because smaller samples of convenience (e.g., JSOs in a specific treatment program or placement setting) are often very homogeneous, they may not reveal significant risk factors because the variable is so highly shared in the population. The resulting restricted range would attenuate metrics used to assess association with sexual recidivism. In addition, any effects found would be very narrowly generalizable to that population. However, the results of this study, though geographically bounded, are more likely to be generalized to the full spectrum of juvenile sexual offending because the sample included JSOs adjudicated for all types of sexual offenses, placed or not placed in secure facilities, mandated or not mandated for sexual offender specific treatment, and so on.
Yet, some concerns about the generalizability of the results do remain. The study utilized JSOs exclusively from the state of Utah. The two samples were predominantly White (76.2%), whereas the U.S. population is approximately 69% non-Hispanic White (U.S. Census Bureau, 2001). Although the religious affiliation of JSOs could not be determined in most instances, the validation sample clearly included a larger proportion of JSOs subscribing to the Mormon/Latter-Day Saints (LDS) religion than found in the remainder of the United States, given that 14.3% of the sample clearly identified as LDS. The American Religious Identification Survey (Kosmin, Mayer, & Keysar, 2001) conducted by the U.S. Census Bureau found that approximately 1.34% of Americans self-identify as LDS. Given the geographic and related characteristics of the sample, along with the dependence of some items on educational and juvenile justice practices, one cannot assume generalizability of the results to samples from other geographic areas with different compositions. Given that the sample included only male JSOs, by design, the results clearly cannot be assumed to be relevant to female JSOs.
A second strength involves the methodology of this particular study. Extensive measures were enacted to keep research assistants blind to the recidivism status of the JSOs. Furthermore, research assistants received extensive training and data entry error was eliminated through a double-entry processes. The result of these methodological conditions was a high level of reliability in scoring the case files. However, despite the reliability of scoring, the predictive accuracy of the tool in the validation attempt may have been compromised through the elimination of key information for some recidivists, as described earlier.
Another limitation is that the present studies did not fully account for time at risk. Although we could examine differences in age at index offense, and therefore time until age 18, this does not truly mean that all JSOs were equally at risk during the years between the index offense and when they turned 18. Some JSOs may have been in the community that entire time, others may have been in a secure placement the entire time, and others may have spent some time in the community and some time in secure placements. Although it is certainly possible to sexually offend in such facilities, the opportunity to reoffend is likely much reduced. One might hypothesize that accounting for true time at risk, outside of secure facilities, may provide a more accurate index of predictive accuracy.
The last limitation pertains to the generally underreported nature of sexual offenses in general. Results from recidivism studies are often underestimates because of the nature of these types of crimes (Hanson & Bussière, 1998). As such, these results must be interpreted knowing that not all first time offenders were detected initially and not all recidivists were detected after entering the system for their index offense.
With these strengths and limitations in mind, several future research directions seem warranted. First, to assess the predictive validity of the JSORRAT-II, studies must be conducted in other states that have different geographic locations, racial and ethnic compositions, and dominant religious affiliations. This process is underway. Second, it is possible to follow the JSOs in this study for longer times at risk, into adulthood. Consequently, a future study will seek to assess the performance of the JSORRAT-II for longer term predictions of risk. Although we are not optimistic about the ability of the tool to make accurate longer term predictions, this is an empirical question. Third, future studies will account for time at risk outside of secure facilities.
Conclusions and Recommendations
Because of its successful validation, the JSORRAT-II remains a promising juvenile sexual recidivism risk assessment tool. Certainly within Utah, this tool, along with psychological and needs assessment, can productively inform a range of shorter term, clinical decisions such as placement, programming, and treatment decisions. Its use to inform similar decisions outside of states where it has been validated should be considered experimental at this time. Additional, planned studies will help clarify the predictive accuracy of the JSORRAT-II in other jurisdictions and potentially expand its usefulness. All uses of the JSORRAT-II, experimental or otherwise, require that the tool be scored accurately and reliability. Extraordinarily high levels of reliability were achieved through the procedures used in our laboratory. However, field-workers also demonstrated high reliability following a day-long scoring workshop with the authors.
The JSORRAT-II has not been validated as a predictor of adult behavior, so all risk assessments based on the JSORRAT-II necessarily expire no later than the 18th birthday. Accordingly, it would not be appropriate or informative to use JSORRAT-II scores to justify the longer term consequences of some contemporary sexual offender laws. However, the aggregated research data in the two samples seem to support arguments for exempting juveniles, or at least the vast majority of juveniles, from laws with longer term consequences.
Footnotes
Acknowledgements
Special thanks to Dave Fowers and John Dewitt for coordinating the identification and transportation of case file information.
Authors’ Note
This research was completed in collaboration with the Utah Juvenile Court, the Utah Division of Juvenile Justice Services, and Utah’s Network on Juveniles Offending Sexually statewide organization. Validation evidence reported in this article derives from Christopher A. Ralston’s doctoral dissertation.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by funding from the Utah State Juvenile Justice Services Division.
