Abstract
Assessment of sexual and nonsexual recidivism risk is important for juveniles who have offended sexually (JSOs). It is unclear whether clinicians who assess risk for both types of recidivism should use a JSO-specific measure alone or in combination with an assessment of other potential risk factors, such as psychopathy. Using a sample of 72 JSOs, this study examined the reliability and validity of a scale (Scale P) intended to assess psychopathic traits comprised of seven items from the Juvenile Sex Offender Assessment Protocol–Revised (J-SOAP-II). Scale P demonstrated adequate internal consistency and was significantly correlated with the Hare Psychopathy Checklist–Youth Version (PCL:YV). In addition, Scale P significantly predicted nonsexual and sexual recidivism as well as the PCL:YV and was a significantly stronger predictor of nonsexual recidivism than several of the preexisting J-SOAP-II scales. These preliminary findings suggest that Scale P may enhance the clinical utility of the J-SOAP-II.
Recidivism risk assessment is integral to the juvenile justice system (Hempel, Buck, Cima, & van Marle, 2013). One group of juvenile offenders for whom risk assessment is particularly important are those who have offended sexually (JSOs). Although relatively few JSOs sexually reoffend, with recent research demonstrating a significant decrease in the sexual recidivism rate among JSOs over the past 15 years (2.75%; Caldwell, 2016), public misperceptions about the risk JSOs pose to community safety have led to the imposition of increasingly punitive sanctions. In some jurisdictions, JSOs may be civilly committed or required to register in a public sex offender registry (Letourneau & Caldwell, 2013). Because of the severity of these consequences, clinicians frequently are called upon to assess JSOs’ recidivism risk before treatment, placement, and discharge determinations are made.
Over the past two decades, a handful of measures have been developed to assist clinicians in assessing recidivism risk among JSOs (Hempel et al., 2013). One widely used JSO risk assessment measure is the Juvenile Sex Offender Assessment Protocol–Revised (J-SOAP-II; Prentky & Righthand, 2003). Designed for use with male JSOs 12 to 18 years old, the J-SOAP-II is comprised of 28 static and dynamic risk factors for sexual and nonsexual recidivism. Scores on these items are summed to generate a total score, four scale scores (Scale I: Sexual Drive/Preoccupation; Scale II: Impulsive/Antisocial Behavior; Scale III: Intervention; Scale IV: Community Stability/Adjustment), and two summary scale scores (Static, comprised of Scales I and II, and Dynamic, comprised of Scales III and IV).
Despite the widespread use of the J-SOAP-II, evidence of its predictive validity is equivocal. Some studies have found that the total score significantly predicts both sexual and nonsexual recidivism (Martinez, Flores, & Rosenfeld, 2007; Prentky et al., 2010; Rajlic & Gretton, 2010), whereas others have found that the total score predicts only nonsexual recidivism (Aebi, Plattner, Steinhausen, & Bessler, 2011; Chu, Ng, Fong, & Teoh, 2012). Support for the predictive validity of the scale scores is also mixed, with some studies finding that only Scale I (Sexual Drive/Preoccupation) predicts sexual recidivism (Chu et al., 2012; Viljoen et al., 2008), and others finding that Scales II (Impulsive/Antisocial Behavior) and IV (Community Stability/Adjustment) are significant predictors as well (Aebi et al., 2011; Petersen, 2011; Prentky et al., 2010). Still, other studies have found that none of the J-SOAP-II scales significantly predict sexual recidivism (see Viljoen, Mordell, & Beneteau, 2012, for review).
The findings of a recent meta-analysis suggest that the inconsistency in the JSO risk assessment literature is not limited to the J-SOAP-II. Viljoen and colleagues (2012) analyzed 33 studies that examined the predictive validity of the J-SOAP-II and two other commonly used JSO-specific instruments, the Estimate of Risk of Adolescent Sexual Offense Recidivism (ERASOR; Worling & Curwen, 2001) and the Juvenile Sexual Offense Recidivism Risk Assessment Tool–II (J-SORRAT-II; Epperson, Ralston, Fowers, DeWitt, & Gore, 2006). The authors found that these tools were only moderate predictors of both sexual and nonsexual recidivism, with mean area under the curve (AUC) estimates ranging from .64 to .70. They also found significant heterogeneity in AUC estimates across studies for all three instruments, indicating that there is inconsistent support for the use of these instruments in assessing recidivism risk.
Because of the mixed support for JSO-specific risk assessment tools, practice standards for JSO risk assessment recommend that clinicians use these instruments as only one component of a comprehensive assessment process that incorporates information from a variety of sources and measures (Association for the Treatment of Sexual Abusers, 2012; Vitacco, Caldwell, Ryba, Malesky, & Kurus, 2009). Consequently, clinicians often assess other potential risk factors for recidivism not included in JSO risk assessment tools, including psychopathy (Viljoen, MacDougall, Gagnon, & Douglas, 2010; Viljoen, McLachlan, & Vincent, 2010). Psychopathy is a personality disorder characterized by a constellation of interpersonal, affective, and behavioral traits (Hare, 1991, 2003) that is a well-documented risk factor for both nonsexual and sexual recidivism in adult offenders (Gendreau, Goggin, & Smith, 2002; Gendreau, Little, & Goggin, 1996; Hanson & Morton-Bourgon, 2005; Hawes, Boccaccini, & Murrie, 2013; Hemphill, Hare, & Wong, 1998; Leistico, Salekin, DeCoster, & Rogers, 2008; Salekin, Rogers, & Sewell, 1996). Although some debate exists regarding the optimal way to measure psychopathy (e.g., Cooke, Hart, Logan, & Michie, 2012; Skeem & Cooke, 2010), Hare’s conceptualization of psychopathy, operationalized by the Hare Psychopathy Checklist–Revised (PCL-R; Hare, 1991, 2003) for adults and the Hare Psychopathy Checklist–Youth Version (PCL:YV; Forth, Kosson, & Hare, 2003) for juveniles, is the most widely accepted (Kotler & McMahon, 2010). The PCL instruments contain 20 items that assess interpersonal characteristics (e.g., superficial charm), affective traits (e.g., lack of empathy), and behavioral patterns (e.g., criminal versatility). These instruments yield a dimensional score representing the number and severity of psychopathic traits an individual displays.
Although clinicians often assess psychopathic traits when evaluating recidivism risk in juvenile offenders, there is controversy regarding the propriety of psychopathy assessment in youth. Some contend that psychopathy cannot be validly assessed in juveniles because their personalities have yet to crystallize, citing the fluctuation of psychopathic traits during normal adolescent development (e.g., Edens, Skeem, Cruise, & Cauffman, 2001; Edens & Vincent, 2008; Seagrave & Grisso, 2002). Others raise concerns about the potentially stigmatizing effect of psychopathy assessment for youth, citing research suggesting that labeling youth as psychopathic adversely impacts how judges, clinicians, and lay people view them (Chauhan, Reppucci, & Burnette, 2007; Edens, Guy, & Fernandez, 2003; Murrie, Boccaccini, McCoy, & Cornell, 2007). Taken together, these concerns have led some to advise clinicians to exercise significant caution when evaluating psychopathic traits in youth, particularly those involved in the juvenile justice system (Edens et al., 2001; Edens & Vincent, 2008; Seagrave & Grisso, 2002).
The controversy regarding psychopathy assessment in youth notwithstanding, there is a small but growing body of evidence that psychopathic traits remain moderately stable during adolescence and the transition to adulthood (Blonigen, Hicks, Krueger, Patrick, & Iacono, 2006; Frick, Kimonis, Dandreaux, & Farell, 2003; Loney, Taylor, Butler, & Iacono, 2007; Lynam, Caspi, Moffitt, Loeber, & Stouthamer-Loeber, 2007), providing some support for the validity of psychopathy ratings in adolescents. There is also evidence that psychopathy is associated with recidivism in youth, with a handful of meta-analyses showing that psychopathy significantly predicts general recidivism in this population (Asscher et al., 2011; Edens, Campbell, & Weir, 2007; Olver, Stockdale, & Wormith, 2009). The relationship between psychopathic traits and recidivism among JSOs is less clear. Although most studies that have examined this relationship revealed that psychopathy significantly predicts nonsexual recidivism in this population (Caldwell, Ziemke, & Vitacco, 2008; Gretton, Hare, & Catchpole, 2004; Gretton, McBride, Hare, O’Shaughnessy, & Kumka, 2001; Langström & Grann, 2000; Viljoen, Elkovitch, Scalora, & Ullman, 2009), no firm conclusions regarding this relationship can be drawn given the dearth of research in this area. Moreover, the few studies that have examined the association between psychopathic traits and sexual recidivism in JSOs have yielded mixed results. The majority of these studies have found that psychopathy is not significantly associated with sexual recidivism (Gretton et al., 2004; Gretton et al., 2001; Langström & Grann, 2000; Viljoen et al., 2009). However, two studies to date have found a significant relationship (Caldwell et al., 2008; Parks, 2004), and one study actually demonstrated the opposite association (Auslander, 1998). Given the well-documented association between psychopathy and recidivism among adult sex offenders and the controversy surrounding psychopathy assessment in youth, additional research of the association between psychopathic traits and recidivism in JSOs is needed.
In sum, the JSO risk assessment literature leaves a number of questions unanswered. As Viljoen et al. (2009) have observed, JSOs are far more likely to reoffend nonsexually than sexually (see Caldwell, 2010; McCann & Lussier, 2008), highlighting the importance of assessing risk for both types of recidivism in this population. Yet, it remains unclear how clinicians should structure their risk assessments to maximize predictive accuracy for sexual and nonsexual recidivism alike. A number of JSO-specific risk assessment tools have been developed to aid clinicians in assessing recidivism risk, but support for the predictive accuracy of these tools is mixed. In addition, although psychopathy may hold promise as a risk factor for recidivism in JSOs by virtue of its association with recidivism in adult sex offenders, very little research has examined this relationship. Risk assessments can have serious, long-term consequences, from influencing a youth’s course of treatment to determining whether he should stand trial in adult court (Vitacco, Salekin, & Rogers, 2010). Consequently, further study of the predictive accuracy of both JSO-specific risk assessment tools and psychopathy for JSOs is warranted.
The present study sought to extend the JSO risk assessment literature by examining whether psychopathic traits can be assessed using items drawn from a commonly used JSO risk assessment tool, the J-SOAP-II. Because clinicians working in the juvenile justice system face time and resource constraints, and given the well-established association between psychopathy and recidivism in adult sex offenders, it may be possible to maximize the clinical utility of the J-SOAP-II by using it to assess psychopathic traits. More generally, this study also sought to examine the relationship between psychopathy and recidivism, both nonsexual and sexual, in JSOs. We hypothesized that a scale comprised of J-SOAP-II items that have face validity in assessing psychopathic traits (Scale P) would generate adequate internal consistency and interrater reliability, and would be significantly correlated with the PCL:YV, a widely used measure of psychopathic traits in youth. Moreover, we hypothesized that psychopathy, as measured by Scale P, would predict sexual and nonsexual recidivism as well as the PCL:YV total score.
Method
Participants
This study utilized a retrospective file review of 72 male JSOs who had been discharged from one of two New Jersey Juvenile Justice Commission (NJJJC) facilities, the Pinelands Residential Community Home (Pinelands; n = 32) and the New Jersey Training School (NJTS; n = 40), between 1998 and 2008. At discharge, participants’ ages ranged from 14 to 18 years, with a mean age of 17.28 years (SD = 0.96). The majority of participants were African American (65.3%, n = 47), 19.4% (n = 14) were Caucasian, and 13.9% (n = 10) were Latino, and one participant (1.4%) was classified as “other” racial/ethnic background. Participants spent an average of 14.44 months in NJJJC custody (SD = 6.77). More than half were discharged to their family or a foster home (65.3%, n = 47), with 15.3% (n = 11) discharged to a residential treatment program and 6.9% (n = 5) discharged to another correctional facility; discharge location for the remaining 12.5% (n = 9) of participants was unavailable.
The majority of the sample had committed a sexual offense (63.9%, n = 46), but 36.1% (n = 26) were incarcerated for a nonsexual offense and were placed in sex offender treatment because of prior sexual offenses. The most common sexual offense charge was aggravated sexual assault (e.g., sexual assault involving penetration and a child under the age of 13, the use of a weapon, or the use of physical force or coercion and consequent severe personal injury; 40.3%, n = 29), followed by sexual assault (27.8%, n = 20), sexual contact (15.3%, n = 11), and noncontact offenses (e.g., lewdness, harassment, endangering the welfare of a minor; 16.7%, n = 12).
Measures
J-SOAP-II
The J-SOAP-II is an empirically informed 28-item checklist of static and dynamic risk factors for recidivism. Each item is scored on a 0- to 2-point scale, with a 0-point score reflecting the absence of a risk factor, a 1-point score reflecting some evidence of a risk factor, and a 2-point score reflecting the clear presence of a risk factor. The J-SOAP-II generates seven scores: a total score, four scale scores (Scale I: Sexual Drive/Preoccupation, Scale II: Impulsive/Antisocial Behavior, Scale III: Intervention, and Scale IV: Community Stability/Adjustment), and two summary scale scores (Static and Dynamic). As all risk factors are weighted equally, these scores are generated by summing the relevant items. No cutoff scores for risk-level classifications have been established (Prentky & Righthand, 2003). Consistent with prior research (e.g., Martinez et al., 2007; Parks & Bard, 2006), the J-SOAP-II total score and scales displayed good interrater reliability in the present sample, intraclass correlation coefficient (ICC; k = 2, n = 25) = .88, .87, .70, .90, and .74 for the total score, Static summary scale, and Scales I, II, and III, respectively. Internal consistency was adequate for the total score (α = .71) and the scales (α = .66, .80, .70, .92, for the Static summary scale and Scales I, II, and III, respectively).
The authors of the current study created a scale within the J-SOAP-II to assess psychopathic traits (Scale P). Scale P is comprised of seven items: Item 10 (Pervasive Anger), Item 11 (School Behavior Problems), Item 13 (Juvenile Antisocial Behavior), Item 15 (Multiple Types of Offenses), Item 17 (Accepting Responsibility for Offenses), Item 20 (Empathy), and Item 21 (Remorse and Guilt). These items were drawn from two of the four scales, Scale II (Impulsive/Antisocial Behavior) and Scale III (Intervention), and were selected based on their face validity in assessing traits or behaviors consistent with psychopathy. In the present sample, Scale P had good interrater reliability, ICC (k = 2, n = 9) = .79, and strong internal consistency (α = .83).
PCL:YV
The PCL:YV is a 20-item rating scale intended to assess psychopathy in male and female adolescents between 12 and 18 years old (Forth et al., 2003). Items on the PCL:YV have been divided into four facets (Hare, 2003): interpersonal (Facet 1; four items), affective (Facet 2; four items), lifestyle (Facet 3; five items), and antisocial (Facet 4; five items); two items are excluded from this model (impersonal sexual behavior, unstable interpersonal relationships). These facets comprise two factors, with Factor 1 (Interpersonal/Affective) being comprised of Facets 1 and 2, and Factor 2 (Lifestyle/Behavioral) being comprised of Facet 3 and four items from Facet 4; criminal versatility, a Facet 4 item, is excluded from this model.
In the present sample, internal consistency was good for the PCL:YV total score (α = .83), but highly variable for the facet and factor scores. Facets 1 and 2 had extremely poor internal consistency (α = .11 and α = .28, respectively), whereas internal consistency was somewhat better for Facets 3 and 4 (α = .61 and α = .51, respectively). Factor 1 also had very poor internal consistency (α = .22), whereas Factor 2 demonstrated good internal consistency (α = .77). Interrater reliability, on the contrary, was strong for the total, factor, and facet scores, ranging from ICC (k = 2, n = 20) = .82 (Facet 1 and Factor 2) to ICC = .90 (Facet 3).
Procedures
The procedures used in this study were approved by the Fordham University Institutional Review Board and the NJJJC Office of Policy, Research, and Planning.
J-SOAP-II Ratings
The present study utilized data from a previous study (Martinez, Rosenfeld, Cruise, & Martin, 2015) that completed J-SOAP-II ratings based on a file review of clinical and legal system records available at discharge for each participant. A second rater, a master’s-level clinician, rated a random subset of these files (n = 9) to permit assessment of interrater reliability. Both the primary and secondary raters had more than 10 years experience assessing and treating JSOs, and the primary rater had previously published research regarding the J-SOAP-II. Both raters were blind to recidivism status, but had access to information regarding treatment completion by virtue of reviewing participants’ files. The extent of records available differed across cases, but typically included demographic information, legal documents (e.g., victim and witness statements, police reports, pleadings), and treatment information (e.g., incident reports, progress notes, quarterly treatment summaries). Ratings were completed in accordance with the J-SOAP-II manual (Prentky & Righthand, 2003), and were used to generate J-SOAP-II Scale P scores as described above. Scale IV (Community Stability/Adjustment) was omitted, as all participants were incarcerated in a correctional or treatment facility when the J-SOAP-II was rated (see Prentky & Righthand, 2003). Consequently, the Dynamic summary scale could not be calculated, and the total J-SOAP-II score was calculated by summing Scales I, II, and III, as has been done in prior research using samples of incarcerated youth (e.g., Caldwell et al., 2008; Parks & Bard, 2006).
PCL:YV Ratings
For the present study, PCL:YV ratings were completed based on an independent, retrospective review of participants’ files by the first author, who had not participated in and was blind to the original J-SOAP-II ratings. A second rater rated a random subset of these files (n = 20) to permit assessment of interrater reliability. Both raters were doctoral students with formal PCL-R training/coursework and experience rating the PCL instruments in clinical practice. As with the J-SOAP-II raters, both PCL raters were blind to recidivism status, but had access to information regarding treatment completion by virtue of reviewing participants’ files. In accordance with the PCL:YV manual, prorated scores were calculated for participants with four or fewer missing items (Forth et al., 2003). Because of changes in NJJJC policy that occurred after the previous study was completed, only intake psychological assessments, treatment progress reports, and discharge summaries were available for review. Detailed chronological treatment notes were unavailable for review.
Outcome Data
Outcome data were collected from official criminal justice records, the New Jersey Judiciary’s Family Automated Case Tracking System (FACTS) and Promis/Gavel, to determine whether participants had reoffended after discharge. FACTS and Promis/Gavel are caseload management and recordkeeping systems that contain information regarding juvenile delinquency and adult criminal cases in New Jersey, respectively. On average, outcome data were available for 63.7 months after discharge (SD = 30.4, range: 8.9-130.9). Rearrest data were coded into three dichotomous outcome variables indicating whether participants were rearrested for a new nonsexual offense, a new violent nonsexual reoffense, and/or a new sexual offense. Participants who were rearrested for both nonsexual and sexual offenses were included in only the sexual recidivism group, as it could not be determined whether their nonsexual arrests were from a separate incident (all participants rearrested for a sexual offense also had nonsexual charges). Forty youth (55.6%) committed a nonsexual reoffense during the follow-up period, 19 of whom (47.5% of nonsexual reoffenders; 26.4% of sample) committed a violent reoffense. Only three youth in the sample (4.2%) were identified as having sexually reoffended.
Statistical Analysis
Pearson correlation coefficients were used to estimate the association between J-SOAP-II Scale P; the PCL:YV Total, Facet, and Factor scores; and the preexisting J-SOAP-II scales. The magnitude of the associations between Scale P and the PCL:YV and those of the preexisting J-SOAP-II scales and the PCL:YV were compared using 95% confidence intervals (CIs). The predictive accuracy of Scale P, the PCL:YV, and the preexisting J-SOAP-II scales was assessed using receiver operating characteristic (ROC) curve analysis. ROC curve analysis is common in risk assessment research because it is only minimally affected by low base rates (Mossman, 1994; Singh, 2013). Because of the modest size of the sample used in some of these analyses, statistical significance was determined on the basis of both p values and 95% CIs, with CIs that did not include .50 considered as evidence of significance (see Gardner & Altman, 1986). AUC estimates of the predictive accuracy of J-SOAP-II Scale P were compared with those of the PCL:YV Total score and the preexisting J-SOAP-II scales using the U-statistic method for comparing ROC curves from the same sample described by DeLong, DeLong, and Clarke-Pearson (1988). The predictive accuracy of Scale P and PCL:YV for time to reoffense was also assessed using Cox regression, a type of survival analysis that controls for variation in time at risk across participants (Cox, 1972).
In addition, hierarchical logistic regression was used to examine whether the PCL:YV had incremental validity over and above Scale P in predicting recidivism and, conversely, whether Scale P had incremental validity over and above the PCL:YV. Scale P was first entered in Block 1, and the PCL:YV Total score was added in Block 2. The order was then reversed (i.e., PCL:YV in Block 1, with Scale P added in Block 2), and changes in model fit were examined to determine whether Scale P and/or the PCL:YV accounted for unique variance in reoffending. Hierarchical logistic regression analyses were also used to examine the incremental validity of Scale P and the preexisting J-SOAP-II scales using the same process described above.
The predictive accuracy of age at release, length of custody, race, placement after discharge from NJJJC custody, and NJJJC facility for nonsexual, violent nonsexual, and sexual recidivism was examined to identify covariates for inclusion in the hierarchical logistic regression incremental validity analyses. Age at release, race, and placement after discharge did not significantly predict nonsexual, violent nonsexual, or sexual recidivism and, thus, were not included as covariates. However, length of custody significantly predicted nonsexual recidivism, β = −0.08, SE = .04, OR = 0.93, Wald = 4.23, p = .040. In addition, NJJJC facility significantly predicted nonsexual recidivism, β = −1.11, SE = .49, OR = 0.33, Wald = 5.06, p = .024, and violent nonsexual recidivism, β = −1.86, SE = .69, OR = 0.16, Wald = 7.36, p = .007. Consequently, length of custody was included as a covariate in Block 1 for nonsexual recidivism, and NJJJC facility was included as a covariate in Block 1 for both nonsexual and violent nonsexual recidivism.
Results
Properties of the J-SOAP-II and the PCL:YV
Table 1 displays descriptive statistics for the J-SOAP-II and the PCL:YV. The mean total score on the J-SOAP-II (22.28) was similar to that in prior studies of JSOs in correctional facilities or residential treatment that omitted Scale IV, calculating the total score using Scales I, II, and III (e.g., Parks & Bard, 2006). The mean total score on the PCL:YV (12.30) was somewhat lower than that in other studies utilizing similar samples (e.g., Viljoen et al., 2009), and no participants scored above 30, the cutoff for identifying psychopathy on the PCL-R.
J-SOAP-II and PCL:YV Descriptive Data
Note. J-SOAP-II = Juvenile Sex Offender Assessment Protocol–Revised; PCL:YV = Hare Psychopathy Checklist–Youth Version; Scale P = Psychopathy; Scale I = Sexual Drive/Preoccupation; Scale II = Impulsive/Antisocial Behavior; Scale III = Intervention; Facet 1 = Interpersonal; Facet 2 = Affective; Facet 3 = Lifestyle; Facet 4 = Behavioral; Factor 1 = Interpersonal/Affective; Factor 2 = Lifestyle/Behavioral.
Because Scale IV (Community Stability/Adjustment) was omitted, the Dynamic summary scale score could not be calculated, and the total score was calculated by adding Scales I, II, and III.
Correlations Between J-SOAP-II Scale P and the PCL:YV
Table 2 displays the correlation matrix for the J-SOAP-II and the PCL:YV. There was a strong, significant correlation between J-SOAP-II Scale P and the PCL:YV total score, r(72) = .68, 95% CI = [.53, .79], p < .001. The magnitude of this association exceeded that of the associations between the PCL:YV total score and all of the preexisting J-SOAP-II scales (see Table 2). However, comparison of 95% CIs for these correlations indicates that only the correlation between Scale I and the PCL:YV total score, r(72) = .01, 95% CI = [–.22, .24], p = .97, was significantly smaller than the correlation between Scale P and the PCL:YV total score.
Correlations Between the J-SOAP-II and the PCL:YV
Note. J-SOAP-II = Juvenile Sex Offender Assessment Protocol–Revised; PCL:YV = Hare Psychopathy Checklist–Youth Version; J-SOAP-II Scale P = Psychopathy; J-SOAP-II Scale I = Sexual Drive/Preoccupation; J-SOAP-II Scale II = Impulsive/Antisocial Behavior; J-SOAP-II Scale III = Intervention; PCL:YV Facet 1 = Interpersonal; PCL:YV Facet 2 = Affective; PCL:YV Facet 3 = Lifestyle; PCL:YV Facet 4 = Behavioral; PCL:YV Factor 1 = Interpersonal/Affective; Factor 2 = Lifestyle/Behavioral.
Because Scale IV (Community Stability/Adjustment) was omitted, the Dynamic summary scale score could not be calculated, and the total score was calculated by adding Scales I, II, and III.
p < .05. **p < .001.
There were also significant correlations between Scale P and the PCL:YV Facet 2, Facet 3, Facet 4, Factor 1, and Factor 2 scores, but no significant correlation between Scale P and the Facet 1 score (see Table 2). A similar pattern emerged for the preexisting J-SOAP-II scales. Of note, Scale P was more strongly correlated with all of the PCL:YV total, facet, and factor scores than Scale III, one of the J-SOAP-II scales from which its items were drawn. In addition, Scale P was more strongly correlated with Facets 2 and 3 and Factor 1 than Scale II, the second J-SOAP-II scale from which Scale P items were drawn. However, comparison of 95% CIs for these associations indicates that none of these differences were significant.
Predictive Accuracy of J-SOAP-II Scale P and the PCL:YV
Table 3 presents AUC estimates for the predictive accuracy of the J-SOAP-II and the PCL:YV for general nonsexual, violent nonsexual, and sexual recidivism. Both Scale P and the PCL:YV were significant predictors of general nonsexual recidivism. After controlling for variation in time at risk using Cox regression, both measures were also significant predictors of time to general nonsexual reoffense: Wald = 13.12, OR = 1.19, –2 loglikelihood (LL) = 299.75, χ2(1, N = 72) = 13.81, p < .001, for Scale P; and Wald = 8.44, OR = 1.11, –2 LL = 306.09, χ2(1, N = 72) = 8.71, p = .003, for the PCL:YV total score.
Predictive Validity of the J-SOAP-II and the PCL:YV
Note. J-SOAP-II = Juvenile Sex Offender Assessment Protocol–Revised; PCL:YV = Hare Psychopathy Checklist–Youth Version; Scale P = Psychopathy; Scale I = Sexual Drive/Preoccupation; Scale II = Impulsive/Antisocial Behavior; Scale III = Intervention; CI = confidence interval.
Because Scale IV (Community Stability/Adjustment) was omitted, the Dynamic summary scale score could not be calculated, and the total score was calculated by adding Scales I, II, and III. bInterpreted as significant based on 95% CI above .50.
According to the U-statistic method of comparison (DeLong et al., 1988), the predictive accuracy of Scale P for general nonsexual recidivism was significantly stronger than that of the PCL:YV total score, χ2(1, N = 72) = 4.09, p = .043. Scale P also was a significantly stronger predictor than the J-SOAP-II total score, χ2(1, N = 72) = 6.49, p = .011; the Static summary scale, χ2(1, N = 72) = 5.81, p = .016; Scale I, χ2(1, N = 72) = 14.36, p < .001; and Scale III, χ2(1, N = 72) = 6.911, p = .009. There was no difference in the predictive accuracy between Scale P and J-SOAP-II Scale II, χ2(1, N = 72) = 0.02, p = .89.
Scale P was a significant predictor of violent nonsexual recidivism (see Table 3) and predicted time to violent nonsexual reoffense, Wald = 6.37, OR = 1.19, –2 LL = 146.18, χ2(1, N = 72) = 6.79, p = .009. In contrast, the PCL:YV total score did not perform significantly better than chance and did not predict time to reoffense, Wald = 0.58, OR = 1.04, –2 LL = 153.01, χ2(1, N = 72) = 0.58, p = .45. These estimates differed significantly, χ2(1, N = 72) = 6.86, p = .009. Scale P also was a significantly stronger predictor of violent nonsexual recidivism than the J-SOAP-II total score, χ2(1, N = 72) = 11.03, p < .001; Static summary scale, χ2(1, N = 72) = 18.79, p < .001; and Scale I, χ2(1, N = 72) = 27.66, p < .001. No significant differences were found between the predictive accuracy of Scale P for violent nonsexual recidivism and that of Scales II and III, χ2(1, N = 72) = 2.01, p = .16, and χ2(1, N = 72) = 0.062, p = .80, respectively.
Findings with respect to the predictive validity of the J-SOAP-II and the PCL:YV for sexual recidivism must be interpreted cautiously given the small number of youth in the sample who sexually reoffended (n = 3). Both Scale P and the PCL:YV total score were moderate, significant (based on the 95% CIs) predictors of sexual recidivism (see Table 3). However, neither measure significantly predicted time to sexual reoffense: Wald = 1.57, OR = 1.31, –2 LL = 23.45, χ2(1, N = 72) = 1.78, p = .18, for Scale P, and Wald = 1.63, OR = 1.19, –2 LL = 23.78, χ2(1, N = 72) = 1.79, p = .18, for the PCL:YV total score.
The predictive accuracy of Scale P for sexual recidivism was not significantly different from that of the PCL:YV total score, χ2(1, N = 72) = 0.23, p = .63. In addition, no significant differences were found between the predictive accuracy of Scale P and that of the J-SOAP-II total score, χ2(1, N = 72) = 0.002, p = .96; Static summary Scale, χ2(1, N = 72) = 2.95, p = .09; or Scale I, χ2(1, N = 72) = 2.03, p = .15. Scale III was a significantly stronger predictor than Scale P, χ2(1, N = 72) = 22.16, p < .001. However, Scale P was a significantly stronger predictor of sexual recidivism than J-SOAP-II Scale II, χ2(1, N = 72) = 11.93, p < .001.
Tables 4, 5, and 6 display the results of hierarchical logistic regression analyses examining the incremental validity of Scale P and the PCL:YV for general nonsexual, violent nonsexual, and sexual recidivism. After controlling for NJJJC facility and length of time in custody, Scale P significantly predicted general nonsexual recidivism (see Table 4) and provided incremental validity over and above the PCL:YV total score, χ2(1, N = 72) = 4.28, p = .039. However, the PCL:YV total score provided no incremental validity over and above Scale P, χ2(1, N = 72) = 0.09, p = .76. Neither Scale P nor the PCL:YV was a significant predictor of violent nonsexual recidivism in the logistic regression model after controlling for NJJJC facility (see Table 5), and neither measure significantly predicted sexual recidivism (see Table 6).
Incremental Validity of J-SOAP-II Scale P and the PCL:YV: General Nonsexual Recidivism
Note. J-SOAP-II = Juvenile Sex Offender Assessment Protocol–Revised; PCL:YV = Hare Psychopathy Checklist–Youth Version; Scale P = J-SOAP-II Psychopathy scale; PCL:YV = PCL:YV total score; OR = odds ratio.
Incremental Validity of J-SOAP-II Scale P and the PCL:YV: Violent Nonsexual Recidivism
Note. J-SOAP-II = Juvenile Sex Offender Assessment Protocol–Revised; PCL:YV = Hare Psychopathy Checklist–Youth Version; Scale P = J-SOAP-II Psychopathy scale; PCL:YV = PCL:YV total score; OR = odds ratio.
Incremental Validity of J-SOAP-II Scale P and the PCL:YV: Sexual Recidivism
Note. J-SOAP-II = Juvenile Sex Offender Assessment Protocol–Revised; PCL:YV = Hare Psychopathy Checklist–Youth Version; Scale P = J-SOAP-II Psychopathy scale; PCL:YV = PCL:YV total score; OR = odds ratio.
The incremental validity of Scale P and the preexisting J-SOAP-II scales was also examined using hierarchical logistic regression. With respect to general nonsexual recidivism, Scale P provided incremental validity over and above the J-SOAP-II total score, χ2(1, N = 72) = 4.44, p = .035; Static summary scale, χ2(1, N = 72) = 3.91, p = .048; Scale I, χ2(1, N = 72) = 7.46, p = .006; and Scale III, χ2(1, N = 72) = 11.37, p = .001. However, Scale P did not contribute incrementally to the prediction of general nonsexual recidivism over and above Scale II, χ2(1, N = 72) = 0.02, p = .89. With respect to violent nonsexual recidivism, Scale P provided incremental validity over and above the J-SOAP-II total score, χ2(1, N = 72) = 4.21, p = .040, and the Static summary scale, χ2(1, N = 72) = 3.99, p = .046, but not Scale I, χ2(1, N = 72) = 1.71, p = .19; Scale II, χ2(1, N = 72) = 0.40, p = .53; or Scale III, χ2(1, N = 72) = 1.48, p = .22. Scale P did not provide incremental validity for the prediction of sexual recidivism over and above any of the preexisting J-SOAP-II scales, χ2(1, N = 72) = 1.37, p = .24, for the total score; χ2(1, N = 72) = 2.91, p = .09, for the Static summary scale; χ2(1, N = 72) = 1.49, p = .22, for Scale I; χ2(1, N = 72) = 2.84, p = .09, for Scale II; and χ2(1, N = 72) = 0.65, p = .42, for Scale III. None of the preexisting J-SOAP-II scales provided incremental validity over and above Scale P in the prediction of general nonsexual, violent nonsexual, or sexual recidivism.
Discussion
Recidivism risk assessment is an important component of disposition, treatment, and discharge planning for JSOs. Several instruments have been designed to assist clinicians in assessing recidivism risk in this population, but support for the predictive accuracy of these tools is mixed. In addition, although clinicians commonly assess psychopathy when evaluating recidivism risk among JSOs, the relationship between psychopathy and recidivism in this population remains unclear. To address these gaps in the literature, this study examined (a) whether a commonly used JSO-specific risk assessment tool, the J-SOAP-II, can be used to measure psychopathic traits in youth reliably, and (b) whether this scale was a significant predictor of recidivism among JSOs.
Consistent with expectations, a seven-item psychopathy scale derived from the J-SOAP-II (Scale P) demonstrated good reliability. Both interrater reliability and internal consistency for Scale P were comparable with the preexisting J-SOAP-II scales. Scale P also was strongly correlated with scores on the PCL:YV, a widely used and well-validated measure of psychopathic traits in youth. Scores on Scale P were highly correlated with all PCL:YV factor and facet scores except Facet 1, and these correlations were stronger than those found in prior research examining the association between different clinician-rated juvenile psychopathy measures (e.g., Murrie & Cornell, 2002). The correlations between Scale P and the PCL:YV were also stronger (albeit nonsignificantly) than those between several of the preexisting J-SOAP-II scales and the PCL:YV.
The lack of variability in scores on PCL:YV Facet 1, relative to the other facet and factor scores, may explain the lack of association between this facet and Scale P. So too could the composition of Scale P. Four of the seven items comprising Scale P (Item 10: Pervasive Anger; Item 11: School Behavior Problems; Item 13: Juvenile Antisocial Behavior; Item 15: Multiple Types of Offenses) assess behaviors or traits included in Facet 4 of the PCL:YV, and the remaining three (Item 17: Accepting Responsibility for Offenses; Item 20: Empathy; Item 21: Remorse and Guilt) assess traits included in Facet 2. Although impression management, a Facet 1 trait, is considered when scoring Item 20 (see Prentky & Righthand, 2003), none of the Scale P items directly assess Facet 1 traits. Consequently, it is likely that Scale P assesses a narrower range of psychopathic traits than the PCL:YV does. Given the controversy regarding the relative importance of interpersonal traits (i.e., Facet 1) and antisocial/criminal behavior (i.e., Facet 4) to the construct of psychopathy (see Hare, 2010; Patrick, Fowles, & Krueger, 2009; Skeem & Cooke, 2010), further examination of the association between Scale P, the PCL:YV, and other measures of psychopathic traits in youth is necessary before any firm conclusions regarding the construct validity of Scale P can be drawn.
This limitation notwithstanding, Scale P may have utility in the prediction of nonsexual recidivism risk in JSOs. Scale P significantly predicted general and violent nonsexual recidivism, whereas the PCL:YV total score predicted only general nonsexual recidivism. In addition, Scale P was a significantly stronger predictor of both general and violent nonsexual recidivism than the PCL:YV and provided incremental validity over and above the PCL:YV in the prediction of general nonsexual recidivism. These findings suggest that Scale P may be a better predictor of nonsexual recidivism among JSOs than the PCL:YV, and it may be possible for clinicians to use Scale P as an embedded measure of psychopathic traits within the J-SOAP-II. More generally, these results also indicate that psychopathic traits are a risk factor for nonsexual recidivism among JSOs. Of note, the mean total score on the PCL:YV in the present sample (12.30) was relatively low, and no study participants received a score of 30 or greater, suggesting that even low levels of psychopathic traits may increase recidivism risk in this population.
When compared with the preexisting J-SOAP-II scales, Scale P was a significantly stronger predictor of general nonsexual recidivism than the J-SOAP-II total, Static summary scale, Scale I, and Scale III scores, and an equally strong predictor as Scale II. A similar pattern emerged for violent nonsexual recidivism, with Scale P being a significantly better predictor than the J-SOAP-II total, Static summary scale, and Scale I scores. Scale P also provided incremental validity over and above all of the preexisting J-SOAP-II scales, but Scale II in the prediction of general nonsexual recidivism, and over and above the J-SOAP-II total score and Static summary scale in the prediction of violent nonsexual recidivism. Taken together, these findings suggest that Scale P may have greater utility for the prediction and management of nonsexual recidivism than the preexisting J-SOAP-II scales. However, additional research is necessary with respect to the comparison between Scale P and Scale II, given that they were similarly strong predictors of general and violent nonsexual recidivism in the current sample. It is possible that Scale P may better facilitate risk formulation than Scale II by highlighting not only past problematic behaviors but also current affective characteristics that are most salient to a JSO’s future risk. Future research examining risk formulation using the J-SOAP-II is warranted.
Scale P also significantly predicted sexual recidivism in the present sample, as did the PCL:YV, suggesting that psychopathic traits may increase sexual recidivism risk among JSOs. However, these findings must be interpreted cautiously given the small number of sexual recidivists in the sample (n = 3). Although ROC curve analyses are only minimally affected by base rates (Singh, 2013), analyses involving so few recidivists are problematic, as evidenced by the wide 95% CIs and the inconsistency between 95% CIs and p values for the AUC estimates in the sexual recidivism analyses. The small number of sexual recidivists in the present sample highlights the relative infrequency of sexual recidivism among JSOs, one of the many challenges that researchers and clinicians alike face in predicting and managing risk in this population. Future research utilizing larger samples is essential before any firm conclusions regarding the association between psychopathic traits and sexual recidivism in JSOs can be drawn.
The results of this study should be viewed in light of several methodological limitations, in addition to the modest sample size and the small number of sexual recidivists in the sample. First, both the J-SOAP-II and the PCL:YV are typically rated in clinical practice on the basis of both an interview and review of relevant records. However, both tools were scored based on a file review in the present study, and although interrater reliability for both measures was good, variability in file content across participants may have impacted the accuracy of ratings. Inclusion of a clinical interview would likely have enhanced the accuracy of ratings on both the PCL:YV and the J-SOAP-II. Second, it is possible that Scale P emerged as a stronger predictor of nonsexual recidivism than the PCL:YV because of differences in the file material that was used to score the J-SOAP-II and the PCL:YV. Due to changes in NJJJC policy that occurred after the J-SOAP-II was rated, only intake assessments, treatment progress reports, and discharge summaries were available for review to score the PCL:YV. More accurate PCL:YV ratings may have been generated had all of the file content on which J-SOAP-II ratings were based been available for review. Third, as the low internal consistency of PCL:YV Factor 1 suggests, it may be difficult to score interpersonal and affective PCL:YV items, intended to assess personality traits that are stable across time and context (e.g., lack of remorse, callous/lack of empathy; Forth et al., 2003), when the available information about the youth primarily relates to his attitudes toward his offense and his behavior while in placement. Fourth, using only New Jersey criminal justice databases to collect recidivism data may have led to an underestimation of recidivism in this sample, as participants may have committed crimes in other states, or crimes that did not result in arrest. Future research should gather recidivism data from multiple sources, including self-report and collateral informants. Fifth, the J-SOAP-II is intended for use with youth with a history of contact sexual offenses or sexually coercive behavior, but the sample used in the present study included a small number of youth adjudicated for noncontact sexual offenses. It is possible that the J-SOAP-II would have demonstrated better predictive accuracy had the sample been comprised solely of JSOs with a history of contact offenses. Sixth, because the J-SOAP-II was developed for use with male JSOs, and because an all-male sample was used, the results of the present study likely do not generalize to female youth with a history of sexual offenses. Indeed, almost all research relating to juvenile sexual offending has used male samples, and little is known regarding offending patterns and recidivism risk assessment in female JSOs (Oliver & Holmes, 2015; Wijkman, Bijleveld, & Hendriks, 2014). Research in this area is needed. And seventh, an independent, prospective evaluation of Scale P is necessary before clinical use of this scale is recommended.
These limitations aside, the findings of this study have potentially important clinical implications. First, our findings suggest that Scale P may increase clinician efficiency. Although Scale P assesses a narrower range of psychopathic traits than the PCL:YV does, Scale P was strongly correlated with the PCL:YV, was a significant predictor of both general and violent nonsexual recidivism, and provided incremental utility over and above the PCL:YV in the prediction of general and violent nonsexual recidivism. Scale P thus has the potential to serve as an embedded measure of psychopathic traits within the J-SOAP-II, obviating the need for separate psychopathy measure. In addition, because Scale P provided incremental validity in the prediction of general nonsexual recidivism over and above nearly all of the preexisting J-SOAP-II scales, it may also obviate the need for a separate general recidivism risk measure. Relatively little research has examined the predictive accuracy of general risk recidivism assessment tools in JSOs (see Viljoen et al., 2009). Future research should compare the predictive accuracy of both Scale P and the preexisting J-SOAP-II scales for nonsexual recidivism among JSOs to that of general recidivism risk assessment measures.
Second, our results indicate that clinicians may be able to use Scale P to improve the clinical utility of the J-SOAP-II for the prediction and management of nonsexual recidivism risk. Because JSOs are far more likely to commit a nonsexual reoffense than a sexual reoffense, reduction of nonsexual recidivism risk is an important treatment target for this population. Pursuant to the Risk–Needs–Responsivity model, treatment and risk management strategies are most effective when they (A) are proportionate to the offender’s risk level; (b) address the offender’s unique criminogenic needs, or dynamic risk factors that contribute to his or her risk; and (c) account for the offender’s characteristics that may impact how he or she responds to treatment, such as his or her learning style (Andrews & Bonta, 2003; Andrews & Dowden, 2007). Cuadra, Viljoen, and Cruise (2010) asserted that risk assessment has the potential to improve treatment and risk management strategies for JSOs by not only identifying risk level but also identifying criminogenic needs that may be targeted through treatment.
According to this perspective, Scale P holds promise for risk management and treatment planning by serving as an indicator of risk. Clinicians may be able to use Scale P to identify youth at elevated risk for nonsexual recidivism more accurately than they would be able to using the preexisting J-SOAP-II scales. Scale P also may assist clinicians in gauging the nature and severity of potential future nonsexual reoffending, as a number of studies have found that youth with psychopathic traits are more likely to engage in premeditated or instrumental aggression that results in greater harm to victims than youth without psychopathic traits (e.g., Frick, Cornell, Barry, Bodin, & Dane, 2003; Murrie, Cornell, Kaplan, McConville, & Levy-Elkon, 2004).
Scale P also holds promise for risk management and treatment planning by serving as a measure of criminogenic needs. More specifically, Scale P has the potential to identify JSOs for whom specialized interventions are warranted. Although research regarding the treatment of psychopathy in youth is nascent, there is evidence that youth with psychopathic traits are less responsive to treatment than other youth (Manders, Deković, Asscher, van der Laan, & Prins, 2013; O’Neill, Lidz, & Heilbrun, 2003). Consequently, researchers have begun to examine the effectiveness of specialized treatments that target traits and behaviors consistent with psychopathy. Caldwell, McCormick, Umstead, and Van Rybroek (2007) found that incarcerated youth who received long-term treatment targeting interpersonal skills and behavioral management were less likely to reoffend than youth who received standard treatment. Subsequently, they found that youth who received this specialized treatment displayed significantly lower levels of interpersonal and affective psychopathic traits after treatment, and decreases in these traits were associated with improvements in institutional behavior (Caldwell, McCormick, Wolfe, & Umstead, 2012). However, it remains unclear whether treatment reduces psychopathic traits and recidivism risk among JSOs. The results of the current study underscore the need for additional research in this area.
Footnotes
Acknowledgements
The authors thank the New Jersey Juvenile Justice Commission for providing access to its records and facilitating data collection.
