Abstract
The presence of callous-unemotional (CU) traits delineates a subgroup of youth with severe antisocial behavior. However, debate surrounds the best method to assess CU traits. This study examined the factor structure of the parent-reported Inventory of Callous-Unemotional Traits (ICU) among high-risk 9-year-olds (N = 540) and its predictive validity over 1 year. Confirmatory factor analysis showed support for a three-factor bifactor model and revised two-factor model using a shortened ICU. Within a three-factor bifactor framework the general CU traits factor and specific uncaring factor scores were related to higher externalizing and lower internalizing behavior problems at ages 9.5 and 10.5. Findings were replicated using teacher-reported outcomes. However, results also suggest the need for item refinement and highlight the utility of a two-factor solution using a shortened ICU. In particular, the meaning of the unemotional items is discussed in relation to the conceptualization of CU traits.
In the past 20 years, research has examined callous-unemotional (CU) traits among antisocial youth as a theoretical downward extension of the affective features of adult psychopathy (Frick, O’Brien, Wootton, & McBurnett, 1994). Measures of CU traits assess behaviors, such as deficits in empathic concern, shallow affect, and lack of guilt. In recent years, several reviews have summarized the literature showing that the presence of CU traits is related to more severe antisocial behavior in childhood and adolescence and that these traits identify a homogenous subgroup of children with specific risk factors who may benefit from tailored interventions (Frick, Ray, Thornton, & Kahn, 2014; Frick & White, 2008). In recognition of the growing body of research that has demonstrated the utility of assessing CU traits during middle childhood and adolescence, a specifier for the diagnosis of Conduct Disorder, based on conceptualizations of CU traits, was added to the Diagnostic and Statistical Manual of Mental Disorders, fifth edition and termed with limited prosocial emotions (DSM-5; American Psychiatric Association, 2013). Measurement of CU traits thus remains a significant research focus with important clinical implications. However, gaps remain in our knowledge of the underlying construct of CU behavior 1 and how best to measure it, especially among school-aged children. In response to the shortcomings of previous measures, the Inventory of Callous-Unemotional Traits (ICU; Frick, 2004) was developed to comprehensively assess CU behavior, but remains at the center of debate, particularly in relation to its psychometric properties and the extent to which these properties inform conceptualizations of CU behavior (see Lahey, 2014). In the current study, we examined the factor structure and construct validity of the parent-reported ICU. In particular, we examined the widely used three-factor bifactor (3FBF) and newly proposed two-factor (2F) solutions, and tested the predictive validity of ICU scores in a large, high-risk sample of 9.5-year-olds followed longitudinally over a year.
Factor Structure of the Inventory of Callous-Unemotional Traits
The ICU comprises 12 positively and 12 negatively worded items, including 4 items referenced in the “limited prosocial emotions” DSM-5 specifier that indexes CU behavior (“I care about how well I do at school,” “I feel guilty when I do something wrong,” “I do not show emotions,” and “I am concerned about the feelings of others”). In terms of factor structure, the best fit for the 24-item ICU has typically been obtained by models specifying a 3FBF solution (Essau, Sasagawa, & Frick, 2006). Bifactor models represent an appealing way to model multidimensionality, specifying a general factor that captures shared variance across all items, while simultaneously modeling specific variance of separate dimensions within subsets of items. Bifactor models are common in the intelligence literature (e.g., Carroll, 1993), where conceptualizations of the structure of mental ability comprise both general and specific skills. Bifactor models have also been applied to antisocial behavior, including among studies of adult psychopathy (e.g., Patrick, Hicks, Nichol, & Krueger, 2007) and to model different forms of aggression among antisocial youth (e.g., Tackett, Daoud, De Bolle, & Burt, 2013). These studies have demonstrated an overarching general factor (e.g., psychopathy or aggression) with underlying specific factors (e.g., interpersonal, affective, or lifestyle traits).
In the ICU 3FBF model, items load onto three specific factors (callousness, uncaring, and unemotional), while simultaneously loading onto a general CU behavior factor. A 3FBF solution has been replicated in studies of the self-reported ICU among forensic (N = 248, ages 12-20 years, Kimonis et al., 2008) and community samples that represent different age periods and countries (Table 1; e.g., N = 347, ages 12-18 years, Fanti, Frick, & Georgiou, 2009; N = 455, ages 14-20 years, Roose, Bijttebier, Decoene, Claes, & Frick, 2010; N = 540, ages 10-14 years, Ciucci, Baroncelli, Franchi, Golmaryami, & Frick, 2014). Nevertheless, because much research on the ICU has focused either on healthy community samples recruited from schools or on forensic/clinical samples, there is a need for studies to assess dimensional samples that include a full range of antisocial behavior and CU behavior.
Summary of sample descriptives and model fit statistics reported in previous studies examining ICU factor structure among youth.
Note. Com =community (i.e., normative, healthy, or school-based sample); Foren =forensic; Strat =stratified. Degrees of freedom vary across previous studies depending on item set being used (e.g., 22 items instead of 24 items).
Limitations of 3FBF ICU Models
Despite advances in research examining models, there are also a number of limitations associated with the 3FBF for the ICU, including poor-to-acceptable model fit indices (see Table 1), marginally acceptable internal consistency of the unemotional subscale, the need to remove items, and error terms being specified to correlate according to modification indices. Furthermore, the callousness subfactor is largely composed of negatively worded items, whereas the uncaring factor is composed of positively worded items, suggesting that the 3FBF structure may be driven by method variance related to response styles. Finally, an increasing number of studies have not replicated a 3FBF (see Table 1). Across a range of samples using the youth-reported ICU, studies have reported solutions for models with five (N = 383; ages 8-18 years, Feilhauer, Cima, & Arntz, 2012), two (N = 268, ages 7-13 years, Houghton, Hunter, & Crow, 2013), and three factors (N = 620, ages 3-4 years, Ezpeleta, Osa, Granero, Penelo, & Domènech, 2013), as well as a recently proposed two-factor solution using a 12-item version of the parent-reported ICU (N = 250, ages 6-12 years, Hawes et al., 2014).
Despite these alternative model solutions and the limitations outlined above, results of studies supporting a 3FBF model have been the justification for which subsequent studies have simply used a summed subscale or total ICU scores within analyses. Notably, studies are also using the parent-reported ICU to assess CU behavior in young children (e.g., Somech & Elizur, 2012), despite the fact that only two previous studies have examined the factor structure of the parent-reported ICU, and both reported inadequate fit for a 3FBF model (Hawes et al., 2014; Roose et al., 2010; Table 1). It is troublesome that the field is moving forward on less than solid psychometric grounds, particularly in the context of the new DSM-5 specifier, which highlights the need for reliable measurement of CU behavior for diagnosis and classification. Specifically, questions surround the use of the ICU for diagnosis of limited prosocial emotions and, specifically, which version of the ICU (e.g., 24-item or 12-item; e.g., Hawes et al., 2014) best assesses the CU behavior construct, particularly using parent report.
Construct Validity of the ICU: Externalizing and Internalizing Behavior Outcomes
Apart from factor structure, an important aspect of psychometric work relating to the ICU surrounds its construct validity. Previous studies have supported the utility of total summed ICU scores, which typically exhibit high internal consistencies. Expected positive correlations between summed ICU total scores and externalizing outcomes have been reported for both the self-reported (e.g., Fanti et al., 2009; Kimonis et al., 2008) and parent-reported ICU (e.g., at-risk adolescents, ages 13-17 years, N = 70, Berg et al., 2013; detained adolescents, ages 12-18 years, N = 94; White, Cruise, & Frick, 2009). In line with findings reported for total ICU scores, callous and uncaring subscale scores have also been shown to correlate positively with externalizing outcomes. However, even though the construct is “callous-unemotional,” unemotional subscale scores have typically demonstrated inconsistent correlation with measures of externalizing behavior (e.g., Berg et al., 2013). Recently Hawes and colleagues proposed a revised 2F model for the ICU, composed of only callous and uncaring subfactors (unemotional items were dropped, with the exception of Item 6, “does not show emotions”), based on an examination of item–total correlations and item-response theory. Their 2F solution demonstrated good model fit, high internal consistency, acceptable test–retest reliability, convergent validity, and replication in an independent sample. This initial revision of the ICU highlights the need for further examination of the validity of “unemotional” items within the context of youth CU behavior.
CU behavior has also been examined in relation to internalizing symptoms. Broadly, results mirror those from theory in adult psychopathy that CU behavior is related to fearlessness, low anxiety, and low internalizing (Lykken, 1995). For example, one study found that CU behavior was related to fewer internalizing symptoms over time (N = 1862, ages 5-8 at baseline, Pardini, Stepp, Hipwell, Stouthamer-Loeber, & Loeber, 2012). However, ICU scores have also been shown to predict higher levels of internalizing problems (e.g., Berg et al., 2013; Essau et al., 2006). One explanation for these discrepant findings may derive from the fact that externalizing behavior problems are often strongly associated with internalizing symptoms. Thus, the direction of the association between CU behavior and anxiety may be positive, until concurrent externalizing behavior is accounted for, when it becomes negative (i.e., because of cooperative suppression; see Frick et al., 2014; Lilienfeld, 2003). However, the prediction of internalizing when controlling or not controlling for externalizing is yet to be addressed in longitudinal analyses using the parent-reported ICU.
Implications of a 3FBF Model
Finally, studies have rarely considered the meaningof the 3FBF solution in comparison with using summed total or subscale scores. In particular, the meaningof specific factors may be different when partialling out variance in a general CU behavior factor. Moreover, it is unclear what the general factor means in terms of capturing correlation between items once unique variance relating to specific subfactors has been accounted for. Somewhat surprisingly, no studies to date have examined unique associations between specific factors and externalizing or internalizing outcomes, accounting for variance explained by a CU generalfactor. Rather, studies have solely examined associations between summed subscale scores based on the 3FBF model, even though the 3FBF model implies a need for general and specific factor scores (see Lahey, 2014). Indeed, when summed total ICU scores are used, potentially unique variance captured by specific factorsis lost. However, harnessing a 3FBF may produce more precise associations with relevant behavioral outcomes. It is thus important to examine whether a 3FBF predicts criterion variables differently than summed scores as this would have important practice implications in disseminating this type of factor analytic work.
Gaps in the Literature
A number of gaps thus emerge in the ICU literature. First, although the parent-reported ICU is already widely in use, there remains a need for its validation, particularly among high-risk samples of youth who exhibit a range of scores on measures of antisocial behavior. Second, the majority of previous studies that have examined the ICU have assessed samples with wide age ranges (Table 1), making it difficult to draw conclusions about its factor structure during specific developmental periods. Third, no studies to date have examined the longitudinal and predictive validity of the parent-reported ICU across informants. From a prevention perspective this is an important question especially if we can clarify whether ICU scores identify children at risk of developing more entrenched behavior problems before they reach clinical levels, particularly in youth already at high risk for later psychopathology. The few longitudinal studies that have been conducted have typically focused on treatment or forensic samples, among which ICU scores predicted poorer outcomes (e.g., White, Frick, Lawing, & Bauer, 2013). Fourth, although results from studies that have examined externalizing problems have generally been consistent, studies are needed that examine associations between the ICU and internalizing problems while taking into account any overlap between internalizing and externalizing symptoms. Finally, no previous studies have considered the predictivevalidity of the 3FBF scales versus summed ICU scores.
Aims of Current Study
The current study seeks to clarify the factor structure of the ICU and its construct validity in a number of ways. First, we examined the parent-reported version of the ICU in a large sample (N = 540; 50% female) of 9.5-year-olds at risk for conduct problems. We used reports from both primary and alternative caregivers. For primary caregiver (i.e., mother) reports on the ICU, we compared four models reported in previous studies (one factor, three correlated factors, 3FBF, and revised 2F model) using confirmatory factor analysis (CFA) and replicated findings with alternative caregiver–reported ICU data (e.g., co-parent, father, grandmother). Within a 3FBF framework, we examined associations between ICU scores and primary caregiver– versus teacher-reported outcomes at age 9.5 and predictions to 10.5. We focused on narrow symptom outcomes for externalizing (aggressive vs. rule-breaking) and internalizing (anxious-depressed vs. withdrawn-depressed) to enable a more precise examination of the associations. We computed models for internalizing behaviors while controlling for the overlap between internalizing and externalizing symptoms. We also compared findings using summed scores based on the 3FBF versus 2F solutions specifically with externalizing behavior outcomes to evaluate the utility of a shortened form of the ICU. The major goal of the current study was to comprehensively examine the factor structure of the ICU and its predictive validity over time in a way that would inform our understanding of the psychometrics of the measure, the use of summed versus factor scores, proposed revisions to the measure, and our understanding of the CU behavior construct.
Method
Participants
Participants included 731 mother–child dyads recruited between 2002 and 2003 from Women, Infants, and Children Nutritional Supplement Programs in the metropolitan areas of Pittsburgh, Pennsylvania and Eugene, Oregon, and in and outside Charlottesville, Virginia (Dishion et al., 2008). Participants were originally recruited to be part of a randomized controlled trial of the Family Check-Up, a preventative intervention for use in high-risk environments to address normative challenges facing parents from toddlerhood onward (see Dishion et al., 2008). Families were invited to participate if they had a son or daughter between age 2 years 0 months and 2 years 11 months. Recruitment risk criteria were defined as 1 SD above normative means or established clinical cut-points on screening measures in at least two of the following three domains: (a) child behavior problems (e.g., conduct problems—Eyberg Child Behavior Inventory; Robinson, Eyberg, & Ross, 1980), (b) primary caregiver problems (e.g., maternal depression, daily parenting stress, or self-reported substance use), and (c) sociodemographic risk (low education or low family income). Thus, children in the study were selected as “high risk” based on established risk factors for later conduct problems. Specifically, apart from socioeconomic or family risk, families qualified for the original study if children scored in the clinical range on the Intensity or Problem Scales of the Eyberg Behavior Inventory, which comprised 44% of the sample at recruitment, making the sample community-based but enriched/oversampled for those with early conduct problems (see Dishion et al., 2008; Robinson et al., 1980). However, because the sample was a community (vs. clinical) sample and not all children met inclusion criteria based on this definition of clinically meaningful frequencies of conduct problems, there was variability in the frequency of child conduct problems.
Of the 1,666 families who had children of the appropriate age and who were contacted across study sites, 879 met the eligibility requirements (52% in Pittsburgh, 57% in Eugene, and 49% in Charlottesville), and 731 (83.2%) consented to participate. The children in the sample had a mean age of 29.9 months (SD = 3.2) at the age 2 assessment (approximately 2.5 years old). Across sites, primary caregivers self-identified as belonging to the following ethnic groups: 28% African American, 50% European American, 13% biracial, and 9% other groups. During screening, more than 66% of enrolled families had an annual income < $20,000, and the average number of family members per household was 4.5 (SD = 1.63). Forty-one percent of the sample had a high school or general education diploma. Following the baseline assessment, half the sample was randomly assigned to receive the Family Check-Up intervention (see Dishion et al., 2008); thus, intervention status is used as a covariate in all analyses. Of 731 families who initially participated, we had ICU data for 540 (74%) at age 9.5. Of the 540 children with ICU data at age 9.5, we had alternative caregiver reports on the ICU for 401 (74%) children, primary caregiver–reported data for 404 children (75%) at age 10, teacher-reported data for 358 children at age 9.5 (66%), and teacher-reported data for 318 children at age 10.5 (59%). Selective attrition analyses conducted via chi-square tests or analyses of variance indicated that there were no differences between children for whom we did and did not have ICU data according to intervention status (p = .40), race (p = .19), family income (p = .19), and baseline levels of child problem behavior as reported by either primary (p> .70) or alternative (p> .90) caregivers. However, parent education was lower among those families for whom we did not obtain age 9.5 ICU data from—those lost to follow-up were less likely to have at least a high school education (p> .001).
Measures
Recruitment began when children were age 2 and annual assessments (with the exception of age 6) were conducted at family homes using a variety of questionnaires, interviews, assessor impressions, and videotaped observations. From age 7.5 years onward, we also collected data from teachers. The current study uses questionnaire data collected from homes (primary caregiver and alternative caregiver reports) and schools (teacher reports) at ages 9.5 and 10.5 years old. The majority of primary caregivers at age 9.5 were biological mothers (90%). Alternative caregivers were typically a biological father (45%), the romantic partner of the child’s mother (11%), a grandparent (10%), or an aunt/uncle (4%).
Demographics Questionnaire: Covariates
Primary caregivers completed a demographics questionnaire at age 2 (Dishion et al., 2008). Consistent with past studies in this sample, child gender was coded as female = 0 (n = 271; 50.2%); male = 1 (n = 269; 49.8%). Child’s race was coded as “Caucasian/other” = 0 (n = 304; 56.3%); “Black African American/biracial” = 1 (n = 236; 43.7%). Ethnicity was coded as “non-Hispanic” = 0 (n = 474; 87.8%); “Hispanic” = 1 (n = 64; 11.9%). Also consistent with past studies in this sample, parent education was coded as “less than high school” = 0 (n = 113; 20.9%) and “high school and beyond” = 1 (n = 428; 79.1%). Gross annual family income was coded as ≤$14,999 = 0 (n = 268; 49.6%); ≥$15,000 = 1 (n = 272; 50.4%). Finally, as data were collected from multiple sites which differed with respect to the urbanicity and ethnic/racial composition of participants, location was included as a covariate to account for these potential differences. Furthermore, the cut-points reported represent meaningful differences between groups within our relatively high-risk sample. Note that the pattern of findings was unchanged if we included quasi-continuous parent education and family income variables.
Callous-Unemotional Traits (Age 9.5 Years)
We assessed CU traits at age 9.5 via primary and alternative caregiver reports on the 24-item ICU (Frick, 2004). Each item is rated on a 4-point scale (0 = not true; 1 = somewhat true; 2 = very true; 3 = definitely true). Self-reported ICU data were not collected.
Externalizing and Internalizing Problem Behavior (Ages 9.5 and 10.5 Years)
Primary caregivers completed the Child Behavior Checklist at ages 9.5 and 10.5 (CBCL; Achenbach, 1991a) and teachers completed the Teacher Report Form of the CBCL (Achenbach, 1991b). Both questionnaires consist of an externalizing (33 items for the CBCL and 34 for the Teacher Report Form of the CBCL) and internalizing (31 items for CBCL and 35 for Teacher Report Form of the CBCL) problem behavior scale. We focused on the externalizing scale, measuring aggressive (e.g., defiant and talks back, disrupts class discipline) and rule-breaking (e.g., steals, fights) behaviors. We also examined two internalizing subscales, withdrawn-depressed (e.g., likes to be alone, withdrawn) and anxious-depressed (e.g., fears mistakes, needs to be perfect) behaviors. We focused on these narrower symptom subscales rather than the broad band scales of externalizing and internalizing scales to test more precise associations of ICU scores.
Analytic Strategy
Aim 1: To Examine the Factor Structure of the ICU
First, we computed inter-item polychoric correlations for the ICU using primary and alternative caregiver reports. 2 We then used CFA in Mplus Version 7.2 (Muthén & Muthén, 2014) to compare model fit for one-factor, correlated three-factor, 3FBF, and revised 2F solutions for the primary caregiver–reported ICU. We also tested the 3FBF and revised 2F model using alternative caregiver reports enabling corroboration of model fit within our sample. Models were estimated with mean and variance adjusted weighted least squares estimation for use with ordinal items (Flora & Curran, 2004). We considered model fit to be adequate if the root mean square error of approximation (RMSEA) and comparative fit index (CFI) values met guidelines (i.e., RMSEA < .06 and CFI >.95; Hu & Bentler, 1999). Because we used mean and variance adjusted weighted least squares estimation, we carried out corrected chi-square differences test with DIFFTEST in Mplus (Muthén & Muthén, 2014). We examined descriptive statistics and zero-order correlations between summed ICU total and subscale scores for the 3FBF and revised 2F solutions. Internal consistencies of summed ICU scores were assessed using Cronbach’s alpha.
Aim 2: To Test Cross-Sectional and Longitudinal Construct Validity of the ICU
We computed cross-sectional and longitudinal zero-order correlations between age 9.5 summed total scores and subscale primary caregiver–reported ICU scores and both primary caregiver– and teacher-reported externalizing and internalizing scores at ages 9.5 and 10.5. Next, we computed a series of path models to examine the prediction of primary caregiver– and teacher-reported externalizing and internalizing problem behaviors by 3FBF general and specific factor scores. We computed separate models to examine the prediction of aggressive versus rule-breaking behavior and anxious-depressed versus anxious-withdrawn behavior. We examined cross-sectional associations at age 9.5 and longitudinal associations with age 10.5 scores (controlling for autoregressive effects). We were thus able to examine the pattern of findings for a general CU factor when variance in specific factors was accounted for and vice versa. However, for purposes of enabling comparison with studies that have computed summed total and subscale scores and for practical translation, we also examined associations with aggressive versus rule-breaking behavior within regression models using summed scores. As before, to be transparent with the data, we computed separate models for cross-sectional and longitudinal associations and examined both within and across informant associations. Finally, we compared findings for regression models examining associations between ICU summed scores and aggressive versus rule-breaking behavior when summed scores were based on the 3FBF solution versus the revised 2F solution. In all models examining cross-sectional and longitudinal associations with externalizing and internalizing subscales, we controlled for intervention status, project location, child gender, race, and ethnicity, parent education, and family income though results were similar when not partialling for covariates.
Results
Factor Structure of the ICU
We computed polychoric correlations among items of the primary caregiver–reported ICU (Table 2). There were modest–moderate correlations among items. 3 Consistent with previous studies, we dropped Item 10, “does not let feelings control him or her” (e.g., Ciucci et al., 2014) as higher endorsement of this item was not related to endorsement of other ICU items. It thus appeared that raters were interpreting Item 10 as indexing a desirable behavior. In addition, Items 15 (“always tries his or her best”) and 23 (“works hard on everything”), which have similar item content, were highly related (r = .75). We dropped Item 23 as it caused difficulties in the model estimation stage, which appeared to be related to an issue of multicollinearity. We found similar associations among items using alternative caregiver reports (not shown for brevity, but available on request). We thus computed all models using 22 of the original 24 ICU items.
Polychoric Correlations Between Items (Parent-Reported Version of ICU).
Note. ICU = Inventory of Callous-Unemotional Traits. Item 10 inversely related to other ICU items and was excluded from subsequent analyses. High overlap between Items 23 and 15 caused problems in model estimation stage; Item 23 excluded from subsequent analyses.
We examined one-factor, three-correlated factor, 3FBF, 3FBF with correlated residuals, and revised 2F models for the primary caregiver–reported ICU (Table 3 and Figure 1). The one-factor model showed poor fit to the data although the moderate loadings of all 22 items onto a general factor were notable and hinted at shared variance among items. The three-correlated-factor model fit the data better than the one-factor model, Δχ2(3) = 254.76, p< .001; however, the fit was still poor. The 3FBF model fit the data better than a three-correlated factor model, Δχ2(19) = 224.24, p< .001, but the fit was only just acceptable and a number of items on the uncaring subscale loaded in the opposite direction and several were nonsignificant (Table 3). Because of this poor fit, we examined modification indices. We allowed error terms of five pairs of items to correlate, although we only specified correlations when there was overlap in item content (Figure 1). However, only one of five item pairs was the same as specified in a previous study (see Fanti et al., 2009 who specified 16 pairs of items to correlate, including one overlapping pair with us, Items 15 and 20). It is noteworthy, however, that other studies incorporating modification indices have not consistently specified how many items (e.g., Essau et al., 2006) or which items were specified to correlate (e.g., Ciucci et al., 2014). Based on items having negative item loadings on their specific factors, we specified four items (Items 8, 3, 5, and 13) to only have loadings on the general factor. This 3FBF model with modification indices showed good fit to the data (Table 3). We also examined the fit of this 3FBF model specifying the same correlated residuals for the alternative caregiver–reported ICU. Loadings were similar and the fit was good, demonstrating corroboration of our 3FBF model across informants (Table 3). We also compared model fit for males versus females. We conducted multigroup analyses comparing model fit when factor loadings and intercepts were fixed versus freed using the DIFFTEST procedure. We found that the fixed model showed significantly better fit, suggesting that loadings were similar across males and females. Finally, we examined a revised 2F model with 12 ICU items (Hawes et al., 2014). This newly proposed two-factor solution showed good model fit for both primary caregiver (see Table 3) and alternative caregiver reports, χ2(53) = 126.98, p< .001; CFI = .98; RMSEA = .05 (details available on request). However, because the 2F model is based on a different item set than the 3FBF, we could not compare model fit directly.
Factor Loadings and Model Fit Statistics for One-Factor, Three-Correlated-Factor, 3FBF, and Revised 2F Solutions for Parent-Reported ICU.
Note. ICU = Inventory of Callous-Unemotional Traits; CU = callous unemotional; MI = modification indices; 2F = two-factor; 3FBF = three-factor bifactor; df = degrees of freedom; CFI = comparative fit index; RMSEA = root mean square error of approximation.Items paraphrased for brevity. All parent-reported ICU except if stated. In the 3FBF model with MI, error terms of the following items were allowed to correlate: Item 15 with Item 7, Item 15 with Item 11, Item 22 with Item 12, Item 20 with Item 11, Item 20 and Item 15. In addition, Items 8, 3, 5, and 13 were specified to only have general CU factor variance (no specific factor variance). Note that a 2BF (i.e., general CU factor and two specific factors of callous and uncaring) ran into estimation problems as the uncaring items loaded strongly (all loadings >.70) onto the general factor. Thus, the two-correlated factor appears to offer a more parsimonious method to express the variance in these two factors.
p< .05. **p< .01. ***p< .001.
Source. Frick (2004).

Final three-factor bifactor model for parent- and alternative caregiver–reported Inventory of Callous-Unemotional Traits (ICU).
Total and Subscale Summed Scores
Based on our 3FBF solution, we created a 22-item 3FBF total score, a 9-item callous subscale score, a 4-item uncaring subscale score, and a 5-item unemotional subscale score. Based on the 2F model, we created a 12-item 2F total score, a 7-item callous subscale score, and a 5-item uncaring subscale score. There were moderate intercorrelations among unemotional, callous, and uncaring subscales within 3FBF summed scores (range, r = .31-.52, p< .001) and between callous and uncaring 2F scores (r = .49, p< .001; Table 4). There was high internal consistency for total ICU scores (3FBF, α = .87; 2F, α = .84), and the callous (3FBF, α = .78; 2F, α = .76) and uncaring subscales (3FBF,α = .81; 2F, α = .84), which were similar across the two model solutions. There was acceptable internal consistency for the 3FBF unemotional subscale (α = .65). The total 2F score featured 12 of the 22 items included in the total 3FBF score, and not surprisingly, they were highly related (r = .94, p< .001). The 2F callous subscale score featured six of the nine items included in the 3FBF callous subscale and an additional item that appeared in the 3FBF unemotional scale; these scales were strongly associated (r = .93, p< .001). Similarly, the 2F uncaring scale featured three of the four 3FBF uncaring items and two different items (one from 3FBF general factor with no specific variance, and one from 3FBF callous scale), and they were also strongly related (r = .90, p< .001). Based on bivariate associations, it thus appears that the refined ICU produces scale scores that are similar to those derived from the full version of the measure. Nevertheless, the utility of dropping the unemotional items (and having no unemotional subscale) required further investigation via an examination of correlates. We focused our results on models computed within a 3FBF framework so we could examine associations with externalizing and internalizing outcomes for a general factor controlling for variance explained by specific factors and vice versa. However, we also computed total and subscale summed scores and examined associations with externalizing outcomes, which enabled comparability of our findings with previous studies and allowed us to compare the pattern of findings for total and subscale scores derived from the 3FBF versus 2F solutions.
Descriptives, Interscale and Intersubscale Bivariate Correlations, and Internal Consistencies for Summed Scores Derived From Three-Factor Bifactor and Two-Factor Solutions.
Note. ICU = Inventory of Callous-Unemotional Traits; SD = standard deviation.
p < .05. **p< .01. ***p< .001.
Cross-Sectional and Longitudinal Construct Validity of the ICU
Cross-Sectional and Longitudinal Bivariate Correlations: Summed Scores
There were strong associations within primary caregiver and teacher reports of externalizing problem behavior from ages 9.5 to 10.5, and moderate associations between primary caregiver and teacher reports (see Table 5), suggesting convergence across time and informants. There were modest-to-strong zero-order associations between primary caregiver–reported ICU summed scores for the 3FBF and 2F solutions and primary caregiver–reported externalizing symptoms, including the aggressive and rule-breaking subscales (range, r = .16-.60, p< .01). The magnitude of associations was greater for total, callous, and uncaring scores than for unemotional scores. There were modest–moderate zero-order associations between 3FBF and 2F primary caregiver–reported summed ICU total scores and callous and uncaring subscale scores and teacher-reported externalizing scores (range, r = .10-.25, p<.10). Associations with 3FBF unemotional scores were smaller in magnitude and less likely to be significant (range, r = .09, ns, −.13, p< .13). For internalizing symptoms, zero-order associations within primary caregiver reports were high over time (range, r = .41-.71, p< .001). However, associations between primary caregiver and teacher reports of internalizing problem behavior were lower in magnitude (range, r = .05, ns, −.29, p< .001; Table 6). Finally, we found moderate positive zero-orderassociations between primary caregiver–reported ICU and internalizing scores (range, r = .05, ns, −.44, p< .001). Associations between primary caregiver–reported ICU and teacher-reported internalizing scores were smaller and less likely to be significant, although still tended to be positive in directionality.
Zero-Order Cross-Sectional and Longitudinal Correlations Between Summed Total and Subscale Scores for the Primary Caregiver–Reported ICU (Three Bifactor and Two-Factor Solutions) and Parent- and Teacher-Reported Externalizing Behavior Problems (Aggression and Rule-Breaking) at Ages 9.5 and 10.5 Years.
Note. ICU = Inventory of Callous-Unemotional Traits; Ext = Externalizing; Agg = Aggression; RuleB = Rule-breaking; PC = primary caregiver–reported.
p < .10.*p< .05. **p < .01. ***p < .001.
Zero-Order Cross-Sectional and Longitudinal Correlations Between Summed Total and Subscale Scores for the Primary Caregiver–Reported ICU (Three Bifactor and Two-Factor Solutions) and Parent- and Teacher-Reported Internalizing Behavior Problems (Anxious and Withdrawn) at Ages 9.5 and 10.5 Years.
Note. ICU = Inventory of Callous-Unemotional Traits; Anx = Anxious-depressed; With = Withdrawn-depressed; PC = primary caregiver–reported.
p< .10. *p< .05. **p< .01. ***p< .001.
Cross-Sectional and Longitudinal Construct Validity: 3FBF Latent Model
In cross-sectional models at age 9.5, higher ICU general factor scores were associated with higher levels of both primary caregiver– and teacher-reported aggressive and rule-breaking behaviors (Table 7). Higher callous and uncaring specific scores were also related to higher primary caregiver–reported aggressive and rule-breaking behaviors. However, the unemotional specific factor predicted fewer primary caregiver–reported aggressive and rule-breaking behaviors, and was unrelated to teacher-reported outcomes in cross-sectional models. In longitudinal autoregressive models that controlled for earlier externalizing behavior problems, the ICU general factor significantly predicted increases in both teacher and primary caregiver–reported rule-breaking. General ICU factors scores also predicted increases in teacher-reported aggressive behavior at age 10.5 (prediction of primary caregiver–reported aggressive behavior was a trend). It is noteworthy that the effect of the ICU general factor was greater in magnitude for the prediction of rule-breaking (primary caregiver–reported, β = .16, p< .01; teacher-reported, β = .22, p< .001) compared with the prediction of aggression (primary caregiver–reported, β = .09, p< .10; teacher-reported,β = .16, p< .01). None of the specific factors accounted for unique variance in primary caregiver–reported aggressive or rule-breaking behavior from ages 9.5 to 10.5 years.
Regression Models Showing 3FBF ICU Model General and Specific Factor Scores Predicting Primary Caregiver– and Teacher-Reported Age 9.5 and 10.5 Externalizing Behavior.
Note. ICU = Inventory of Callous-Unemotional Traits; CU = callous unemotional. All models controlled for intervention group, gender, race, ethnicity, and project site. Models predicting age 10.5 outcomes controlled for age 9.5 primary caregiver–reported aggressive or rule-breaking behavior as relevant. Controlling for concurrent internalizing disorder did not change the pattern of findings—results are not shown for brevity.
p< .10.*p< .05. **p< .01. ***p< .001.
When we examined associations with internalizing outcomes (anxious-depressed vs. withdrawn-depressed), we compared the pattern of effects when we did and did not control for concurrent externalizing behavior. In cross-sectional models, we found that a higher ICU general factor score was related to lower primary caregiver–reported anxious-depressed scores at age 9.5 (Table 8), but only after accounting for concurrent externalizing behavior. Accounting for variance explained by the general CU behavior factor however, we found that unemotional and callous specific factors were related to higher withdrawn-depressed scores, and unemotional scores were also related to higher anxious-depressed scores. There were no significant associations between the ICU general or specific factors scores and teacher-reported outcomes in cross-sectional models at age 9.5 (Table 8). In longitudinal autoregressive models (Table 8), we found that ICU general factor scores were related to decreases in primary caregiver– and teacher-reported anxious-depressed scores and primary caregiver—reported withdrawn-depressed scores from ages 9.5 to 10.5, but again, only after controlling for concurrent externalizing behavior at age 10.5. Scores on the callous specific factor were also related to decreases in primary caregiver–reported anxious-depressed behavior. In line with cross-sectional models, we found that the unemotional specific factor was related to increases in primary caregiver– and teacher-reported withdrawn-depressed scores. Likewise, the callous specific factor predicted increases in teacher-reported withdrawn-depressed scores. We also report estimates when we did not control for concurrent externalizing (in italics and parentheses; Table 8), the results of which reinforce the importance of considering cooperative suppression effects between externalizing and internalizing behavior in relation to associations with CU behavior. Specifically, the direction of effects between the general ICU factor and primary caregiver–reported anxious-depressed score reversed when taking account of comorbid externalizing behavior symptoms (see the “Discussion” section).
Regression Models Showing 3FBF ICU Model General and Specific Factor Scores Predicting Primary Caregiver– and Teacher-Reported Age 9.5 and 10.5 Internalizing Behavior.
Note. ICU = Inventory of Callous-Unemotional Traits. All models controlled for intervention group, gender, race, ethnicity, and project site. Models predicting age 10.5 outcomes controlled for age 9.5 primary caregiver–reported aggressive or rule-breaking behavior as relevant. Models predicting internalizing symptoms include controlling for externalizing disorder. However, estimates when concurrent externalizing behavior was not controlled for are shown in italics and parentheses. These results highlight cooperative suppression effects of externalizing behavior in relation to associations between ICU scores and internalizing symptoms.
p< .10. *p< .05. **p< .01. ***p< .001.
Cross-Sectional and Longitudinal Construct Validity: Summed Scores
We re-examined associations with externalizing and internalizing outcomes using summed total and subscale scores within regression analyses. The pattern of findings broadly mirrored that obtained when associations were examined within a 3FBF framework. For brevity, we thus only present results from examining associations with externalizing outcomes (Table 9; results of models examining associations with internalizing outcomes available on request from authors). We also examined associations using summed scores based on the 2F model to test whether this more parsimonious set of items performed similarly to the 22-item set. Both the 3FBF and 2F produced summed total scores and callous and uncaring subscale scores that were cross-sectionally related to higher primary caregiver– and teacher-reported aggressive and rule-breaking behavior. In longitudinal autoregressive models, summed total scores of the 3FBF and 2F models predicted increases in primary caregiver and teacher reports of rule-breaking at age 10.5. In longitudinal autoregressive models, the uncaring subscale of both model solutions was related to increases in rule-breaking behavior across informant. As with the 3BF analyses however, we found that unemotional summed scores were cross-sectionally related to lower primary caregiver reported aggressive and rule-breaking behavior, accounting for overlap with other subscales. 4
Regression Models Showing Summed ICU Total and Subscale Scores Predicting Primary Caregiver– and Teacher-Reported Ages 9.5 and 10.5 Externalizing Behavior.
Note. ICU = Inventory of Callous-Unemotional Traits; Ext = Externalizing; Agg = Aggression; RuleB = Rule-breaking; PC = primary caregiver–reported. All models controlled for intervention group, gender, race, ethnicity, and project site. Models predicting age 10.5 outcomes controlled for age 9.5 primary caregiver–reported aggressive or rule-breaking behaviors as relevant. Controlling for concurrent internalizing disorder did not change the pattern of findings—results are not shown for brevity.
p< .10. *p< .05. **p< .01. ***p< .001.
Discussion
In the current study, we addressed a number of questions surrounding the parent-reported ICU. We found acceptable model fit for a 3FBF model with correlated residuals for both primary and alternative caregiver reports, thus providing some corroboration of this structure among a sample of high-risk children aged 9.5 years. We also found good model fit for a revised 2F model using a reduced 12-item pool, providing support for the proposal by Hawes et al. (2014) to focus only on callous and uncaring and trim the unemotional item content. Total and subscale scores from the 3FBF and 2F models showed acceptable-to-high internal consistencies. An examination of the cross-sectional and longitudinal construct validity of scores within a 3FBF framework suggested that the ICU provides predictive validity particularly in relation to covert forms of antisocial behavior indexed via a measure of rule-breaking behavior, although effect sizes of this prediction were modest in magnitude within autoregressive models. Our results speak both to the assessment of CU behavior using this measure and to the construct itself.
Aim 1: To Examine the Factor Structure of the ICU
We found that a 3FBF model, with modification indices guiding correlation of five pairs of items, showed the best model fit for primary and alternative caregiver reports on the ICU. It is unclear how similar our use of modification indices is to those used in previous studies, which have not always reported the correlations specified among item residuals (e.g., Ciucci et al., 2014; Essau et al., 2006). The need to use modification indices is a limitation associated with the ICU that we had wanted to avoid. However, the necessity for this approach in both the current study and previous studies suggests overlapping item content within and between factors that may be compounded further by similar semantic item structure (e.g., eight items begin with “does not”). Furthermore, it is noteworthy that the factor loadings of the uncaring specific factor seemed to be the most affected within the bifactor model, suggesting that “uncaring” may be most closely aligned with a general CU factor (Pardini, Hawes, Burke, & Loeber, 2014). Thus, although we replicated the most commonly reported factor structure (i.e., 3FBF), the psychometric properties of the ICU continue to appear far from robust. Importantly, from a modeling perspective, the 2F solution of Hawes et al. (2014) had good model fit and appears to offer a more parsimonious assessment of a central callous and uncaring construct than the 3FBF model. This 2F model could also prove to be more stable across samples as modification indices and other specific changes were not needed. The 2F solution is supported by another recent study that assessed children aged 7 to 12 years, where confirmatory factor analysis showed that a 2F model comprising callous and uncaring dimensions fit the data best (Houghton et al., 2013).
Aim 2: To Test Cross-Sectional and Longitudinal Construct Validity of the ICU
Externalizing Problem Behavior
Within a 3FBF framework, the ICU general factor was cross-sectionally related to higher aggressive and rule-breaking behavior. The general factor also predicted increases in rule-breaking behavior from ages 9.5 to 10.5 years across informants and settings. This finding highlights that by leveraging shared variance among ICU items, and controlling for unique variance of specific factors, we tapped a construct with incremental validity in relation to the development of behavior problems, particularly covert antisocial behavior. Nevertheless, the practical utility of a “general bifactor” model may be limited given the inability to model a meta-factor with individual client data. Interestingly however, summed total and uncaring subscale scores implied by both the 3FBF and 2F models also predicted increases in rule-breaking from ages 9.5 to 10.5, after accounting for earlier rule-breaking. It was surprising that despite the widespread use of the ICU and the predominance of the 3FBF, no previous studies had examined associations of ICU scores with relevant outcome variables within a 3FBF (see Lahey, 2014). The fact that we found a similar pattern of findings when analyses were carried out using summed scores or within the 3FBF framework is striking. Ultimately, assessment is a practical enterprise, and our results provide justification for future studies to use a summed total score using either 22 ICU items or the revised 12-item ICU to create a total score with predictive utility. Furthermore, the results suggest that uncaring items may be particularly pertinent for identifying those youth at risk for displaying increases in rule-breaking behavior among high-risk children. At the same time, effect sizes were modest in magnitude, especially for longitudinal autoregressive models, which may have been due to high stability of externalizing behavior. Nevertheless, the incremental validity of the ICU scores suggests that knowing about CU behavior, as indexed by the ICU, may be helpful for tailoring specific intervention or treatment components to different subgroups of youth (e.g., Dadds et al., 2014).
Whereas total and uncaring scores were reliably related to more externalizing problems across informants, we found that unemotional subfactor scores were related to lower scores when we controlled for variance in general CU behavior within a 3FBF framework. This effect was particularly notable for cross-sectional parent-reported aggressive and rule-breaking symptoms. As few studies have examined the longitudinal predictive validity of ICU within a 3FBF framework, we interpret our findings with caution. One possibility is that once variance in general CU behavior and specific uncaring and callousness factors is partialled, the variance remaining in unemotionality actually relates to emotional resiliency or a lack of externalizing problem behavior, and thus represents a marker of positive mental health. We return to an evaluation of the unemotional part of the CU behavior construct later in this discussion.
Internalizing Problem Behavior
Within a 3FBF, we obtained expected negative associations between the general CU behavior factor and anxious-depressed scores. Notably, the general factor predicted lower anxious-depressed scores at age 10.5, controlling for autoregressive effects, and across primary caregiver– and teacher-reported outcomes. Thus, our results support the notion that the shared general variance within ICU items is related to lower anxiety, in line with the defining characteristics of affective aspects of psychopathy among adult samples (Frick et al., 2014; Lykken, 1995). Findings were replicated using summed scores for the total ICU and uncaring scores implied by both the 3FBF and 2F model for primary caregiver–reported anxious-depressed scores. However, this pattern of effects only emerged when we controlled for comorbid externalizing symptoms; otherwise ICU general scores were related to higher anxious-depressed scores. This reversal in the direction of associations is notable, and represents possible cooperative suppression (see Frick et al., 2014). By partialling the variance in this way, we appeared to be accounting for comorbidity between externalizing and internalizing symptoms, which may be underpinned by some higher order dimension, such as behavioral dysregulation or negative emotionality (for further discussion of this issue, see Lilienfeld, 2003; Hyde, Byrd, Votruba-Drzal, Hariri, & Manuck, 2014). Only when this overlapping variance was accounted for were we able to obtain expected negative associations between CU behavior and anxiety (also, see Loney, Frick, Clements, Ellis, & Kerlin, 2003). At the same time, our results highlight that while partialing variance is a useful approach to demonstrate construct validity within the context of statistical modeling, the reality may be more complex with many children presenting with problems across multiple domains of functioning.
In contrast to the negative association between ICU total scores and anxious-depressed symptoms, we found callousness to be related to higher withdrawn-depressed scores even after controlling for concurrent externalizing problems. It may be that this finding emerged as a statistical artifact with a 3FBF framework. However, a similar pattern of findings was found using summed scores in regression models. Furthermore, Hawes et al. (2014) also reported a positive association between ICU callous scores and measures of internalizing behavior problems. Thus, when variance in other subscales or a general CU behavior factor is partialled, it appears that callousness may relate to parents or teachers endorsing children as being socially withdrawn, isolated, or low in mood. This finding highlights the need for careful consideration of the wording of callous items. In particular, it is noteworthy that the uncaring subscale is composed of positively worded items, whereas the callous subscale is negatively worded. Future studies are needed to examine whether the same pattern of associations is achieved when callousness is assessed with positive-worded items and uncaring with negatively worded items. Moreover, it may indicate that the suppression effects discussed above do not generalize to all types of internalizing, only those focused on symptoms of anxiety (vs. depression).
The Meaning of “Unemotional”
We found that when callous, uncaring, or general factor scores were controlled for, unemotional behavior was related to more withdrawn-depressed symptoms. In conjunction with the finding that unemotional scores were related to lower aggressive and rule-breaking scores (see earlier), our results thus provide support for the conclusions of Hawes et al. (2014) that consideration needs to be given to conceptualizations of “unemotional” among children and adolescents. In particular, the ICU unemotional items may not be doing a good job of capturing “unemotionality”as it relates to the nomological network of CU behavior. Interestingly, in a previous article assessing this sample in the preschool years (Hyde, Shaw, Gardner, et al., 2013), we developed a “home-grown” measure of CU-like behavior and found that traditional “unemotional” items did not load with items indexing callousness and uncaring.
Alternatively, it may be that “unemotionality” as indexed by the ICU is interpreted by informants as withdrawn, anhedonic, or shy behavior, which differs somewhat from conceptualizations of the unemotional component of CU behavior indexing reduced anxiety or fearlessness (cf., low fear in psychopathy; Lykken, 1995). In support of this notion, Ezpeleta et al. (2013) reported a positive association between teacher-reported unemotionality and anxiety among preschoolers. A review of individual cases suggested that teachers did appear to be rating “unemotional” children as shy, socially phobic, and unable to express feelings (Ezpeleta et al., 2013, p. 102). At the same time, it is possible that once variance in callous and uncaring is partialled, unemotionality predicts covert aggression not captured in either our aggressive and rule-breaking subscales (e.g., lying that remains undetected; relational aggression).
Another explanation is that the general CU behavior factor acts as a suppressor variable, obscuring the relationship of the unemotional subfactor with criterion-related variables when considered in the 3FBF framework. Indeed, in zero-order correlations, unemotional scores were associated with higher externalizing, albeit with effect sizes that were smaller in magnitude than those for callous or uncaring scores. However, this conclusion is difficult to reconcile with the positive associations we found between unemotional and internalizing problem behaviors. Furthermore, while a lack of emotional responsivity has been documented at a neurobiological level for antisocial youth with CU behavior (i.e., reduced amygdala responsivity to others’ fear and reduced responsivity to punishment; see Hyde, Shaw, & Hariri, 2013), it is less clear that children with high CU behavior are, in fact, less emotionally expressive, as is implied by the ICU unemotional items (e.g., “hides feelings from others”). For example, previous studies have shown that adolescents with high CU behavior display significant negative emotionality, including anger, anxiety, and depression (e.g., Kimonis, Frick, Cauffman, Goldweber, & Skeem, 2012).
As such, there is a need for refinement of items to capture more accurately what is meant by “unemotional.” For example, some youth may be callous/uncaring but display fearlessness and low anxiety, whereas other may be callous/uncaring and show high levels of anxious or fearful behavior as a result of early environmental risk (Kimonis et al., 2012; Waller et al., 2013). Indeed, classic descriptions of psychopathy typically focus on low fear rather than a lack of emotionality in general (Lykken, 1995). Thus, revised items are needed to capture distinction in CU behavior in the context of negative emotional responses (including anger or anxiety) versus low levels of fear. Incorporating assessment of temperamental dimensions, such as negative emotionality or prosociality may help differentiate between antisocial subgroups of youth with and without CU behavior (Lahey, 2014). At the same time, one item of the original unemotional scale (“does not show emotions”) was retained in the shortened ICU, albeit as part of the callous factor (Hawes et al., 2014). Thus, it appears that continued consideration of the emotional displays of children (or lack thereof) is warranted to provide the affective context for any callous or uncaring behavior (see Rowe, 2014). However, our results relating to unemotionality highlight that the CU behavior construct, at least as assessed by the ICU, is relatively narrow and indexes only some of the personality traits linked to adult psychopathy. Thus, the CU behavior construct should not be synonymously equated with psychopathy.
Implications for Assessment of the CU Behavior Construct
There are a number of implications for assessment of the CU behavior construct. First, the results provide justification for practitioners to assess CU behavior using a sum of 22 ICU items, as implied by our 3FBF, or using the revised 12-item set proposed by Hawes et al. (2014) both of which added unique variance in relation to the prediction of future behavior problems across informants and settings. Use of these ICU summed scores may help in the diagnosis of the DSM-5 “limited prosocial emotions” specifier. Second, the 3FBF implies separable components of the CU behavior construct. However, we also found unexpected correlations with criterion-related variables that undermine conceptualizations of the subfactors, particularly the unemotional subfactor. The “unemotional” items do not appear to operate as intended in the nomological basis of CU behavior and thus may not be useful clinically or conceptually. Taken together, these points suggest that in using the ICU for assessment, the most meaningful and reliable predictive validity is derived via use of a latent general or summed total score. However, our results also highlight the need for alternative self-, parent-, or teacher-reported measures of CU behavior with stronger psychometric properties. Furthermore, the need for alternative methodologies is implied, especially in relation to how items are interpreted (i.e., callousness as withdrawal or anhedonia). One new assessment approach, the “Clinical Assessment of Prosocial Emotions” (Frick, 2013) is currently under development, and is designed to assess CU behavior via semi-structured and self-reported interviews. Finally, our analyses were conducted within a dimensional framework, and as such, we cannot speak to the application of the ICU within person-centered analyses or for categorical “diagnosis.” However, evidence from neuroimaging studies has demonstrated divergent patterns of amygdala reactivity among youth with conduct disorder based on their level of CU behavior (e.g., amygdala hyporeactivity to threat for high CU behavior vs. amygdala hyperreactivity to threat for low CU behavior; Viding et al., 2012), which represents a strong, objective test of the discriminant validity of ICU scores. In this example, a cutoff score was created using a median split on ICU total scores (Viding et al., 2012, p. 1110). However, future studies are needed to establish the validity of cutoff scores using the ICU or other measures of CU behavior, including the Clinical Assessment of Prosocial Emotions (Frick, 2013).
Strengths and Limitations
There were a number of strengths to the study, including a fairly large sample size, having children all assessed at the same age, use of a prospective longitudinal design, use of multiple informants, and corroboration of findings across informants. However, there were several limitations. First, we focused on low-income children with risk factors across multiple domains, including sociodemographic risk and early child problem behavior. Thus, it is unclear whether our results would generalize to children from higher income families with fewer risk factors. At the same time, our findings represent a useful complement to previous studies that have examined the ICU factor structure and that have focused on adolescents (e.g., Fanti et al., 2009), community or school samples (e.g., Ezpeleta et al., 2013), and incarcerated youth (Kimonis et al., 2008). Results from the current study likely bridge the distribution across these studies in having some children who were low and others very high on externalizing. Second, while we collected data from teachers for externalizing and internalizing outcomes, we only had primary and alternative caregiver reports for the ICU. Nevertheless, it remains unclear who the best informant is for assessing youth CU behavior and our results suggest a need to investigate alternative ways to find out about these emerging aspects of personality, especially in relation to concerns regarding the validity of self-reported measures of psychopathy (e.g., Lilienfeld, 1994). Finally, there was some attrition in our sample by 10.5 years old. We used FIML to accommodate missing data for models testing prediction by ICU scores for the 540 families for whom we had ICU data at age 9.5, although it is unclear whether the prediction of outcomes by primary caregiver–reported ICU scores would have differed among those lost to follow-up.
Conclusions
We found support for use of a general CU behavior score based on 22 items of the ICU, which was related to future rule-breaking behavior and lower anxious-depressed behavior problems over a 1-year period across informants and settings. Thus, our study supports the use of the ICU among high-risk children to identify those at risk for developing covert forms of antisocial behavior. However, the effect size of predictions was not large, and future work is indicated for better assessment and identification of youth with particularly accelerating forms of externalizing behavior problems. In particular, our study presents a number of questions moving forward for this relatively young field. Although there appears to be utility in using a total score that captures shared variance across this set of items, more psychometric work is needed to examine the item wording and specific correlates of subfactors, particularly the unemotional subfactor. Indeed, total summed ICU scores derived from a refined 12-item version of the measure showed comparable longitudinal predictive validity to the original item set. Furthermore, the model fit for the 2F solution for these 12 items was very good, and this model appears to offer a parsimonious method of assessing callous and uncaring behaviors with more robust psychometric support.
Footnotes
Acknowledgements
We thank the families and staff of the Early Steps Multisite Study. We also thank Dustin Pardini and Samuel Hawes for helpful comments on an earlier draft of the article. Finally, we are very grateful to three anonymous reviewers and the editor for valuable suggestions and feedback during the review process.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by Grants 5R01 DA16110 and 5R01 DA16110-02 from the National Institutes of Health, awarded to Thomas J. Dishion, Daniel S. Shaw, Melvin N. Wilson, and Frances Gardner. Aidan G. C. Wright’s efforts were supported by Grant F32 MH097325 from the National Institute of Mental Health.
