Abstract
Psychopathy refers to a range of complex behaviors and personality traits, including callousness and antisocial behavior, typically studied in criminal populations. Recent studies have used self-reports to examine psychopathic traits among noncriminal samples. The goal of the current study was to examine the underlying factor structure of the Self-Report of Psychopathy Scale–Short Form (SRP-SF) across complementary samples and examine the impact of gender on factor structure. We examined the structure of the SRP-SF among 2,554 young adults from three undergraduate samples and a high-risk young adult sample. Using confirmatory factor analysis, a four-correlated factor model and a four-bifactor model showed good fit to the data. Evidence of weak invariance was found for both models across gender. These findings highlight that the SRP-SF is a useful measure of low-level psychopathic traits in noncriminal samples, although the underlying factor structure may not fully translate across men and women.
Psychopathy is defined by a constellation of complex personality and behavioral traits, including callousness, irresponsibility, impulsivity, and antisocial behavior. To better parse the complexity within the psychopathy construct, varying models have been proposed that use different factor structures to measure an array of interpersonal and affective personality features, as well as deviant lifestyle and antisocial behavior characteristics (e.g., Cooke & Michie, 2001a; Hare & Neumann, 2008; Neumann, Hare, & Pardini, 2014). These models are based on conceptualizations of psychopathy as having: (1) two meta-factors containing a personality factor and a behavioral factor; (2) three specific factors that focus only on the personality factors and exclude antisocial behavior as a separate construct; or (3) a four “facet” structure in which the personality and behavioral factors are each split into two facets that tap interpersonal versus affective personality traits and deviant lifestyle versus antisocial behavior traits (Hare & Neumann, 2005). Though many studies focus on two-factor or four facet approaches, one underlying debate in the field is the extent to which the psychopathy construct taps a unitary construct with underlying factors; that is, the extent to which psychopathy can be considered a multidimensional versus unidimensional construct (Patrick, Hicks, Nichol, & Krueger, 2007). To address this debate, highly studied correlated factor solutions need to be compared with newer models that examine hierarchical or general-specific models. Beyond meta-structure, psychopathy has been studied among forensic samples, but research has highlighted the utility of examining psychopathy dimensionally within normative samples (Babiak, Neumann, & Hare, 2010; Neumann & Pardini, 2014; Neumann, Schmitt, Carter, Embley, & Hare, 2012; Skeem, Polaschek, Patrick, & Lilienfeld, 2011; Welker, Lozoya, Campbell, Neumann, & Carré, 2014). Thus, recent studies have also begun to examine the factor structure of psychopathy in community samples, using self-reported data (e.g., Mahmut, Menictas, Stevenson, & Homewood, 2011; Williams, Paulhus, & Hare, 2007), which can be helpful for understanding psychopathic traits dimensionally in individuals with lower levels of these traits (Lilienfeld, Fowler, & Patrick, 2006). Moreover, many studies of forensic populations have focused exclusively on men, meaning that studies are needed that include women (e.g., community samples) to examine the extent to which factor structure may be moderated by gender.
In the current study, we examined traditional and more novel factor structures of the Self-Report of Psychopathy–Short Form (SRP-SF; Paulhus, Neumann, & Hare, 2015) as well as the generalizability of these factor structures across gender. To examine the meta-structure of self-reported psychopathy with this relatively new measure, we tested traditional factor models (i.e., 1-4 correlated factors) and compared model fit to hierarchical factor and bifactor models (i.e., a model that specifies a general “g” factor and four specific factors that are uncorrelated with the general factor). We also examined configural and scalar invariance of the best-fitting solutions across gender. Thus, our overarching goal was to examine the SRP-SF to inform knowledge of its underlying factor structure and to provide a better understanding of the broader conceptualizations and structure of the psychopathy construct in a self-reported measure among community samples.
Development of the Self-Report Psychopathy (SRP) Scale
Derived from the original Psychopathy Checklist (PCL; Hare, 1985), the SRP was developed for use in nonforensic populations as a practical and brief method to assess psychopathic traits. Several versions of the SRP, including the SRP-II (Hare, Harpur, & Hemphill, 1989) and the SRP-III (Paulhus & Hemphill, 2006), have been examined among both forensic and community samples. In general, while a two-factor solution, consisting of “affective-interpersonal” and “social deviance” factors, has not consistently demonstrated acceptable fit (e.g., Lester, Salekin, & Sellbom, 2013; Williams & Paulhus, 2004), the four-factor solution that has been applied to the Psychopathy Checklist–Revised (PCL-R, Hare, 1999; e.g., Hare & Neumann, 2005, 2008; Hill, Neumann, & Rogers, 2004; Vitacco, Neumann, & Jackson, 2005) has shown good fit when applied to the SRP (e.g., Lester et al., 2013; Mahmut et al., 2011; Neal & Sellbom, 2012; Neumann et al., 2012; Neumann et al., 2014; Neumann & Pardini, 2014; Seibert, Miller, Few, Zeichner, & Lynam, 2011; Visser, Ashton, & Pozzebon, 2012; Welker et al., 2014; Williams, Nathanson, & Paulhus, 2003). Based on support for a four-factor solution, theorists have argued that these solutions denote two meta-factors including two facets each, which all combine to form the psychopathy construct (Hare & Neumann, 2005, 2008). However, as solutions containing multiple factors and facets gain support, a question that emerges is whether variance in items of these facets contributes to a higher-order or “general” psychopathy factor (Patrick et al., 2007), particularly among community samples where levels of these traits are lower and factors may be less distinct.
Recently, Neumann and colleagues have developed a shortened version of the SRP-IV-SF, which contains only 29 items and may provide a more efficient method of measuring psychopathic traits among larger samples (Neumann et al., 2014; Neumann & Pardini, 2014). The SRP-SF has been examined using confirmatory factor analysis (CFA) in seven studies to date (see Table 1). These studies have not provided as consistent an account of the structure of the SRP-SF, finding support for two-factor (Foulkes, Seara-Cardoso, Neumann, Rogers, & Viding, 2013; Seara-Cardoso, Dolberg, Neumann, Roiser, & Viding, 2013; Seara-Cardoso, Neumann, Roiser, McCrory, & Viding, 2012), three-factor (Neumann et al., 2012), and four-factor (Carré, Hyde, Neumann, Viding, & Hariri, 2013; Declercq, Carter, & Neumann, 2015; Foulkes et al., 2013; Neumann et al., 2014; Welker et al., 2014) structures. The mixed findings from these previous factor analyses highlight that continued search for the best way to conceptualize the underlying structure of the SRP-SF is needed. Moreover, a limitation is that only two previous studies have statistically compared the model fit for different factor solutions of the SRP-SF. Although different factor solutions may provide good fit to the data, direct statistical comparison of solutions can better determine which model best explains the underlying factor structure of the SRP-SF. A second limitation is that previous studies have typically only focused on examining the SRP-SF within gender (e.g., male or female), as opposed to across groups, which limits the generalizability of findings across sample type (although see Neumann et al., 2014, for an exception). Taken together, these limitations highlight that additional research is needed to examine the factor structure of the SRP-SF between models including novel approaches, ideally with direct statistical comparisons, and among samples that include men and women.
Summary of Sample Descriptives and Model Fit Statistics Reported in Previous Studies Examining SRP-SF Factor Structure.
Note. df = degrees of freedom; TLI = Tucker–Lewis index; CFI = comparative fit index; RMSEA = root mean square error of approximation.
This model did not include an antisocial factor. bFactors were modeled as predictors of external correlates. cFit better than 1 factor (χ2[1] = 6.65, p < .05). d2 factor fit better than a 1 factor (χ2[1] = 4.42, p < .05). e4 factor fit better than 2 factor (Δχ2[5] = 78.47, p < .001). YPI = Youth Psychopathic Inventory.
Hierarchical and Bifactor Models of Psychopathy
Another issue to consider is whether a traditional correlated factor structure represents the best approach to modeling psychopathy using SRP-SF data. Indeed, alternative structures within the psychopathy literature that emphasize both unidimensionality and multidimensionality within constructs include hierarchical and bifactor models. Hierarchical factor structures conceptualize psychopathy as a second-order construct driven by first-order “specific” factors, which are allowed to correlate. In these models, the variance in the traditional two-factor or four-factor solutions contributes to a meta-factor that represent psychopathy through two factors modeling the latent covariance between factors; thus, an overarching construct of psychopathy is characterized as the commonality among the specific factors (i.e., affective, lifestyle). Although hierarchical models have been used occasionally in the literature (Cooke & Michie, 2001a), further research is needed to test the validity of applying such structures to the psychopathy construct, particularly given the central place this solution has had in models of the structure of psychopathy (Hare & Neumann, 2005, 2006).
Beyond hierarchical approaches, bifactor models are more novel to the study of psychopathy, and include a general factor that captures shared variance across all items, while simultaneously modeling the variance captured by specific factors within subsets of items. In this model, items contribute variance both to a general factor and to one of several specific factors. In contrast to both correlated and hierarchical models, bifactor models allow for separation of the unidimensional component (i.e., general psychopathy) from the multidimensional components of a construct (e.g., unique affective or interpersonal components that are unrelated to unique antisocial behavior components). Bifactor models have been used in the intelligence literature (e.g., Carroll, 1993) and personality assessment (e.g., Reise, Moore, & Haviland, 2010) in delineating overarching, unitary constructs (e.g., intelligence), as well as separable, unique orthogonal components (e.g., verbal and spatial intelligence). Similarly, a bifactor model may better account for psychopathic traits in noncriminal populations, in which an overarching psychopathy factor can be distinguished from more specific factors that may tap less inherently harmful traits once overall psychopathy is parsed (e.g., once variance related to deviance is parsed from items tapping low affect, these items may actually signal well-being or emotional stability, describing individuals with calm or unflinching demeanors). Bifactor solutions have been applied to the PCL-R (Flores-Mendoza, Alvarenga, Herrero, & Abad, 2008; Patrick et al., 2007) and the extended SRP-III (e.g., Debowska, Boduszek, Kola, & Hyland, 2014). However, no previous studies have examined whether a bifactor model provides the best factor solution for the SRP-SF. A bifactor model appears particularly appealing in relation to community data, where there is often limited variability in responses (Williams et al., 2007). Thus, a second aim of the current study was to examine the fit of a bifactor model to the SRP-SF data.
Gender as a Consideration for Measurement Invariance
In addition to an assessment of model fit, an important test of the validity of the SRP-SF factor structure is to examine the generalizability of different factor solutions and to determine whether the measure assesses the psychopathy construct in the same way across different populations, particularly between genders. Despite the existence of historical cases of female psychopathy (Cleckley, 1941), due to limited research on psychopathy within women (though see Verona & Vitale, 2006), it is unclear if proposed conceptualizations of psychopathy are applicable to both genders, particularly given demonstrated higher prevalence rates (Vitale, Smith, Brinkley, & Newman, 2002) and higher scores on psychopathy measures in males (Rogstad & Rogers, 2008). If we are to understand potential gender differences in the etiology or nomological network of psychopathy, we must first know that the measure tapping psychopathy is measuring the same construct across genders. This assumption may be problematic given that some previous research suggests that the measurement of psychopathic traits differs by gender (Cale & Lilienfeld, 2002; Verona & Vitale, 2006; although see Neumann et al., 2012, for an exception).
Indeed, because assessments of psychopathy were primarily developed in male populations, item and factor scores may assess fundamentally different constructs in men and women, particularly at the extremes of the spectrum. As an illustration of this point, Cooke and Michie (2001b) found that within a group diagnosed as highly psychopathic, women had lower overall scores on the PCL-SV. Thus, the total PCL-SV score (and cutoffs for diagnosis) may not entail the equivalent level of severity in women as men with the same diagnosis (Forouzan & Cooke, 2005). Additionally, symptoms of psychopathy may not factor together in women in the same way as they do in men. For example, in a previous factor analysis of the PCL-R in women, some items loaded on differential factors or did not load onto any factor (Salekin, Rogers, & Sewell, 1997). An assessment of measurement invariance could help to determine the extent to which items and factors of a measure are equivalent across men and women. However, very few studies have examined invariance of psychopathy measures across gender (Kosson et al., 2013; Neumann, Kosson, Forth, & Hare, 2006) and no previous studies have tested measurement invariance of the SRP-SF across gender.
Beyond simply testing for invariance across men and women, studies have rarely differentiated between types of invariance. Configural invariance refers to the similarity of factors obtained within a factor structure (i.e., whether the same factors exist across samples; see Sass, 2011); that is, whether psychopathy includes the same pattern of clusters of traits (i.e., interpersonal, affective, lifestyle, antisocial) in both genders. In contrast, the more stringent scalar invariance indicates that for the same score on a factor, men and women have the same “intercept,” or baseline level of endorsement for each item within the latent variable. Differences in scalar invariance across gender could reflect gender differences in the rate of the given behavior or in the interpretation and endorsement of the item. Scalar invariance is particularly important because items indexing psychopathy may load together as “factors” similarly across population, but still demonstrate mean item-level differences. For example, an antisocial specific factor could exist in both an offender and a community sample, but the item “I have committed a serious crime” will always result in higher levels of endorsement in offender populations, regardless of whether or not offenders have high levels of psychopathic traits.
Previous research has demonstrated configural invariance across gender for different measures of psychopathy, including the PCL-R (e.g., Bolt, Hare, Vitale, & Newman, 2004), PCL-YV (Dillard, Salekin, Barker, & Grimes, 2013), and earlier versions of the SRP (e.g., Neumann et al., 2012). However, studies using the PCL-R and PCL-YV have also identified item-level (i.e., scalar) differences by gender, including increased endorsement of antisocial items in men compared with women (e.g., Bolt et al., 2004), and boys compared with girls (e.g., Dillard et al., 2013). Thus, an examination of the validity of the SRP-SF would require comparison between configural invariance and scalar invariance to develop a more precise understanding of how to best identify the construct across gender.
Present Study
The current study examined the factor structure and measurement invariance of the SRP-SF across four complimentary samples. First, we used CFA to examine the model fit of six common model solutions reported within the psychopathy literature (1-factor, 2-correlated factor, 3-correlated factor, 4-correlated factor, 2 hierarchical factor, and 4 hierarchical factor), collapsing data across four independent samples (N = 2,377; 49% female) that represented a balance of regions across the United States (i.e., Midwest vs. the South), type of university (i.e., public vs. private), and type of sample (i.e., undergraduate vs. low-income community). Based on recent research that has indicated the potential value of bifactor models of psychopathy, we also examined model fit of a two-factor-bifactor and four-factor-bifactor solution. Second, we investigated both configural and scalar measurement invariance of the best fitting models across gender. 1 We expected to demonstrate configural invariance across gender, given previous research indicating the validity of these factor solutions of psychopathy using similar measures. However, we did not make specific hypotheses regarding scalar invariance, as we were aware of only one study that examined scalar invariance by gender in addition to configural invariance of a psychopathy measure (Neumann et al., 2012).
Method
Our goal was to use a large sample size to validate a newly developed measure. Additionally, we included a higher risk sample to cover a wider distribution of psychopathic traits.
Sample 1: Midwestern Public Undergraduate Students
Sample 1 consisted of 384 (65.1% female) college students from a large, public Midwestern university. The mean age in the sample was 19.33 years (SD = 1.67), ranging from 18 to 34 years. The sample consisted primarily of students who identified as European American (n = 273; 71.1%), but also included 61 (15.9%) who identified as Asian American, 13 (3.4%) as African American, and 17 (4.4%) as biracial or multiracial. Additionally, 51 (3.9%) students identified their race as “other” and 16 (4.2%) reported their ethnicity to be Hispanic American. Participants gave written informed consent for participating in the study and voluntarily completed questionnaire measures as part of credit for taking a Psychology course. Participants completed a basic demographics questionnaire (assessing self-reported age, gender, race) and the SRP-SF (Paulhus et al., 2015) during a computer session.
Sample 2: Southern Public Undergraduate Students
Sample 2 consisted of 848 (45.8% female) college students from a large, public Southern university. The mean age in the sample was 20.71 years (SD = 4.24), ranging from 18 to 57 years. The sample consisted primarily of participants who identified as European American (n = 380; 44.8%), but also included 89 (10.5%) who identified as African American, 43 (5.1%) as Asian American, 2 (0.2%) as Native American, and 74 (8.7%) as multiracial. Additionally, 56 (6.6%) participants reported their ethnicity to be Hispanic American. Participants gave written informed consent for participating in the study and voluntarily completed questionnaire measures for course credit. Participants completed paper versions of a basic demographics questionnaire (assessing self-reported age, gender, race) and the SRP-SF during a group testing session at the university.
Sample 3: Southern Private Undergraduate Students
Sample 3 consisted of 1,012 (57.3% female) college students recruited from a larger study at a private Southern university. The mean age in the sample was 19.67 years (SD = 1.25), ranging from 18 to 22 years. The full sample consisted primarily of individuals who identified as European American (n = 497; 49.1%), but also included 285 (28.2%) who identified as Asian American, 124 (12.4%) as African American, 3 (0.3%) as Native American, and 73 (7.2%) as biracial. Additionally, 29 (2.9%) students identified their race as “other.” Participants gave written informed consent for participating in the study and voluntarily completed questionnaire measures. Participants completed the study during a computer session where the participants responded to a battery of questionnaires assessing a wide variety of personality traits.
Sample 4: Urban, Low-Income Community Sample of Boys
Sample 4 was drawn from the Pitt Mother & Child Project, an ongoing longitudinal study of child vulnerability and resiliency in low-income families (Shaw, Gilliom, Ingoldsby, & Nagin, 2003). In 1991 and 1992, 310 infant boys and their mothers were recruited from Allegheny County Women, Infant, and Children Nutrition Supplement Clinics when the boys were between 6 and 17 months old. Many boys in this study were considered at elevated risk for antisocial outcomes because of their families’ SES at the time of recruitment, with mean per capita family income at $241 per month ($2,892 per year), and were followed almost annually from age 1.5 to age 22. The SRP-SF was completed by youth at age 22 via self-report. Based on parent-reported race of the child (now adult), the full sample consisted primarily of men who were identified by their parents as European American (n = 161; 51%) and as African American (n = 123; 39%). Additionally, parent-reported race for 29 (9.2%) men was “other” and 1 (0.3%) male was reported to be Hispanic American.
Institutional review boards at each respective university gave approval for each study.
Measures
Assessment of Psychopathic Traits
Psychopathic traits were assessed using the 29-item Self-Report Psychopathy Short-Form (SRP-SF; Neumann & Pardini, 2014; Paulhus et al., 2015), a self-report measure of psychopathy derived from and shown to correlate highly with the Psychopathy Checklist-Revised (Neumann et al., 2014; Paulhus et al., 2015). The items are grouped into four dimensions of psychopathy: affective callousness (e.g., “I never feel guilty over hurting others”), interpersonal manipulation (e.g., “I think I can beat a lie detector”), antisociality (e.g., “I have tried to hit someone with a vehicle”), and erratic lifestyle (e.g., “I’ve often done dangerous things just for the thrill of it”; Neumann & Hare, 2008). Participants rated these items based on the extent to which they thought the statements reflected their own beliefs using a 5-point Likert-type scale (1= disagree strongly to 5 = agree strongly). Each factor of the SRP-SF showed high internal consistency in the current study (interpersonal, α = .95; affective, α =.92; lifestyle, α =.87; antisocial, α =.88; total scores, α =.98), similar to previous studies of the SRP-SF (e.g., Neal & Sellbom, 2012; Neumann & Pardini, 2014).
Analytic Strategy
Aim 1: To Examine the Factor Structure of the SRP-SF
First, we used CFA in Mplus version 7.2 (Muthén & Muthén, 2014) to compare model fit for eight different models: a one factor, two-correlated factor, three-correlated factor, four-correlated factor, two-hierarchical factor, four-hierarchical factor, two-factor-bifactor, and four-factor-bifactor solutions for the SRP-SF. In both the four-correlated and four-bifactor model, items of the SRP were loaded on the four dimensions as previously specified (Paulhus et al., 2015): interpersonal (Items 7, 9, 10, 15, 19, 23, and 26), affective (Items 3, 8, 13, 16, 18, 24, and 28), lifestyle (Items 1, 4, 11, 14, 17, 21, and 27), and antisocial (Items 20, 5, 6, 12, 22, 25, and 29). Additionally, in the two- and four-bifactor models, all items were specified to load onto one general factor, which was specified not to correlate with the specific factors (i.e., both bifactor models had a single “general” factor and then either 2 or 4 specific factors).
Models were estimated with mean and variance adjusted weighted least squares estimation (WLSMV), most appropriate for use with ordinal items in Mplus (Flora & Curran, 2004). 2 Consistent with standard Mplus v. 7.2 procedures when using WLSMV estimation, in all analyses, the covariance matrix (calculated from the polychoric correlation matrix with variances on the diagonals) was used. Model fit was evaluated using both absolute and relative fit indices. The former included the χ2 statistic, and the root mean square error of approximation (RMSEA). The χ2 statistic assesses the difference between expected and observed covariance matrices, with a nonsignificant result indicating a smaller difference between the matrices, and thus better model fit (Hu & Bentler, 1999). RMSEA assesses how well a model with optimally chosen parameter estimates would fit the population data, with smaller values indicating better fit (Hu & Bentler, 1999). Relative fit indices included the comparative fit index (CFI) and the Tucker–Lewis index (TLI). CFI is a comparison of fit between the target model to that of an independent model in which the variables are assumed to be uncorrelated (Bentler, 1990). TLI is a comparison of fit between the χ2 of the target model and the χ2 of the independent model. For both indices, values range from 0 to 1, and values approaching 1 indicate better fit (Bentler, 1990). Because of the large sample size in the study and well-known limitations in the χ2 statistic when using large samples, we did not use a specific χ2 cutoff for model fit. However, similar to previous factor analyses of the SRP-SF (Neumann & Pardini, 2014), RMSEA values less than or equal to .08, CFI values greater than .90, and TLI values greater than or equal to .90 were used to indicate a good fit to the data. When possible, we carried out corrected χ2 differences tests to directly compare model fit of models that were nested within each other using the DIFFTEST procedure in Mplus, which is most appropriate for use within WLSMV estimation (Muthén & Muthén, 2014). However, the three-correlated factor model was not nested within the other models tested, and thus could not be directly compared with other models using DIFFTEST. Additionally, the bifactor models (2 vs. 4) were not nested within each other, and thus could also not be compared with each other with DIFFTEST.
Aim 2: Analyze Measurement Invariance of the SRP-SF Across Gender
We also examined configural and scalar measurement invariance for the best-fitting models examined in Aim 1. We followed well-established steps for measurement invariance testing (Sass, 2011; Widaman & Reise, 1997). First, we tested model fit separately for men and women using the combined sample. Second, configural invariance was examined by testing the best fitting factor solutions in the combined dataset, grouped by gender. Configural invariance requires the same underlying factor structure to fit the data from both men and women, while factor loadings and thresholds are allowed to differ. Third, after establishing configural invariance, scalar invariance was examined by constraining the loadings and thresholds of items to be equal across gender (Muthén & Muthén, 2014). As our study used categorical data, consistent with Mplus guidelines, in our analysis of scalar invariance, thresholds and factor loadings were freed and constrained in tandem, as the item probability curve is influenced by both parameters (Muthén & Asparouhov, 2002; Muthén & Muthén, 2014). Typically in analyses using continuous data, configural invariance testing is followed by metric invariance testing, which involves only fixed loadings (and not thresholds). However, with categorical data, the loadings and thresholds must be fixed simultaneously, thus not allowing for this intermediate metric invariance testing within our data (Muthén & Asparouhov, 2002). The fit of the models was assessed using both absolute and relative fit indices, as described in Aim 1. The fit of the constrained model was then compared with that of the earlier model. We examined differences in the competing models in two ways. We first computed a χ2 difference test with nested models using the DIFFTEST procedure. However, due to the sensitivity of the χ2 difference test to sample size (Brannick, 1995), we also considered changes in CFI equal to or less than .01 and changes in RMSEA of equal to or less than .015 as evidence of invariance (Chen, 2007; Cheung & Rensvold, 2002).
Results
Aim 1: The Factor Structure of the SRP-SF
To examine the factor structure of the SRP-SF, we sequentially tested a series of six different model solutions across all four samples collapsed into one dataset, comparing their model fit each time. 3 First we tested a one-factor model, which fit the data poorly (see Table 2). Next, we tested a two-correlated factor model. While the two-correlated factor model fit the data better than the one factor model (Δχ2 = 422.92, df = 1, p < .001), it still showed poor fit. Similarly, the three-correlated factor model showed poor overall fit to the data; and as it was not nested within our other models, we continued our comparisons using the two-correlated factor model. The four-correlated factor model fit the data better than the two-correlated factor model (Δχ2 = 571.21, df = 5, p < .001), though the TLI and CFI values were slightly below our standards of good fit (see Figure 1 and Table 3).
Fit Statistics of All Factor Solutions.
Note. All analyses performed using WLSMV in MPlus. All χ2 statistics were significant at p < .001. df = degrees of freedom; TLI = Tucker–Lewis index; CFI = comparative fit index; RMSEA = root mean square error of approximation.

Four-correlated factor model.
Factor Loadings and Model Fit Statistics: Four-Correlated Factor Model.
Note. df = degrees of freedom; TLI = Tucker–Lewis index; CFI = comparative fit index; RMSEA = root mean square error of approximation. The full items could not be reproduced here, because they are copyrighted by Multi-Health Systems, Inc. Instead, we refer to item numbers and provide a paraphrased indication of the item content within parentheses.
p < .001.
We next moved to bifactor models. The two-bifactor model fit the data better than the two-correlated factor model (Δχ2 = 2125.31, df = 27, p < .001) and had good fit overall (see Figure 2 and Table 2). Finally, the four-bifactor model demonstrated good model fit (χ2 = 2950.88, df = 322, p < .001; CFI = .95, TLI = .94; RMSEA = .06). Because of estimation constraints and errors within the DIFFTEST procedure necessary when using WLSMV estimator, 4 we were unable to statistically compare the four-bifactor model fit with the four-correlated factor solution. However, the four-bifactor appeared to have a better fit based on the fit indices (CFI = .95 vs. .89; TLI = .94 vs. .88; RMSEA = .06 vs. .08). Additionally, when we ran a traditional chi-square difference test using the maximum likelihood (ML) estimator, 5 the four-bifactor model did have significantly better fit. We also modeled a four factor hierarchical model (Table 3). The TLI and CFI values were slightly below our standards of good fit, and the model had significantly worse than the four-correlated factor model (Δχ2 = 11.49, df = 2, p < .01), as well as the four-bifactor model (Δχ2 = 2157.21, df = 24, p < .001). Hence, further analyses were conducted using only the correlated factor and bifactor models. Within the four-bifactor model, there were moderate to high and significant loadings of all SRP items on the general “g” factor (Table 4; range, β = .52-.78). Generally, the items also showed moderate loadings on respective specific facets, with the exception of Items 9 (“get kick out of scamming”) and 21 (“getting in trouble for same things”). Interestingly, some items loaded negatively and in the opposite direction as predicted on specific facets. Specifically, four of seven items had negative loadings onto the affective facet, including “cold-heartedness” and “does not feel bad about hurting others,” highlighting that removing the variance shared with the general psychopathy disposition may change the meaning of the affective construct. Overall, consistent with our hypothesis, the four-correlated factor and four-bifactor models showed the best fit to the data in the whole sample, but the two could not be directly compared statistically.

Four-bifactor model.
Factor Loadings and Model Fit Statistics: Four-Bifactor Model.
Note. g = general psychopathy factor; df = degrees of freedom; TLI = Tucker–Lewis index; CFI = comparative fit index; RMSEA = root mean square error of approximation. The full items could not be reproduced here, because they are copyrighted by Multi-Health Systems, Inc. Instead, we refer to item numbers and provide a paraphrased indication of the item content within parentheses.
p < .10. *p < .05. **p < .01. ***p < .001.
Aim 2: Comparisons of the Factor Structure of the SRP-SF Between Genders
Model Fit Within Each Gender
Based on previous findings that psychopathic traits may have different configurations in men versus women, we examined measurement invariance of the best fitting models (i.e., the four-correlated factor model and four-bifactor model). First, in line with traditional sequences of invariance testing (Widaman & Reise, 1997), we examined model fit individually for each gender. When evaluating the four-correlated factor model, all indices of fit were slightly below our standards of good fit in men (χ2 = 2821.74; df = 344; p < .001; TLI = .87; CFI = .88; RMSEA = .09; see Table 5), but the model had good fit in women (χ2 = 2153.15; df = 344; p < .001; TLI = .90; CFI = .91; RMSEA = .07). The four-bifactor model had good fit for men (χ2 = 1661.83; df = 322; p < .001; TLI = .93; CFI = .94; RMSEA = .07) and women (χ2 = 1133.22; df = 297; p < .001; TLI = .95; CFI = .96; RMSEA = .05). Similar to the results from Aim 1, the four-bifactor model appeared to have slightly better fit than the four-correlated factor model when comparing across gender (e.g., Average CFI & RMSEA of four-correlated factor model: .90 and .08, respectively; Average CFI & RMSEA of four-bifactor model: .95 and .06, respectively), but could not be compared directly statistically.
Fit Indices of Measurement Invariance Testing by Gender for Four-Correlated and Four-Bifactor Models.
Note. All analyses performed using WLSMV in MPlus. All χ2 statistics were significant at p < .001. df = degrees of freedom; TLI = Tucker–Lewis index; CFI = comparative fit index; RMSEA = root mean square error of approximation. The steps of measurement invariance are presented using gender: fit indices within each gender (Step 1 of measurement invariance testing), followed by the fit statistics when using configural (Step 2), and scalar (Step 3) solutions in the entire sample of men and women, and finally χ2 differences tests to determine if the scalar model fits significantly better than the configural model (Step 4).
SRP 8 was removed from the model to resolve model errors.
Configural Invariance Across Gender
Next, we examined configural variance. Identical models (either four-correlated factor or four-bifactor models) were computed for each gender simultaneously, while the factor loadings and thresholds were allowed to differ (see Table 5). In the four-correlated factor model, the TLI and CFI values were slightly below our standards of good fit (χ2 = 5005.17, df = 688, p < .001; TLI = .88; CFI = .89; RMSEA = .08). For the four-bifactor model, the fit was good (χ2 = 2742.33, df = 644, p < .001; TLI = .94; CFI = .95; RMSEA = .06). However, whereas factor loadings were similar in magnitude and direction across gender for the four-correlated factor model, there were clear differences for the four-bifactor model across men versus women (see Table 6). Importantly, an item (#8: enjoy watching fights) had to be excluded from the model in women due to errors in model specification, suggesting that this item may not fit into the model for women. Among women, the majority of interpersonal items either had weak loadings (#9, #15), negative loadings (#9, #10), and/or did not have significant loadings (#7, #10), on the specific interpersonal factor, even though all items had significant positive loadings for men. Furthermore, several affective items had differential loadings on the specific affective factor both in magnitude (#3, #8, #13, #16, #18, #24) and direction (#3) across gender. In contrast, item loadings on the general psychopathy factor were similar for both men and women. Thus, whereas the factors in the four-correlated factor model appeared to identify the same four specific factors for men and women, the four-bifactor model identified gender differences in the structure of the specific interpersonal and affective facets when controlling for general psychopathy as assessed by the SRP-SF, with interpersonal and affective items loadings on the specific factors appearing weaker in women.
Factor Loadings of Four-Correlated Model in Men Versus Women.
p < .001.
Scalar Invariance Across Gender
Finally, we examined scalar variance across gender in which both factor loadings and thresholds were constrained to be equal. The fit was good for both the four-correlated factor and four-bifactor models (see Table 7). χ2 difference tests revealed that this more constrained scalar model had worse fit compared with the configurally invariant model (where factor loadings and thresholds were freed) for both the four-correlated factor (Δχ2 = 709.08, df = 103, p < .001) and four-bifactor solutions (Δχ2 = 355.04, df = 124, p < .001). However, using alternate indices of invariance (i.e., changes in CFI equal to or less than .01 and changes in RMSEA of equal to or less than .015), the scalar models for both the four-correlated factor and four-bifactor model had better fit than the configural models. The good fit of the models testing for configural invariance confirmed the presence of four specific factors of psychopathy and a general factor for the bifactor model in men and women. However, given that there was some evidence that scalar invariant models had worse fit than the configural invariant models, it appeared that some individual items on each of the specific factors varied in level of endorsement (e.g., thresholds or mean scores) across gender. 6 To determine which item scores differed between men and women, we performed independent samples t tests in SPSS. We ran analyses with and without the high-risk sample to avoid confounding gender with level of endorsement (i.e., the high-risk sample was all men).In the analyses using only the college samples, men generally had significantly higher mean scores than women across all items (Table 8). Thus, overall scalar invariance may be undermined by differences in overall item endorsement, with men demonstrating significantly higher endorsement of items than women.
Factor Loadings of Four-Bifactor Model in Men Versus Women.
Note. g = general psychopathy factor.
SRP 8 removed from model in women to resolve model errors.
p < .10. *p < .05. **p < .01. ***p < .001.
Mean Scores of SRP-SF Items by Gender.
Note. Women n = 1,153; Men n = 726. SD = standard deviation. The high-risk sample was not included in these analyses. However, analyses were repeated including all samples, and results were all in similar directions, with slight increases in significance levels. The full items could not be reproduced here, because they are copyrighted by Multi-Health Systems, Inc. Instead, we refer to item numbers and provide a paraphrased indication of the item content within parentheses.
p < .10. *p < .05. **p < .01. ***p < .001.
Discussion
This study provided a comprehensive examination of the factor structure of the SRP-SF and tested the equivalence of the best-fitting models across a large sample of men and women. We tested factor models previously examined in other studies and two novel bifactor models using SRP-SF data from three undergraduate samples and one high-risk community sample. Consistent with our hypotheses, the four-correlated factor structure fit the data significantly better than model solutions with fewer factors and better than hierarchical models. In addition, the four-bifactor model structure also showed good fit. Findings from invariance testing suggest a potential lack of scalar invariance that could be due to different endorsement rates of the SRP-SF items between men and women. However, in line with our hypotheses, overall the results suggested very similar patterns of factor loadings across gender for both solutions, particularly for the four-correlated factor solution (i.e., configural invariance). The results highlighted strengths and weaknesses of each model, with the four-bifactor model demonstrating superior overall fit, but proving more problematic than the four-correlated factor model when comparing across gender.
Viable Factor Structures of the SRP-SF
As found in previous studies using other measures of psychopathy (e.g., SRP-III) the four-correlated factor and four-bifactor model provided the best fit to the combined SRP-SF data when collapsing across gender (Mahmut et al., 2011; Neal & Sellbom, 2012; Neumann et al., 2012; Neumann et al., 2014; Seibert et al., 2011; Visser et al., 2012; Williams et al., 2003). The presence of four factors emphasizes significantly greater differentiation within the psychopathy construct relative to broader definitions that focus only on “personality” versus “disinhibited behavior” within two-factor models (Hare & Neumann, 2010). Our findings suggest that psychopathy may be better conceptualized as a construct with four separable underlying factors, particularly within community samples. In addition, our findings add to the burgeoning evidence highlighting the utility of bifactor models for conceptualizing psychopathology, particularly psychopathy (e.g., Debowska et al., 2014; Patrick et al., 2007; Waller et al., 2015). In particular, the bifactor model demonstrated that SRP items can be modeled as different and specific dimensions (affective, interpersonal, lifestyle, and antisocial specific factors), while simultaneously representing a distinct and broad construct of “general” psychopathy. Therefore, while some variance across all items is common to the “general” psychopathy factor, the remaining variance in items relates uniquely to specific and separable aspects of the psychopathy construct. Thus, the results support the idea that psychopathy has both a unidimensional and multidimensional nature, at least as measured by the SRP-SF within community samples.
The coherence of items and factors in the development and validation of psychopathy measures has previously been a focus in the field (Lilienfeld et al., 2006; Neumann, Uzieblo, Grombez, & Hare, 2013).
A shift toward the use of bifactor models could represent a novel approach to improve assessment and construct validation of psychopathy, emphasizing that both general SRP-SF and specific facet scores may have utility in the assessment of psychopathic traits within community samples. One issue when considering bifactor versus correlated factors models is that bifactor models will tend to produce better fit because so many model parameters are specified; such models are more saturated compared with correlated factor models. For instance, using 18 items of the PCL-R, a four-correlated factor model uses 42 free parameters to account for 171 variances/covariances, whereas a bifactor model requires 72 free parameters for the same covariance matrix. Thus, the good fit of the bifactor model found in this sample should be interpreted with caution, particularly since it could not be directly compared with that of the correlated factor model. Nevertheless, the results for both the four-correlated factor and four-bifactor models indicate that both the total SRP-SF score and individual facet scores denote viable representations of the psychopathic personality and personality factors in community samples.
Partial Measurement Invariance of the SRP-SF
Across gender, the four-bifactor and four-correlated factor models of the SRP-SF demonstrated configural invariance and partial evidence for scalar invariance. In particular, we found that the same pattern of factors was found for the four-correlated and four-bifactor models using the SRP-SF for men and women (i.e., configural invariance). However, across men and women there were significant differences in the endorsement of items (i.e., scalar invariance). This study is the first to examine the invariance of the SRP-SF across gender and to differentiate between configural and scalar invariance. Consistent with previous studies that reported configural invariance across gender using the PCL-R (Bolt et al., 2004), PCL-YV (Kosson et al., 2013), and earlier versions of the SRP (Neumann et al., 2012), our findings support the generalizability of the four-correlated factor and four-bifactor structures across men and women.
As we only found limited evidence for scalar invariance, the generalizability of individual SRP-SF item intercepts within the different factor solutions was not fully supported. Overall men scored significantly higher than women across the majority of SRP-SF items. Moreover, in the overall four-bifactor structure for men, even when separating out the variance accounted for by the general psychopathy factor, affective items related to violence and pleasure derived from violence (Items #8 and #21) loaded positively onto the same factor as empathic traits (represented by the negative item loadings of callous Items #13, #16, #28). Thus, in men these items may reflect a more normative cluster of traits indexing emotional reactivity (i.e., high tolerance of aversive emotional stimuli, such as violence) that can be assessed separately from general psychopathy in the context of the bifactor model. In contrast, for women, high endorsement of violence items appeared to more narrowly indicate the presence of psychopathy, as these items only loaded on the general psychopathy factor, while the loadings on the specific affective factor were compromised. Furthermore, Item #8 had to be excluded from the four-bifactor structure in women, emphasizing that the affective factor may not translate fully across gender. These differences in specific factors need to be considered when generalizing results from studies that use bifactor solutions of the SRP-SF in mixed gender samples; in such cases, the general factor may be more reliable across gender. In contrast, specific factor loadings on the four-correlated factor model were similar for men and women, demonstrating partial scalar invariance of the SRP-SF.
Furthermore, the lack of scalar invariance evidenced by the χ2 difference tests could have also been due to our large sample size, possibly resulting in a higher degree of statistical power and increased likelihood of finding significant differences of model fit across groups. Of note, alternate indices of invariance did demonstrate scalar invariance across gender for both models. However, the use of these alternate indices of model fit has limitations in the current study, as they were developed for use with ML, not WLSMV, estimation (Sass, 2011).
Future Usage of the SRP-SF
Though our results support the replicated factor structure and reliability of the SRP-SF for use in college and community samples, the findings also present a number of issues for future research in psychopathy broadly and for studies explicitly employing the SRP-SF. Specifically, whereas the bifactor model showed good fit to the SRP-SF data, such models have seldom been used, and therefore are often poorly understood. In particular, there may be confusion of how to interpret “general” versus orthogonal “specific” aspects of psychopathy, and therefore how to incorporate such a model into real-world application (Reise et al., 2010). For example, it is not clear how clinicians could leverage a general versus specific model to provide information on a specific individual. However, one of the strengths of the bifactor model is the ability to identify the extent to which a construct is unidimensional versus multidimensional; that is, determining whether a construct can accurately be assessed by a sum score of all items (e.g., general factor) or whether it is necessary to create subscales (e.g., specific factors). Modeling bifactor structures for different measures could result in the creation of a brief measure of general psychopathy only using those items that most strongly contribute to the cohesive, general construct, rather than those with strong loadings only on the individual facets. 7
Finally, it is interesting to consider the very goal of having single measures of psychopathy across genders when, by definition, the construct and presentation of psychopathy might reasonably be expected to differ meaningfully among men versus women (Cleckley, 1941). For example, the gender differences we noted for the affective facet of our factor models have also been identified in studies of behavior (e.g., Montagne, Kessels, Frigerio, de Haan, & Perrett, 2005) and neural levels during emotion processing tasks (e.g., Stevens & Hamann, 2012). A recent meta-analysis of gender differences in brain activation to emotional stimuli found that women exhibited greater activation of the left amygdala and medial prefrontal cortex to negative emotion compared with men (Stevens & Hamann, 2012). Abnormalities in the activation of such brain regions during emotion processing have been consistently demonstrated in men with psychopathic traits (Blair, 2008). Furthermore, other research using the SRP-SF has shown associations with amygdala structure (Pardini, Raine, Erickson, & Loeber, 2014) and differential relationships with amygdala activation according to gender (Carré et al., 2013) in non-offender samples. However, researchers are just beginning to investigate links with neural correlates in female populations, and findings have been mixed (Rogstad & Rogers, 2008). The current study supports the notion that affective traits as assessed by the SRP-SF items may contribute differentially to the general psychopathy construct in men and women, which could reflect differential underlying emotional and physiological mechanisms. Thus, measures such as the SRP-SF that accurately reflect gender differences in the construct may be useful in identifying external correlates unique to male versus female psychopathy, differences that might be obscured by self-report measures that are not sensitive to differences in item endorsement rate or “meaning” of items in men versus women. Indeed, the SRP-SF has already demonstrated a host of associations with other correlates of psychopathy in a variety of populations (Hare, Neumann, & Mokros, in press; Neumann et al., 2014). Accordingly, future research is needed to examine with greater precision the meaning, interpretation, and endorsement of different items of the SRP-SF with respect to their links with various external correlates and the nature of their performance within specific sample types and gender.
Study Limitations
A significant strength of our study was our large sample size that included diverse sample types. Unfortunately, the incorporation of different samples also meant that we were unable to examine external correlates or test the nomological network of the SRP-SF as we lacked criterion variables that were common to all four included samples. Additionally, because of the differences in sample composition and data collection methods, we did not focus on sample differences in CFA structure, which may have influenced the results. Moreover, our high-risk sample was entirely male, which limited the range of endorsement of antisocial items by women. However, we believe that the use of a large combined sample and the diversity across these samples increases the validity of our factor structure analyses. Our findings thus set a strong foundation for future research to investigate relationships between factor solutions of the SRP-SF and other constructs commonly linked to psychopathy, such as substance abuse and criminality.
Additionally, while the present study was able to examine the factor structure of the SRP-SF in community and undergraduate samples, future studies should further investigate and compare factor structures of the SRP-SF among criminal and clinical samples or in samples that include the entire range of responses from normative to highly deviant individuals. Finally, our study encountered some methodological limitations in our analysis of measurement invariance. First, we were unable to fully address limitations in previous studies regarding direct statistical comparisons, as we were unable to directly compare the three-factor solution to the other models, or compare the two bifactor models to each other because they were not nested. Furthermore, we encountered computational errors when attempting to run direct statistical comparisons between the four-correlated and four-bifactor model. While DIFFTEST is the standard approach for model testing with ordinal data, at present it is unclear if its inability to compare model fit in these instances represents a methodological anomaly or substantive problem. Additionally, we had to exclude an item from the model in women, highlighting the instability issues that can arise while using bifactor models. Finally, as our study used categorical data, thresholds and factor loadings were freed and constrained in tandem. Hence, we were unable to determine with certainty which of these (i.e., thresholds vs. factor loadings) contributed to the lack of scalar invariance.
Conclusion
Our study supports the continued use of the SRP-SF to assess psychopathy among community samples. However, neither factor solution for the SRP-SF demonstrated strict scalar invariance, indicating that items may not be analogously representative of psychopathic traits with respect to endorsement rates across gender, which should be considered when attempting to compare findings using samples containing men and women. Even though there is continued debate about the use of self-report in the study of psychopathy, these findings demonstrate the usefulness of a bifactor model across multiple measures and sample types (e.g., Patrick et al., 2007; Waller et al., 2015). While the four-bifactor model demonstrated superior fit overall, the four-correlated factor model was less problematic when comparing across genders, highlighting that each solution may have costs and benefits. Overall, the present study emphasizes the importance of careful consideration of the tools and methods used to measure complex constructs such as psychopathic traits, especially in terms of their applicability to men versus women.
Footnotes
Acknowledgements
The authors would like to thank Martha A. Alves for comments on previous drafts of this article, and the staff and study families of the Pitt Mother and Child Project for making this research possible.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received the following financial support for the research, authorship, and/or publication of this article: Preparation of this article was supported by grants to the third author from The William H. Donner Foundation. The research reported in this article was also supported by grants to the fourth and fifth authors from the National Institute of Mental Health (MH 50907 and MH 01666), and L40-DA036468 for the corresponding author. The Duke Neurogenetics Study is supported by Duke University and NIDA Grants R01DA031579 and R01DA026222.
