Abstract
This study examined various psychometric properties of the items comprising the shame and guilt scales of the Test of Self-Conscious Affect–Adolescent. A total of 563 adolescents (321 females and 242 males) completed these scales, and also measures of depression and empathy. Confirmatory factor analysis provided support for an oblique two-factor model, with the originally proposed shame and guilt items comprising shame and guilt factors, respectively. Also, shame correlated with depression positively and had no relation with empathy. Guilt correlated with depression negatively and with empathy positively. Thus, there was support for the convergent and discriminant validity of the shame and guilt factors. Multiple-group confirmatory factor analysis comparing females and males, based on the chi-square difference test, supported full metric invariance, the intercept invariance of 26 of the 30 shame and guilt items, and higher latent mean scores among females for both shame and guilt. Comparisons based on the difference in root mean squared error of approximation values supported full measurement invariance and no gender difference for latent mean scores. The psychometric and practical implications of the findings are discussed.
Keywords
The Test of Self-Conscious Affect–Adolescent (TOSCA-A; Tangney, Wagner, Gavlas, & Gramzow, 1991) is a theoretically driven self-report measure that has scales for measuring guilt proneness and shame proneness in adolescents. The current study examined the psychometric properties of these scales in the TOSCA-A. More specifically, it examined support for an oblique two-factor model (separate factors for shame and guilt, with item loadings on these factors as originally proposed), convergent and discriminant validity of the shame and guilt latent factors, and construct invariance across male and female adolescents.
According to Lewis (1971), shame involves a negative evaluation of the self, where the focus is on unworthiness of the self. In contrast, guilt involves a negative evaluation of a specific behavior or action, where the focus is on the wrongness of a particular controllable action. Extending this model, Tangney (1991, 1993) has proposed that shame and guilt are associated with different cognitions, motivations, evaluations, feelings, and behaviors (Niedenthal, Tangney, & Gavanski, 1994; Tangney, 1991, 1993; Tangney, Wagner, Fletcher, & Gramzow, 1992; for reviews, see Tangney & Dearing, 2002b; Tangney, Stuewig, & Mashek, 2007). More specifically, shame is speculated to involve a negative evaluation of the self, and is associated with maladaptive, avoidance, and concealing responses, whereas guilt is speculated to involve a negative evaluation of the transgressing behavior and is associated with adaptive and approach responses aimed at repairing (reparation and apology) the consequences of the transgressing behavior (Niedenthal et al., 1994; Tangney, 1993). Research suggests that while shame involves internal, stable, uncontrollable, and global attributions about the self, guilt involves internal, unstable, controllable, and specific attributions about the self (Tracy & Robins, 2006). Although there is now considerable support for Tangney’s theory (Baumeister, Stillwell, & Heatherton, 1994; Niedenthal et al., 1994; Tangney, 1991, 1993; for reviews, see Tangney & Dearing, 2002b; Tangney et al., 2007), alternative models of shame and guilt exist. For example, there are theories of shame and guilt defined in terms of the types of situations that invoke these responses, often referred to as public–private distinctions (Wolf, Cohen, Panter, & Insko, 2010), where shame is viewed as resulting from the public exposure of transgressions and guilt to private commission of moral transgressions (Ausubel, 1955; Smith, Webster, Parrott, & Eyre, 2002).
In terms of developmental changes in shame and guilt, data indicate that both guilt and shame increase during adolescence (Tangney et al., 1991; see also Tangney & Dearing, 2002b). There are data showing that compared with children, adolescents show greater tendencies to attribute the cause of guilt to controllable behaviors rather than to accidents (Graham, Doubleday, & Guarino, 1984). Roos, Hodges, and Salmivalli (2014) have speculated that guilt and shame could begin to differentially influence an individual’s social behaviors during early adolescence since the differential attributions motivating behaviors begin to stabilize during this period. Both shame and guilt are considered moral emotions and are related to the self and interpersonal relations. As adolescents undergo significant changes in the development of morality (Kohlberg, 1984; Mitchell, 1975), the self (Damon & Hart, 1988; Harter, 2012), and peer relationships (B. Brown & Larson, 2009), it could be argued that the study of shame and guilt during adolescence would improve our understanding of developmental changes in morality, the self, and peer relationships during adolescence. Also, as shame is positively correlated with psychopathology, and guilt is often unrelated to psychopathology (Tangney & Dearing, 2002b), the study of shame and guilt in adolescents could have implications on understanding the development of emotional and psychological problems during adolescence. However, for such studies we need, first of all, measures of shame and guilt for adolescent use that have well-established psychometric properties.
The TOSCA-A was developed for use with adolescents to measure shame and guilt, consistent with the theoretical perspectives proposed by Tangney (1991, 1993). This is also the case for versions developed for adults (TOSCA; Tangney, Wagner, & Gramzow, 1989) and children (TOSCA-C; Tangney, Wagner, Burggraf, Gramzow, & Fletcher, 1990). All versions of the TOSCA, including the latest version for adults (TOSCA-3; Tangney, Dearing, Wagner, & Gramzow, 2000), consist of scenarios that measure shame proneness and guilt proneness, and also externalization, detachment/unconcern, alpha pride (pride in the entire self), and beta pride (pride from evaluation of a specific behavior).
At present, there is limited information on the psychometric properties of the guilt and shame scales of TOSCA-A. The limited data indicate a significant correlation between the shame and guilt factors, support for the internal consistency reliability, and convergent and discriminant validity of the shame and guilt scales (Tangney, 1991; Tangney, Wagner, Hill-Barlow, Marschall, & Gramzow, 1996). For example, Tangney et al. have reported internal consistency reliability values of .77 and .81 for shame and guilt, respectively. In terms of concurrent and discriminant validity, they have reported that shame is associated negatively with empathy and positively with anger and aggression, whereas guilt is associated positively with empathic and adaptive anger management strategies.
An important psychometric property for a measure is support for its factor structure. For only the TOSCA (adult version) shame and guilt items, Luyten, Fontaine, and Corveleyn (2002) found that principal components analysis supported separate factors for shame and guilt. For confirmatory factor analysis (CFA) of the TOSCA shame, guilt, externalization, and detachment item parcels (total scores for two or more items used as indicators), Fontaine, Luyten, De Boeck, and Corveleyn (2001) found support for the expected four-factor model. This model was also supported in a study of TOSCA-C (Strömsten, Henningsson, Holm, & Sundbom, 2009). To date, no study has been published examining the factor structure of the TOSCA-A. The findings for TOSCA and TOSCA-C raise the possibility of support for a similar factor structure of TOSCA-A.
A robust finding with all versions of the TOSCA is that for both shame and guilt, females score higher than males (Silfver, Helkama, Lönnqvist, & Verkasalo, 2008; for a review, see Tangney & Dearing, 2002a). However, if this finding is to be accepted unconditionally, measurement invariance for males and females for the TOSCA has to be demonstrated in the first instance. Measurement invariance refers to groups reporting the same observed scores when they have the same level of the underlying trait (Reise, Widaman, & Pugh, 1993). Invariance means that for the groups being compared, the measure in question has the same measurement and scaling properties, and thus the observed scores for the groups can be directly compared. The issue of measurement invariance across males and females for the TOSCA-A is particularly important. This is because as society expects higher standards of interpersonal social behaviors from female than male adolescents (Bybee, 1998), it can be speculated that for the same response to an event, females are likely to interpret their responses in more shameful and guilty ways. Such responses raise the possibility that when they are asked to complete the shame and guilt scales of TOSCA-A, their responses to the same scenario could be differentially influenced by their gender status. To date, no study has examined measurement invariance across males and females for the TOSCA-A.
Although we have some support for the concurrent and discriminant validity for the shame and guilt scales of TOSCA-A, existing data are limited. It would be prudent to examine further the concurrent and discriminant validity of the shame and guilt scales of TOSCA-A so as to ensure the robustness of their validity. In this respect, it would be useful to examine this in terms of how shame and guilt are related to depression and empathy. This is because, as noted previously, while both guilt and shame are associated with internal attributions about the self, the attributions for shame have been speculated to be stable, uncontrollable, and global, whereas the attributions for guilt have been speculated to be unstable, controllable, and specific. Given that internal, stable, uncontrollable, and global attributions about the self have been associated with higher depression (Gotlib & Abramson, 1999), a relationship between shame and depression, but not between guilt and depression, can be expected (Tangney, Burggraf, & Wagner, 1995). Additionally, as guilt, but not shame, is associated with taking responsibility for the negative consequences of one’s behavior followed by reparation and attempts to restore interpersonal bonds, a relationship between guilt and empathy, but not shame and empathy, can be expected. Consistent with these expectations, Tilghman-Osborne, Cole, Felton, and Ciesla (2008) found a positive association for the TOSCA-A shame scale with depression, and a small relationship for TOSCA-A guilt with depression. These associations have also been shown in adult and child versions of the TOSCA. For instance, there are data showing that when guilt is controlled for in the analysis, shame has a positive relationship with depression, and that when shame is controlled in the analysis, there is no or very low relationships between guilt and depression (Fontaine et al., 2001; Orth, Berking, & Burkhardt, 2006; Tangney, Wagner, & Gramzow, 1992). Data also show that guilt, but not shame, predicts higher levels of empathy (Joireman, 2004; Leith & Baumeister, 1998; Roos et al., 2014; Tangney, 1995).
Given current gaps and limitations on the psychometric properties of TOSCA-A, this study examined further the psychometric properties of this measure for the shame and guilt items. More specifically, this study examined the fit for an oblique two-factor model, the concurrent and discriminant validity of the guilt and shame latent factors, measurement invariance across males and females for the two-factor model, and differences in latent mean scores across males and females, controlling for noninvariance in the TOSCA-A items. Based on past findings involving various versions of the TOSCA, we expected support for the two-factor model. We also expected support for the concurrent and discriminant validity of the guilt and shame latent factors. In this respect, we expected that empathy will be associated positively with guilt; and depression will be associated positively with shame. We also expect that there will be some level of noninvariance across ratings of males and females, and that females will have higher scores for both guilt and shame.
Method
Participants
The sample comprised 562 individuals, 320 females and 242 males, with age ranging from 12.01 years to 16.15 years. The mean age of this group of participants together was 13.41 years (SD = 0.92). The mean age of females (M = 13.51, SD = 0.92) and males (M = 13.27, SD = 0.91) differed significantly, t(560) = 3.02, p < .01, with females being only slightly older. Participants were from 14 primary and 9 secondary schools. Schools were selected from within areas chosen to reflect both geographic (based on local government areas) and socioeconomic well-being (based on the Socio-Economic Indexes for Areas 2001; Australian Bureau of Statistics, 2001) diversity within Melbourne, a large Australian city.
The socioeconomic status (SES) of parents was assessed using the Australian National University, Fourth edition (ANU4) socioeconomic index (Jones & McMillan, 2001). The ANU4 index ranges from 0 (low SES) to 100 (high SES), and has a normative mean of 45.1 (SD = 22.5). The ANU4 scores for the present sample was comparable to these normative scores, with means of 44.46 (SD = 23.98) and 40.82 (SD = 22.90) for Parents 1 and 2, respectively. Parental birthplace was diverse. While 44.5 % of mothers and 38.6% of fathers were born in Australia, the remainder came from 70 different countries. When collapsed into major geographic regions, the most common areas of parental birthplace were Southeast Asia (17.4% of mothers, 17.4% of fathers), Southern and Eastern Europe (10.7% of mothers, 12.6% of fathers), and Southern and Central Asia (5.3% of mothers, 5.7% of fathers).
The study was conducted following approvals from the Monash University Human Ethics Committee, and from Department of Education and Training (Victorian State Government), and school principals. Signed informed consent from parents and students was required for participation. Ethics approval stipulated that forms be distributed via the schools. Interested schools were given the requested number of explanatory statements and informed consent forms for distribution. Teachers or other school officials were responsible for distributing these forms, and students were responsible for taking forms home to their parents. As we have no information of how many of the requested forms were distributed by the schools, and how many of the distributed forms were delivered to parents, it is not possible to determine with complete accuracy the response rate. Of the parent and adolescent consent forms that were returned, 80% of parents and their adolescents consented to participate.
Measures
The Test of Self-Conscious Affect–Adolescent (TOSCA-A; Tangney et al.,1991)
The TOSCA-A, developed for use with adolescents between 12 and 20 years, has 15 scenarios (10 negative and 5 positive) that would be likely events experienced by adolescents. Each scenario is followed by response items that assess guilt proneness, shame proneness, detachment, and externalization. Positive items also include responses that measure pride (alpha-pride and beta-pride). In the current study, only the response items assessing guilt and shame were used. These emotions were the focus of the study as they were considered to be most relevant for understanding the moral, self, and interpersonal behaviors of adolescents (as presented in the Introduction). An example of a scenario is “At lunchtime, you trip and spill your friend’s drink.” The shame response is “I would be thinking that everyone is watching me and laughing” and the guilt response is “I would feel very sorry. I should have watched where I was going.” For each scenario, adolescents rated the shame and guilt response items on a 5-point scale (1 = very unlike me, 2 = a little unlike me, 3 = maybe [half and half], 4 = a little like me, and 5 = very like me) to indicate their likelihood of responding in the manner depicted. For increased clarity, our labels differed slightly from the original labels (not at all likely, unlikely, maybe [half and half], likely, and very likely). Similarly, to ensure appropriateness for Australian adolescents, minor wording changes were made (e.g., “cafeteria” was replaced with “lunchtime” and “grade” was replaced with “mark”). In the current study, all 15 scenarios were used. The Cronbach’s alpha values for shame and guilt were .78 and .82, respectively.
Children’s Depression Inventory (CDI; Kovacs, 1992)
The CDI is a self-rating scale for depression, appropriate for children and adolescents (7-17 years). There are 27 items, and each item consists of three statements serving to reflect differences in symptom severity. Respondents are required to select the statement that describes them best for the past 2 weeks. A higher total score reflects higher depression. The CDI has demonstrated good test–retest reliability, internal consistency, and construct validity (Kovacs, 2003). To satisfy the university’s ethics requirements, the item assessing suicide ideation was not included. A meta-analysis of the CDI found that adjusting means from studies which excluded the suicide ideation item to the 27-item mean did not produce any changes in results (Twenge & Nolen-Hoeksema, 2002). In the current study, Cronbach’s alpha for the remaining 26 items was .89.
The Index of Empathy for Children and Adolescents (IECA; Bryant, 1982)
The IECA is a 22-item measure of cognitive and affective components of empathy. For the current study, participants were required to endorse the response that best applies to them on a 4-point scale ranging from 1 = strongly disagree to 4 = strongly agree. A higher total score reflects higher levels of empathy. The IECA has adequate internal consistency with Cronbach’s alpha coefficients of around .81, and has demonstrated good convergent and discriminant validity (Bryant, 1982). In the current study, Cronbach’s alpha was .77.
Procedure
Measures were administered by two PhD student researchers during school hours in quiet classrooms and in small groups of up to 30 students as part of a larger study involving additional measures. Participants were informed that participation was voluntary and that they were free to withdraw at any time. It was emphasized that there were no right or wrong answers, but that it was the answers most true for the respondent that we were interested in. One researcher read aloud all instructions and items as the students proceeded through the questionnaires, while a second researcher was on hand to assist participants where required. The order of questionnaire administration was counterbalanced between groups, and administration took between 30 and 45 minutes, depending largely on the age of the group.
Statistical Procedures
The oblique two-factor model for TOSCA-A was examined with CFA. The factors were shame and guilt factors, and the items loading on these factors were the originally nominated shame and guilt items. The concurrent and discriminant validity of shame and guilt was examined in terms of how they were correlated with depression and empathy. For this, the two-factor TOSCA-A model was extended to include the variables for depression and empathy, and these variables were correlated with the factors for shame and guilt.
Measurement invariance across males and females was tested using the multiple-group CFA invariance procedure proposed by others (e.g., T. Brown, 2006). This study tested for configural invariance (same overall factor structure), factor loadings or metric invariance (same strengths of the associations of items with the latent factors), and intercepts invariance (equivalency in item intercept values). Although it is also possible to test for error variances (equality for uniqueness), this was not tested here as most methodologists consider this test as overly stringent and unnecessary (T. Brown, 2006; Cheung, 2008). In the multiple-group CFA approach, the configural invariance model (M1) is the first to be examined. Configural invariance is supported if this model shows good fit. When the configural invariance model is supported, the invariance model for factor loadings or metric invariance (M2) can be tested. This metric model is examined by constraining the factor loadings of like items equal across the two groups. Metric invariance is inferred if M2 does not differ from M1. Following this test, the invariance for the intercepts (M3) can be examined. This model is tested by constraining the intercept values (in addition to factor loadings) of like items to be equal across the two groups. Support for invariance intercepts is inferred if M3 does not differ from M2.
It is to be noted that the sequence of analyses described above tests if a given level of invariance is fully satisfied or not. When full metric invariance is not satisfied, partial invariance can be explored. The researcher can determine the source of the noninvariance by freeing, progressively, the loadings in M1 for items (using the modification index or MI) across the two groups, until a final partial metric invariance model is obtained. This final partial metric invariance model will have only those like items with equal loadings constrained equal across the two groups. The test for partial intercepts invariance is conducted by equating the intercepts of only those like items with equal factor loadings. If invariance is not supported, the source of the noninvariance can be explored, as explained for testing partial metric invariance.
All CFA and structural equation modeling models in the study were conducted using Mplus (Version 7.2) software (Muthén & Muthén, 2012). Since the ratings of adolescents had a hierarchical structure (as there were distinct groups of adolescents from different schools), we modeled this using the TYPE = COMPLEX option in Mplus. This study used maximum likelihood with robust estimation (MLRχ2) to ascertain statistical fit. As χ2 values are inflated by large sample sizes, the fit of the models was also examined using root mean squared error of approximation (RMSEA), the comparative fit index (CFI), and the standardized root mean square residual (SRMR). The guidelines suggested by Hu and Bentler (1998) are that RMSEA values close to .06 or below, CFI values close to .95 or above, and SRMR values close to .08 or below be used to infer good model-data fit. To determine differences between models at the statistical level, the difference in MLRχ2 values (computed using the scaling correction formula for MLR; Satorra & Bentler, 2001) was used. An alpha value of .01 was used to allow for more stringent Type II error control in the models compared. The differences between models at the practical level was also examined using the differences in the RMSEA values. Although this can also be done by comparing the CFI values of the models, this was not done in this study for reasons presented below. According to Chen (2007), an increase of .015 or more in the RMSEA value can be taken as indication of lack of invariance.
Results
Missing Values
Full information maximum likelihood estimation, available in Mplus, was used to deal with missing values. This procedure, which assumes that data are missing at random, is a widely accepted approach for handling missing data (Schafer & Graham, 2002).
Fit for the Two-Factor Model
The goodness-of-fit values for the two-factor model for all participants together were MLRχ2 (degrees of freedom [df] = 404) = 1289.48, p < .001; RMSEA = .062, 90% confidence interval (CI) [.058, .066]; CFI = .723, SRMR = .078. The RMSEA and SRMR values indicated good fit, whereas the CFI value indicated poor fit. The CFI is an incremental measure of fit that compares the theoretical model with the null model, or a model with zero correlation between all variables. Thus, when the theoretical model has low correlations among the variables, the discrepancy between the theoretical and the null model will be relative low, thereby leading low, a lower CFI value. According to David Kenny (2014), when the RMSEA values for null models are less than .158, the CFI values of theoretical models are not informative. In the current study, the average correlations among the TOSCA items was low at .24, and the RMSEA for the null model was less than .16 at .11 (90% CI [.105, .110]). Given this, the CFI can be taken as offering limited value for examining model fit in the current data set. Consequently, the fit for this (and all other models in this study) was based on the RMSEA and SRMR values. As aforementioned, the RSMEA and SRMR values for the two-factor model indicated good fit. In this model, the correlation between the factors for guilt and shame was .40 (p < .001).
Table 1 shows the means, standard deviations, and standardized parameter estimates for the items of the two-factor model. All factor loadings were significant (p < .001). Based on Thurstone’s (1947) classic criterion for salience as a standardized loading greater than .3, the loadings for only two guilt items (5 and 9), and one shame item (4) were not salient. The factor loadings for the shame items ranged from .16 to .60, and the factor loadings for the guilt items ranged from .23 to .69. On average, the loadings for the guilt items (M = 0.51, SD = 0.12) was higher than the shame items (M = 0.42, SD = 0.13). The amount of variance explained by the guilt and shame factors was 22.10% and 16.07%, respectively.
Mean (SD) and Standardized Parameter Estimates of the Items of the Two-Factor Model.
External Validity of Shame and Guilt Factors in the TOSCA-A
The correlations of shame and guilt with depression were .29 (p < .001) and −.11 (p < .05), respectively; and the correlations of shame and guilt with empathy were .07 (ns) and .40 (p < .001), respectively.
Multiple-Group CFA for Invariance Across Sex
According to T. Brown (2006), for reliable estimates from multiple-group CFA models, it would be desirable for the hypothesized model applied to the groups to have adequate fit. Given this, prior to multiple-group analyses, the goodness-of-fit values for the two-factor CFA models for male and female adolescents were examined separately. The values for males were MLRχ2(df = 404) = 820.85, p < .001; RMSEA = .065, 90% CI [.058, .071]; CFI = .716, and SRMR = .087. The values for females were MLRχ2 (df = 404) = 976.13, p < .001; RMSEA = .066, 90% CI [.061, .072]; CFI = .662; and SRMR = .086. Thus, for both groups, the RMSEA values were close to .06 and the SRMR values were close to .08, thereby indicating support for the two-factor CFA models for both male and female adolescents. The correlations between the factors for guilt and shame in males and females were .42 (p < .001) and .30 (p < .001), respectively.
Table 2 shows the results of the analyses for invariance testing across gender, based on the χ2 difference test and the difference in RMSEA values. As shown, the RMSEA and SRMR indicated good fit for the configural model (M1), and thus support for configural invariance.
Results of Tests for Invariance Across Gender.
Note. df = degrees of freedom; MLRχ2 = maximum likelihood with robust estimation; RMSEA = root mean square error of approximation; CFI = comparative fit index; CI = confidence interval; SRMR = standardized root mean square residual. All MLRχ2 values were significant (p < .001).
p < .01. ***p < .001.
As shown in Table 2, for the analyses involving the χ2 difference test, there was no difference between the configural model (M1) and the metric invariance model (M2); thereby supporting the full metric invariance model. The full intercepts invariance model (M3) differed from the metric invariance model (M2). Additional analyses indicated noninvariance for the intercepts of Guilt Items 3 and 11; and Shame Items 5 and 3. Table 2 shows that after taking into account the lack of invariance in the intercepts of these four items, there was no support for equivalency for the mean scores for guilt and shame (M4), as this model differed from the partial intercepts invariance model (M3.4). Addition analysis indicated differences for both guilt (M4.1) and shame (M4.2). For both shame and guilt, the latent mean scores were higher for females.
Table 2 shows that for the analyses involving the difference in RMSEA values, there was no difference (<.015) between the configural model (M1P) and the metric invariance model (M2P); and the metric invariance model (M2P) and the intercepts invariance model (M3P), thereby supporting full measurement invariance (metric and intercepts invariance) for all TOSCA-A items. There was also no difference between the intercepts invariance model (M3P) and the equivalency for the mean scores model (M4P). Thus, unlike the analysis involving the χ2 difference test, the analyses involving the difference in RMSEA values indicated no gender difference for guilt and shame.
Table 3 shows the estimates of the noninvariant intercepts in the final partial intercepts invariance model (M3.4) derived from the χ2 difference test analyses. As shown, the findings indicated that males had a higher intercept for Guilt Item 3, whereas females had a higher intercept for Guilt Item 11. Also, females had a higher intercept for Shame Item 5, and males had a higher intercept for Shame Item 11. However, as shown in Table 3, all differences were small in an absolute sense (Item 3, guilt: 4.33 for females and 4.76 for males; Item 11, guilt: 3.37 for females and 3.02 for males; Item 5, shame: 2.41 for females). Also, as shown in Table 3, the latent scores for guilt and shame for males were .29 and .19 units, respectively, less than females. The variances for the latent factors for guilt and shame for males were .37 and .25, respectively. Thus, based on the approach proposed by Hancock (2001), the effect sizes for the difference between males and females for guilt and shame were .48 (.29 ÷ √.37) and .38 (.19 ÷ √.25), respectively.
Unstandardized Estimates in the Final Invariance Model for Males and Females Derived From the χ2 Difference Test.
As shown in Table 2, for the analyses involving the difference in RMSEA values, there was no difference (i.e., all differences <.015) between the configural model (M1P) and the metric invariance model (M2P); and the metric invariance model (M2P) and the intercepts invariance model (M3P), thereby supporting full measurement invariance (metric and intercepts invariance) for all TOSCA-A items. There was also no difference between the intercepts invariance model (M3P) and the equivalency for the mean scores model (M4P). Thus, unlike the analysis involving the χ2 difference test, the analyses involving the difference in RMSEA values indicated no gender difference for guilt and shame.
Discussion
One aim of the current study was to use CFA to examine support for the oblique two-factor model for the shame and guilt items of TOSCA-A. As expected, the findings indicated good fit for this model. All the loadings on the shame and guilt factors were significant, and only two guilt items (5 and 9), and one shame item (4) lacked salience (<.30). It is worth noting that this is the first study to demonstrate this support for the TOSCA-A. Somewhat consistent with our findings, a previous study by Luyten et al. (2002) that used principal components analysis of shame and guilt items of the adult version of the TOSCA showed separate factors for the most of the shame and guilt items. Also of relevance, using CFA with parcels as indicators for shame, guilt, externalization, and detachment latent factors, Fontaine et al. (2001) reported good support for the expected theorized four-factor model. This model, with the individual items as indicators, was also supported in a study that examined the TOSCA-C (Strömsten et al., 2009).
Another aim of the current study was to examine the convergent and discriminant validity of the guilt and shame factors of the TOSCA-A. As expected, the findings indicated that depression was correlated positively with shame and negatively with guilt. In contrast, empathy was correlated negatively with shame and had no relation with guilt. Thus, there was support for the convergent and discriminant validity of the guilt and shame factors. The findings for empathy are consistent with existing data for the TOSCA-A (Tangney, 1991; Tangney et al., 1996), and the findings for depression are consistent with findings from studies with the adult version of the TOSCA (e.g., Tangney et al., 1992). As there are also existing data showing positive associations for the TOSCA-A shame scale with anger and aggression, and TOSCA-A guilt scale with adaptive anger management strategies (Tangney, 1991; Tangney et al., 1996), it can be argued that shame and guilt (at least as measured by TOSCA-A) reflect risk and protective factors, respectively, for psychological symptoms and problem behaviors (Muris & Meesters, 2014).
The study also examined measurement invariance across male and female adolescents using differences in MLRχ2 and RMSEA values. For the analyses involving differences in MLRχ2 values, the findings showed support for the configural model and full metric invariance model. There was no support for full intercepts invariance, with two guilt items (apologizing because you throw a ball that hits a friend; and feeling you should be in trouble for talking in class); and two shame items (avoiding a friend because you broke something at his or her house and then hide it; and feeling stupid because you throw a ball that hit a friend) showing noninvariance. However, sex differences for the intercepts of all four items were negligible. Thus the findings can be interpreted as providing sufficient support for invariance for the intercepts. For the analyses involving difference in RMSEA values, the findings indicated support for full measurement invariance (metric invariance and intercepts invariance for all items). Overall, therefore, our findings indicated good support for measurement invariance for TOSCA-A. This means that the TOSCA-A shame and guilt items function similarly across gender. It is to be noted that the current study is the first to report on measurement invariance for the TOSCA-A across males and females. Of relevance, a previous study that simultaneously tested measurement invariance across two time points and gender for the TOSCA-C, with parcels for shame and guilt as indicators, concluded support for sex invariance (Roos et al., 2014).
For the analyses involving differences in MLRχ2 values, the findings in the current study showed that for both shame and guilt, the latent means were higher for females. Findings indicated that the latent scores for guilt and shame for males were .29 and .19 units, respectively, less than females. The effects sizes, comparable to Cohen’s (1992) d , for the difference between males and females for guilt and shame were .48 and .38, respectively. Since factors are error free, these effect sizes values can be assumed to be larger than effect size values from measured scores (Thompson & Green, 2006). Cohen’s recommended magnitudes for d effect sizes for measured scores are <.20 = negligible; ≥.20 and <.50 = small; ≥.50 and <.80 = medium; ≥.80 = large. Thus, although females had higher scores than males for both guilt and shame, the magnitude of the differences can be considered negligible. In further support of this argument, we found no sex difference for the shame and guilt factor mean scores when evaluated using the differences in RMSEA values. Although existing data indicate gender differences for shame and guilt (Tangney & Dearing, 2002a), we wish to argue that as our findings are based on latent scores that are free of error variance, our findings provide a better test of gender differences for shame and guilt.
Another finding in the current study worthy of some discussion is that for the two-factor model, the amount of total variance explained by the guilt and shame factors were 22.1% and 16.1%, respectively. Thus, 77.9% and 83.9% of the total variance in these factors was error variance. In a CFA model, error variance constitutes variance from both random measurement error and uniqueness. According to Tangney et al. (1996), each item of a given scale in the different versions of the TOSCA share common variance due to the psychological construct (guilt or shame) being measured, and its own unique variance associated with its own specific scenario. It is conceivable that the unique variance associated with specific scenarios is substantial. As these are part of uniqueness of the items, it will be modeled as part of the items’ error variance in a CFA. This would explain the relatively low variance explained by the guilt and shame factors (or conversely, the high error variance in these factors) found in the current study.
The findings of this study have implications for the use of the shame and guilt scales of TOSCA-A. The findings indicate good support for the two-factor model in terms of factor structure. Also, the convergent and discriminant validity analyses indicate that the shame and guilt items and scales of TOSCA-A can provide valid information for research use. The support for measurement invariance across males and females indicates that observed scores derived from male and female adolescents for the shame and guilt scales of the TOSCA-A can be directly compared. Thus, mean and standard deviation scores can be developed and used with confidence for assessing shame and guilt among adolescents. Since there was no gender difference for the shame and guilt factor mean scores when evaluated using the differences in RMSEA values, it would be argued that the same mean scores would be used for both gender groups from a practical viewpoint. However, as there were statistical differences, for mean scores for both shame and guilt when the χ2 difference test was used, it would be necessary for mean scores to be gender specific if high precision is needed.
In concluding, the results of this study need to be viewed with several limitations in mind. First, we had no information on those who were invited but did not participate in the study. Thus, is it uncertain how this may have affected the results. Second, as this study involved adolescents from the general community, it is uncertain how the findings would apply to clinic-referred adolescents or adolescents with special needs. Third, like the shame and guilt scales of the TOSCA-A, the CDI and the IECA that were used to examine their concurrent and discriminant validity were also self-report measures. Thus, it is possible that findings in these analyses were confounded by shared common method variance. Fourth, the findings reported here are based on a single study. As a consequence, there is a need for cross-validation of the findings before the findings can be generalized. Fifth is that although the TOSCA-A has been designed for adolescents up to 20 years, the sample in this study were younger adolescents. Sixth, Luyten et al. (2002) have argued that the TOSCA is biased, in that its guilt scale measures mild and adaptive forms of guilt and the shame scale measures maladaptive aspects associated with shame. If so, the findings report for the relationships of shame and guilt with depression and empathy may have little substantive meaning. Seventh, although we have argued that the TOSCA-A can be useful for research on shame and guilt in adolescents, this may be more so for adolescents without psychological disorders than those with psychological disorders. This is because there is some evidence of poor discriminant validity between shame and guilt in clinical groups when using the TOSCA (Rusch et al., 2007). It would be useful therefore to conduct more studies in this area, taking into consideration the limitations highlighted here. Notwithstanding these limitations, the collective findings in the current study, as well as past studies, support the use of the TOSCA-A, especially in adolescents from the general community.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by an Australian Research Council Discovery Project Grant [ARC DP0343902].
