Abstract
This study examined the psychometric properties of the Ego Resiliency Scale-Revised (ER89-R). Though support exists for a multidimensional conceptualisation using classical test theory approaches (i.e., a higher-order model comprising Openness to Life Experiences and Optimal Regulation factors), this measure has not been subjected to Rasch analysis. Accordingly, this paper evaluated the higher-order model via confirmatory factor analysis (CFA) before assessing Openness to Life Experiences and Optimal Regulation components using Rasch analysis. CFA, using a general population sample (N = 2009), supported the higher-order factor structure. Openness to Life Experiences and Optimal Regulation scales met Rasch model assumptions. Specifically, good item/person fit and item/person reliability, and evidence of unidimensionality. Moreover, most items displayed gender invariance. Overall, findings supported the higher-order conceptualisation of the ER89-R, and indicated that the Openness to Life Experiences and Optimal Regulation scales are relatively useful measures of ego resiliency components in a general population sample.
Keywords
Introduction
The Ego Resiliency Scale (ER89) (Block & Kremen, 1996) is a 14-item self-report measure of psychological resilience that examines the capacity to flexibly adjust responses in tandem with changing situational demands, especially during emotionally challenging conditions (Block, 2002). Accordingly, the ER89 evaluates adaptability, defined as the capacity to modify ego-control as a function of contextual demands to maintain or enhance equilibration (Maltby, Day, & Hall, 2015). Congruent with this classification, individuals high in ego resiliency demonstrate higher levels of adjustment and personal attainment across life stages (Block & Block, 1980; Fredrickson, Tugade, Waugh, & Larkin, 2003). In this context, scores on the ER89 reflect motivational control and resourceful adaptation. In terms of stability, consistent with the conceptualisation of resilience as a personality trait, Block and Kremen (1996) regard ego resiliency as relatively unchanging (Block & Block, 2006).
Block and Kremen (1996) further delimit resilience as an affect processing system comprising ego-control (EC) and ego- resilience (ER) (Farkas & Orosz, 2015). This distinction derives from psychoanalytic theory, which designates ego-control as inhibition or expression of impulses, and ego resilience as the capacity to modify impulses according to situation. EC denotes a meta-dimension of impulse inhibition/expression, and ER designates the dynamic capacity to modify level of control in response to situational demands (Letzring, Block, & Funder, 2005). Based on this perspective, resilient individuals avoid maladaptive coping strategies by adapting their level of ego-control dependent on context.
The ER89 emerged from a study examining conceptual connections and separateness between ER and intelligence (IQ) (see Block & Kremen, 1996). Items from the two constructs were interspersed within a single paper-and-pencil measure. Based on analysis, Block and Kremen (1996) advised that the ERS measures one factor. Subsequent studies confirmed this unidimensional structure, and established that the scale was reliable and valid (Caprara, Steca, & De Leo, 2003; Letzring, Block, & Funder, 2005; Menesini & Fonzi, 2005; Tugade & Fredrickson, 2004). These findings indicated that the ER89 adequately measured EC and ER, and produced scores that were conceptually and coherently related to personality. Noting the potential importance of both factors, Tugade and Fredrickson (2004) encouraged researchers to consider the interaction between EC and ER since this provided a nuanced and deeper understanding of resilience.
Subsequently, the ER89 has become a well-established and widely used measure of psychological resilience. Consequently, researchers have translated the scale into different languages (e.g., Italian, Caprara, Steca, & De Leo, 2003; Chinese, Chen, He, & Fan, 2020; and Japanese, Ushio & Onodera, 2013) and revised the measure (ER89-R, see Alessandri, Vecchione, Caprara, & Letzring, 2012; Alessandri, Vecchio, Steca, Caprara, & Caprara, 2008).
The Ego Resiliency Scale-Revised (ER89-R)
The ER89-R modification resulted from a series of studies that subjected the ER89 to confirmatory factor analysis and reported a two-factor solution. This provided best fit and was stable across a range of samples (see Alessandri et al., 2008; Menesini & Fonzi, 2005; Vecchione, Alessandri, Barbaranelli, & Gerbino, 2010). Explicitly, Alessandri et al. (2008) found that four items possessed psychometrically inadequate properties. Removal of these resulted in the 10-item ER89-R (Alessandri et al., 2008). Structurally, the ER89-R depicts a higher-order model, where ego resiliency, a second-order factor, affects two first-order components (Openness to Life Experiences and Optimal Regulation), which affect responses to scale items. These factors are conceptually consistent with delineations of ego resiliency (Alessandri et al., 2012). Vecchione et al. (2010) found that this structure was stable and invariant from late adolescence to young adulthood. In a subsequent study, Alessandri et al. (2012) using multigroup confirmatory factor analysis demonstrated invariance (i.e., partial configural, metric and scalar) of the ER89-R across samples from Italy, Spain and the United States.
Despite support for the two-factor ER89-R solution, Farkas and Orosz (2015) propose an alternative three-component model using an 11-item version of the ER89. This comprises a hierarchical model with three factors: Active Engagement with the World (AEW), Repertoire of Problem-Solving Strategies (RPSS) and Integrated Performance under Stress (IPS). These components are distinct and demonstrate distinct relationship patterns with other constructs (e.g., subjective well-being, and state and trait anxiety). Based on this outcome, Farkas and Orosz (2015) advocated that resiliency was a double-faced construct, embracing two functions. The first maintains personality, whereas the other is adaptive and adjusts the personality system to the demands of the dynamically changing environment. From this perspective, stability (permeability) is represented by RPSS and IPS, and flexibility (elasticity or plasticity) by RPSS and AEW.
This approach to scale validation reflects classical test theory (CTT), which emphasises reliability (e.g., internal and test-retest) and validity (e.g., the application of factor analysis to establish internal conceptual coherence) (Mills, Young, Nicholas, Pallant, & Tennant, 2009). Moreover, CTT assumes that in the absence of measurement error, tests produce true scores. Error arises and is inevitable because measurement instruments are imperfect. Hence, observed scores differ as a function of both construct difference and measurement error. CTT advocates that this error is random and that the distribution of error is the same for all individuals (Magno, 2009). In contrast, item response theory (IRT or modern test theory) focuses on item scores. This accent is based on the supposition that items possess different levels of difficulty and that individual variations reflect differences in the latent trait or ability observed (Magno, 2009).
The Present Study
Noting that the development and validation of the ER89-R used CTT, the present study examined the instrument by conducting Rasch scaling in conjunction with CTT (i.e., confirmatory factor analysis). This was necessary since researchers have not previously conducted this form of analysis. In addition, a critical limitation of using CTT in previous research to validate the ER89-R is that observed scores of respondents are dependent on the measure (Magno, 2009). Explicitly, if ER89-R items are too challenging to endorse, then respondents’ mean scores can be reduced, and mean scores can be inflated if items are too easy. Without examining if this bias exists, average scores of the ER89-R can be unreliable when utilised in studies. The ER89 and ER89-R are widely used in research and are examined in relation to significant constructs including mental health and psychological well-being (e.g., Edlina, Arif, Nilesh, & Sonia, 2020; Kubo, Sugawara, & Masuyama, 2021; Milioni et al., 2015; Sadziak, Wiliński, & Wieczorek, 2017; Spurr, Walker, Squires, & Redl, 2021; Tsirigotis & Łuczak, 2018). Therefore, assessment of relationships between the ER89-R, and such constructs can be erroneous, which potentially impacts the conclusions formed concerning the role of ego resiliency. Accordingly, validation of the ER89-R using Rasch techniques that overcome this test-dependent limitation is important. Furthermore, Rasch scaling was particularly appropriate because ER89-R items were constructed over two decades ago, and their source of origin is untraceable (Farkas & Orosz, 2015).
Moreover, Alessandri et al. (2008) claim that the most appropriate conceptualisation of the ER89-R includes the factors of Openness to Life Experiences and Optimal Regulation. In practice, this means that the measure contains two distinct (albeit related) unidimensional scales. Combining Rasch with traditional CFA is a recommended approach for examining the validity of a measure (factorial structure, etc.; Lin & Pakpour, 2017) due to the explicit focus on unidimensionality among other benefits, including the ability to estimate person ability and item difficulty separately. Accordingly, CFA will involve testing one-factor against multidimensional (i.e., the higher-order) models. Failure to confirm a unifactorial solution would support Alessandri et al. (2008) and the presence of two distinct factors/scales. Rasch analysis would provide additional robust tests of unidimensionality.
The Rasch model outlines expected item responses if metric level measurement is achieved (Rasch, 1960). This applies for both dichotomous (Rasch, 1960) and polytomous (Andrich, 1978) responses (Pallant & Tennant, 2007). Rasch models accomplish this by assessing the degree to which observed response patterns match expected values using a probabilistic form of Guttman scaling (Guttman, 1950) and fit statistics (Smith, 2000). Within the model, the probability of a respondent confirming an item is defined as a logistic function. This is the relative distance between the item and respondent location on a linear scale. Thus, the Rasch model views responses as the product of the interaction between test taker scores, which reflect latent trait or ability level, and item difficulty. Thus, the higher the ability relative to item difficulty, the higher the probability of a correct item response. Consistent with this, when latent trait location is equal to item difficulty, there is a 0.5% probability of a correct response (Weller et al., 2013). Within this framework, Rasch analyses characterise a curve, which identifies the ability level at which the item maximally discriminates.
Noting the potential of the Rasch measurement model to convert ordinal observations to interval scaled measurement (Wright & Linacre, 1989), another advantage of the approach is the ability to test for differential item functioning (DIF) (item bias) (Tennant et al., 2004). DIF is important because it indicates that group membership (age, gender, etc.) rather than the underlying latent trait or ability level is influencing responses. Hence, DIF occurs when different groups at the same level demonstrate a different probability of item response (Chen & Revicki, 2014). This is problematic since results are no longer representative of the trait or ability, and therefore cannot be compared objectively (Kopf, Zeileis, & Strobl, 2015). Emerging concern with DIF stems from increased interest in IRT as an alternative to classical test theory. Acknowledging these issues, a Rasch scaling of the ER89-R was undertaken.
Method
Respondents
The sample comprised 2009 respondents, (mean age, M) = 39.81 years, SD = 14.71, range = 18–89. There were 549 males (27%), M = 44.52 years, SD = 16.07, range = 18–88; and 1460 females (73%), M = 38.04 years, SD = 13.75, range = 18–89. Respondent recruitment was via Qualtrics, an online multi-channel management platform for data collection.
The researchers requested a sample of UK-based, non-clinical respondents aged 18 years and over. This was the only exclusion criteria. Recruitment focused on the general public to validate the ER89-R for general use within research and sampled a spread of ages. As the emphasis was on obtaining general public responses, data collection was not restricted to any particular demography. Qualtrics obtains data from recruitment panels, which are derived from a pre-arranged pool of individuals who have consented to respond to surveys in research studies. Data accessed from respondent recruitment panels are generally more diverse and far reaching than traditional student samples. These advantages are not detrimental to quality, and are commensurate with traditional samples in terms of demographics and responses to established surveys (Kees, Berry, Burton, & Sheehan, 2017).
Measures
The Ego Resiliency Scale-Revised (ER89-R; Alessandri, et al., 2008) assesses the capacity to modify responses flexibly in response to changing situational demands, especially under emotionally challenging conditions. Items assess resilience both directly (e.g., ‘I quickly get over and recover from being startled’) and indirectly (e.g., ‘I like to take different paths to familiar places’). The ER89-R presents items as statements, and respondents rate agreement via a 4-point Likert-type scale ranging from 1 (does not apply at all) to 4 (applies very strongly). The ER89-R has demonstrated good reliability and validity (Alessandri, et al., 2008), alongside high correlations with the original measure by Block and Kremen (1996).
Procedure and Ethics
Respondents retrieved study materials using a web-link. Preceding item presentation, respondents received general information about the research project. This explained the investigation and presented details about ethics. To advance, respondents provided informed consent. Respondents then supplied demographic information (i.e., age and preferred gender) before progressing to the items. Procedures asked respondents to thoroughly read and answer all questions, work at their own pace and respond openly and honestly. Furthermore, instructions reduced the potential for evaluation, apprehension and social desirability effects by telling respondents that there were no right or wrong responses. Respondents worked through the items at their own pace until they reached the end of the survey, at which point they received the debrief. A typical survey took approximately 10 min to complete. Ethical approval was granted by the Manchester Metropolitan University Faculty of Health, Psychology and Social Care Ethics Committee.
Analysis
The authors employed confirmatory factor (CFA) and Rasch analyses to assess the validity of the ER89-R. CFA (using AMOS27) tested construct validity via four models: three one-factor models and a higher-order model. The one-factor models comprised a total scale model as a test of the original structure (Block & Kremen, 1996), followed by models with respective latent factors of Openness to Life Experiences and Optimal Regulation. The higher-order model was based on Alessandri et al. (2008) and included the two factors of Openness to Life Experiences and Optimal Regulation, alongside a higher-order construct of Ego Resiliency.
Indices of chi-square, Comparative Fit Index (CFI), Tucker–Lewis Index (TLI), standardised root-mean-square residual (SRMR) and root-mean-square error of approximation (RMSEA) evaluated model fit. Good fit thresholds are CFI ≥ 0.90, TLI ≥ 0.90, SRMR ≤ 0.08 and RMSEA ≤ 0.08 (Browne & Cudeck, 1993). Marginal fit represents CFI ≥ 0.88, TLI ≥ 0.88, SRMR ≤ 0.10 and RMSEA ≤ 0.10 (Bong, Woo, & Shin, 2013). Additionally, for model comparison analysis considered Akaike’s information criterion (AIC), with lower values indicative of superior fit. For interpretation, factor loadings ≥ .30 are satisfactory and representative of the factors (Gliner, Morgan, & Leech, 2016).
Rasch analysis provided follow-up measurement information at the person and item levels. Rasch models can be used ‘as confirmatory tests of the extent to which scales have been successfully developed according to explicit a priori measurement criteria’ (Ludlow, Enterline, & Cochran-Smith, 2008, p. 196). For this study, analysis used the Rasch Rating Scale Model (RRSM; Andrich, 1978). Winsteps software (Linacre, 2018) estimated the parameters for analysis using joint maximum likelihood estimation techniques. Similar to previous Rasch validation studies (e.g., Royal & Elahi, 2011; Wolfe & Smith, 2007), assessment of the ER89-R examined five criteria: rating scale effectiveness, dimensionality, reliability, differential item functioning and item hierarchy.
Results
Confirmatory factor analysis
Assessment of multivariate kurtosis via Mardia’s test indicated a significant departure from normality (i.e., 11.89 > 1.96). However, Mardia’s test is highly sensitive to sample size, and it is recommended to examine kurtosis among individual variables (Stelmack et al., 2009). Values > 3.00 indicate a variable is not normally distributed (Westfall & Henning, 2013). All variables were below 3.00 (i.e., ranged between 0.41 and 1.01).
Fit Indices for ER89-R Factor Models.
Note. **χ2 significant at p < .001.
Factor loadings were significant and > 0.3, ranging from 0.36 to 0.75. Comparison of AIC between the higher-order and total scale models in addition to the fit indices suggested that the higher-order model was superior (i.e., lower AIC and better overall fit). Figure 1 displays standardised factor loadings for the higher-order model alongside error and R2. Omega and alpha reliability for all scales was acceptable (total scale ω = 0.80, total scale α = 0.80; Openness to Life Experiences ω = 0.76, Openness to Life Experiences α = 0.75; and Optimal Regulation ω = 0.70, Optimal Regulation α = 0.70). ER89-R higher-order model. Note. ER = Ego Resiliency total score; OL = Openness to Life Experiences; OR = Optimal Regulation.
Rasch analysis
Rating Scale Effectiveness.
Item Fit Statistics.
Note. The MNSQ acceptable limits to productive measurement were 0.6–1.4. Values beyond these limits are considered misfitting.
Figure 2 displays response curves for each category of the survey. This is informative insofar as it indicates level of endorsement. For example, the bell-shaped curve for ‘Applies somewhat’ peaks in the ability range of 0.5 and 1.3 similarly for Openness to Life Experiences and Optimal Regulation, signifying that individuals with Ego Resiliency scores between 0.5 and 1.3 are more likely to endorse this category. It appears from the pattern of responses (see Figure 2) that participants are using all four response categories. Response category probability curves.
The PCA of the residuals examined dimensionality. The observed variance explained by the first extracted dimension for Openness to Life Experiences was 49.4%. The unexplained variance in the first contrast was 19.7% (eigenvalue: 1.6). For Optimal Regulation, the observed variance was 40.3% for the first component. The initial contrast accounted for 13.5% (Eigenvalue: 1.4). An eigenvalue > 2 is suggested to reflect a component (Linacre, 2012). Therefore, the results indicated the presence of one Rasch dimension for each factor, and represented acceptable evidence for unidimensionality.
Reliability and separation estimates reflect the degree of reproducibility in the scores. For Openness to Life Experiences and Optimal Regulation, person reliability was 0.70 and 0.73, signifying acceptable internal consistency. Similarly, item reliability was 0.96 and 0.99, indicating high item reliability. Person separation estimates of 1.54 and 1.66 for Openness to Life Experiences and Optimal Regulation indicated reasonable spread, distinguishing high and low ability in the sample (Souza et al., 2017). Similarly, respective item separation measures of 5.15 and 12.12 existed, inferring good spread of items.
Differential item functioning examines stability in item difficulty in relation to subpopulations (e.g., gender). For Openness to Life Experiences, all items demonstrated acceptable DIF contrasts (i.e., below 0.5 logits; Linacre, 2012). For Optimal Regulation, item 1 displayed a noticeable difference between men and women, and was .67 logits more difficult for men (Mantel–Haenszel p < .001). This study examined how the structure of the ER89-R behaved when administered in this sample. Considering that modifying the scale structure was not the purpose, the authors hope this information will guide use in future research.
Item difficulty is interpretable from the person–item maps (Figure 3). Items positioned in the bottom section of the continuum are the most straightforward, and participants located near these items possess less ability to take part (i.e., less inclination to endorse items). Items positioned near the top of the continuum are the most challenging, and participants close to these possess greater ability. Item 4 was the easiest to agree with for Openness to Life Experiences, whereas item 5 was the most difficult to endorse. Item 1 was the easiest for Optimal Regulation, and items 9 and 10 were the most challenging. Person-item maps of openness to experiences and optimal regulation. Note. M = mean persons’ ability or mean items’ difficulty; S = one standard deviation; T = two standard deviations.
Discussion
CFA findings supported the presence of a higher-order structure for the ER89-R, comprising two factors of Openness to Life Experiences and Optimal Regulation. Specifically, like Alessandri et al. (2008), a one-factor model was unsatisfactory, and a higher-order model fitted data more adequately. This is in contrast with Block and Kremen’s (1996) original conceptualisation of the construct/measure as unidimensional. Factor loadings for the two dimensions were satisfactory overall, and consistent with Alessandri et al. (2008), satisfactory to good internal consistency existed.
Rasch analysis demonstrated that the ER89-R is a valid measurement tool with a general population, which can adequately quantify Openness to Life Experiences and Optimal Regulation (as indices of Ego Resiliency). Items worked well in combination to form valid unidimensional interval scales. Specifically, Rasch is an effective technique permitting classification of rogue items (i.e., those not sensitive to the underlying construct) that add noise/randomness to measurement and should be discarded. All items were productive for measurement in this analysis, inferring that the scales measure single (albeit related) constructs. The PCA of the residuals furthermore supported unidmensionality, and consistent with the classical test results, demonstrated satisfactory reliability (using the Rasch approach).
From the person–item map, mean endorsement score was in a similar location to the mean item difficulty in each instance, signifying that the items for the study sample were well targeted (Stelmack et al., 2004). However, the mean endorsement score was slightly greater than the mean item difficulty in each instance, indicating that items were overall ‘fairly’ easy to complete for the sample. Furthermore, items demonstrated a low spread on the continuum versus the person spread. This lack of spread may explain the person separation indexes (1.54 and 1.66), which are impacted by aspects including scale length and number of response categories per item (Linacre, 2012). Notably, the scales are relatively short (4 and 6 items) and include a fairly brief response range (1–4). Though the person separation indexes were sufficient to distinguish high and low ability, a greater value is desirable for improved classification of levels of ability among respondents. In this context, expanding the measure to include items that better differentiate low versus high ability would prove productive.
Analysis of DIF suggested men found one item more difficult than women (item 1, ‘I am generous with my friends’). The literature on the ER89-R designates that women often score higher than men (e.g., Alessandri et al., 2008; Caprara et al., 2003). Perhaps item difficulty may contribute to lower levels of positive endorsement. Indeed, if participants find items challenging/difficult to discern in terms of meaning, this can bias the response (Furnham, 1986). However, further work clarifying whether DIF significantly impacts overall score is necessary before suggesting any changes on this item.
Findings overall indicate that the ER89-R possessed relatively sound psychometric properties in a general population sample. Moreover, the results suggest that it is apposite to treat Openness to Life Experiences and Optimal Regulation as distinct unidimensional scales. In practice, these should be tallied separately when determining Ego Resiliency scores. Results furthermore suggest that including all items is appropriate when administering the ER89-R given these were all valid and fitted well with their respective latent construct.
The stability of the ER89-R’s psychometric properties across method (CFA and Rasch) and independent studies (Alessandri et al., 2008, 2012) highlights the potential usefulness of this measure for assessing resilience, which is an important concept in relation to psychological well-being (Tugade & Fredrickson, 2004). In addition, the brevity and ease of administration are appealing, and the measure could be highly useful to researchers investigating self-regulation and associated constructs.
Limitations
The current study included a higher proportion of women compared with men (73% vs. 27%). Given that women tend to report higher ER89-R scores (e.g., Alessandri et al., 2008; Caprara et al., 2003), it is possible that an overrepresentation of higher Ego Resiliency scores existed in the sample. Nonetheless, this slight bias is not likely to undermine the calibration of the ER89-R because Rasch analysis permits relatively sample-free measure standardisation in comparison with classical test theory (Tinsley & Dawis, 1975).
In addition, it may have been informative to examine alternative versions of the ER89 (i.e., the original 14-item and the 11-item versions), and compare factor solutions with the ER89-R. As it stands, conclusions can only be made on how the ER89-R performs in isolation, not in comparison with alternative measures that propose different latent structures. Moreover, development of the ER89-R via classical test approaches included cross-cultural comparisons (e.g., Alessandri et al., 2012). Current findings are, however, restricted to the English language version of the ER89-R. It is thus important for future research to replicate these findings in other cultural contexts and with other language versions.
Footnotes
Consent to Participate
Informed consent was obtained from all the participants in the study.
Ethical Approval
The study was approved by the Manchester Metropolitan University Faculty of Health, Psychology and Social Care Ethics Committee.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by an internal research excellence award from the Manchester Metropolitan University Faculty of Health, Psychology and Social Care (grant no. 294300).
