Abstract
The objective of this study is to analyze the factor structure of the BFI-10 considering item valence effects when applied to measure older adults. Likewise, this study aims to estimate the factorial structure, internal consistency of the scale, to assess the nomological validity, and the association of the Big Five traits with age. 75,078 participants with mean age of 68.27 from the 7th Wave of the SHARE study were included. Confirmatory Factor Analyses, omega coefficients and Pearson correlations were estimated. The best-fit model identified a five-factor structure with two valence effects, internal consistency ranged from .26 to .64, the nomological network showed that loneliness is negatively associated to neuroticism and positively with the other four traits, and the opposite direction in the associations with the five traits and satisfaction and quality of life. Consciousness, Extraversion and Openness have been found as dimensions that tend to decrease with age.
Keywords
Introduction
Traditionally, the Big Five personality attributes model (namely extraversion, agreeableness, conscientiousness, neuroticism, and openness to experience) has been one of the most used in the field of personality psychology (Thalmayer et al., 2011) and specially to describe personality throughout adulthood (Brandt et al., 2020). Personality traits are difficult to study because they are mainly constant throughout life, but they are also flexible, and people change as they age (Damian et al., 2019), go through the psychosocial changes of adulthood, and gain life experience (Costa et al., 2019). Generally, longitudinal personality studies in old adults have found a tendency for neuroticism to increase, and for other personality traits to decrease as people age (Berg and Johansson, 2014; Kandler et al., 2015; Mottus et al., 2012; Wagner et al., 2016), but others only support this decline as we grow older in Consciousness, Extraversion and Openness (Atherton et al., 2022). However, the Big Five personality traits showed relative stability in adulthood (Terracciano et al., 2006; Wagner et al., 2019), meaning that people are relatively stable on each personality variable over time or change in similar ways (Olaru & Allemand, 2021).
The gold standard for assessing the Big Five personality dimensions has been the NEO-PI-R (Costa & McCrae, 1992). Despite its great theoretical and psychometric properties, this instrument has the inconvenient of taking about 45 minutes to complete (Balgiu, 2018). Instruments with a reduced number of items avoid redundancy and long completion time (Guido et al., 2015). For this reason, researchers have developed alternative short personality instruments as the 60-item NEO-FFI (McCrae & Costa, 2004), the 44-item Big Five Inventory (BFI-44, John & Srivastava, 1999), Goldberg’s (1992) set of 100 Trait Descriptive Adjectives (TDA), their compact version of 40 descriptors (Saucier, 1994), or the 60-item BFI-2 scale (Soto & John, 2017). There are also extra-short measures to be used in large-scale surveys (Hofmans et al., 2008), such as the Ten-Item Personality Inventory (TIPI, Gosling et al., 2003), the Five-Item Personality Inventory (FIPI, Gosling et al., 2003), the Single-Item Measures of Personality (SIMP, Woods & Hampson, 2005), and the one to discuss in this article, the Big Five Inventory-10 (BFI-10) designed by Rammstedt and John (2007).
The BFI-10 is a ten-item scale developed in two languages, English and German, with samples of students. It was thought to assess the Big Five dimensions of personality in a very short amount of time. The items of the scale were selected from the BFI-44, based on expert judgment and empirical item analyses. This scale represents an asset for health surveys, where resources and time limitations require administration of such short measures (Eisinga et al., 2013). In fact, in 2017 the BFI-10 was administered for the first time in the Survey of Health, Aging and Retirement in Europe (SHARE), a large, longitudinal, household survey of representative samples of European people aged 50 years and over (Börsch-Supan, 2020).
Regarding the factor structure of the BFI-10, multiple studies have examined whether the original five-factor structure was replicated, and mixed results have been found. In the first validation study, the expected five-factor structure of the BFI-10 was displayed in both US and German student samples (Rammstedt & John, 2007). While this original structure has been replicated in Italian (Guido et al., 2015) and French (Courtois et al., 2020) samples, item cross-loadings have also been documented, mostly for agreeableness items (Carciofo et al., 2016; Rammstedt et al., 2013) and to some extent for extraversion (Balgiu, 2018), neuroticism (Carciofo et al., 2016) and openness (Rammstedt et al., 2013) items. Finally, John et al. (2019) tested the five-factor model in a sample of Indian students but did not find convergence.
Overall, validation studies seem to disagree on the underlying structure of the BFI-10. There are two reasons that might obscure the scale’s dimensionality. On the one hand, as noted by Kline (2016), problems in analysis are more common in factors with only two items. On the other hand, items can be framed in the direction of the construct under measurement, hence presenting a positive valence in respect to the construct under measurement, or in the reversed direction, with a negative valence. One way to create reversed items with negative valence is phrasing them with antonyms of that measured by the construct (Suárez-Álvarez, et al., 2018). Reversed items are thought to prevent response bias (Salazar, 2015). However, some authors have argued that reversing items poses interpretation problems that can lead to inattention and confusion (van Sonderen et al., 2013). Some other authors (Seng Kam & Meyer, 2015) have further demonstrated that social desirability response style partially explains method effects associated to item valence and item keying. In the case of the BFI-10 and item valence, there are only two items tapping each dimension, one framed positively and the other framed in the opposite direction of the construct. Hence, the little number of items per dimension and their item valence could account for divergent factor structures found among different validation studies.
Controversy regarding the factorial structure in the Big Five scales goes beyond the BFI-10. Recent research has deepened into the study of bifactor models trying to capture substantive and method factors (e.g. Ashton et al., 2020; Biderman et al., 2018, 2019; Chang et al., 2012; Ober et al., 2021; Revelle & Wilt, 2013). Regarding the measurement-related issues in the Big Five assessment, Biderman et al. (2011) explored the factorial structure of two personality questionnaires (IPIP and NEO-FFI) including three method factors: (a) a general factor including all the items, (b) one including positively worded items, and (b) one including negatively worded items. The authors concluded that personality measures might be misspecified if method factors are not considered. Their results supported previous literature on self-assessment that indicates the relevance of including both method factors (positive and negative). Marsh et al. (2010) conducted a remarkable study on the wording effect using the Rosenberg Self Esteem scale, where they found longitudinal support for the co-existence of both method factors.
Furthermore, a key reason why the structure of personality is frequently challenging to replicate in representative samples of the adult population is that older people contain a wide range of cognitive abilities and different levels of education, closely related to method effects on the BFI-10 scale (Lechner & Rammstedt, 2015). For example, reverse-key items have been associated with several measurement problems (Menold, 2020) that could especially affect older adult samples, as they seem to require mental rotation of the rating scale to be processed, and this is a difficult cognitive task that depends on cognitive functioning (Kail & Park, 1990). In addition, poorer cognitive functioning has also been associated with higher levels of acquiescence (Rammstedt & Farmer, 2013).
Estimates of internal consistency of the BFI-10 show great disparity among studies, with reported Cronbach’s alpha values ranging from .12 to .75 for Extraversion; .02–.78 for Agreeableness; .25–.46 for Conscientiousness; .33–.62 for Neuroticism; and .21–.76 for Openness (Carciofo et al., 2016; Hrebícková et al., 2016; John et al., 2019).
Although there are many studies that have tested the psychometric properties of the BFI-10, most of them focus on samples of students or young adults, whereas there is no evidence about the invariance of the measure across different age groups. A measure that does not show sufficient evidence of invariance and is administered to a group different from its standardized population may lead to invalid personality scores. Therefore, there is a strong need to study the psychometric properties of the BFI-10 also in older people.
Regarding nomological validity, some authors have reported evidence on the association between the Big Five personality traits and several variables. Among the available measures in SHARE, some accumulate sufficient evidence of clear associations with personality, which means that they can be used to test for nomological validity. Specifically, the variables were: life satisfaction (Anglim et al., 2020; Balgiu, 2018; Lachmann et al., 2018; Rammstedt et al., 2020) and quality of life (Anglim et al., 2020; Balgiu, 2018), which, in turn, show negative associations with neuroticism and a positive relationship with extraversion, conscientiousness, agreeableness, and openness to experience. Loneliness is also an important variable in the context of old age, due to its higher prevalence, and its strong relation with personality traits (Hensley et al., 2012). Loneliness is positively related to neuroticism and negatively associated with the other personality traits in a consistent way in older people (Itzick et al., 2020; Schutter et al., 2020; Wang & Dong, 2018).
Despite the amount of BFI-10 psychometric studies, there is a need to study the scale properties in samples of old people, and especially considering the potential method effects associated to item valence. The aim of this study is to test the psychometric properties of the BFI-10 in a large representative sample of European and Israelis adults. More concretely, the aims of the present study are (a) to analyze potential item valence effects and establish the factor structure of the scale; b) to estimate internal consistency of the best-fitting model; (c) to assess the nomological validity of the personality dimensions; and (d) to test the association of the five personality dimensions with age.
Method
Sample and procedure
Data used in this study comes from the Survey of Health, Aging and Retirement in Europe (SHARE; Börsch-Supan et al., 2013) on its 7th Wave (Börsch-Supan, 2020). The data are available at the SHARE Research Data Center to the entire research community free of charge (www.share-project.org) upon reasonable request and with permission of the SHARE project. For this study, the total sample was comprised by 77,263 participants, from which 2185 were not administered the BFI-10. Hence, the final sample consisted of the remaining 75,078 participants in this study.
From the final sample, 43,041 (57.3%) participants were women and 32,037 (42.7%) were men. Most of them were married and living together with their spouse (68.9%), while 15.2% were widows/widowers, 8.1% were divorced, and the remaining 7.8% were in other situations. Age ranged between 22 and 105 years old (M = 68.27, SD = 9.90). The amount of individuals younger than 50 years old accounted for the 1.1% of the sample. Although the SHARE target population is all people aged 50 years and over, partners living in the same household are also eligible respondents regardless of their age. The sampling strategy employed in SHARE is a probabilistic four-stage process. For more details on sample eligibility and the sampling procedures, please see Bergmann et al. (2019). The Ethical Approval for gathering of the data used in this study was obtained by the SHARE project and are guided by the ethical declaration of Helsinki.
Instruments
Personality traits were measured using the 10-item Big-Five Inventory (BFI-10; Rammstedt & John, 2007). This measure of personality contains two items tapping each of the following factors: neuroticism, extraversion, conscientiousness, agreeableness and openness to experience. For each factor, one item is phrased in the direction of the construct and the other item is phrased in the opposite direction. Responses were coded in a 5-point Likert scale ranging from 1 (disagree strongly) to 5 (agree strongly).
Loneliness was measured with the Three-Item Loneliness Scale (Hughes et al., 2004), a reduced version of the R-UCLA Loneliness Scale (Russel et al., 1980). The items of the scale assess the frequency of feelings of exclusion, isolation and abandonment. Responses are coded in a three-point Likert scale: 1 (hardly ever or never), 2 (some of the time), and 3 (often). Estimate of internal consistency was .76 by means of Cronbach’s alpha.
Quality of life was assessed using the Control, Autonomy, Self-realization and Pleasure-12 (CASP-12) scale, a reduced version of the CASP-19 (Hyde et al., 2003). Item responses are coded in a four-point Likert scale, ranging from 1 (never) to 4 (often). The 12-item version of the instrument was recently validated using data from SHARE Wave 7 (Oliver et al., 2021). Results from the validation study support its use as an overall score of subjective well-being. Scale’s alpha was .83 for the general factor of well-being.
Life satisfaction was measured with the question: “On a scale from 0 to 10 where 0 means completely dissatisfied and 10 means completely satisfied, how satisfied are you with your life?“.
Statistical analyses
To satisfy the first objective of the study, a set of Confirmatory Factor Analyses (CFAs) were estimated. Specifically, the four a priori models were: Model 1, a five-factor structure of the BFI-10 as established by Rammstedt & John (2007); Model 2, the five factor substantive structure plus a method effect factor associated to the five negative valence items of the scale; Model 3, the five factors of personality plus a method factor associated to the five positive valence items of the scale; and Model 4, with the five substantive factors plus both method effect factors, one associated to positive valence items and the other associated to negative valence items.
Assessment of model fit employed the recommended indices (Tanaka, 1993): chi-square statistic (χ2), Comparative Fit Index (CFI), Root Mean Squared Error of Approximation (RMSEA) and Standardized Root Mean square Residual (SRMR). Adequate fit is considered with CFI equals or is over .90 and RMSEA/SRMR equal or are under .08, while excellent fit is deemed with CFI of at least .95 and RMSEA/SRMR of .05 or less (Hu & Bentler, 1999). Our data were markedly non-normal and ordinal (five response alternatives), and therefore Weighted Least Squares Mean and Variance adjusted (WLSMV) was chosen as the method of estimation, given that this is the preferred method for compensating deviations from multivariate normality of ordinal data (Kline, 2016).
The second aim of the study was to estimate the internal consistency of the scale. For this, coefficient omega (ω, McDonald, 1999) was used, given that it overcomes the deficiencies of Cronbach’s alpha. Specifically, it is known that coefficient alpha underestimates the true reliability unless the items are tau-equivalent, whereas coefficient omega does not (Deng & Chan, 2017). The third research objective was approached examining the relationships between personality dimensions, measured using the BFI-10, and loneliness, life satisfaction and quality of life, using Pearson’s correlations. Finally, the fourth research objective was to analyze the association between age and the five personality traits using Pearson’s correlations. Analyses were done with SPSS 26 and Mplus 8.7 (Muthén & Muthén, 1998-2017–2021).
Results
Factorial validity
Fit indices of the tested models.
M = Model; M.E. = Method effect.
Standardized factor loadings of the best-fitting model, Model 4, can be consulted in Figure 1. Both method factors were uncorrelated with each other and with the personality dimensions. Best-fitting model structure and standardized factor loadings. All relationships statistically significant (p < .05) unless stated otherwise. Note: O = Openness; C = Conscientiousness; E = Extraversion; A = Agreeableness; N = Neuroticism; V. E. = Valence effect; ns = not significant.
Correlation matrix between the personality dimensions considering methods effects into the model and without method effects. Note: Estimates when method effects are considered in the left, and without method effects in the right; all correlations statistically significant (p < .05).
Internal consistency
Internal consistency was estimated using omega. Results were ω = .35 for Openness, ω = .56 for Consciousness, ω = .50 for Extraversion, ω = .26 for Agreeableness, and ω = .64 for Neuroticism.
Criterion-related validity
Correlations between the personality dimensions and loneliness, well-being and life satisfaction. All correlations statistically significant (p < .05).
Correlations between the Big Five traits and age
The dimensions Extraversion r = −.043, (p < .001), Conscientiousness r = −.015, (p < .001), Openness r = −.066, (p < .001) and Neuroticism r = −.010, (p < .05) showed a decrease with increasing age, while Agreeableness increases r = .046, (p < .001) as people get older. However, although significant all correlations are extremely low, thus indicating a great stability of personality through the old age.
Discussion
The present study examined the psychometric properties of the BFI-10 (Rammstedt & John, 2007) using data from SHARE. Three psychometric properties were addressed. Firstly, evidence for its structure was examined testing a series of CFAs, in which special attention was paid to potential method effects associated to item valence. Secondly, internal consistency of the resulting factors was estimated using omega coefficients. Thirdly, criterion-related validity was studied using life satisfaction, loneliness and quality of life as criteria. Fourthly, the associations between age and personality traits were evaluated to see if they were similar to those found in previous studies.
Regarding factor validity, among the four tested models, the best-fitting model was the one including two method effects associated to positive and negative item valence. Although method factors have been evidenced in personality assessment (e.g. Biderman et al., 2011; 2018), this study represents the first one to acknowledge such valence effects within the five-factor structure of the BFI-10. Some of the previous literature had found the simple five-factor structure of the Big-Five personality traits to fit the data (Courtois et al., 2020; Guido et al., 2015; Rammstedt & John, 2007), but some other studies had found some kind of misbehavior with item cross-loadings (Balgiu, 2018; Carciofo et al., 2016; Rammstedt et al., 2013). Even one study was not able to find model convergence for the five-factor structure (John et al., 2019). Nevertheless, some of these investigations consider that possible problems in the factorial structure may be due to the translation of the items, the item interpretation, or the cultural differences in personality structure (Carciofo et al., 2016). This should be considered especially in non-Western countries, as these cultural differences in personality may be more marked (John et al., 2019).
About correlations among the personality traits of the best-fitting model, they are generally as expected and reported in the literature on the Big-Five factors of personality, except for the one between agreeableness and openness to experience. In our study, these two dimensions correlated negatively. This result differs from previous studies, which found near-zero estimates for this relationship (Carciofo et al., 2016; Rammstedt & John, 2007) at the observable level, while John et al. (2019) found a positive association.
One possibility is that valence effects were obscuring the scale’s dimensionality, as it is well known that mixing items with positive and negative wording creates problems in interpreting item content (van Sonderen et al., 2013). However, it could also be that just two items per dimension do not suffice to fully represent the construct (Marsh et al., 1998) or to provide an adequate analytical solution. As stated by Kline (2016), convergence and estimation problems are more frequently encountered in factors with few items.
In this line, internal consistency problems have also been documented, as it is difficult to obtain consistent responses to items belonging to the same construct when these are scant (Eisinga et al., 2013). Hence, one possible reason why so much disparity in internal consistency of the BFI-10 has been found. In the present study, internal consistency estimates using coefficient omega were low, as in the case of all previous studies looking at the scale’s reliability either by means of Cronbach’s alpha (Balgiu, 2018; Carciofo et al., 2016; John et al., 2019), Spearman-Brown coefficient (Guido et al., 2015; Hrebícková et al., 2016), or test-retest (Courtois et al., 2020). When considering these low estimates of internal consistency, we must be aware of the natural tension between internal consistency and content validity almost inherent to short scales that try to tap constructs with different facets. Indeed, as acknowledged by Little et al. (1999), accurate relationships among constructs can occur even with indicators with poor reliability, given that they are well-selected across the construct’s domain, variability is ensured and confirmatory factor analysis supports such factor structure.
Regarding nomological validity, results from this study replicated those by previous studies. Loneliness was positively associated to neuroticism and negatively related to the other four personality traits. As in the case of all previous studies looking at this relationship in old age samples (Itzick et al., 2020; Schutter et al., 2020; Wang & Dong, 2018), the relationship of loneliness with neuroticism was stronger than the relationship with the other personality dimensions. Life satisfaction and quality of life, on the other side, showed negative associations to neuroticism and positive ones to extraversion, agreeableness, conscientiousness and openness. This goes in line with previous literature (Anglim et al., 2020; Balgiu, 2018; Lachmann et al., 2018; Rammstedt et al., 2020). In general, as in the aforementioned studies, neuroticism showed the strongest association to the criteria, and openness the weakest. Hence, it seems that results found in this article suggest the BFI-10 to display adequate criterion-related validity.
Finally, the relationship found between the Big Five and age was as expected for the dimensions Extraversion, Conscientiousness and Openness. However, an increase in agreeableness and a decrease in neuroticism were found with increasing age. A possible explanation is that the sample also includes people in middle adulthood and during these years, agreeableness, tends to increase and neuroticism to decrease due to maturation and increased responsibilities (Marsh et al., 2013). In any case, these correlations have been relatively small, showing that personality traits are fundamentally stable.
All in all, results show that responses to the BFI-10 seem to interfere with item valence, and there is a lack of internal consistency, probably due to the little number of items per dimension. Seng Kam and Meyer (2015) argue that item valence ought to be controlled for using Multi-Trait Multi-Method (MTMM) CFA models, while deletion of negative valence items is not recommended as these authors demonstrated that item valence influences nomological network analyses. Be that as it may, once item valence is acknowledged within the factor model, nomological validity can be established. Additionally, the use of short measures is recurrent in large scale surveys in which resources and time are limited (Eisinga et al., 2013). SHARE represents one of the most complete surveys in Europe in terms of sample representativeness, types of measures covered (social, psychological, economic, physical, etc.) and quality of research design. Despite the use of short measures not representing the best assessment scenario, balance between quality and scope of research ought to be contemplated. In this sense, SHARE provides an excellent middle ground of the quality of assessment instruments and methodology.
Among the contributions of this research, this is the first validation study of the BFI-10 in a representative sample of adults, which allows for application of the instrument beyond student populations. Moreover, this study expands the knowledge of valence effects in the BFI-10 scale and emphasizes on considering them when used in samples of older people, as our data show that method effects are present and are not negligible. Due to the importance of these effects, alternatively, it should be consider the possibility to incorporate constrained Factor Mixture Analysis (FMA) as proposed by Arias et al. (2020) and Steinmann et al. (2021) to screen out inconsistent respondents to mixed-worded scales. Findings should be, however, interpreted considering some limitations. Our study does not include concurrent and discriminant validity analyses, as the SHARE project only includes the BFI-10 personality measure. Although the BFI-10 scale has been correlated with other personality instruments in previous studies (Carciofo et al., 2016; Courtois et al., 2020; Guido et al., 2015; Rammstedt & John, 2007) and has shown evidence of concurrent validity, this short scale ought to be accompanied with other personality scales (Balgiu, 2018) whenever it is possible, otherwise interpretation of the results should be made with caution.
Conclusions
This study contributes to increase knowledge about the validity of the BFI-10 in a representative sample of older adults. And it highlights possible method effects associated with the valence of the items to improve the interpretation of the psychometric properties of the scale. It also warns to consider these effects particularly when studying personality in older people.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The SHARE data collection has been funded by the European Commission, DG RTD through FP5 (QLK6-CT-2001–00,360), FP6 (SHARE-I3: RII-CT-2006–062,193, COMPARE: CIT5-CT-2005–028,857, SHARELIFE: CIT4-CT-2006–028,812), FP7 (SHARE-PREP: GA N°211,909, SHARE-LEAP: GA N°227,822, SHARE M4: GA N°261,982, DASISH: GA N°283,646) and Horizon 2020 (SHARE-DEV3: GA N°676,536, SHARE-COHESION: GA N°870,628, SERISS: GA N°654,221, SSHOC: GA N°823,782, SHARE-COVID19: GA N°101,015,924) and by DG Employment, Social Affairs and Inclusion through VS 2015/0195, VS 2016/0135, VS 2018/0285, VS 2019/0332, and VS 2020/0313. Additional funding from the German Ministry of Education and Research, the Max Planck Society for the Advancement of Science, the U.S. National Institute on Aging (U01_AG09740-13S2, P01_AG005842, P01_AG08291, P30_AG12815, R21_AG025169, Y1-AG-4553–01, IAG_BSR06-11, OGHA_04-064, HHSN271201300071 C, RAG052527 A) and from various national funding sources is gratefully acknowledged (see
). The data used for this article can be accessed at the SHARE Research Data Center to the entire research community free of charge (www.share-project.org). Data are available from the authors upon reasonable request and with permission of the SHARE project. This research is framed in project PID2021-124,418OB-I00 funded by MCIN/AEI/10.13,039/501,100,011,033 and by “ERDF A way of making Europe”. Irene Fernández is the recipient of grant PRE2019-089,021 funded by MCIN/AEI/10.13,039/501,100,011,033 and by “ESF Investing in your future”. Zaira Torres is a researcher beneficiary of the FPU program from the Spanish Ministry of Universities (FPU20/02,482) and Sara Martínez-Gregorio is a researcher beneficiary of the FPU program from the Spanish Ministry of Sciences, Innovation and Universities (FPU18/03,710).
