Assessment of the Validity and Reliability of the Arabic Version of the Ten-Item Personality Inventory (TIPI) Among Undergraduate Nursing Students and Interns in Saudi Arabia

Abstract

Background

The Ten-Item Personality Inventory is widely recognised for measuring Personality traits. It has neither been translated nor psychometrically tested in an Arabic context.

Aim

To assess the validity and reliability of the translated Arabic version of the Ten-Item Personality Inventory (TIPI) among undergraduate nursing students and interns in Saudi Arabia.

Methods

A descriptive, cross-sectional survey with translation, back-translation and panel expert validation was used. A convenient sample of 283 undergraduate nursing students and interns was selected from three higher educational institutions in Saudi Arabia. The sample was randomly split, exploratory and confirmatory factor analysis were then conducted on each sample. Internal consistency was determined by Cronbach’s alpha coefficient, item-total corrected correlation, and mean inter-item correlation. To assess the model fitness, the following indices thresholds were used: χ3/df ≤ 3, TLI >0.90, CFI >0.90, IFI > 0.90, NFI > 0.85, RMSEA < 0.08 and SRMR < 0.80. Composite reliability, convergent, and discriminant validity were assessed for the final model.

Results

The translated tool demonstrated excellent content validity. Exploratory factor analysis produced a seven-item, two-factor model accounting for 36% of the total variance. Both the McDonald’s Omega and Cronbach alpha coefficients for the overall scale were 0.71 and 0.64, respectively, and the mean inter-item correlation was 0.20, suggesting acceptable internal consistency. Confirmatory factor analysis revealed a Seven-item, Two-factor model with mixed goodness of fit: χ3/df: 2.6, TLI: 0.85, CFI: 0.91, IFI: 0.91, NFI: 0.86, RMSEA:0.29, SRMR: 0.08, demonstrating both discriminant validity and acceptable composite reliability. However, the convergent validity was partially met.

Conclusion

The final Seven-item scale contributes toward establishing an Arabic version of the Ten-Item Personality Inventory. Due to the psychometric limitations, a revised version of the Ten-Item Personality Inventory might be needed to ascertain the Arabic version that best captures the underlying constructs of the Ten-Item Personality Inventory.

Keywords

ten-item personality inventory validation Arabic undergraduate nursing students interns

Background

Personality traits provide an explanation for behavior and they need to be understood in the context of the wider system of individual functioning (Costa and McCrae, 2008). Early studies in personality science have mainly focused on identifying and describing personality characteristics (Conley, 1985; Hastie and Kumar, 1979), before shifting toward studying personality traits and predicting future behaviors based on these traits (Kumaranayake, 2017). Researchers first identified the common specific traits that are representative of the individual’s personality. Evidence from the literature has consistently presented five of them, which are commonly called the “Big Five” model. These are: Extraversion, Agreeableness, Conscientiousness, Openness to Experience, and Neuroticism (Goldberg, 1993). Capturing these “Big Five” traits among health care professionals is bound to provide a critical insight into their psychological trends and well-being. Therefore, it is imperative that a psychometrically valid and reliable survey tool be developed to accurately capture these important traits.

Review of Literature

The personality traits of and wellbeing of current and future nursing workforce have increasing been investigated and become a particular concern. For undergraduate nursing students and interns, specific traits such as Agreeableness and Conscientiousness have been widely identified as crucial predictors of effective clinical decision-making, professional engagement, and lower anxiety and stress (Xu et al., 2023; Kućar et al., 2025). Furthermore, personality is recognised as a factor in the development of professional identity, which is essential for the transition from student to qualified nurse (Wu et al., 2024). Recent studies (Farag et al., 2025; Tsiara et a., 2025) examined how the big five traits (from TIPI) correlate with attitudes of nursing students towards Artificial intelligence. Results showed that extraversion and openness traits are positively correlated with perception toward AI.

The “Big Five” Model has become widely recognized for describing the individual’s personality traits, and several research tools have been developed to measure the “Big Five” traits. One of the most comprehensive tools to examine the dimension of the Big Five is the Revised Neuroticism-Extraversion-Openness Personality Inventory (NEO-PI-R) (Costa and McCrae, 2012). This 240-item tool demonstrated excellent reliability and validity, however; this was disadvantaged by the lengthy time needed to complete which presents a challenge in certain clinical and large-scale research settings.

Due to the need for a significantly shorter instrument that saves the participants’ time while still capturing the main dimensions of the “Big Five” personality traits, Gosling et al. (2003) developed the Ten-Item Personality Inventory (TIPI), which needs less than five minutes to complete. Gosling et al. (2003) reported that this tool has acceptable psychometric criteria, such as convergent and discriminant validity and test-retest reliability. Berdida and Grande (2023) examined the interrelationships of personality traits, sleep quality, social media addiction, and academic performance among Filipino nursing students and found that nursing students reported high levels of extraversion, but low levels of emotional stability and openness to experiences. In contrast, a high level of openness was also reported among the undergraduate nursing students in Egypt (Ibrahim and Elhabashy, 2025).

This tool has been translated, adapted, and validated into various languages such as Spanish (Renau et al., 2013), South African (Metzer et al., 2014), Croatian (Vorkapić, 2016), Polish (Laguna et al., 2014), Italian (Chiorri et al., 2015), and Indonesian (Akhtar, 2018). Most of these studies reported satisfactory levels of construct validity for the Big Five Inventory (BFI). Other studies, however, suggested that the TIPI did not meet the minimum criteria of a reliability coefficient, which is 0.70 (Atak et al., 2013). The original scale (TIPI) (Gosling et al., 2003) demonstrated low-to-moderate Cronbach’s alphas (0.40–0.68), which is usually reported for short scales since the number of items is small (Ziegler et al., 2014). Gosling et al. (2003) acknowledged that it is almost impossible to achieve high levels of alpha coefficients and good fit indices in a short instrument such as the TIPI, which has only two items in each dimension. Nonetheless, the TIPI continues to be widely used among the psychology research community, where the researchers can accept somewhat diminished psychometric properties in favour of a more convenient shorter tool (Thørrisen et al., 2021; Akhtar et al., 2018). Whilst the TIPI has been adapted into other languages, it has not been adapted into the Arabic language. Researchers can benefit from using an Arabic version of the TIPI among Arabic-speaking participants.

Aim

The aim of this research was to assess the validity and reliability of the translated Arabic version of the Ten-Item Personality Inventory (TIPI) among undergraduate nursing students and interns in Saudi Arabia.

Methods

Research Design

A descriptive, cross-sectional design was utilised. An online survey was distributed to three undergraduate nursing education providers in Saudi Arabia. One nursing education provider was a publicly funded university in the Eastern Province, and the other two were private colleges that offer undergraduate nursing education in Riyadh (Central Province) and Jeddah (Western Province). This paper is part of a larger project, which was conducted to examine the psychometric properties of two scales from the Psychology domain: The 1st one was the Psychometric evaluation of the Arabic version of the Irish Assertiveness Scale among Saudi undergraduate nursing students and interns. This paper has now been published in PLoS Journal (Mansour et al., 2021). This is the 2nd paper from this project. Hence, the delay in publishing this paper.

Instrument

The Ten-Item Personality Inventory (TIPI) was used in this study (Gosling et al., 2003). The TIPI examines five dimensions of personality traits: Conscientiousness, Neuroticism, Extroversion, Openness, and Agreeableness. Each dimension in the TIPI is measured by two bipolar items, each representing a positive or a negative aspect of the personality. The participants were asked to rate how each item is applied to their personality on a seven-point Likert scale (from 1 = strongly disagree to 7 = strongly agree). Previous validation research has typically reported a low Cronbach alpha value for TIPI items ranging from 0.31 to 0.73 (Gosling et al., 2003; Hanif, 2018; Muck et al., 2007). Validation studies aimed at preserving the construct validity of their newly translated TIPI tools, but this was not always conclusive, with researchers sometimes reporting unidentical constructs to the original TIPI one (Hanif, 2018).

Participants, Sampling, and Sample

A total of 570 undergraduate nursing students and nurse interns were invited to participate in this study using a convenience sampling technique in 2019. Evidence from the literature suggests no absolute rules for the sample size required for validation studies. The suggested rule of thumb for the sample size for validation studies is based on the respondent-to-item ratio, ranging from 5:1 (Spielberger et al., 1983) to as large as 30:1 (Pedhazur, 1997). This study adopted Kline’s (2011) ’ suggestion of a 10:1 ratio. For the Ten-item scale, we aimed to recruit at least 100 respondents for the current study. The participant’s inclusion criteria were being 3^rd, 4^th-year undergraduate nursing students, or nursing interns, and enrolled in an undergraduate nursing program in one of the selected universities/colleges. Nursing students are undergoing academic and clinical training as part of their studies, while nursing interns will have completed the required theoretical and clinical courses and are undergoing a mandatory, full-time, 12-month practical training period as a prerequisite for transition to become fully licensed to practice registered nurse (Moreljwab et al., 2025). In comparison to first- and second-year undergraduate nursing students, the 3rd, 4th year nursing students and nursing interns will have reasonable clinical exposure compared with the 1st and 2nd year students (albeit in different magnitudes), which enables them to reflect more thoroughly on their clinical experience when answering the survey questions. All three academic institutions offered undergraduate nursing education, and two of them offered postgraduate nursing programs. The number of nursing students across each of the three academic institutions ranged between 250 – 450 nursing students from all programs. All three academic institutions operated a male-female segregation policy, where male and female nursing students were taught in separate campuses, and one of them has campuses in eight cities across Saudi Arabia. All BSc nursing programs in the selected settings were accredited by the Saudi Education and Training Evaluation Commission (ETEC). To graduate as a fully licensed registered nurse in Saudi Arabia, the students had to successfully complete four years of undergraduate nursing education, which is mainly based at a university/college, before completing a fifth year of a hospital-based internship program. Finally, the students must pass the Saudi Commission for Health Specialties (SCHS)’ licensure exam before becoming fully licensed registered nurses.

Translation and Cultural Adaptation

There is an open permission to use the TIPI, which is posted on the instrument’s website (Gosling et al., 2003). To translate the TIPI from English into the Arabic language, Beaton et al. (2000) framework was adopted in this study to guide the translation and cultural adaptation process. Firstly, the tool was translated into the Arabic language by two independent translators who were both fluent in Arabic and English languages. One of those translators was a nursing lecturer, and the other was a professor in the English language. A third translator compiled the two translations into one version (Arabic version), which was then back-translated into the English language by two independent translators who were fluent in both Arabic and English languages. The two back-translations were synthesised into one by an independent translator (English version). Finally, an 8-member panel of experts were invited to assess the adequacy of the translation process. The expert panel consisted of all five translators involved in the translation and back-translation process, one language expert with a PhD degree, one research method expert with a PhD degree, and one health expert with a PhD degree. Although there is no consensus on the number of the expert panel, between 5 - 10 members were frequently cited in the literature (Almanasreh et al., 2019). The expert panel reviewed the Arabic and English translation versions and agreed on a pre-final Arabic version of the scale for pilot testing. In doing so, each panel member rated each item of the pre-final version of the scale: 1 = not relevant, 2 =somewhat relevant, 3 =quite relevant, and 4 =highly relevant. The panel members have also assessed whether the translated phrases reflect the same ideas expressed in both the original and translated versions of the scale. This was to ensure that each item was translated correctly but also relevant to the cultural context of the new setting. None of the panel members (including the translators) was financially reimbursed.

The newly-translated scale was piloted on 30 participants across the three research sites. The participants were probed to provide their views on the acceptability and overall understanding of the newly translated scale, and no major problems were reported.

Procedure

Following the IRB approval, and with the permission of the senior academic management in the selected colleges of nursing, an email invitation, which included an electronic link for the survey, was sent by administrative staff to all 3^rd and 4^th-year students, as well as all nursing interns in each participating college (n=570). A follow-up email reminder was sent to all potential participants after one week to enhance the response rate (Sammut et al., 2021). The research team aimed to maintain the operational equivalence of the instrument whenever possible by using a similar questionnaire format, mode of administration, and measurement methods in the target populations as it was used in the original setting (Gjersing et al., 2010). Although this study utilised an online survey, which was somewhat different from the paper-based surveys used in the original setting, the other research parameters were overall similar to those originally published by Gosling and his colleagues (2003). Data collection lasted from March 2019 to May 2019.

Data Analysis

The participants’ responses were collected online using QuestionPro software and analysed using the Statistical Package for Social Sciences (SPSS) version 26. Before statistical analysis was commenced, 5-negatively worded items were reversed-coded as stipulated by Gosling et al. (2003) (Items no.1,3,5,7,9). Descriptive statistics including frequencies, means, and standard deviations were used to analyze the participants’ demographic data. An average mean scores for each of the five dimensions were calculated to allow the reader to assess the participants' responses on each personality trait dimension. Both Item-level Content Validity Index (I-CVI) and Content Validity Index for the whole instrument (Instrument level-CVI) were calculated. The I-CVA was calculated by counting the number of expert panels who rated the item as 3 or 4 and dividing that number by the total number of experts (i.e. the extent of the experts’ agreement on each item) (Almanasreh et al., 2019). The Instrument level-CVI was calculated using the Averaging method (Instrument-CVI/Av) by summing up the I-CVI for each item divided by the number of items on the scale. For a panel of eight expert raters, the I-CVI of 0.78 and the Instrument-CVI/Ave of 0.90 were considered the minimum acceptable indices (Polit and Beck, 2022).

The reliability of the new scale was initially examined using a Cronbach alpha coefficient threshold of 0.7 and corrected item-total correlation (Mean IIC) of 0.2 as a cut-off point for acceptable scale reliability (Field, 2024a, Field, 2024b). It was reported that the Mean IIC is a more meaningful measure in assessing the scale’s reliability when the number of items on the scale is less than 10, with an optimum Mean IIC value between 0.2 – 0.4 (Briggs and Cheek, 1986; Pallant, 2020), so the Mean IIC was reported along with the Cronbach alpha coefficient when the number of remaining items in the scale became less than 10. McDonald’s Omega was also used in examining the reliability of the new scales, as it is considered superior to Cronbach’s alpha, and it assumes that all items have various factor loadings on the new construct (Kalkbrenner, 2024), so it was also used in this study to further consolidate the reliability findings in this study.

Exploratory Factor Analysis (EFA) was used in this study to examine the scale’s construct validity. Exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) are two factor analysis methods that allow the researcher to evaluate how well a group of observed variables (e.g., items on a questionnaire) collectively reflect an unobserved (or latent) construct (Dumi et al., 2025). This study first used EFA test because it is a data-driven method that is used to freely explore the structure of the data without imposing any a priori knowledge or theory on the data instrument (Field, 2024a, 2024b). CFA on the other hand, requires pre-specification of a measurement model (i.e., representation of the factors and their relationship to the observed items/questions), which allow the researcher to assess the fitness of the hypothesized factor structure to the observed data which emerged from EFA analysis, hence model which emerged from the EFA was then evaluated in this study by the CFA. CFA using AMOS 26.0 software (Chicago, IL, USA) was conducted to confirm the model goodness of fit (Field, 2024a, 2024b), with a cut-off point of standardized factor loading of 0.3 as the minimum acceptable cut-off (Hair et al., 2014). The following indices thresholds were adopted when assessing the model fitness of the data: χ3/df is ≤ 3, TLI >0.90, CFI >0.90, IFI > 0.90, NFI > 0.85, RMSEA < 0.08 and SRMR < 0.80 (Hu and Bentler, 1999). Convergent validity, discriminant validity, and composite reliability were all assessed using Fornell and Larcker (1981) ’s criterion. Convergent validity is supported when the average variance extracted (AVE) is at least 0.5 for each factor. Discriminant validity was met when the AVE was greater than the maximum shared squared variance (MSV) for each factor. Composite reliability was also established with 0.6 as the minimum value. To allow for cross-validation of the final model, the study’s sample had to be randomly split into two sub-samples using SPSS’ Random Split Function (Thompson, 2004). EFA was conducted on the first sample split, and CFA was conducted on the second sample split.

Ethical Considerations

The Participants Information Sheet (PIS) was posted at the start of the online survey. It stated that by completing and submitting the survey, the participants implied their consent to participate in this study. As members of the research team may have been involved in elements of teaching and academic assessment of the potential participants, there was a possible feeling of vulnerability and coercion among the participants. Therefore, the PIS emphasised the voluntary nature of participation and stressed that no person-identifying information would be sought in the survey. Hence, the identity of the participants will always be kept anonymous, and the participants will always be preserved. Institute Review Board (IRB)’s approvals were obtained from each of the three selected research sites.

Results

Demographics

Almost 50% of the circulated online questionnaires were completed (n=283). Most of the participants (93%, n=259) were in the 20 – 24 years age group, which is typical for the 3^rd, 4^th year students and nursing interns. Almost two-thirds of the participants were female (70%, n= 194). Moreover, 53.4% of the participants (n=151) came from one academic institution in the Eastern Province. The remaining were recruited from nursing colleges in Western Province - Jeddah (33.4%, n=94) and Central Area- Riyadh (13.4%, n=38). Most of the participants were 3^rd year nursing students (40%, n= 112), with 4^th year students and nursing interns representing 28% (n= 80) and 32.2 (n=92) of the sample, respectively. The participants reported the highest score on the Conscientiousness dimension (M= 5.44, SD= 1.2), followed by Openness to Experiences (M= 5.1, SD= 1.2), Agreeableness (M= 4.9, SD= 1.2), Emotional Stability (M= 4.8, SD= 1.2) and Extraversion (M= 4.2, SD=0.99). Floor and ceiling effects were assessed to examine whether the participants responses were excessively grouped at the lower or upper ends of the scale, which can potentially reduce the sensitives of the measure. A threshold of 15% clustered into either of the externe ends of the answers (strongly agree – strongly disagree) indicates the presence of floor or ceiling effects (Fitzpatrick et al., 1998). Celling effect was noticed in question no., 3 where 38.1% of the participants answered strongly agree on “Dependable, self-disciplined” item, and flooring effect was noticed in question no 8 where 29.7% of the participants answered strongly disagree on the “Disorganized, careless “item. The remining 8 items on the TIPI didn’t show any floor and ceiling effects, which imply that the scale still captures any potential viability among the participants response, and can reasonably support meaningful analysis.

Content Validity

All items on the TIPI had an I-CVI value of 1 except two items (items no. 1 & 7, which both had an I-CVI value of 0.875). This suggests a strong consensus among the expert panel on the relevance of each item. The Instrument level-CVI for the total scale was 0.98, which is considered excellent. Polit and Beck (2022) recommended that panel experts need to only report perfect agreement when there are only three to four panel members. Table 1. Shows the Content Validity panel outcomes for the translated Ten-Item Personal Trait Scale.

Table 1.

Raters Voting and Content Validity for the Translated Ten-Item Personal Trait Scale

Rater no.Item no.	Rater 1	Rater 2	Rater 3	Rater 4	Rater 5	Rater 6	Rater 7	Rater 8	Number of agreements	I-CVI
1	4	4	4	4	4	4	4	1	7	0.875
2	3	4	4	4	4	4	4	4	8	1
3	4	4	4	4	4	4	4	4	8	1
4	4	4	4	4	4	4	4	4	8	1
5	4	4	4	4	4	3	4	3	8	1
6	4	4	4	4	4	4	4	4	8	1
7	4	4	3	4	4	4	4	2	7	0.875
8	4	4	4	4	4	4	4	4	8	1
9	4	4	4	4	4	4	4	3	8	1
10	4	4	4	4	4	4	4	4	8	1

S-CVI/Av 0.975.

S-CVI/UA 0.8.

Exploratory Factor Analysis (EFA)

To ascertain the construct validity of the TIPI, EFA was initially conducted on the first randomly split sample (n=157). All ten items on the TIPI were entered into the EFA, using Maximum Likelihood as the method of extraction, Varimax as a method for rotation, and an Eigenvalue of 1 (Ellis, 2017). Only items with a factor loading of 0.4 or higher were retained. The Kaiser-Meyer-Olkin Measure (KMO) was 0.69, and Bartlett’s test of sphericity was significant (p<0.01), which both support the adequacy of the sample factorability for EFA (Field, 2024a, 2024b). The EFA revealed a two-factor, seven-item solution accounting for 36% of the total variance. The examination of the scree plot showed that the descending curve levelled off after the second part, confirming a two-factor structure. Three items had to be deleted due to poor loading on any factor (Item No. 5,7,9). For the first factor: Avoidant, Low Functionality, five items loaded on this factor had corrected item-total correlation between 0.31 and 0.65, indicating a good coherence among those items. The Cronbach alpha coefficient was 0.72, which is considered good, and the Mean IIC was 0.34, further supporting the internal consistency of this subscale (Field, 2024a, 2024b). The second factor, Constructive, High Functionality, had two items loading on it (1R and 6), with a Cronbach alpha coefficient of 0.4, which is considered inadequate, most likely due to the small number of items on the scale (Gosling et al., 2003; Iwasa & Yoshida, 2018; Nunes et al., 2018; Storme et al., 2016). However, the Mean IIC for the 2^nd factor- Constructive, High Functionality -was found to be 0.25, and corrected item-total correlations for all items were also 0.25, indicating an acceptable and more accurate reliability measure than the 0.4 Cronbach’s Alpha (Table 2). McDonald’s Omega and Mean IIC for the preliminary seven-item scale were 0.71 and 0.2, respectively, implying acceptable internal consistencies. The Cronbach Alpha for this two-factor solution was 0.64, suggesting a marginal/satisfactory but less accurate reliability measure. As the number of items on the whole scale was less than 10, the Mean IIC of 0.20 and MacDonald Omega 0.71 measures are considered to represent more accurately the overall scale’s reliability (Briggs & Cheek, 1986; Pallant, 2020).

Table 2.

Exploratory Factor Analysis of the Seven-Item Arabic Version of the Ten–Item Personality Inventory

Items	Avoidant, Low Functionality	Constructive, High Functionality	Corrected Item-Total Correlation	Alpha (α)	Mean Inter-item Correlation (Mean IIC)
Factor solution	Avoidant, Low Functionality	Constructive, High Functionality	Corrected Item-Total Correlation	Alpha (α)	Mean Inter-item Correlation (Mean IIC)
Q8 Disorganized, careless	0.79		0.65	0.72	0.34
Q2 Critical, quarrelsome	0.70		0.54
Q10 Conventional, uncreative	0.63		0.51
Q4 Anxious, easily upset	0.50		0.44
Q3 R Dependable, self-disciplined	0.40		0.31	0.4	0.25
Q6 Open to new experiences, complex		0.51	0.25
Q1 R Extraverted, enthusiastic		0.44	0.25

*Overall McDonald’s Omega coefficient is 0.71.

*Overall Cronbach Alpha is 0.64 and Mean IIC 0.20.

**Goodness-of-fit Test: Chi-Square’s p value =0.05, indicating an acceptable fit for the two-factor solution.

Confirmatory Factor Analysis (CFA)

CFA was carried out on the second sample split (n=126) using AMOS 26.0 (Chicago, IL, USA) to examine the model goodness of fit for the revised seven-item, two-factor solution (Ellis, 2017). The CFA showed a mixed model (χ3/df: 2.6, TLI: 0.85, CFI: 0.91, IFI: 0.91, NFI: 0.86, RMSEA:0.29, SRMR: 0.08. Standardised factor loadings for all items exceeded the 0.3 threshold (Figure 1).

Figure 1.

Confirmatory Factor Analysis (CFA) of the Seven-item Arabic version of the Ten–Item Personality Inventory (TIPI)

The composite reliability for the first factor was good (0.79), and close to acceptable for the second factor (0.36). The AVE was adequate for the first factor (AVE = 0.5), but inadequate for the second factor (AVE < 0.5), so the convergent validity was partially supported. The AVE values for both factors were greater than their corresponding MVS, confirming discriminant validity (Table. 3).

Table 3.

Convergent Validity, Discriminant Validity, and Composite Reliability of the Final CFA Model of the Arabic Version of the Ten-Item Personality Inventory (Seven-Item Scale)

Latent variables	CR	AVE	MSV
Avoidant, Low Functionality	0.79	0.5	0.141
Constructive, High Functionality	0.36	0.23	0.141

CR: Composite Reliability. AVE: Average Variance Extracted. MSV: Maximum Shared Variance.

Discussion

Despite the psychometric challenges associated with developing a very short scale, the TIPI remains one of the most widely used instruments that examines personality in published research. Several studies have translated and examined the psychometric properties of the TIPI in different languages, many of which have reported satisfactory properties (Iwasa & Yoshida, 2018; Nunes et al., 2018). However, it is important that the reader exercises a meaningful reflection on the methodological trade-offs associated with these published studies. Gosling and his colleaques (2003) acknowledged that the TIPI is somehow inferior to the original multi-item instrument, yet they seemed to be prepared to trade off a certain level of the psychometric properties of this short scale (i.e. Reliability’s Cronbach alpha coefficient) in favor of preserving the construct validity of the scale, thus, keeping it more convenient and user-friendly. In this study, we adopted a balanced approach by aiming to preserve both the construct validity and reliability, as longer scales tend to have better psychometric properties (Field, 2024a, Field, 2024b); Ziegler et al., 2014). This was proven to be challenging, particularly when using a short scale with fewer than ten items.

In this study, the demographic characteristics were largely consistent with nursing students’ populations reported in previous studies: the majority were young adults (20-24 years), female, and in their third or fourth year of study. These demographic characteristics may have influenced the results, as previous research reported that younger nursing students often score higher on traits such as openness and extraversion, while female students tend to report higher agreeableness and Conscientiousness compared to male peers (Durmaz & Tastan, 2022; Salem et al., 2024). Given that most participants were female and at an early stage in their professional development, the observed two-factor solution may reflect not only linguistic and cultural adaptation issues but also the developmental context of the sample. This aligns with a previous study emphasising the role of age and gender, and the education stage in shaping personal expression among nursing students (Berdida & Grande, 2023).

It is clear that the new two-factor solution in this study did not match the original five-factor construct, which was developed by Gosling et al. (2003), and in this respect, the Arabic version has some limitations related to the internal structure. When it comes to assessing the structural validity of the TIPI, the literature review provides mixed results. While several Studies have failed to establish the five-factor solutions in the Chinese language (Shi et al., 2022), Dutch (Hofmans et al., 2008) and Croatian (Vorkapić, 2016), thus confirming the findings from this study in relation to misfit with the five-factor solution. Several other studies, however, have reported five-factor solutions of the TIPI (albeit at an acceptable level), such as in Bangli Language (Islam, 2019), Portuguese (Nunes et al., 2018) and Norwegian (Thørrisen et al., 2021). In the context of this study, one explanation for such a mismatch may be ascribed to the sample’s characteristics utilised in the original sample in Gosling et al. (2003) study, which markedly differ from those used in this study. Another explanation may be related to the translation process which may have allowed for the phraseology used in some items to be lexically related to other scale dimensions, However, there were eight raters in the expert panel, and the relatively high I-CVI and Instrument-CVI values, which meant that the translation process is less likely to be the underlying reason, but it cannot be completely ruled out. Previous validation studies have reported similar challenges. For example, Hofmans et al. (2008) found that when translating the TIPI into the Dutch language (TIPI-d), the EFA revealed three underlying factor solutions, mismatching the intended five-factor structure, but when five descriptors in the first version were adjusted, five-factor solution resulted, but even that did not fully capture the five-factor scale scores, because the correlations between the openness scale scores for the Dutch TIPI (version 2) and the respective facets as measured by the NEO-PI-R were negative and moderate in magnitude. A subsequent study that examined the psychometric properties of the Portuguese version of the TIPI reported a five-factor structure, but the item Disorganised, careless, did not load sufficiently on the Agreeableness factor as advocated by the original TIPI (Nunes et al., 2018).

The reliability measure for this proposed scale was acceptable. Short scales are commonly reported to have a low reliability value (Ziegler et al., 2014). We used a previously advocated test to examine the scale reliability, the Mean IIC, which provided an alternative assessment method for measuring the internal consistency for short scales (i.e. Ten items or less). The authors have used this test successfully in a previously published psychometric evaluation (Mansour et al., 2021). Other TIPI validation studies have also used the Mean IIC to examine the reliability measures (Hanif, 2018). The authors of the original TIPI stated very clearly that the goal for developing the TIPI was to create a very short instrument with optimised validity (including content validity), but not to create an instrument with high alphas and good CFA fits. The scale validity was traded off with its reliability (Gosling et al., 2003). This paper adopted a pragmatic approach by attempting to present an instrument with “balanced” validity and reliability measures, by measuring the reliability using Mean IIC, but also acknowledging the challenges associated with presenting the model’s goodness of fit. More specifically, the goodness of fitness for the confirmatory factor analysis in this study showed mixed fit indices; while most indices indicated acceptable fit (i.e. χ3/df, CFI, IFI, SRMR), both the TLI and RMSEA values didn’t meet the recommended threshold, suggesting inadequate fit. Notably, while the original authors focused on other measures, Mean IIC provides valuable alternative evidence for short scales, thus providing future researchers with valuable evidence on this matter, particularly when it is not feasible to verify the reliability of the tool using the test-retest procedure, as it was the case for this study.

The convergent validity proved to be inadequate for the second factor. This can be explained by the relatively low factor loading for items 1R and 3R (0.38, 0.34), although both crossed the standard threshold of 0.3. It was reported that standardised factor-loading below 0.5 in CFA could influence AVE, and consequently, convergent and discriminant validity (Hair et al., 2014). A recent systematic review examined the psychometric properties of the TIPI in terms of its validity (convergent and structural) and two aspects of reliability (internal consistency and test–retest reliability) across different languages, and found that TIPI was characterised by certain psychometric shortcomings, demonstrating mixed results for convergent and structural validity, and inappropriate internal consistency. However, the review also emphasised the acute need to trade-off some of the psychometric properties with the survey length (Thørrisen and Sadeghi, 2023).

Strengths and Limitations

Previous studies have used the correlation coefficient between the TIPI and well-known personality scales, which measure a similar construct, such as the NEO- PI-R R, to report the convergent validity (Halama et al., 2020; Iwasa & Yoshida, 2018). Along with the EFA, this procedure can provide further evidence of the underlying construct validity of the scale, where simple correlation coefficients convey both common factors and unique variances. Test-retest reliability has often been used to establish scale reliability in previous TIPI validation studies. Due to logistical and time constraints, neither correlating the new scale with another personality scale nor test-retest reliability was used in this study. Future validation studies of the Arabic version of the TIPI may need to consider using a correlational coefficient with the NEO-PI-R and test-retest reliability.

In the final model, three out of the five negatively worded items had to be eventually deleted. This implies that the wording of the scale may need to be reconsidered to better reflect the underlying construct, but also the deletion of three negatively worded items might suggest cultural or linguistic nuances in how “reverse-scored” traits are perceived in Arabic. Therefore, an alternative translation framework with different sample demographic characteristics may also need to be contemplated to examine whether there is an effect of the lexical components of the translation on the dimensionality of the scale. The data was collected in 2019, which is over 5 years old. In addition, the study relied on students’ recollections of their memories, which may be imperfect due to the passage of time. However, there is a little research which examined the TIPI, and arguably, those personality traits are relatively stable over time, which helps mitigate the 5-year gap. Therefore, the findings from this study still serve as a platform for other researchers to utilise and build upon. Moreover, the sample was drawn from three academic institutions, which may have limited the scope to which the findings of this research can be applied. However, the sample was drawn from three geographically diverse locations in Saudi Arabia (East, Middle and West regions), which helped to provide diverse insights from a range of participants.

Implication for Practice

Evaluating the psychological well-being of the nursing staff and the wider health care professionals has increasingly become a critical subset for building a resilient nursing workforce. Researchers and practitioners can now rely on the findings of this study to further test, measure and interpret construct validity and reliability of the Arabic version of the TIPI, thus informing the decision-making and monitoring the practice outcome over an extended period of time. Moreover, The TIPI can have a valuable clinical application in nursing practice. For example, by providing a platform for rapid screening to understand both patient responses and staff dynamics. This can help building a personality profile of the target individuals with specific health problem leading to a more tailored nursing care delivery.

Conclusion

The TIPI remains the most widely used short scale to examine the Big Five personality traits. Although the ability of the scale to fully capture the intended psychological facets has been disputed, it is arguably a very convenient research tool when the focus of the research is not the personal traits per se. Our new proposed tool showed some limitations in terms of the convergent and structural validity compared with the original TIPI, with an acceptable reliability measure. An additional validation study is imperative to further consolidate the psychometric properties of the Arabic version of the TIPI, taking into consideration the balanced approach for examining its validity and reliability adopted in this study.

Supplemental Material

Supplemental Material - Assessment of the Validity and Reliability of the Arabic Version of the Ten-Item Personality Inventory (TIPI) Among Undergraduate Nursing Students and Interns in Saudi Arabia

Supplemental Material for Assessment of the Validity and Reliability of the Arabic Version of the Ten-Item Personality Inventory (TIPI) Among Undergraduate Nursing Students and Interns in Saudi Arabia by Mansour Mansour, Ahmad Alafafsheh, Abd Alhadi Hasan and Afnan Alswyan in Sage Open Nursing.

Supplemental Material

Supplemental Material - Assessment of the Validity and Reliability of the Arabic Version of the Ten-Item Personality Inventory (TIPI) Among Undergraduate Nursing Students and Interns in Saudi Arabia

Footnotes

Acknowledgment

We would like to thank the nurses and the expert panel members who participated in the study.

ORCID iDs

Mansour Mansour

Ahmad Alafafsheh

Abd Alhadi Hasan

Ethical Considerations

All relevant Institute Review Board approvals were secured prior to commencement of data collection. All methods were performed in accordance with the relevant guidelines and regulations (e.g. in accordance with the Declaration of Helsinki). Three IRB permissions were secured before data collection commenced in each site: Imam Abdulrahman Bin Faisal University’s Institute Review Board (IRB) Committee. (IRB No: IRB -2018- 04–319). Al-Ghad International Colleges for Applied Medical Sciences’ Institutional Review Board Committee—Riyadh. Fakeeh College for Medical Sciences’ Institutional Review Board Committee—Jeddah. IRB No: 24/IRB/201.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The datasets generated and/or analysed during the current study are available from the corresponding author on reasonable request, and subject to the approval of the relevant Research Ethics Committees. Request can be made to the corresponding author Dr. Mansour Mansour. Contact email: mansour.mansour@actvet.gov.ae.

Supplemental Material

Supplemental material for this article is available online.

References

Akhtar

(2018). Translation and validation of the Ten-Item Personality Inventory (TIPI) into Bahasa Indonesia. International Journal of Research Studies in Psychology, 7(2), 59–69. https://doi.org/10.5861/ijrsp.2018.3009

Akhtar

Thyagaraj

Das

(2018). The impact of social influence on the relationship between personality traits and perceived investment performance of individual investors. International Journal of Managerial Finance, 14(1), 130–148. https://doi.org/10.1108/IJMF-05-2016-0102

Almanasreh

Moles

Chen

(2019). Evaluation of methods used for estimating content validity. Research in Social and Administrative Pharmacy, 15(2), 214–221. https://doi.org/10.1016/j.sapharm.2018.03.066

Atak

Kapçı

Çok

(2013). Evaluation of the Turkish version of the multi-measure agentic personality scale. Dusunen Adam The Journal of Psychiatry and Neurological Sciences, 1(26), 36–45. https://doi.org/10.5350/DAJPN2013260104

Beaton

Bombardier

Guillemin

Ferraz

(2000). Guidelines for the process of cross-cultural adaptation of self-report measures. Spine, 25(24), 3186–3191. https://doi.org/10.1097/00007632-200012150-00014

Berdida

Grande

(2023). Nursing students’ nomophobia, social media use, attention, motivation, and academic performance: A structural equation modelling approach. Nurse Education in practice, 70(70), 103645. https://doi.org/10.1016/j.nepr.2023.103645

Briggs Cheek

(1986). The role of factor analysis in the development and evaluation of personality scales. Journal of Personality, 54(1), 106–148. https://doi.org/10.1111/j.1467-6494.1986.tb00391.x

Chiorri

Bracco

Piccinno

Modafferi

Battini

(2015). Psychometric properties of a revised version of the Ten Item Personality Inventory. European Journal of Psychological Assessment, 31(2), 109–119. https://doi.org/10.1027/1015-5759/a000215

Conley

(1985). Longitudinal stability of personality traits: A multitrait–multimethod–multioccasion analysis. Journal of Personality and Social Psychology, 49(5), 1266–1282. https://doi.org/10.1037//0022-3514.49.5.1266

10.

Costa

P. T.

McCrae

(2008). The revised NEO Personality Inventory (neo-pi-r). The SAGE handbook of personality theory and assessment, 2(2), 179–198.

11.

Costa

P. T.

McCrae

(2012). Major contributions to the psychology of personality. In Hans Eysenck: Consensus and Controversy (pp. 65–74). Routledge.

12.

Dumi

O’Neill

Daskalopoulou

Keeley

Rhoten

Sauriyal

Fromy

(2025). The impact of different data handling strategies in exploratory and confirmatory factor analysis of diary measures: an evaluation using simulated and real-world asthma nighttime symptoms diary data. Journal of Biopharmaceutical Statistics, 35(5), 944–968. https://doi.org/10.1080/10543406.2024.2310312

13.

Durmaz

Tastan

(2022). Analyzing the relationship between the personality traits of nursing students and their attitudes toward people with mental illnesses. Perspectives in Psychiatric Care, 58(4), 2481–2488. https://doi.org/10.1111/ppc.13083

14.

Ellis

J. L.

(2017). Factor analysis and item analysis. [11-59 Available from. https://www.applyingstatisticsinbehaviouralresearch.com/documenten/factor_analysis_and_item_analysis_version_11_.pdf

15.

Farag

Abde El-Tawab

Fayed

Abdel Mageed

(2025). Perspectives, Attitudes, and Personality Traits of Maternity Nursing Students Toward the Use of Artificial Intelligence in Education. Egyptian Journal of Health Care, 16(2), 770–781. https://doi.org/10.21608/ejhc.2025.435427

16.

Field

(2024a). Discovering statistics using IBM SPSS statistics. Sage Publications Limited.

17.

Field

(2024b). Discovering Statistics Using IBM SPSS Statistics: Sage Publications.

18.

Fitzpatrick

Davey

Buxton

M. J.

Jones

D. R.

(1998). Evaluating patient-based outcome measures for use in clinical trials. Health Technology Assessment, 2(14). https://doi.org/10.3310/hta2140

19.

Fornell

Larcker

D. F.

(1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of marketing research, 18(1), 39–50. https://doi.org/10.1177/002224378101800104

20.

Gjersing

Caplehorn

J. R.

Clausen

(2010). Cross-cultural adaptation of research instruments: language, setting, time and statistical considerations. BMC Medical Research Methodology, 10(1), 1–10. https://doi.org/10.1186/1471-2288-10-13

21.

Goldberg

(1993). The structure of phenotypic personality traits. American Psychologist, 48(1), 26–34. https://doi.org/10.1037/0003-066x.48.1.26

22.

Gosling

Rentfrow

Swann

(2003). A very brief measure of the Big-Five personality domains. Journal of Research in Personality, 37(6), 504–528. https://doi.org/10.1016/S0092-6566(03)00046-1

23.

Hair

Black

W. C.

Babin

Anderson

(2014). Multivariate data analysis (7th ed.): Pearson.

24.

Halama

Kohút

Soto

John

(2020). Slovak adaptation of the Big Five Inventory (BFI-2): Psychometric properties and initial validation. Studia Psychologica, 62(1), 74–87. https://doi.org/10.31577/sp.2020.01.79

25.

Hanif

(2018). Translation and validation of the ten-item personality inventory (TIPI) into Bahasa Indonesia. International Journal of Research, 7(2), 59–69.

26.

Hastie

Kumar

P. A.

(1979). Person memory: Personality traits as organizing principles in memory for behaviors. Journal of Personality and Social Psychology, 37(1), 25–38. https://doi.org/10.1037/0022-3514.37.1.25

27.

Hofmans

Kuppens

Allik

(2008). Is short in length short in content? An examination of the domain representation of the Ten Item Personality Inventory scales in Dutch language. Personality and Individual Differences, 45(8), 750–755. https://doi.org/10.1016/j.paid.2008.08.004

28.

Bentler

P. M.

(1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: a Multidisciplinary Journal, 6(1), 1–55. https://doi.org/10.1080/10705519909540118

29.

Ibrahim

Elhabashy

(2025). Relationship between academic performance, personality traits, and anxiety level among Egyptian undergraduate nursing students: a correlational research study. BMC Nursing, 24(1), 115. https://doi.org/10.1186/s12912-025-02697-7

30.

Islam

(2019). The Big Five model of personality in Bangladesh: examining the ten-item personality inventory. Psihologija, 52(4), 395–412. https://doi.org/10.2298/psi181221013i

31.

Iwasa

Yoshida

(2018). Psychometric evaluation of the Japanese version of Ten-Item Personality Inventory (TIPI-J) among middle-aged, and elderly adults: Concurrent validity, internal consistency and test–retest reliability. Cogent Psychology, 5(1), 1426256. https://doi.org/10.1080/23311908.2018.1426256

32.

Kalkbrenner

M. T.

(2024). Choosing between Cronbach’s coefficient alpha, McDonald’s coefficient omega, and coefficient H: Confidence intervals and the advantages and drawbacks of interpretive guidelines. Measurement and Evaluation in Counselling and Development, 57(2), 93–105. https://doi.org/10.1080/07481756.2023.2283637

33.

Kline

(2011). Convergence of structural equation modeling and multilevel modeling. In Williams

Vogt

(Eds.), The SAGE handbook of innovation in social research methods (pp. 562–589): Sage. https://doi.org/10.4135/9781446268261.n31

34.

Kumaranayake

(2017). Review of the current status of the studies on personality traits. International Journal of Applied Research, 3(11), 38–45.

35.

Laguna

Bak

Purc

Mielniczuk

Oles

(2014). Short measure of personality TIPI-P in a Polish sample. Roczniki Psychologiczne, 17(2), 421–437.

36.

Mansour

Hasan

A. A.

Alafafsheh

, (2021) Psychometric evaluation of the Arabic version of the Irish Assertiveness Scale among Saudi undergraduate nursing students and interns. PLOS One. 12;16(8):e0255159. https://doi.org/10.1371/journal.pone.0255159

37.

Metzer

De Bruin

Adams

(2014). Examining the construct validity of the Basic Traits Inventory and the Ten-Item Personality Inventory in the South African context. SA Journal of Industrial Psychology, 40(1)1–9. https://doi.org/10.4102/sajip.v40i1.1005

38.

Moreljwab

Mokhtar

Idress

Mohamed

Alanazi

Hassan

Adam

K. M.

(2025). Challenges and Difficulties During the Nursing Internship Program Using 5 Domains: A Cross-Sectional Study. Advances in Medical Education and Practice, 16, 341–355. https://doi.org/10.2147/AMEP.S466735

39.

Muck

Hell

Gosling

(2007). Construct validation of a short five-factor model instrument. European Journal of Psychological Assessment, 23(3), 166–175. https://doi.org/10.1037/t07017-000

40.

Nunes

Limpo

Lima

C. F.

Castro

S. L.

(2018). Short scales for the assessment of personality traits: Development and validation of the Portuguese Ten-Item Personality Inventory (TIPI). Frontiers in Psychology, 9, 461. https://doi.org/10.3389/fpsyg.2018.00461

41.

Pallant

(2020). SPSS survival manual 7^th Edition: McGraw-Hill Education (UK).

42.

Pedhazur

(1997). Multiple regression in behavioral research (3rd ed.): Harcourt Brace.

43.

Polit

D. F.

Beck

C. T.

(2022). Essentials of Nursing Research: Appraising Evidence for Nursing Practice: Wolters Kluwer.

44.

Renau

Oberst

Gosling

Rusiñol

Chamarro

(2013). Translation and validation of the ten-item-personality inventory into Spanish and Catalan. Aloma: Revista de Psicologia, Ciències de l’Educació i de l’Esport, 31(2), 85–97.

45.

Salem

El-Gazar

Mahdy

Alharbi

Zoromba

(2024). Nursing Students’ Personality Traits and Their Attitude toward Artificial Intelligence: A Multicenter Cross‐Sectional Study. Journal of Nursing Management, 2024(1), 6992824. https://doi.org/10.1155/2024/6992824

46.

Sammut

Griscti

Norman

(2021). Strategies to improve response rates to web surveys: a literature review. International Journal of Nursing Studies, 123, 104058. https://doi.org/10.1016/j.ijnurstu.2021.104058

47.

Shi

Chen

(2022). Assessing the psychometric properties of the Chinese version of the ten-item personality inventory (TIPI) among medical college students. Psychology Research and Behavior Management, 15, 1247–1258. https://doi.org/10.2147/PRBM.S357913

48.

Spielberger

Gorsuch

Lushene

Vagg

Jacobs

(1983). Manual for the State-Trait Anxiety Inventory: Consulting Psychologists Press.

49.

Storme

Tavani

Myszkowski

(2016). Psychometric properties of the French ten-item personality inventory (TIPI). Journal of Individual Differences, 37(2), 81–87. https://doi.org/10.1027/1614-0001/a000204

50.

Thompson

(2004). Exploratory and confirmatory factor analysis. American Psychological Association.

51.

Thørrisen

M. M.

Sadeghi

(2023). The Ten-Item Personality Inventory (TIPI): a scoping review of versions, translations and psychometric properties. Frontiers in psychology, 14, 1202953. https://doi.org/10.3389/fpsyg.2023.1202953

52.

Thørrisen

M. M.

Sadeghi

Wiers-Jenssen

(2021). Internal consistency and structural validity of the Norwegian translation of the ten-item personality inventory. Frontiers in Psychology, 12, 723852. https://doi.org/10.3389/fpsyg.2021.723852

53.

Tsiara

Bakalis

V. I.

Toska

Zyga

Stathoulis

J. D.

Albani

E. N.

Fradelos

E. C.

Togas

Agraniotis

(2025). The Role of Personality Traits in Nursing Students’ Attitudes Toward Artificial Intelligence. Cureus, 17(2), Article e78847. https://doi.org/10.7759/cureus.78847

54.

Vorkapić

(2016). Ten Item Personality Inventory: A validation study on a Croatian adult sample. The European Proceedings of Social & Behavioral Sciences, 4, 192–202. https://doi.org/10.15405/epsbs.2016.05.20

55.

Ziegler

Kemper

Kruyen

(2014). Short scales–Five misunderstandings and ways to overcome them. Journal of Individual Differences, 35(4), 185–189. https://doi.org/10.1027/1614-0001/a00014

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.09 MB

0.00 MB

0.13 MB