Abstract
Background:
The Geriatric Depression Scale (GDS) is widely used to screen depression in older adults, yet culturally adapted brief forms require validation.
Objective:
This study aimed to evaluate the validity and reliability of the Persian Five-Item Geriatric Depression Scale (GDS-5) to ensure its suitability for accurate and efficient depression screening among Iranian older adults.
Methods:
This cross-sectional study evaluated the psychometric properties of the Persian Five-Item GDS (GDS-5) in 200 Iranian older adults. Structural validity was examined using Exploratory Factor Analysis (EFA). Concurrent validity, test–retest reliability, and diagnostic accuracy were assessed using standard correlation and ROC analyses.
Results:
EFA supported a coherent one-factor structure. The GDS-5 showed strong concurrent validity with the CES-D (r = .71, p < .001) and excellent test–retest reliability (ICC = 0.809, p < .001). ROC analysis demonstrated high diagnostic accuracy (AUC = 0.924). At the optimal cutoff, sensitivity was 0.98 and specificity was 0.77.
Conclusion:
Overall, the present study demonstrates that the Persian version of the GDS-5 is a valid, reliable, and diagnostically accurate screening instrument for depressive symptoms among Iranian older adults. Future research should focus on validating the instrument in diverse settings (e.g., primary care clinics, long-term care facilities) and utilizing a wider sample with diverse sociodemographic characteristics to improve generalizability.
This study provides the first validation of the Persian Five-Item Geriatric Depression Scale (GDS-5), supporting its psychometric robustness for use among Iranian older adults.
Findings confirm that the GDS-5 demonstrates a unidimensional structure, strong concurrent validity with established depression measures, and high reliability over time.
The study contributes to cross-cultural gerontological research by establishing an efficient and culturally adapted tool for depression screening in non-Western older populations.
Applications of study findings:
The validated Persian GDS-5 enables rapid, accurate identification of depressive symptoms in older adults within clinical, community, and primary care settings.
Policymakers and practitioners can incorporate the tool into national mental health screening programs to enhance early detection and intervention among aging populations.
Background
According to the WHO, 14% of people aged 60 years and older suffer from a mental disorder, with depression being one of the most common conditions (World Health Organization, 2023). As the global population of older adults is projected to increase by 40% from 2020 to 2030, the burden of mental health issues in this age group is expected to rise substantially (Fei et al., 2025; Jafari et al., 2021; World Health Organization, 2023). Depression is often underdiagnosed and undertreated due to social stigma, misconceptions, unusual manifestations, and overlap with other medical conditions. Undiagnosed and untreated depression may result in impaired cognitive and physical functioning, increased risk of suicide, diminished quality of life, and increased mortality rates in older adults (Rapaport et al., 2005; Wang et al., 2020). Therefore, the early detection of depression plays a crucial role in promoting mental well-being in older adults.
Various screening tools, such as the Patient Health Questionnaire (PHQ) and Beck Depression Inventory (BDI), have been validated for detecting depression in older adults (Smarr & Keefer, 2011). Among these, the Geriatric Depression Scale (GDS) has gained prominence in recent years due to its specificity for the older population (Jongenelis et al., 2005; Krishnamoorthy et al., 2020).
The original 30-item GDS has been translated into numerous languages, and its strong psychometric properties have been consistently confirmed across different cultures (Benedetti et al., 2018; Thomas et al., 2021). Nevertheless, lengthy questionnaires—particularly when administered alongside comprehensive surveys or cognitive assessments—can be burdensome for older individuals. This underscores the importance of brief screening instruments that maintain acceptable levels of validity and reliability. In response to this need, multiple abbreviated versions of the GDS have been developed and psychometrically evaluated.
The Five-Item Geriatric Depression Scale (GDS-5), a condensed version derived from the GDS-15, offers a practical solution for efficient and rapid depression screening in older adults (Hoyl et al., 1999). The validity and reliability of this tool have been supported by several studies (Gokcekuyu et al., 2022; Li et al., 2021); however, cultural and demographic factors may influence both the expression of depressive symptoms and their interpretation, highlighting the necessity for localized validation efforts.
In Iran, the validity and reliability of the 15- and 11-item Persian versions of the GDS have been confirmed and are widely utilized in research (Malakouti et al., 2006). However, the short 5-item Persian version—comprising items 1, 4, 8, 9, and 12—has not yet undergone formal psychometric evaluation.
Given high patient volumes and limited mental health resources in primary care, a concise screening tool is vital for early detection of depression in older adults. While the full GDS remains necessary for definitive diagnosis, short forms efficiently identify those needing further evaluation. This study aims to assess the validity and reliability of the Persian GDS-5 among Iranian older adults.
Methods
Participants
This validation study employed a cross-sectional design and was conducted among older adults residing in Tehran, Iran. The inclusion criteria for all participants were: an age of 60 years or older, residency in Tehran, fluency in Persian, and the cognitive ability to understand and reliably complete self-report questionnaires.
A total of 200 older adults participated in the study from the general population using a convenience sampling technique. To achieve a diverse representation of the urban community, recruitment efforts were distributed across five distinct geographical areas of Tehran, specifically targeting accessible public venues like parks and community centers.
The entire evaluation process was conducted through face-to-face interviews. A trained research team member administered the study instruments to ensure consistency. To standardize the procedures and maximize the reliability of all psychometric measures, the interviewer underwent a standardized training session led by the study psychiatrist prior to data collection. Coordination with the study psychiatrist was essential for the clinical group: following the diagnostic evaluation and confirmation via SCID-I, eligible patients who agreed to participate were referred to the research team.
Ethical Considerations
The study adhered to the ethical guidelines outlined in the Declaration of Helsinki for non-interventional clinical research. All participants were fully informed about the aims, procedures, and methods of the study and provided written informed consent before enrollment. Confidentiality of personal information was maintained throughout the study. Depressed individuals who required treatment were referred to mental health professionals. The study was conducted in accordance with ethical standards and was approved by the institutional ethics committee in Iran.
Measurements
Geriatric Depression Scale (GDS)
The GDS is one of the most widely used tools for assessing depression in older adults. The original version consists of 30 yes/no items and has been reported to have a Cronbach’s alpha of .94 and a test-retest reliability of .85. Its concurrent validity with the Zung and Hamilton depression scales has been reported as 0.84 and 0.95, respectively (Yesavage et al., 1982). Shorter versions of the GDS (4, 5, 10, 12, 15), and even a single-item version, have also been developed and validated.
Yesavage and Sheikh first published the 15-item version in 1986, and it showed high correlation with the original scale (Jafari et al., 2021). Malakouti et al. translated it into Persian, and they confirmed its validity and reliability among Iranian older adults.
Hoyl et al. developed the 5-item version (GDS-5) in 1999. It demonstrated a sensitivity of 0.97 (compared to 0.94 for the GDS-15), specificity of 0.85 (vs. 0.83), positive predictive value of 0.85 (vs. 0.82), negative predictive value of 0.97 (vs. 0.94), and overall accuracy of 0.90 (vs. 0.88) in predicting depression (Hoyl et al., 1999).
CES-D Questionnaire
The CES-D is a 20-item self-report questionnaire that evaluates an individual’s psychological state over the past week. Responses are scored as yes/no. Due to some difficulties elderly respondents face, a 10-item version has been proposed. This shorter version includes three items for depression, four for somatic complaints, two for well-being, and one for irritability.
The Persian version of the CES-D-10 has demonstrated acceptable psychometric properties, with internal consistency, split-half reliability, and test-retest reliability reported as 0.85, 0.65, and 0.49, respectively, and sensitivity and specificity as 0.82 and 0.70 (Malakouti et al., 2015).
Structured Clinical Interview for DSM-IV-TR (SCID-IV)
The SCID-IV is a structured clinical interview based on the DSM-IV-TR criteria, which is used to evaluate different psychotic disorders. The Persian version of SCID-IV has demonstrated acceptable psychometric properties in previous studies for clinical and research uses (Sharifi et al., 2004). It demonstrated a sensitivity of 0.64 and a specificity of 0.89. Internal consistency and test-retest reliability of depression diagnoses were α = .99 and r = .79, respectively, which indicates good reliability.
Procedure
Following the initial screening, demographic data were collected from each participant, including age, sex, marital status, educational level, and living situation.
Depression was assessed using items 4, 5, 10, 12, and 15 of the translated version of the 15-question Geriatric Depression Scale (GDS; Malakouti et al., 2015). All participants completed both the GDS-5 and the CES-D questionnaires to assess the concurrent validity of the GDS-5.
The SCID-IV diagnosis served as the diagnostic gold standard for evaluating the discriminative validity of the GDS-5 through receiver operating characteristic (ROC) analysis. Construct validity was evaluated using the SCID-IV questionnaire. The optimal cutoff point for the GDS-5 was determined using ROC curve analysis, identifying the score with the highest sensitivity and specificity for distinguishing between depressed and non-depressed individuals. To assess reliability, 30 elderly individuals from the general population completed the GDS-5 twice, with a 2-week interval. The tool’s internal consistency was evaluated using Cronbach’s alpha.
Statistical Analyses
Descriptive statistics were computed, presenting quantitative variables as mean ± standard deviation (SD) and categorical variables as frequencies and percentages. The structural validity of the Persian version of the GDS-5 was assessed using Exploratory Factor Analysis (EFA; Adawi et al., 2018). Concurrent validity was evaluated using the Pearson correlation coefficient (r) between the GDS-5 and CES-D scores (Mehdizadeh et al., 2025). Correlation strength was interpreted as weak (r < .4), moderate (r = .4–.7), or strong (r > .7; Schober et al., 2018).
The internal consistency of the GDS-5 was assessed using Cronbach’s alpha (α), with values greater than .70 considered indicative of acceptable reliability (Tavakol & Dennick, 2011). Paired-sample t-test and intra-class correlation coefficients were used to assess reliability (Ayosanmi et al., 2022). Test–retest reliability was examined through the Intraclass Correlation Coefficient (ICC) using a two-way random effects model and a 95% confidence interval (Terwee et al., 2007). ICC above 0.70 indicates excellent reliability. The standard error of measurement (SEM) was also calculated by (
To evaluate diagnostic accuracy in distinguishing participants with and without depression (based on DSM-IV criteria), a Receiver Operating Characteristic (ROC) analysis was performed (Soltaninejad et al., 2025). The Area Under the Curve (AUC) and its 95% confidence interval were calculated, and Youden’s index (J) was applied to determine the optimal cutoff value maximizing the balance between sensitivity and specificity (Nahm, 2022). All statistical analyses were conducted using SPSS version 22, with significance set at p < .05.
Results
Two hundred older adults with a mean age of 69.06 ± 6.41 years participated in the study. The majority of participants were male (53%), married (71.5%), had obtained an education diploma or higher (56.5%), and lived with their families (68.5%). Demographic characteristics of the study population are presented in Table 1.
Demographic characteristics of the elderly participants.
Factor Analysis
The suitability of the data for factor analysis was assessed using the Kaiser-Meyer-Olkin (KMO) index and Bartlett’s test of sphericity.
The KMO index value was 0.755, confirming the adequacy of the sample. Additionally, Bartlett’s test was statistically significant (p < .001), indicating that the correlation matrix was appropriate for factor analysis.
Principal component analysis extracted one factor with an eigenvalue greater than one (Eigenvalue = 2.555). This single factor accounted for 51.09% of the total variance. As shown in Table 2, all items demonstrated strong factor loadings on this factor, ranging from 0.459 to 0.838. These findings support the unidimensional structure of the GDS-5 in this population.
Psychometric Properties of the Persian Version of the GDS-5.
Correlation and Validity Analysis
Concurrent Validity
To assess concurrent validity, the Pearson cprrelation coefficient was calculated between the total scores of the GDS-5 and the Center for Epidemiologic Studies Depression Scale (CES-D). The results indicated a strong, significant positive correlation (r = .71, p < .001), suggesting good concurrent validity for the GDS-5 (Table 2).
Mean Scores by Diagnostic Group
Participants classified as depressed scored an average of 3 (SD = 1) on the GDS-5, compared to an average score of 1 (SD = 1) among non-depressed participants. For the GDS-15, the mean score among depressed individuals was 11 (SD = 2), while non-depressed individuals had a mean score of 3 (SD = 3).
Reliability Analysis
Internal Consistency and Inter-Item Correlation
The internal consistency of the GDS-5 was analyzed using Cronbach’s alpha coefficient. A coefficient of 0.75 was obtained, indicating high internal consistency for the instrument. The inter-item correlation was also examined, with correlations consistently exceeding the minimum acceptable threshold of r = .20, supporting the coherence of the items.
Test-Retest Reliability (ICC and SEM)
The ICC, a measure of test-retest reliability, was calculated to be 0.809. This value exceeds the threshold of 0.70 for high reliability and indicates an acceptable level of reliability for the instrument. The paired-sample t-test showed that there was no statistically significant difference in the assessment at test-retest for the construct, which indicates good data reliability.
Given a pooled standard deviation (SDpooled) of 0.19 and an ICC of 0.809, the SEM value was found to be 0.084. This value is considerably less than the acceptable maximum threshold of 1.2 × SDpooled, indicating a low level of random error in repeated measurements. Collectively, the results of the ICC and SEM analysis confirm the appropriate reliability of the GDS-5 for use in this population (Table 2).
Diagnostic Accuracy
The Area Under the ROC Curve (AUC) for the GDS-5 was calculated to be 0.924 (95% CI [0.887, 0.960]), which indicates high diagnostic accuracy. At the optimal cutoff point, the GDS-5 demonstrated a sensitivity of 0.98 and a specificity of 0.77. This performance highlights the strong ability of the GDS-5 to accurately identify depression, particularly due to its high sensitivity (Figure 1).

ROC curve for the GDS-5.
The diagnostic utility of the GDS-5 was further assessed using Likelihood Ratios. The Positive Likelihood Ratio (LR+) was calculated to be 4.26. This value indicates that an Iranian older adult who tests positive on the GDS-5 is 4.26 times more likely to have depression than an older adult who tests negative. Conversely, the Negative Likelihood Ratio (LR−) was calculated to be 0.03. This low value suggests that a negative GDS-5 result is highly effective at ruling out depression, as the odds of having the disease decrease significantly following a negative test result.
Discussion
To the best of our knowledge, this study provides the first assessment of the validity and reliability of the GDS-5 for depression screening among older adults in Iran. Our primary finding demonstrated that the Persian version of the GDS-5 is a valid and reliable tool for this population. The comprehensive psychometric analysis-including unidimensional factor structure, strong internal consistency (α = .75), and high test-retest reliability (ICC = 0.809), firmly supports the instrument’s scientific soundness.
The diagnostic accuracy of the GDS-5 was confirmed by a high AUC of 0.924. At the optimal cutoff point, the tool demonstrated excellent diagnostic utility with a sensitivity of 0.98 and a specificity of 0.77. The exceptionally high sensitivity suggests the GDS-5 is highly effective at correctly identifying older adults who do have depression, thus minimizing the risk of false negatives.
We further established concurrent validity through a strong positive correlation (r = .71) between the GDS-5 and the CES-D. This finding aligns with previous research confirming both the GDS-5 and CES-D as robust measures for geriatric depression screening (Acosta Quiroz et al., 2021; Sheikh & Yesavage, 1986). Gokcekuyu et al. (2022) also reported a significant correlation between the GDS-5 and DSM-5 diagnostic criteria for depression (Gokcekuyu et al., 2022).
The high diagnostic performance observed in our study is consistent with the established utility of the GDS in other cultural contexts. For example, studies assessing various GDS versions have also reported satisfactory validity and reliability (Acosta Quiroz et al., 2021; Brañez-Condorena et al., 2021; Krishnamoorthy et al., 2020). However, our results show even higher sensitivity and AUC values compared to some international studies. Eriksen et al., in evaluating the Norwegian GDS-5, reported a lower sensitivity and specificity of 73.2% with an AUC of 0.81 (Eriksen et al., 2019). The superior performance of the GDS-5 observed in the current Iranian sample highlights its specific effectiveness and cultural relevance.
A critical aspect of this study was determining the optimal cutoff point for clinical application. Our ROC analysis suggested an optimal cutoff score of 1/2 for the GDS-5 (and 7/8 for the GDS-15). This differs slightly from findings in other populations, such as the Norwegian study by Eriksen et al. (2019) and the Turkish study by Gokcekuyu et al. (2022), both of which identified an optimal GDS-5 cutoff score of 2. This slight variance underscores the importance of validating cutoff points within specific cultural and linguistic groups to maximize diagnostic accuracy.
The overall strong performance of the GDS-5, particularly its short, five-item format, offers significant practical advantages. Lengthy questionnaires can often be burdensome for older adults, potentially affecting compliance and data quality. The GDS-5 addresses this by providing a rapid and practical screening tool that can be easily administered by clinicians and researchers in primary care settings, nursing homes, and hospitals. It efficiently distinguishes between depressed and non-depressed participants, regardless of initial symptom severity (Weeks et al., 2003).
The validated psychometric properties of the brief Persian GDS-5 carry significant weight for public health policy and resource management within Iran’s aging population, primarily through its diagnostic utility. Specifically, the calculated Negative Likelihood Ratio (LR−) of 0.03 is a crucial policy asset: this extremely low value means an older Iranian adult with a negative screening result is only 3% as likely to have depression as the average person, making the GDS-5 highly effective at ruling out the diagnosis. This directly supports an efficient policy of resource conservation by preventing unnecessary, costly, and time-consuming diagnostic follow-ups by specialists for individuals who are genuinely not depressed. Coupled with a Positive Likelihood Ratio (LR+) of 4.26 (meaning a positive result significantly increases the probability of depression), the GDS-5 is confirmed as an excellent screening tool. It provides clear, evidence-based guidance for policymakers to rapidly screen large populations to confidently exclude the majority who do not need further resources, and efficiently flag the small group that does require immediate specialist attention. This allows for the integration of highly efficient mental health screening into routine geriatric care, improving patient flow, optimizing the use of limited specialist resources, and supporting the overarching policy goal of early, targeted intervention.
Strengths and Limitations of the Study
Despite the robust psychometric findings, our study has several limitations. First, the use of convenience sampling and a limited sample size may restrict the generalizability of our findings. The participant demographic was also skewed, with most participants being married, which could have influenced the overall depression prevalence and scores. Consequently, our results may not be directly applicable to all older adult populations, such as hospitalized individuals or residents of care facilities.
Future research should focus on validating the instrument in diverse settings (e.g., primary care clinics, long-term care facilities) and utilizing a wider sample with diverse sociodemographic characteristics to improve generalizability. Research should also focus on assessing the feasibility and effectiveness of implementing the GDS-5 by primary care providers.
Conclusion and Implications
The reliability and validity findings of the Persian version of the GDS-5 indicate that this questionnaire is an effective and efficient screening tool for detecting depression in Iranian older adults. We recommend the adoption of the GDS-5 as a rapid and practical initial assessment measure in clinical and epidemiological studies. Its ease of use and high diagnostic accuracy can facilitate the timely identification and initiation of treatment for depression in this vulnerable population.
Footnotes
Acknowledgements
The Iran University of Medical Sciences supported this study. The corresponding author confirms that generative artificial intelligence tools, specifically Gemini, were used solely to enhance the language and improve the fluency of the manuscript. The authors alone were responsible for all original research concepts, the study’s design, data collection, analysis, and interpretation.
Ethical Considerations
We declare that all ethical guidelines for authors have been followed by all authors. The study adhered to the ethical guidelines outlined in the Declaration of Helsinki for non-interventional clinical research. Confidentiality of personal information was maintained throughout the study. Depressed individuals who required treatment were referred to mental health professionals. The study was conducted by ethical standards and was approved by the Ethics Committee of Iran University of Medical Sciences (Ethics Code: IR.IUMS.REC.1401.999).
Consent to Participate
All participants were fully informed about the aims, procedures, and methods of the study and provided written informed consent before enrollment.
Author Contributions
Seyede Saleheh Mortazavi: Conceptualization (lead), Project administration (lead). Mohsen Shati: Conceptualization (supporting); Supervision (lead); Validation (lead). Ensiyeh Najjari: Design (supporting), Writing – review and editing (supporting). Mobin Fathi: Methodology (lead), Data acquisition (lead), Analysis and interpretation of data (supporting). Fariba Tabrizi: Analysis and interpretation of the data (lead). Ahmad Ashouri: Conceptualization (supporting), Project administration (supporting). Roghie Bagheri: Writing – original draft (lead). Sahar Sarmadi: Writing – review and editing (supporting). Sadaf Agahi: Writing – review and editing (lead), interpretation of data (supporting). All authors confirm accountability for the entire content and integrity of the work.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
