Abstract
Background
Competency-based medical education (CBME) requires robust alignment between theoretical knowledge assessments and clinical performance evaluations. In resource-constrained settings like Iran, conducting separate high-stakes examinations strains limited faculty, infrastructure, and budgets. This study examined the predictive relationship between pre-internship written examination scores and Objective Structured Clinical Examination (OSCE) performance, with implications for sustainable assessment redesign.
Methods
Retrospective analytical study of 312 sixth-year medical students (208 female, 104 male) from OSCE sessions 17-20 (2021-2024) at Ahvaz Jundishapur University. Pre-internship scores (200-item MCQ, 0-100 scale) and OSCE scores (12 stations: 6 theoretical, 6 practical) were analyzed using t-tests, Pearson correlations, multiple regression, and quartile analysis (SPSS 26.0).
Results
No gender differences emerged (pre-internship: female 66.2±7.9 vs male 64.8±8.7, P=0.18; OSCE: female 69.4±9.8 vs male 67.1±10.9, P=0.07). Strong positive correlation existed between assessments (r=0.55, 95% CI: 0.45-0.63, P<0.001, R2=0.30). Correlations persisted for theoretical stations (r=0.52) and practical stations (r=0.48), consistent across genders. Regression confirmed pre-internship as significant predictor (β=0.55, adjusted R2=0.29). Top pre-internship quartile: 38% achieved top OSCE quartile vs 9% from lowest quartile (χ2=32.1, P<0.001).
Conclusion
Theoretical knowledge explains ∼30% of OSCE variance, supporting integrated curricula while highlighting needs for enhanced practical training. In Iran’s resource-limited context, consolidated hybrid or programmatic assessment models could optimize efficiency, reduce costs, maintain validity, and advance sustainable CBME implementation.
Keywords
Introduction
Contemporary medical education has undergone a paradigm shift from predominantly knowledge-focused approaches toward competency-based medical education (CBME), which emphasizes integrated development of knowledge, clinical skills, communication abilities, professionalism, and lifelong learning. 1 Miller’s pyramid provides a widely accepted hierarchical framework for conceptualizing clinical competence, progressing from foundational knowledge (“knows”) through applied understanding (“knows how”) and demonstrated performance (“shows how”) to authentic practice behavior (“does”). 2 In the context of Iran’s medical education system, this theoretical framework has direct implications for the design of high-stakes assessments.
In Iran’s six-year undergraduate medical education system, two major comprehensive national assessments serve as critical gateways: the pre-internship examination—a high-stakes written multiple-choice test administered after completion of clinical sciences coursework—and the Clinical Competence Assessment employing the Objective Structured Clinical Examination (OSCE) format during or following clinical clerkships. 3 The pre-internship examination evaluates theoretical knowledge, clinical reasoning, and diagnostic decision-making through 200 standardized items covering internal medicine, surgery, pediatrics, obstetrics-gynecology, psychiatry, and other clinical disciplines. 4 OSCE, recognized internationally as the gold standard for clinical skills assessment, 5 utilizes multiple standardized stations to evaluate practical competencies, patient interaction, communication skills, physical examination techniques, procedural proficiency, and professional behaviors. 6
Previous research examining correlations between written knowledge assessments and OSCE performance demonstrates considerable variability, with reported correlations ranging from weak (r=0.20-0.30) to moderate-strong (r=0.50-0.60), influenced by factors including curricular structure, teaching methodologies, assessment design, and contextual characteristics.7-9 Iranian studies have been relatively limited, with most focusing on individual examination performance rather than inter-assessment relationships, and few incorporating recent curricular innovations or post-pandemic adaptations.10-12
In low- and middle-income countries (LMICs) including Iran, persistent resource constraints pose substantial challenges to comprehensive medical education assessment. These constraints include chronic faculty shortages, limited simulation infrastructure and equipment, high costs associated with standardized patient recruitment and training, inadequate examiner training programs, bureaucratic inefficiencies, economic sanctions restricting access to educational technologies, and overall underfunding of medical education systems.13-15 The COVID-19 pandemic further exacerbated these challenges while simultaneously demonstrating potential efficiencies through hybrid and virtual adaptations. 16
Conducting entirely separate high-stakes pre-internship written examinations and multi-station OSCE assessments demands substantial institutional resources. In resource-constrained environments, this parallel approach may represent inefficient allocation, potentially diverting resources from educational enhancements such as faculty development, simulation laboratories, or curricular innovations. 17 Understanding the predictive relationship between these assessments could inform more sustainable, integrated models that maintain validity while optimizing resource utilization.
To account for potential gender differences reported in prior literature and any session-specific variations in OSCE administration, gender and assessment session were included as covariates in subsequent analyses. Given mixed international findings on gender differences in assessment performance, we also examined gender-based variations in scores and correlations. Furthermore, predictive modeling of OSCE performance from pre-internship scores was included to provide evidence-based guidance for resource-efficient assessment redesign in Iran’s constrained educational environment. These objectives were exploratory in nature, without pre-specified directional hypotheses beyond the overall moderate-to-strong positive correlation expected from competence hierarchy models.
This study addresses these gaps by investigating: (1) the strength and nature of correlations between pre-internship examination and OSCE scores; (2) differential relationships with theoretical versus practical OSCE components; (3) gender-based variations; (4) predictive modeling of OSCE performance; and (5) implications for resource-efficient assessment redesign in Iran’s medical education context. We hypothesized moderate-to-strong positive correlations consistent with hierarchical competence models, with stronger relationships for theoretical compared to practical stations.
Methods
Study Design and Setting
This retrospective descriptive-analytical study utilized de-identified examination records from Ahvaz Jundishapur University of Medical Sciences, a major medical education institution in southwestern Iran. The study analyzed data from Clinical Competence Assessment sessions 17-20, conducted between March 2021 and September 2024. The reporting of this study conforms to the STROBE statement for observational studies. 18 The completed STROBE checklist is provided as Supplementary File 1. Ethical approval was obtained from the Institutional Review Board and Research Ethics Committee (Registration Number: B-20/045, Approval Date: January 15, 2021). Given the retrospective nature and use of existing de-identified educational data, the requirement for informed consent was waived in accordance with institutional guidelines and national research ethics regulations.
Participants
Inclusion criteria included all sixth-year medical students with complete, verified data for both the pre-internship examination and Clinical Competence Assessment (OSCE). Exclusion criteria were incomplete data for either assessment or non-participation in the specified OSCE sessions 17–20. Census sampling methodology was employed, including all sixth-year medical students who participated in OSCE sessions 17-20 and possessed complete verified data for both pre-internship examination and Clinical Competence Assessment. The final analytical sample comprised 312 students: 208 females (66.7%) and 104 males (33.3%), with mean age 25.1±1.5 years (range: 23-29 years). Sample size provided statistical power exceeding 0.95 to detect moderate correlations (r≥0.30) at conventional significance level (α=0.05). Students were distributed relatively evenly across the four assessment sessions: Session 17 (n=76, 24.4%), Session 18 (n=81, 26.0%), Session 19 (n=78, 25.0%), and Session 20 (n=77, 24.7%).
Assessment Instruments
Pre-Internship Examination: This comprehensive written assessment consists of 200 multiple-choice questions with four response options each, systematically sampling across all major clinical disciplines proportional to curricular emphasis. Content domains include internal medicine, general surgery, pediatrics, obstetrics and gynecology, psychiatry, emergency medicine, and other clinical specialties. Items assess not only factual knowledge recall but also higher-order cognitive skills including clinical reasoning, diagnostic formulation, treatment planning, and interpretation of clinical data. Questions undergo rigorous development and review processes by content experts and educational measurement specialists. The examination is administered under standardized proctored conditions with 4-hour duration. Scoring is conducted electronically with final scores reported on a 0-100 scale.
Clinical Competence Assessment (hereafter referred to as OSCE): The institutional OSCE follows internationally recognized standards adapted to local context. It comprises 12 standardized stations, each 10 minutes duration with 2-minute inter-station intervals for student transition. Six stations are designated as “theoretical stations” assessing clinical reasoning, diagnostic interpretation, management decision-making, and clinical judgment through written case scenarios, data interpretation tasks, and structured oral responses. The remaining six “practical stations” evaluate hands-on clinical skills including history-taking, physical examination techniques, procedural competencies (e.g., suturing, catheterization), patient communication, and professional behaviors through interactions with standardized patients (trained actors portraying specific clinical scenarios) and clinical simulation equipment.
The pre-internship examination is typically administered 4–8 months prior to the OSCE, following completion of clinical sciences coursework and before the start of internship.Pre-internship scores were calculated as the percentage of correct answers out of 200 items, scaled to a 0–100 range (each item equally weighted at 0.5 points). For the OSCE, each station was scored on a 0–100 checklist-based scale by trained examiners. Theoretical and practical subscores were computed as the mean of the 6 respective station scores, and the total OSCE score as the mean of all 12 stations, all scaled to 0–100. Internal consistency reliability for the total OSCE was Cronbach’s α = 0.88. Subscale reliabilities were α = 0.83 for theoretical stations and α = 0.79 for practical stations.
Statistical Analysis
All statistical analyses were conducted using IBM SPSS Statistics version 26.0 (IBM Corporation, Armonk, NY, USA). Descriptive statistics including means, standard deviations, medians, ranges, frequencies, and percentages were calculated to characterize the sample and examination performances. Normality of continuous variable distributions was assessed using the Kolmogorov-Smirnov test supplemented by visual inspection of histograms and Q-Q plots.
Independent samples t-tests were employed to compare mean pre-internship and OSCE scores between male and female students. Effect sizes were calculated using Cohen’s d to supplement significance testing. Pearson correlation coefficients were computed to examine bivariate relationships between pre-internship examination scores and OSCE scores (total, theoretical subscore, practical subscore). Correlations were calculated for the total sample and separately for male and female subgroups. Correlation magnitude was interpreted using conventional criteria: weak (r=0.10-0.29), moderate (r=0.30-0.49), strong (r=0.50-0.69), very strong (r≥0.70).Multiple linear regression analysis was conducted with OSCE total score as dependent variable and pre-internship score as primary predictor, controlling for gender and assessment session as covariates. Standardized regression coefficients (β), adjusted R2 values, and 95% confidence intervals were calculated. Model assumptions including linearity, homoscedasticity, and normality of residuals were verified.
To examine clinical and educational significance beyond statistical correlation, students were stratified into performance quartiles based on pre-internship examination scores. Chi-square tests assessed whether distribution of students across OSCE performance quartiles differed significantly among pre-internship quartiles. Statistical significance was established at α=0.05 (two-tailed) for all analyses.
Results
Examination Scores Stratified by Gender
Note. Data presented as mean ± standard deviation. P-values derived from independent samples t-tests comparing male and female students. OSCE = Objective Structured Clinical Examination.
Correlation Matrix: Pre-internship and OSCE Component Scores
Note. Values represent Pearson correlation coefficients. ***P < 0.001. N=312 for all correlations. OSCE = Objective Structured Clinical Examination.
Gender Comparisons Independent samples t-tests revealed no statistically significant differences between male and female students on any examination measure. Female students achieved mean pre-internship score of 66.2±7.9 compared to 64.8±8.7 for males (t=1.35, P=0.18, Cohen’s d=0.17), representing a negligible effect size. Similarly, OSCE total scores showed no significant gender difference (female: 69.4±9.8; male: 67.1±10.9; t=1.82, P=0.07, Cohen’s d=0.22), despite a tendency toward higher female performance that approached but did not reach statistical significance. Theoretical stations (female: 70.8±10.3; male: 68.7±11.4; t=1.59, P=0.10) and practical stations (female: 68.0±11.3; male: 65.5±12.6; t=1.71, P=0.14) showed similar non-significant patterns. All effect sizes remained in the small range (d<0.30), indicating gender equity in both theoretical knowledge and clinical competence at this institution.
Correlation Analyses Pearson correlation analysis demonstrated a statistically significant strong positive correlation between pre-internship examination scores and OSCE total scores (r=0.55, 95% CI: 0.45-0.63, P<0.001). The coefficient of determination (R2=0.30) indicates that pre-internship performance explains approximately 30% of the variance in OSCE scores, while the remaining 70% reflects other factors contributing to clinical competence
Although scores did not span the full 0–100 scale (pre-internship range: 45.0–87.0; OSCE range: 42.0–93.5) and quartiles covered approximately five scale points, these differences represent clinically meaningful performance variations within this cohort, as evidenced by the significant chi-square association (χ2=32.1, P<0.001).
Discussion
Principal Findings and Theoretical Implications This comprehensive study of 312 medical students demonstrates a statistically significant strong positive correlation (r=0.55, P<0.001) between pre-internship theoretical examination performance and Clinical Competence Assessment (OSCE) outcomes, with pre-internship scores explaining approximately 30% of variance in OSCE performance. This finding aligns with hierarchical models of clinical competence, particularly Miller’s pyramid, 2 which conceptualizes knowledge as foundational but not sufficient for clinical expertise. The strong correlation magnitude supports the theoretical proposition that theoretical knowledge provides necessary scaffolding for clinical performance while simultaneously confirming that clinical competence encompasses substantial additional dimensions—including psychomotor skills, interpersonal communication, clinical reasoning under uncertainty, professional behaviors, and contextual adaptation capabilities—that extend beyond theoretical understanding.7-9
The slightly stronger association between pre-internship scores and theoretical OSCE stations (r=0.52) compared to practical stations (r=0.48), though not reaching statistical significance, follows intuitive logic: theoretical OSCE stations emphasizing clinical reasoning and diagnostic interpretation likely draw more heavily upon the same cognitive knowledge base evaluated in written examinations, whereas practical stations requiring hands-on skills, patient interaction, and real-time performance engage additional psychomotor and affective competencies. 19 Nevertheless, the substantial correlation with practical stations (r=0.48, still moderate-to-strong) demonstrates that theoretical knowledge contributes meaningfully even to hands-on clinical performance, possibly by providing the conceptual framework guiding skill execution and clinical decision-making during practical tasks.
Contrary to our initial hypothesis of stronger associations with theoretical stations, correlations with theoretical (r=0.52) and practical stations (r=0.48) were comparable. This may reflect substantial overlap in the underlying constructs assessed or the fact that practical stations still require robust theoretical knowledge as a foundation. Importantly, OSCE examiners were blinded to pre-internship scores, minimizing potential bias in ratings.
The quartile analysis further illustrates the predictive value while highlighting individual variability: although students in the highest pre-internship quartile were approximately four times more likely to achieve top OSCE performance, 25–30% of students shifted across quartiles in either direction. This underscores that theoretical knowledge is necessary but not sufficient for clinical competence and supports the continued need for independent clinical skills assessment even in resource-limited settings.
Although this study was conducted at a single institution, the findings offer valuable preliminary insights for other medical schools in Iran and similar LMIC contexts. Multi-center studies are warranted to confirm generalizability given institutional variations in resources and curricula.
This study has several limitations that should be considered. First, it was conducted at a single medical university, which may limit the generalizability of the findings to other institutions in Iran with different resources, curricula, or student populations. Second, the retrospective design, while appropriate for the research question, precludes causal inferences. Third, although the score ranges were typical for high-stakes medical examinations, the relatively restricted range of scores may have slightly attenuated the observed correlations. Finally, unmeasured confounders such as student motivation, quality of clinical exposure, and specific teaching methods may have influenced the relationship between theoretical knowledge and clinical performance.
Directions for Future Research Several promising research directions emerge from this work. First, multi-institutional studies incorporating diverse medical schools across Iran—including universities in different geographic regions, with varying resource levels, and employing different curricular models—would clarify generalizability and identify institutional or contextual moderators of the theory-practice relationship. Such research could reveal whether integrated curricula strengthen correlations, whether resource constraints weaken them, or whether findings are relatively stable across contexts. Second, prospective longitudinal research following students from pre-internship examinations through OSCE, into internship rotations, through residency training, and ultimately into independent practice would provide comprehensive understanding of how theoretical knowledge relates to clinical competence development across the full professional trajectory. Third, experimental or quasi-experimental studies evaluating proposed assessment redesigns would provide crucial evidence for educational innovation. Fourth, qualitative research exploring students’ and educators’ perspectives on the relationship between theoretical knowledge and clinical competence would provide valuable complementary insights to our quantitative findings. Finally, health services research examining relationships between medical student assessment performance and subsequent patient care quality outcomes would provide the ultimate validation.
Conclusion
This study provides robust empirical evidence that pre-internship theoretical examination performance strongly predicts OSCE-assessed clinical competence among Iranian medical students (r=0.55, R2≈0.30). This finding validates hierarchical models of clinical competence while highlighting that clinical expertise encompasses substantial dimensions beyond theoretical knowledge—including psychomotor proficiency, communication skills, clinical reasoning under uncertainty, professional behaviors, and contextual adaptation capabilities.
In Iran’s resource-constrained context, these results support exploring consolidated hybrid or programmatic assessment models to optimize efficiency without compromising validity. Although derived from a single institution, the findings have practical implications for similar settings across Iran and other LMICs, with the important caveat that multi-center validation is needed. Gender equity was observed, reinforcing inclusive educational practices.
Future multi-institutional, longitudinal, and intervention studies will further guide evidence-based assessment redesign in competency-based medical education.
Supplemental Material
Supplemental material - Association Between Pre-Internship Examination Scores and OSCE Performance: Implications for Resource-Efficient Competency-Based Medical Education at an Iranian Medical University
Supplemental material for Association Between Pre-Internship Examination Scores and OSCE Performance: Implications for Resource-Efficient Competency-Based Medical Education at an Iranian Medical University by Ali Delirrooy Fard, Arash Forouzan, Shahrzad Mirnasouri, Mehdi Sayyah in Journal of Medical Education and Curricular Development
Footnotes
Acknowledgments
The authors gratefully acknowledge the Medical Education Development Center and Office of Educational Affairs at Ahvaz Jundishapur University of Medical Sciences for facilitating access to examination records and providing institutional support. We thank all faculty members, standardized patients, and administrative staff who contributed to examination administration. Most importantly, we thank the medical students whose examination performance data made this research possible. Their commitment to learning and professional development continues to inspire medical education research and improvement efforts.
Ethical Considerations
This study received full ethical approval from the Research Ethics Committee of Ahvaz Jundishapur University of Medical Sciences (Ethics Approval Code: B-20/045, Approval Date: January 15, 2021). The study was conducted in accordance with ethical principles outlined in the Declaration of Helsinki and national research ethics guidelines of the Islamic Republic of Iran. Given the retrospective nature of this study utilizing existing de-identified educational records collected as part of routine institutional assessment procedures, the ethics committee granted a waiver of informed consent in accordance with applicable regulations. All data were anonymized prior to analysis and no individual student performance data are reported in this manuscript.
Consent for Publication
Not applicable. This manuscript contains no individual person’s data in any form.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The datasets generated and analyzed during the current study are not publicly available due to institutional privacy policies protecting student educational records, but anonymized data supporting the findings are available from the corresponding author upon reasonable request, subject to institutional review board approval and execution of appropriate data use agreements.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
