Abstract
Background
Deep vein thrombosis (DVT) remains a frequent and potentially life-threatening complication among hospitalized patients, necessitating timely diagnosis. The Wells score is widely used for assessing DVT probability; however, its performance in inpatient populations remains uncertain. This study aimed to evaluate the diagnostic accuracy of the Wells criteria for lower extremity DVT among hospitalized patients.
Methods
In this case–control study conducted at two teaching hospitals between 2017 and 2020, 240 patients with confirmed DVT were compared with 240 age- and sex-matched controls without DVT. All participants underwent standardized clinical evaluation and duplex ultrasonography within 24 h of admission. Wells scores were calculated based on predefined clinical parameters. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were determined.
Results
Of 480 participants (mean age 51.9 ± 11.6 years; 54.4% female), DVT was confirmed in 240 (50%). A Wells score ≥2 classified patients as likely DVT. This threshold yielded a sensitivity of 86.3%, specificity of 70.0%, PPV of 74.2%, and NPV of 83.8%. Significant predictors included recent surgery or prolonged hospitalization (p < 0.001), calf swelling >3 cm (p < 0.001), and pitting edema confined to the symptomatic leg (p < 0.001).
Conclusion
The Wells criteria demonstrated good sensitivity and moderate specificity for diagnosing DVT in hospitalized patients, supporting their role as an initial clinical assessment tool. However, they should not replace confirmatory testing such as ultrasonography or D-dimer assays. Larger multicenter studies are warranted to further validate these findings.
Introduction
Deep vein thrombosis (DVT) is a common and potentially life-threatening condition in hospitalized patients, where immobility, comorbidities, and surgical interventions create a high-risk environment for venous thromboembolism (VTE). 1 The clinical consequences of DVT extend beyond acute morbidity, as complications such as pulmonary embolism (PE) and post-thrombotic syndrome can lead to significant mortality and long-term disability. 2 Because of these risks, timely and accurate identification of patients at risk for DVT is critical for guiding diagnostic evaluation, initiating appropriate therapy, and improving patient outcomes. 3 Clinical diagnosis of DVT remains challenging, as symptoms such as swelling, pain, and erythema are non-specific and can overlap with other conditions. 4 To improve diagnostic accuracy and reduce reliance on indiscriminate imaging, several clinical prediction tools have been developed. 5 Among these, the Wells score is the most widely used and validated. By integrating clinical signs, symptoms, and risk factors into a structured scoring system, the Wells score stratifies patients into probability categories, supporting the rational use of confirmatory testing, such as duplex ultrasonography and D-dimer assays.6,7 Its simplicity and utility have made it a cornerstone of DVT risk assessment, particularly in outpatient and emergency department settings. 8
Despite its widespread adoption, questions remain regarding the performance of the Wells score in hospitalized populations. 9 Medical and surgical inpatients represent a heterogeneous group with distinct risk factors: medical patients often suffer from chronic illnesses or acute medical conditions predisposing them to hypercoagulability, while surgical patients face additional risks related to perioperative immobility, endothelial injury, and inflammatory responses to tissue trauma.10,11 These differing risk profiles may limit the generalizability of the Wells score, which was originally developed in ambulatory cohorts. 12 Previous studies evaluating its inpatient utility have yielded conflicting results, with variations in reported sensitivity, specificity, and predictive values.
Understanding the accuracy and limitations of the Wells score in inpatient settings is essential, as misclassification carries important clinical implications. 13 Underestimation of risk may delay diagnosis and increase the likelihood of thromboembolic complications, whereas overestimation can lead to unnecessary imaging or anticoagulation, exposing patients 3 to avoidable harm and resource use.
Identifying a reliable low-risk subgroup among hospitalized patients is clinically important because it can reduce unnecessary diagnostic imaging and D-dimer testing, limit delays in care, and conserve hospital resources. 14 However, inpatients often have higher baseline risk and competing causes of leg symptoms, which may increase the chance of missed disease if outpatient-derived decision rules are applied without caution. This highlights the importance of examining not only the diagnostic accuracy in high-risk patients, but also the failure rate within the low-risk category when applying the Wells score in the inpatient setting. 15 This study, therefore, aims to evaluate the performance of the Wells score in predicting DVT among medical and surgical hospitalized patients. By comparing its diagnostic accuracy across these distinct cohorts, we seek to clarify its clinical utility, inform evidence-based decision-making, and guide the development of tailored diagnostic strategies for inpatient populations.
Methods
Study design
The present investigation was a case–control study conducted to determine the sensitivity and specificity of the Wells criteria for diagnosing deep vein thrombosis (DVT) of the lower extremities among patients presenting to two affiliated teaching and research hospitals. The study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of the University of Medical Sciences and the National Ethics Committee. The current study is reported in accordance with the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines. 16
Patients and participants
Cases were defined as patients newly diagnosed with DVT at the two teaching hospitals between 2017 and 2020. Controls were matched 1:1 with cases based on age and sex and were randomly selected from individuals undergoing routine health check-ups at the same institutions during the same period. A total of 240 patients with DVT and 240 matched controls were included in the analysis, drawn from the overall cohort of health check-up participants (Figure 1). Comparison of wells criteria components between DVT and Non-DVT groups.
Inclusion criteria
Patients aged ≥18 years with a clinical suspicion of DVT.
Exclusion criteria
Use of therapeutic-dose anticoagulants for more than 24 h prior to enrollment. Hemodynamic instability. 2
Clinical evaluation
In this study, a comprehensive medical history (including the presence of malignancy, limb immobility or paralysis, and previous history of DVT) and physical examination findings (limb swelling, asymmetry in limb circumference, tenderness, and presence of superficial collateral veins) were assessed according to a standardized evaluation form. For each patient, the Wells score was calculated individually. 3 Within the first 24 h after admission, all participants underwent compression ultrasonography (CUS) and duplex sonography of the lower limbs, performed by an experienced sonologist.
Ultrasound examinations were conducted using a Toshiba Aplio XG system (Milwaukee, USA) following a standardized protocol, employing real-time B-mode ultrasonography with a linear probe (6–11 MHz). 4
Data collection
Demographic information, medical history, smoking status, body mass index (BMI), and conventional cardiovascular risk factors (including obstructive sleep apnea–hypopnea syndrome [OSAHS]) were obtained from medical records for both cases and controls. All participants or their primary caregivers were interviewed by trained research staff using a structured questionnaire to confirm and supplement the extracted data. Medical history variables included the presence of hypertension, diabetes mellitus, and, if applicable, a history of trauma (with details regarding time and mechanism of injury). Because smoking history was not consistently recorded in medical records, this information was obtained directly through participant interviews. Smoking status was categorized as ever-smokers or never-smokers. Hypertension was defined as a prior diagnosis with ongoing use of antihypertensive medication, or repeated blood pressure readings ≥140/90 mmHg on at least three separate occasions. Smoking was defined as the consumption of at least one cigarette per day for a duration exceeding 6 months.
Statistical analysis
To describe qualitative variables, frequencies were used, and the mean \pm standard deviation was reported for quantitative variables. The Chi-square test (\chi^2) was employed to investigate the association between the components of the Wells’ criteria and the incidence of Deep Vein Thrombosis (DVT). The Positive Predictive Value (PPV), Negative Predictive Value (NPV), sensitivity, and specificity were also reported. A two-tailed p-value of <0.05 was considered statistically significant. All analyses were performed using IBM SPSS Statistics version 26 (IBM Corp., Armonk, NY).
Results
Baseline characteristics
Baseline demographic and anthropometric characteristics.
Past medical and habitual history
Past medical and habitual history.
Distribution of wells criteria components by DVT status
The frequency and percentage distribution of Wells criteria components among participants, categorized by their DVT status, are shown in Figure 1. Statistically significant differences were observed between the DVT and non-DVT groups for active cancer (25.3% vs 8.8%, p = 0.001), recent paralysis or immobilization (18.8% vs 7.5%, p = 0.02), hospitalization ≥3 days or major surgery within 12 weeks (78.8% vs 13.8%, p < 0.001), calf swelling >3 cm compared with the other leg (96.3% vs 46.3%, p < 0.001), and pitting edema confined to the symptomatic leg (90% vs 15%, p < 0.001). No significant differences were found for localized tenderness along the deep venous system (68.8% vs 53.8%, p = 0.07), entire leg swelling (11.3% vs 2.5%, p = 0.06), and non-varicose superficial collateral veins (16.3% vs 10%, p = 0.35).
Radiographic findings
Ultrasound findings among patients diagnosed with DVT revealed unilateral involvement in 97.5% of cases, vein dilation in 90%, absent blood flow in 88.8%, non-compressible veins in 90%, and direct visualization of thrombus in 65%. No ultrasound features suggestive of DVT were identified in the non-DVT group (Figure 2). Using the predefined threshold (Wells score ≥2 = likely DVT; ≤1 = unlikely DVT), a total of 201/480 (41.9%) participants were classified as low risk (Wells ≤1) and 279/480 (58.1%) as likely DVT (Wells ≥2). Of the 201 participants in the low-risk group, 33 (16.4%) were confirmed to have DVT on ultrasonography. The remaining 168/201 (83.6%) low-risk participants did not have DVT. Among participants classified as likely DVT, 207/279 (74.2%) had confirmed DVT and 72/279 (25.8%) did not. Frequency of Doppler ultrasound findings in patients with DVT.
Diagnostic accuracy of wells score for DVT prediction
Diagnostic accuracy of wells criteria for DVT.
Discussion
Hospitalization is recognized as one of the most significant risk factors for venous thromboembolism (VTE). 17 Since the clinical signs and symptoms of deep vein thrombosis (DVT) are often nonspecific, physicians tend to maintain a low threshold for performing lower limb ultrasonography to rule out DVT. 18 To reduce unnecessary ultrasonographic evaluations, Wells and colleagues developed a clinical pretest probability model for assessing patients suspected of having DVT, which has been widely used in clinical practice for several years. 19 The present study aimed to evaluate the predictive value of the Wells criteria in hospitalized patients diagnosed with DVT. Our findings demonstrated that the sensitivity and specificity of the Wells criteria for diagnosing DVT were 86.25% and 70.0%, respectively, indicating good diagnostic sensitivity. The positive and negative predictive values were 74.19% and 83.58%, respectively. These results suggest that while the Wells criteria have an acceptable diagnostic performance, they are not sufficient as a standalone diagnostic tool and should instead be considered a useful method for establishing clinical suspicion. Globally, numerous studies have assessed the diagnostic accuracy of the Wells criteria in various patient populations.20,21
Sartori et al. 22 reported that a Wells score of ≥1 had a sensitivity of 81% and a specificity of 43%, demonstrating a similar sensitivity but lower specificity compared to our findings. Unlike our study, they used one point as the cutoff for probable DVT, whereas our cutoff was two points, which may explain the lower specificity observed in their results. In a study by Modi et al. 6 involving trauma patients, the sensitivity and specificity of the Wells criteria were 67% and 90%, respectively, showing lower sensitivity but higher specificity compared with our findings. These differences may be attributed to variations in study populations. In contrast, Haenssle et al. 23 reported that the frequency of DVT did not differ significantly between patients with high and low Wells scores, a finding inconsistent with ours. They found a sensitivity of 25% and a specificity of 70%, the latter being consistent with our results. Their low sensitivity may be explained by the fact that all patients were hospitalized for dermatological conditions, in which clinical manifestations similar to DVT (e.g., erysipelas) may confound diagnosis. Similarly, Luksameearunothai et al. 24 reported that a Wells score of ≥2 had a sensitivity of 46.7% and specificity of 80.5%, both lower than those in our study. They also found that a score of ≥3 resulted in a sensitivity of 13.3% but a specificity of 98.7%, indicating that increasing the Wells score threshold reduces sensitivity but improves specificity. This suggests that ultrasonography is necessary in patients with high Wells scores.
Wang et al. 7 found a sensitivity of 62.3% and a specificity of 72%, showing lower sensitivity but comparable specificity to our study. In a meta-analysis by Novielli et al., 25 the sensitivity and specificity of the Wells criteria alone were 87.9% and 49.7%, respectively—showing similar sensitivity but lower specificity. Subramaniam et al. 26 also reported lower diagnostic performance, with a sensitivity of 75% and specificity of 55% for the modified Wells criteria. Taken together, the heterogeneity in reported sensitivity and specificity across studies suggests that diagnostic performance varies by patient population. Certain diseases can mimic DVT symptoms, leading to potential diagnostic confusion. Moreover, several studies have shown that the Wells criteria are more accurate for proximal DVT than for distal DVT. 27 Differences in sample size across studies may also contribute to the variability of results.
One approach to enhance the diagnostic performance of the Wells criteria is to combine them with D-dimer testing, as supported by multiple studies. It has been well established that DVT cannot be reliably excluded in patients with a low Wells score without a D-dimer test. 28 Sartori et al. 22 similarly found that a low Wells score could not completely rule out DVT. Goodacre et al. 29 noted that performing ultrasonography in all patients was not an efficient use of healthcare resources, and recommended combining clinical probability (Wells score) with D-dimer testing to improve diagnostic sensitivity. In the meta-analysis by Novielli et al., 25 combining a positive D-dimer test with a high Wells score yielded a 99.2% sensitivity for DVT detection—higher than either test alone, albeit with lower specificity.
Our analysis also showed that, apart from localized tenderness along the deep veins, whole-leg swelling, and presence of collateral superficial veins, all other individual components of the Wells criteria were significantly more prevalent in DVT patients. Collateral veins often develop as a result of chronic DVT, explaining why this sign was less frequent in acute cases. 30 Furthermore, tenderness and leg swelling can also occur in other conditions such as lower-limb trauma, potentially mimicking DVT symptoms.
In our study, the low-risk category (Wells ≤1) demonstrated a failure rate of 16.4%, with 33 of 201 low-risk patients ultimately diagnosed with DVT. This finding is clinically important because the Wells score was originally designed to help identify a group in whom DVT could be safely excluded without immediate imaging. However, inpatient studies consistently show higher failure rates in the low-risk subgroup compared to outpatient populations. For example, Silveira et al. 15 reported substantial misclassification when outpatient-derived decision rules were applied to hospitalized patients, and the accompanying commentary emphasized the potential dangers of relying solely on clinical probability in this setting. Our findings align with these observations and reinforce that the Wells score should not be used as a stand-alone tool to rule out DVT in hospitalized patients. Instead, integration with D-dimer testing or imaging is necessary to ensure diagnostic safety in inpatient populations.
Conclusion
The present study demonstrated that the sensitivity, specificity, positive predictive value, and negative predictive value of the Wells criteria for DVT diagnosis were 86.25%, 70.0%, 74.19%, and 83.58%, respectively. These findings indicate that in hospitalized patients, the Wells criteria are useful for establishing clinical suspicion of DVT but are not sufficient for definitive diagnosis or exclusion. Therefore, additional laboratory or imaging tests remain necessary for accurate diagnosis.
Limitations
This study is not without limitations. As a two-center investigation, its findings may not be fully generalizable to the broader population. Hence, larger multicenter studies with more diverse populations are recommended to validate these results.
Footnotes
Acknowledgements
We thank all patients and their families and Mazandaran University of Medical Sciences for their support.
Ethical considerations
The Ethics Committee of the University of Medical Sciences approved this study.
Consent to participate
Informed consent was obtained from all patients and their legal guardians to participate in this study.
Consent for publication
Written informed consent was obtained from the patients for publication. Copies of the written consent are available for review by the Editor-in-Chief of this journal on request.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.
Contributorship
S.K and S.N were involved in the interpretation and collection of data. S.K, S.N, M.A and F.G were involved in writing, and editing of the manuscript. All the authors reviewed the paper and approved the final version of the manuscript.
Declaration of generative AI and AI-assisted technologies in the writing process
During the preparation of this work, the authors used ChatGPT4 in order to increase the reliability of this document. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.
Provenance and peer review
Not commissioned, externally peer-reviewed.
Guarantor
Dr Farnaz Godazandeh. Assistant Professor of Radiology, Department of Radiology and Nuclear Medicine, School of Medicine, Sari Imam Khomeini Hospital, Mazandaran University of Medical Sciences. Email:
