Abstract
Background
Peripheral artery disease (PAD) outcomes often rely on the expertise of individual vascular units, introducing potential subjectivity into disease staging. This retrospective, multicenter cohort study aimed to demonstrate the ability of artificial intelligence (AI) to provide disease staging based on inter-institutional expertise by predicting limb outcomes in post-interventional pedal angiograms of PAD patients, specifically in comparison to the inframalleolar modifier in the Global Limb Anatomic Staging System (IM GLASS).
Methods
We used computer vision (CV) based on the MobileNetV2 model, implemented via TensorFlow.js library, for transfer learning and feature extraction from 518 pedal angiograms of PAD patients with known 3-month limb outcomes: 218 salvaged limbs, 140 minor amputations, and 160 major amputations.
Results
After 43 epochs of training with a learning rate of 0.001 and a batch size of 16, the model achieved a validation accuracy of 95% and a test accuracy of 93% in differentiating salvaged limbs from amputations. In manual testing with 45 angiograms excluded from the training, validation, and test processes, the AI predicted mean limb salvage probabilities of 96% for actual salvaged limbs, 27% for minor amputations, and 17% for major amputations (p-value < .001). The correlation coefficient between the CV model-predicted outcome and the actual outcome for these 45 angiograms was 0.7, nearly five times higher than that between the IM GLASS pattern and the actual outcome (0.14).
Conclusion
Computer vision can analyze angiograms and predict disease outcomes, demonstrating a significant correlation between predicted and actual limb salvage rates, outperforming IM GLASS segmentation by a vascular specialist. It has the potential to provide immediate and precise treatment results during vascular interventions, tailored to (inter)institutional expertise, and enhance individualized decision-making.
Keywords
Introduction
Accurately predicting disease outcomes based on evidence not only guides clinicians in decision-making but also serves as a tool for communicating with patients and building a foundation for clinical research. This can optimize disease outcomes, reduce disparities and discrimination among specific patient cohorts, and lower treatment costs.
Assessment of tissue perfusion and peripheral circulation is fundamental in managing patients with peripheral artery disease (PAD), serving as a key predictor of outcomes, particularly when considering revascularization or amputation surgery. 1 Current evaluations of distal circulation and tissue perfusion in PAD predominantly rely on measurements of peripheral intravascular pressure, oxygen saturation, and blood flow characteristics. The Wound Ischemia foot Infection (WIfI) classification, part of the evidence-based revascularization (EBR) framework, is the primary tool used for predicting limb salvage in cases of critical limb-threatening ischemia (CLTI). 1 While well established in both clinical research and daily practice, recent studies highlight certain limitations of the WIfI classification that must be critically considered. One of the key limitations is the non-linear correlation between early WIfI stages and amputation probability, likely due to the predominant influence of subjectively assessed wound severity on outcomes 2 which may be influenced by the individual expertise of the particular vascular unit.3,4
Another validated method for predicting limb outcomes in PAD is peripheral angiography. Angiography aids in selecting optimal revascularization strategies, assessing their necessity, and predicting outcomes using various scoring systems. One such system, recommended by the EBR framework, is the Global Limb Anatomic Staging System (GLASS). 1 GLASS is used to assess the femoropopliteal (FP) and infrapopliteal (IP) levels to predict reocclusion rates, and the inframalleolar (IM) levels to predict limb salvage. 1
IM GLASS segmentation divides sagittal pedal angiograms into P0, P1, and P2 disease patterns, serving as a critical indicator of peripheral circulation and a predictor of amputation probability. 1 Previous studies have established a link between poor outcomes and diminished perfusion (P2 pattern).5,6 However, discrepancies in amputation predictions using the IM GLASS system 7 suggest potential subjectivity, likely due to the complex angiographic anatomy and technical challenges such as contrast volume, angulation, and fluoroscopy duration.
Hypothesis
Current peripheral artery disease (PAD) staging systems are prone to subjective interpretation, as clinicians may vary in their assessment of wound severity in the WIfI classification and angiographic patterns in IM GLASS segmentation, depending on the expertise of their vascular units. These limitations emphasize the need for more objective PAD staging algorithms tailored to (inter)institutional expertise.
Artificial Intelligence (AI) encompasses various techniques for image analysis, including the specialized field of computer vision (CV). Computer vision enables tasks such as image classification, object recognition (including tracking), image segmentation (including enhancement and automatic measurements), and image generation (including 3D reconstruction and image-text processing) based on previously collected data. 8 With its pixel-by-pixel memory from the training dataset, CV can objectively assess image similarity. This capability makes CV particularly useful for analyzing complex medical images, which may be difficult for human interpretation. Specifically for PAD, a CV model built on previously collected angiographic data could predict disease outcomes while minimizing the influence of subjective interpretation.9,10 Such a model built on (inter)institutional data can serve as a baseline, adjusting the experience of a particular vascular unit to the staging systems currently recommended in the guidelines. 1
Aim
This study aimed to demonstrate the ability of CV to identify patterns associated with limb salvage and amputation in pedal angiograms of patients with PAD, with the goal of enhancing existing staging systems and adapting them to (inter)institutional expertise.
Material and methods
This study was designed as a multicenter research project, with retrospective data collection conducted at two university institutions in Germany, each housing separate vascular surgery units. The primary study center was the University Hospital Leipzig, which obtained ethical approval for the study (YR), and the second center was Charité-Universitätsmedizin Berlin. The methodology encompassed data collection, preprocessing, model development, and testing.
Data collection
Anonymized angiograms of the lower limbs were collected from PAD patients at stages ranging from claudication to CLTI, across all WIfI stages. The angiograms were sourced from the vascular surgery departments mentioned above. Specifically, digital subtraction angiograms (DSAs) of the pedal arteries in the sagittal plane, corresponding to IM GLASS disease patterns, were utilized. Both institutions involved in this study used C-Arms equipped with DSA capability for vascular diagnostics, ensuring a consistent baseline across the dataset. These angiograms were obtained following endovascular, open, or hybrid arterial blood flow restoration procedures, excluding deep venous arterialization procedures. The time between angiography and the recorded outcome (limb salvage or amputation) did not exceed 3 months. In our dataset, a prioritization system was applied: major amputation had priority over minor amputation and limb salvage, while minor amputation was prioritized over limb salvage. The cases with resolved vascular events were included for model training. This implies that while it is possible that some patients may have experienced limb loss beyond the 3-month period (e.g., if a patient transferred to another vascular unit without notifying the study unit), such instances should not be associated with the specific angiograms used for training.
Other clinical predictors, such as patient characteristics and disease patterns, were intentionally not incorporated into this study to specifically evaluate the ability of AI to predict disease outcomes using solely anonymized angiographic data, without the influence of additional clinical variables. By focusing exclusively on pedal angiograms with only one known outcome—limb salvage or amputation within 3 months after the last revascularization—we aimed to determine whether AI could identify pertinent patterns in imaging data alone. Furthermore, ethical considerations and data protection regulations, particularly those governing the medical AI model development 11 necessitated the use of anonymized data, which was only possible by reducing the inclusion of clinical predictors to a minimum. It is noteworthy that the number of claudicants and early-stage WIfI cases was higher in the salvaged limb group.
Data collection and labeling were performed by YR and VL, both vascular surgeons with over 10 years of experience. Angiograms were categorized based on IM GLASS disease patterns and divided into three outcome groups: - Salvaged limbs. - Minor amputations (limited to the foot). - Major amputations (above the ankle).
Still images free from textual content, movement, and foreign body artifacts, and with the highest distribution of contrast medium in the vessels—therefore most likely to indicate limb salvage—were used for model training. Cine images were not utilized for model training due to variable frame rates and a higher incidence of artifacts, which could introduce bias into CV image interpretation.
Data preprocessing
The angiograms were manually cropped to an aspect ratio between 0.667 and 1.5 (between 1:1.5 and 1.5:1) and augmented by horizontal flipping. This augmentation was applied to increase the diversity of the dataset, ensuring that the model could better generalize across different anatomical variations and imaging orientations. Each angiogram was automatically bilinearly resized to 224 × 224 pixels during input for model training and prediction in the image classification machine (Figure 1). The data was normalized by dividing all pixel values by 255 to ensure consistent input scaling for the model. Bilinearly resized angiograms excluded from the training, validation, and test processes, showing prediction results during manual testing of the model. Left—true salvaged limb; Right—true major amputation outcome.
CV model development
We used the full MobileNetV2 12 layer model for transfer learning, implemented via the TensorFlow.js 13 library. MobileNetV2 is particularly well-suited for this task due to its efficiency and ability to work effectively with smaller datasets while still achieving high performance in image classification. Its lightweight architecture enables faster processing without compromising accuracy, making it highly suitable for medical image analysis. The model’s performance has been validated in previous medical research.14–17 Additionally, this approach allowed the model to be trained and deployed directly on the user’s side, ensuring that no data was transferred to third-party servers, thereby maintaining complete data privacy.
For the feature extraction phase, the global average pooling layer of MobileNetV2 was used as the base. From there, we developed a custom model to classify the angiograms. This model used a feedforward neural network, which consists of multiple layers that process the data step by step to help the model recognize patterns in the images. These layers had progressively smaller units (1024, 512, 256, and 128), and additional techniques such as activation functions and dropout were applied to help the model learn effectively and avoid overfitting. The model was compiled using the Adam optimizer. The final layer used a softmax activation function to classify the angiograms into different categories.
To handle the uneven distribution of images across different categories, we adjusted the class weights, ensuring that the model trained in a balanced way, regardless of how many images were in each class.
We used a standard data split to train, validate, and test the model: 80% for training, 10% for validation, and 10% for testing. The validation dataset was used to track the cross-entropy loss and accuracy per epoch, preventing overfitting through an automatic early stopping mechanism, which halted training if no improvement was observed after 10 epochs. The test dataset was used to assess the performance of the trained model using metrics such as accuracy, loss, precision, recall, ROC AUC, and the confusion matrix. The preset hyperparameters for training were a batch size of 16 and a learning rate of 0.001, with a scheduler that reduced the learning rate by 5% each epoch.
Manual test
Before training, we extracted 45 randomly selected angiograms (10% of the standard test data split) from the main dataset. These angiograms were excluded from the training, validation, and test processes. The manual test involved comparing the predicted limb salvage probabilities to the actual outcomes for each subgroup using these 45 angiograms. The difference in mean limb salvage probabilities between subgroups was assessed using an unpaired two-tailed t-test. Additionally, we compared the correlation coefficient between the actual limb outcomes and the CV-predicted outcomes with the correlation coefficient between the limb outcomes and IM GLASS disease patterns, segmented by a vascular specialist.
To evaluate potential biases in the model and improve angiogram selection, we incorporated a random image dataset from the Kaggle platform 18 as an additional “bias” class (negative or out-of-distribution class). This exposed the model to non-angiographic images, ensuring it could differentiate between valid pedal angiograms and irrelevant or low-quality images. The effectiveness of this function was assessed by calculating the mean probability of bias in 15 angiograms containing artifacts, such as textual information, movement artifacts, or additional image details.
Statistical analysis
Numerical data collection and statistical analysis, including the t-test and correlation coefficient calculation, were performed using Excel (Microsoft Office 2016, USA). A p-value of < 0.05 was considered statistically significant.
Results
To develop and test the model, we collected 518 pedal angiograms: 218 from salvaged limbs, 140 from minor amputations, and 160 from major amputations. The distribution of IM GLASS disease patterns among the groups was as follows: - Salvaged limbs: Mean IM GLASS 0.72 (SD 0.6). P0 = 79, P1 = 120, P2 = 19 angiograms. - Minor amputations: Mean IM GLASS 1.05 (SD 0.73). P0 = 34, P1 = 65, P2 = 41 angiograms. - Major amputations: Mean IM GLASS 1.4 (SD 0.72). P0 = 22, P1 = 45, P2 = 93 angiograms.
Validation and test
Performance metrics of the main model.

Confusion matrix of the main model’s training.

Accuracy and loss per epoch during the main model’s training.
According to the confusion matrix (Figure 2), the main model performed slightly better in identifying true salvaged limbs. In the test split, the model misclassified only 3 out of 40 actual salvaged limb angiograms as amputations. Of 54 actual amputations, 6 were incorrectly classified as salvaged limbs. A potential cause of these misclassifications could be the specific patterns of minor amputation angiograms. In additional training with separate subgroups for major amputations, minor amputations, and salvaged limbs, 10 out of 14 misclassified angiograms were associated with minor amputation patterns (Figure 4). No bias was detected during the validation process, indicating the dataset’s suitability for model development (Figure 2). Training results of the model with separate classes for major amputation, minor amputation, and limb salvage, shown in the confusion matrix. Amp—amputation.
Manual test
The main model was tested using 45 angiograms, with the distribution among limb outcome classes as follows: - 15 salvaged limbs: Mean IM GLASS 0.93 (SD 0.8). - 15 minor amputations: Mean IM GLASS 1.07 (SD 0.8). - 15 major amputations: Mean IM GLASS 1.2 (SD 0.86).
The model demonstrated significant predictive ability (p-value < .001) in differentiating amputations from salvaged limbs in the test angiograms. The mean probability of limb salvage was 96% for actual salvaged limbs, 27% for minor amputations, and 17% for major amputations. The model correctly classified 14 angiograms in the salvaged limb group, 11 in the minor amputation group, and 13 in the major amputation group. Of the 6 angiograms from amputated limbs that the model incorrectly predicted as salvaged, 3 corresponded to IM GLASS P0 patterns and 3 to P2 patterns. The single angiogram of a salvaged limb incorrectly predicted as an amputation corresponded to the IM GLASS P2 pattern. The correlation coefficient between the CV model-predicted outcome and the actual outcome for these 45 angiograms was 0.7, nearly five times higher than the correlation between the IM GLASS pattern and the actual outcome (0.14).
In an additional bias test involving 15 angiograms with artefacts, the model’s bias function was activated in all 15 cases, with a mean bias probability of 90% (SD 18.2). This demonstrated the model’s ability to detect poor-quality angiograms. The presence of bias negatively impacted the model’s performance, reducing its ability to accurately predict salvage and amputation outcomes.
Examples of the model’s predictive abilities are shown in Figure 1.
Discussion
In this study, we demonstrated the capability of CV to identify angiographic patterns associated with amputation events in patients with PAD, using only anonymized pedal angiograms without additional clinical data. The model successfully classified previously unseen angiograms and accurately predicted limb outcomes, showing a significant correlation between predicted probabilities and actual limb salvage rates, outperforming IM GLASS segmentation by a vascular specialist. The features extracted by the model from the angiograms extended beyond the standard interpretation of IM GLASS pedal loop characteristics. The model evaluated various parameters, such as the collateral network, vessel structure, their proportions within the field of view, and additional image characteristics that may not be easily explained by human analysis.
The challenges in pattern interpretation were mainly associated with the minor amputation subgroup, due to its similarity with both the amputation and salvage classes. Eliminating this subgroup could potentially enhance the model’s predictive abilities, but at the cost of reduced generalization. Although the model was not specifically trained to distinguish between major and minor amputations, the predicted probability of amputation differed between the two subgroups, with a slightly higher probability observed in cases of actual major amputations.
Practical standpoint
Such a CV model has the potential to be (re)trainable at the (inter)institutional level. Modern ML libraries allow these models to be deployed as web applications that operate entirely client side (Figure 1).10,19 This means that the model can run privacy safe on any device that supports WebGL,
20
including smartphones and laptops, allowing for broad accessibility without the need for integration into the existing imaging infrastructure of a particular vascular unit. Users can utilize the Web AI model through image capture or even a photo taken with a smartphone.
10
This approach offers several potential applications for clinical practice: - Decision Support: The model can assist clinicians in decision-making, complementing the current EBR framework.
1
It could be particularly useful in borderline cases, especially for patients with limited life expectancy, where prolonged wound healing could deplete the patient’s reserves and lead to complications or poor outcomes, as well as higher treatment costs. - Strengthening Patient Involvement in Decision-Making: The tool enables patients to objectively interpret their angiograms without requiring additional medical knowledge, thereby enhancing collaboration with vascular specialists and improving adherence to therapy. - Institutional Expertise: A CV model built on institutional data can reflect an institution’s specific experience and expertise in treating PAD, allowing for the inclusion of individual pedal circulation in disease staging. Comparing models from different institutions can help identify centers where specific disease patterns may yield better outcomes. - Quality Indicator: The model could also serve as a quality indicator. A reduction in the predicted amputation probability at the conclusion of a vascular intervention could indicate a successful procedure and provide a measurable outcome for quality assurance. - Patient Follow-up: The model could be used to follow up with PAD patients. For example, an increased probability of amputation, particularly in early WIfI stages or claudicants predicted as high-risk, would suggest that these patients require more frequent monitoring and possibly more aggressive treatment strategies. - Research and Standardization: A larger, inter-institutional model built on big data could serve as a valuable research tool. Such a model could establish a baseline for patient outcomes, with improvements over the predicted outcomes indicating the effectiveness of new treatments. Such a model could be built through the centralization of data by a dedicated medical society, medical insurance company, or as a decentralized research tool.
Limitations
- The primary limitation of this research is the use of a holistic image classification algorithm. Segmentation of appropriate pattern could potentially outperform our model and deliver better results. However, the challenge lies in choosing the right pattern, as this pattern is not yet well defined. The previously observed discrepancy in clinical outcomes with the IM GLASS pattern
7
have led us to develop a model based on the memory of “stuff” rather than “things.” This semantic approach highlights the need for a large database and interinstitutional collaborations to create a powerful open-source model. - Second, resizing the angiograms could influence the model’s predictive performance. Proper preprocessing of angiograms is essential for both prediction and training. - Third, augmenting the dataset with horizontally flipped angiograms may lead to an overestimation of the model’s performance. However, this does not affect the separate manual testing with angiograms that were excluded from training, validation, and test. - Fourth, the outcome data were limited to a 3-month period. While the model was trained using cases with resolved outcome events, some pedal wounds may not fully heal within this timeframe. A longer follow-up period could provide a more comprehensive reflection of long-term outcomes and should be explored in future investigations. - Fifth, due to the anonymized nature of the dataset, detailed clinical parameters, were not included in this study. This exclusion was intentional for two main reasons: first, to focus on assessing the predictive value of isolated pedal angiograms without additional biases, and second, to protect patient privacy, particularly when building an AI model using their data. Despite these constraints, the foundational model was able to predict outcomes effectively without additional clinical data. Future studies should explore designs that incorporate clinical data to assess the potential benefits of a hybrid approach combining computer vision and numerical machine learning. - Sixth, this study used only still images to build the model. While dynamic imaging could potentially enhance prediction accuracy, video processing across different institutions and specialists presents challenges such as varying frame rates, magnification differences, and artifacts that could significantly impact model performance. Properly cropped static images help to mitigate these biases and can still be utilized to assess dynamic series by selecting the frame with the highest probability of limb salvage. Future research is warranted to evaluate the potential benefits of integrating dynamic imaging techniques.
Conclusion
Computer vision can analyze angiograms and predict disease outcomes, demonstrating a significant correlation between predicted and actual limb salvage rates, outperforming IM GLASS segmentation by a vascular specialist. It has the potential to provide immediate and precise treatment results during vascular interventions, tailored to (inter)institutional expertise, and enhance individualized decision-making.
Footnotes
Acknowledgements
Special thanks to Jason Mayes for providing the opportunity to explore TensorFlow.js.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethical statement
Guarantor
YR.
Contributorship
YR: idea, building of the model and web application, draft of the manuscript. YR, VL, MD: study protocol. YR, VL data collection and labeling. All authors: literature search, interpretation of data, editing of the manuscript, final approval of the manuscript.
