Abstract
Background:
The most prevalent extrathyroidal manifestation of Graves' disease (GD) is Graves' ophthalmopathy (GO). However, only few methods allow for predictions of GO occurrence or progression in patients with GD.
Methods:
We retrospectively analyzed 1,074 patients with new-onset GD, and divided them into a derivation and a validation cohort based on the date of their GD diagnosis. We then separately analyzed clinical risk factors affecting the occurrence and progression of GO using multivariable regression analysis and created a predictive model based on the factors we identified as significant.
Results:
Of the 853 GD patients included in the derivation cohort, 101 (11.8%) developed GO. Those who developed GO were more likely to be smokers (25.7% vs. 8.5%, p < 0.001), were younger at the time of their GD diagnosis (35.0 years vs. 42.0 years, p < 0.001), more commonly had a family history of GD (27.7% vs. 17.2%, p = 0.015), and had higher thyrotropin-binding inhibitor immunoglobulin (TBII) levels at the time of their diagnosis (13.5 IU/L vs. 10.0 IU/L, p = 0.020) than those who did not develop GO. Of the 101 GO patients in the derivation cohort, after excluding 8 who initially had active and moderate-to-severe GO, 11 of the remaining 93 had progressed to more active or severe GO. GO patients with confirmed progression had a higher proportion of those older than 45 years (54.5% vs. 19.8%, p = 0.031), and they had a different initial clinical activity score distribution. The multivariable regression analysis identified age at GD diagnosis, sex, smoking history, family history of GD, total cholesterol level, and TBII level at the time of the diagnosis as significant risk factors of GO occurrence, and a predictive model including these risk factors was built to create a nomogram.
Conclusions:
The predictors of GO occurrence in patients with new-onset GD were female sex, positive smoking history, young age, family history of GD, high cholesterol level, and high TBII level. The predictive nomogram developed in this study may be useful in patient counseling and facilitating informed treatment decision-making.
Introduction
Graves' ophthalmopathy (GO) is the most common extrathyroidal symptom of Graves' disease (GD). 1 GO can occur even when thyroid function is normal or reduced, but it is typically accompanied by GD. GO is reported in ∼30–50% of GD patients. 2 Symptoms such as dry eyes, foreign body sensations, excessive tearing, double vision, and eye pain reduce the quality of life of patients with GO. In addition, since exophthalmos is caused by hypertrophy of the extraocular muscles and increased orbital adipose tissue, patients may experience depression or a lack of confidence owing to changes in their appearance. 3 Even when the thyroid function is normalized, GO symptoms can still progress or worsen in some patients. Conservative treatment is sufficient if GO is mild, but active treatment such as steroid therapy or radiation therapy is required if the disease progresses to severe ophthalmopathy. In rare cases, when GO is accompanied by symptoms such as compression of the optic nerve, emergency surgery is warranted.
Therefore, predicting the occurrence of GO in patients with GD and investigating whether there is a risk of developing severe GO are important. However, there are few tools for predicting GO that can easily be used in clinical practice settings.
The risk factors known to affect the occurrence and severity of GO include smoking, 4,5 high thyrotropin receptor antibody levels, 6 and radioactive iodine (RAI) treatment. 7 Recently, high plasma cholesterol levels have also been identified as a risk factor for GO. 8 However, because GO often develops in nonsmokers, patients with low autoantibody levels, and those who have not undergone radioiodine treatment, these risk factors are not predictive of GO in all patients.
The purpose of this study was to retrospectively identify risk factors affecting the occurrence and progression of GO in patients who have been treated for GD and undergone a median follow-up of ∼5 years. In addition, we aim to establish a model to predict the occurrence or progression of GO based on these risk factors.
Methods
Study design and subjects
The protocol for this retrospective cohort study was approved by the Institutional Review Board of Chung-Ang University Hospital (IRB No. 2104-013-19365), and informed consent was waived owing to the study's retrospective design. This study adheres to the guidelines of the Declaration of Helsinki and followed the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis reporting guidelines.
We included patients who were diagnosed with hyperthyroidism and prescribed antithyroid drugs at least twice from January 2011 to June 2021 at Chung-Ang University Hospital. A total of 3,375 patients were initially included, among whom 806 (23.9%) were diagnosed with GO. We first excluded 2,301 patients who were diagnosed with GO three months before the diagnosis of GD, those with an unknown GD or GO diagnosis date, and those who did not have new-onset GD. Of the remaining patients, 1,074 who were initially diagnosed with GD at our hospital and whose baseline characteristics were available for analysis were finally included in this study (Fig. 1). We divided these patients into derivation and validation cohorts based on the date of their diagnosis of hyperthyroidism. The 2011–2017 cohort (consisting of 853 patients) was used as the derivation cohort to develop a predictive model, and the 2018–2021 cohort (221 patients) was used as the validation cohort.

Flow chart of the study population.
We reviewed the patients' electronic medical records and collected the following items: date of birth, sex, smoking history, family history, time of diagnosis of GD and GO, GD remission, prescription and duration of treatment with antithyroid drugs, history of thyroid surgery or RAI treatment for GD, activity and severity of GO, and the following laboratory results: free thyroxine (fT4) level, thyrotropin-binding inhibitor immunoglobulin (TBII) level, total cholesterol level, total white blood cell (WBC) count, neutrophil count, and lymphocyte count.
Laboratory measurements
Serum fT4 (reference range = 0.89–1.76 ng/dL) levels were measured using a chemiluminescence immunoassay (Siemens Advia Centaur XP; Siemens Health care Diagnostics, Inc., Tarrytown, NY, USA). TBII (reference range = 0–1.75 IU/L) levels were evaluated using an automated electrochemiluminescence immunoassay kit (Elecsys® Anti-TSHR; Roche Diagnostics, Mannheim, Germany). Plasma total cholesterol levels were measured with an automatic analyzer (Labospect; Hitachi, Tokyo, Japan). WBC, neutrophil, and lymphocyte counts were measured using the Sysmex-XN analyzer (Sysmex, Seoul, Korea). The neutrophil–lymphocyte ratio (NLR) was determined by calculating the ratio of absolute neutrophils to lymphocytes.
Ophthalmic assessment
An ophthalmologist evaluated and diagnosed GO based on criteria from Bartley and Gorman in GD patients who exhibited typical symptoms (dry eyes, tears, irritation) or eye abnormalities such as eyelid edema, retraction, or exophthalmos. 9 One ophthalmologist (J.K.L.) performed all ocular examinations. A modified clinical activity score (CAS) 10 was calculated by allocating one point for each of the seven items (spontaneous retrobulbar pain, pain with eye movement, redness of the eyelids, conjunctival injection, inflammation of the caruncle or plica, swelling of the eyelids, and chemosis) 11 at the initial visit, or assessed for 10 items by adding three more (increase in proptosis, decrease in eye movement, decrease in visual acuity) during a follow-up visit. Patients with a CAS of 3 or more at the first visit or a CAS of 4 or more at the follow-up visit were considered to have active GO. The severity of GO was assessed in accordance with the standardized criteria recommended by the European Group on Graves' Orbitopathy (EUGOGO). 12
Definition of GO progression
We defined the progression of GO as follows: (1) deterioration from mild to moderate-to-severe GO according to the EUGOGO classification or (2) progression from inactive to active GO according to the patient's CAS. Cases with initially moderate-to-severe and active GO at the same time were excluded from the progression analysis.
Statistical analysis
Continuous variables are presented as the mean with the standard deviation or as the median with the interquartile range and either Student's t-tests or the Mann–Whitney U test were used for group comparisons. Categorical variables are presented as numbers and percentages and were evaluated using chi-square tests or Fisher's exact tests.
Multiple imputation using chained equations was used to handle missing data. A total of 10 imputed datasets were generated with predictive mean matching for continuous data and logistic regression for binary data.
A predictive model to identify risk factors associated with occurrence of GO was constructed in the derivation cohort using a multivariable logistic regression model; predictors including demographic and laboratory data measured at baseline were considered for variable selection. A backward elimination method with a p-value threshold of 0.157 was implemented for each imputed dataset, and the pooled sampling variance (D1) method was used to identify predictive variables. 13 The estimated parameters were then pooled using Rubin's rule for the final model. 14 A nomogram was developed based on the predictive factors included in the final model (Fig. 2).

Graphic nomogram based on a multivariable model for prediction of GO in patients with new-onset Graves' disease. Each model variable is assigned a score. A probability of the occurrence of GO may be estimated for a population demonstrating such characteristics by drawing a vertical line from respective predictor values to the score scale at the top and summing the scores. After manually summing up the scores, the “total points” correspond to the probability of developing GO. GO, Graves' ophthalmopathy.
We applied the final model to the validation cohort. Model performance was evaluated using the receiver operating characteristic (ROC) curve, the area under the ROC curve (AUC), the Hosmer–Lemeshow test, and the Brier score as performance metrics. We pooled all performance metrics using Rubin's rule. Additional details of our statistical methods are provided in the Supplementary Appendix SA1. All analyses were conducted using R software (4.1.0) and Python (v3.9.6).
Results
Baseline characteristics and risk factors affecting the occurrence of GO in the derivation cohort
Of the 853 patients included in this study, 101 (11.8%) developed GO. As given in Table 1, patients who developed GO were more likely to be smokers (25.7% vs. 8.5%, p < 0.001), were younger at the time of their GD diagnosis (35.0 years vs. 42.0 years, p < 0.001), and were more likely to have a first-degree family history of GD (27.7% vs. 17.2%, p = 0.015) than those who did not develop GO. Remission of GD, which is defined as the discontinuation of antithyroid drugs for more than a year, occurred in 29.3% of patients without GO, but there were no cases of remission occurring in patients with GO.
Baseline Characteristics and Risk Factors Affecting the Occurrence of Graves' Ophthalmopathy in the Derivation Cohort
NLR was calculated by neutrophil to lymphocyte ratio.
Values in bold denote statistical significance (p < 0.05).
GD, Graves' disease; GO, Graves' ophthalmopathy; IQR, interquartile range; RAI, radioactive iodine; SD, standard deviation; fT4, free thyroxine; TBII, thyrotropin binding inhibitor immunoglobulin; WBC, white blood cell.
The proportion of patients with GO who underwent RAI treatment or thyroidectomy before the diagnosis of GO did not differ from that of patients with GD who did not develop GO. Levels of the thyroid autoantibody TBII were higher at the time of the initial diagnosis in patients who developed GO (13.5 IU/L vs. 10.0 IU/L, p = 0.020). When TBII level improvement was defined as a decrease in the TBII level of ≥25% from baseline 2–4 months after initiation of antithyroid drugs, patients with GO showed less improvement than patients without GO (49.3% vs. 61.5%, p = 0.066). The basal total cholesterol levels and WBC counts did not differ between the two groups. At the time of their GD diagnosis, 3.7% of the patients who did not develop GO—and none of the patients who developed GO—were taking statins. The total prescription duration of antithyroid drugs was longer in the GO group (586.0 days vs. 412.0 days, p = 0.002).
The baseline characteristics and risk factors for GO occurrence in the validation cohort are given in the Supplementary Appendix SA2.
Baseline characteristics and risk factors for progressing to active or moderate-to-severe GO in the derivation cohort
Next, we compared patients with and without progression in GO activity or severity (Table 2). Of all 101 patients who develop GO, 73 initially had mild GO and 28 had moderate-to-severe GO. Ten of the mild GO patients progressed to moderate-to-severe GO during their follow-up. Of all 101 GO patients, 93 were analyzed, excluding 8 patients with initially active and moderate-to-severe GO. Eleven of the inactive or mild GO patients progressed to active or moderate-to-severe GO during their follow-up. GO patients with confirmed progression had a higher proportion of those older than 45 years (54.5% vs. 19.8%, p = 0.031), and they had a different initial CAS distribution (p = 0.009).
Baseline Characteristics and Risk Factors for Progressing to Active or Moderate-to-Severe Graves' Ophthalmopathy in the Derivation Cohort
NLR was calculated by neutrophil to lymphocyte ratio.
Bold type denotes statistical significance (p < 0.05).
CAS, clinical activity score.
Predictive factors for the occurrence and progression of GO
The multivariable logistic regression analysis to identify factors predicting the occurrence of GO identified female sex (odds ratio [OR] = 2.23; 95% confidence interval [CI] = 1.22–4.06; p = 0.009), positive smoking status (OR = 5.96; 95% CI = 3.09–11.46; p < 0.001), age at GD onset (OR = 0.98; 95% CI = 0.96–0.99; p = 0.002), first-degree family history of GD (OR = 1.59; 95% CI = 0.97–2.61; p = 0.066), total cholesterol level (OR = 1.01; 95% CI = 1.00–1.01; p = 0.1), and TBII level (OR = 1.28; 95% CI = 1.02–1.60; p = 0.030) as predictive factors (Table 3). A nomogram was constructed based on these six predictive factors to predict the risk of GO (Fig. 2). The dynamic nomogram based on the final model is available on a web server for clinicians (
Multivariable Logistic Regression Analysis in Predicting the Development of Ophthalmopathy in Patients with Graves' Disease
TBII* is log-transformed TBII.
CI, confidence interval.
Performance of our predictive model for GO
The performance evaluation of our final model showed acceptable discrimination, with an AUC of 0.71 (95% CI = 0.65–0.76) in the derivation cohort and 0.71 (95% CI = 0.61–0.79) in the validation cohort. The ROC curves from 10 imputed validation datasets and the mean ROC curve are given in Figure 3 and demonstrate the robustness of the imputation model. The p-value of the Hosmer–Lemeshow test was 0.944 and 0.053 in the derivation and validation cohorts, respectively, indicating good calibration. The Brier score of 0.075 in both the derivation and validation cohorts showed a good overall performance.

ROC curves across 10 imputed validation datasets with the AUC. The blue line shows the mean of 10 ROC curves. AUC, area under the ROC curve; ROC, receiver operating characteristic.
Discussion
In this retrospective cohort study, we aimed to identify predictive factors for the occurrence and progression of GO in patients with newly diagnosed GD. Female sex, positive smoking status, young age, first-degree family history of GD, high cholesterol level, and high TBII level were identified as predictive factors for GO development. In addition, age >45 years at the diagnosis of GD and high initial CAS were risk factors for GO progression.
Various studies have discussed risk factors affecting the development or progression of GO. However, no study has separately analyzed whether each risk factor affects the occurrence or progression of GO. To address this issue, we conducted a separate analysis to determine whether well-known risk factors affect the occurrence or progression of GO. We were able to confirm that many known risk factors mainly affect the occurrence rather than the progression of GO. Using these results, we created a predictive model using a nomogram to predict GO occurrence using clinical patient data such as sex, smoking status, family history of GD, age, total cholesterol level, and TBII level at the time of GD diagnosis.
Wiersinga et al previously developed the PREDIctor of Graves' Orbitopathy (PREDIGO) model to identify predictive scores related to the development of GO in patients with GD by recruiting 348 patients from eight European countries. 4 The PREDIGO model included the initial CAS, TBII level, duration of hyperthyroid symptoms, and smoking as risk factors. This model predicts the occurrence of GO if a patient's score exceeds 6 of 15 points. The PREDIGO study had an advantage over our study in that it adapted a prospective design. Although our study represents a retrospective analysis, its strong point is that we included >1,000 patients with newly developed GD as well as validation procedures. To our knowledge, to date, no predictive model for GO has been developed other than the PREDIGO model and our nomogram.
Unfortunately, we could not identify any risk factors other than age older than 45 years and a high initial CAS for predicting the progression of GO; this prevented us from developing a predictive model for GO progression. Previous studies have confirmed that older age and smoking can affect the severity of GO. 15 –17 In these studies, the risk of developing severe GO was high, especially in men aged 50–60 years. In one Chinese study, the severity of GO was significantly associated with male sex, old age, smoking, family history of thyroid disease, and degree of proptosis. 18
Regarding risk factors that affect the activity of GO, a recently published meta-analysis identified age, male sex, a short duration of thyroid disease, and current smoking status as predictors of active GO. 19 In our study, we assume that smoking did not show statistical significance as a risk factor because the number of patients exhibiting GO progression was too small. However, combining the results of the abovementioned studies with our findings suggests that old age, male sex, smoking status, and a high initial CAS may be clinical risk factors of GO progression.
RAI treatment is a well-known risk factor for the development of GO. 20 In our study, only 6% of RAI-treated patients developed GO, and RAI treatment was not a significant risk factor. Because steroid treatment was routinely combined with RAI treatment, the risk of GO development might have been reduced.
Considering that the mechanism underlying GO is an inflammatory process involving inflammatory cells such as B and T cells and various cytokines, we evaluated whether the NLR, which is used as a simple parameter of inflammation, is related to the occurrence or activity and severity of GO. Previous studies found that patients with GO exhibited a significant difference in the NLR compared with control groups 21 and that patients with active GO had a higher NLR than those with inactive GO. 22 However, in our study, the NLR was not a statistically significant risk factor. A large-scale prospective study is needed to further clarify the significance of the NLR as a risk factor for GO.
Sabini et al reported that serum cholesterol levels were a novel risk factor for GO in a case–control study. 10 In this study, especially when the duration of hyperthyroidism was shorter than 44 months, patients with GO had significantly higher total and low-density lipoprotein (LDL) cholesterol levels than patients without GO. In another study conducted in Italy, serum total and LDL cholesterol levels were high in patients with GO, but a relationship between these factors and disease severity and activity was not identified. 8 In our study, LDL cholesterol was not measured at the time of the initial diagnosis of GD; therefore, we only analyzed total cholesterol levels. A simple comparison between the groups revealed no differences according to the presence of GO, but total cholesterol was confirmed as a risk factor for GO development through our regression analysis.
However, because total cholesterol was positively correlated with age at GD diagnosis (r = 0.13; p < 0.001), total cholesterol had a positive effect on GO development, whereas the age at GD diagnosis had the opposite effect on total cholesterol. Therefore, the association between total cholesterol and GO development should be examined while controlling for the age at GD diagnosis. In addition, no statins were prescribed in the GO group (Table 1), suggesting a protective effect of statins on GO development; this effect did, however, not show statistical significance.
Of interest, GD remission did not occur in patients with GO, and the duration of antithyroid treatment was longer in patients with than in patients without GO. These findings show that patients with GO are not well-controlled with antithyroid drugs. According to Eckstein et al, patients with severe GO have a higher rate of recurrence of GD and a lower rate of remission. 23
In addition, the low rate of improvement in TBII levels at 2–4 months after antithyroid drug treatment in patients with GO supports the fact that this treatment may not be effective for patients with GO. Considering the results of previous studies that found that surgery and thyroid ablation contributed to a reduction in TBII levels, 24 it may be necessary to consider definitive treatment more aggressively than antithyroid drug treatment in patients with GO.
Our study has some limitations. As this is a retrospective observational study, not all patients with GD underwent ophthalmologic examinations; therefore, the number of patients with GO may have been underestimated. The proportion of patients with GO among all patients with new-onset GD was 12.6%, which is small considering that the known prevalence rate is ∼50%. In fact, the proportion of patients with GO among all patients with GD at our hospital was 23.9%, but it may have been lower here, because we only included patients with newly diagnosed GD to establish a predictive model. However, in other recently published cohort or prospective studies on GO, the reported proportions of patients with GO were 8% 25 and 15% 4 ; therefore, it is possible that the known prevalence is too high. Another limitation is that our validation procedures were conducted within our study; further external validation is required to evaluate the generalizability of our prediction model.
The strength of our study is that we included a relatively large number of patients with new-onset GD and GO—over 1,000—compared with a previous study that developed a predictive model based on only 348 patients with GD. 4 We only included patients whose clinical data at the time of the initial GD diagnosis were available, to compensate for the retrospective design. We imputed missing data to increase the efficiency and employed an internal validation method. In addition, unlike other studies, our study allowed us to create a predictive model related to GO occurrence by separately analyzing risk factors for the occurrence and progression of GO.
In conclusion, the predictors of GO occurrence in patients with new-onset GD were female sex, positive smoking status, young age, first-degree family history of GD, high cholesterol, and high TBII level. We developed a model to predict GO occurrence by combining these factors. The predictive nomogram developed in this study may be useful in patient counseling and facilitating informed treatment decision-making.
Footnotes
Authors' Contributions
H.Y.A. was involved in the study design, data collection, interpretation, drafting of the article, and approval of the final version; J.L. was involved in the analysis, drafting of the article, and approval of the final version; J.K.L. was involved in the study design, interpretation, critical review of the article, and approval of the final version.
Author Disclosure Statement
H.Y.A., J.L., and J.K.L. declare that they have no competing interests.
Funding Information
This research was supported by a National Research Foundation of Korea (NRF) grant funded by the Korean Government (MSIT) (NRF-2021R1A2C1011351). The funding organization had no role in the design or conduct of this research.
Supplementary Material
Supplementary Appendix SA1
Supplementary Appendix SA2
