Abstract
Background:
Treatment for Graves’ hyperthyroidism (GH) in patients with Graves’ orbitopathy (GO) remains a topic of debate. This study aimed to investigate the outcome of GO following glucocorticoids, depending on the chosen thyroid treatment.
Methods:
This retrospective cohort study included 49 consecutive patients with GH and moderate-to-severe, active GO, as defined by the European Group on Graves' Orbitopathy guidelines. Twenty-four patients were treated with radioactive iodine (RAI) and 25 with methimazole (MMI). All patients were administered intravenous methylprednisolone. Follow-up visits occurred at weeks 24, 48, and 72. The primary endpoint was the overall outcome of GO at week 24. Response was defined as a change in at least two of the following eye features: reduction ≥1 point in clinical activity score; proptosis reduction ≥2 mm; eyelid aperture reduction ≥2 mm; increase in eye ductions ≥8 degrees.
Results:
Follow-up duration was 72 weeks for both groups (interquartile range 66–72 for RAI and 48–72 for MMI). The proportion of responders for week 24 overall GO outcome was greater in RAI (54.1% vs. 16%; odds ratio [OR] 6.2 [confidence interval (CI): 1.6–23.6], p = 0.0075), but it increased in MMI at weeks 48 and 72, with no differences between groups. There was a trend indicating a better response in RAI regarding individual eye features. Improvement in GO-specific quality of life questionnaire at week 24 was trendily more pronounced in RAI (responders 50% vs. 28% in MMI; OR = 2.5 [CI: 0.7–8.4], p = 0.11), although results were similar in both groups at later time points. At week 24, only one patient (4%) in RAI and three (12%) in MMI experienced worsening of GO. Fifty-nine adverse events were recorded among 36 patients, with no differences between groups, except for infections, which were more frequent in RAI (53.8% vs. 15.3% in MMI; OR = 6.41 [CI: 1.7–23.9], p = 0.0056).
Conclusions:
RAI appears to be associated with an earlier response of GO to intravenous glucocorticoids. In the long term, a conservative approach also seems to be effective. RAI appears to be relatively safe when patients are concurrently treated with glucocorticoids. However, randomized clinical trials are necessary to confirm these findings.
Introduction
Background and rationale
Graves’ orbitopathy (GO) is typically associated with Graves’ hyperthyroidism (GH), although it can occur in association with autoimmune hypothyroidism or subclinical thyroid autoimmunity. 1 –3 While most patients experience mild GO, some develop moderate-to-severe forms requiring treatment. According to recent guidelines, the first-line treatment for moderate-to-severe, active GO is intravenous glucocorticoids (ivGC). 4 In countries where it is approved, the anti-insulin-like growth factor-1 antibody teprotumumab is also an option. 5
GO reflects autoimmunity against autoantigens expressed by thyroid cells and orbital fibroblasts, with the thyrotropin (TSH) receptor being the most significant. 6 There exists a close relationship between thyroid function, thyroid treatment, and GO. 7 However, treatment for GH in patients with GO remains a topic of debate. While some advocate for a conservative approach (antithyroid medications), 4 others, including ourselves, favor an ablative strategy (thyroidectomy or radioactive iodine [RAI]). 7 This preference is based on the principle that eliminating thyroid antigens can lead to a reduction in the immune response against the same antigens in the orbit.
Randomized clinical trials indicate that RAI may negatively affect mild GO by causing its de novo appearance or progression, due to release of thyroid antigens and activation of the immune system against these antigens in orbital tissues. 8 –10 Supporting this, treatment with RAI is associated with an increase in serum anti-TSH receptor autoantibodies (TRAbs). 11 However, appearance or progression of mild GO following RAI can be prevented by administering oral glucocorticoids at relatively high doses. 8,9
To our knowledge, no studies have investigated how thyroid treatment affects moderate-to-severe, active GO, except for a few studies showing a beneficial effect of total thyroid ablation. 12 –14 Therefore, this study aimed to evaluate the effects of an ablative (RAI) versus a conservative (antithyroid medications) approach to thyroid treatment in consecutive patients with moderate-to-severe, active GO, as defined by the European Group on Graves’ Orbitopathy (EUGOGO). 4 Following ivGC, the primary endpoint was the overall outcome of GO at 24 weeks, evaluated using the composite assessment proposed by EUGOGO. 4
Methods
Study design
As depicted in Figure 1a, this retrospective cohort study evaluated data from patients with GH and moderate-to-severe, active GO, to compare the outcome of GO following ivGC, depending on whether GH was managed with an ablative (RAI or thyroidectomy) or a conservative approach (antithyroid medications). Patients who underwent thyroidectomy were excluded for reasons outlined below. All procedures (see below) are part of the standard of care at our institution. The study was not preplanned; data analysis was conducted post hoc.

Study design and study profile. (
Setting
The study was conducted at a tertiary referral center, the University Hospital of Pisa. Patients were enrolled from January 14, 2020 to December 19, 2022, through consecutive sampling.
Ethics approval and consent
The study received approval from the local ethics committee (Comitato Etico Area Vasta Nord-Ovest, approval no. 24205_MARINO). Signed informed consent was obtained from all patients.
Participants
The inclusion criteria for the study were as follows: (1) informed consent; (2) presence of GH; (3) GH duration ≤18 months; (4) moderate-to-severe GO, defined by the presence of at least two of the following in the most affected eye: (i) proptosis (exophthalmos) ≥3 mm, compared to normal values for sex and race; (ii) lid retraction ≥2 mm; and (iii) intermittent to constant diplopia; (5) active GO, namely a clinical activity score (CAS) ≥3 out of 7 points; (6) GO duration ≤18 months; and (7) age 18–75 years. The exclusion criteria were: (1) optic neuropathy or sight-threatening GO; (2) previous RAI or thyroidectomy; (3) use of glucocorticoids or other immunosuppressive treatments within the last three months; (4) previous surgical treatment and/or radiotherapy for GO; (5) GO improvement between screening and baseline; (6) contraindications to glucocorticoids; (7) history of cancer; and (8) mental illness that prevents patients from understanding and signing informed consent.
Variables
The primary objective was to assess the overall outcome of GO at week 24, as determined by the composite evaluation proposed by EUGOGO. 4 A response was defined as a change in at least two of the following measures in the most affected eye, without any deterioration in both eyes: (1) reduction ≥1 point in five-scale CAS (spontaneous and gaze-evoked pain excluded); (2) proptosis reduction ≥2 mm; (3) eyelid aperture reduction ≥2 mm; and (4) increase in eye duction ≥8 degrees.
Secondary objectives included: (1) overall outcome of GO at weeks 48 and 72; (2) outcomes of individual eye features; (3) quality of life (QoL), assessed using a GO-QoL questionnaire; (4) GO worsening at week 24, defined as a change in at least two of the following measures in any eye: (i) increase ≥1 point in five-scale CAS; (ii) increase in proptosis ≥2 mm; (iii) increase in eyelid aperture ≥2 mm; (iv) reduction in eye ductions ≥8 degrees; and (v) reduction in visual acuity ≥2 decimal places; (5) GO relapse at weeks 48 and 72, specifically in responders from week 24; (6) need for additional treatments; and (7) occurrence of adverse events.
Procedures
In accordance with the guidelines for treatment of hyperthyroidism, 15,16 patients were offered an ablative therapy (RAI, or thyroidectomy if the ultrasound thyroid volume was ≥30 mL) after careful and detailed information during the screening process, which was performed approximately 2 weeks prior. Alternatively, patients could opt for a conservative approach, namely antithyroid medications. Patients who underwent RAI or thyroidectomy were prescribed L-thyroxine at replacement doses (1.6 µg/kg). The baseline visit (4 weeks after screening) corresponded to the first administration of methylprednisolone, which was given according to the previously described protocol: 500 mg weekly for 6 weeks, followed by 250 mg weekly for the subsequent 6 weeks (cumulative dose: 4.5 g). 4 Patients treated with RAI started methylprednisolone 2 weeks later and did not receive oral glucocorticoids before methylprednisolone initiation. This practice is standard at our institution due to Italian law (patients cannot access public services for 2 weeks after RAI). To our knowledge, it remains unclear whether this delay could be detrimental. Conversely, if patients had received oral glucocorticoids immediately after RAI, this would have introduced a bias, making the two groups incomparable. During ivGC, patients were administered omeprazole 20 mg daily to prevent gastrointestinal adverse events. Additionally, postmenopausal women received bisphosphonates (i.e., alendronate 70 mg/week) and vitamin D (i.e., cholecalciferol 25.000 IU/month) to mitigate bone loss.
Sources of data and measurements
Patients underwent an ophthalmological assessment at all visits, which included (1) exophthalmometry (Hertel exophthalmometer); (2) measurement of eyelid aperture; (3) CAS; (4) eye ductions; (5) corneal examination; (6) fundus examination; and (7) visual acuity (Snellen chart). Ophthalmologists were not blinded to the patient’s treatment. Additionally, patients underwent a thyroid ultrasound at screening and thyroid volume was calculated using the ellipsoid formula.
The following blood tests were performed at screening, baseline, and every 4 weeks up to week 72: free triiodothyronine (Vitros Immunodiagnostics, Raritan, NJ) and TSH (Immulite 2000, Siemens Healthcare, Gwynedd, UK). TRAbs (ElisaRSRTM TRAb 3rd Generation, Cardiff, UK) were measured at baseline, weeks 24, 48, and 72. Additionally, routine blood tests, urine analysis, and urine culture were performed at screening, baseline, and every 2 weeks up to week 24.
The GO-QoL questionnaire was used to assess QoL. 4,17 The questionnaire comprises two subscales: (1) visual functioning (eight questions) and (2) appearance (eight questions). Each question is scored as follows: severely limited (one point), a little limited (two points), or not limited (three points). Scores are converted into percentages according to the following formula: (total points*100)/(number of questions answered*3). A response is defined as an increase of at least 6% from the baseline measurement. 18
Adverse events were documented and graded according to the National Cancer Institute’s Common Terminology Criteria for Adverse Events, Version 5.0. Data were collected by GC, GL, SC, MNM, CP, DC, and MM and recorded in a database. Database validation procedures were implemented to ensure data integrity.
Bias
The study was retrospective; to address this limitation, we adopted consecutive sampling. Additionally, the ophthalmologists were not blinded, although this should not have affected the findings given the retrospective nature of the study.
Sample size
The sample size was calculated based on the knowledge that in patients receiving methylprednisolone, treatment with antithyroid medications is associated with a response rate of ∼70% for the primary outcome, 19,20 whereas thyroidectomy is associated with a response rate of ∼30%. 12 Given the limited studies on the effects of RAI in patients with moderate-to-severe, active GO, we assumed that the response rate would be similar to that of thyroidectomy. Thus, we estimated that a total of 52 patients (26 per group) would be sufficient to achieve statistical significance (p ≤ 0.05 by Fisher’s exact test) for the primary outcome measure, with a statistical power of 0.8, considering a 10% dropout rate.
Quantitative variables
Continuous variables are presented as means (SD) or medians (interquartile range [IQR]). Categorical variables are expressed as percentages.
Statistical analyses
Differences were assessed using analysis of variance with Bonferroni’s correction, Mann–Whitney U test, two-tailed Fisher’s exact test, or chi-square test, as appropriate. Analyses were performed using SPSS, Version 21.0 (IBM, New York, NY). Outcome measures were analyzed for patients who attended all visits or at least week 24 visit. Data from patients who did not attend week 48 and/or week 72 visits were included in the analysis using the last-observation-carried-forward method. Data from patients who withdrew from study treatments or were non-compliant but attended follow-up visits were retained in the analysis. Statistical analyses were conducted by MM and reviewed by a statistician.
Results
Participants
As shown in Figure 1b, of 63 patients screened, 11 were deemed ineligible. Thus, the first 26 consecutive patients treated with RAI or thyroidectomy and 26 administered antithyroid medications (methimazole [MMI] in all cases) were included. Among the 26 patients in the first group, 24 were treated with RAI, while only two underwent thyroidectomy. Therefore, the analysis was restricted to patients treated with RAI (RAI group) versus patients treated with MMI. One patient in MMI who started methylprednisolone did not attend the week 24 visit, leading to exclusion from data analysis. Eight of the remaining 49 patients (four in each group) did not attend the week 48 visit or underwent additional treatments between week 24 and week 48. Five patients (two in RAI and three in MMI) who attended weeks 24 and 48 visits did not attend the week 72 visit or underwent additional treatments between week 48 and week 72. Considering patients who did not attend weeks 48 and 72 visits, the median follow-up duration was 72 weeks in both groups (IQR 66–72 for RAI and 48–72 for MMI). Demographic and clinical data at baseline are reported in Table 1. The two groups did not differ on any of the variables considered.
Demographic and Clinical Features of Patients with Graves’ Orbitopathy at Baseline
Values are number (%), mean (SD), or median (IQR).
GO, Graves’ orbitopathy: GO-QoL, quality of life score; MD, mean difference; MMI, methimazole group; NV, normal values; RAI, radioactive iodine group; TRAbs, anti-TSH receptor antibodies.
Overall outcome of GO
The proportion of week 24 responders was greater in RAI: 54.1% versus 16% in MMI (odds ratio [OR]

GO response to treatment and QoL. (
Quality of life
The proportion of responders for total GO-QoL score at week 24 was trendily greater in RAI (50% vs. 28% in MMI; OR = 2.5 [CI: 0.7–8.4], p = 0.11) (Fig. 2b), but similar in both groups at week 48 (RAI 41.6%; MMI 40%) and week 72 (RAI 41.6%; MMI 52%), which is consistent with the overall GO outcomes. Additionally, the proportion of responders in the functional (Fig. 2c) and appearance (Fig. 2d) subscales did not differ between groups.
Overall outcomes of individual eye features
There was a trend indicating a greater proportion of responders in RAI for individual eye features, namely proptosis (Fig. 3a), CAS (Fig. 3b), eyelid aperture (Fig. 3c), and eye ductions (Fig. 3d). However, no statistical differences were observed, except for eye ductions at week 48 (RAI 43.4%; MMI 14.2%; OR = 4.61 [CI: 1–20.1], p = 0.042). The proportion of responders increased over time in both groups, especially in MMI, which paralleled the overall GO outcome.

Response of individual eye features in patients with GO treated with methylprednisolone. (
TRAbs
As expected, 11 TRAbs increased in RAI at week 24 (14.4 mU/l [2.1–58.9] vs. 6.8 [2.2–12.8] at baseline) and week 48 (15.1 mU/l [2.5–3.4]), to then decline at week 72 (4.4 mU/l [1.5–8.9]). In MMI, TRAbs decreased at week 24 (1.8 mU/l [1.1–4.2] vs. 6.6 [3.8–10] at baseline) and remained stable at week 48 (1.7 mU/l [0.9–3.9]) and week 72 (0.4 mU/l [0.9–3.2]).
GO worsening, relapse, and need for additional treatments
At week 24, one patient in RAI (4.1%) and three in MMI (12%) experienced worsening compared with baseline, with no significant difference (OR = 0.3 [CI: 0.03–3.3], p = 0.33). The proportion of week 24 responders who relapsed at week 48 (RAI 2 [8.3%]; MMI 4 [16%]) and week 72 (RAI 1 [4.1%]; MMI 2 [8%]) were similar. As an exploratory outcome, we evaluated worsening at week 48 and week 72 compared with baseline, regardless of week 24 outcome, which was similar in both groups (week 48: RAI two patients [8.3%], MMI four patients [16%]; week 72: RAI one patient [4.1%], MMI two patients [8%]). Four patients in RAI (16.6%) and one in MMI (4%) required additional treatments: one orbital decompression per group for optic neuropathy; one patient in RAI required RAI retreatment for hyperthyroidism relapse; and two patients in RAI required an additional methylprednisolone course. There was no statistically significant difference between groups (OR = 2.3 [CI: 0.3–13.9], p = 0.18). It is important to note that patients received additional treatments after the week 24 visit; therefore, these additional treatments were unrelated to the primary endpoint.
Adverse events
The safety population included all 52 enrolled patients. A total of 59 adverse events were recorded across 36 patients (Table 2). Among these, five were considered serious: two optic neuropathies (one per group); one intestinal perforation in MMI, which was associated with ulcerative colitis diagnosed during methylprednisolone treatment and required treatment discontinuation; one Bartolini’s cyst abscess in RAI; one ovarian cancer in RAI; one parkinsonism in RAI. The remaining adverse events were mild and did not require treatment discontinuation. No significant differences were observed between the two groups, except for infections, which were more frequent in RAI (OR
Adverse Events in the Safety Population (52 Patients)
Data are n (%). Serious adverse events are underlined.
Discussion
The present retrospective cohort study arose from the lack of knowledge regarding the optimal treatment for GH in the presence of moderate-to-severe, active GO. 4 We report data obtained from consecutive patients treated with RAI (24 patients, RAI group) or MMI (25 patients, MMI group), all of whom received a course of methylprednisolone. Various outcome measures were assessed at weeks 24, 48, and 72.
At week 24, the proportion of responders in terms of overall GO outcome (the primary objective) was significantly greater in RAI (52% vs. 16%). This trend continued at weeks 48 and 72, although not to a statistically significant extent, due to an increase in responders within MMI. Although there was a trend toward a better outcome of individual eye features within RAI, only eye ductions at week 48 improved to a greater, significant extent. In parallel with the overall GO outcome, there was also a trend indicating a greater proportion of responders in the total GO-QoL score for RAI at week 24.
As expected, 11 TRAbs increased at weeks 24 and 48 in RAI, followed by a decline at week 72. Conversely, the MMI group exhibited a continuous decrease in TRAbs throughout the follow-up period. It may seem surprising that patients in RAI experienced a faster improvement in GO, given that TRAbs are known to correlate with GO severity and activity. 21,22 One possibility is related to the timing of the TRAb increase, which is known to peak 24 weeks after RAI. 11 This timing coincides with the observed improvement in GO outcomes in RAI compared with MMI. Thus, the timing of our observations may not have been sufficient for TRAbs to exert a deleterious effect, which could occur later.
An important issue is worsening of GO. As previously mentioned, mild GO progresses in ∼15% of patients following RAI. 9 –11 However, it remains unclear whether this progression occurs in cases of moderate-to-severe GO and whether glucocorticoids have any impact. It is reassuring that only one patient in RAI (4%) worsened at week 24, compared with three patients (12%) in MMI. This trend was confirmed at weeks 48 and 72. Similarly, the relapse of GO did not differ between the two groups, nor was there a difference in the need for additional treatments.
The main limitation of our study is its retrospective nature. Therefore, prospective, randomized clinical trials are needed to confirm our observations. Another limitation is that the study was powered to assess GO outcomes at week 24, but not later. It is possible that findings at weeks 48 and 72 may differ in a larger population. Another concern is the response to methylprednisolone, which in MMI was lower than previously reported. 23,24 However, in those studies, patients had more severe GO, and outcomes were evaluated using different criteria. On the other hand, the response to methylprednisolone in MMI was similar to that reported in a recent study using comparable response criteria. 25 Nevertheless, in both groups, the response to methylprednisolone increased significantly at weeks 48 and 72, supporting the choice of this treatment type. 4 Finally, a concern is the notable number of adverse events. Fortunately, the majority were mild and did not require treatment discontinuation. The reason for the higher incidence of infections in RAI is unknown and requires further investigation.
In conclusion, we found that RAI appears to be associated with an earlier response of moderate-to-severe, active GO to ivGC, although a conservative approach also seems to be effective in the long term. Importantly, GO worsening was similar in patients given RAI or MMI, suggesting that RAI may be safely administered to patients with GO, provided that treatment is followed by glucocorticoids, as seen in mild GO. 9,10 Randomized clinical trials are needed to confirm our findings and reach firm conclusions.
Footnotes
Authors’ Contributions
G.C.: Conceptualization (equal), data collection (equal), software (equal), review and editing (equal); G.L.: Conceptualization (equal), data collection (equal), software (equal), review and editing (equal); S.C.: Conceptualization (equal), data collection (equal), software (equal), review and editing (equal); M.N.M.: Data collection (equal), review and editing (equal); C.P.: Data collection (equal), review and editing (equal); D.A.C.: Data collection (equal), review and editing (equal); F.M.: Conceptualization (equal), review and editing (equal); R.R.: Conceptualization (equal), review and editing (equal); F.L.: Conceptualization (equal), review and editing (equal); M.F.: Conceptualization (equal), review and editing (equal); review and editing (equal); F.S.: Conceptualization (equal), review and editing (equal); M.M.: Conceptualization (lead); methodology (lead), software (lead), formal analysis (lead); writing original draft (lead).
Disclosure Statement
G.C., G.L., S.C., M.N.M., C.P., D.A.C., F.M., R.R., F.L., M.F., F.S., and M.M. declare that they have no commercial association that might create a conflict of interest in connection with this article.
Funding Information
No funding was received for this study.
