Abstract
Background:
Thyroid dysfunction is among the most common adverse effects during anti-programmed cell death 1 (PD-1) immunotherapy, and alongside correlations with elevated anti-thyroid antibodies (ATAb), studies have found correlations with survival. However, the exact relations remain to be clarified. We, therefore, aimed at clarifying the relationship between thyroid dysfunction, ATAbs, and survival in anti-PD-1 treated cancer patients.
Methods:
We included 168 patients with nonsmall-cell lung carcinoma, renal cell carcinoma, and metastatic melanoma treated with nivolumab or pembrolizumab. Thyrotropin and free T4 (fT4) levels were measured before each anti-PD-1 infusion. ATAb levels (anti-thyroid peroxidase [TPO] and anti-thyroglobulin [Tg]) were measured at baseline and after two months of treatment. Although the vast majority of patients had detectable levels of ATABs, only a few patients had positive ATAbs when using conventional cut-offs. To study the consequences of detectable ATABs, the cut-off levels were a priori set at the median concentrations at baseline in the study population. Tumor progression was classified according to RECIST v1.1.
Results:
Patients who acquired overt thyroid dysfunction during treatment had significantly higher overall survival (OS) (hazard ratio [HR] = 0.18 confidence interval [CI: 0.04–0.76]; p = 0.020) and progression-free survival (PFS) (HR = 0.39 [0.15–0.998]; p = 0.050) than patients without thyroid dysfunction with 1-year OS rates of 94% vs. 59% and 1-year PFS rates of 64% vs. 34%. During treatment, patients with ATAb levels above the median had a higher OS (HR = 0.39 [0.21–0.72]; p = 0.003) and PFS (HR = 0.52 [0.33–0.81]; p = 0.004) than patients with ATAb levels below the median, with 1-year OS rates of 83% vs. 49% and PFS rates of 54% vs. 20%, respectively. When analyzing ATAb levels over time, patients with a persistent ATAb level above the median had a higher OS (HR = 0.41 [0.19–0.89], p = 0.025) and PFS (HR = 0.54 [0.31–0.95], p = 0.032) compared with patients with a persistent ATAb level below the median. Patients whose ATAb levels increased above the median during treatment had an improved OS (HR = 0.27 [0.06–1.22], p = 0.088) and PFS (HR = 0.24 [0.07–0.77], p = 0.017) compared with patients whose ATAb levels decreased below the median.
Conclusions:
Acquired overt thyroid toxicity and above median ATAb levels during anti-PD-1 treatment are associated with improved PFS and OS. In addition, our results suggest that ATAb levels at baseline are of clinical relevance for PFS and OS.
Introduction
Immunotherapies against immune checkpoints that inhibit T cell activation (cytotoxic T lymphocyte antigen 4 [CTLA-4] and programmed cell death 1 [PD-1]) are rapidly emerging treatments in oncology. While these treatments have impressively improved survival for various metastatic malignancies, they are associated with many immune related adverse events (1 –3). Thyroid toxicity is among the most common (5–15%) immune-related adverse events during anti-PD-1 immunotherapy (1 –3). In most cases, it presents itself as a transient thyrotoxicosis followed by hypothyroidism, thereby resembling the course of a classical thyroiditis, but its exact cause remains unclear (4,5). Previous studies have suggested that thyroid toxicity during anti-PD-1 treatment may be associated with improved overall survival (OS), but these studies were limited by small sample sizes, including only patients with nonsmall-cell lung carcinoma (NSCLC), and showed inconsistent effects on progression-free survival (PFS) (6,7). While thyroid autoimmunity, the presence of anti-thyroid antibodies (ATAbs), before immunotherapy has been related to the development of thyroid toxicity during immunotherapy (8,9), the ATAb status at baseline showed no relation with improved survival (10). We, therefore, hypothesized that the occurrence of thyroid toxicity during anti-PD-1 treatment and thyroid antibody status may predict treatment outcomes in anti-PD-1 treated patients. This hypothesis was investigated in patients with various cancer types in a large prospective cohort study.
Materials and Methods
A total of 168 patients with metastatic melanoma, NSCLC, and renal cell carcinoma (RCC) who started anti-PD-1 treatment, nivolumab (every 2 weeks), or pembrolizumab (every 3 weeks), between April 2016 and July 2017 at the Erasmus Medical Center (Rotterdam, The Netherlands) were eligible for the MULTOMAB-trial (11 –13) (Dutch Trial Registry No. NL6828). The goal of the MULTOMAB-trial was to set up a biobank of prospectively collected blood samples for pharmacokinetic analyses of monoclonal antibodies and immunophenotyping. All adult patients beginning anti-PD-1 treatment were eligible for the MULTOMAB-trial, in which we serially collected serum samples and isolated peripheral blood mononuclear cells. Therapy was terminated in case of complete response, disease progression, severe side effects, or patients' wish. The study was approved by the local ethics board committee.
OS was defined as the period between the start of therapy until death, while PFS was calculated until tumor progression, based on standard Response Evaluation Criteria In Solid Tumors (RECIST) v1.1, or death. Patients who were lost to follow-up were censored. Serum was collected before each drug administration (every 2 or 3 weeks) to determine thyrotropin (TSH) and free T4 (fT4) levels. Serum TSH and fT4 levels were measured via immunoassays on Siemens Immulite 2000XPi (reference range: 0.4–4.3 mU/L) and Ortho Vitros ECiQ (reference range: 11–25 pmol/L), respectively. Thyroid toxicity was scored as “subclinical” when TSH levels were increased (subclinical hypothyroidism) or decreased (subclinical hyperthyroidism) and fT4 levels were normal. Thyroid toxicity was scored as “overt” when TSH levels were increased and fT4 levels decreased (overt hypothyroidism), or vice versa (overt hyperthyroidism). Patients who acquired overt thyroid toxicity after they had acquired subclinical thyroid toxicity were scored as “overt.” Patients were monitored for thyroid toxicity until tumor progression. This was done because this research investigates possible prospective markers for which events occurring after tumor progression are not of any predictive value. Patients with abnormal TSH levels or receiving thyroid medication within three months before the start of immunotherapy were categorized as having a pre-existing thyroid disorder.
Antibodies directed against thyroid peroxidase (anti-TPO) and thyroglobulin (anti-Tg) were measured on a Phadia 250 at baseline and two months after start of treatment, or the last available sample in case of death. The lower limit of quantification for anti-TPO was 33 IU/mL and for anti-Tg it was 244 IU/mL, as reported by the supplier. The lower limit of detection for both values was 1 IU/mL. The vast majority of patients had detectable levels of ATAbs, and only 7 patients at baseline and 14 patients during follow-up had positive antibodies when using conventional cut-offs for anti-TPO and anti-Tg positivity as provided by the assay manufacturers (60 and 280 IU/mL, respectively). To avoid underpowered analyses and to study the consequences of detectable ATAbs as a reflection of an auto-immune reaction, the cut-off levels were a priori set at the median concentrations at baseline in the study population (anti-TPO: 3.05 IU/mL, anti-Tg: 22.35 IU/mL). Patients with any antibody level above these cut-offs were categorized as patients with “above median” ATAb levels, and patients with anti-TPO and anti-Tg levels below these cut-offs were categorized as patients with “below median” ATAb levels.
Categorical variables were tested by using χ 2 tests (or Fisher's exact test when χ 2 test assumptions were not valid). Age distribution between groups was analyzed by using one-way ANOVA. To study relationships between thyroid toxicity and PFS or OS, Cox regression analysis was used where thyroid toxicity was added to the model as a time-dependent covariate. This method takes into account the fact that thyroid toxicity emerges during the follow-up period and can only emerge when the patient survives long enough to develop the toxicity. Hence, false positive results due to immortal time bias are prevented. Conventional Cox regression was used for relationships between ATAb status and PFS or OS. Patients using glucocorticoids were excluded in sensitivity analyses, as these could interfere with both thyroid toxicity and ATAb.
All analyses were corrected for cancer type, as it was a confounder in the relationship between thyroid toxicity and survival. A p-value <0.05 was considered statistically significant, and STATA (v15.1 StataCorp.; College Station, TX) was used for all statistical analyses.
Results
Baseline characteristics of all patients (n = 168) are shown in Table 1. The median follow-up time of patients still alive was 14.9 (interquartile range [IQR]: 9.2–18.4) months. During the study period, 34 patients (20%) developed subclinical thyroid dysfunction and 20 patients (12%) developed overt thyroid dysfunction. Twenty-seven patients (16%) had pre-existing thyroid dysfunction, consisting of 9 cases with hyperthyroidism and 18 cases with hypothyroidism. Twenty-two of those patients only had subclinical thyroid disease. Eighty patients (48%) developed no thyroid dysfunction during the study period. For seven patients (4%) it was impossible to determine thyroid dysfunction due to missing TSH values. Median time to development of all types of thyroid dysfunction was 2.8 (IQR: 1.3–4.2) months, whereas for overt thyroid dysfunction it was 2.1 (IQR: 1.2–3.7) months. There were no differences in age, sex, cancer type, and drug administered between patients who developed thyroid dysfunction and those who did not (Table 2). Patients with above median ATAb levels at baseline (p = 0.374) or acquired above median ATAb levels during therapy (p = 0.349) did not have a different risk of acquired thyroid toxicity than patients with below median baseline ATAb levels.
Baseline Characteristics of Study Cohort
Follow-up time for patients who were alive at end of study.
IQR, interquartile range.
Patient Characteristics by Thyroid Toxicity Groups
Progression-free survival could not be calculated for three patients.
Patients who acquired overt thyroid dysfunction had longer OS (hazard ratio [HR] = 0.18 confidence interval [CI: 0.04–0.76], p = 0.020) as well as PFS (HR = 0.39 [0.15–0.998], p = 0.050) than patients without thyroid dysfunction. No difference was found in OS (HR = 1.54 [0.79–2.98], p = 0.201) and PFS HR = 0.99 [0.52–1.91], p = 0.998) in patients with subclinical thyroid dysfunction or with pre-existing thyroid dysfunction (OS: HR = 0.92 [0.44–1.90], p = 0.816; PFS: HR = 0.71 [0.40–1.26], p = 0.243) when compared with patients without thyroid dysfunction (Fig. 1). These effects remained significant after excluding patients using glucocorticoids for PFS (p = 0.020). For OS, a trend to significance was observed (p = 0.078).

Thyroid toxicity and survival. Kaplan–Meier curves of patients with acquired overt thyroid toxicity (in green), acquired subclinical thyroid toxicity (in blue), pre-existing thyroid dysfunction (in black), or without thyroid dysfunction (in red). (
Median time to the measurement of ATAbs during treatment was 2.1 (IQR: 2.0–2.3) months. ATAb-level distributions are described in Supplementary Table S1. Age and sex were divided proportionally over the ATAb status groups, in contrast to cancer type and drugs administered (Table 3). Within this cohort, patients with NSCLC and RCC were predominantly treated with nivolumab, while patients with metastatic melanoma were predominantly treated with pembrolizumab. Patients with ATAb levels above the median did not differ in OS (HR = 0.97 [0.55–1.72], p = 0.930) and PFS (HR = 0.99 [0.64–1.54], p = 0.981) when compared with patients with ATAb levels below the median (Fig. 2). However, during treatment, patients with ATAb levels above the median had improved OS (HR = 0.39 [0.21–0.72], p = 0.003) and PFS (HR = 0.52 [0.33–0.81], p = 0.004) when compared with patients with ATAb levels below the median with 1-year survival rates of 83% versus 49% for OS and 54% versus 20% for PFS, respectively. These effects remained significant after excluding patients treated with glucocorticoids for OS (p = 0.010) and PFS (p = 0.010).

Anti-thyroid antibodies and survival. Kaplan–Meier curves of patients with anti-thyroid antibodies above median (in green) or below median (in red). (
Patient Characteristics by Anti-Thyroid Antibody Status Groups
Further, we analyzed the changes of ATAbs over time (Fig. 3). In 84% of patients, the ATAb status did not change from baseline until the end of treatment. We found that the 70 patients (48%) with a persistent ATAb level above the median had a significantly improved OS (HR = 0.41 [0.19–0.89], p = 0.025) and PFS (HR = 0.54 [0.31–0.95], p = 0.032) compared with the 53 patients (36%) with a persistent ATAb level below the median. The 15 patients (10%) whose ATAb levels increased above the median during treatment had an improved OS (HR = 0.27 [0.06–1.22], p = 0.088) and PFS (HR = 0.24 [0.07–0.77], p = 0.017) compared with the 9 patients (6%) whose ATAb levels decreased below the median (Fig. 3).

Changes in anti-thyroid antibody status and survival. Kaplan–Meier curves of patients with acquired anti-thyroid antibodies above median (in green), acquired anti-thyroid antibodies below median (in red), continued anti-thyroid antibodies above median (in blue), or continued anti-thyroid antibodies below median (in black). (
Finally, Supplementary Figure S1 provides the thyroid toxicity courses in patients experiencing thyroid toxicity. This swimmers plot shows a variety of patterns. In many cases, a hyperthyroid phase was followed by a hypothyroid phase, which is classical for a thyroiditis, while only rarely a hypothyroid phase was followed by a hyperthyroid phase.
Discussion
Anti-PD-1 treated patients with metastatic melanoma, NSCLC, and RCC and acquired overt thyroid toxicity and/or higher ATAb levels during treatment have clinically relevant prolonged OS and PFS compared with patients without thyroid toxicity and/or low ATAbs. We hypothesize that these patients represent a group with a higher susceptibility to autoimmunity, which, in turn, could be beneficial in the anti-cancer treatment via autoimmune dependent pathways, leading to longer survival. This is supported by various studies in recent years showing that the occurrence of immune related adverse events in anti-PD-1 treated patients is associated with improved response and survival (14 –25). However, most of these studies were limited by their retrospective study design and/or small sample size, while in the current large prospective study we observed effect sizes for thyroid toxicity on OS that are much stronger compared with published associations with other immune-related adverse events (14 –25).
Our finding that ATAb status at baseline is not correlated with OS or PFS is in line with the results of Toi et al. (10) However, we found that a very large portion of patients did not change in ATAb status from baseline until end of treatment and significantly improved OS and PFS were observed for patients with persistent ATAb levels above the median when compared with patients with persistent ATAb levels below the median. Finally, patients whose ATAb levels rose above the median during treatment had significantly improved survival compared with patients whose ATAb levels decreased below the median, leading to the observed difference in survival between patients with ATAb levels above and below the median during treatment. These results suggests that a predisposition for higher susceptibility to autoimmunity could already exist at baseline, but may be influenced during—and possibly by—the treatment, which is a new finding.
An important finding in our study is that only a few patients had positive ATAbs when using conventional cut-off thresholds. As these thresholds are specifically used for diagnosing Hashimoto's thyroiditis, rather than detecting autoimmunity, the cut-off levels in our study were a priori set at the median concentrations at baseline in the study population. The fact that these lower cut-off thresholds can serve as a strong predictive marker suggests that more subtle increases in ATAb levels are involved and relevant in this setting when compared with diagnosing Hashimoto's hypothyroidism.
Alternative models are available to describe response in patients treated with immunotherapy, such as iRECIST (immune-related response criteria). However, we used RECIST v1.1 since this is currently still standard in the oncology field for patients treated with immunotherapy (26,27). The scoring of thyroid toxicity in our study was purely based on the interpretation of serum TSH and fT4 levels to determine the presence of subclinical or overt thyroid dysfunction. This is because the decision to treat thyroid toxicity in practice is predominantly guided by this endocrine classification and therefore has clinical relevance. This in contrast to Common Terminology Criteria for Adverse Events (CTCAE), a common method used within oncology that describes the severity of organ toxicity for patients receiving cancer therapy, with a score ranging from 0 (no adverse event) up to 5 (death). Another important advantage of the current analysis is that we adjusted our results for time dependent effects, while others did not, thereby preventing misinterpretation of the effects of thyroid toxicity on survival. It may be possible that patients with a longer follow-up time are more likely to develop thyroid toxicity. However, median duration to overt thyroid toxicity was 2.1 months, whereas the median follow-up time in our study was 14.9 months, well exceeding the median duration to thyroid toxicity.
The number of patients with pre-existing thyroid dysfunction in our cohort was high (16%), which is likely explained by our definition of thyroid dysfunction. Out of 27 patients with pre-existing thyroid disease, 22 patients were classified as having subclinical thyroid dysfunction, which we classified as such when having one aberrant TSH value prior to starting with immunotherapy. We intentionally used these very strict criteria to define pre-existing thyroid disease to avoid including any patient with pre-existing mild thyroid dysfunction that could interfere with our study aim.
The presence of nonthyroidal illness (NTI) may complicate the interpretation of thyroid function tests in severely ill patients. NTI is more common among intensive care patients, but none of our patients was hospitalized in the intensive care unit. Moreover, the most important associations in our cohort were found in patients with overt thyroid toxicity, the biochemical fingerprint of which does not resemble NTI, and was also supported by the observed thyroid toxicity courses (Supplementary Fig. S1). Therefore, it is unlikely that the potential presence of NTI among study participants substantially influenced the study results. When evaluating the course of thyroid dysfunction over time, a variety of patterns was observed, an important part of which resembled a thyroiditis. The fact that a hypothyroid phase was not always preceded by a hyperthyroid phase might be due to the fact that the hyperthyroid phase was short and occurred between the time of thyroid function testing.
Despite correlations between ATAb status and and thyroid toxicity and OS, no associations were found between ATAb status and thyroid toxicity in our cohort. This could be explained by the fact that the number of patients with thyroid toxicity in the separate ATAb status groups was low. Therefore, further studies should investigate this relationship in larger groups of patients.
In conclusion, this study shows that acquired overt thyroid toxicity and higher ATAbs during treatment are strong predictive markers for response to anti-PD-1 treatment in three cancer types studied. If validated in an independent study, these parameters may serve as novel predictive markers.
Footnotes
Author Disclosure Statement
A.A.M.V. declares having an advisory and consultancy position for Bristol-Myers Squibb and Merck Sharp & Dohme. All remaining authors have declared no conflict of interest.
Funding Information
The authors declare that no funding was received.
Supplementary Material
Supplementary Table S1
Supplementary Figure S1
