Abstract
Background:
Germline pathogenic variants in CHEK2 are associated with a moderate increase in the lifetime risk for breast cancer. Increased risk for other cancers, including non-medullary thyroid cancer (NMTC), has also been suggested. To date, data implicating CHEK2 variants in NMTC predisposition primarily derive from studies within Poland, driven by a splice site variant (c.444 + 1G>A) that is uncommon in other populations. In contrast, the predominant CHEK2 variants in non-Polish populations are c.1100del and c.470T>C/p.I157T, representing 61.1% and 63.8%, respectively, of all CHEK2 pathogenic variants in two large U.S.-based commercial laboratory datasets. To further delineate the impact of common CHEK2 variants on thyroid cancer, we aimed to investigate the association of three CHEK2 founder variants (c.444 + 1G>A, c.1100del, and
Methods:
The presence of three CHEK2 founder variants was assessed within three groups: (1) 1544 NMTC patients (and 1593 controls) from previously published genome-wide association study (GWAS) analyses, (2) 789 NMTC patients with germline exome sequencing (Oncology Research Information Exchange Network [ORIEN] Avatar), and (3) 499 NMTC patients with germline sequence data available in The Cancer Genome Atlas (TCGA). A case–control study design was utilized with odds ratios (ORs) calculated by comparison of all three groups with the Ohio State University GWAS control group.
Results:
The predominant Polish variant (c.444 + 1G>A) was present in only one case. The proportion of patients with c.1100del was 0.92% in the GWAS group, 1.65% in the ORIEN Avatar group, and 0.80% in the TCGA group. The ORs (with 95% confidence intervals [CIs]) for NMTC associated with c.1100del were 1.71 (0.73–4.29), 2.64 (0.95–7.63), and 2.5 (0.63–8.46), respectively. The proportion of patients with c.470T>C/p.I157T was 0.91% in the GWAS group, 0.76% in the ORIEN Avatar group, and 0.80% in the TCGA group, respectively. The ORs (with CIs) for NMTC associated with c.470T>C/p.I157T were 1.75 (0.74–4.39), 1.52 (0.42–4.96), and 2.31 (0.58–7.90), respectively.
Conclusions:
Our analyses of unselected patients with NMTC suggest that CHEK2 variants c.1100del and c.470T>C/p.I157T have only a modest impact on thyroid cancer risk. These results provide important information for providers regarding the relatively low magnitude of thyroid cancer risk associated with these CHEK2 variants.
Background
C
A complicating factor in interpreting the impact of CHEK2 variants on cancer susceptibility is the observation of differential risks based on the specific variant and/or variant type. Truncating or frameshift CHEK2 variants can result in loss of protein and have the strongest associations with cancer susceptibility. Missense variants predominately affecting CHK2 functional domains are proposed to attenuate protein function, but not result in complete loss. 12,13 Such missense variants are commonly referred to as being pathogenic, although data suggest the penetrance may be lower compared to truncating variants.
A recent study demonstrated that three specific missense variants (p.I157T, p.S428F, and p.T476M) have an attenuated association with breast cancer and were not associated with non-breast cancers in a large cohort of patients undergoing multigene testing in a large commercial laboratory, 14 which is consistent with other studies. 15 –17 The p.I157T variant is particularly controversial with variation among clinical laboratories in whether they report the variant or, if they do, its pathogenicity classification (ranging from variant of uncertain significance to pathogenic with low penetrance). 18 Studies investigating the association between breast cancer risk and CHEK2 variants primarily have been based on truncating variants, such as the c.1100del variant. Thus, the National Comprehensive Cancer Network screening guidelines for individuals with pathogenic or likely pathogenic CHEK2 variants only apply to those with frameshift pathogenic/likely pathogenic variants. 19
Pathogenic variants in the CHEK2 gene have been suggested to cause an increased risk for thyroid cancer. 3,20 However, these data are primarily derived from studies within Poland, which are driven by a specific truncating splice site variant, c.444 + 1G>A (also known as IVS2 + 1G>A). Studies in Polish populations indicated that this specific CHEK2 variant has a strong association with papillary thyroid cancer (PTC) with odds ratios (ORs) ranging from 6.2 to 10.0.
In contrast, studies assessing the impact of CHEK2 variants on thyroid cancer risk in other populations have suggested weaker associations. For example, a large study from a commercial testing laboratory comparing individuals with pathogenic CHEK2 variants (excluding common missense variants that were deemed low risk) to individuals with negative multigene cancer testing found that the OR for developing thyroid cancer in the variant-positive group was 1.63 (95% confidence interval [CI] = 1.26–2.08; p < 0.001). 14
A possible explanation for the differing ORs may be variant-specific impact on thyroid cancer risk, given that the Polish founder variant was rarely present in the commercial laboratory datasets. Conversely, the two most common pathogenic variants in the United States are c.1100del and c.470T>C/p.I157T, representing 61.1% and 63.8% of all pathogenic CHEK2 variants detected in two large datasets of patients (2540 and 1101 patients, respectively) with commercial laboratory genetic testing. 15,21 Since data from commercial laboratories are likely impacted by ascertainment bias with an enrichment of individuals with a strong personal or family history of cancer, further research is needed regarding the impact of CHEK2 variants on thyroid cancer risk in unselected populations.
Thus far, studies of these CHEK2 variants have not been systematically performed in unselected non-medullary thyroid cancer (NMTC) populations, which match most patients with this cancer. Therefore, we investigated the frequency of the predominant Polish founder variant (c.444 + 1G>A) and the two most common cancer-associated variants (c.1100del and c.470T>C/p.I157T) in the United States using three unselected clinically annotated groups of patients with NMTC, including (1) 1544 NMTC patients and 1593 matched controls from previously published genome-wide association studies (GWAS), (2) 789 NMTC unselected patients with germline exome sequencing from the Oncology Research Information Exchange Network (ORIEN), and (3) 499 NMTC patients with germline sequence data available in The Cancer Genome Atlas (TCGA).
Methods
Patient groups
GWAS group: We have previously reported GWAS analyses aimed at identifying loci associated with risk for NMTC in which the population is included. 22,23 Recruitment and analysis protocols were approved by the Ohio State University (OSU) institutional review board for both patients (2006C0047) and controls (2005H0249) and participant consent was obtained. Data available from the prior analyses were used to investigate the presence of specific variants in the CHEK2 gene in a series of 1544 well-annotated NMTC patients and 1593 noncancer controls from Ohio. The proportion of patients and controls with each variant was evaluated using imputed data.
ORIEN Avatar group: Germline exome analysis was performed on blood/normal tissue samples from 896 unselected NMTC patients previously consented to the Total Cancer Care protocol, a biorepository for patients seen at the OSU James Comprehensive Cancer Center, or one of another 17 centers nationwide participating in the ORIEN. All protocols were approved by the respective institutional review board, including OSU protocol 2013H0197. Patients included in the GWAS group were excluded from analysis, resulting in 789 eligible cases. DNA was isolated and extracted using Qiagen (QIASymphony). Massively parallel sequencing was performed on Nimblegen v3 platform using Illumina HiSeq4000 sequencer according to manufacturers' instructions. Target coverage/reads were 100 × (minimum 50 × ). Germline variants were called using Haplotyper algorithm (sentieon_release_201911). Tumor and germline samples were assessed for a minimum single nucleotide polymorphism (SNP) concordance of 80%.
TCGA group: TCGA Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein, and epigenetic levels. 24 Two protected TCGA data sets were used in the analysis. Germline genotype calls for the three CHEK2 variants were obtained for 499 TCGA-NMTC patients. 25 Genome wide SNP genotype calls to perform principal component analysis (PCA) on 460 TCGA-NMTC patients out of 499 were acquired from the publication of Carrot-Zhang et al. 26
Data were obtained through an approved research request (
Statistical analysis
A case–control study design was utilized with analysis performed within each group for NMTC (regardless of histology) and specifically for PTC, including all nonanaplastic histological subtypes such a follicular variant or tall cell variant of PTC. The Ohio GWAS control group was used as the comparison for all three groups, given that it resembles each group with respect to demographics, such as sex distribution, which is skewed toward female in thyroid cancer (Table 1).
Demographic Information and Cancer Characteristics
Race categories reflect self-reported ancestry.
ATC, anaplastic thyroid cancer; FTC, follicular thyroid cancer; FVPTC, follicular variant of papillary thyroid cancer; GWAS, genome-wide association studies; NMTC, non-medullary thyroid cancer; NOS, not otherwise specified; ORIEN, Oncology Research Information Exchange Network; OTC, oncocytic thyroid cancer, previously known as Hurthle cell; PDTC, poorly differentiated thyroid cancer; PTC, papillary thyroid cancer; TCGA, The Cancer Genome Atlas.
Differences between each case group versus the control group were compared by applying T-tests for continuous variables and Fisher's exact tests for categorical variables. PCA was performed to identify variations in ethnic backgrounds of 460 out of 499 available samples and remaining 39 samples were treated as having missing principal components. Logistic regression analysis models adjusting for age, sex, and genetic ancestry with 3 first principal components were applied to compare cases and controls. ORs, CIs, and p-values are reported.
Results
Demographics: The demographics of the three NMTC groups and the control group are shown in Table 1. As expected, PTC was the predominant histology, women were more common than men, and TNM characteristics were similar in all case groups. The Ohio NMTC GWAS group was significantly younger than the GWAS controls (p < 0.001). The ORIEN Avatar group was significantly older than GWAS controls (p < 0.001), with more males (p < 0.001), and more self-reported racial diversity (p < 0.0001). The TCGA group also had more self-reported racial diversity (p < 0.0001) than the controls.
CHEK2 Genotype Associations: The proportion of patients from the three groups, who are heterozygous for either c.1100del or c.470T>C/p.I157T, is shown in Table 2. There was no homozygous patient identified in any of the groups. The predominant Polish variant (c.444 + 1G>A) was present in only one individual from the ORIEN group and was absent in the other groups.
CHEK2 Variant Frequencies and Odds Ratios
Indicates the total number of cases within each group with GWAS or exome data available. Variant proportion columns indicate the number of cases with each variant out of the total number with quality data at that locus. A small number of cases/controls were removed from the calculation of variant frequencies or odds ratios due to low-quality data at the variant locus.
PTC includes both PTC and FVPTC.
OR, odds ratio.
The proportion of patients with c.1100del was 0.92% in the GWAS group, 1.65% in the ORIEN group, and 0.80% in the TCGA group, respectively. The ORs (with CI) for NMTC associated with c.1100del within each group was 1.71 (0.73–4.29), 2.64 (0.95–7.63), and 2.5 (0.63–8.46), respectively. The proportion of patients with c.470T>C/p.I157T was 0.91% in the GWAS group, 0.76% in the ORIEN group, and 0.80% in the TCGA group, respectively. The ORs (with CI) for NMTC associated with c.470T>C/p.I157T within each group was 1.75 (0.74–4.39), 1.52 (0.42–4.96), and 2.31 (0.58–7.90), respectively. Because the TCGA group was predominantly PTC, we analyzed the variant frequencies in specifically PTC patients in the other two groups. When analyzed in this manner, the variant impacts were similar compared to those in the overall NMTC populations in the groups (Table 2).
Analysis was also performed by combining all three NMTC groups (with removal of any overlapping case). In the combined group, there were 30/2798 (1.07%) NMTC cases with c.1100del, with in an OR of 1.94 (CI = 0.92–4.61, p = 0.103) when compared to GWAS controls, and 24/2822 (0.85%) NMTC cases with c.470T>C/p.I157T, resulting in an OR of 1.78 (CI = 0.83–4.25, p = 0.163) when compared to GWAS controls.
Discussion
CHEK2 is often described as a thyroid cancer susceptibility gene in the scientific literature. 14,20 Estimating thyroid cancer risk in individuals with pathogenic CHEK2 variants is complicated for three main reasons: (1) Previous studies have shown that the magnitude of risk for various cancer types is not consistent between CHEK2 variants (especially between predicted truncating and missense variants), (2) allele frequencies can vary significantly between populations due to founder variants, and (3) risk estimates derived from commercial testing datasets may be influenced by ascertainment bias based on who is referred for such testing clinically, with some of these challenges being common in the field of germline cancer genetics.
The American College of Medical Genetics recently published a practice resource for the management of individuals with germline pathogenic/likely pathogenic variants in CHEK2, which acknowledges the fact that studies of thyroid cancer risk have been limited to the Polish population and states that further studies are required to confirm clinically actionable associations with several cancers, including thyroid cancer. 18
To address these concerns, we analyzed three independent populations with NMTC, who were not selected for patients with familial disease or who had been referred for cancer genetics counseling or testing. Our analyses in these unselected groups of patients with NMTC support an association of CHEK2 variants with NMTC based on the ORs. However, these data suggest that the common CHEK2 variants (c.1100del and c.470T>C/p.I157T) likely have a modest impact on NMTC risk. The ORs for NMTC/PTC for CHEK2 c.1100del ranged from 1.71 to 2.64 across groups. These results are consistent with the OR of 1.68 estimated by Bychkovsky et al., which was based on commercial genetic testing laboratory data.
Although studies in testing populations are likely influenced by ascertainment bias with a skew toward patients with stronger personal or family history of cancer, this may be less of a concern for thyroid cancer risk estimates since, given an absence of established testing criteria for NMTC, the history of thyroid cancer is less likely to be the primary testing indication. This may help explain the concordance between thyroid cancer ORs found in the clinical laboratory datasets 14 and our unselected groups. While our analysis showed only a modest impact of common CHEK2 variants on thyroid cancer risk, further studies are needed to determine if CHEK2-associated thyroid cancers are more aggressive or harbor unique somatic profiles.
Studies have suggested that common CHEK2 missense variants (such as c.470T>C/p.I157T) have less of an impact on breast cancer risks than truncating or frameshift variants
14,15
; however, this may be specific to c.470T>C/p.I157T (and perhaps
Research to further delineate the magnitude of thyroid cancer risk associated with CHEK2 variants will likely require larger unselected and well-annotated patient groups from even more diverse genetic ancestry. Even though our analysis involved relatively large numbers of patients with thyroid cancer and controls, only one calculated OR was statistically significant, likely due to the overall infrequency of these specific variants in the population.
A major limitation of our study is an overall lack of diversity in our groups. There was a statistically significant difference between the self-reported racial diversity of the ORIEN Avatar and TCGA groups when compared to the controls. However, the GWAS controls were deemed an appropriate control population due to the similarity in sex distribution to the cases, which is particularly important in the study of thyroid cancer risk, given the 3:1 female-to-male sex bias in thyroid cancer incidence. 28 Furthermore, potential impacts on results from differences in genetic ancestry were mitigated by using PCA components from SNP data in the logistic regression models.
Importantly, most data regarding the risk for thyroid cancer in individuals with pathogenic variants in CHEK2 come from Polish and/or commercial testing datasets (Table 3). The Polish studies are enriched with a specific founder variant (c.444 + 1G>A) that has shown high associations with thyroid cancer (ORs of 6.2–10.0), but is uncommon in North American populations. 3,20 In addition, thyroid cancer was reported with the highest frequency (5.8%) in c.444 + 1G>A heterozygotes compared to the other commonly reported CHEK2 PVs in a commercial testing dataset. 15 Only one c.444 + 1G>A heterozygote was identified in our analysis, making the impact of this variant small in these populations. While this variant appears to have the most significant impact on thyroid cancer risk, it appears limited to the Polish population in comparison to North America.
Existing Literature Regarding CHEK2 Variants and Thyroid Cancer Risk
N/A, not applicable.
At least two studies have reported the presence of the CHEK2 gene c.1100del variant in a family with familial NMTC with 12 and 4 affected individuals, respectively. 29,30 Based on data from our analysis and studies outlined previously, it is unlikely that the c.1100del variant is the sole explanation for the diagnoses of thyroid cancer in these families. Rather, the variant, along with other genetic or environmental risk factors or modifiers, may be contributing to the development of thyroid cancer in these families. As such, family members should be managed and screened based on the family history rather than the presence or absence of the CHEK2 variant.
Our analyses in groups of unselected patients with NMTC suggest that the ORs for NMTC/PTC associated with CHEK2 c.1100del ranged from 1.71 to 2.64. Similarly, the ORs for NMTC/PTC associated with CHEK2 c.470T>C/p.I157T ranged from 1.52 to 2.33 across groups. These modest increases in risk are unlikely to warrant clinical evaluations beyond routine care, unless driven by family history. The impact of CHEK2 c.470T>C/p.I157T on NMTC risk remains controversial, but is likely to be similar or even more subtle than the risk associated with c.1100del. These results provide important information for providers regarding the relatively modest magnitude of thyroid cancer risk associated with these common CHEK2 variants and also point to the need for further studies in more diverse populations to confirm these findings.
Footnotes
Acknowledgments
The authors would like to thank the ORIEN member institutions for their contributions to this research. These institutions include Moffitt Cancer Center, The Ohio State University, University of Virginia Cancer Center, The University of Colorado Cancer Center, Rutgers Cancer Institute of New Jersey, University of Southern California Norris Comprehensive Cancer Center, John P. Murtha Cancer Center, University of Utah Huntsman Cancer Institute, University of Oklahoma Stephenson Cancer Center, University of Iowa Holden Comprehensive Cancer Center, Roswell Park Comprehensive Cancer Center, University of Kentucky Markey Cancer Center, and Indiana University Simon Comprehensive Cancer Center. This work was presented at the 2023 American Thyroid Association annual meeting (Abstract ID poster-263).
Authors' Contributions
All authors made significant contributions in the development of this article. P.B.: Conceptualization (lead); writing—original draft (lead); and writing—review and editing (lead). S.L.: Conceptualization (supporting); data curation (lead); formal analysis (lead); and writing—original draft (supporting). T.T.N.: Conceptualization (supporting); data curation (supporting); and writing—original draft (supporting). C.C.: Resources (supporting) and writing—review and editing (supporting). W.K.: Resources (supporting) and writing—review and editing (supporting). L.A.S.: Resources (supporting) and writing—review and editing (supporting). S.Y.: Resources (supporting) and writing—review and editing (supporting). A.L.G.: Resources (supporting) and writing—review and editing (supporting). K.E.J.: Resources (supporting) and writing—review and editing (supporting). J.M.K.: Resources (supporting) and writing—review and editing (supporting). B.S.: Resources (supporting) and writing—review and editing (supporting). P.G.: Resources (supporting) and writing—review and editing (supporting). J.K.H.: Resources (supporting) and writing—review and editing (supporting). M.D.R.: Conceptualization (supporting); data curation (supporting); writing—original draft (supporting); and project administration.
Author Disclosure Statement
No competing financial interests exist.
Funding Information
This work was supported by National Cancer Institute Grants P50CA168505, P30CA16058, and P30CA086862, and Jalmari and Rauha Ahokas Foundation (T.T.N.).
