Abstract
Background:
Thyroglobulin (Tg) is a 660 kDa iodoglycoprotein that serves as a scaffold for thyroid hormone synthesis. Although a twin study showed that variability of serum Tg levels has a substantial genetic basis, no genome-wide association study (GWAS) of serum/plasma Tg levels has been performed to date. The aim of this study was to identify genetic variants associated with plasma Tg levels among healthy individuals.
Methods:
A GWAS was conducted on two Croatian cohorts, and a combined analysis was performed. The analyses included 1094 individuals. A total of 7,597,379 variants, imputed using the 1000 Genomes reference panel, were analyzed for association. GWAS was performed under an additive model, controlling for age, sex, and relatedness within each data set. Combined analysis was conducted using the inverse-variance fixed-effects method.
Results:
Sixteen variants located on chromosome 3, within the ST6GAL1 gene, reached genome-wide significance. The lead SNP was rs4012172 (
Conclusions:
A highly biologically plausible locus was identified that could have a role in the regulation of plasma Tg levels in healthy individuals.
Introduction
Thyroglobulin (Tg) is a 660 kDa iodoglycoprotein, the largest and most abundant protein in the thyroid gland (1). The synthesis of Tg occurs in the thyrocytes and neosynthesized Tg is subjected to glycosylation in the rough endoplasmic reticulum and in the Golgi compartment, followed by iodination and hormone formation at the follicular lumen/apical membrane boundary (2). The mature, hormone-containing Tg is then transferred to the follicular lumen where it functions as a thyroid hormone (TH) store (2,3). Thus, Tg can be considered as a prohormone, as it is a precursor of thyroxine (T4) and triiodothyronine (T3) (4). When THs are required, Tg is internalized by endocytosis in the thyrocyte and subjected to proteolysis, followed by the release of THs at the basolateral membrane into the bloodstream. In healthy individuals, only some Tg vesicles bypass intrathyroidal lysosomal proteolysis and reach the basolateral surface. This process, called transcytosis, seems to be the main pathway involved in the release of Tg into the circulation (5,6).
Serum Tg levels show a great deal of individual variation in healthy individuals (7). Several studies found an inverse association between iodine intake and serum Tg (8,9). Moreover, there is an association between thyroid volume and serum Tg (10), and the levels of serum Tg significantly change with thyroid abnormalities and dysfunction such as thyroiditis, autoimmune thyroiditis, and thyroid cancer (11 –13).
Glycosylation is a key event in Tg maturation (6,14). The glycan part of the Tg molecule contains 8–10% of total carbohydrates—mainly galactose, mannose, fucose, N-acetyl glucosamine, and sialic acid residues (15). Glycosylation of Tg affects several thyroid functions, including iodination and hormone synthesis, specific interactions with microsomes and membrane receptors, protein folding and recycling, as well as Tg immunoreactivity and the directing of Tg to subcellular and extracellular compartments (12).
A twin study suggested that there is a strong genetic component to the variability of serum Tg (16). No candidate or genome-wide association studies (GWAS) investigating the genes associated with Tg level have been reported so far.
The aim of this study was to identify loci associated with the level of Tg in plasma. A GWAS of plasma Tg levels was performed in 1094 healthy individuals from two Croatian cohorts.
Methods
Study population
This study was performed on subjects originating from two Croatian cohorts: one from the city of Split (CROATIA_Split) and the other from the island of Korcula (CROATIA_Korcula), within the 10,001 Dalmatians project (17). Characteristics of the cohorts are shown in Table 1. All participants who could have any thyroid disease according to anamnestic data and detailed biochemical findings were excluded. Individuals taking thyroid medication or who had undergone thyroid surgery, who reported thyroid disorders, or who had Tg, thyrotropin (TSH), free T3 (fT3), free T4 (fT4), Tg autoantibodies (TgAb), or thyroid peroxidase autoantibodies (TPOAb) levels falling outside of the normal reference range were excluded, leaving 1094 participants for analysis. Information on the family history of thyroid diseases was not available. Of note, the Croatian population has a sufficient iodine intake through legally mandated salt iodization (18). The study was approved by the Research Ethics Committees in Croatia and Scotland, and all subjects gave written informed consent.
Characteristics of the Two Cohorts
BMI, body mass index; Tg, thyroglobulin.
Genotyping and imputation
Supplementary Table S1 shows cohort summary information on genotyping, imputation, and quality-control procedures. The final number of single nucleotide polymorphisms (SNPs) included in the analysis was 8,777,560 for the CROATIA_Split sample and 9,182,797 for the CROATIA_Korcula sample, with 7,597,379 overlapping SNPs present in both cohorts.
Biochemical measurements
Plasma concentrations of THs and antibodies were measured by immunoassay methods with the Liaison XL Biomedica Chemiluminescence Analyzer, using in vitro assays for quantitation of Tg (REF 311861), TSH (REF 311211), fT3 (REF 311531), fT4 (REF 311611), TgAb (REF 311711), and TPOAb (REF 311701). The following reference ranges were used for the study population: Tg 0.2–50 ng/mL, TSH 0.3–3.6 mIU/L, fT3 3.39–6.47 pmol/L, fT4 10.29–21.88 pmol/L, TgAb 5–100 IU/mL, and TPOAb levels 1–16 IU/mL (19). Biochemical measurements were performed in the Biochemistry Laboratory in the Department of Nuclear Medicine at the University Hospital Split.
Statistical analysis
A GWAS was performed within each data set, and then the results were combined.
Prior to analysis, Tg levels were adjusted for age and sex using linear regression analysis, and inverse normal transformation was used to achieve a normal distribution for the calculated residuals. GWAS was performed on transformed residuals using a linear mixed model under the assumption of additive genetic effects. An inverse normal transformation was used on Tg levels. Betas (β) represent the change in outcome for each additional effect allele under an additive model, expressed in standard deviation units/allele. To obtain more interpretable effect estimates, a linear regression analysis was also performed on the original scale of Tg levels. The association analysis for the CROATIA_Split sample was carried out using a combination of the R package “GenABEL” and “SNPTEST” software, while the R packages “GenABEL” and “VariABEL” were used for the CROATIA_Korcula sample (20 –22).
A combined analysis of the individual results was performed using a fixed-effect inverse-variance weighted model, as implemented in the R package “MetABEL” (23). To visualize the results, a Manhattan plot and a quantile–quantile (QQ) plot were generated using the R package “qqman,” regional association plots for genomic regions within 500 Kb of the top hit using Locus Zoom software, and a forest plot for the significant SNP association using “MetABEL” (24,25). Cluster plots were visually inspected for quality for all suggestively associated SNPs using the Illumina GenomeStudio software package.
To identify independently associated SNPs, conditional association analysis was performed on the lead SNP using “SNPTEST” and R software. Post hoc power calculations were performed using Quanto, assuming an additive genetic model and genome-wide significance level
Results
GWAS analysis was performed in the CROATIA_Split and CROATIA_Korcula populations separately. Four SNPs within the ST6 Beta-Galactoside Alpha-2,6-Sialyltransferase 1 (ST6GAL1) gene reached genome-wide significance in the CROATIA_Split population, and all of them were replicated in the CROATIA_Korcula population with a p-value of
Cohort-level results were combined. There was little evidence for population stratification at a cohort-level (
The combined analysis results are shown in Figure 1A. The QQ plot showed an excess of significant p-values (Fig. 1B). Sixteen SNPs in a region on chromosome 3 containing the ST6GAL1 gene reached genome-wide significance (

Manhattan plot and quantile–quantile (QQ) plot of combined analysis for thyroglobulin (Tg) levels. (

Regional association plot of ST6GAL1 region. The most significant SNP (rs4012172) is shown in purple. Otherwise, the colors of the circles denote their correlations (LD r 2) with the top SNP. Color images are available online.

Forest plot of the association between the ST6GAL1 gene polymorphism rs4012172 and Tg levels.
In addition to the genome-wide significant association identified for SNPs within ST6GAL1, several suggestive associations were also identified with a p-value of
Associations of Top Single Nucleotide Polymorphisms
SNP, single nucleotide polymorphism.
Discussion
In this GWAS of plasma Tg levels in healthy populations, the ST6GAL1 gene on chromosome 3 was identified as a novel significant locus. The strongest association was observed for the ST6GAL1 gene polymorphism rs4012172 (p = 1.29 × 10−10), with the C allele being associated with higher Tg levels. The identified SNP accounts for 3.19% of population variance in plasma Tg. ST6GAL1 belongs to the sialyltransferase family, Golgi membrane-bound enzymes that catalyze the biosynthesis of sialylated oligosaccharides, and therefore has a fundamental role in the synthesis of specific sialylated structures (27). The highest expression of ST6GAL1 is found in liver, thyroid, spleen, lymph node, and prostate tissue (28). According to the Human Splicing Finder (29), there is no information on the predictive effect of the rs4012172 SNP in ST6GAL1 protein expression or activity. Among the remaining 15 significant polymorphisms, the rs3872724 variant was predicted to create a donor splice site within the intron, while for other polymorphisms, there was no available prediction information. As specified in GTEx (30), the strongest eQTL association signals reported in thyroid tissue for this gene are for rs3821819 and rs967367 (p = 8.2 × 10−13). Both variants are also significantly associated with the level of Tg in this data set (p = 1.9 × 10−9 and p = 2.42 × 10−9, respectively), and are in moderate LD (r 2 = 0.74, 1000 Genomes Phase III) with the lead rs4012172 SNP.
Since all of the GWAS hits are located in non-coding sites, either untyped rare variants or more indirect mechanisms could be responsible for the variation of plasma Tg levels.
Besides iodine availability, sialylation of Tg plays an important role in TH synthesis (2). Sialylation of Tg is associated with three aspects of Tg biology: recycling, autoregulation, and immunoreactivity (12).
Tg stored in the follicular lumen is internalized in thyrocytes and broken down to release THs. A portion of the internalized Tg (with no or low hormone content) that is poorly sialylated and iodinated cannot be proteolyzed by lysosomes (6). The thyroid gland rather tries to protect and recycle the immature Tg through a two-step transport system: first, through the Golgi compartment where novel oligosaccharides are added, followed by transport back to the follicular lumen where additional iodination and hormone formation occurs (2,6). Another fraction of the immature Tg molecules is transported via transcytotic vesicles to the basolateral thyrocyte membrane and then released into the bloodstream (2,31).
One of the potential mechanisms of Tg autoregulation and endocytosis is placed at the apical membrane of thyrocytes where there is direct contact of signal transduction receptors with Tg from the follicular lumen. Up to now, three candidate signal transduction receptors have been described: the asialoglycoprotein (ASGP), N-acetylglucosamine receptor, and megalin. Sialylation of Tg is considered necessary for the release of Tg into thyrocytes via receptor-mediated endocytosis. The ASGP and N-acetylglucosamine receptors are probably involved in sorting immature Tg molecules for recycling to the follicular lumen. Poorly sialylated Tg located in the periphery of the follicular lumen has a greater affinity for the ASGP receptor than normally sialylated Tg (2,32). Sialylation could also have an important role in the variability of plasma Tg levels in healthy individuals, since it seems to play an important role in regulating Tg secretion into the bloodstream (33).
Sialic acids of native Tg have a protective role in the autoimmunization process. Several studies showed that desialylation may unmask the Tg isoantigen (34,35). There is now comprehensive evidence that the trigger for an autoimmune reaction is increased synthesis of desialylated Tg followed by the initiation of the autoimmunity process (35).
The polypeptide chain of the human Tg molecule contains 20 putative N-linked glycosylation sites, of which 16 are glycosylated in the mature Tg molecule (36,37). Sialylation is a late post-translational modification of human Tg. It has been reported that normal human Tg has a sialic acid content of 7.3 μg/mg (38). Sialyltransferases catalyze the transfer of sialic acid from cytidine monophosphosialic acid to the terminal position of carbohydrate chains of glycolipids and glycoproteins (39). ST6GAL1 encodes sialyltransferase 1, also known as β-galactiside α-2,6-sialyltransferase (39). Sialic acids are attached to the carbohydrate units of human Tg primarily with α-2,6 linkages established by the activity of sialyltransferase 1 (38). Alpha 2,6-bound sialic acids were also described on the Tg secreted by rat thyroid cell line FRTL-5 and porcine Tg (40,41). Furthermore, human thyroid tissue showed high mRNA levels of ST6GAL1 (39). Severely hyposialylated Tg was noticed in a patient with congenital goiter with hypothyroidism, and the authors observed that the α-2,6 sialyltransferase activity was deficient in that family (38). Therefore, based on this evidence and the present finding, it is suggested that ST6GAL1 variations may alter Tg sialylation and consequently influence the release of Tg into the bloodstream.
Genetic defects that affect the biochemical pathways of glycan synthesis can cause severe glycan deficiencies and lead to a variety of diseases (42). Several genome-wide studies have already revealed significant associations with variants in the ST6GAL1 gene and different traits and diseases. So far, the most significant association discovered with a variant in ST6GAL1 was identified for plasma immunoglobulin G (IgG) glycosylation, with rs11710456 as the leading SNP (p = 6.12 × 10−75). Lauc et al. discovered 37 genome-wide significant SNPs within ST6GAL1 that were associated with 14 different IgG glycosylation/sialylation traits in a meta-analysis of European populations (43). Highly sialylated IgG molecules are considered as one of the key elements in immune homeostasis, as well as in the prevention and regulation of autoimmune and inflammatory diseases (44).
In a Han Chinese population, the ST6GAL1 gene was associated with immunoglobulin A nephropathy (rs7634389; p = 7 × 10−10) (45). Drug-induced liver injury was also associated with the ST6GAL1 gene (rs10937275; p = 1.4 × 10−8) in Europeans (46). These GWAS indicate that ST6GAL1 is a plausible cofactor in the susceptibility to immuno-inflammatory events.
Kooner et al. found an association of the ST6GAL1 gene (rs16861329, p = 3 × 10−8) with type 2 diabetes in a South Asian population (47). A study by Santisteban et al. showed that insulin and insulin growth factor 1, along with TSH, increase Tg synthesis at the mRNA levels in FRTL-5 thyroid cells (48). Changes in rat hepatocyte cell surface sialic acid content have also been observed in insulin resistance associated with diabetes mellitus (49).
Sialylation regulates immune cell function, cell signaling, activation, and differentiation (50 –53), thus supporting the findings of GWAS on the pleiotropy of ST6GAL1. The results of the present study suggest that ST6GAL1 is also likely to have a role in the regulation of the plasma Tg level.
Although no signals other than the ST6GAL1 locus reached genome-wide significance, several candidate loci showed suggestive evidence of association. Of particular interest is the rs140183574 variant located on chromosome 15 (p = 1.9 × 10−6), within the GABRB3 and GABRA5 genes. GABRB3 and GABRA5 encode proteins that serve as the receptors for gamma-aminobutyric acid (GABA), a major inhibitory neurotransmitter of the nervous system. GABA plays a role in (43) the homeostasis of the normal thyroid and has an inhibitory role in TSH-stimulated TH secretion from thyroid follicular cells (54,55).
Another interesting finding is the suggestive association for a locus near the POM121L12 gene on chromosome 7 (p = 4 × 10−6). Variants near this gene were also suggestively associated with N-glycosylation of human IgG (p = 5 × 10−6) (43). Likewise, several autoimmune diseases (type 1 diabetes, type 2 diabetes, psoriasis, and rheumatoid arthritis) have shown suggestive associations (p = 10−5–10−7) within the same region (56 –58), indicating that the POM121L12 gene could also have a pleiotropic effect.
A major strength of this study is the careful selection of healthy individuals based on detailed thyroid function assessment (including measurements of Tg, TSH, fT3, fT4, TgAb, and TPOAb) and information on self-reported thyroid disease and thyroid medications used. Other strengths include the ethnically homogeneous sample of Croatian individuals and the comprehensive set of genetic variants examined. Although both cohorts consist predominantly of older subjects, all participants included in the study were healthy individuals with normal thyroid function. Additionally, since thyroid function decreases with age (59), Tg levels were adjusted for age effects in the analysis. The main limitation of the study is the modest sample size. Thus, the study was underpowered to detect small effects at genome-wide significance. However, there were sufficient data to detect an association for the identified ST6GAL1 locus confidently, since the combined analysis had a 91% power to detect the top associated SNP rs4012172 with an effect size of 0.30 and minor allele frequency of 0.35 at the genome-wide level of significance.
In conclusion, this study reports the first GWAS on plasma Tg levels and contributes to the understanding of genetic factors underlying Tg levels in healthy individuals. The novel genome-wide significant ST6GAL1 locus revealed by this study is highly biologically plausible and likely to have a role in the regulation of plasma Tg levels. ST6GAL1 encodes sialyltransferase 1 involved in the sialylation of Tg (37,39). It is known that only desialylated or poorly sialylated Tg can be transferred into the bloodstream. The biological role of sialyltransferase 1 suggests a possible influence of ST6GAL1 variations on plasma Tg levels.
Footnotes
Acknowledgments
The Croatian Science Foundation funded this work under the project “Identification of new genetic loci implicated in regulation of thyroid and parathyroid function” (grant no. 1498). The 10,001 Dalmatians project was funded by grants from the Medical Research Council (United Kingdom), European Commission Framework 6 project EUROSPAN (contract no. LSHG-CT-2006-018947), the Republic of Croatia Ministry of Science, Education and Sports research grant (216-1080315-0302), the Croatian Science Foundation (grant 8875), CEKOM (Ministry of Economy, Entrepreneurship and Crafts), and the Research Centre of Excellence in Personalized Medicine (Ministry of Science and Education). C.H. is supported by an MRC University Unit Program Grant MC_PC_U127592696.
We would like to thank all participants of this study and acknowledge the invaluable support of the local teams in Zagreb and Split, especially that of the Institute for Anthropological Research, Zagreb, Croatia. We wish to thank Shelly Pranic, PhD, for proofreading the manuscript.
Author Disclosure Statement
No competing financial interests exist.
Supplementary Material
Supplementary Table S1
Supplementary Table S2
Supplementary Table S3
