Abstract
Background:
Thyroid hormone is prerequisite for proper fetal and postnatal neurodevelopment, growth, and metabolism. Although much progress has been made in the characterization of genes implicated in thyroid development and function, the majority of genes involved in this process are still unknown. We have previously applied serial analysis of gene expression (SAGE) to identify novel genes preferentially expressed in the thyroid, and this has resulted in the characterization of DUOX2 and IYD (also known as DEHAL1), two genes encoding essential enzymes in the production of thyroid hormone. In the current study we characterize the gene C16orf89, which is linked to another thyroid-specific SAGE tag CCAGCTGCCT.
Methods:
We establish tissue-specific expression of C16orf89 using novel tissue-specific SAGE libraries and quantitative polymerase chain reaction. In addition, we characterize the C16orf89 gene and protein, and analyze its mRNA expression in response to thyrotropin and during mouse development.
Results:
C16orf89 is predominantly expressed in human thyroid tissue with a specificity intermediate between thyroid transcription factors and proteins involved in thyroid hormone synthesis. C16orf89 shows the same expression pattern as Nkx2-1 (thyroid transcription factor 1) from embryonic day (E) 17.5 onward in the developing mouse thyroid and lung. The developmental timing of C16orf89 mRNA expression is similar to that of the iodide transporter Slc5a5 (also known as Nis). Both transcripts are detected from E17.5 in the developing thyroid. This is clearly later than the onset of Tg mRNA expression (from E14.5), while Nkx2-1 and Iyd mRNA can already be detected in the E12.5 thyroid. In in vitro cell culture C16orf89 expression is stimulated by thyrotropin. The major splice variant encodes a 361 amino acid protein that is well conserved between mammals, contains an N-terminal signal peptide, is secreted in a glycosylated form, and does not contain any known functional domain.
Conclusions:
We present a novel gene highly expressed in thyroid that encodes a currently enigmatic protein.
Introduction
The thyroid gland is the first endocrine gland to appear during embryonic development. It arises from a median endodermal thickening in the floor of the primitive pharynx that develops into a caudally growing diverticulum. Fetal thyroid hormone production initiates at embryonic day (E) 16.5 in mice and 10–12 weeks postconception in humans (2) within the thyrocytes that surround a follicular lumen. In addition, about 1% of the epithelial cell mass of an adult human thyroid gland consists of parafollicular calcitonin producing C cells that arise from cells within the ultimobranchial body (3).
Murine models have been essential for the delineation of thyroid morphogenesis, and experiments with knockout mice show that the transcription factors NKX2-1, FOXE1, PAX8, and HHEX are crucial for thyroid development. Although these transcription factors are also expressed in other tissues, they are coexpressed only in thyroid.
Apart from controlling thyroid development and growth, they also control expression of genes essential to the process of thyroid hormonogenesis (3,4). After proper development of the thyroid gland, the synthesis, storage, and secretion of thyroid hormones requires a sequence of precisely tuned reactions, in which a large number of proteins and factors are involved. The provision of adequate amounts of the rare element iodide at the site of hormone synthesis at the apical membrane depends on the presence of the sodium-iodide symporter NIS (SLC5A5) at the basal membrane of the thyrocyte (5), while for iodide across the apical membrane pendrin (SLC26A4) is currently the most likely candidate (6). Subsequently, tyrosine residues present in the scaffold protein thyroglobulin (TG) (7) can be iodinated by thyroid peroxidase (TPO) (8) in the presence of H2O2 that is supplied by DUOX2 (9). Some of the iodinated tyrosine residues are coupled to form thyroid hormone, while residual iodide present in uncoupled iodotyrosines can be reused for thyroid hormone synthesis after release by iodotyrosine deiodinase (IYD, also known as DEHAL1) (10).
Mutations in genes involved in either thyroid development or thyroid function form the molecular basis of congenital hypothyroidism that affects about 1 in 3000–4000 infants (11). The vast majority of cases, however, have an as-yet-unknown molecular defect (12 –15). To identify novel genes involved in thyroid physiology, we have previously applied serial analysis of gene expression (SAGE) to human thyroid tissue (16). A computational subtraction approach identified thyroid-specific SAGE tags that could be linked to at that time uncharacterized transcripts (17). This approach has led to the identification of DUOX2 and IYD, two enzymes within the thyroid that are involved in thyroid hormone production. Subsequent studies established that defects in these enzymes are the molecular basis for congenital hypothyroidism (18,19). The current article describes the initial characterization of C16orf89, the gene linked to another thyroid-specific SAGE tag CCAGCTGCCT.
Materials and Methods
Cell culture
HEK-293 cells (ATCC CRL-1573) were maintained in Dulbecco's modified Eagle's medium supplemented with 10% newborn calf serum and penicillin/streptomycin (Invitrogen, Carlsbad, CA). Rat thyroid PCCl3 cells (kindly provided by Dr. A. Fusco) were cultured as described (20) in H6/Coon's modified F12 medium (Autogen Bioclear, Nottingham, United Kingdom) supplemented with 10% newborn calf serum and penicillin/streptomycin (Invitrogen).
Quantitative polymerase chain reaction
High-purity total RNA from 1 × 106 rat PCCl3 cells was isolated using the MagNA Pure LC RNA Isolation Kit High Performance (Roche, Almere, The Netherlands), and reverse transcribed using AMV First Strand cDNA Synthesis Kit for reverse transcriptase-polymerase chain reaction (rt-PCR) (Roche). For human tissues, cDNA was obtained from the Human Major Tissue qPCR Panel (Origene, Rockville, MD). Quantitative PCR (qPCR) was performed on a LightCycler 480 system (Roche) with reaction mixtures containing 2.5 μL cDNA, 0.4 μM of each primer, 100 nM UPL probe (Roche), and 5 μL Absolute qPCR mix (Thermo Scientific, Waltham, MA) in a total volume of 10 μL. Primers were designed using the Universal ProbeLibrary Assay Design Center (Roche) (Table 1).
Data were analyzed and quantified using the second derivative maximum for Cp determination with LightCycler 480 software 1.5.0 (Roche).
In situ hybridization
Probes for in situ hybridization (ISH) were generated by PCR amplification of mouse C16orf89 (AU021092) nucleotides 234–612 (GenBank NM_001033220.1), Iyd nucleotides 299–663 (GenBank NM_027391.3), Tg nucleotides 104–506 (GenBank NM_009375.2), Slc5a5 nucleotides 1140–1570 (GenBank NM_053248.2), and Nkx2-1 nucleotides 2294–2715 (GenBank NM_009385.3). PCR fragments were cloned into pGEM-Teasy (Promega, Madison, WI) from which DIG-labeled RNA probes were generated using Roche's DIG RNA labeling kit (SP6/T7).
ISH was essentially performed as described (21). FVB mouse embryos were fixed in 4% paraformaldehyde, dehydrated in a graded alcohol series, and embedded in paraplast. Ten-micrometer-thick sections were mounted onto aminoalkylsilane-coated slides. Probe binding was observed using NBT/BCIP, according to the manufacturer's protocol (Roche). After color development, sections were rinsed in double-distilled water, dehydrated in a graded ethanol series, treated by xylene, and embedded in Entellan (Merck, Darmstadt, Germany).
Tandem affinity purification
Expression vectors with the C16orf89-coding region fused to the SBP/CBP tag (vector pCTAP; Stratagene, La Jolla, CA) or the Myc/His tag (vector pcDNA3-Myc/His; Invitrogen) were generated using standard molecular biological techniques. C16orf89-CTAP was transfected into HEK-293 cells using the Fugene6 protocol (Roche), and a stable C16orf89-CTAP expressing cell line was generated by G418 selection. Subsequently, C16orf89-Myc/His was transiently transfected into this cell line and tandem affinity purification was performed according to the manufacturer's instructions (Stratagene).
Deglycosylation
HEK-293 cells were transfected with pcDNA3-Myc/His using the Fugene6 protocol (Roche) generating a stable C16orf89-Myc/His-expressing cell line by G418 selection. Cells were lysed in 8 M urea, 0.1 M NaH2PO4, 0.01 M Tris-Cl, and 0.05% Tween 20 (pH 8.0), and the His-tagged C16orf89 protein was purified using Ni-NTA Magnetic Agarose beads (Qiagen, Hilden, Germany). The culture medium was harvested and dialyzed against 8 M urea, 0.1 M NaH2PO4, 0.01 M Tris-Cl, 0.01 M imidazole, and 0.05% Tween 20 (pH 8.0), and the His-tagged protein was purified using Ni-NTA Magnetic Agarose beads, dialyzed against double-distilled water at 4°C, and deglycosylated using the GlycoPro™ prO-LINK™ Extender Kit (Prozyme-GLYKO, Hayward, CA) according to the protocol provided by the manufacturer.
Western blot
Three grams of normal thyroid tissue was pulverized under liquid nitrogen, and homogenized by a 3-minute polytron treatment after adding 20 mL nondenaturing lysis buffer (50 mM Tris-HCl pH 7.4, 300 mM NaCl, 5 mM ethylenediaminetetraacetic acid, and 1% Triton X-100) with protease inhibitors (complete ethylenediaminetetraacetic acid-free protease inhibitor cocktail [Roche] and 1 mM phenylmethanesulfonyl fluoride [PMSF]). Protein samples were separated by sodium dodecyl sulfate–polyacrylamide gel electrophoresis and transferred to a 0.2-μm polyvinylidene fluoride membrane (Bio-Rad, Hercules, CA). For C16orf89, an affinity-purified rabbit antibody, generated against the peptides H2N-FSR-RVK-RRE-KQF-PDG-C-CONH2 and H2N-CNR-EPH-PST-PPP-PPS-R-COOH (Eurogentec, Seraing, Belgium), was used (see Fig. 1). The Myc tag was detected using a c-Myc Monoclonal antibody (Clontech, Mountain View, CA). Primary antibody binding was observed by horseradish-peroxidase-conjugated anti-mouse/rabbit secondary antibodies (Dako, Glostrup, Denmark) in combination with the Lumi-Light Western blotting kit (Roche).

Human C160rf89 mRNA and translated protein sequence. The major C16orf89 mRNA sequence (GenBank NM_001098514) is presented with the untranslated regions in lower case characters and the translated regions in upper case characters. The two potential translation start codons (ATG) and the translational stop codon (TGA) are shown in bold. The relevant putative Kozak sequences are indicated by a dashed line above the sequence. The SAGE tag is underlined. The corresponding translated protein sequence was determined using Vector NTI software (Invitrogen) and is presented in bold uppercase characters. Translation of C16orf89 most likely starts at the second methionine based on the observation that the in frame first ATG is not conserved in all species (see also Supplemental Fig. S1, available online at
Immunohistochemistry
Paraffin-embedded sections were treated with pepsin for antigen retrieval, and subsequently processed for immunohistochemistry using the EnVision-HRP/DAB (Dako) protocol as described (22). Monoclonal mouse anti-human antibody (M0781) was obtained from Dako.
Results
SAGE tag annotation
The tag CCAGCTGCCT is uniquely linked to the human gene C16orf89. According to the NCBI Entrez Gene database (23), C16orf89 contains eight exons, spans 22 kb, and encodes for two alternatively spliced mRNAs. These transcripts differ in their most 3′ end as a result of using an out-of-frame alternative splice acceptor site at the intron 7–exon 8 border. qPCR analysis indicates the existence of both transcript variants in human thyroid, but the variant designated as #2 in the NCBI database is by far (>78%) the most prominent transcript in human thyroid (data not shown). Compared to #2, #1 uses an alternative acceptor splice site upstream within intron 7. The putative in frame translation start site of both transcripts is not conserved in all species and is preceded by a rather poor Kozak sequence (24). When using the second in frame ATG start site, the major transcript encodes a 361 amino acid protein, while the minor transcript encodes a 402 amino acid protein. The expected molecular weight of the proteins is 40.6 and 45.4 kDa, respectively, and they both contain a putative signal peptide and O-glycosylation sites. A schematic outline of the human C16orf89 gene, mRNA, and protein is shown in Figures 1 and 2.

Schematic representation of the human C16orf89 gene, mRNA, and protein. According to the Entrez Gene database, the human C16orf89 gene (also known as MGC45438) contains eight exons (boxes in middle part of figure). At least two different mRNAs that differ in their 3′ end because of alternative splicing of exon 8 are generated from C16orf89: a minor transcript of 1.9 kb, using a more upstream exon 8 acceptor splice site, encoding a 402 aa protein (top), and a major transcript of 1.5 kb that uses a frame-shifting alternative acceptor splice site 358 bp downstream, encoding a 361 aa protein (bottom). The resulting proteins are identical in the N-terminal 319 aa, but differ in their C-terminus. The gray-shaded areas within the mRNA indicate the untranslated regions, and the position of the SAGE tag is indicated by the arrows.
ClustalW analysis (
Tissue-specific expression
In the original thyroid SAGE library (26), SAGE tag CCAGCTGCCT was labeled as highly thyroid specific by the Tissue Preferential Expression algorithm (27). In silico reanalysis of tissue expression for SAGE tag CCAGCTGCCT using SAGE Digital Gene expression Displayer (28) and a second normal thyroid SAGE library (29) containing 115938 SAGE tags in which CCAGCTGCCT was present at the level of 129 Tags/million again demonstrated a high thyroid-specificity score. This was calculated by comparing the expression level of the SAGE tag in the normal thyroid library to 93 other normal human tissue libraries, resulting in a tag odds ratio of 128.74 (p < 0.0001) (Supplemental Table S1A, available online at

C16orf89 mRNA expression in human tissues. qPCR was performed on the Origene Human Major Tissue qPCR Panel. Values are the average C16orf89/PSMD4 copies from duplicate qPCR samples ± standard deviation. Numbers on the right of the bars represent the tags per million for the C16orf89 SAGE tag CCAGCTGCCT in the corresponding normal tissue SAGE libraries (
Analysis of the mouse ortholog for C16orf89 (AU021092, hereafter called mC16orf89) by RNA ISH at E12.5, 14.5, 17.5, and postnatal day (P) 70 demonstrated expression from E17.5 onward (Fig. 4 and Table 2). We could not detect any mC16orf89 signal at E14.5, while the thyroid-specific markers Nkx2-1, Tg, and Iyd are clearly visible at this stage. Remarkably, Iyd is clearly expressed in the E12.5 thyroid, a time where the mRNAs encoding other components involved in thyroid hormone production are still undetectable. In addition, the tracheal epithelium shows expression of both mC16orf89 and Nkx2-1, but no expression of Iyd. Since Nkx2-1 is known to be strongly expressed in both the developing thyroid, trachea and lung (30,31), we have also investigated mC16orf89 expression in lung. As shown in Figure 4B, Nkx2-1 and mC16orf89 are expressed at the same location in lung with staining mainly in the bronchial epithelium.

mC16orf89 expression in thyroid and lung. In situ hybridization with sense mC16orf89, and antisense mC16orf89, Nkx2.1, Tg, Slc5a5, and Iyd probes on parallel E12.5, E14.5, and E17.5 cross sections of mouse thyroid (
E, embryonic day; P, postnatal day; +, present; −, absent; nd, not determined.
Regulation of expression
Cell cultures of rat PCCl3 cells cultured without thyrotropin (TSH) and subsequently stimulated by TSH, forskolin, or dibutyryl–cyclic adenosine monophosphate (dbcAMP) showed an increase in the rat ortholog for C16orf89 (RGD1565166, hereafter called rC16orf89) expression by qPCR similar but less pronounced compared to Tg and Tpo (Fig. 5).

TSH induces rC16orf89 expression in cultured rat thyroid cells. PCCl3 cells were maintained for 24 hours on the culture medium without TSH, and subsequently treated with 1 μIU/mL bovine TSH (Sigma), 10 μM forskolin (Fluka), or 1 mM dibutyryl–cyclic adenosine monophosphate (dbcAMP) (Sigma) for 0, 24, or 48 hours. mRNA levels for rC16orf89, Tg, and Tpo were determined by qPCR. Shown is the fold induction of mRNA expression (normalized to Hprt1 expression) over vehicle-treated (no add) cells. Values are the average normalized transcript copies from duplicate mRNA samples ± standard deviation.
C16orf89 protein characterization
The UniProt Protein Knowledgebase (
As shown in Figure 6A, C16orf89 protein can be detected in both cell lysate and medium of the C16orf89-Myc/His expressing HEK-293 cells. The molecular weight of the secreted protein is larger than that of the intracellular protein, which is due to glycosylation as demonstrated by the treatment with glycosidases, indicating that C16orf89 requires glycosylation before secretion.

C16orf89 protein characterization in thyroid and HEK-293 cells. Western blot analysis of (
To identify potential binding partners of C16orf89, we transfected HEK-293 cells with a C16orf89-CTAP vector, and performed a tandem affinity purification. Mass spectrometric analysis of copurified proteins eluted from sodium dodecyl sulfate–polyacrylamide gel electrophoresis gel indicated that within HEK-293 cells C16orf89 mainly dimerizes with itself (data not shown). To verify this homodimerization of C16orf89, we generated a cell line stably expressing C16orf89-CTAP and cotransfected C16orf89-Myc/His into these cells. C16orf89-CTAP complexes were purified by tandem affinity purification and analyzed by immunoblotting. As shown in Figure 6B, C16orf89-Myc/His copurified with C16orf89-cTAP, demonstrating that C16orf89 forms homodimers.
For observation of endogenous C16orf19 we raised polyclonal antibodies against two antigenic epitopes of C16orf89 that recognize both the native and the secreted glycosylated form of the C16orf89 protein on Western blot. Subsequent Western blot analysis of extracts from normal thyroid tissue identified only C16orf89 protein that in size corresponds to the nonglycosylated form of the protein (Fig. 6C), indicating that the glycosylated secreted form observed in transfected HEK-293 cells is not present in thyroid tissue.
Immunohistochemistry (Fig. 7) of normal thyroid tissue with anti-C16orf89 demonstrates uniform cytosolic/membranous staining of the follicular epithelium. In contrast to TG, the staining was exclusively in the epithelial lining of the follicles, but not the colloidal content of the follicles (that well was positive for TG). A comparable expression pattern was observed in the thyroidal epithelium of patients with Graves' disease and in the proliferating epithelial cells of papillary thyroid carcinoma.

C16orf89 protein expression in thyroid epithelial cells. Sections from human normal (
Discussion
We report the initial characterization of a novel protein C16orf89 that is transcribed from human chromosome 16p13.3 and mouse chromosome 16A1. C16orf89 is highly expressed in thyroid, with an expression level intermediate between the transcription factors involved in thyroid development (PAX8, FOXE1, and NKX2-1) and thyroid hormonogenesis (TG, IYD, and TPO). C16orf89 is also expressed in some other tissues such as pancreas, urethra, and lung.
Apart from an N-terminal cleavable signal peptide and several O-glycosylation sites, the 361 amino acid protein does not have any functionally annotated domains. A preliminary report (34) described features characteristic of the cystine knot three-dimensional structure, but there is currently no evidence to support this. Global alignment of orthologs shows that cystine knot consensus residues C2 (C16orf89 amino acid residue 192) and C3 (amino acid residue 196) flanking the three conserved amino acids (X-G-X) and forming part of the ring are not conserved (see Supplemental Fig. S1). Additionally, the amino acid residue preceding C3 is a tyrosine in all orthologs. Both features are in contrast with the consensus derived from all established cystine knot proteins (35).
It is intriguing to speculate that C16orf89 is essential to thyroid development or function and might form the molecular basis of a specific subtype of congenital hypothyroidism. As the protein does not contain any functionally annotated domains, we performed several experiments to elucidate whether the protein might be involved in either thyroid development or thyroid hormonogenesis. ISH has established that the traditional thyroid transcription factors are already expressed at E8.5 (3). The first factor directly involved in the process of thyroid hormone synthesis that comes up during mouse embryonal development is Iyd that is present at E12.5 in mouse thyroid (Fig. 4A). Subsequently Tg, Tpo, and Tshr are present at E14.5 (30,31) (Fig. 4), while our data show that mC16orf89 is not yet expressed at that stage. On the basis of these data we see no rational basis to expect a functional role for C16orf89 in thyroid development. Strikingly, we observe mC16orf89 expression in thyroid, trachea, and lung, tissues known to express Nkx2-1 during development.
The stimulation of endogenous rC16orf 89 expression in the thyroid cell system PCCl3 by TSH, cAMP, and forskolin indicates that C16orf89 is at least in part under TSH control via the cAMP pathway similar to TG and TPO (36 –38), suggesting a role for C16orf89 in thyroid hormonogenesis. Although we see strong C16orf89 protein expression in Graves' thyroid and papillary thyroid carcinoma, it is too preliminary to suggest that under these conditions C16orf89 is upregulated compared to the normal thyroid. HEK-293 cells expressing C16orf89 show that the protein is secreted in a glycosylated form and is able to form homodimers and we did not find any evidence for a heterodimerizing partner.
A functional role in thyroid hormonogenesis would fit with a location at the apical membrane where thyroid hormonogenesis takes place. Especially in the process of posttranslational modification of TG, internalization of hormone-containing TG across the apical membrane of the thyroid follicle, and release of TG bound thyroid hormone, not all factors involved have been fully identified (39). Elucidation of the molecular basis of these processes might establish the molecular basis of patients suffering from a TG synthesis defect based on clinical and chemical criteria, who do not have a mutation in the TG cDNA (40). However, our studies show that C16orf89 is not a membrane bound, but a secreted protein, while immunohistochemical studies on thyroid sections using a polyclonal antibody raised against two antigenic epitopes of C16orf89 did not demonstrate the presence of any C16orf89 in the follicular lumen rendering secretion to circulation the most likely option.
Combined, the data present an interesting puzzle. On one hand, the protein is not likely to be involved in early thyroid development; on the other hand, the homodimerized glycosylated protein is secreted but not to the follicular lumen where it could play a role in thyroid hormonogenesis. This implies that C16orf89 must be secreted to the circulation where it most likely would need a receptor to perform its intended biological action. The functional role of C16orf89 remains an enigma, especially since patients suffering from thyroid agenesis do demonstrate a significant increase in morphological abnormalities compared to the control population (41), but do not show clinical evidence of a consistent lack of any endocrine factor apart from thyroid hormone deficiency.
Footnotes
Acknowledgments
We thank Mrs. C. de Gier-de Vries (Department of Anatomy and Embryology) for her technical assistance in setting up the ISH studies in our laboratory, and Dr. D. Speijer (Department of Medical Biochemistry) for performing the Mass Spectrometry (MS) analysis.
Disclosure Statement
The authors declare that they have no commercial associations that might create a conflict of interest in connection with this article.
