Abstract
Background:
Reported cutoffs for childhood thyrotropin (TSH) and free thyroxine (fT4) reference ranges vary widely, and knowledge on the determinants of childhood thyroid function is sparse. This study aimed to summarize the existing studies on thyroid function reference ranges in children. Furthermore, the objective was to investigate the determinants of childhood TSH and fT4 concentration in a population based-prospective cohort.
Methods:
First, to identify studies on childhood thyroid reference ranges, The National Library of Medicine's PubMed, Embase, Ovid Medline, Web of Science, and Google Scholar databases were systematically searched. Second, in a non-selected sample of 4273 children (median age 6.0 years, range 4.9–9.1 years) from the cohort, the associations of age, sex, anthropometric characteristics, ethnicity, maternal education, and time and season at venipuncture were studied with TSH and fT4 concentrations. The study also investigated to what extent between-individual variations in the determinants of TSH and fT4 could influence the calculation of reference ranges.
Results:
Published reference ranges for TSH and fT4 differ per age range and within age ranges (cutoffs low TSH: 0.13 to >1 mIU/L; high TSH: 2.36 to >10 mIU/L; low fT4: 7.0 to >10 pmol/L; high fT4: 15.5 to >30 pmol/L). In the present cohort, weight, sex, and ethnicity were determinants of TSH (p ≤ 0.03) and fT4 concentrations (p ≤ 0.01), and height and time at venipuncture were determinants of TSH only (p < 0.0001). The between-individual variation depending on clinical determinants for TSH ranged between 0.64 and 0.96 mIU/L (total population 0.87 mIU/L) for the lower limit and 4.30 and 5.62 mIU/L (total population 5.20 mIU/L) for the upper limit, whereas for fT4, the lower limit ranged between 13.6 and 14.2 pmol/L (total population 13.8 pmol/L) and the upper limit ranged between 20.2 and 23.0 pmol/L (total population 20.8 pmol/L).
Conclusions:
Considerable differences exist in the reported reference ranges for childhood TSH and fT4 across and within age ranges and assays. The present cohort shows only a minimal association between TSH and fT4, suggesting that the hypothalamus–pituitary–thyroid axis remains unaffected by thyroid interfering factors. Various determinants of TSH and fT4 in children were identified, which accounted for a considerable variation of reference range cutoffs.
Introduction
A
In order to diagnose thyroid disease, adequate reference ranges for thyrotropin (TSH) and free thyroxine (fT4) are essential. The guidelines of the European Thyroid Association for the management of subclinical hypothyroidism in children recommend the use of age-related normative values (7). However, there is no further consensus on the definition of TSH and fT4 reference ranges during childhood. This complicates the interpretation of thyroid function tests and clinical diagnosis of thyroid dysfunction, as is illustrated for example by the wide range of TSH cutoffs (between 5.0 and 10 mIU/L) currently used to define subclinical hypothyroidism during childhood (7).
Multiple studies have been performed to define thyroid function reference ranges in pediatric populations (8 –43). Although some studies adhere to the recommendations by the International Federation of Clinical Chemistry, there is considerable between-study heterogeneity, as studies have been conducted across various age ranges, using different assays, and in populations comprising different ethnicities and from different geographical conditions. It is currently unknown to what extent between-study variations in methodology and between-population differences in thyroid function determinants add to this heterogeneity.
Although some studies have indicated that TSH and fT4 reference ranges are influenced by child age, Tanner stage, ethnicity, anthropometric characteristics, and/or iodine intake, data on determinants of thyroid function during childhood are sparse (22,43,44). Further knowledge on determinants of TSH and fT4 concentrations during childhood may help to identify specific causes that underlie an abnormal test result. In addition, such knowledge enables physicians to assess the generalizability of described reference ranges to a specific patient population. With regards to the research setting, knowledge on determinants is important in order to define mediating and/or confounding factors that can influence studies on the effects of thyroid function on clinical outcomes.
The aim of the current study was to assess and summarize systematically the current literature on thyroid function reference ranges during childhood in order to create a general overview of TSH and fT4 reference ranges during childhood. Subsequently, in a large, iodine-sufficient pediatric population, the study aimed to investigate which clinical characteristics are determinants of thyroid function and to quantify to what extent these determinants affect reference ranges for TSH and fT4.
Methods
Literature overview
A systematic literature search of The National Library of Medicine's PubMed, Embase, Ovid Medline, Web of Science, and Google Scholar databases was performed to identify studies published from inception until November 18, 2016 (search terms are outlined in the Supplementary Appendix; Supplementary Data are available online at
Original Study
Design and participants
This study was embedded in The Generation R Study, a population-based prospective cohort from early fetal life onwards in Rotterdam, the Netherlands. This study has been described in detail elsewhere (45). In total, all children with consent for follow-up during childhood (N = 8305) were invited to visit the research center, of whom 6674 children attended. After consent by the mother and child, serum samples were obtained for 4593 children, and TSH and/or fT4 concentrations were determined in 4286 samples with adequate serum volumes. Children with known thyroid disease, chronic illness (endocrine, inflammatory, autoimmune, cancer, or kidney disease) or thyroid (interfering) medication usage (levothyroxine or growth hormone) were excluded (n = 13), resulting in a final population of 4273 children (median age 6.0 years, range 4.9–9.1 years).
Determinants and covariates
Potential determinants were selected based on the literature, biological plausibility, and data availability (13,22,46). These included age, sex, ethnicity, height, weight, maternal education level (as a marker of social economic status), and time and season of venipuncture. Information on these determinants was obtained by questionnaires and measurements during the visit to the research center (on the same day as blood sampling). Medical history was assessed by questionnaires, and answers were verified by certified medical doctors. Information on maternal education level was obtained through postal questionnaires. Child ethnicity was determined by the country of origin of the child and/or parents and was defined according to the classification of Statistics Netherlands and categorized according to the major ethnic groups in Rotterdam (45). These were: Dutch, Turkish, Moroccan, Surinamese, Dutch Antilles, African/Cape Verdean, other Western (European, Oceanian, and Caucasian descent Americans/Asians), and other non-Western.
Procedures
Plain tubes were centrifuged, and serum was stored at −80°C. Child TSH and fT4 concentrations were determined using an electrochemiluminescence immunoassay on the Cobas e601immunoanalyzer (Roche Diagnostics, Mannheim, Germany). The intra- and inter-assay coefficients of variation were 1.1–3.0% for TSH (range 0.4–0.04 mIU/L) and 1.6–5.0% for fT4 (range 1.6–24.1 pmol/L).
Statistical analyses
Reference ranges for TSH and fT4 in the Generation R Study were defined by the 2.5th and 97.5th percentiles. For analyses aimed to identify thyroid function determinants, TSH concentrations were log transformed to adhere to model assumptions (back-transformed values are displayed in graphs to allow for better interpretation). Multiple linear regression models were used to investigate the association between the determinants and childhood TSH or fT4. Nonlinearity of the association between continuous variables and childhood TSH or fT4 concentrations was investigated by ordinary least squares linear regression models with restricted cubic splines utilizing three to five knots. As a sensitivity analysis, the analyses were repeated after exclusion of children with TSH or fT4 concentrations outside of the 95% range to investigate the effect of potential data outliers. To study the effects of highest and lowest values of TSH and fT4 determinants on the TSH and fT4 reference ranges, the 95% range for both TSH and fT4 was calculated at the highest 10% and lowest 10% of each determinant. Multiple imputation was used to cope with missing data for determinants/covariates. The multiple imputation model included maternal education level, ethnicity of the child, height, weight, age, sex, time of venipuncture, and season (missing data in 15.4%, 2.5%, 0.2%, 0.2%, and for the rest, 0%, respectively), and TSH and fT4 concentrations were used as prediction variables only. Five imputed data sets were created and pooled for further analyses. There were no statistically significant differences between the original and imputed data sets. All analyses were performed using SPSS Statistics for Windows v20.0 (IBM Corp., Armonk, NY) and R statistical software v3.03 (“rms” package).
Results
Literature overview
The systematic search yielded 4704 studies of potential interest, of which 4620 studies were excluded after assessment of the title and the abstract. After further examination, 35 studies were finally included for extraction of the data on TSH and/or fT4 reference ranges (Supplementary Fig. S1). An overview of all included studies and reference ranges for various age categories are shown in Supplementary Tables S1–S8. In general, the variability of reported upper and lower limits reference ranges for TSH and fT4 was the highest in the first week of life and became lower as the age of the study population becomes older. Similar effects were observed in studies with longitudinal data on TSH and fT4 reference ranges (22 –28). There were considerable differences in the reported lower and upper limits of TSH and fT4, as is shown in Table 1. Taken together, in children aged ≥1 years, the lower limit for TSH ranged between 0.32 and 1.30 mIU/L, while the upper limit for TSH ranged from 2.36 to 6.57 mIU/L (Table 1). Furthermore, in children aged ≥1 years, the lower limit for fT4 in these age groups ranged between 7.0 and 18.0 pmol/L, and the upper limit for fT4 ranged between 15.5 and 34.7 pmol/L (Table 1).
Reference ranges derived from 2.5th and 97.5th percentiles. Reference ranges that were calculated in populations with overlapping age ranges were counted for the category with most overlap. Derived from the data extracted from the reviewed studies.
TSH, thyrotropin; fT4, free thyroxine.
Original study
Initially, the study investigated which child characteristics are determinants of thyroid function. Subsequently, the aim was to identify to what extent differences in such determinants between populations underlie the large between-study differences in reference range limits for TSH and fT4. Descriptive characteristics of the study population are shown in Supplementary Table S9. After exclusions, the final study population comprised 4273 children (Fig. 1). There were no considerable differences in characteristics between children with or without data available on TSH or fT4 concentrations (Supplementary Table S10). Child serum samples were obtained at a median age of 6.0 years (95% range 5.7–8.0 years), and the majority of subjects were of Dutch origin (57.8%; Supplementary Table S9). The number of drawn samples was equally distributed throughout the year, and samples were on average taken in the afternoon (median time 14:02 h, 95% range 11.17–5.17 h; Supplementary Table S9). The median and reference range (2.5th–97.5th percentile) for TSH concentrations were 2.30 and 0.87–5.20 mIU/L, respectively (Table 2). The median and reference range (2.5th–97.5th percentile) for fT4 were 16.8 and 13.8–20.8 pmol/L, respectively (Table 2). There was a negative, nonlinear association of fT4 with TSH, exhibiting a stable TSH concentration across fT4 concentrations ranging between roughly 12 and 18 pmol/L (Fig. 2).

Flow chart showing selection procedure of the study population.

Plot shows the association of free thyroxine (fT4) with thyrotropin (TSH) in children (median age 6 years, 95% range 5.7–8.0 years) with corresponding confidence interval, adjusted for age, sex, ethnicity, height, weight, time at venipuncture, season, and maternal education.
Data shown are the 95% range of TSH and fT4 in the highest and lowest 10% values of the determinants.
Data shown as median 10% versus highest 10% and lowest 10%, due to a non-linear association of time at venipuncture with TSH.
Assessment of thyroid function determinants
Boys had a higher TSH concentration than girls (Fig. 3; p < 0.0001). TSH differed according to ethnicity, with the lowest concentration in children of Dutch Antilles origin and the highest concentration in Dutch children (Fig. 3; p = 0.0003). Height was negatively associated with TSH (p < 0.0001), and there was a positive linear association of weight with TSH (Fig. 3; p = 0.03). There was a U-shaped association of time at venipuncture with TSH, with higher TSH concentrations during the morning and late afternoon (Fig. 3; p < 0.0001). Age, maternal education, and season at venipuncture were not associated with TSH concentrations (Fig. 3, p = 0.90; Supplementary Fig. S1, p = 0.20 and p = 0.15, respectively).

Different biological determinants and their association with TSH with corresponding confidence interval. Every association has been adjusted for the remaining determinants and further adjusted for season of the year and maternal education status (supplement). OnW, other non-Western ethnicities; Afr/Ca, African and Cape Verdian; Moro, Moroccan; DuAnt, Dutch Antilles; Surinam, Surinamese; Turk, Turkish; OthWes, other Western ethnicities.
Boys had a lower fT4 concentration than girls had (Fig. 4; p < 0.0001). fT4 differed according to ethnicity, with the lowest concentration in Dutch children and the highest concentration in children of non-Western or Surinamese origin (Fig. 4; p < 0.0001). There was a negative linear association of weight with fT4 concentrations (Fig. 4; p = 0.002). There was a nonlinear association of age with fT4 concentrations in which fT4 was higher at the lower age range (Fig. 4; p = 0.01). Season at venipuncture was associated with fT4, with the highest fT4 concentration during autumn, and higher maternal education was associated with lower fT4 (Supplementary Fig. S1; p = 0.006 and p < 0.0001, respectively). Height and time at venipuncture were not associated with fT4 (Fig. 4; p = 0.10 and p = 0.23, respectively). All results remained similar after exclusion of children outside of the 95% reference range for TSH and/or fT4 or when age-standardized values for height or weight were studied.

Different biological determinants and their association with fT4 with corresponding confidence interval. Every association has been adjusted for the remaining determinants and further adjusted for season of the year and maternal education status (supplement).
Subsequently, reference ranges were stratified according to the studied thyroid function determinants and their highest and lowest values (10% and 90% cutoffs; Table 2). The lower limit of TSH in this study ranged between 0.64 and 0.96 mIU/L (total population 0.87 mIU/L) according to between-individual variation in clinical determinants (Table 2). The upper limit ranged between 4.30 and 5.62 mIU/L (total population 5.20 mIU/L; Table 2). For fT4, the lower limit ranged between 13.6 and 14.2 pmol/L (total population 13.8 pmol/L), and the upper limit ranged between 20.2 and 23.0 pmol/L (total population 20.8 pmol/L), according to between-individual variation in clinical determinants (Table 2).
Sensitivity analyses were performed to examine whether the association of the various determinants with TSH and fT4 concentrations differs based on ethnicity, no relevant effect modification was identified (data not shown).
Discussion
The current study provides a literature overview of published reference ranges for thyroid function in children, demonstrating large differences in the reported reference ranges for TSH and fT4 during childhood. Differences were present across different age categories, between studies using different assays, as well as between studies utilizing a similar assay. Subsequently, in the present population-based cohort from an iodine sufficient area, child age, sex, ethnicity, and various anthropometric characteristics were identified as thyroid function determinants. Already within this population, between-individual variation in a single clinical determinant accounted for variation in the lower and upper cutoffs of 0.64–0.96 mIU/L and 4.30–5.62 mIU/L for TSH, and 13.6–14.2 pmol/L and 20.2–23.0 pmol/L for fT4.
In both the clinical as well as the research setting, there is very little consensus on how to define abnormal thyroid function and reference ranges in children (7). For example, some studies included in the present literature overview defined pediatric reference ranges for thyroid function using a nonparametric approach utilizing the 2.5th–97.5th range or the 5th–95th range to define a normal TSH or fT4, while others used a (semi-) parametric approach defining normality based on ±1.96 or two standard deviations from the mean (25). Such methodological differences are the most likely cause of the large differences in pediatric reference ranges for TSH and fT4 in the literature, as shown in the literature overview. These differences hamper translation of research findings to the clinical setting and also affect the accuracy of the literature summary. The findings demonstrate the need for standardization of reference range methodology in this field, and suggest that further studies are required to optimize clinical diagnosis of thyroid disease in children.
Apart from differences in the methodology of calculating reference ranges, the study size, study population selection, and exclusion of individuals with major disease known to affect thyroid function may also play an important role (47). Many studies identified through the literature overview lack a sufficiently sized population to generate appropriate reference ranges for different age intervals. Although a minimum of 120 subjects is often proposed to define reference ranges, this is only recommended as an absolute minimum for the calculation of nonparametric 90% coverage intervals (e.g., 5th and 95th percentile reference ranges) (48 –51). However, because of the high inter-individual variability and skewness of TSH and to some extent also fT4, a minimum of approximately 400 individual measurements per partition is required for these measurements (48 –51). As shown in the literature overview, 14 studies presented data derived from <100 measurements (11,15,18,20,21,23,24,26,28,31 –33,38,42).
Another important determinant of the large differences in reference ranges for TSH and fT4 concentrations is the assay that is used. While most studies used an immunoassay, some studies used equilibrium dialysis and/or liquid chromatography–mass spectrometry (8,20). However, even when similar assays were used, large between-study differences were present. For comparison, for the five- to eight-year-old children from the cohort study, the reference range was 0.87–5.20 mIU/L for TSH. Studies that used a similar assay report a TSH reference range that lies anywhere between 0.48 and 5.66 mIU/L (8,12,22). This variation may also suggest that differences in population characteristics can account for some of the between-study differences in thyroid function reference ranges. Although characteristics such as child age and anthropometric data have previously been identified as determinants of thyroid function, it is unknown to what extent these may affect reference ranges (22,43,44). This population-based cohort study shows that already within a population representing children from a small geographical area, lower and upper cutoffs for TSH may vary up to 0.21 mIU/L and 0.93 mIU/L (up to 21% and 31%), respectively, according to a single thyroid function determinant. For lower or upper fT4 cutoffs, this variation was much lower (0–2.5 pmol/L, or up to 4.3% and 12.0%, respectively). This larger variation according to determinants in TSH than in fT4 may reflect only mild alterations in the hypothalamic–pituitary–thyroid axis (HPTa). In the vast majority of young children, it is likely that the HPTa is not yet subjected to pathophysiological processes such as development of toxic nodules or thyroid autoimmunity. This is supported by the results from the present cohort study, showing a stable association of fT4 with TSH. Therefore, it is speculated that the majority of the between-individual differences in TSH and fT4 presented in this study are more likely caused by differences in the HPTa set point, for example based on genetic variation (52). In order to clarify the explained variability in thyroid function further, genetic studies in children could thus prove to be valuable. The higher TSH values in boys compared to girls that were identified in this study differ from study results in adults, in which women tend to have higher TSH values compared to men (53). These differences could perhaps be explained by genetic differences, which cause more prominent TSH differences at an age with a lower prevalence of thyroid disorders, or, alternatively, slight sex differences could be present in the maturation of the HPTa. Another relevant difference in determinants of TSH and fT4 as identified in the current study is that height was associated with TSH but not fT4, while weight was associated with both TSH and fT4. This may indicate that the association of height with TSH is perhaps caused by genetic pleiotropy, implicating the existence of genes affecting both the HPTa set point, as well as height, while weight is more likely to interfere with the HPTa (54). Although the fat component of weight could potentially increase TSH and decrease fT4 via higher leptin concentrations and higher thyroxine-binding globulin concentrations (54), respectively, it was demonstrated that lean body mass but not fat mass is associated with fT4 concentrations in a previous study (55). Since it is likely that the association of body composition with thyroid function is at least partly subject to reverse causation (56), further studies are needed to clarify the underlying mechanisms of the results.
The current literature overview provides a detailed summary of the existing studies on the thyroid function reference ranges during childhood. Furthermore, clinical determinants of thyroid function were studied in a large prospective population-based cohort of children living in an iodine-sufficient area. A limitation of this study is the fact that the population comprised a relatively narrow age range. It is therefore not possible to extrapolate the results to other age categories. These data should be collected in the future. In addition, venipuncture in this study was mostly done in the afternoon (median time 14:02 h, 95% range 11:20–17:20 h) in a non-fasting state. In clinical practice, determining thyroid function usually occurs in the morning in a fasting state. Current literature about childhood reference ranges does not often consider the time of venipuncture or the fasting state (9,14,19,22 –28,31,36,41,42), which makes the comparison between studies challenging. Furthermore, several studies have determined reference ranges in different fasting and non-fasting states (13,21,29,33), which could potentially complicate the interpretation of reference ranges even further. The lower concentrations of TSH in the present study when the time of venipuncture was in the early afternoon could be partially mediated by food intake, as free triiodothyronine concentrations rise and TSH concentrations decrease postprandially (57). However, the design of this study was not adequate for investigating postprandial thyroid function changes, as breakfast or lunch were not consumed at defined times, and snacks were provided to the children during the visit. In addition, contrary to some studies (13,21,29,33), the majority of studies on childhood thyroid function reference ranges do not consider the time of venipuncture or the relation to fasting and feeding (9,14,19,22 –28,31,36,41,42). Based on the present results on the impact of the time of venipuncture, this makes the comparison of thyroid function tests between studies and between individuals in a clinical setting more challenging. Another potential limitation of the cohort study is that data on TPO antibodies are not available. However, it is unlikely that the relatively short exposure to thyroid autoimmunity in children with a median age of six years already affects thyroid function. This is illustrated by the fact that lower fT4 concentrations were not associated with higher TSH concentrations in the cohort study. Moreover, it has been shown that thyroid function in children with positive serum thyroid peroxidase antibodies (TPOAbs) is not lower than in TPOAb-negative children (58). Finally, the observational nature of the population-based cohort study leaves the possibility of residual confounding and the uncertainty about causality within studied associations.
In conclusion, in the current literature overview, a large heterogeneity is demonstrated in pediatric thyroid function reference ranges in the existing literature. In the population-based cohort study, a minimal association of TSH and fT4 was demonstrated, suggesting that the HPTa in children is still unaffected by thyroidal pathological processes, and it is shown that child age, sex, ethnicity, anthropometric data and time of venipuncture are determinants of TSH and/or fT4 concentrations, and that between-individual variations in these determinants can influence the calculation of reference ranges. The identification of these determinants and quantification of their effects can help with the interpretation of thyroid function tests. Future efforts should focus on generating evidence based recommendations to define abnormal thyroid function in children, in order to tackle the large heterogeneity in the current literature.
Footnotes
Acknowledgments
The contributions of the endocrine laboratory technicians are highly appreciated. The Generation R study is conducted by the Erasmus Medical Center (Rotterdam) in close collaboration with the School of Law and Faculty of Social Sciences of the Erasmus University Rotterdam; the Municipal Health Service Rotterdam area, Rotterdam; the Rotterdam Homecare Foundation, Rotterdam; and the Stichting Trombosedienst and Artsenlaboratorium Rijnmond, Rotterdam. We gratefully acknowledge the contribution of children and parents, general practitioners, hospitals, midwives, and pharmacies in Rotterdam. The general design of the Generation R Study is made possible by financial support from the Erasmus Medical Center, Rotterdam; the Erasmus University Rotterdam; The Netherlands Organization for Health Research and Development; The Netherlands Organisation for Scientific Research; the Ministry of Health, Welfare, and Sport; and the Ministry of Youth and Families. Furthermore, we gratefully acknowledge W.M. Bramer for his contribution to the systematical search through different libraries in order to find eligible studies for the systematic review.
This work was supported by a clinical fellowship from ZonMw, project number 90 700 412 (to R.P.P.) and by a fellowship from ERAWEB, a project funded by the European Commission (to M.B.).
Author Disclosure Statement
The authors have nothing to disclose.
