Abstract
Background:
Biorepositories facilitate research and clinical studies in many settings. Modern biobanks use state-of-the art storage methods and low temperatures, while many older collections of biospecimens have been stored at less optimal temperatures. The Janus Serum Bank Cohort in Norway holds over 700,000 serum samples collected decades ago and stored at −25°C. To obtain insights in the stability of serum components at −25°C over prolonged times, we performed 7 measurements for increasing storage time up to 108 months for a panel of 15 serum components.
Method:
A selection of analytes (proteins, an enzyme, electrolytes, small molecules, hormones, lipids, and a vitamin) were measured in serum from 40 anonymous donors. The serum components were measured in fresh samples and after 3, 6, 12, 24, 36, 72, and 108 months in storage at −25°C. We tested for variations using analysis of variance and paired sample t-tests and performed trend analyses for these serum component levels against time.
Results:
All measured serum components showed differences in values for at least one of the timepoints. Trend analyses identified significantly decreasing levels for nine components, whereas four components showed significantly increasing levels. Two components did not show significant trends.
Conclusion:
Storage of serum at −25°C may result in changes in serum analyte levels over time. We cannot exclude that batch effects of assaying kits; laboratory instrument changes and standards contributed to the observed differences. To mitigate the influence of increasing storage time, storage time should be used as matching criteria for control samples included in research projects.
Introduction
Collections of biological material are important resources in medical research. Exploitation of biorepositories allows for identification of biomarkers for exposures, early detection of diseases, or treatment side effects. Such biomarkers can provide important insights in development and manifestation of diseases and medical conditions. 1 Recently, increasingly large population-based biorepositories, such as the UK Biobank, 2 are systematically established, following standard procedures to ensure the best quality of the samples to be stored. However, many of todays’ older biorepositories are from historical collections in disease-specific biobanks and smaller population-based research biobanks. Many of these older biorepositories are valuable because they have a long follow-up time with an increasing proportion of donors with specific disease outcomes relevant for research projects. Compared with newly established systematic biobanks with relatively short follow-up time, the older biobanks have the benefit of access to historical samples with declared outcomes. Furthermore, prospective biobanks collecting samples from donors with a specific diagnosis of interest or at risk for an event may take years to collect adequate numbers of samples. Thus, a common requirement for all biobanks is long-term sampling and intermediate storage of collected samples, and it is therefore important to have knowledge about the quality parameters impacting the storage.
Earlier studies of storage time effects on serum-based markers have shown that while most serum components appear stable over time in storage, time-related differences are observed for some markers, even over shorter storage periods.3–6 These studies assessed the stability of markers over periods of months or years and at different storage temperatures between −20 and −80°C.
In the Janus Serum Bank, over 700,000 serum samples from 318,628 donors have been stored at −25°C for up to 50 years. 7 Previously, we have shown that 10 repeated freeze–thaw cycles affected 3 of 15 assessed serum components (bilirubin, sodium, and thyroid-stimulating hormone). 8 Furthermore, a study investigating the impact of storage temperature at −20°C versus −80°C revealed that 15 out of 193 assessed serum components were affected by differences in temperature with a median storage time of 4.2 years. 9 In addition to the effect of putative freeze–thaw cycles and storage temperature, the time in storage is an important factor to consider in biorepositories’ sustainability. The impact of time in storage on analyte stability has been investigated both in human and animal studies. A study by Cray et al. using rat serum showed that common biochemical analytes in rat serum were generally stable for up to 7 days when refrigerated and that storage at −70°C resulted in only modest changes in analyte levels even after 360 days, whereas prolonged storage at −20°C resulted in many changes in the analytes measured: After 360 days at −20°C, Cray et al. observed significant declines for enzyme activities as well as less extensive though significant changes for the analytes CO2 and calcium. 10 A study by Herrington et al. on human urine samples reported the albumin-to-creatinine ratio remained stable after 6 months of storage at −80°C and −40°C but showed a decrease in the albumin-to-creatinine after 6 months of storage at −20°C. 11 Kleeberger et al. performed a study of serum samples stored in nitrogen in the vapor phase. 5 For 11 out of 21 measured serum components, this study reported significantly different measurements between the baseline collection and analysis after 6 years. However, after accounting for a potential systematic underrecovery of one analyte, the assay-specific internal control variance, coefficients of variation, and direction of effect, four analytes showed strong evidence of systematic degradation. 5 Most of the earlier studies have only a single follow-up time for the analyses of components comparing a baseline to a follow-up sample. The dynamics in the effects of storage time are therefore not well investigated. To obtain insights into the effects of storage time at −25°C, we performed six repeated measurements for up to 9 years in storage for 15 biochemical markers in serum. We included electrolytes, metabolites, proteins, an enzyme, lipids, small molecules, hormones, and a vitamin in the panel of markers.
Materials and Methods
We included blood samples from 40 consenting men and women aged between 30 and 60 years. Whole blood was collected in eight 7 mL vials (BD Vacutainer, Cat. No. 367615) from each individual donor, and the tubes were inverted 10 times. The blood was then coagulated for 60 minutes before centrifugation at 1300 × g for 10 minutes at room temperature. For each donor, serum was pooled into a single vial and thoroughly mixed by rolling and pipetting and divided into seven 1.0 mL aliquots in 1.5 mL polypropylene screwcap vials (Sarstedt, Cat. No. 72.730.406). Six aliquots were stored together in colocalized cardboard boxes in a monitored and alarmed freezer facility at −25°C until measurement. There were no critical temperature deviations in the storage during the study. One fresh sample was used for biochemical analyses for a panel of 15 serum components (Table 1). These 15 components were subsequently measured again in an untouched aliquot, which was thawed for 35 minutes on a benchtop roller before being submitted for analyses after 3, 6, 12, 24, 36, 72, and 108 months of storage at −25°C (Fig. 1). All analyses were performed within 8 hours of submission.
Analytical Methods and Coefficients of Variation for 15 Serum Components and Assay-Specific Coefficients of Variation at Different Component Levels
Relevant instrument and assay changes are listed in Supplementary Table S2.

Flowchart of sample collection and measurement times. The 3-month thawed sample T1 is used as the reference sample for all subsequent measurements.
The panel of 15 serum components was measured using a Cobas Biochemical analyzer (Roche Diagnostics) and comprised proteins (albumin, immunoglobulin G, and C-reactive protein); enzyme (aspartate aminotransferase); electrolytes (sodium and potassium); small molecule components (creatinine, glucose, total bilirubin, and urea); hormones (testosterone and thyroid-stimulating hormone); lipids (cholesterol and triglycerides); and vitamin (B12). The analyses were performed at the Department of Medical Biochemistry at Oslo University Hospital HF, Rikshospitalet, which is accredited according to NS-EN ISO 15189. For each timepoint, samples were thawed and analyzed in a single batch with internal controls for each assay. The analytical method and coefficient of variation at corresponding component levels for each serum component are provided in Table 1. Relevant equipment changes, restandardization, and reagent changes are provided in Supplementary Table S2.
For values below the lower limit of quantification (Lloq, provided in Table 2), we replaced the reported value with the assay-specific Lloq/√2. This applied to aspartate aminotransferase (3 replacements for 1 donor); bilirubin (13 replacements for 4 donors, of which 1 donor all measurement points), C-reactive protein (117 replacements for 17 donors, of which 13 donors all measurement points); testosterone (27 replacements for 8 donors, of which 2 donors all measurement points); and vitamin B12 (one replacement).
Median Values and Interquartile Range for 15 Serum Components in the Time-Series Measurements After Storage at −25°C
Lloq—lower limit of quantification representing the in-house lower limit of quantification. Reported values below this value were replaced with Lloq/√2.
Reference reports the in-house reference values for normal values of the analyte for adults, the lower and upper boundaries for adult males and females considered combined.
The fresh measurements at T0 are provided for comparison.
Statistical methods
Median value and interquartile range are reported for each component at each timepoint, as well as the mean percentage difference from the 3-month reference with 95% confidence intervals. Samples in biobanks are expected to experience at least one freeze–thaw cycle; to account for the potential impact of freezing, 8 samples thawed up after 3 months in storage at −25°C were considered as the baseline reference samples (T1). We also present the measured values of the fresh samples (T0) that are more relevant in a clinical setting; however, these T0 values are not used in any of the statistical analyses. Each subsequent measurement time was compared with the 3-month sample as the reference for a sample that experienced freezing (Fig. 1). For each component, the biochemical data were tested for approximately adhering to a normal distribution, and where relevant, the data were transformed to approximate normal distribution (Supplementary Table S1). One-way analyses of variance (ANOVAs) with repeated measurement were used to compare differences in components’ levels up to 108 months of storage against the 3-month reference sample. For each component’s model, we performed Mauchly’s test for sphericity. 12 Upon violation of the sphericity assumption, a correction according to Greenhouse–Geisser 13 was performed. For each component where the ANOVA test was significant, post hoc paired-samples t-tests were used to test pairwise for differences between the 3-month reference and subsequent measurements. A Wilcoxon’s test was used for C-reactive protein and testosterone to account for the distribution not adhering to a normal distribution even after transformation. Time trends were tested using likelihood ratio tests comparing a linear mixed model with random intercept and no covariates to the similar model in which elapsed time from sample collection was added as a continuous variable.
To test for clinical impact, we performed equivalence tests by the two one-sided test (TOST) procedure 14 based on a random intercept linear mixed model for percent differences. The zone of clinical indifference for TOST was defined as Δcritical = CVw/3, where CVw is the assay-specific biological variance within individuals from the “European Federation of Clinical Chemistry and Laboratory Medicine” biological variation database accessed on February 11, 2025. 15 The values are provided in Supplementary Table S1. Analyses were performed using R version 4.3.2 (http://cran.r-project.org/), and for linear mixed model analyses, the nlme package 16 was used.
Unless mentioned otherwise, p-values are provided unadjusted.
Ethics statement
The project was evaluated by the Regional Committee for Medical and Health Research Ethics in Norway. The committee considered this project as a quality assurance study based on anonymized samples, and thus, no explicit ethical approval was required.
Results
Samples in biobanks are expected to experience at least one freeze–thaw cycle; therefore, the samples thawed up after 3 months in storage at −25°C were considered as the baseline reference samples (T1) which the subsequent measurements were compared with. We also present the measured values of the fresh samples (T0) that are more relevant in a clinical setting. The median concentration and the interquartile range for all components by timepoints (T1 through T7 and T0) are presented in Table 2. For several of the measured components, the levels are higher at T1 compared to T0, for example, this is observed for cholesterol, creatinine, and triglycerides. Figure 2 displays the median value and interquartile range for all 15 serum components, whereas the individual-level data are plotted in Supplementary Figure S1.

Plots of the median of the individual levels of serum components (y-axis) measured after increasing storage time (x-axis). Each storage time median value is connected by a line across the timepoints. The vertical lines at each timepoint indicate the interquartile range. The 3-month reference is provided in blue (dashed interquartile range). The freshly measured level (marked with *) is provided in green (dashed interquartile range). Component names and measurement units are provided at the top of each panel. A representation of all individual values is provided in Supplementary Figure S1. ASAT, aspartate aminotransferase; TSH, thyroid-stimulating hormone.
The percentage of difference between the 3-month reference sample (T1) and each subsequent measurement (T2 − T7) was calculated by pairwise subtraction of the 3-month measurement (T1). The mean percentage difference and confidence intervals for the mean for each measurement are provided in Table 3. For most components, a difference from T1 is present for all subsequent measurements. While, for some components, there seems to be a consistent direction of this difference (e.g., cholesterol showing consistently lower measurements and creatinine showing consistently higher measurements), other components show differences in both positive and negative direction in the subsequent measurements (e.g., bilirubin measuring higher values at T2 and T3 and lower measurements at T4 − T7). For comparison, the timepoint T0 (fresh sample measurement) with the difference compared with T1 is provided in Table 3.
Overview of the Mean Percent Differences and Confidence Intervals for All Measurement Times After Pairwise Subtraction of the Individual T1 Values
Statistical tests for T2 − T7 are presented in Table 4.
The percent differences between the fresh measurement T0 and the study reference T1 are provided for comparison and not subjected to statistical testing.
For the components that did not approximate normal distribution (C-reactive protein and testosterone), the median percent difference is provided.
Statistical Test Results (p-Values) for the Comparisons of Storage Time
The provided p-values are not adjusted for multiple testing.
Likelihood ratio test between linear mixed model including time “component ∼ Time + (1|donor-id)” and the null model “component ∼ 1 + (1|donor-id).”
For aspartate aminotransferase, cholesterol, and creatinine, the Mauchly test was not performed due to low variance attributable to rounded values in the assay output data file.
ANOVA, analysis of variance; N/A, not available; n.s., not significant.
Violation of the Mauchly’s test for sphericity was observed for 11 out of 12 tested components (Table 4). ANOVA-based testing of the difference against the 3-month reference value revealed that for all components, a statistically significant difference was present in the time series (Table 4). To identify the timepoints with significant differences, we performed paired-samples t-tests for each component, testing the measurements T2 − T7 against the reference T1. As shown in Table 4, most of these paired-samples t-tests identified significant differences between the measurements. The most differences were observed for T7, for which all comparisons were significant, while the least differences were observed at T3 where 4 out of the 15 components did not show significant differences.
To test whether observed differences between the measurements adhere to a specific direction of effect, we performed trend analyses (Table 4). Only albumin and thyroid-stimulating hormone did not show significant trends of effect for the time in storage, whereas the remaining 13 components had significant trends. Of these 13, 9 components showed a negative direction of effect with lower measurements for increasing storage time, whereas 4 components showed a positive direction of effect with higher measurements for increasing storage time (Table 4).
We assessed whether the observed differences in measured values of the components exceeded the zone of clinical indifference (Fig. 3). Except for triglycerides and C-reactive protein, all components had at least one measurement time that exceeded the zone of clinical indifference. For glucose, IgG, sodium, and vitamin B12, such measurement differences beyond the threshold for clinical indifference were observed in both directions. For the remaining nine components, the zone of clinical indifference was exceeded in one direction for at least one measurement time (higher for albumin, creatinine, and testosterone; lower for aspartate aminotransferase, bilirubin, cholesterol, potassium, thyroid-stimulating hormone, and urea). For cholesterol and creatinine, all measurement times were outside the zone of clinical indifference.

Equivalence plots for component percentage deviations from the reference sample T1 at all measurement times. The x-axis displays the percent difference from the reference sample T1. The yellow shaded area marks the zone of clinical indifference delimited by a deviation of maximum one-third of the assay-specific biological variance within individuals (methods), this zones’ cutoffs are marked by the left and right dashed vertical lines. For each analyte, the mean difference (black dot) and 95% confidence interval for the mean difference are provided (black horizontal bars). The 3-month reference T1 is provided as a blue diamond shape and marked with the center dotted vertical line. The metrics for the fresh sample are provided in green (short vertical line and horizontal bars) for comparison; these T0 values are not included in any statistical tests. (ref), reference.
Discussion
Our study shows significant differences in measured analyte levels for all of the 15 measured components after up to 9 years in storage at −25°C. Except for triglycerides and C-reactive protein, these differences exceeded the zone of clinical indifference, which was defined by one-third of the measurand-specific biological variance within individuals. Trend analyses indicate that only albumin and testosterone did not have a specific direction of effect in the time-trend analyses. For creatinine, IgG, urea, and vitamin B12, the time-trend analyses showed higher measurements for longer storage time, whereas the remaining nine analytes showed a negative trend with increasing storage time. The most consistent time-in-storage effect was observed for bilirubin, which showed increasingly lower levels with longer storage times. Inconsistent results of storage time effects on bilirubin at −20°C have been reported earlier, by Amin et al. 17 and the CALIPER study 4 reporting a time-dependent decline in levels at −20°C, respectively, after 2 weeks and more than a month. Other studies did not find such a time-dependent effect at −20°C 3 or only observed this for some timepoints in their time series at −20°C. 6 Our current study supports a time-related decline in serum bilirubin levels in frozen samples. In contrast to a clear pattern of time-related decline for bilirubin, the other components we considered showed more erratic patterns after time in storage, although most components showed a declining trend across the times measured. This observation is in line with a previous study for storage of serum samples stored on vapor-phase nitrogen, which showed the predominant direction of effect for significant changes is toward lower levels after 6 years in storage. 5
Inherent to the design of our study, in which aliquots were frozen from a common serum sample for each donor, we cannot exclude that variations in the reported level for each analyte are attributable to technical variations in the laboratory equipment, assay kits, and standards used at the different timepoints. The measurements were performed in an accredited facility, which closely monitors assay performance using internal and external standards, and no deviations were reported during the period this study was conducted. Despite such technical considerations, it is likely that consistent time trends are caused by true time-in-storage effects, illustrated by, for example, bilirubin levels that show a consistent decline in reported level with increasing storage time. A more frequent sampling interval, repeated measurements at the sampling intervals, and/or an extended period of monitoring may reveal additional time trends for samples stored frozen.
No information about the sex of the anonymous donors was recorded. It should be noted that under normal circumstances, testosterone is present at much lower concentrations in females compared with males; based on low testosterone levels, most of the donors appear to be female. Therefore, testosterone may be less suited for the comparisons conducted in the current study; the variance is low for samples with near-zero levels for testosterone and the distribution of testosterone does not approximate a normal distribution after transformations. Similarly, many of the C-reactive protein measurements are either in the lower range or below the limit of quantification, indicating a low inflammatory state for the donors. For C-reactive protein, this also resulted in a distribution that does not approximate a normal distribution after transformations.
The results in the current study should be considered indicative. To draw solid conclusions with respect to the impact of storage time on serum analyte stability, additional studies are needed in which the potential technical bias is further addressed. Such potential technical batch effects over longer periods of time may negatively impact the robustness of the data on the investigated analytes. The fact that we have used aliquots originating from a single pooled sample and that these aliquots remained untouched in the freezer until analysis contributes to minimizing batch effects from other sources.
Four of the 15 analyzed components showed a time trend toward higher levels for longer storage times. Although also observed in other studies on analyte stability during storage,3,5,10 the mechanism behind increasing concentrations measured after longer storage times remains unclear. Possibly, the epitopes that are detected by the assays are masked in assays performed at the earlier timepoints, whereas these masked epitopes may become more accessible by time-related degradation of the masking components at later timepoints. Other sources of such increases could be sublimation/evaporation processes; however, this would also result in increased concentrations of the remaining 11 components.
The generalizability of this study to storage in lower temperatures is unclear, although Valo et al. showed 15 out of 193 components were more affected at −20°C as compared with −80°C. 9 Lower storage temperature may therefore show less time-in-storage effects.
When using biospecimens stored over prolonged time periods, it is recommended to match control samples with similar storage times to avoid systematic errors in the analyses of biomarker levels.
Conclusion
We show that serum analyte stability is affected by prolonged storage times at −25°C. Biobanks that can provide samples of index cases with matched controls should match on sample storage time. Bilirubin clearly shows signs of degradation corresponding to the time spent in storage. Other serum components, such as triglycerides and C-reactive protein, show more stable measurements over time in storage with respect to the zone of clinical indifference, and yet other components, such as serum glucose, show high degrees of variations over the different timepoints, without consistent time trends.
Authors’ Contributions
S.D.B.: Curation and analysis of data, interpretation of results, drafting the article and coordinating the coauthor contributions; and final approval of the article. M.L.: Sample handling, acquisition and curation of the data, interpretation of the data, reviewing the article, and final approval of the article. R.E.G.: Conceptualization of the study, acquisition of data, funding of the study, interpretation of results, reviewing the article, and final approval of the article. N.S.: Analysis of data, interpretation of results, reviewing the article, and final approval of the article. O.I.K.: Acquisition and curation of the data, interpretation of results, reviewing the article, and final approval of the article. H.L.: Conceptualization of the study, acquisition of data, funding of the study, interpretation of results, reviewing the article, and final approval of the article.
Footnotes
Acknowledgments
The authors acknowledge the voluntary contribution of the serum donors used in this study.
Author Disclosure Statement
No competing financial interests exist.
Funding Information
Supplemental Material
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
