Abstract
Introduction:
Based on the experience from a Swedish biobank, we established a clinical cervical cytology biobank and adapted it to a Danish setting. The aim of the present study was to validate the biobank material regarding quality and quantity, to determine the usefulness of the material for future diagnostics and biomarker testing.
Methods:
Cervical cytology samples collected in ThinPrep were analyzed before and after biobanking using p16/ki-67 dual staining, a human papillomavirus (HPV) DNA test (Cobas), and a test for HPV messenger RNA (mRNA; Aptima). The concordance of the test results before and after biobanking was assessed. We also evaluated the morphology before and after biobanking and did additional tests on the biobanked material to qualify the usefulness of the material (library preparation for next-generation sequencing [NGS], reverse transcription–polymerase chain reaction [RT-PCR], and the Inno-Lipa HPV genotyping test).
Results:
For the Cobas HPV test, the concordance was 92% (122/133), and for the Inno-Lipa test (30 samples), it was 100%. For the Aptima assay, the concordance was a little lower, 84% (42/50). The morphology of the cell was well preserved, and the concordance of the p16/ki-67 dual staining was 88% (37/42). The functional tests showed that DNA-based NGS libraries (TST15 panel; Illumina) had good quality parameters. However, with the RT-PCR, 12% of the samples showed poor quality and a too low input amount for the analysis.
Conclusion:
The quality of the biobanked samples is high, and the material is suitable for testing of DNA, RNA, and protein. However, for testing of specific biomarkers, pilot studies are recommended to ensure sufficient input amount and quality of the material, especially for RNA-based studies.
Introduction
Screening for cervical cancer has been carried out in Denmark for more than five decades. 1 The first Danish National Screening Guideline was issued in 1986 and updated in 2007, 2012, and 2018.2,3 Since 2007, it has been recommended to screen women aged 23–49 years every 3 years, and women aged 50–64 years every 5 years, and ∼400,000 samples are analyzed every year in Denmark. 4
With the introduction of liquid-based cytology (LBC) for cervical cancer screening, biobanking of these samples has become feasible. Systematic biobanking offers a unique possibility to collect and store biological material for further clinical purposes, quality assurance, and in addition, a biobank is a unique resource for research projects, making it possible to combine biologic, demographic, and epidemiological data with relevant endpoint data for follow-up/disease. In Denmark, all residents have a unique Civil Registration Number, which is used universally in all national health registers and administrative systems. 5 Using the Civil Registration Number, data from biobanked samples can be linked with a variety of health data, making it possible to conduct high-quality epidemiological studies with virtually no loss to follow-up. 5
However, implementation of a biobank requires thorough preparation to ensure future use of the samples.6,7 For most laboratories, it is not possible to store the original vials due to limited space, and therefore, samples have to be processed. Accordingly, sample processing and storage procedures need to be standardized to ensure quality and usefulness of the biobanked material.
In 2010, a cervical cytology biobank was initiated in Sweden, 8 and after adjustments, the biobank was expanded to a nationwide implementation.7,9 Based on their experience, we have established a clinical cervical cytology biobank in a Danish setting. The biobank was developed as part of the first Danish implementation study of human papillomavirus (HPV)-based screening, HPV SCREEN DENMARK, which is embedded in the screening program at Lillebaelt Hospital in the Region of Southern Denmark. 10 In 2020, Lillebaelt Hospital processed 12% of the cervical cytology samples in Denmark. 4
The overall purpose of the biobank was to collect and store residual material from LBC vials under controlled and standardized conditions from all liquid-based cervical cytologies processed at the Department of Pathology, Lillebaelt Hospital. Accordingly, the aim of the present study was to validate the biobank regarding yield and quality of the biobanked material and to describe an efficient high-throughput workflow.
Materials and Methods
The biobank consists of residual material from cervical LBC samples received at the Department of Pathology, Lillebaelt Hospital, in the period May 29, 2017, to May 5, 2020. In total, 157,396 LBC samples were registered and analyzed during this period. We were able to biobank 150,240 (94.7%) of the samples. The main cause for not biobanking a sample was too little residual material in the vial (samples with <6 mL were excluded). Other, much less frequent, causes were missing samples or technical issues in the biobanking procedure.
All LBC samples were collected in ThinPrep Media (Hologic, Marlborough, MA) and initially processed for routine clinical purposes according to the Danish National Guidelines for screening for cervical cancer 3 or according to the algorithms for the HPV SCREEN DENMARK study for women aged 30–59 years. 10 After routine testing, the residual material was stored in a dark basement at room temperature and biobanked 5.5–13 months later. The quality of the oldest samples was examined before starting to biobank (Supplementary Data S1).
The biobanking procedure is outlined in Figure 1. After sedimentation for a minimum of 2 hours, 6 mL (3 × 2 mL) of the sample was aspirated from the bottom of the ThinPrep vial and dispensed into an intermediate conical tube. After another 30 minutes of sedimentation, 570 μL of sedimented sample was transferred to storage tubes A and B, respectively. The tubes were stored at −18°C.

Biobank flowchart.
The sample processing procedure was automated in a 96-well format using a Tecan Freedom robot (Tecan, Männedorf, Switzerland). Identification and traceability of the samples were ensured by a Laboratory Information Management System (LIMS; LabWare Nordic, Helsingborg, Sweden), which links the unique sample identification number to a two-dimensional (2D) barcode at the bottom of the storage tubes and to a specific position in the freezer.
A standard operating procedure (SOP) was prepared to ensure uniform and correct collection and storage of the samples. Likewise, an SOP for retrieving samples from the biobank has been prepared. All documents are kept in a document management system (D4).
Ethical considerations and data protection
The biobank was established as a clinical biobank at the Department of Pathology, Lillebaelt Hospital, and was, according to Danish legislation, reported to the Data Protection Agency in the Region of Southern Denmark (Journal No. 18/21475). In the screening invitation letter, the women were informed that the residual sample material would be stored in a biobank. The women were informed that the biobanked sample could be used for (1) Clinical purposes, that is, later diagnostics and treatment of the woman herself; (2) Evaluation and quality assurance of the screening program; and (3) Research purposes. Storage tube B in the biobank (Fig. 1) is reserved exclusively for clinical purposes. Women were given the possibility to opt out of the biobank by registering in the Danish Register for Use of Tissue (“Vævsanvendelsesregisteret”) or by contacting Vejle Department of Pathology directly. Before using material from the biobank for research purposes, the specific research project must be approved by the Danish Scientific Ethics Committee System.
The biobank was established to comply with European data protection regulations, including Directive 95/46/EC (before May 18, 2018) and the European General Data Protection Regulation (EU 2016/679) and associated national Danish laws. The electronic biobank database is placed at a secure regional server, which is continuously backed up. The database is protected by password and only provided to authorized staff. The physical biobank is located in an access-restricted room only accessible by authorized personnel, and biobank freezers are temperature-monitored to minimize risk of sample loss.
Samples in the biobank are labeled with a unique 2D barcode and thus cannot be directly linked to a woman's identity (pseudonymization). The LIMS stores the link between the 2D barcode and the unique sample identification number. Linkage between the sample identification number and the identity of each woman is possible through the Danish Pathology Databank, which is a nationwide pathology register and administrative system for the Danish cervical cancer screening program. 11 The Pathology Databank holds diagnostic information on all pathology specimens processed in Denmark, including the samples stored in the biobank, registered under the sample identification number and the patient's unique Civil Registration Number. Authorized staff at the Vejle Department of Pathology can access the Pathology Databank by logging in with a password.
Establishment of workflow and validation of sample quality
We performed several tests to validate the quality of the biobank content, and initially, we evaluated the yield of material in a biobanked sample compared with the original tube, to assess if the input amount in the biobank was adequate for different subsequent analyses. The setup is outlined in Table 1.
Summary of Validation Tests Performed on the Biobank Material
gDNA, genomic DNA; HPV, human papillomavirus; mRNA, messenger RNA; NGS, next-generation sequencing; PAP, Papanicolaou; RT-PCR, reverse transcription–polymerase chain reaction.
Calculation of yield
For calculation of yield, 24 samples were constructed by mixing samples 2 and 2 together from 48 randomly chosen anonymous patient samples. These 24 samples were divided into 2 uniform aliquots and biobanked in parallel to compare the yield from 4 and 6 mL input amounts. Yield was calculated by extrapolating from ng DNA in a 50 μL sample to the total content (ng) in the original and biobanked sample and comparing the results.
Concordance of DNA and RNA test results
For the concordance of HPV DNA test results in biobanked and original patient samples, we used two different methods: the Cobas HPV assay (Roche, Heidelberg, Germany) and the Inno-Lipa HPV genotyping assay (Fujirebio, Zwijnaarde, Belgium).
The Cobas HPV assay is a quantitative polymerase chain reaction (qPCR)-based method that targets 14 high-risk HPV types, with individual detection of HPV16 and HPV18 and pooled detection of 12 other oncogenic types (31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, and 68). For this investigation, 144 routine samples were randomly selected (72 HPV positive and 72 HPV negative). The samples were anonymized and biobanked. For the Cobas HPV test, 300 μL of biobanked sample was added to 700 μL pure ThinPrep solution and tested for the presence of high-risk HPV DNA according to the manufacturer's instructions.
The Inno-Lipa is a probe-based blotting assay targeting the L1 region of the HPV genome. It is a qualitative method, which detects and identifies 32 HPV genotypes individually (6, 11, 16, 18, 26, 31, 33, 35, 39, 40, 42, 43, 44, 45, 51, 52, 53, 54, 56, 58, 59, 61, 62, 66, 67, 68, 70, 73, 81, 82, 83, and 89). Fifty nanograms or 10 μL of purified DNA was used, and the results were compared with the original Cobas HPV result. We only considered the 14 high-risk HPV types that could be detected by both assays. In total, 30 samples were tested.
For the concordance of HPV messenger RNA (mRNA) test results, 50 biobanked samples initially tested by the Aptima HPV assay (Hologic) for triage according to the Danish National Cervical Cancer Screening Guidelines 3 were randomly selected and anonymized. The biobanked material was re-analyzed using the same assay, which is a qualitative multiplex amplification test that targets E6/E7 mRNA of the 14 high-risk HPV types (16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, and 68). We used 50 μL of biobanked material for the analysis, and the result was compared with the original test result.
Functional test of purified genomic DNA and RNA
Next-generation sequencing (NGS) is a more advanced molecular method compared with qPCR, and it holds great potential for future investigation. Therefore, we investigated whether purified genomic DNA (gDNA) from the biobank would be useful for library preparation for NGS. We used the multiplex amplicon-based TST15 panel (Illumina, San Diego, CA), which covers 15 genes 12 and considered the quality parameters from the library generation to assess the quality of gDNA (concentration of gDNA input in relation to the library concentration and size distribution [bp] of the libraries).
A functional test of purified RNA from the biobank was done by reverse transcription–polymerase chain reaction (RT-PCR), with visualization of the PCR result using TapeStation (Agilent, Cheshire, United Kingdom). The RT reaction was performed using the QuantiTect Reverse Transcription Kit from Qiagen (Hilden, Germany) and an input amount of 12 μL RNA. The target gene was GAPDH (forward primer 5′-GTCAGCCGCATCTTCTTTTG-3′, reverse primer 5′-GCGCCCAATACGACCAAATC-3′), and we used DreamTaq DNA polymerase (Thermo Fisher Scientific, Paisley, United Kingdom) for the amplification. Fifty randomly selected samples were tested.
Cell morphology and protein expression
The cellularity in LBC samples varies, and this is reflected in the amount of material that is transferred to the storage tubes when biobanking. We initially tested different volumes of extracted biobanked materials to determine the minimum amount of volume to achieve slides that were adequate for cytology evaluation according to the Bethesda classification. We decided to use 200 μL. We suspended the 200 μL into 20 mL of ThinPrep media and processed the samples using the routine setting on the T5000 instrument (Hologic). The slides were stained with Papanicolaou (PAP) stain. We randomly selected 30 specimens, and an experienced cyto-technician and a pathologist evaluated the slides in a light microscope for preservation of cell morphology compared with the original LBC slide before biobanking.
We also did an evaluation of dual staining for p16/ki-67 proteins before and after biobanking. We selected 45 LBC samples, of which 18 samples had normal cytology and 27 had abnormal cytology (1 atypical squamous cell of undetermined significance, 13 low-grade squamous intraepithelial lesions, and 13 high-grade squamous intraepithelial lesions). After selection, the samples were anonymized. For details on how the evaluation of the dual staining was performed, see Supplementary Data S2.
Statistical analysis
Concordance of results before and after biobanking was calculated using Cohen's kappa statistics 13 and the GraphPad Software (San Diego, CA). We interpreted the kappa values as “poor” agreement (κ = 0.00–0.20), “fair” agreement (κ = 0.21–0.40), “moderate” agreement (κ = 0.41–0.60), “good” agreement (κ = 0.61–0.80), or “very good” agreement (κ = 0.81–1.00). 13
Results
As an initial experiment, we tested the yield from 4 and 6 mL of original sample for biobanking. The yield of gDNA from 6 mL was on average 47% (median 46%) of the initial sample, whereas for 4 mL, it was 30% (median 23%).
Concordance of DNA and mRNA results
We investigated the concordance between the Cobas HPV DNA test results in the biobanked and corresponding original samples. Eleven of the biobanked samples had an invalid HPV test result. For the remaining 133 samples, 92% showed the same HPV result as in the original Cobas HPV test (Table 2). The 11 discrepant samples were all characterized by a weak HPV-positive result in either the original or the biobanked sample reflected in high Ct values (Ct range 32.7–40.0). Five of the samples with an invalid test result in the biobanked sample were re-tested using the Inno-Lipa HPV assay, which requires less input material than the Cobas HPV assay. For these samples, valid HPV test results were seen.
Concordance of the Cobas HPV Results (DNA), Inno-Lipa Test Results (DNA), and Aptima HPV Results (RNA)
The Inno-Lipa test result is compared with the original Cobas HPV test result.
Defined as positive for ≥1 of the 14 high-risk HPV types (16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, and 68).
We also investigated the concordance between the Cobas HPV test results in original tubes and the results of Inno-Lipa HPV genotyping assay on the corresponding biobanked samples. We found a 100% concordance for the 30 samples considering positive and negative HPV test results (Table 2).
Finally, we examined the concordance using the Aptima mRNA assay (Table 2). For 84%, the results in the original sample were concordant with that in the biobanked sample. For the 8 samples that were positive in the original sample, but were negative in the biobanked sample, the Aptima test was repeated using more material from the biobank samples (200 μL). Hereafter, six of the eight samples became positive increasing the concordance to 96% with a kappa value of 0.91.
Functional test (DNA and RNA)
For the NGS analysis, we tested 16 samples. For 9 of the 16 samples, we used 2 × 10 ng of gDNA for library preparation, which is the recommended amount from the manufacturer. For the remaining seven samples, the DNA concentration only allowed 2 × 2–6 ng for library preparation, which is sufficient if the DNA quality is good.
The library concentrations had a median of 25 ng/μL, but five samples had one library pool concentration between 2 and 6 ng/μL. According to the manufacturer's instruction, it is recommended to have a concentration of 20 ng/μL or more to have a successful sequencing. However, all samples showed a correct and satisfactory size distribution in the TapeStation analysis, indicating that all samples were adequate for sequencing.
mRNA quality was examined using RT-PCR, and 88% had a positive RT-PCR result, whereas for six samples, we saw no band on the TapeStation, either due to poor RNA quality or low input in the RT-PCR.
Cell morphology and protein expression
We evaluated the morphology of 30 biobanked samples compared with the Pap-stained slides prepared immediately after collection and before biobanking. No differences were seen before and after biobanking (Fig. 2). In all biobanked samples, the cells were intact, well preserved, and suitable for cytology.

PAP-stained imprint of cervical cell samples for cytological evaluation. Comparative illustration of imprint immediately after sample collection
Altogether, 90 slides consisting of 45 pairs obtained before and after biobanking were evaluated for dual staining of p16/ki-67. Three cases were excluded due to low cellularity according to the Bethesda criteria. The results are summarized in Table 3 showing agreement in 37 of 42 cases.
Concordance of Dual Staining for p16/ki-67 Before and After Biobanking
Discussion
A clinical cervical cytology biobank is a very useful resource for supplementary analyses of patient samples to guide further treatment, for quality assurance purposes, and it offers a unique possibility for research projects. However, this requires that the biobanked material has a high quality and is in an adequate amount. Accordingly, this study focused on evaluation of a biobank workflow with samples that meet these requirements.
Initially, we evaluated the amount of material to be stored in the biobank, comparing a 6 mL sample with a 4 mL sample. The median yield was doubled in 6 mL compared with 4 mL, and therefore, 6 mL was preferred as this will enable considerable additional testing. We stored the biobanked material in two tubes (in total 1140 μL), where one of the tubes is reserved exclusively for potential future clinical analysis for the woman. In the Swedish biobank, they stored one sample of 600 μL from each woman but reserved 100 μL of the material to enable further clinical analysis for the woman.7,8
Using the Cobas HPV test, we found a very good concordance (92%) between results before and after biobanking. This is in line with a similar study based on the Swedish biobank where they reported an agreement of 87%. 14 It is well known that tests have intra- and inter-assay variations even without extra procedures of the material such as biobanking, but our results are similar to those in validation studies, where an intralaboratory agreement of around 98% and an interlaboratory agreement of 95% were reported for the Cobas HPV assay.15,16
For the 11 samples (8%) with a discrepant Cobas HPV-positive result, we observed Ct values close to the cutoff for the analysis. We also found that when re-testing the invalid samples with the Inno-Lipa assay, we could produce valid results. Both these observations indicate that for some samples, the Cobas HPV assay will require more than the 300 μL biobanked material we used to produce a valid test result.
In relation to the Aptima HPV test, we also found a very good concordance (84%) using 100 μL material, which is similar to the results reported by Larsson et al. 14 When we increased the amount of material to 200 μL, six of eight HPV-negative biobanked samples became HPV positive. This increases the concordance to 96% and is in line with data from Heideman et al., 17 where intralaboratory variation for the Aptima HPV assay was 96%. Accordingly, even though mRNA is more susceptible to degradation, samples from the biobank do have intact mRNA suitable for biomarker analysis. It is important that future studies also perform initial investigations to ensure that an adequate amount of biobanked material is used for the planned study.
To further describe the usefulness of the biobanked material, we evaluated the cell morphology and found that the cells were intact and suitable for cytology also after biobanking. The same result was seen in the study on the Swedish biobank. 8
For the p16/ki-67 dual staining, we found a concordance of 88% when scoring samples before and after biobanking. This is a high degree of reproducibility, considering that the biobanked samples were processed through the biobank workflow before the p16/ki-67 staining was repeated. To the best of our knowledge, this is the first study to investigate the reproducibility of p16/ki-67 on biobanked material. Other studies have investigated interobserver reproducibility of the scoring of p16/ki-67 dual staining with concordance ranging from 70% to 95%.18,19 Studies have demonstrated that training and possibly artificial intelligence can improve the reproducibility of the scoring of the p16/ki-67 analysis.20–22
For future research projects or triage of HPV-positive women, new biomarkers will most likely appear. For this purpose, it will be of interest to do NGS. In the present validation study, we therefore prepared a panel-based library for NGS, and based on the quality parameters in the NGS workflow, the biobanked material seems adequate for this type of analysis. Still, one would need to do pilot testing of the specific analysis to determine the amount of biobanked material needed.
In conclusion, this validation study shows that the biobanked material is of high quality and for most analyses in adequate amount, making it suitable for investigations of cell, protein, RNA, and DNA. A limiting factor can be the amount of material, as some future analyses may require a larger input, and this needs to be considered when planning future studies using the biobanked material.
Footnotes
Author Disclosure Statement
No conflicting financial interests exist. SKK has received research grant through her affiliating institution.
Funding Information
The project was supported by the Mermaid project (Mermaid 3) and the Region of Southern Denmark (Damhaven 12, 7100 Vejle, Denmark). Louise T. Thomsen was supported by the Lundbeck Foundation (R287-2018-1454).
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
