Abstract
Introduction:
During the COVID-19 pandemic, an extraordinary number of nasopharyngeal secretion samples inoculated in viral transport medium (VTM) were collected and analyzed to detect SARS-CoV-2 infection. In addition to viral detection, those samples can also be a source of host genomic material, providing excellent opportunities for biobanking and research.
Objective:
To describe a simple, in-house–developed DNA extraction method to obtain high yield and quality genomic DNA from VTM samples for host genetic analysis and assess its relative efficiency by comparing its yield and suitability to downstream applications to two different commercial DNA extraction kits.
Methods:
In this study, 13 VTM samples were processed by two commercial silica-based kits and compared with an in-House–developed protocol for host DNA extraction. An additional 452 samples were processed by the in-House method. The quantity and quality of the differentially extracted DNA samples were assessed by Qubit and spectrophotometric measurements. The suitability of extracted samples for downstream applications was tested by polymerase chain reaction (PCR) amplification followed by amplicon sequencing and allelic discrimination in real-time PCR.
Results:
The in-House method provided greater median DNA yield (0.81 μg), being significantly different from the PureLink® method (0.14 μg, p < 0.001), but not from the QIAamp® method (0.47 μg, p = 0.980). Overall satisfactory results in DNA concentrations and purity, in addition to cost, were observed using the in-House method, whose samples were able to produce clear amplification in PCR and sequencing reads, as well as effective allelic discrimination in real-time PCR TaqMan® assay.
Conclusion:
The described in-House method proved to be suitable and economically viable for genomic DNA extraction from VTM samples for biobanking purposes. These results are extremely valuable for the study of the COVID-19 pandemic and other emergent infectious diseases, allowing host genetic studies to be performed in samples initially collected for diagnosis.
Introduction
Human DNA isolated from clinical samples can be used in a wide range of methodological approaches with diagnosis and research purposes. Since the completion of the Human Genome Project, technologies to analyze genetic material have been rapidly developed and constantly updated. These advances allowed us to obtain large amounts of genetic data, quickly and efficiently.1,2 Hence, these technologies revolutionized the knowledge in the human health field, resulting in faster and enhanced diagnoses, prognoses, and treatment of diseases.1,3
In clinical diagnosis and research, DNA is purified mostly through standard commercially available kits using silica adsorption technologies with magnetic beads or spin columns. These methods may be highly reliable and convenient, but are usually expensive and still require sample preparation steps.4,5 Numerous DNA extraction protocols have been created and continuously optimized for a wide variety of samples. Some of these techniques resulted in the development of commercial DNA or RNA extraction kits, 4 whereas several remarkable in-House protocols are still described and widely used.5–7 Implementation of such procedures may contribute to operational sustainability in a DNA biobank setting, especially in hospital or university-run initiatives in low- and middle-income countries.8,9
The main technique for laboratory detection of SARS-CoV-2 infection, the reverse transcription–polymerase chain reaction (PCR), detects viral RNA from nasopharyngeal swabs-derived samples stored in a viral transport medium (VTM),10,11 which can also be a source of host DNA. Swab collection, however, recovers a limited number of host cells, which are diluted in VTM and usually frozen, reducing the amount of available DNA compared with blood and saliva samples. 12 In this study, we describe an in-House developed method to obtain high-quality genomic DNA from VTM samples for host DNA analysis. We compared the developed protocol yield with two different commercial DNA extraction kits, in addition to testing performance in downstream applications and comparing costs for large-scale applications.
Materials and Methods
Samples
Residual VTM inoculated with nasopharyngeal swabs previously tested for SARS-CoV-2 were obtained from the Laboratório Central do Rio Grande do Sul (LACEN-RS), the main public laboratory for COVID-19 diagnosis in the southern state of Brazil. We compared differentially extracted DNAs from 13 randomly selected samples, collected for diagnostic purposes in July 2020 (n = 6) and March 2021 (n = 7), stored initially at −80°C and then at −20°C after COVID-19 testing until the DNA extraction procedure. In addition, 452 samples were processed by our in-House method. To assess total cell-free DNA recovery without sampling variables, we also included two clean VTM samples spiked with cell-free lambda phage DNA (Thermo Fisher Scientific), in a total DNA mass of 10 μg per sample, which were run together with the clinical samples in extraction procedures.
DNA extraction methods
All extraction methods were executed by the same scientist, carried out using the same input sample amount (400 μL) and the same elution/resuspension volume (50 μL). The elution volume was chosen because of the previously observed low yield in this kind of sample and is within the volume range indicated by manufacturers. The same clinical samples were tested across the different extraction methods to minimize sample collection variability, but not processed in replicates.
Commercial DNA extraction kits
We included two commercial kits in this study: Invitrogen's the PureLink® Genomic DNA mini kit and the QIAamp® Viral RNA Mini Kit. PureLink Kit (Protocol A) was chosen because of its flexibility in extracting high-quality DNA from different sample types. QIAamp Viral RNA Kit (Protocol B) was selected because it is widely used in sample preparation for SARS-CoV-2 detection by health/diagnostic services. This kit is designed for the purification of viral RNA, but does not separate viral RNA from DNA, suggesting that RNA samples routinely extracted by this method could also be used for host genetics research. DNA extraction with commercial kits was performed following the manufacturer's instructions with minimal adaptations related to initial sample volumes. Detailed protocols are given in Supplementary Data S2.
In-House method
The protocol, briefly, consists of lysis by sodium dodecyl sulfate (SDS) and proteinase K, followed by deproteinization by salting-out and DNA precipitation in isopropanol. It was designed using commonly available molecular biology reagents. Samples (400 μL) were digested by mixing 20 μL of VTM lysis buffer (200 mM Tris-HCl pH 8.0, 3 M NaCl, 80 mM ethylenediaminetetraacetic acid [EDTA], 200 mM KCl), 20 μL SDS 10%, and 10 μL proteinase K (10 mg/mL) and incubating at 56°C for 20 minutes. After digestion, a salting-out procedure was carried out by adding 130 μL ammonium acetate (8.5 M) to the lysate, mixing it thoroughly by vortexing for 5 seconds. The samples were incubated at −20°C for 5 minutes. Proteins were precipitated by centrifugation at 15,000 × g for 10 minutes.
The supernatant was transferred to a new microcentrifuge tube and mixed 1:1 with room temperature isopropanol, mixed by inversion 10 times. The DNA was precipitated by centrifugation at 15,000 × g for 10 minutes. DNA pellets were washed using 1 mL 70% ethanol and centrifuged at 15,000 × g for 5 minutes. After decanting the alcohol and air drying the DNA pellet, 50 μL TE buffer (10 mM Tris-HCl pH 8.0, 1 mM EDTA) was used for DNA rehydration.
Analysis of the extracted DNA samples
Quantification, purity, and integrity assessment
Purity ratios of DNA samples were determined by spectrophotometry (NanoDrop® One; Thermo Fisher Scientific, Waltham, MA), using ratios of absorbance at 260/280 nm and 260/230 nm. Double-stranded DNA (dsDNA) quantification was performed in a Qubit® 2.0 fluorometer using dsDNA BR Assay Kit or dsDNA HS Assay Kit (Life Technologies, Carlsbad, CA). The integrity of extracted DNA was assessed by resolving samples in 1% agarose gel electrophoresis (7 V/cm for 50 minutes), using 1 × Tris/Borate/EDTA buffer with GelRed® (Biotium, Inc., Fremont, CA) staining and a known molecular weight marker.
Molecular analysis
The suitability of differentially extracted DNA samples to downstream applications was assessed by PCR amplification followed by amplicon sequencing and real-time PCR genotyping experiment.
PCR amplification, clean-up, and sequencing
To test the samples' suitability to downstream applications, a fragment of 342 bp from the serotonin 2A receptor (5-HTR2A) gene was amplified by PCR. The amplification products were analyzed by electrophoresis on 1.5% agarose gel stained with GelRed and visualized under ultraviolet light. PCR products were purified by precipitation using polyethylene glycol 8000 and sequenced by capillary electrophoresis using BigDye chemistry in an ABI 3730 system DNA analyzer (Applied Biosystems, Foster City, CA). Sequences were analyzed using CodonCode Aligner. Detailed information about reaction conditions and clean-up protocol are given in Supplementary Data S2.
Genotyping experiment
Genotyping of the rs3775291 SNP in the TLR3 gene was performed using TaqMan® SNP Genotyping Assay. PCR reactions were performed according to the manufacturer's instructions using 1 × TaqMan™ Genotyping Master Mix and 10 ng template DNA, in the StepOnePlus Real-Time PCR system (Applied Biosystems). The primers and genotyping assay were chosen based on previous expertise of our group and our interest in genetics of immune response and neurological and developmental conditions. Additional information about real-time genotyping experiments is given in Supplementary Data S2.
Statistical analysis
Descriptive statistics were performed and data are given as median and its respective 25th and 75th quartiles or 95% confidence intervals. The Kolmogorov–Smirnov test was used to verify the normality of continuous variables that were compared between methods by Friedman's test. All statistical analyses were performed using SPSS (version 20.0; SPSS, Inc., Chicago, IL). Time and cost for each DNA extraction method were assessed considering 50 samples.
Results
A total of 13 randomly selected clinical samples were compared in this study between extraction methods. Six of them were collected in July 2020, stored at −80°C for 14 months, and at −20°C for 3 days before DNA extraction. The other seven samples were collected in March 2021, stored at −80°C for 3 months, and at −20°C for 45 days before DNA extraction. The patients' mean age was 53.2 years and 77% were women. Data on individual samples are given in Supplementary Data S1.
The cost and time for each DNA extraction method are described in Table 1. The time includes both hands-on time and incubation and centrifugation steps to process 50 VTM samples without taking into account the reagents' preparation time. Although the in-House method takes longer to be executed, it is considerably less expensive than any of the other methods, considering the amount of each reagent which is commonly used in molecular biology laboratories.
Time and Cost for Each DNA Extraction Method for Extraction 50 Viral Transport Medium Samples
Time and cost for processing 50 samples in one batch. Reagents preparation time not considered (∼2 hours). Commercial kits suggested input volume was used.
Value based on quantity of each reagent used, all commonly available in molecular biology laboratories.
Yield, purity, and integrity of extracted nucleic acids
The yield and purity ratios of DNA extracted from clinical samples are given in Table 2 and Figure 1a–d. The highest DNA median yield was obtained with the in-House DNA extraction method (0.81 μg), statistically different from protocol A (0.14 μg, p < 0.001), but not from protocol B (0.47 μg, p = 0.980). Protocol B produced the highest A260/280 and A260/230 ratios, significantly different from both Protocol A (p < 0.001 in both readings) and the in-House method (p = 0.001 and p = 0.019, respectively). The in-House protocol showed a higher total DNA recovery rate (86.8%) in the VTM samples spiked with lambda phage DNA, significantly different from protocol A (22.6%, p = 0.014), but not from protocol B (28.3%, p = 0.473; Fig. 1e).

DNA quantity and quality metrics. Plotted comparison of DNA yield by fluorometric reads
Analysis of DNA Yield and Purity Between Different Extraction Methods
Data are given as median (25th–75th quartiles). Values indicated with different letters are significantly different.
Sample storage time and temperature did not have a significant effect on DNA yield on any of the tested extraction methods. Detailed description of spectrophotometric analysis of additional 452 samples extracted by the in-House method is given in Supplementary Data S3. No relevant differences were observed in DNA integrity, evaluated by band patterns on 1% agarose gel, from samples across the different extraction methods (Fig. 2).

Agarose gel electrophoresis of differentially extracted samples. All 13 extracted DNA samples (lanes 1–13) from the three methods were tested. A volume of 2 μL sample was used. A marker (M) of 1 kb was used to infer fragment length distribution.
PCR amplification, sequencing, and genotyping
All DNA samples obtained with the three extraction methods were able to produce clear discrimination between genotypes of TLR3 rs3775291 SNP in the real-time genotyping experiment using positive control reactions for each genotype (Fig. 3a). All DNA samples from the three methods were able to produce the 5-HTR2A amplification fragment with high yield and specificity using a standard PCR reaction and the purified PCR products from all methods performed equally well and were able to generate clear sequence (Fig. 3b, c).

Results of PCR experiments with in-House extracted samples.
Discussion
In the course of the COVID-19 pandemic there has been a huge effort in the collection and analysis of nasopharyngeal samples which, in addition to pathogen detection, could also generate a wealth of host genomic information, aiding in the discovery and understanding of disease prognostic predictors and genetic risk factors.13,14 In this context, the main purpose of this study was to develop and describe a rapid, simple, and affordable isolation method capable of recovering host DNA in sufficient amounts and quality for biobanking purposes and use in research.
Regarding sample purity, protocol B showed the highest purity values on both parameters, with ratios higher than usually observed (Table 2). An A260/280 ratio of ∼1.8 and A260/230 ratio of 2 to 2.2 is generally accepted as “pure” for DNA, according to Thermo Scientific Technical Bulletin T042. The out of expected range readings in samples processed with this kit are possibly owing to carrier RNA addition to the lysis buffer and by the addition of sodium azide in buffer AVE, used for elution. Carrier RNA is added to the lysis buffer to improve nucleic acid binding and prevent degradation, whereas sodium azide is added to the elution buffer to prevent microbial growth and nuclease contamination. As stated in the kit's handbook, both sodium azide and carrier RNA may affect spectrophotometric absorbance readings at 260 nm, hindering spectrophotometric quantification and purity analysis, but has no effect on downstream applications.15,16
To overcome contaminant interference, fluorometric DNA quantification was performed in samples used for yield comparison. The in-House protocol showed higher median DNA yield than both kits, but was statistically different only from protocol A, indicating the in-House protocol efficiency to be comparable with both methods in extracting DNA from those samples. The in-House method exhibited higher percentage recovery of lambda DNA from VTM samples, with DNA recovery higher than both kits, but only statistically different from protocol B (p = 0.014). The higher concentrations observed using the in-House method on recovering the highly available cell-free lambda DNA in comparison with both commercial kits suggests that one or more sample matrix components may have influence on the efficiency of silica-based DNA purification.17,18
Recently, Gorzynski et al. from Stanford University described a method for high-throughput sequencing of SARS-CoV-2 and the host genome in which DNA was extracted from 200 μL VTM samples using Qiagen DNeasy® Blood and Tissue kit. The mean DNA mass obtained from the 160 analyzed samples was 200 ng (min 0 ng max 4700 ng). 19 In our study, an input sample volume of 400 μL was used for DNA extraction and a final mean yield of 646 ng (min 94.75 ng and max 5000 ng). Thus, regarding the Qiagen DNeasy Blood and Tissue kit performance obtained by Gorzynski et al., the in-House protocol used twice the sample volume as input and yielded approximately three times the purified DNA, presenting comparable performance.
Gel electrophoresis analysis did not indicate significant differences in migration patterns of DNA samples across the three extraction methods, suggesting no differential DNA recovery based on molecule size by the different methodologies or considerable DNA degradation. Lambda DNA samples also did not present differential migration patterns across methods. To evaluate whether the extracted DNA samples could be used in downstream molecular analysis, we performed TaqMan genotyping assay, PCR amplification, and amplicon sequencing. All samples extracted with protocols A, B and in-House methods were able to produce clear and consistent genotype discrimination using TaqMan assay, and were easily amplified by PCR, producing clear sequencing reads.
These results show that DNA extracts obtained with the in-House method did not have enough amounts of inhibitors to significantly impair amplification, suggesting the suitability of this method for large-scale applications. It is important to note that samples processed with protocol B (QIAamp Viral RNA Kit) with the purpose of SARS-CoV-2 detection can be used for host genetic analysis, providing the opportunity of using samples already processed for additional applications in biobanking and research.
The high variability in DNA yield observed using the same input volume of sample across different extraction protocols reflects the number of cells in each sample. The samples were collected by different operators, from patients with no prior preparation, with different age and different viral loads, factors that significantly influence cell richness in this kind of respiratory tract sample. 20 Another factor that can influence nucleic acid yield is the composition of the VTM. 21 Our study included only one type VTM (based on Hank's balanced salt solution with bovine serum albumin, phenol red, gentamicin, and amphotericin B), but COVID-19 testing swabs are also eluted in different mediums and even in phosphate-buffered saline. It was demonstrated that eluting swabs directly in lysis buffers is also an efficient way to prevent nucleic acid degradation, as it imposes a lesser risk of nuclease degradation. 22 In our study, sample storage time and temperature did not significantly enhance or reduce DNA yield or purity in any of the tested methods, suggesting that VTM effectively conserved DNA for this time period.
One limitation of our study is the relatively low number of samples. Testing the same samples across different extraction methods, however, allowed us to infer the usefulness of the tested protocol in comparison with the commercial DNA extraction kits. Other kits could be used in comparison, including the QIAamp Circulating Nucleic Acid Kit, MagMAX Cell-Free DNA Isolation Kit, and Wizard Genomic DNA purification Kit. In addition, Trizol and Chelex-100 resin could be used to test those compounds' efficiency in removing sample contaminants. Analysis by nanoelectrophoresis would be more informative than agarose gel electrophoresis, providing a better resolution of DNA fragmentation amongst different methods. Considering that the extracted DNA comes from both human and bacterial cells, it would be more accurate to quantify it through quantitative PCR using specific primers.
In our study, an input volume of 400 μL was chosen to test samples for all three methods. The in-House protocol permits higher input volumes, expecting higher yields, using proportionally higher amounts of reagents in lysis, protein, and DNA precipitation steps. Some other points can be adjusted in the protocol for optimization according to sample conditions. In samples with high cell or mucus content, higher concentrations of proteinase K 23 or the reducing agent dithiothreitol can be used to form a homogenate. 24 Different salts, including sodium acetate, sodium chloride, and potassium acetate can be used in salting-out and/or DNA precipitation steps, which can be performed with isopropanol, as described, or ethanol, with or without incubation or use of coprecipitants, such as linear polyacrylamide or glycogen.25–27 Additional information about protocol optimization suggestions can be found along with detailed description in Supplementary Data S2.
Furthermore, the total processing time of the in-House protocol was higher than both commercial kits due to longer centrifugation periods, drying, and resuspension steps, not considering reagent preparation time, which can take up to 2 hours. The hands-on time, however, is almost the same or, if greater sample volumes are used, it is a faster procedure in comparison with both kits, which would require repeated lysate loading steps. Clearly, an inherent advantage of commercial kits, however, is the possibility of automation, which would reduce hands-on time beyond comparison and diminish contamination risks. The described protocol can be considerably more affordable than both kits. Its relatively easy execution and especially the low cost of the described method make it a viable alternative for application in the biobanking workflow for this kind of sample, especially in low- and middle-income countries.
Conclusions
We described a simple, efficient, and cost-effective method for DNA extraction from residual VTM used in SARS-CoV-2 diagnostic testing. The results obtained with this in-House protocol were comparable with those of the tested commercial extraction kits, considering DNA quantity, integrity, and performance on downstream applications. This, in addition to the remarkably low cost of the described in-House method, proved its suitability and potential for large-scale application to generate samples for DNA biobanking and use in host genetic analysis.
Footnotes
Acknowledgments
The authors are grateful for the scholarships provided by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), Brasília, Brazil. FSLV is the recipient of a CNPq scholarship, grant number 312960/2021-2.
Authors' Contributions
R.C.S.: conceptualization, data curation, formal analysis, investigation, methodology, writing—original draft, writing—review and editing. M.F.F.: data curation, formal analysis, investigation, validation, visualization, writing—review and editing. N.A.C.: data curation, formal analysis, investigation, validation, visualization, writing—review and editing. G.C.G.: data curation, investigation, validation, writing—review and editing. T.W.K.: conceptualization, formal analysis, methodology, validation, writing—review and editing. T.S.G.: data curation, investigation, resources, supervision, writing—review and editing. J.A.B.C.: funding acquisition, project administration, resources, supervision, writing—review and editing. F.S.L.V.: conceptualization, formal analysis, funding acquisition, methodology, project administration, resources, supervision, visualization, writing—review and editing.
Ethical Approval
This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Research Ethics Committee of the Hospital de Clínicas de Porto Alegre (HCPA) under the number 30797220.9.0000.5327.
Author Disclosure Statement
No conflicting financial interests exist.
Funding Information
This study was supported with scholarships and financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code 001 (Process number 23038.003012/2020-16).
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
