Abstract
New SARS-CoV-2 variants are constantly emerging and putting a strain on public health systems by spreading faster and potentially evading immune protection through vaccination. One of these strains is the B.1.1.7 variant that has initially been described in the United Kingdom and has subsequently spread to several countries. Monitoring the amplification of the S gene—a major hotspot for molecular evolution—by reverse transcription polymerase chain reaction (RT-PCR) allows rapidly screening for such variants. This report describes the detection of sequence variants in Romania by using this strategy followed by next-generation sequencing of the entire genome for confirmation and further characterization. One B.1.1.7 and three B.1.258 sequences were confirmed. Each of these strains presented additional mutations with possible impact on the replicative capacity. Public health strategies should be devised to ensure molecular monitoring of SARS-CoV-2 evolution during the pandemic and allow adequate and rapid reaction.
The evolution of SARS-CoV-2 is mainly driven by spontaneous mutations and recombination, immune selective pressure, and other stochastic factors. New variants with mutations conferring potential replicative advantages have been constantly reported during the pandemic. One recent variant is B.1.1.7 (VOC 202012/01) that has efficiently spread in the United Kingdom since September 2020, VOC 202012/01, has also numerous mutations and deletions in its genome. Mutations in the Spike gene are probably responsible for its increased transmissibility.
Soon after being described in the United Kingdom, other countries started reporting its presence. Since December 2020, we have implemented a pilot program to screen for B.1.1.7 strains using Taq Path COVID-19 CE-IVD reverse transcription polymerase chain reaction (RT-PCR) kit (Thermo Scientific). The samples were collected mainly from patients addressing the National Institute for Infectious Diseases, Bucharest, Romania. Positive samples by SARS-CoV-2 RT-PCR but without amplification of the S gene were selected for whole genome sequencing (WGS) through next-generation sequencing (NGS). RNA extraction performed with QIAamp® DSP Virus Kit (QIAGEN) was followed by rRNA depletion procedure and DNA library preparation with the TruSeq Stranded Total RNA kit (Illumina). DNA libraries were sequenced using Illumina® MiSeq® Reagent Kit v3. The optimized WGS Bioinformatics pipeline used to generate the SARS-CoV-2 complete genome sequences consisted of a double-assembly method approach in which de novo assembly was combined with multiple rounds of reference mapping, as previously described. 1 The final data set for phylogenetic analysis included 288 sequences: 153 Romanian sequences and 135 control sequences (some were selected by Blast search and others used as an outgroup) were aligned with MAFFT v7.450 implemented in Geneious Prime. The phylogenetic trees were generated with RAxML 8.2.11, nucleotide model: GTR GAMMA I, algorithm: Rapid hill-climbing, number of starting trees or bootstrap replicates: 1, parsimony random seed: 1. The WGS sequences generated in this study were submitted to the GISAID platform under the following accession numbers: EPI_ISL_794744, EPI_ISL_807154–EPI_ISL_807156. A number of 673,271 cases were reported in Romania since the beginning of the SARS-CoV-2 epidemic with a fatality rate of about 2.5%, mainly affecting the elderly (>60 years) and the male gender. 2 The highest peak of new infections/deaths was registered in mid-November 2020. Two subtypes, segregated in distinct geographic regions (B.1.1 in Southern and B.1.5 in Northern Romania), were circulating during the first months of the epidemic; genetic diversity of SARS-CoV-2 strains significantly increased after travel restrictions were lifted. 1 Analyzing all the available sequences on GISAID platform with the Pangolin subtyping tool, the most prevalent subtype in Romania is B.1.5 (31%), followed by B.1.2 (17%) and B.1.1 (15%).
Our screening strategy led to identifying three sequences assigned as B.1.258 subtype, and one as B.1.1.7. All these sequences had in common one deletion in the S-gene (del H69–V70). These sequences are represented in magenta in Figure 1, whereas other sequences previously reported to circulate in Romania are marked in blue. Defining genetic changes for B.1.1.7 and B.1.258 lineages as well as newly observed mutations reported in this study are represented on the SARS-CoV-2 genome map in Figure 1. The three sequences assigned as B.1.258 subtype were closely related to a sequence collected at the end of 2020 in Abruzzo, Italy. The B.1.258 samples were all from a closely knit religious community with limited observance of social distancing measures. Similar strains have been signaled in the Czech Republic, Ireland, and to a greater extent in Denmark. 3 This variant is also characterized by the presence of the N439K substitution, which improves the interaction between ACE2 and the receptor-binding motif and was reported to be less sensitive to neutralizing antibodies in vitro. 4 Aside from I1683T mutation from NSP3, H290Y mutation from NSP13, which are known to belong to B.1.258 lineage, most of the other mutations (Fig. 1) found in the new SARS-CoV-2 Romanian strains are new or very rarely attributed to this lineage.

The characteristics of novel SARS-CoV-2 variants identified in Romania. The phylogenetic analysis of the sequences reported in this study (magenta bars) and all available Romanian sequences on GISAID platform (blue bars). Specific clusters are highlighted (Top). SARS-CoV-2 genomic representation of the defining mutations for B.1.1.7 (dark magenta) and B.1.258 (green) lineages and the additional mutations observed in Romanian B.1.1.7 (magenta) and B.1.258 (khaki) sequences (Bottom).
The Romanian sequence assigned as VOC 202012/01 clustered with other sequences recently reported in Western European countries (The Netherlands, Finland, Germany, and France). This sequence has been collected from a Romanian patient with mild COVID-like illness. There was no history of recent travel abroad. However, the patient is living in a small town with a relatively large proportion of inhabitants working abroad, in Western European countries, including the United Kingdom. Further extended epidemiological inquires in the area are ongoing.
The particular mutations present in the VOC 202012/01 spike protein might have biological implications: N501Y is located in the receptor-binding domain and is associated with increased binding affinity to human ACE2 receptor, P681H seems to be important for infection and transmission due to its adjacent position to the furin cleavage site, and the deletion 69–70 is associated with increased viral infectivity in vitro and immune escape in immunocompromised patients. 5 This deletion, also described in other lineages, is most likely responsible for the failure of some diagnostic assays to detect the S-gene target.
Additional new mutations, previously not detected in former VOC 202012/01, were present in the Romanian B.1.1.7 strain: P111T in NSP15, A119V in NSP14 and K68stop in ORF8. However, in a few sequences recently uploaded in GISAID database in January 2021, K68stop appears together with Q27stop mutation. The latter one alters the protein and favors the accumulation of further mutations downstream (e.g., R52I, K68stop, and Y73C). 6 The ORF8 protein of SARS-CoV2 was suggested to contribute to immune evasion. Cell culture data showed that ORF8 binds and downregulates the major histocompatibility complex class I, facilitating CD8 T cell escape. 7 Moreover, ORF8 protein is a major target for the humoral immune response. 8 Viruses with deletions in ORF8 gene were reported during the first months of the pandemic. It was shown that a 382 nucleotide deletion in ORF8 gene does not impact virus fitness, and it was suggested to be an immune evasion strategy. 9
VOC 202012/01 was first detected in United Kingdom in September 2020 and by mid-December became prevalent in three regions of England (South East, East of England, and London) and dispersed all over England. 5,10 Starting with end of December 2020 an increasing number of countries, mostly European, reported sequences corresponding to VOC 202012/01, some of them, such as Denmark, Germany, or Australia, from samples collected as far back as November. 11 Several biological mechanisms might contribute to the high prevalence of this lineage. An important mechanism is most probably its increased infectiousness, as suggested both by mathematical models and the observed lower Ct values. 5 A shorter generation time or the VOC's ability to escape immune response have not so far been supported by mathematical models as explanation for the rapid dispersion of this lineage. 5,10
In conclusion, by screening for S-gene mutations using RT-PCR and WGS, we have identified one B.1.1.7 and three B.1.258 novel sequence variants. Each of these strains presented additional mutations with possible impact on the replicative capacity. Our results and similar ones support the view that public health strategies should include molecular monitoring of SARS-CoV-2, allowing adequate and rapid reaction to detect and react as soon as to changes that might affect transmissibility and/or antigenicity. Such a system is currently being implemented in Romania.
Footnotes
Acknowledgments
The sequencing reagents were kindly provided by National Red Cross Society in Romania. The authors thank Dr. Mihaela Stoian who kindly provided samples collected from Deva County Emergency Hospital.
Author Disclosure Statement
No competing financial interests exist. The sponsor National Red Cross Society in Romania had no role in the design, execution, interpretation, or writing of the study.
Funding Information
M.S. was supported by Research Institute of the University of Bucharest (ICUB) grant no. 20964/30.10.2020. The study was supported by POSCCE program CRCBABI project (642/2014).
