Abstract
The current severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreak demonstrates the potential of coronaviruses, especially bat-derived beta coronaviruses to rapidly escalate to a global pandemic that has caused deaths in the order of several millions already. The huge efforts put in place by the scientific community to address this emergency have disclosed how the implementation of new technologies is crucial in the prepandemic period to timely face future ecological crises. In this context, we argue that metagenomics and new approaches to understanding ecosystems and biodiversity offer veritable prospects to innovate therapeutics and diagnostics against novel and existing infectious agents. We discuss the opportunities and challenges associated with the science of metagenomics, specifically with an eye to inform and prevent future ecological crises and pandemics that are looming on the horizon in the 21st century.
Introduction
In the past decade, the world witnessed increasing examples of emerging viral diseases. Since 2009, World Health Organization (WHO) has declared six “Public Health Emergencies of International Concern” for virus H1N1 (swine flu), Polio, West Africa Ebola, Zika, and the ongoing Kivu Ebola, and the recent severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Wilder-Smith and Osman, 2020); H1N1 and SARS-CoV2 escalated to pandemics. Notably, both H1N1 and SARS-CoV-2 viral infections have a zoonotic origin (Andersen et al., 2020; Boni et al., 2020; Smith et al., 2009), have indeed been transmitted to humans from wild or domestic animals through direct contacts or indirectly through exposure to the urine or feces of infected animals.
Host shift is a hallmark of RNA viruses that can rapidly evolve and easily cross the species barrier by mutations, recombinations, or reassortments of their genetic material, taking advantage of high polymorphism and low fidelity of RNA-dependent RNA polymerases (Drew, 2011; Wilke et al., 2001). As recent history shows, emergence of zoonotic viruses from wildlife can induce a strong impact on the public health and generate a pandemic risk.
Viral metagenomics consists in the characterization of the complete viral diversity in a sample isolated from an organism or an environment using high-throughput sequencing technologies. The rise of SARS-CoV-2 outbreak highlights that viral metagenomics may help in the discovery of causative viral agents of zoonotic diseases, suggesting that these methodologies could be widely applied to yield actionable tools to prevent pandemic spread of further viruses in the next decades (Fig. 1). In this review, we discuss how metagenomics offer hopes and opportunities for diagnostics, survey, and prevention of zoonotic infections in a world facing not only the current coronavirus disease 2019 (COVID-19) pandemic but also future ecological crises.

How metagenomics might improve our management of current and future epidemiological crisis.
Zoonotic Infections
What we learned from SARS-CoV-2 and other RNA viruses
The ongoing SARS-CoV-2 pandemic has abruptly disclosed the vulnerability of the current globalized and interconnected society. Although the natural reservoirs of SARS-CoV-2 have not been conclusively identified yet, a viral shift from a wildlife reservoir is considered the most likely origin of this virus (Andersen et al., 2020; Boni et al., 2020; Smith et al., 2009). Nevertheless, lack of genetic data documenting the recent evolution of this virus in wild or domesticated animals leaves many open hypotheses that cannot be formally proved or disproved.
Currently, based on the close genomic similarity (96% overall genome identity) between SARS-CoV-2 and RaTG13, sampled from a Rhinolophus affinis horseshoe bat in 2013 in Yunnan Province, wild horseshoe bats are considered the most likely candidate as the wildlife reservoir of SARS-CoV-2 (Zhou et al., 2020). Despite huge ongoing efforts to isolate and identify further related coronaviruses (Hul et al., 2021; Zhou et al., 2021), RaTG13 still represents the most closely related coronavirus genome identified thus far.
Notably, estimates of the divergence between SARS-CoV-2 and the closest genomically characterized coronavirus RaTG13 (Zhou et al., 2020) date the last common ancestor back to half of the 20th century (Boni et al., 2020), suggesting that several still undisclosed related coronaviruses might exist in wildlife reservoirs. A further hypothesis postulates putative intermediate hosts between the wild RaTG13 reservoir and humans (Zhang et al., 2020). Indeed, most of these hypotheses stem from the large knowledgebase on coronaviruses collected after SARS-CoV-1 and Middle East respiratory syndrome (MERS) outbreaks.
In the past decades, several epidemics were associated with RNA viruses that moved from natural reservoirs to human hosts: H1N1 (Smith et al., 2009), HIV-1, HIV-2 (Reperant and Osterhaus, 2017; Sharp and Hahn, 2011), Ebola (Jacob et al., 2020), SARS-CoV-1 (Ksiazek et al., 2003), MERS coronavirus (MERS-CoV) (Zaki et al., 2012), and SARS-CoV-2 (Zhou et al., 2020). However, further RNA viruses spillovers with human to human transmission that did not result in large outbreaks have been reported (Chua et al., 2007; Martinez et al., 2005; Plowright et al., 2019; Uehara et al., 2019).
Such anecdotal evidence is supported by quantitative studies that have formally investigated the dynamics of RNA virus evolution and host shift (Dolan et al., 2018a, 2018b) and consistently demonstrated that RNA viruses are more likely to shift their host compared with DNA viruses (Dolja and Koonin, 2018; Longdon et al., 2014; Olival et al., 2017; Wells et al., 2020). Indeed, topological comparison of phylogenetic trees of DNA viruses and their hosts highlighted close similarities, suggesting that for DNA viruses host shifts are a rare event. On the contrary, a similar approach on RNA viruses highlights a very poor overlap between virus and host phylogenetic trees, confirming the propensity of RNA viruses to frequent host shifts (Geoghegan et al., 2017).
Epidemiological analyses suggest that most spillover events may remain overlooked even for RNA viruses causing severe diseases such as Ebola (Glennon et al., 2019). Accordingly, retrospective phylogenetic analyses established that two independent HIV viruses and multiple different serotypes spilled to the Homo sapiens in several distinct events for the past decades (Sharp and Hahn, 2011). On these premises, it is reasonable to cautiously speculate that the reported outbreaks actually represent a fraction of the viral spillover events occurring.
The plasticity of RNA viruses also increases the likelihood of a “reverse” spillover from humans to a new wild or domesticated animal reservoir. In silico studies have highlighted that the spike protein of SARS-CoV-2 can proficiently bind the angiotensin-converting enzyme (ACE) receptor of a wide range of mammals (Fischhoff et al., 2021; Luan et al., 2020; Zhai et al., 2020). Experimental study has then confirmed that SARS-CoV-2 can infect several mammals, including hamsters (Chan et al., 2020; Sia et al., 2020; Trimpert et al., 2020), monkeys (Woolsey et al., 2021), ferrets (Richard et al., 2020; Schlottau et al., 2020), and minks (Koopmans, 2021; Munnink et al., 2021). Notably, in some of these models, sustained animal-to-animal airborne transmission has been documented (Richard et al., 2020).
A further warning is that transmission back from minks to humans has been documented (Munnink et al., 2021). SARS-CoV-2 antibodies were detected in escaped minks, suggesting that spread of SARS-CoV-2 is, at least in theory, possible (Shriner et al., 2021). Indeed, New World rodents were shown to be susceptible to SARS-CoV-2 infection (Fagre et al., 2020). These observations raise the possibility that a wildlife or domesticated reservoir of SARS-CoV-2 may be soon established. The spillover of SARS-CoV-2 to minks in the Netherlands provided a robust proof of principle (Koopmans, 2021).
The ongoing global vaccination efforts will likely provide most communities with a sustained herd immunity against the currently circulating SARS-CoV-2 variants. However, taking into account the sustained mutation rate and the unpredictable selective pressures that SARS-CoV-2 may experience in wild or domesticated animal reservoirs, a further spillover of novel variants to humans may be foreseen in the next decades. Indeed, rapid selection of spike protein variants upon infection of cells expressing a suboptimal host receptor has been observed for MERS-CoV (Letko et al., 2018). SARS-CoV-2 spike protein variants represent a significant threat to the ongoing worldwide vaccination efforts (Madhi et al., 2021) and animal reservoirs might represent the ideal environment for the development of such variants.
Re-emergence of viral diseases from sylvatic reservoirs is a concern also for other human pandemic viruses, both known and unknown. Indeed, the end of smallpox vaccination programs, after eradication of the disease, is associated with the re-emergence of monkeypox from presumed wildlife reservoirs in sub-Saharan Africa (Patrono et al., 2020; Simpson et al., 2020). Smallpox vaccination likely conferred sufficient immunity to monkeypox and re-emergence of such virus might represent a public health challenge.
Metagenomics: A Powerful Tool for Facing and Preventing Zoonotic Infections
The past two decades have witnessed an exponential growth of the field of metagenomics and characterization of virome by shotgun sequencing is nowadays a mature technique (Garmaeva et al., 2019; Khan Mirzaei et al., 2021; Shkoporov et al., 2018). These approaches provide invaluable tools to face future viral pandemics. The current pandemic has highlighted that third-generation sequencing technology may increase the throughput and efficacy of virome characterization (Kim et al., 2020). Indeed, Oxford Nanopore technology is a valuable tool in this scenario due to the possibility of directly sequencing RNA long reads (Bull et al., 2020; Lewandowski et al., 2019; Viehweger et al., 2019). This emerging technology is particularly fit for the characterization of RNA viral genomes.
The integrative Human Microbiome Project has allowed us to attain an exhaustive knowledge of the communities inhabiting diverse body sites under different conditions (Proctor et al., 2019). However, several environmental niches are relatively unexplored yet. The virome is a thus far overlooked component of the microbiome in several environments. Recent findings suggest that DNA and RNA viruses identified and characterized thus far represent only a fraction of the total virome (Tisza et al., 2020; Wolf et al., 2020; Zhang et al., 2018).
In line with the ONE Health perspective, a global integrative multi-OMICS approach might be propelled by the development of extensive and exhaustive viral genome databases. Recently, widespread antimicrobial resistance provided a benchmark of this approach (Lammie and Hughes, 2016). Characterization of vertebrates virome in ONE Health perspective (Kelly et al., 2017), involving professionals in the fields of virology, molecular biology, veterinary science, and ecology would certainly require a considerable effort in terms of financial, computational, and technological resources. However, recent multi-OMICS approaches to the study of the resistome suggest that such efforts could actually be beneficial in facing future pandemics. We can foresee feasible approaches that could greatly benefit from the development of an extensive database of viral genomes.
When still-unknown viruses emerge, no molecular or serological diagnostic assays are available for their detection. In this context extensive characterization of viromes through the use of metagenomics is the only strategy to both identify such viruses and develop actionable biotechnological tools (quantitative polymerase chain reaction tests, monoclonal antibodies, and vaccines) for the field epidemiology.
Metagenomic sequencing has the unprecedented power of profiling viral populations in wildlife hosts that have close and frequent contacts with humans and domestic animals. Accordingly, metagenomic-based surveillance of urban rats (Firth et al., 2014) and mice (Williams et al., 2018) as well as blood-feeding arthropods (Bouquet et al., 2017; Coffey et al., 2014) have revealed a variety of viruses that are phylogenetically related to human-infecting taxa, which, therefore, might be considered at high risk for zoonoses.
The intensification of livestock farming by increasing population size and density might facilitate disease transmissions within herds and between livestock and humans (Jones et al., 2013). Therefore, the potential emergence of zoonoses from livestock populations should not be underestimated. In a recent review, Kwok et al. (2020) analyzed and summarized the data from 120 published records of viral metagenomics in common livestock, including cattle, small ruminants, poultry, and pigs, to identify background virus diversity profiles of common farm animals to guide the survey of future zoonoses.
Thus far, virome characterization has been largely achieved through several independent initiatives. It is a matter of debate whether ongoing coordinated global efforts toward a Global Virome Project (Carroll et al., 2018) would be justified (Holmes et al., 2018; Jonas and Seifman, 2019). Assembling, cataloging, and making publicly available the genomes of DNA and RNA viruses infecting domestic and wild hosts is an ambitious yet reachable achievement in the next two decades. The cost estimates for a 10 years global effort toward exhaustive metagenomic characterization of the global virome are about 3–4 billion U.S. dollars (USD) (Jonas and Seifman, 2019).
Compared with the estimated 16 trillion USD economic losses of the current pandemic (Cutler and Summers, 2020), a Global Virome Project would be worth being supported even if the best expected outcome was shortening by 1 month the resolution of the next pandemic.
Notably, evaluating which among the newly discovered viruses pose sufficient risk of infecting humans to deserve detailed laboratory characterization and surveillance remains largely speculative. We should acknowledge that predicting which virus will cause the next pandemic is not feasible. Nevertheless, such effort would propel in silico and in vitro analyses, which may in turn provide preliminary data to prioritize viral genomes possibly encoding for proteins able to mediate viral entry into human cells.
Algorithms to identify potential zoonotic viruses have been proposed (Olival et al., 2017), along with criteria for zoonosis prioritization (Rist et al., 2014). In a recent pilot study viral metagenomic data from feces and saliva from common vampire bats was analyzed using machine learning to rank the newly identified viruses by their relative likelihood of human infection based on viral genome sequences (Bergner et al., 2021).
Application of molecular modeling of viral surface proteins extracted from prioritized genomes could efficiently help to identify viral surface proteins and predict their putative binding to human cell surface proteins, which might act as viral receptors (Cho and Son, 2019; Kruglikov et al., 2021; Yan et al., 2019). Further improvements and combinations of this kind of approach might help in both assessing the risk of zoonoses and prioritizing surveillance resources on those most likely to pose a threat of emergence in humans (Kress et al., 2020).
Evidence that SARS-CoV-2 crossed the species barriers at least twice (from a wildlife reservoir to human and from human to minks) clearly exemplified the urgency of setting up coordinated actions to characterize the potential pandemic viruses through metagenomics not only in wild but also in domesticated animal reservoirs, including farm animals.
Conclusions and Outlook
The current pandemic witnessed a race to the development of biotechnological tools to detect, diagnose, treat, and prevent SARS-CoV-2 spread. Although real-time reverse transcription polymerase chain reaction (rRT-PCR) assays were deployed about 10 days after the publication of SARS-CoV-2 genomic sequence (Corman et al., 2020), the first monoclonal antibodies against viral proteins were reported about 4 months later (Sun et al., 2020; Wu et al., 2020) and phase 3 vaccine trials started a few months later. Reducing the deployment time of each of these biotechnological tools by a few months would probably have had a dramatic impact on limiting the spread of SARS-CoV-2 and consequently reduced the socioeconomic effects of the pandemic.
We are attesting the success of mRNA vaccines against SARS-CoV-2 infection, and current data suggest that mRNA-based vaccines provide a flexible, rapid, and scalable platform for immunizations (Tombácz et al., 2021). This innovative tool parallels the recent explosion of monoclonal antibodies technologies. On the other hand, huge efforts aiming at drug repurposing, mainly focused on host cell actionable targets, were not entirely successful (Cavalcanti et al., 2020; Rubin et al., 2021). Such results highlight the potential of global virome genome characterization efforts. Overall these observations suggest that further investigation of the host response and interaction with viruses are required before successful strategies leveraging cellular targets can be rapidly deployed.
Putting sufficient resources on virome characterization would allow us to compile a list of hundreds to thousands of potentially pandemic viruses. The obvious observation that guessing the next pandemic virus is not possible, in our opinion, argues in favor of an extensive unbiased catalog, as science is not aiming at exact prediction of the future, but rather committed to build large knowledge bases for future challenges. We are aware of the fact that global virome characterization requires a huge financial effort, and a collective action by the scientific community. Nevertheless, recent EU-financed projects approached a 1-billion budget for a 10 years timeline (Abbott, 2019), an effort similar to the one estimated for a Global Virome Project (Jonas and Seifman). We believe that the global scientific community can express the managerial skills along with the technical and scientific tools to undertake this challenge.
It is tempting to speculate that current compound libraries that are routinely screened by pharmaceutical companies to identify actionable small molecules will be flanked by a set of monoclonal antibodies libraries against surface antigens of potentially pandemic viruses and corresponding collections of mRNA libraries for rapid deployment of mRNA vaccines against future pandemics. Anticipating the first steps of the development of such reagents would largely increase the preparedness and timely deployment of tools to face future crises at a cost corresponding to a fraction of the financial losses caused by the current pandemics, not to mention the large number of human lives that could be saved.
Footnotes
Author Disclosure Statement
The authors declare they have no conflicting financial interests.
Funding Information
No funding was received for this article.
