Abstract
Background:
Alzheimer’s disease (AD) is a fatal neurodegenerative disease, the etiology of which is unclear. Previous studies have suggested that some viruses are neurotropic and associated with AD.
Objective:
By using bioinformatics analysis, we investigated the potential association between viral infection and AD.
Methods:
A total of 5,066 differentially expressed genes (DEGs) in the temporal cortex between AD and control samples were identified. These DEGs were then examined via weighted gene co-expression network analysis (WGCNA) and clustered into modules of genes with similar expression patterns. Of identified modules, module turquoise had the highest correlation with AD. The module turquoise was further characterized using Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analysis.
Results:
Our results showed that the KEGG pathways of the module turquoise were mainly associated with viral infection signaling, specifically Herpes simplex virus, Human papillomavirus, and Epstein-Barr virus infections. A total of 126 genes were enriched in viral infection signaling pathways. In addition, based on values of module membership and gene significance, a total of 508 genes within the module were selected for further analysis. By intersecting these 508 genes with those 126 genes enriched in viral infection pathways, we identified 4 hub genes that were associated with both viral infection and AD: TLR2, COL1A2, NOTCH3, and ZNF132.
Conclusion:
Through bioinformatics analysis, we demonstrated a potential link between viral infection and AD. These findings may provide a platform to further our understanding of AD pathogenesis.
INTRODUCTION
Alzheimer’s disease (AD) is the leading cause of age-related dementia characterized by progressive decline in cognitive capacities, behavioral and language deterioration, and the loss of ability to carry out daily functions [1, 2]. Currently, AD is affecting approximately 50 million people worldwide and is estimated to reach 131.5 million by 2050. The AD-related annual societal economic cost is expected to rise to $1 trillion in 2018 and $2 trillion in 2030 [3]. As the population ages, the increasing prevalence of AD will have an even more serious impact on global healthcare systems.
Although significant advancement has been achie-ved in recent AD research, there are still no effective treatments available to halt the progression of AD. AD is a complex disease with multiple risk factors, such as age, genetic makeup, family history, cerebrovascular diseases, diabetes, dyslipidemia, head injury, hypertension, smoking, obesity, etc. [4–7]. It is generally believed that amyloid plaques and neurofibrillary tangles are pathological hallmarks of AD. However, these features are insufficient to fully explain many complexities of this disorder [8–10].
The link between viral infection and AD has recently attracted more attention in AD research. Involvement of infection in AD pathogenesis was first proposed by Dr. Oskar Fischer in 1907 [11]. Following studies have suggested that some viruses are neurotropic and associated with central nervous system (CNS) diseases, including AD [12, 13]. Some viruses may invade the CNS through the trigeminal nervous system or oral-nasal pathway, some may directly translocate across the blood-brain barrier, and some may infect the CNS through penetrating the gastrointestinal tract [14–16]. Many viruses can be latent for decades before being reactivated in the CNS by stress, immune compromise, or other factors. Recent studies have further suggested that viral infection may be a risk factor for dementia [17–20]. Readhead et al. demonstrated that levels of human herpes virus (HHV)-6A and HHV-7 were higher in people who had AD than in controls [21]. Moreover, Romeo et al. reported that human herpes virus 6A affected autophagy by infecting astrocytes and primary neurons, thus promoted Aβ intracellular and extracellular accumulation, and hyperphosphorylation of tau, and eventually led to the development of AD [12]. Similarly, Eimer et al. reported that Herpes simplex virus-1 (HSV-1) infection could be a trigger for persistent Aβ over-production and Aβ plaque development, either alone or in combination with impaired Aβ clearance [22]. Interestingly, increasing evidence indicates that Aβ peptides present antimicrobial and antiviral activities [22, 23]. Recent evidence suggests that anti-HSV drugs reduce Aβ and p-tau accumulation in brains of infected mice [17]. These findings suggest that anti-viral regiments might be helpful for AD therapy in the future.
To further elucidate the potential link between viral infection and AD, we used comprehensive bioinformatics to analyze differentially expressed genes (DEGs) in temporal tissue samples from AD patients and control samples.
MATERIALS AND METHODS
Microarray data analysis
GSE118553 expression profiles and related clinical information data were retrieved and obtained from the Gene Expression Omnibus (GEO) website (https://www.ncbi.nlm.nih.gov/geo/) [24]. Temporal tissue samples (from 52 AD and 31 control patients) were included in the dataset. The corresponding GPL10558 platform annotation file included more than 31,000 annotated genes with more than 47,000 probes that were applied to convert the probes into target gene samples. If the target gene was annotated with two or more probes, the mean value was calculated. Among the targeted genes, the protein-coding genes were selected by referring to the human genome assembly GRCh38. Then, the Limma [25] for the R package was used to detect DEGs between AD and control samples. DEGs were screened with the following cut-off criteria: [log2 fold change (FC)] > 0.5 and p-value < 0.05.
Construction of co-expression networks
Modules of correlated genes were identified using weighted gene co-expression network analysis (WGCNA) implemented in the R package. A soft threshold power (β= 12) was selected based on approximate scale-free topological criteria and was used to calculate the adjacency. The adjacency matrix was transformed into a topological overlap matrix, from which the corresponding dissimilarity was calculated. Modules with correlation higher than 0.8 were merged.
DEG functional enrichment analysis
Functional annotations and enrichment analysis were performed using the R package cluster profiler. Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analysis were performed from the cluster profiler package. The function compare Cluster from this package was used to compare enriched functional categories of each gene module. The top ten enrichment terms were visualized using ggplot2 package in the R package [26].
Identification of hub genes
We defined the genes with a gene significance (GS) over 0.2 and a module membership (MM) over 0.8 in the clinically relevant gene module networks as hub genes. Comparisons for the expression of hub genes between the two groups were performed by independent-samples t-test. A p-value < 0.05 was regarded as statistically significant.
The overview workflow of this integrated bioinformatics analysis is illustrated in Fig. 1.

An overview workflow of the integrated bioinformatics analysis.
RESULTS
Construction of weighted gene co-expression network
Through integrated analysis of the GSE118553 datasets, a total of 5,066 DEGs between AD and control samples were identified. WGCNA was used to explore potential gene modules related to gene expression data of patients with AD. We selected β= 12 as the soft-thresholding power to ensure a scale-free network as the same analysis of the mean connectivity for various soft thresholding powers (Fig. 2A). As shown in Fig. 2B, the first set of modules was obtained with Dynamic Tree Cut algorithm, then correlated modules (r > 0.8) were merged (Merge dynamic). A total of 7 modules were identified. For each module, the gene co-expression was summarized. As presented in Fig. 2C, each colored row represented a color-coded module which contained a group of highly connected genes, and the module turquoise was significantly associated with AD (cor = 0.84, p = 4e–22). The scatter plot further validated the relationship between module turquoise and AD (Fig. 2D, cor = 0.96, p < 1e-200). Further analysis demonstrated that genes in the module turquoise were positively correlated with AD, and negatively correlated with the control group (Fig. 2E, p < 0.001). Thus, the module turquoise was identified as the most significant clinically-related unit and selected for subsequent analyses.

Weighted gene co-expression network analysis (WGCNA) was applied to construct a gene co-expression network.
Functional enrichment analysis of DEGs
The GO-biological process (BP) analysis and KEGG pathways enrichment analysis were performed for module turquoise. A total of 595 BPs and 55 KEGG (Table 1) were enriched. The top ten most significant results were shown in Fig. 3A. For the KEGG analysis, 4 out of the top 10 pathways were related to viral infection: HSV, HSV-1, Human papillomavirus (HPV), and Epstein-Barr virus (EBV) infection. In the BP category, 595 BPs were analyzed. 18.5% of the pathways were associated with developmental processes, 14% with metabolic processes, and 11.1% with immune system processes (Fig. 3B). Among the 55 KEGG enriched pathways, 7 were associated with viral infection (Fig. 3C).
KEGG terms enrichment in metascape

The GO-biological process (BP) analysis and KEGG pathways enrichment analysis were performed in module turquoise.
Identification of hub genes
Using a GS over 0.2 and an MM over 0.8, we filtered out 508 significant genes in turquoise module (Table 2). In addition, a total of 126 genes were enriched in viral infection signaling pathways (Table 3). By intersecting those 508 clinically relevant genes with these 126 genes enriched in viral infection pathways, we identified 4 hub genes that were associated with both viral infection and AD: TLR2, COL1A2, NOTCH3, and ZNF132 (Fig. 4A). As presented in Fig. 4B, all 4 hub genes were highly expressed in the AD patient samples. The differences in hub gene expression between AD and control samples were shown in Fig. 4C and 4D. Compared with the control samples, the temporal cortex of the AD group demonstrated significantly higher expression level of all 4 hub genes (Fig. 4C). Similar results were seen in the entorhinal cortex of AD patients (Fig. 4D).
508 genes with high gene significance and high intramodular connectivity in turquoise module with the absolute average gene significance of all genes higher than 0.2 and the absolute module membership higher than 0.6
Virus infection pathways from KEGG enrichment in metascape

Identification of hub genes.
DISCUSSION
In this study, we used comprehensive bioinformatics technology to demonstrate that viral infection (in particular HSV, HPV, and EBV infection) might be an important contributing factor in the development of AD. Subsequent analysis identified TLR2, COL1A2, NOTCH3, and ZNF132 as hub genes that might be essential links between viral infection and AD.
All 4 hub genes that we identified were associated with viral infection and AD. Toll-like receptors (TLRs) were expressed in microglia, astrocytes, oligodendrocytes, and neurons in the CNS. Besides pathogen recognition, TLR2 also recognized danger-associated molecular patterns (DAMPs) during the sterile inflammation process, as found in cerebral ischemia, traumatic brain injury, Parkinson’s disease, and AD [27]. Collagen, type I, alpha 2 (COL1A2, fibrillar forming collagen), was found increased in Aβ deposits in AD brains [28]. Previous study showed abundant cortical Aβ42-positive plaques in vicinity of NOTCH3 deposits [29]. In addition, NOTCH receptors might be the angiogenic pathological basis of AD by affecting γ-secretase [30, 31]. Zinc finger MYM-type protein 3 (ZMYM3) was previously reported among the top three genes involved in the progression of late-onset AD [32].
Previous studies reported the association of AD with various viral infections including HSV-1, HSV-2, HHV-6A, HHV-7, EBV, Kaposi Sarcoma Herpes virus (KSHV), cytomegalovirus, human immunodeficiency virus (HIV), etc. [12, 33–37]. Our results indicated the essential role of HSV, HSV-1, HPV, and EBV in the pathogenesis of AD. These viruses were persistently existent in AD brains. The reactivation of latent viruses could result in neuroinflammation, neuronal dysfunction, and death. Wozniak et al. detected the neurotropic pathogen HSV-1 in Aβ plaques in AD brains [33, 38]. A study in Taiwan population showed that HSV infection was significantly correlated with a higher risk of dementia later in life and anti-herpetic treatment could greatly reduce the risk of dementia onset [39]. HPV was the most common sexually transmitted infection in the United States. A recent study demonstrated that HPV might play an essential role in the pathogenesis of AD in certain population [40]. EBV infection was associated with many neurological diseases, such as encephalitis, neuritis, myelitis, acute disseminated encephalomyelitis, and multiple sclerosis. A high proportion of AD patients were reported to have EBV positive blood leukocytes [41]. In addition, a recent bioinformatics analysis study also indicated that viral infection might contribute to AD pathogenesis by inducing oxidative and inflammatory responses [42]. Other potential molecular mechanisms by which viral infection induced AD-related pathophysiology included inducing amyloid-β (Aβ) deposition, altered AβPP metabolism, hyperphosphorylation of tau proteins, dysregulation of calcium homeostasis, impaired autophagy, local and systemic inflammation, promoting oxidative stress, mitochondrial injuries, synaptic dysfunction, affecting neurotransmitters, neuronal and glial dysfunctions, etc.
The current COVID-19 pandemics affected hundreds of millions of lives and caused millions of deaths globally. In addition to respiratory symptoms, more that than 25% of COVID-19 patients developed various neurological symptoms. Examination of deceased COVID-19 patients showed that SARS-CoV-2 viral particles were neurotropic and found in the neuronal cell body extending into apparent neurite structures [43–45]. Although long-term effect still unclear, SARS-CoV-2 might exacerbate inflammatory processes in brain, increase the probability of cognitive impairment, and accelerate progression of AD.
Consistent with previous studies, our findings suggested that antiviral therapy, in combination with other treatments, might be effective in preventing or slowing down the progression of AD. Interestingly, a clinical trial was registered to investigate whether the anti-viral agent valacyclovir was effective in the treatment of AD (https://clinicaltrials.gov/ct2/show/NCT03282916). Further clarification of the regulatory mechanisms of viral infection in AD may provide new platforms to develop novel AD treatment strategies.
Footnotes
ACKNOWLEDGMENTS
The present study was supported by the National Natural Science Foundation of China (grant number 81600921 to Cheng Li), the Natural Science Foundation of Shanghai (grant number 20ZR1442900 to Cheng Li), the National Natural Science Foundation of China (grant number 81870824 to Li Tian) and the Shanghai Rising-Star Program (grant number20QA1407800 to Li Tian).
