Revealing Molecular Patterns of Alzheimer’s Disease Risk Gene Expression Signatures in COVID-19 Brains

Abstract

Background:

Various virus infections are known to predispose to Alzheimer’s disease (AD), and a linkage between COVID-19 and AD has been established. COVID-19 infection modulates the gene expression of the genes implicated in progression of AD.

Objective:

Determination of molecular patterns and codon usage and context analysis for the genes that are modulated during COVID-19 infection and are implicated in AD was the target of the study.

Methods:

Our study employed a comprehensive array of research methods, including relative synonymous codon usage, Codon adaptation index analysis, Neutrality and parity analysis, Rare codon analyses, and codon context analysis. This meticulous approach was crucial in determining the molecular patterns present in genes up or downregulated during COVID-19 infection.

Results:

G/C ending codons were preferred in upregulated genes while not in downregulated genes, and in both gene sets, longer genes have high expressivity. Similarly, T over A nucleotide was preferred, and selection was the major evolutionary force in shaping codon usage in both gene sets. Apart from stops codons, codons CGU – Arg, AUA – Ile, UUA – Leu, UCG – Ser, GUA – Val, and CGA – Arg in upregulated genes, while CUA – Leu, UCG – Ser, and UUA – Leu in downregulated genes were present below the 0.5%. Glutamine-initiated codon pairs have high residual values in upregulated genes. Identical codon pairs GAG-GAG and GUG-GUG were preferred in both gene sets.

Conclusions:

The shared and unique molecular features in the up- and downregulated gene sets provide insights into the complex interplay between COVID-19 infection and AD. Further studies are required to elucidate the relationship of these molecular patterns with AD pathology.

Keywords

Alzheimer’s disease codon context codon usage COVID-19 relative synonymous codon usage

INTRODUCTION

Alzheimer’s disease (AD) is a disease that changes the quality of life of the patient and their caregivers. There are various theories related to the etiology of AD, including accumulation of amyloid-β (Aβ) in the extracellular spaces of neurons, formation of neurofibrillary tangles of hyperphosphorylated tau proteins, neuroinflammation, and cholinergic neuron damage, etc. 1 Also, a viral connection with AD has been reported. Herpes Simplex type 1 and 2,2,3, 2,3 Epstein–Barr virus,4,5, 4,5 human cytomegalovirus,6,7, 6,7 influenza virus, 8 Ljungan virus, Borna disease virus, 9 and hepatitis C virus 10 have been reported to increase the risk of AD and cognitive decline. 11 Cells of the central nervous system (CNS) are infected, and there have been found a connection between the virus infection and AD.9,12,13 , 9,12,13

The pandemic of the coronavirus disease 2019 (COVID-19) has caused over 600,000,000 infections globally thus far. 14 The infection leads to consequences ranging from severe illness and death to long-term neurological trajectories. 15 Neurological consequences include the development of neurodegenerative disorders, including AD and other dementias.16,17, 16,17 Research has exhibited that COVID-19 accelerates structural and functional brain deterioration in dementia patients regardless of dementia type. 18 A meta-analysis encompassing 11 studies, 939,824 post-COVID-19 cases, and 6,765,117 controls demonstrated a significant linkage between COVID-19 infection and new-onset dementia.

The attempts of researchers to link the molecular patterns that appeared after COVID-19 infection and in the AD brain provided evidence that the infection modulates the gene expression profiling of the genes involved in development of AD. 14 Whole transcriptome expression data analysis from COVID-19 versus control and AD versus control frontal cortex patient samples revealed differential expression of genes in both cases. A list of genes has been compiled by Green et al., which are involved in AD and related dementias and their expression modulated during COVID-19 infection. Five genes are activated (upregulated), and nineteen genes have been deactivated (downregulated) during COVID-19 infection, which is implicated in AD. The authors were keen to know the codon usage and other specific molecular patterns in the genes modulated during COVID-19 infection implicated in the progression of AD. The information obtained from the present study will help modulate and fine-tune the expression of these genes so that they might be used for therapeutic purposes against AD progression after COVID-19 infection.

MATERIAL AND METHODS

Sequence retrieval

Upon reviewing the literature, we found several AD and related dementias risk/pathology genes; a few of them are related to the pathology of AD that, during COVID-19 infection, predicted to be activated or deactivated and increased (upregulated) and decreased (downregulated). For this study, predicted activated and deactivated genes were ignored, and CXCL8, GFAP, IL18, IL6R, KLF4, and STAT3 genes, which are upregulated, and BDNF, CAMKK2, CTCFL, CXCL8, EGFR, GFAP, IFI16, IL18IL6R, KLF4, LGALS3, CAV1, FKBP5, IFITM3, C3, C5AR1, PLAT, HSPA8, HSP90AA1, and TAC1, which are downregulated, were considered for the study. All the transcripts were taken, starting with the initiation codon and terminating with stop codons. Predicted transcripts were omitted. A total of 33 transcripts in up and 79 transcripts in down genes were undertaken for the study. Compositional analysis was done for all the genes that included overall % A, % T, % G and % C and % AT and % GC component at third codon positions.

Relative synonymous codon usage analysis

If two or more than two codons encode for the same amino acid, they are called synonymous codons. Synonymous codons are not used equally, and the choice of codon changes according to species, 19 organism, 20 tissue, 21 and cellular level. 22 This uneven usage of codon is called codon bias. 23 One of the indices that indicate this bias is relative synonymous codon usage (RSCU). RSCU values more than 1.6 and below 0.6 are called over and underrepresented, respectively. 24 RSCU values were calculated using CAIcal server. 25

Codon adaptation index analysis (CAI)

CAI is another parameter that indicates bias and exhibits the codon usage frequency in a given transcript. 26 The CAI value has shown a correlation with the gene expression data; 27 hence, CAI is generally used as a surrogate for protein expression. Highly expressed genes exhibit higher CAI values. It is calculated by using the highly expressed genes as a reference gene set. The value ranges between 0 and 1, 28 and values towards 1 show higher expression. 29

Neutrality analysis

The neutrality plot reveals the equilibrium between the selection and mutational forces. 30 Percent GC3 and the average of % GC content at the first and second positions of the codon (% GC12) are determined. % GC3 is plotted on the X axis, while % GC12 is plotted on the Y axis. 31 Each dot represents the transcript. In the regression plot, the value ranges between 0 and 1. The value 1 indicates that codon usage is solely driven by mutational forces, and deviation from 1 shows the part of selection forces. A regression coefficient less than 0.5 is suggestive of the dominance of selection forces. 32

Parity plot analysis

As per parity rule, in the absence of selection or mutational forces, number of G = C and A = T in DNA and C + G + A + T = 1 and at the middle of the plot where both co-ordinates are 0.5, then C = G and A = T and the nucleotide composition follows parity rule. 33 Deviation from the above status is suggestive of presence of mutation, selection or both mutation and selection forces. It is distribution of experimental versus predicted values. 34 AT or GC bias at third codon position is determined by parity plot analysis. The AT bias [A3/(A3 + T3)] and GC bias [G3/(G3 + C3)] were calculated for parity analysis. 35 If the value of bias is low than 0.5, it indicates the pyrimidine preference over purine. 36

Rare codon analyses

Rare codons are those present in lower frequencies. 37 The presence of rare codons not only helps transiently pausing ribosomes on mRNA but also helps fold proteins properly. 38 These codons also help enhance protein solubility. 39 The frequency of codons was obtained, and the codons having a threshold below 5/1000, which means less than 0.5%, were considered rare codons.

Residual table formation for codon context

The tendency of codons to prefer or reject to be in pairs is called codon context. Higher residual values prefer to be in pairs, while the opposite is true for highly negative values. Using ANACONDA V2.0 software, 40 all the transcripts were scanned, initiated with the first six, and then, in each step, three nucleotides were scanned. The frequency of all hexanucleotides was computed. The matrix plot was generated for up and downregulated gene transcripts based on residual values. Considering the set of up and downregulated gene transcripts as genomes, a differential display map was generated, and the difference was shown on the color map. 40 For differential display maps, residual values below 20 were considered as no significant difference in codon context in up and downregulated genes. Residual values between 20–100 were considered significant differences and above 200 were considered very significant. Different matrixes, including xNN-Nxx and dinucleotides at the p3-p1 (third position of first codon and first position of second codon) junction, were calculated using ANACONDA V2 software. Four context clusters based on the p3-p1 position were calculated that are XXU-AYY, XXC-AYY, XXU-GYY. Dinucleotide context is related to DNA repair and replication constraints. 41 Sixteen dinucleotide frequencies at the codon junction of codon pair (xxA-Axx, xxC-Axx, xxG-Axx, xxU-Axx, xxA-Cxx, xxC-Cxx, xxG-Cxx, xxU-Cxx, xxA-Gxx, xxC-Gxx, xxG-Gxx, xxU-Gxx, xxA-Uxx, xxC-Uxx, xxG-Uxx, xxU-Uxx) were calculated.

RESULTS

RSCU analysis revealed a high preference for G/C ending codons in upregulated genes, while no such trend was evident in downregulated genes

Compositional analysis was done for up- and downregulated genes. % A3, % T3, % G3, % C3, % AT3, and % GC3 were 16.95%, 19.31%, 34.91%, 28.81%, 26.61%, and 49.3%, respectively. For downregulated gene, it was 21.65%, 24.74%, 28.87%, 24.74%, 34.02%, and 41.24%, respectively. Composition at third codon position has influence of RSCU values. RSCU analysis is done to determine the high and low preferred codons. 42 Overall trend analysis revealed that G/C ending codons were preferred in upregulated gene transcripts, while the same is not valid for downregulated genes. GTG and CTG were the codons preferred in most transcripts in both transcript sets. Considering average RSCU values obtained from RSCU values of all up transcripts, G/C ending codons CGG, TCC, AGC, and ATC were overrepresented, while none of the A/T ending codons was overrepresented. On the other hand, A/T(U) ending codons TTA (UUA), CGA, CAA, CTA (CUA), ATA (AUA), GTA (GUA), GGT (GGU), CGT (CGU), and GTT (GUU) were underrepresented. Among G/C ending codons, CGC, TCG, GCG, ACG, and CCG were underrepresented. AGA codon was overrepresented in downregulated genes, while TTA (UUA), CGA, CAA, CTA (CUA), ATA (AUA), GTA (GUA), GGT (GGU), CGT (CGU), CTT (CUU), TCG (UCG), GCG, ACG, and CCG were underrepresented codons. In a summary, CTG (CUG) and GTG (GUG) were the codons overrepresented in both gene sets, similar to the results found by Khandia et al. (2022), 43 while the codons with dinucleotide CpG or TpA (UpA) as an integral part were underrepresented. Table 1 shows average RSCU value of different genes, and overrepresented codons have been highlighted. The heat map is shown in Fig. 1A for up- and Fig. 1B for downregulated genes.

Fig. 1A

Data matrix for visualizing values in the cells using a color gradient for upregulated genes. Heatmap is plotted using heatmap R package (version 0.7.7) and for clustering dist and hclust functions in R have been used.

Fig. 1B

Data matrix for visualizing values in the cells using a color gradient. Heatmap is plotted using heatmap R package (version 0.7.7) and for clustering dist and hclust functions in R have been used.

Table 1

Average RSCU values of up- and downregulated gene transcripts

	Up genes									Down genes
AMINO ACIDS	Codons	CXCL8	GFAP	IL18	IL6R	KLF4	STAT3	BDNF	CAMKK2	CTCFL	CXCL8	EGFR	GFAP	IFI16	IL18	IL6R	KLF4	LGALS3	CAV1	FKBP5	IFITM3	C3	C5AR1	PLAT	HSPA8	HSP90AA1	TAC1
F	TTT	1	0.55	1.57	0.39	0.44	1.09	0.68	0.8	0.73	1	0.95	0.55	0.87	1.57	0.39	0.44	0.63	0.91	1.44	0	0.43	0.61	0.77	1.33	1.04	1.69
F	TTC	1	1.45	0.43	1.61	1.56	0.91	1.32	1.2	1.27	1	1.05	1.45	1.13	0.43	1.61	1.56	1.37	1.09	0.56	2	1.57	1.39	1.23	0.67	0.96	0.31
L	TTA	0	0	0.46	0.15	0.14	0.2	0.04	0.23	0.27	0	0.45	0	0.85	0.46	0.15	0.14	0.8	0	1.4	0	0.04	0.12	0.15	0.15	0.52	1.01
L	TTG	1.5	0.45	1.39	0.55	0.29	0.78	1.94	0.85	1.05	1.5	0.91	0.45	1.03	1.39	0.55	0.29	1.2	0.96	0.61	0	0.6	0.82	0.68	0.83	1.61	0.54
L	CTT	1	0.1	1.39	0.42	0.57	0.73	1.36	0.16	0.54	1	0.46	0.1	1.17	1.39	0.42	0.57	0	0	1.42	0	0.3	0	0.29	1.57	1.04	1.3
L	CTC	1	1.47	1.39	1.77	1.93	1.11	0.71	1.41	1.17	1	1.62	1.47	0.7	1.39	1.77	1.93	1.2	2.63	0.97	2	1.51	1.53	1.11	1.58	0.85	0.54
L	CTA	0	0.1	0.92	0.15	0.21	0.59	0.31	0.11	0.43	0	0.1	0.1	0.94	0.92	0.15	0.21	0	0	0	0	0.53	0.12	0.15	0.15	0.51	0.24
L	CTG	2.5	3.88	0.46	2.95	2.86	2.59	1.63	3.24	2.53	2.5	2.46	3.88	1.32	0.46	2.95	2.86	2.8	2.4	1.6	4	3.02	3.41	3.63	1.72	1.46	2.37
I	ATT	2	0.84	1.24	0.85	1	1.23	1.18	0.52	1.25	2	0.79	0.84	0.94	1.24	0.85	1	1.5	1.34	1.87	0.6	0.47	0.27	0.64	1.81	1.19	0.25
I	ATC	0.5	1.88	0.71	1.76	2	1.7	0.65	2.26	1.45	0.5	1.56	1.88	1.17	0.71	1.76	2	0.3	1.33	0.69	2.1	2.24	2.46	1.77	1.04	1.37	2.2
I	ATA	0.5	0.28	1.06	0.39	0	0.07	1.17	0.23	0.3	0.5	0.65	0.28	0.89	1.06	0.39	0	1.2	0.33	0.44	0.3	0.29	0.27	0.59	0.15	0.45	0.55
V	GTT	1.14	0.76	1	0.72	0	0.42	0.51	0.12	0.61	1.14	0.22	0.76	0.7	1	0.72	0	1.71	0.71	0.67	0.31	0.5	0.23	0.47	1.38	1.14	0.37
V	GTC	0	1.07	1	1.14	1.06	0.87	0.87	1.52	1.31	0	1.52	1.07	0.44	1	1.14	1.06	0.57	1.72	1.16	1.54	1.27	0.91	1.44	1.28	0.69	1.9
V	GTA	0.57	0.1	1	0.19	0.1	0.29	0.64	0	0.06	0.57	0.13	0.1	1.22	1	0.19	0.1	0.29	0.42	0.76	0	0.19	0.23	0	0.38	0.69	0
V	GTG	2.29	2.07	1	1.94	2.84	2.41	1.99	2.37	2.02	2.29	2.13	2.07	1.64	1	1.94	2.84	1.43	1.15	1.41	2.15	2.04	2.63	2.09	0.95	1.48	1.73
S	TCT	1.86	0.38	1.88	0.92	0.62	0.52	0.68	0.79	1.16	1.86	0.52	0.38	0.55	1.88	0.92	0.62	2.07	1.48	0.14	1.33	1	0.25	0.49	2.64	2.04	1.28
S	TCC	1.86	2.9	0.75	1.43	1.29	1.61	0.99	1.91	0.78	1.86	1.85	2.9	0.77	0.75	1.43	1.29	0.61	0.61	1.24	3.33	1.67	3	1.22	1.16	0.59	2.55
S	TCA	0.43	0.38	1.88	0.54	0.5	1.03	1.36	0.78	0.95	0.43	0.41	0.38	1.58	1.88	0.54	0.5	0.41	0	1.76	0	0.5	0.5	0.76	0.76	1.68	0.38
S	TCG	0	0.31	0	0.2	1.35	0.21	0.62	0.44	0.29	0	0.19	0.31	0	0	0.2	1.35	0.83	0.27	0.78	0	0.61	0.25	0.95	0	0.48	0
S	AGT	1.86	0.57	1.13	0.79	0.11	0.87	1.36	0.45	1.48	1.86	0.87	0.57	1.67	1.13	0.79	0.11	1.24	0.61	1.43	0.67	0.61	0.25	0.54	0.57	0.43	0.9
S	AGC	0	1.46	0.38	2.12	2.13	1.76	1	1.64	1.33	0	2.16	1.46	1.43	0.38	2.12	2.13	0.83	3.03	0.65	0.67	1.61	1.75	2.04	0.87	0.79	0.9
P	CCT	0.8	1.2	2.29	0.85	0.68	0.86	2	0.48	1.36	0.8	0.97	1.2	1.15	2.29	0.85	0.68	1.95	0.69	0.88	0.8	0.72	0.75	0.87	1.68	1.41	1
P	CCC	1.6	1.2	0.57	1.02	1.27	1.36	1.55	2.15	1.42	1.6	1.58	1.2	0.84	0.57	1.02	1.27	0.7	0.57	0.86	2.4	1.83	2.25	1.78	1.24	1.37	2
P	CCA	1.6	0.9	1.14	0.95	0.56	1.45	0.45	0.75	0.57	1.6	1.02	0.9	1.67	1.14	0.95	0.56	1.35	2.06	1.84	0.4	0.87	1	0.47	1.08	0.71	0
P	CCG	0	0.7	0	1.18	1.48	0.32	0	0.61	0.65	0	0.44	0.7	0.34	0	1.18	1.48	0	0.69	0.42	0.4	0.58	0	0.87	0	0.52	1
T	ACT	1.33	0.29	2.29	1.02	0.31	1.15	1.28	0.11	1.02	1.33	0.49	0.29	1.31	2.29	1.02	0.31	0.54	0.54	1.78	0.44	0.5	0.64	0.72	1.42	1.29	3
T	ACC	0	2.83	0.57	1.11	2.23	1.82	1.24	2.7	1.81	0	1.65	2.83	0.98	0.57	1.11	2.23	1.86	2.39	1.17	3.11	2.29		1.4	1.8	1.2	0
T	ACA	2.67	0.42	0.57	1.37	0.69	0.72	0.87	0.45	0.82	2.67	1.07	0.42	1.49	0.57	1.37	0.69	1.07	0.54	0.78	0	0.61	0.32	0.83	0.65	1.43	1
T	ACG	0	0.47	0.57	0.5	0.77	0.32	0.61	0.74	0.34	0	0.79	0.47	0.22	0.57	0.5	0.77	0.54	0.54	0.28	0.44	0.61	1.44	1.06	0.14	0.09	0
A	GCT	1.47	1.06	2.76	0.92	0.68	0.96	0.96	0.65	1.58	1.47	0.68	1.06	1.78	2.76	0.92	0.68	1.19	0.42	1.24	0.73	0.78	0.28	0.49	2.2	1.63	1.25
A	GCC	1.27	1.79	0	1.84	1.37	1.64	1.08	2.23	1.35	1.27	1.76	1.79	1.09	0	1.84	1.37	1.19	1.47	1.13	2.55	2.31	2.9	2.34	1.06	0.92	1.47
A	GCA	1.27	0.65	1.24	0.46	0.39	0.83	1.65	0.6	0.87	1.27	1.13	0.65	1.13	1.24	0.46	0.39	1.48	1.69	1.57	0.36	0.58	0.55	0.7	0.74	1.09	1.28
A	GCG	0	0.49	0	0.79	1.56	0.56	0.31	0.52	0.19	0	0.43	0.49	0	0	0.79	1.56	0.15	0.42	0.06	0.36	0.33	0.28	0.47	0	0.36	0
Y	TAT	0	0.65	0.4	0.92	0.5	1.13	0	0.5	0.73	0	0.87	0.65	1.71	0.4	0.92	0.5	0.92	0.28	1.5	1.5	0.35	0.55	0.16	1.12	0.98	1.1
Y	TAC	2	1.35	1.6	1.08	1.5	0.87	2	1.5	1.27	2	1.13	1.35	0.29	1.6	1.08	1.5	1.08	1.72	0.5	0.5	1.65	1.46	1.84	0.88	1.02	0.9
H	CAT	0	0.49	2	0.33	0.46	0.79	0.84	0.29	0.81	0	0.77	0.49	1.5	2	0.33	0.46	0.9	0.17	1.03	0.4	0.41	0.4	0.83	1.38	1.27	1.42
H	CAC	2	1.51	0	1.67	1.54	1.21	1.16	1.71	1.19	2	1.23	1.51	0.5	0	1.67	1.54	1.1	1.83	0.97	1.6	1.59	1.6	1.17	0.62	0.73	0.58
Q	CAA	0	0.14	1.33	0.4	0.59	0.44	0.94	0.31	0.57	0	0.27	0.14	0.52	1.33	0.4	0.59	0.67	0.58	0.73	0.5	0.4	0	0.21	0.47	0.89	0.13
Q	CAG	2	1.86	0.67	1.6	1.41	1.56	1.06	1.69	1.43	2	1.73	1.86	1.48	0.67	1.6	1.41	1.33	1.42	1.27	1.5	1.6	2	1.79	1.53	1.11	1.88
N	AAT	0.33	0.57	1.23	0.93	0.73	0.93	1.12	0.86	0.98	0.33	0.69	0.57	1.24	1.23	0.93	0.73	1	0.39	1.43	0.29	0.43	0.55	0.51	1.05	0.51	2
N	AAC	1.67	1.43	0.77	1.07	1.27	1.07	0.88	1.14	1.02	1.67	1.31	1.43	0.76	0.77	1.07	1.27	1	1.61	0.57	1.71	1.57	1.46	1.49	0.95	1.49	0
K	AAA	0.8	0.27	1.33	0.33	0.78	0.94	1.02	0.49	0.97	0.8	0.93	0.27	1.11	1.33	0.33	0.78	1.26	0.86	1.14	0	0.55	0.62	0.64	0.57	0.99	1.28
K	AAG	1.2	1.73	0.67	1.67	1.22	1.06	0.98	1.51	1.03	1.2	1.07	1.73	0.89	0.67	1.67	1.22	0.74	1.14	0.86	2	1.45	1.39	1.36	1.43	1.01	0.72
D	GAT	1	0.62	1.16	0.67	0.38	0.7	0.46	0.55	1.09	1	0.74	0.62	1.13	1.16	0.67	0.38	1.11	0.38	0.99	0	0.56	0.57	0.97	1.13	1.12	1.13
D	GAC	1	1.38	0.84	1.33	1.62	1.3	1.54	1.45	0.91	1	1.26	1.38	0.87	0.84	1.33	1.62	0.89	1.62	1.01	2	1.44	1.43	1.04	0.87	0.88	0.87
E	GAA	1.18	0.4	1.51	0.43	0.28	0.64	1.01	0.72	0.9	1.18	1.02	0.4	1.27	1.51	0.43	0.28	1.67	1.1	1.19	0	0.62	0.25	0.6	1.06	1.09	1.29
E	GAG	0.82	1.6	0.49	1.57	1.72	1.36	0.99	1.28	1.1	0.82	0.98	1.6	0.73	0.49	1.57	1.72	0.33	0.9	0.81	2	1.38	1.75	1.4	0.94	0.91	0.71
C	TGT	1.2	0.25	1.6	0.52	0.69	0.92	0.95	1.05	0.87	1.2	0.53	0.25	1	1.6	0.52	0.69	0.33	0.67	0.91	0	0.52	0.89	0.58	1.75	1.03	0
C	TGC	0.8	1.75	0.4	1.48	1.31	1.08	1.05	0.95	1.13	0.8	1.47	1.75	1	0.4	1.48	1.31	1.67	1.33	1.1	2	1.48	1.11	1.42	0.25	0.97	0
R	CGT	0	0.25	0	0.31	0.39	0.17	0.05	0.97	0.96	0	0.49	0.25	0.39	0	0.31	0.39	0	1.35	0	0	0.59	0.32	0.34	2.54	0.96	0.68
R	CGC	0	1.95	0	0.6	0.97	0.45	0.65	1.15	1.42	0	1.39	1.95	0.45	0	0.6	0.97	1.33	4.05	0.52	2	1.32	0.95	1.12	0.7	0.51	0
R	CGA	0	0.56	0	0.27	0.78	0.52	1.11	0.27	0.5	0	0.9	0.56	1.12	0	0.27	0.78	0	0	0	0	0.88	0.95	0.44	1.04	0.71	0
R	CGG	0	1.79	1.5	1.44	1.54	2.58	1.16	2.66	0.6	0	1.03	1.79	0.22	1.5	1.44	1.54	0.67	0.6	1.23	0	1.46	3.16	1.06	0.23	0.61	1.35
R	AGA	3	0.31	4.5	1.44	0.39	1.42	1.26	0.09	1.65	3	0.73	0.31	2.86	4.5	1.44	0.39	2.67	0	2.29	0	0.59	0	1.67	1.16	1.92	3.3
R	AGG	3	1.14	0	1.95	1.94	0.85	1.77	0.86	0.88	3	1.46	1.14	0.95	0	1.95	1.94	1.33	0	1.96	4	1.17	0.63	1.38	0.34	1.29	0.68
G	GGT	1.33	0.38	0.67	0.38	0.62	0.43	1.08	0.69	0.44	1.33	0.76	0.38	0.48	0.67	0.38	0.62	0.12	0	0.53	0	0.49	0.29	0.21	1.37	1.51	0
G	GGC	0	1.86	0.67	1.4	2.23	1.42	1.53	1.89	0.83	0	1.49	1.86	0.44	0.67	1.4	2.23	0.81	2.31	1.05	2.29	1.76	2.29	1.95	0.76	0.66	1.66
G	GGA	2.67	0.26	2	1.25	0.54	1.23	0.49	0.58	1.77	2.67	0.88	0.26	2	2	1.25	0.54	1.45	0.22	1.98	0.57	0.69	0.29	0.83	1.62	1.11	2.34
G	GGG	0	1.51	0.67	0.97	0.61	0.92	0.91	0.84	0.95	0	0.87	1.51	1.07	0.67	0.97	0.61	1.62	1.47	0.44	1.14	1.06	1.14	1	0.24	0.71	0

With the increasing length the gene expression also is increased in both gene sets

The average length was 1691.56±641.31 and 1460.57944±816.70 while the average CAI was 0.787±0.02 and 0.777±0.03 for up and down genes, respectively. We did a correlation analysis between CAI and lengths of the genes, and we found a positive and significant correlation between CAI and lengths, the results indicated that with an increase in length, the level of gene expression also increased. Pearson correlation coefficient was r = 0.392 with p < 0.05 and r = 0.198, p < 0.05 for up and down genes, respectively.

Selective forces are dominant in both the genes

A neutrality plot signifies the equilibrium between the mutation and selection forces, two dominant forces shaping codon usage. A neutrality plot between % GC3 and % GC12 has been plotted. If there is a correlation between % GC3 and % GC12, it indicates the likeliness of the presence of mutational forces since the forces acting in determining codon usage are acting on all the codon positions.44,45, 44,45 We found a correlation between % GC12 and % GC3 in the up and downregulated gene sets. Pearson correlation coefficient was r = 0.810, <0.0001 and r = 0.681, p < 0.0001 for up- and downregulated genes respectively. Significant correlations between % GC3 and % GC12 indicated the presence of mutation force; 46 however, it was not dominant since the regression coefficient was 47.92 and 28.49 for up- and downregulated genes, respectively. It indicated that the mutational forces were 47.92% while selection forces were 52.08% for upregulated genes (Fig. 2A). It was 28.49% and 71.51% for selection and mutational forces for downregulated genes (Fig. 2B), respectively. In both cases, the selection force is dominant in shaping codon bias.

Fig. 2A

Neutrality plot analysis for upregulated genes (Neutrality analysis was done using PAST4 software. 47

Fig. 2B

Neutrality plot analysis for downregulated genes.

Parity plot revealed preference of T (U) over A nucleotide at third codon position in both up and down gene transcripts

Parity plot analysis indicates the bias between A and T (U) and C and G at the third codon position. The average GC bias value was 0.528±0.03 and 0.477±0.04 for up- and downregulated genes, respectively. The averages AT (AU) bias values were 0.46±0.01 and 0.487±0.06 for up- and downregulated genes (Fig. 3). The results of GC bias suggested that for upregulated genes, G is preferred over C; contrarily, in downregulated genes, C is preferred over G at the third codon position. For the AT bias, T (U) is preferred over A at the third codon position in both the up- and downregulated genes.

Fig. 3

Parity plot analysis for up- and downregulated genes.

Rare codon analysis

Apart stops codons, codons CGU – Arg, AUA – Ile, UUA – Leu, UCG – Ser, GUA – Val, CGA – Arg in upregulated genes while CUA – Leu, UCG – Ser, and UUA – Leu in downregulated genes were present below the threshold value 5/1000, which was set as a default value, indicating that these codons were present less than 0.5% in the studied transcripts (Fig. 4).

Fig. 4

Rare codon analysis for up- and downregulated genes. Apart stop codons, six codons in up- and three codons in downregulated genes were present below the threshold value 5/1000.

Glutamine initiated codon pairs have high residual values in upregulated genes

To identify the strongest and weakest codon context bias in up- and downregulated genes, the highest and lowest residual values were obtained. In upregulated genes, a higher bias was present since it ranged between 426 and –844. Similarly in the downregulated genes it ranged between 246 to –508; however, the bias is lower than present in upregulated gene. Interestingly, in upregulated genes, five out of ten codon pairs were initiated with glutamine (Table 2). Codon contexts for up and downregulated genes have been depicted as Figs. 5 and 6, respectively.

Table 2

Highest and lowest residual values

10 Highest residual values				10 Lowest residual values
Up genes		Down genes		Up genes		Down genes
AUA-ACG	426289	AGA-UAG	246158	CAG-CAG	–84453	AAA-CAG	–50889
CAU-CCU	378243	UAU-GCC	152406	CAG-GUG	–72847	UGC-GAG	–50423
UUA-GCA	368609	CGU-CAU	147455	CUG-AUG	–61268	GUG-AAA	–47693
CGU-CAA	313917	GUC-CGA	136969	CAG-GGC	–55268	CUG-UUC	–44497
AUU-GAC	303186	ACU-GUU	116874	UGG-GAG	–54298	GCU-AAG	–43640
UCA-UUA	273384	UUC-CGA	110710	CAG-GAA	–53927	CCC-GAA	–43089
CUA-CGA	254991	CUU-UCU	110314	CAG-GAC	–53717	CAC-GAG	–42698
AGU-CAA	247146	CGC-UAU	108284	CUG-AAA	–51006	AUC-GAA	–42214
AUG-UGA	244966	UUC-ACC	107157	GAG-GCC	–49896	CAC-AAG	–41536
ACU_GCG	230531	GCU-GGU	104799	GAA-CUG	–49563	GAG-CAC	–41473

Fig. 5

Codon context in upregulated genes during COVID-19 infection that contributes to Alzheimer’s disease. Green, red, and grey colors indicate high, low, and null occurrences of codon pairs appearing here.

Fig. 6

Codon context in downregulated genes during COVID-19 infection that contributes to Alzheimer’s disease. Green, red, and grey colors indicate high, low, and null occurrences of codon pairs appearing here.

In another approach the codon context maps of the up- and downregulated genes were compared and called differential display maps. A color scale was formed. Common features have been displayed as black, while significant differences were indicated as blue. Very high codon context changes were displayed as pink. ACG-AUA, CAA-CGU, CCU-CAU, GAC-AUU, and GCA-UUA codon pairs exhibited the maximum difference in context (Fig. 7).

Fig. 7

A differential display map for up- and downregulated genes. Black presents similar residual values, while blue represents a medium difference. Pink boxes represent the highest residual value differences.

Codon pair analysis revealed high occurrence of identical codon pair GAG-GAG and GUG-GUG in both up- and downregulated genes

In up genes, 1591 codon pairs existed, while down genes had 2048 codon pairs. In up genes, 455 possible codon pairs were absent, while in down genes, all kinds of possible codon pairs were present. In up genes, only codon pair CAG-UUU while in down genes, 06 codon pairs GAG-GAA = 161, AAG-GAA = 137, GAA-GAA = 127, GAG-AAA = 124, AAA-GAA = 106, and CAG-AAA = 92 ending with A/U were present among highly occurring top 20 codon pairs. Similar codon pairs GAG-GAG and GUG-GUG are abundant in up and downregulated gene sets (Table 3). Also codon pairs identical codon pair CAG-CAG (n = 77) was abundant in upregulated genes, while codon pair GAA-GAA (n = 127) was present in downregulated genes.

Table 3

Top 20 highly Occurring Codon pairs in up and downregulated genes. Identical codon pairs have been depicted as underlined

S. No.	Codons Pair in upregulated genes	Occurrence of codon pair	S. No.	Codons Pair in downregulated genes	Occurrence of codon pair
1.	GAG-GAG	151	1.	GAG-GAA	161
2.	GAG-CUG	120	2.	GAG-GAG	153
3.	CAG-CAG	110	3.	GAG-AAG	138
4.	CUG-GAG	109	4.	AAG-GAA	137
5.	AUU-GAC	101	5.	GAG-CUG	128
6.	CUG-GCC	93	6.	CUG-GAG	128
7.	GAG-CAG	92	7.	GAA-GAA	127
8.	CAG-AUG	89	8.	GAG-AAA	124
9	AAC-AUG	88	9	CUG-CUG	119
10.	AUG-GAG	84	10.	GUG-GUG	118
11.	GUG-GUG	77	11.	GUG-GAG	117
12.	CAG-CUG	77	12.	CUG-AAG	114
13.	CAG-GAG	76	13.	GAA-GAC	112
14.	GUG-GAG	71	14.	AAA-GAA	106
15.	GAG-CGG	71	15.	AAG-GAG	99
16.	AUG-CUG	71	16.	AUC-AAG	97
17.	CAG-UUU	70	17.	UUC-ACC	96
18.	GAG-AAG	69	18.	CAG-GAG	96
19.	ACC-AAG	67	19.	CAG-AAA	92
20.	AAC-AAC	67	20.	AAG-AUC	92

3^′xNN-Nxx Matrix represents that this position is influenced with purifying selection

In the upregulated gene set, at p2-3-p1 positions, TAA, GTA, TAC, GAC, TCG, GTT, and TAT trinucleotides were present in frequency below 2%. Surprisingly TGA, which forms a stop codon, was present in frequency above 10%, and TAG and TAA are 2% and 1.53%, respectively. In downregulated genes, only CGA and TAG had a frequency below 2%, and stop codons TGA, TAA, and TAG had a frequency of 9.35%, 2.28%, and 1.99%, respectively (Table 4).

Table 4

3^′ xNN-Nxx Matrix for up and downregulated gene transcripts (Percent occurrence)

	Up genes						Down genes
S.No.	Axx	Cxx	Gxx	Uxx	S.No.	Axx	Cxx	Gxx	Uxx
1	xAA	4.89	4	9.01	6.32	1	xAA	8.27	5.96	9.98	11.69
2	xAC	10.91	8.61	3.42	8.76	2	xAC	10	9.8	4.38	11.48
3	xAG	14.81	17.13	9.56	13.93	3	xAG	11.58	10.88	11.96	10.48
4	xAU	3.43	7.39	5.44	5.94	4	xAU	4.19	3.59	9.00	10.08
5	xCA	3.71	3.03	7.47	6.29	5	xCA	5.5	3.26	7.17	9.54
6	xCC	11.56	10.39	3.09	12.25	6	xCC	11.16	9.58	3.51	7.89
7	xCG	2.11	2.83	3.88	2.55	7	xCG	1.64	2.98	3.21	6.03
8	xCU	2.63	3.57	8.87	5.29	8	xCU	2.25	6.08	9.17	5.27
9	xGA	3.8	1.45	3.99	2.76	9	xGA	4.26	2.72	4.40	5.06
10	xGC	7.85	6.39	3.36	6.23	10	xGC	8.43	8.99	3.17	4.66
11	xGG	6.03	6.67	6.43	5.26	11	xGG	5.34	5.06	4.65	4.60
12	xGU	1.2	3.05	4.4	0.68	12	xGU	2.35	3.36	6.38	3.67
13	xUA	1.53	1.78	2.66	0.33	13	xUA	2.28	2.37	1.99	2.44
14	xUC	11.64	7.41	2.63	12.14	14	xUC	10.38	9.67	2.55	2.43
15	xUG	10.74	11.2	17.06	7.24	15	xUG	9.35	10.66	12.85	2.40
16	xUU	3.14	5.09	8.75	4.04	16	xUU	3.01	5.04	5.64	2.30

Dinucleotide analysis at codon pair junction

Among 16 dinucleotides, CpG and TpA at p3-p1 are very low in frequency in up and downregulated genes. Apart from that, ApC dinucleotide at the p3-p1 junction is also low in frequency and present below 15%. ApA also present in low frequency in upregulated genes, but the same is not true for downregulated genes. Apart ApA dinucleotide, the trend of frequency of dinucleotide at p3-p1 junction is similar for both the up and downregulated genes (Fig. 8). The trend has been depicted through.

Fig. 8

Dinucleotide frequency at p3-p1 junction of codons in up- and downregulated genes.

DISCUSSION

AD is a disease that is characterized by neuronal degeneration. 48 Viral etiology for AD is known, and several viruses have been reported to contribute to progression of AD, including Herpes Simplex type 1 and 2, Epstein–Barr virus, human cytomegalovirus, influenza virus, hepatitis C virus, and COVID-19. There have been reported many molecular changes in the COVID-19 brains that contribute to predisposition to AD. 49 In the present study, we envisaged investigation of codon usage pattern, various forces affecting the gene architecture like composition, mutation and selection forces, codon pair bias, differential display of codon pair bias in up and downregulated genes, rare codon analysis, and dinucleotide pattern analysis at codon junction.

In p2,3-p1 positions, the frequency of formation of stop codons below 2% is the result, which is obvious since the formation of the stop codons is avoided due to selective forces. 35 However, in upregulated genes, two out of three codons had a frequency above 2% (TAG and TGA). The same is visible in downregulated genes with two stop codons above 2% (TGA and TTA with a frequency above 9% and 2%, respectively). It indicates that the p2,3-p1 position has a mere role in purifying selection.

In the present study, based on the occurrence of several codon pair analyses, we found an abundance of identical codon pairs. For the subsequence of similar amino acids, codons that use similar tRNA and, thus, identical codons are used. Thus, the usage of identical codons appears to result from favor for the translation process. Presence of identical codons cause recharging of the tRNA to translate both codons before it diffuse. Co-tRNA and identical codon pairing conserve resources and increase translational efficacy by approximately 30%. 50 It also is established that translational dynamics significantly leaves a signature on the genome. 51 GTG-GTG and CTG-CTG codon pairs have been reported to be the most favored codon pairs in the depression-associated gene set. 52 In the proteome of E. coli MG1655, proline codon pairs exhibited regulatory role in translation. 53

The TGA codon has a higher occurrence frequency in both gene sets than TAA and TAG. TGA is a selective advantage over another stop codon since these three termination codons are not entirely synonymous, and their ability to act as termination codons varies. 54 TAA codon is most efficient in termination. In contrast, TGA is the least efficient, and significant readthrough has been observed. 55 A positive selection towards TGA has been observed since the readthrough process is a highly regulated mechanism to expand the proteome diversity so that additional C-terminally modified protein variants may be generated. 56

A negative selection for codons with TpA and CpG as integral parts is a common phenomenon. 57 Other than stop codon, the codons, which are rare (occurrence below 0.5%) in both up (codons CGU – Arg, AUA – Ile, UUA – Leu, UCG – Ser, GUA – Val, CGA – Arg) and downregulated genes (CUA – Leu, UCG – Ser, UUA – Leu), all contain TpA or CpG as integral part. 24 TpA being part of stop codon TAA and CpG being predisposed to mutations by deamination of 5-methylcytosine at CpG sites resulting in C to T changes.58,59, 58,59

CpG and TpA at p3-p1 are underrepresented in both up- and downregulated genes. However, TpA and CpG underrepresentation at the p3-1 junction suggests the role of other forces, like immune pressure, high mutability resulting in a transition from CpG to TpG, selection forces, the mRNA destabilizing effect of TpA, and higher susceptibility of TpA to cytoplasmic RNase. 60

Codon pairs influence the translation process. 61 In upregulated genes, five out of ten codon pairs were initiated with glutamine. In co morbidity case of cancer and neurodegeneration, in APP, CCND1, PTPA, and APP genes also exhibited abundance of Glutamine initiated codonpair. 62

In a study encompassing 14,026 human genes, it is revealed that the selection pressure is responsible for the heterogeneity of C content at the third codon position. 63 G/C ending codons usage increases with increasing GC bias and vice versa.64,65, 64,65 In the present study, in an upregulated gene, GC content at the third codon position was higher than average AT content, and thus, G/C ending codons were preferred and concord with the phenomenon mentioned above. Though in the downregulated gene set, GC content at the third codon position was also high, no clear trend of preference for G/C ending codons was observed. Thus, it can be inferred that other factors besides compositional forces can influence nucleotide bias at the third codon position. In parity plot analysis, it is evident that T is preferred over A in both gene sets. This result might be partially explained based on composition since in the up gene, the occurrence of % A is less than % T. However, the same is not true for the down genes, where % A and % T compositions are not much dissimilar.

Several forces shape the genome of any organism, including selection, mutation, and compositional forces. In the above paragraph, we saw the impact of composition on codon bias. We further investigated whether other forces, like selection forces, are operating. A regression analysis between % GC3 and % GC12 revealed 52.08% and 71.51% selection forces in up- and downregulated genes, respectively.

Gene length is associated with many factors like intron number and gene duplication. Longer genes are tissue-specific, like most extended transcripts, and tend to be expressed in the blood vessels, nerves, thyroid, cervix uteri, and the brain. 66 In the neurodegeneration-related genes, codon bias is significantly correlated with codon bias. 67 The codon adaptation index can be used to measure codon bias. It is also a surrogate for protein expression, and CAI can be correlated with gene expression. 68 Brown et al. (2021) have reported a correlation between gene expression and gene length, 69 and the same result has been obtained in our result also, where the correlation has been obtained in both the gene sets. The role of gene length in regulating gene expression has been reported. 69 Highly expressed genes are short;16,66, 16,66 however, in the present study, we found that longer genes have higher expressivity than smaller genes.

Conclusion and future perspectives

The viral cause of neurodegeneration is known, and COVID-19 is one among several viruses that predispose towards AD and related dementias. Transcriptome analysis of AD, control, and COVID-19 patients revealed modulation of several genes where few are upregulated while some are downregulated. The study of nucleotide composition revealed that despite being GC-rich transcripts, only in upregulated genes G/C ending codons were preferred, while in downregulated genes, no such pattern was present. However, up- and downregulated gene sets shared some common features, too. In both the gene sets with an increase in length, gene expression also increased, and selection force was the larger operative force in shaping the transcript architecture compared to mutation forces. All rare codons contained either CpG or TpA as an integral part of them, again suggestive of negative selection. The T nucleotide was preferred in both gene sets over the A nucleotide at the third codon position. An interesting observation was the presence of glutamine-initiated codon pairs in the top 10 high residual valued codon pairs. The present study helps understand the molecular patterns present in these genes, which might, in the future, help find therapeutics against AD in COVID-19 patients.

AUTHOR CONTRIBUTIONS

Yan Liu (Conceptualization; Data curation; Formal analysis; Methodology; Writing – review & editing); Weiyue Xu (Conceptualization; Data curation; Formal analysis; Validation; Writing – review & editing); Pan Yang (Conceptualization; Data curation; Formal analysis; Resources; Writing – original draft); Xingshun Liu (Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Writing – review & editing).

Footnotes

ACKNOWLEDGMENTS

Authors acknowledge support from their respective universities.

FUNDING

The authors have no funding to report.

CONFLICT OF INTEREST

The authors have no conflict of interest to report.

DATA AVAILABILITY

The data supporting the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

References

, Wang

and Geng

. Alzheimer’s disease hypothesis and related therapies. Transl Neurodegener 2018; 7: 2.

Itzhaki

. Overwhelming evidence for a major role for herpes simplex virus type 1 (HSV1) in Alzheimer’s disease (AD); underwhelming evidence against. Vaccines (Basel) 2021; 9: 679.

Protto

, Marcocci

, Miteva

, et al. Role of HSV-1 in Alzheimer’s disease pathogenesis: A challenge for novel preventive/therapeutic strategies. Curr Opin Pharmacol 2022; 63: 102200.

Tiwari

, Mittal

and Jha

. Unraveling the links between neurodegeneration and Epstein-Barr virus-mediated cell cycle dysregulation. Curr Res Neurobiol 2022; 3: 100046.

Zhang

, Zuo

, Jiang

, et al. Epstein-Barr virus and neurological diseases. Front Mol Biosci 2021; 8: 816098.

Liu

, Wang

, Niu

, et al. Single-cell RNA sequencing transcriptomics revealed HCMV IE2-related microglia responses in Alzheimer’s-like disease in transgenic mice. Mol Neurobiol 2024; 61: 1331–1345.

Mody

, Marvin

, Hynds

, et al. Cytomegalovirus infection induces Alzheimer’s disease-associated alterations in tau. J Neurovirol 2023; 29: 400–415.

Imfeld

, Toovey

, Jick

, et al. Influenza infections and risk of Alzheimer’s disease. Brain Behav Immun 2016; 57: 187–192.

Bruno

, Abondio

, Bruno

, et al. Alzheimer’s disease as a viral disease: Revisiting the infectious hypothesis. Ageing Res Rev 2023; 91: 102068.

10.

Chiu

and Chen

. PIN79 Hepatitis C virus infection increases the risk of Alzheimer’s diseases. Value Health 2012; 15: A399.

11.

Piekut

, Hurła

, Banaszek

, et al. Infectious agents and Alzheimer’s disease. J Integr Neurosci 2022; 21: 73.

12.

Sait

, Angeli

, Doig

, et al. Viral involvement in Alzheimer’s disease. ACS Chem Neurosci 2021; 12: 1049–1060.

13.

Liu

, Jiang

and Li

. The viral hypothesis in Alzheimer’s disease: SARS-CoV-2 on the cusp. Front Aging Neurosci 2023; 15: 1129640.

14.

Green

, Mayilsamy

, McGill

, et al. SARS-CoV-2 infection increases the gene expression profile for Alzheimer’s disease risk. Mol Ther Methods Clin Dev 2022; 27: 217–229.

15.

Parotto

, Gyöngyösi

, Howe

, et al. Post-acute sequelae of COVID- Understanding and addressing the burden of multisystem manifestations. Lancet Respir Med 2023; 11: 739–754.

16.

Grishkevich

and Yanai

. Gene length and expression level shape genomic novelties. Genome Res 2014; 24: 1497–1503.

17.

Toniolo

, Scarioni

, Di Lorenzo

, et al. Dementia and COVID-19, a bidirectional liaison: Risk factors, biomarkers, and optimal health care. J Alzheimers Dis 2021; 82: 883–898.

18.

Dubey

, Das

, Ghosh

, et al. The effects of SARS-CoV-2 infection on the cognitive functioning of patients with pre-existing dementia. J Alzheimers Dis Rep 2023; 7: 119–128.

19.

Hershberg

and Petrov

. Selection on codon bias. Annu Rev Genet 2008; 42: 287–299.

20.

Parvathy

, Udayasuriyan

and Bhadana

. Codon usage bias. Mol Biol Rep 2022; 49: 539–565.

21.

Khandia

, Pandey

, Khan

, et al. Synthetic biology approach revealed enhancement in haeme oxygenase-1 gene expression by codon pair optimization while reduction by codon deoptimization. Ann Med Surg (Lond) 2024; 86: 1359–1369.

22.

Quax

TEF

, Claassens

, Söll

, et al. Codon bias as a means to fine-tune gene expression. Mol Cell 2015; 59: 149–161.

23.

Jiao

, Jing

, Zhang

, et al. Codon pattern and context analysis in genes triggering Alzheimer’s disease and latent tau protein aggregation post-anesthesia exhibited unique molecular patterns associated with functional aspects. J Alzheimers Dis 2024; 97: 1645–1660.

24.

, Khandia

, Papadakis

, et al. An investigation of codon usage pattern analysis in pancreatitis associated genes. BMC Genom Data 2022; 23: 81.

25.

Puigbò

, Bravo

and Garcia-Vallve

. CAIcal: A combined set of tools to assess codon usage adaptation. Biol Direct 2008; 3: 38.

26.

Puigbò

, Bravo

and Garcia-Vallvé

. E-CAI: A novel server to estimate an expected value of Codon Adaptation Index (eCAI). BMC Bioinformatics 2008; 9: 65.

27.

Hao

, Liang

, Ping

, et al. Chloroplast gene expression level is negatively correlated with evolutionary rates and selective pressure while positively with codon usage bias in Ophioglossum vulgatum L. BMC Plant Biol 2022; 22: 580.

28.

Masłowska-Górnicz

, van den Bosch

MRM

, Saccenti

, et al. A large-scale analysis of codon usage bias in bacterial genomes shows association of codon adaptation index with GC content, protein functional domains and bacterial phenotypes. Biochim Biophys Acta Gene Regul Mech 2022; 1865: 194826.

29.

Jansen

, Bussemaker

and Gerstein

. Revisiting the codon adaptation index from a whole-genome perspective: Analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models. Nucleic Acids Res 2003; 31: 2242–2251.

30.

Deb

, Uddin

and Chakraborty

. Codon usage pattern and its influencing factors in different genomes of hepadnaviruses. Arch Virol 2020; 165: 557–570.

31.

Srivastava

, Chanyal

, Dubey

, et al. Patterns of codon usage bias in WRKY genes of Brassica rapa and Arabidopsis thaliana. J Agric Sci 2019; 11: 76.

32.

Khandia

, Khan

, Karuvantevida

, et al. Insights into synonymous codon usage bias in hepatitis C virus and its adaptation to hosts. Pathogens 2023; 12: 325.

33.

Shi

S-L

, Xia

R-X

. Codon usage in the iflaviridae family is not diverse though the family members are isolated from diverse host taxa. Viruses 2019; 11: 1087.

34.

Krishna

and Padma Sree

. Response surface modeling and optimization of chromium (VI) removal from aqueous solution using borasus flabellifer coir powder. Int J Appl Sci Eng 2013; 11: 213–226.

35.

Alqahtani

, Khandia

, Puranik

, et al. Codon usage is influenced by compositional constraints in genes associated with dementia. Front Genet 2022; 13: 884348.

36.

Alqahtani

, Khandia

, Puranik

, et al. Leucine encoding codon TTG shows an inverse relationship with GC content in genes involved in neurodegeneration with iron accumulation. J Integr Neurosci 2021; 20: 905–918.

37.

Daniel

, Onwukwe

, Wierenga

, et al. ATGme: Open-source web application for rare codon identification and custom DNA sequence optimization. BMC Bioinformatics 2015; 16: 303.

38.

Komar

. A code within a code: How codons fine-tune protein folding in the cell. Biochemistry (Mosc) 2021; 86: 976–991.

39.

Rosano

and Ceccarelli

. Rare codon content affects the solubility of recombinant proteins in a codon bias-adjusted Escherichia coli strain. Microb Cell Fact 2009; 8: 41.

40.

Moura

, Pinheiro

, Arrais

, et al. Large scale comparative codon-pair context analysis unveils general rules that fine-tune evolution of mRNA primary structure. PLoS One 2007; 2: e847.

41.

Nussinov

. Doublet frequencies in evolutionary distinct groups. Nucleic Acids Res 1984; 12: 1749–1763.

42.

, Liu

, Li

, et al. Comprehensive analysis of synonymous codon usage patterns and influencing factors of porcine epidemic diarrhea virus. Arch Virol 2021; 166: 157–165.

43.

Khandia

, Pandey

, Khan

, et al. Codon usage and context analysis of genes modulated during SARS-CoV-2 infection and dental inflammation. Vaccines (Basel) 2022; 10: 1874.

44.

Jenkins

and Holmes

. The extent of codon usage bias in human RNA viruses and its evolutionary origin. Virus Res 2003; 92: 1–7.

45.

Kumar

, Khandia

, Singhal

, et al. Insight into codon utilization pattern of tumor suppressor gene EPB41L3 from different mammalian species indicates dominant role of selection force. Cancers (Basel) 2021; 13: 2739.

46.

Khandia

, Singhal

, Kumar

, et al. Analysis of Nipah virus codon usage and adaptation to hosts. Front Microbiol 2019; 10: 886.

47.

Hammer

, Harper

DAT

, Ryan

. PAST: Paleontological Statistics Software Package for Education and Data Analysis.

48.

DeTure

and Dickson

. The neuropathological diagnosis of Alzheimer’s disease. Mol Neurodegener 2019; 14: 32.

49.

Rudnicka-Drożak

, Drożak

, Mizerski

, et al. Links between COVID-19 and Alzheimer’s disease-what do we already know? Int J Environ Res Public Health 2023; 20: 2146.

50.

Miller

, McKinnon

, Whiting

, et al. Codon pairs are phylogenetically conserved: A comprehensive analysis of codon pairing conservation across the Tree of Life. PLoS One 2020; 15: e0232260.

51.

Cannarozzi

, Schraudolph

, Faty

, et al. A role for codon order in translation dynamics. Cell 2010; 141: 355–367.

52.

Khandia

, Gurjar

, Kamal

, et al. Relative synonymous codon usage and codon pair analysis of depression associated genes. Sci Rep 2024; 14: 3502.

53.

Krafczyk

, Qi

, Sieber

, et al. Proline codon pair selection determines ribosome pausing strength and translation efficiency in bacteria. Commun Biol 2021; 4: 589.

54.

Trexler

, Bányai

, Kerekes

, et al. Evolution of termination codons of proteins and the TAG-TGA paradox. Sci Rep 2023; 13: 14294.

55.

McCaughan

, Brown

, Dalphin

, et al. Translational termination efficiency in mammals is influenced by the base following the stop codon. Proc Natl Acad Sci U S A 1995; 92: 5431–5435.

56.

Manjunath

, Singh

, Som

, et al. Mammalian proteome expansion by stop codon readthrough. Wiley Interdiscip Rev RNA 2023; 14: e1739.

57.

Khandia

, Alqahtani

. Genes common in primary immunodeficiencies and cancer display overrepresentation of codon CTG and dominant role of selection pressure in shaping codon usage. Biomedicines 2021; 9: 1001.

58.

Khandia

, Sharma

, Alqahtani

, et al. Strong selectional forces fine-tune CpG content in genes involved in neurological disorders as revealed by codon usage patterns. Front Neurosci 2022; 16: 887929.

59.

Munjal

, Khandia

, Shende

, et al. Mycobacterium lepromatosis genome exhibits unusually high CpG dinucleotide content and selection is key force in shaping codon usage. Infect Genet Evol 2020; 84: 104399.

60.

Gurjar

, Karuvantevida

, Rzhepakovsky

, et al. A synthetic biology approach for vaccine candidate design against delta strain of SARS-CoV-2 revealed disruption of favored codon pair as a better strategy over using rare codons. Vaccines (Basel) 2023; 11: 487.

61.

Harigaya

and Parker

. The link between adjacent codon pairs and mRNA stability. BMC Genomics 2017; 18: 364.

62.

Khandia

, Pandey

, Zaki

MEA

, et al. Application of codon usage and context analysis in genes up- or down-regulated in neurodegeneration and cancer to combat comorbidities. Front Mol Neurosci 2023; 16: 1200523.

63.

Sueoka

and Kawanishi

. DNA G + C content of the third codon position and codon usage biases of human genes. Gene 2000; 261: 53–62.

64.

Palidwor

, Perkins

, Xia

. A general model of codon bias due to GC mutational bias. PLoS One 2010; 5: e13431.

65.

Khandia

, Ali Khan

, Alexiou

, et al. Codon usage analysis of pro-apoptotic Bim gene isoforms. J Alzheimers Dis 2022; 86: 1711–1725.

66.

Lopes

, Altab

, Raina

, et al. Gene size matters: An analysis of gene length in the human genome. Front Genet 2021; 12: 559998.

67.

Khandia

, Saeed

, Alharbi

, et al. Codon usage bias correlates with gene length in neurodegeneration associated genes. Front Neurosci 2022; 16: 895607.

68.

Huyan

, Tang

, Li

, et al. Optimized expression and purification of Humbug in Pichia pastoris and its monoclonal antibody preparation. Iran J Public Health 2015; 44: 1632–1642.

69.

Brown

. Role of gene length in control of human gene expression: Chromosome-specific and tissue-specific effects. Int J Genomics 2021; 2021: 8902428.

Revealing Molecular Patterns of Alzheimer’s Disease Risk Gene Expression Signatures in COVID-19 Brains

Abstract

Background:

Objective:

Methods:

Results:

Conclusions:

Keywords

INTRODUCTION

MATERIAL AND METHODS

Sequence retrieval

Relative synonymous codon usage analysis

Codon adaptation index analysis (CAI)

Neutrality analysis

Parity plot analysis

Rare codon analyses

Residual table formation for codon context

RESULTS

RSCU analysis revealed a high preference for G/C ending codons in upregulated genes, while no such trend was evident in downregulated genes

With the increasing length the gene expression also is increased in both gene sets

Selective forces are dominant in both the genes

Parity plot revealed preference of T (U) over A nucleotide at third codon position in both up and down gene transcripts

Rare codon analysis

Glutamine initiated codon pairs have high residual values in upregulated genes

Codon pair analysis revealed high occurrence of identical codon pair GAG-GAG and GUG-GUG in both up- and downregulated genes

3′xNN-Nxx Matrix represents that this position is influenced with purifying selection

Dinucleotide analysis at codon pair junction

DISCUSSION

Conclusion and future perspectives

AUTHOR CONTRIBUTIONS

Footnotes

ACKNOWLEDGMENTS

FUNDING

CONFLICT OF INTEREST

DATA AVAILABILITY

References

3^′xNN-Nxx Matrix represents that this position is influenced with purifying selection