Abstract
The first primary age-related tauopathy (PART) genome-wide association study confirmed significant associations of Alzheimer’s disease (AD) and progressive supranuclear palsy (PSP) genetic variants with PART, and highlighted a novel genetic variant rs56405341. Here, we perform a comprehensive analysis of rs56405341. We found that rs56405341 was significantly associated with C4orf33 mRNA expression, but not JADE1 mRNA expression in multiple brain tissues. C4orf33 was mainly expressed in cerebellar hemisphere and cerebellum, and JADE1 was mainly expressed in thyroid, and coronary artery. Meanwhile, we found significantly downregulated C4orf33 expression both AD and PSP compared with normal controls, respectively.
Keywords
INTRODUCTION
Primary age-related tauopathy (PART) is a neurodegenerative pathology describing the common neuropathological features of neurofibrillary tangles without associated amyloid-β (Aβ) pathology [1–3]. PART is different from Alzheimer’s disease (AD) and is common in very elderly people [4]. Until recently, the first PART genome-wide association study (GWAS) has been conducted in 647 PART individuals using Braak neurofibrillary tangle stage as a quantitative trait [1]. This study confirmed the significant associations of seven genetic variants around AD risk loci including SLC24A4, MS4A6A, and HS3ST1 and progressive supranuclear palsy (PSP) risk loci including MAPT and EIF2AK3 loci with PART [1]. Importantly, this study highlighted a novel non-coding genetic variant rs56405341 with the genome-wide significance (p < 5×10–8) in a locus containing three genes (C4orf33, SCLT1, JADE1) [1].
Collectively, this first PART GWAS provides important findings, and may further improve our understanding the genetics of PART. However, expression quantitative trait loci (eQTLs) analysis only identified a modestly significant association between rs56405341 variant and JADE1 mRNA expression (p = 0.038) in dorsolateral prefrontal cortex of 452 postmortem samples from the Religious Orders Study and Memory and Aging Project (ROSMAP) [1]. Therefore, this finding should be further validated, as described in the limitations [1]. Until now, multiple large-scale eQTLs datasets and gene expression datasets in human brain tissues are now publicly available, which promote us to further evaluate this finding.
Here, we performed a comprehensive analysis of rs56405341 variant and its target genes. In stage 1, we conducted an eQTLs analysis to identify the target genes regulated by rs56405341 using two eQTLs datasets from Brain xQTL Serve and PsychENCODE in dorsolateral prefrontal cortex. In stage 2, we conducted an eQTLs analysis to verify the association of rs56405341 variant with the expression of its target genes using 13 eQTLs datasets from 13 GTEx brain tissues. In stage 3, we performed a gene expression analysis of these target genes using mRNA expression datasets across 49 GTEx human tissues. In stage 4, we conducted a differential mRNA expression analysis of these target genes in AD, PSP, and normal controls using two case-control gene expression datasets from Mayo Clinic Brain Bank and Banner Sun Health research institute, and Harvard Brain Tissue Resource Center.
MATERIALS AND METHODS
eQTLs analysis using datasets in prefrontal cortex
We performed an eQTLs analysis of rs56405341 in dorsolateral prefrontal cortex, as did by Farrell and colleagues [1]. Here, we selected two eQTLs independent datasets. We first selected a large-scale eQTLs dataset in human brain from Brain xQTLServe (Updated June 2021) [5]. Original genotype data and gene expression data are available in 534 individuals from the ROSMAP study and the CommonMind Consortium [6, 7]. eQTLs analysis was performed using a linear regression of genotype data and gene expression data from these 534 individuals from ROSMAP and 16,918 expressed genes [6, 7]. We also selected another eQTLs dataset from the PsychENCODE Consortium that provided a comprehensive online resource including 1,866 individuals in dorsolateral prefrontal cortex [8]. In PsychENCODE, a liner regression analysis was used to conduct the eQTLs analysis [8].
eQTLs analysis using datasets from GTEx
In addition to dorsolateral prefrontal cortex, we conducted an eQTLs analysis of rs56405341 using the Genotype-Tissue Expression Consortium (GTEx, version 8) datasets in multiple human brain tissues [9]. GTEx version 8 includes 828 donors, 15,201 samples, and 49 human tissues (number of samples with genotype> = 70) [9]. Here, we performed the eQTLs analysis using the online GTEx eQTL Calculator focusing on 13 brain tissues including Amygdala, Anterior cingulate cortex (BA24), Caudate (basal ganglia), Cerebellar Hemisphere, Cerebellum, Cortex, Frontal Cortex (BA9), Hippocampus, Hypothalamus, Nucleus accumbens (basal ganglia), Putamen (basal ganglia), Spinal cord (cervical c-1), and Substantianigra [10].
Gene expression analysis
For the target genes regulated by rs56405341 variant, we conducted a gene expression analysis to evaluate its or their mRNA expression across the 49 human tissues using the gene expression data in GTEx (version 8) [9]. The gene expression level was quantified by transcripts per million (TPM) based on the GENCODE 26 annotation, collapsed to a single transcript model for each gene using a custom isoform collapsing procedure [9].
Case-control gene expression analysis
For the target genes regulated by rs56405341 variant, we evaluated its or their differential mRNA expression in AD, PSP, and normal controls using two case-control gene expression datasets in cerebellum. The first gene expression dataset includes the RNAseq data from AD (n = 86), PSP (n = 84), and normal control (n = 80) without neurodegenerative diagnoses collected by Mayo Clinic Brain Bank and Banner Sun Health research institute [11]. Normalized read counts were assessed for differential expression across different diagnosis groups by a multi-variable linear regression adjusting for some key covariates [11]. The second gene expression dataset includes the gene expression in cerebellum from 129 AD and 101 nondemented healthy controls, collected by Harvard Brain Tissue Resource Center [12]. Here, we identified the differentially expressed genes using an online analysis tool GEO2 R (https://www.ncbi.nlm.nih.gov/geo/geo2r/) [13].
RESULTS
eQTLs analysis of rs56405341 variant in dorsolateral prefrontal cortex
In the first dorsolateral prefrontal cortex eQTLs dataset from the Brain xQTL Serve, we found no significant relation between rs56405341 variant and the expression of five genes including JADE1 (p = 9.71E-01), SCLT1 (p = 9.83E-02), C4orf33 (p = 2.65E-01), PGRMC2 (p = 3.01E-01), and LINC02615 (p = 3.93E-01) using p < 0.05 as the statistical significance threshold, as provided in Table 1. In the second dorsolateral prefrontal cortex eQTLs dataset from PsychENCODE, we found a significant association between rs56405341 variant and C4orf33 mRNA expression (p = 5.34E-05), but no significant association with other four genes including PGRMC2 (p = 6.69E-01), JADE1 (p = 9.94E-01), SCLT1 (p = 7.13E-01), and EEF1GP8 (p = 7.75E-01), as provided in Table 1. Collectively, the current findings from Brain xQTL Serve and PsychENCODE showed that rs56405341 variant was not significantly associated with JADE1 mRNA expression in dorsolateral prefrontal cortex.
eQTLs analysis of rs56405341 in GTEx (version 8) human brain tissues
Beta is the regression coefficient, based on the rs56405341 A allele using an additive model. Beta > 0 and Beta < 0 mean that the A allele regulated increased and reduced gene expression, respectively. The significance level is defined to be p < 0.05.
eQTLs analysis of rs56405341 variant in GTEx
Using eQTLs datasets, we focused on JADE1 and C4orf33 mRNA expression. eQTLs analysis indicated no significant association between rs56405341 variant and JADE1 mRNA expression, as provided in Table 1. Interestingly, GTEx eQTLs findings further support the significant association between rs56405341 variant and C4orf33 mRNA expression, such as in cerebellar hemisphere (p = 9.90E-04), cerebellum (p = 2.20E-03), caudate basal ganglia (p = 3.00E-02), hippocampus (p = 3.00E-02), and spinal cord (p = 4.00E-02), as provided in Table 1. However, the directions of the effects of rs56405341 A allele on C4orf33 mRNA expression are different in different GTEx brain tissues, as provided in Table 1. Meanwhile, eQTLs findings did not support significant association between rs56405341 variant and C4orf33 mRNA expression in frontal Cortex (BA9) (p = 5.00E-01).
Gene expression analysis
Gene expression analysis using mRNA expression datasets across 49 GTEx human tissues showed that C4orf33 is mainly expressed in cerebellar hemisphere (median TPM = 11.40), cerebellum (median TPM = 10.00), Cells - EBV-transformed lymphocytes (median TPM = 7.848), Pituitary (median TPM = 6.073), spinal cord (median TPM = 5.360), Ovary (median TPM = 4.832), Thyroid (median TPM = 4.167), Kidney –Medulla (median TPM = 4.141), Lung (median TPM = 3.716), Spleen (median TPM = 3.696), Hypothalamus (median TPM = 3.684), and Adrenal Gland (median TPM = 3.611), as the top 10 expressed tissues. The box plots for the expression of C4orf33 in different tissues are provided in Fig. 1.

Bulk tissue gene expression for C4orf33 (ENSG00000151470.12) in different tissues in GTEx. The gene expression values are shown in transcripts per million (TPM). The gene expression level was quantified by TPM based on the GENCODE 26 annotation, collapsed to a single transcript model for each gene using a custom isoform collapsing procedure.
Gene expression analysis using mRNA expression datasets across 49 GTEx human tissues showed that JADE1 is mainly expressed in Thyroid (median TPM = 35.54), Artery –Coronary (median TPM = 34.16), Nerve –Tibial (median TPM = 25.84), Artery –Tibia (median TPM = 25.54), Ovary (median TPM = 23.75), cerebellar hemisphere (median TPM = 23.15), Esophagus - Gastroesophageal Junction (median TPM = 23.03), dipose –Subcutaneous (median TPM = 22.89), Artery –Aorta (median TPM = 20.94), Breast - Mammary Tissue (median TPM = 20.29), as the top 10 expressed tissues. The box plots for the expression of JADE1 in different tissues are provided in Fig. 2.

Bulk tissue gene expression for JADE1 (ENSG00000077684.15) in different tissues in GTEx. The gene expression values are shown in transcripts per million (TPM). The gene expression level was quantified by TPM based on the GENCODE 26 annotation, collapsed to a single transcript model for each gene using a custom isoform collapsing procedure.
Case-control gene expression analysis
Using the RNAseq data from AD (n = 86), PSP (n = 84), and normal control (n = 80) collected by collected by Mayo Clinic Brain Bank and Banner Sun Health research institute, we found that the expression of C4orf33 was significantly downregulated in AD with fold change = 0.75, p = 3.86E-10, and also in PSP with fold change = 0.76, p = 3.01E-08, compared with normal controls, respectively. Interestingly, we further verified the downregulated expression of C4orf33 in AD with fold change = 0.86, p = 1.49E-23 using the gene expression from 129 AD and 101 nondemented healthy controls collected by Harvard Brain Tissue Resource Center.
DISCUSSION
Until recently, the first PART GWAS has confirmed significant associations of AD and PSP genetic variants around SLC24A4, MS4A6A, HS3ST1, MAPT, and EIF2AK3 with the risk of PART [1]. Meanwhile, this PART GWAS also highlighted a novel non-coding genetic variant rs56405341 around C4orf33, SCLT1, and JADE1 [1]. Single-cell RNA-seq and immunohistochemistry demonstrated the increased JADE1 mRNA expression in tangle-bearing neurons than non-tangle-bearing neurons at both the mRNA and protein levels [1]. Experiment studies in Drosophila indicated a neuroprotective role of JADE1 in tauopathy [1]. However, eQTLs analysis only showed a modestly significant association between rs56405341 variant and JADE1 mRNA expression (p = 0.038) in 452 dorsolateral prefrontal cortex samples, which has promoted us to perform a comprehensive analysis of rs56405341 variant and its target genes.
In stage 1, we conducted an eQTLs analysis to identify the target genes regulated by rs56405341 variant using two eQTLs datasets from Brain xQTL Serve and PsychENCODE in dorsolateral prefrontal cortex. We found that rs56405341 variant was significantly associated with C4orf33 mRNA expression, but not JADE1 mRNA expression in dorsolateral prefrontal cortex. In stage 2, we conducted an eQTLs analysis to verify the association of rs56405341 variant with the expression of C4orf33 and JADE1 mRNA expression using 13 eQTLs datasets from 13 GTEx brain tissues [9]. We found that rs56405341 variant was significantly associated with C4orf33 mRNA expression but not JADE1 mRNA expression in multiple GTEx brain tissues. In stage 3, we performed a gene expression analysis of C4orf33 and JADE1 using mRNA expression datasets across 49 GTEx human tissues. We identified that C4orf33 was mainly expressed in cerebellar hemisphere and cerebellum, and JADE1 was mainly expressed in Thyroid, and Artery –Coronary. In stage 4, we conducted a differential mRNA expression analysis of C4orf33 in AD, PSP, and normal controls using two case-control gene expression datasets from Mayo Clinic Brain Bank and Banner Sun Health research institute, and Harvard Brain Tissue Resource Center. We found significantly downregulated C4orf33 expression both AD and PSP compared with normal controls, respectively.
C4orf33, chromosome 4 open reading frame 33, is a non-annotated gene. Until now, the exact roles of C4orf33 remain unclear. Interestingly, several studies have reported the involvement of C4orf33 in frontotemporal lobar degeneration [14, 15], and autism spectrum disorder [16]. In addition to C4orf33, C9orf72, chromosome 9 open reading frame 72, has also been widely reported to be associated with amyotrophic lateral sclerosis and frontotemporal dementia [17–20], even AD [21–23], and Parkinson’s disease [24, 25].
We consider that our current study may still have a limitation, although the dysregulation of C4orf33 in both AD and PSP. Evidence showed no significant association between rs56405341 variant and AD (p = 0.722) in a large-scale AD GWAS dataset including 21,982 cases and 41,944 cognitively normal controls [26]. Meanwhile, it remains unclear about the association of rs56405341 variant with PSP, as rs56405341 is not available in the PSP GWAS dataset including 1,114 PSP cases and 3,247 controls [27]. These findings indicate that rs56405341 may have disease context specificity. Future studies with large-sample sizes are required to verify our current findings.
Taken together, we demonstrated that C4orf33 but not JADE1 is the target gene regulated by rs56405341, mainly expressed in cerebellum, and its expression is significantly downregulated in AD and PSP cerebellum. In addition to the JADE1, we provide genetic evidence that C4orf33 may also be involved in PART, which may provide useful supplementary information about the PART GWAS findings [1].
Footnotes
ACKNOWLEDGMENTS
The results published here are in whole or in part based on data obtained from the AD Knowledge Portal (https://adknowledgeportal.org). The Mayo RNAseq study data was led by Dr. Nilüfer Ertekin-Taner, Mayo Clinic, Jacksonville, FL as part of the multi-PI U01 AG046139 (MPIs Golde, Ertekin-Taner, Younkin, Price). Samples were provided from the following sources: The Mayo Clinic Brain Bank. Data collection was supported through funding by NIA grants P50 AG016574, R01 AG032990, U01 AG046139, R01 AG018023, U01 AG006576, U01 AG006786, R01 AG025711, R01 AG017216, R01 AG003949, NINDS grant R01 NS080820, CurePSP Foundation, and support from Mayo Foundation. Study data includes samples collected through the Sun Health Research Institute Brain and Body Donation Program of Sun City, Arizona. The Brain and Body Donation Program is supported by the National Institute of Neurological Disorders and Stroke (U24 NS072026 National Brain and Tissue Resource for Parkinsons Disease and Related Disorders), the National Institute on Aging (P30 AG19610 Arizona Alzheimers Disease Core Center), the Arizona Department of Health Services (contract 211002, Arizona Alzheimers Research Center), the Arizona Biomedical Research Commission (contracts 4001, 0011, 05-901 and 1001 to the Arizona Parkinson’s Disease Consortium) and the Michael J. Fox Foundation for Parkinsons Research. The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The data used for the analyses described in this manuscript were obtained from:
, the GTEx Portal on 2/22/2022.
FUNDING
This work was supported by funding from the National Natural Science Foundation of China (Grant No. 82071212, and 81901181), Beijing Natural Science Funds for Distinguished Young Scholar (Grant No. JQ21022), the Mathematical Tianyuan Fund of the National Natural Science Foundation of China (Grant No. 12026414), and Beijing Ten Thousand Talents Project (Grant No. 2020A15). This work was also partially supported by funding from the Science and Technology Beijing One Hundred Leading Talent Training Project (Z141107001514006), the Beijing Municipal Administration of Hospitals’ Mission Plan (SML20150802), the Funds of Academic Promotion Programme of Shandong First Medical University & Shandong Academy of Medical Sciences (No. 2019QL016, No. 2019PT007).
CONFLICT OF INTEREST
Guiyou Liu is an Editorial Board Member of this journal but was not involved in the peer-review process nor had access to any information regarding its peer-review. The other authors have no conflict of interest to report.
