Abstract
Background:
Both INPP5D and INPP5F are members of INPP5 family. INPP5F rs117896735 variant was associated with Parkinson’s disease (PD) risk, and INPP5D was an Alzheimer’s disease (AD) risk gene. However, it remains unclear about the roles of INPP5F rs117896735 variant in AD.
Objective:
We aim to investigate the roles of rs117896735 in AD.
Methods:
First, we conducted a candidate variant study to evaluate the association of rs117896735 variant with AD risk using the large-scale AD GWAS dataset. Second, we conducted a gene expression analysis of INPP5F to investigate the expression difference of INPP5F in different human tissues using two large-scale gene expression datasets. Third, we conducted an expression quantitative trait loci analysis to evaluate whether rs117896735 variant regulate the expression of INPP5F. Fourth, we explore the potentially differential expression of INPP5F in AD and control using multiple AD-control gene expression datasets in human brain tissues and whole blood.
Results:
We found that 1) rs117896735 A allele was associated with the increased risk of AD with OR = 1.15, 95% CI 1.005–1.315, p = 0.042; 2) rs117896735 A allele could increase INPP5F expression in multiple human tissues; 3) INPP5F showed different expression in different human tissues, especially in brain tissues; 4) INPP5F showed significant expression dysregulation in AD compared with controls in human brain tissues.
Conclusion:
INTRODUCTION
It is known that the inactivating mutation (R258Q) in SYNJ1/PARK20 could cause Parkinson’s disease (PD) [1–4]. In 2014, a large-scale meta-analysis of genome-wide association studies (GWAS) using 19,081 cases and 100,833 controls identified SAC2/INPP5F rs117896735 variant to be significantly associated with PD with p = 4.34E-13 and odds ratio (OR) = 1.624 for A allele [5]. Both SYNJ1 and INPP5F encode Sac domain-containing proteins [1]. In 2020, Cao and colleagues used mouse genetics to verify the hypothesis that Synj1 and Sac2 may have some overlapping functions [1]. Interestingly, Cao and colleagues identified a synthetic effect of the Synj1 and Sac2 in mice, which further supported the role of INPP5F in PD risk [1].
Both INPP5D and INPP5F are members of the inositol polyphosphate-5-phosphatase (INPP5) family [6, 7]. INPP5D encodes the SH2-containing inositol 5-phosphatase 1 (SHIP1), affects multiple signaling pathways such as PI3K/AKT signaling, and functions as a negative regulator of myeloid cell proliferation and survival [8–10]. INPP5F is an inositol 4-phosphatase that functions in the endocytic pathway [11]. It regulates cardiac hypertrophic responsiveness, inhibits STAT3 activity, and suppresses gliomas tumorigenicity [12, 13].
Until recently, INPP5D has been widely reported to be an Alzheimer’s disease (AD) risk gene [6, 14–18]. Evidence shows that the expression of INPP5D increases as AD progresses, mainly in plaque-associated microglia in 5xFAD mouse model [6]. Importantly, inhibition of INPP5D expression reduces amyloid pathology [6]. Until now, there still no publicly available studies to evaluate the association of INPP5F rs117896735 variant with AD or corresponding pathological process. Meanwhile, rs117896735 is an intronic and non-coding variant of INPP5F [1]. Cao and colleagues concluded that INPP5F was responsible for the PD risk variant rs117896735 [1]. However, they did not evaluate the direct association between rs117896735 variant and INPP5F [1]. It is reported that non-coding genetic variants could regulate the expression of nearby genes [19–23]. However, it remains unclear whether rs117896735 variant regulates the expression of INPP5F. Here, we conducted a comprehensive analysis of INPP5F rs117896735 variant.
MATERIALS AND METHODS
Study design
In stage 1, we conducted a candidate variant study to evaluate the association of rs117896735 variant with AD risk using the large-scale AD GWAS dataset. In stage 2, we conducted a gene expression analysis of INPP5F to investigate the expression difference of INPP5F in different human tissues, especially brain tissues using two large-scale gene expression datasets. In stage 3, we conducted an expression quantitative trait loci (eQTLs) analysis to evaluate whether rs117896735 variant may regulate the expression of INPP5F. In stage 4, we explore the potentially differential expression of INPP5F in AD and control using multiple AD-control gene expression resources in human brain tissues and whole blood.
AD GWAS datasets
In order to evaluate the association of rs117896735 variant with AD risk, we selected a large-scale AD GWAS dataset from the International Genomics of Alzheimer’s Project (IGAP) [14]. The IGAP stage 1 consisted of 21,982 AD and 41,944 cognitively normal controls of European descent [14]. These individuals are from four consortia including Alzheimer Disease Genetics Consortium (ADGC), Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium (CHARGE), The European Alzheimer’s Disease Initiative (EADI), and Genetic and Environmental Risk in AD/Defining Genetic, Polygenic and Environmental Risk for Alzheimer’s Disease Consortium (GERAD/PERADES) [14]. AD patients are diagnosed using the NINCDS-ADRDA criteria or DSM-IV guidelines [14]. More detailed information has been described in the original study and the recent studies [24–26].
UKBEC datasets
To investigate the potential expression difference of INPP5F in different human brain tissues, we selected the gene expression datasets from UK Brain Expression Consortium (UKBEC), which included 134 neuropathologically normal individuals of European descent with the mean age at death 59 and 26% of these donors were female [24, 27]. In UKBEC, the gene expression datasets from 10 brain regions are available including cerebellar cortex, frontal cortex, hippocampus, medulla, occipital cortex, putamen, substantia nigra, temporal cortex, thalamus, and intralobular white matter [27]. More detailed information is described in the Braineac database [27].
GTEx datasets
To evaluate the potential expression difference of INPP5F in across different human tissues (both brain and no-brain) and conduct the eQTLs analysis, we selected both the genotype and gene expression datasets from GTEx (version 8, dbGaP Accession phs000424.v8.p2). The GTEx consisted of 828 donors and 15201 samples in 49 human tissues or cells (number of samples with genotype ≥70) including 13 brain tissues (amygdale, anterior cingulate cortex, caudate basal ganglia, cerebellar hemisphere, cerebellum, cortex, frontal cortex, hippocampus, hypothalamus, nucleus accumbens, putamen, spinal cord, and substantia nigra) [28, 29]. The selected donors in GTEx are of multiple descents including European (85.3%), African (12.3%), Asian (1.4%), and Hispanic or Latino (1.9%), with mean age at death 55 and 33% of these donors were female [24]. About 99% of the donors are neuropathologically normal individuals, and 1% of the donors of these brain tissues died of neurological diseases (1.3% in age 20–39 and 1.2% in age 60–71) [29]. Recent studies have provided more detailed information about these datasets [19–21, 30–32].
AD-control gene expression datasets
Here, we explore the potentially differential expression of INPP5F in AD and control using five AD-control gene expression resources in human brain tissues and whole blood. The first is from the Harvard Brain Tissue Resource Center (HBTRC) including 129 late-onset AD patients and 101 nondemented healthy controls in three brain regions including cerebellum, dorsolateral prefrontal cortex and visual cortex [33]. The second is from 20 National Alzheimer’s Coordinating Center (NACC) brain banks and from the Miami Brain Bank in two brain tissues: frontal cortex and temporal cortex [34]. The third is from 7 well established National Institute on Aging Alzheimer’s disease brain banks in hippocampus [35]. The fourth is from the Medical Research Council (MRC) London Neurodegenerative Diseases Brain Bank including and 52 AD and 27 control human brains in entorhinal cortex [36]. The fifth resource is from the AddNeuroMed cohort study including 145 AD cases and 104 normal elderly controls in whole blood [37].
Statistical analysis
Genetic association of rs117896735 with AD: The original genotype datasets in IGAP are not publicly available. Here, we used the IGAP GWAS summary results to evaluate the genetic association between rs117896735 and AD. In brief, the logistic regression analysis was used to investigate the association analysis using an additive genotype model adjusting for age sex, and population substructure using principal components [14].
Gene expression analysis of INPP5F: In UKBEC, the expression profile was measured on the exon-specific level using Affymetrix Exon 1.0 ST Arrays [27]. The expression on gene or transcript-specific level was a Winsorised mean over exon-specific levels [27]. In GTEx, Illumina TruSeq RNA sequencing and Affymetrix Human Gene 1.1 ST Expression Array were selected to measure the levels of gene expression, which was quantified by transcripts per million (TPM) based on the GENCODE 26 annotation [28]. Here, we selected the T test or analysis of variance (ANOVA) to evaluate the expression of INPP5F in different human tissues. The statistical significance is p < 0.05.
eQTLs analysis of rs117896735 variant: We selected the additive model to indicate the rs117896735 genotype dosages including GG = 0, AG = 1, and AA = 2, and conducted the eQTLs analysis using a linear regression analysis. We used the online GTEx eQTL Calculator with the linear regression method to evaluate the association rs117896735 variant and INPP5F expression [29]. The statistically significant association is p < 0.05/49 = 1.00E-03. The suggestive association is defined to be p < 0.05.
AD-control gene expression analysis of INPP5F: we explored the potentially differential expression of INPP5F in AD and control using GEO2R, which performs comparisons using the GEOquery and limma R packages from the Bioconductor project [38]. The statistical significance is p < 0.05, as defined in recent study [39].
RESULTS
Genetic association between rs117896735 and AD
In the IGAP stage 1 dataset, the results revealed rs117896735 variant A allele to be significantly associated with the increased risk of AD with OR = 1.15, 95% CI 1.005–1.315, p = 0.042. In addition to AD, recent findings also indicated that the rs117896735 variant A allele was significantly associated with increased risk of PD with OR = 1.624 and p = 4.34E-13 [5].
Gene expression analysis of INPP5F
Using the gene expression datasets from UKBEC, we found significant different expression of INPP5F across the 10 brain tissues with the maximum fold change = 2.1, and p = 3.60E-38 (Fold change between MEDU and WHMT). A box-plot about the expression of INPP5F across the 10 brain tissues is provided in Fig. 1. Gene expression analysis in GTEx shows that INPP5F is mainly expressed in the brain tissues including Cerebellar Hemisphere (TPM median = 117.7), Cerebellum (TPM median = 93.17), Frontal Cortex (BA9) (TPM median = 85.96), Nucleus accumbens (basal ganglia) (TPM median = 54.69), Cortex (TPM median = 51.48), Spinal cord (cervical c-1) (TPM median = 42.08), Anterior cingulate cortex (BA24) (TPM median = 41.96), Caudate (basal ganglia) (TPM median = 40.59), Putamen (basal ganglia) (TPM median = 33.43), Substantia nigra (TPM median = 27.87), and Amygdala (TPM median = 26.09). Meanwhile, the expression levels of INPP5F in brain tissues are significantly higher than no-brain tissues (p < 0.05). The box plots for the expression of INPP5F in different tissues are provided in Fig. 2.

The box plots for the expression of INPP5F in different tissues in UKBEC. CRBL, cerebellar cortex; FCTX, frontal cortex; HIPP, hippocampus; MEDU, medulla (specifically inferior olivary nucleus); OCTX, occipital cortex (specifically primary visual cortex); PUTM, putamen; SNIG, substantia nigra; TCTX, temporal cortex; THAL, thalamus, WHMT, intralobular white matter.

The box plots for the expression of INPP5F in different tissues in GTEx. The gene expression values are shown in transcripts per million (TPM). The gene expression level was quantified by TPM based on the GENCODE 26 annotation, collapsed to a single transcript model for each gene using a custom isoform collapsing procedure [28]
eQTLs analysis of rs117896735 variant
The eQTLs analysis in GTEx showed that the rs117896735 variant A allele could significantly increase INPP5F expression in aubcutaneous adipose (beta = 0.39, and p = 7.80E-04) and nucleus accumbens (basal ganglia) (beta = 0.38, and p = 5.50E-04). Meanwhile, rs117896735 variant A allele could also increase the expression of INPP5F in cortex (p = 2.20E-02), frontal cortex (p = 2.50E-02), muscularis esophagus (p = 9.90E-03), and small intestine (p = 2.50E-02), but reduce the expression of INPP5F in spinal cord (p = 3.00E-03), as provided inTable 1.
rs117896735 variant A allele and INPP5F expression in 49 human tissues
EA, effect allele; Beta is the regression coefficient based on the effect allele. Beta 0 and Beta 0 means that this effect allele increase and reduce gene expression, respectively. The threshold of statistical significance for eQTLs analysis was p < 0.05/49 = 1.00E-03.
AD-control gene expression analysis of INPP5F
Using GEO2R, we identified significant expression dysregulation of INPP5F in AD compared with controls in dorsolateral prefrontal cortex, visual cortex, hippocampus, and entorhinal cortex (p < 0.05, Table 1). In brief, we found significant increased expression of INPP5F in AD in dorsolateral prefrontal cortex (fold change = 1.12 and p = 5.79E-17) and visual cortex (fold change = 1.15 and p = 9.41E-17). However, reduced expression of INPP5F in AD is observed in hippocampus (fold change = 0.51 and p = 7.52E-08), and entorhinal cortex (fold change = 0.83 and p = 5.19E-05), as provided in Table 2.
Gene expression analysis of INPP5F in human brain tissues and blood
FC, fold change based on AD versus control. The significance level is defined to be p < 0.05.
DISCUSSION
Large-sale GWAS have identified the INPP5F rs117896735 variant A allele to be significantly associated with increased risk of PD [5]. Both INPP5D and INPP5F are members of the INPP5 family [6, 7]. Importantly, INPP5D has been widely reported to be associated with AD [6, 14–18]. However, it remains unclear about the association of rs117896735 with AD risk and the expression of INPP5F, as rs117896735 is an intronic and non-coding variant of INPP5F. Here, we analyzed the rs117896735 variant and INPP5F gene comprehensively. In stage1, genetic association analysis showed that rs117896735 variant A allele also was associated with the increased risk of AD [5]. In stage 2, gene expression analysis using the datasets from UKBEC and GTEx showed expression difference of INPP5F in different human tissues, especially in brain tissues. In stage 3, we found that rs117896735 variant A allele significantly increased INPP5F expression in multiple human tissues. In stage 4, we found significant expression dysregulation of INPP5F in AD compared with controls including increased expression in dorsolateral prefrontal cortex and visual cortex, and reduced expression in hippocampus, and entorhinal cortex.
Until now, the roles of INPP5D have been reported. A large-scale meta-analysis of AD GWAS datasets using 74,046 individuals identified the INPP5D rs35349669 variant to be significantly associated with the risk of AD (p = 3.20E-08) [16]. Tsai and colleagues recently conducted a gene expression analysis to evaluate the INPP5D expression in AD and its association with amyloid plaque density and microglial markers using the RNA-Seq data from the Accelerating Medicines Partnership for Alzheimer’s Disease (AMP-AD) Consortium [6]. They found significantly increased expression of INPP5D in AD, and positive association between the expression of INPP5D and amyloid plaque density [6]. Using 5xFAD mice, they found that the expression of Inpp5d increased with the disease progress, and selectively in plaque-associated microglia [6].
Compared with INPP5D, the roles of INPP5F in neurological diseases especially in AD are largely unknown. Some studies have found the potential roles of INPP5F in AD, although the exact roles of INPP5F in AD remain unclear. Hu and colleagues found that INPP5F DNA methylation (cg17214023) was associated with AD and type 2 diabetes [40]. Zhang and colleagues conducted a whole exome sequencing analysis and found the INPP5F rs3736822 variant to be suggestive associated with AD in Han Chinese (p = 5.6E-03) [41]. Meanwhile, evidence shows that INPP5F inhibits STAT3 activity and suppresses gliomas tumorigenicity [13]. Inpp5f-v3, a transcriptional variant of Inpp5f, is specifically expressed in mouse brain, and may play a role in the development of mouse brain [42]. Therefore, replication studies are required to investigate the exact roles of INPP5F in AD.
It is noted that we only selected the clinically diagnosed AD GWAS dataset from IGAP [14]. In fact, there are also some large-scale datasets from GWAS for family history of AD, known as GWAS by disease proxy phenotype (GWAX) via self-report using the UK Biobank participants [43–49]. Our recent MR study highlighted significant difference and genetic heterogeneity in clinically diagnosed AD GWAS and self-report proxy phenotype GWAX [25, 51]. Meanwhile, only 37.5% (15) of the 40 identified AD susceptibility loci could be replicated across the AD GWAS, GWAX, and GWAS+GWAX datasets at the genome-wide significance p < 5.00E-08 [52]. Therefore, we did not select these large-scale AD GWAX or GWAS+GWAX datasets in our genetic association study.
Meanwhile, our study may some potential limitations. First, we only conducted the eQTLs analysis of rs117896735 variant. We think that the mQTLs (Methylation quantitative trait loci) analysis may also be helpful. However, mQTLs analysis of rs117896735 variant in 543 human brain cortex tissues from ROSMAP and 420103 CpG sites reported negative findings [53]. Second, we only got the GWAS summary results from the IGAP to evaluate the genetic association between rs117896735 and AD. Therefore, we could not provide the corresponding statistical chart, as the original genotype datasets are not publicly available. Third, both INPP5D and INPP5F are members of INPP5 family [6, 7]. We consider that there may be an interaction between INPP5D and INPP5F. However, protein-protein interaction network analysis using STRING database indicates no interaction between INPP5D and INPP5F [54]. Therefore, the roles of INPP5 in the pathology of AD should be further evaluated. Fourth, we did not perform the eQTLs analysis using the UKBEC datasets, as the rs117896735 variant and its proxies are rare variants and not available in UKBEC [27]. Therefore, replication studies are required to investigate our findings. Fifth, our conclusions are based on the findings in different datasets, but not the prospective study. Therefore, a prospective study should be designed to verify our findings with all data collected in one study.
Conclusions
Taken together, we demonstrate that PD rs117896735 variant, which could regulate INPP5F expression in human brain tissues and increase the risk of AD. These finding may provide important information about the role of rs117896735 in AD.
DATA AVAILABILITY
All relevant data are within the paper. The authors confirm that all data underlying the findings are either fully available without restriction through consortia websites or may be made available from consortia upon request. IGAP consortium data are available at https://web.pasteur-lille.fr/en/recherche/u744/igap/igap_download.php https://www.niagads.org/datasets/ng00075 https://www.braineac.org/ https://www.braineac.org/
Footnotes
ACKNOWLEDGMENTS
This work was supported by funding from the National Natural Science Foundation of China (Grant No. 82071212, 81901181, and 12026414), Beijing Natural Science Foundation (Grant No. JQ21022), and Beijing Ten Thousand Talents Project (Grant No. 2020A15). This work was also partially supported by funding from the Science and Technology Beijing One Hundred Leading Talent Training Project (Z141107001514006), the Beijing Municipal Administration of Hospitals’ Mission Plan (SML20150802), and the Funds of Academic Promotion Programme of Shandong First Medical University & Shandong Academy of Medical Sciences (No. 2019QL016, No. 2019PT007).
We thank the International Genomics of Alzheimer’s Project (IGAP) and UK Biobank for the GWAS summary statistics. We thank the Braineac, Mayo Clinic, and GTEx for the eQTLs dataset resources. We also thank Marioni et al. for the GWAS datasets about the family history of AD. The investigators within IGAP contributed to the design and implementation of IGAP and/or provided data but did not participate in analysis or writing of this report. IGAP was made possible by the generous participation of the control subjects, the patients, and their families. The i-Select chips was funded by the French National Foundation on AD and related disorders. EADI was supported by the LABEX (laboratory of excellence program investment for the future) DISTALZ grant, Inserm, Institut Pasteur de Lille, Université de Lille 2 and the Lille University Hospital. GERAD was supported by the Medical Research Council (Grant n° 503480), Alzheimer’s Research UK (Grant n° 503176), the Wellcome Trust (Grant n° 082604/2/07/Z) and German Federal Ministry of Education and Research (BMBF): Competence Network Dementia (CND) grant n° 01GI0102, 01GI0711, 01GI0420. CHARGE was partly supported by the NIH/NIA grant R01 AG033193 and the NIA AG081220 and AGES contract N01-AG-12100, the NHLBI grant R01 HL105756, the Icelandic Heart Association, and the Erasmus Medical Center and Erasmus University. ADGC was supported by the NIH/NIA grants: U01 AG032984, U24 AG021886, U01 AG016976, and the Alzheimer’s Association grant ADGC-10-196728.
