Abstract
Background:
Alzheimer’s disease (AD) patients rank among the highest levels of comorbidities compared to persons with other diseases. However, it is unclear whether the conditions are caused by shared pathophysiology due to the genetic pleiotropy for AD risk genes.
Objective:
To figure out the genetic pleiotropy for AD risk genes in a wide range of diseases.
Methods:
We estimated the polygenic risk score (PRS) for AD and tested the association between PRS and 16 ICD10 main chapters, 136 ICD10 level-1 chapters, and 377 diseases with cases more than 1,000 in 312,305 individuals without AD diagnosis from the UK Biobank.
Results:
After correction for multiple testing, AD PRS was associated with two main ICD10 chapters: Chapter IV (endocrine, nutritional and metabolic diseases) and Chapter VII (eye and adnexa disorders). When narrowing the definition of the phenotypes, positive associations were observed between AD PRS and other types of dementia (OR = 1.39, 95% CI [1.34, 1.45], p = 1.96E-59) and other degenerative diseases of the nervous system (OR = 1.18, 95% CI [1.13, 1.24], p = 7.74E-10). In contrast, we detected negative associations between AD PRS and diabetes mellitus, obesity, chronic bronchitis, other retinal disorders, pancreas diseases, and cholecystitis without cholelithiasis (ORs range from 0.94 to 0.97, FDR < 0.05).
Conclusion:
Our study confirms several associations reported previously and finds some novel results, which extends the knowledge of genetic pleiotropy for AD in a range of diseases. Further mechanistic studies are necessary to illustrate the molecular mechanisms behind these associations.
INTRODUCTION
Alzheimer’s disease (AD) is an aging-related debilitating neurological disorder with features of progressive neurodegeneration and deterioration of memory and cognitive function, characterized by accumulation of extracellular amyloid-β (Aβ) deposits and intracellular hyperphosphorylated tau into neurofibrillary tau tangles (NFTs) in the brain [1]. It has a strong underlying genetic and environmental component [2], among which heritability accounts for 58% -79% of the attribution for AD [3]. Studies indicated that dementia patients rank among the highest levels of co-occurring chronic disorders compared to persons with other conditions [4]. Moreover, several acquired comorbidities have been linked with increased risk of developing AD, such as hypertension, obesity, and diabetes mellitus [5]. However, whether these comorbidities and AD shared pathophysiology, which are due to the genetic pleiotropy of AD risk genes is not clear. Given the huge public health burden, utilizing a population without AD diagnosis to identify the relationship between the genetic risk of AD and other disease conditions can improve our understanding of the genetic pleiotropy of AD risk genes, which may benefit the management and treatment of AD.
Phenome-wide association study (PheWAS) is a genotype-to-phenotype approach to identify the shared genetic etiology for a range of diseases by detecting the association of multiple phenotypes with one genetic locus [6]. However, conventional PheWASs are often limited by unsatisfactory power due to the small effect size of each included single-nucleotide polymorphism (SNP) [7]. While polygenic risk score (PRS), a summary score calculated by aggregating the risk carried by multiple genetic variants from large genome-wide association studies (GWAS) can improve the calculation power [8]. The application of the PRS approach, combining the effects of multiple SNPs based on their effect sizes from GWAS [9], has the potential to identify individual’s AD risk [10–12]. It has been showed to discriminate between AD cases and controls achieving a prediction accuracy of 75-84% in clinical and population-based cohorts [13, 14]. In summary, a hypothesis-free PheWAS can identify multiple phenotypes associated with AD genetic risk, thus provides the opportunity to fully understand the pleiotropy of AD risk genes.
There was only one PheWAS using the AD PRS to examine these associations involving 30,118 individuals with different ancestry populations [15], which only revealed that AD PRS was related with AD, mild cognitive impairment, memory loss, dementia, and gout in the European ancestry. Therefore, by expanding the sample size to over 300,000 UK Biobank individuals, our study aimed to perform a more comprehensive PheWAS from AD PRS to investigate genetic the pleiotropy of AD risk genes on various health outcomes in individuals without AD diagnosis. The overall analysis pipeline is shown in Fig. 1. Specifically, we first constructed AD PRS based on the largest AD GWAS. Then, we investigated the associations between AD PRS and a wide range of illnesses in the UK Biobank.

Flowchart of the study.
MATERIALS AND METHODS
Study participants
The UK Biobank comprises over 500,000 participants aged 40-69 years recruited from England, Wales, and Scotland between 2006-2010 [16]. Our analyses were restricted to 337,138 unrelated British individuals with PRS available (see methods below) [16]. Individuals who had was diagnosed as AD (N = 1,820) or lacked related records of dementia (N = 22,626) (determined by records from algorithmically-defined dementia outcomes (Category 47), first occurrences data (Category 1712), death register center (Field 40001, Field 40002), hospital inpatient data (Fields 41270-41271, 41280-41281), and primary care data (Field 42040)) were excluded from the main analysis, yielding 312305 individuals with all covariates available in the main analysis. The UK Biobank has obtained ethnical approval from the National Research Ethics Committee (11/NW/0382). Participants have provided informed consent for the UK Biobank to access their health-related information. And this research was conducted under application number 19542.
Alzheimer’s disease-PRS
The PRS was constructed using GWAS summary data from the International Genomics of Alzheimer’s Project (IGAP) [17] which did not include any participants from UK Biobank as the discovery data. In this study, we downloaded the imputed SNP genotype data from the UK Biobank resource [16]. participants with low genotyping rate (<5%), with self-report gender-mismatched genetic data, non-white British, and those with too much relatives were removed. Similar with previous study [18], we downloaded the sample quality control file ‘ukb_sqc_v2.txt’ provided by the UK Biobank and restricted our analysis with individuals used in computing the principal components (‘used_in_pca_calculation’ column), white British individuals (‘in.white.British.ancestry.subset’ column), individuals with abnormal sex chromosome aneuploidy (‘putative.sex.chromosome.aneuploidy’ column), heterozygosity rate outliers (‘het.missing.outliers’ column), and more than ten putative third-degree relatives (‘excess.relatives’ column). We further removed variants with call rate < 0.95, MAF < 0.01, Hardy–Weinberg p-value<10-6, and those imputation quality score < 0.8. We used classic clumping and thresholding method to generate PRS in PRSice2 [19], using 14 different p value thresholds to select variants: p < 5e-8, p < 1e-6, p < 5e-6, p < 1e-5, p < 5e-5, p < 1e-4, p < 5e-4, p < 0.001, p < 0.005, p < 0.01, p < 0.05, p < 0.1, p < 0.5, p < 1. To reduce the multiple testing burden and to fully utilize the information of PRS from different thresholds, we performed principal analysis on the set of 14 PRS [20, 21] and used the scaled first principal component (PC1) in the latter analyses. As the sign of loadings for PC1 is arbitrary and the effect size in the base data was based on the reference allele, we flipped the direction of the first principal component to keep the same direction with other studies.
Assessment of disease phenotypes
Information of the disease outcome was ascertained based on first occurrences (Category 1712) [22] and records from the cancer register (Category 100092) [23]. Records in first occurrences are generated by mapping primary care data (Category 3000), hospital inpatient data (Category 2000), death register center (Field 40001, Field 40002), and self-reported medical condition (Field 20002) reported at baseline or follow-up and have been mapped to 3-character ICD-10 code [22]. Cancer register links to national cancer registries and contains records from separate regional cancer centers around the UK. Altogether, we obtained around 1200 3-character ICD-10 codes. On the one hand, these 3-character ICD-10 codes, covering ICD10 chapters I-XVII (except for chapter XVI, due to the low prevalence of cases) were classified into 16 main chapters. Similarly, 3-character ICD-10 codes were classified based on the ICD-10 tree level-1 [24] (participants who had at least one diagnosis in the classification of level 1 were defined as cases in this classification, for example, A00-A09 Intestinal infectious diseases). A total of 173 categories were classified according to the level-1 of the ICD-10 tree (except for chapter XVI). Focusing on the common diseases, we restricted our analysis to those with cases more than 1000, resulting a total of 136 categories. On the other hand, we converted these codes into phecodes, which has been considered to be closely aligned with diseases commonly used in clinical practice [25]. For each phecode, participants were defined as cases when they had at least one ICD10 code mapped to a phecode, whilst others without the same phecode were considered as controls. Totally, 729 phecodes were mapped in the current study. Considering the low prevalence of some diseases, we excluded phecodes with cases less than 1000 and 377 phecodes were left in the main analysis. The distribution of the number of cases in each mapped phecode was shown in Supplementary Figure 1.
Statistical analyses
We used R, version 4.0.0 to perform the main analysis. First, to generally found out the association between AD PRS and disease risk, logistic regression of each classification of 16 ICD10 main chapter against the AD PRS was conducted using R package “drgee” (version 1.1.10), adjusting for birth year, age, sex, region, and the top 10 genetic principal components. Second, logistic regression was also used for 136 disease classifications based on level-1 of the ICD-10 tree, adjusting the same covariates. Third, we conducted a PheWAS of each phecode against AD PRS using logistic regression in R package “PheWAS” (version 0.99.5.5) with the same covariates adjusted. In addition, to investigate the association between AD PRS and the diseases with a low incident rate, we also remained phecodes with cases more than 200 and performed a PheWAS. To investigate the effect of APOE, we calculated PRS without modeling APOE region and performed a PheWAS. A false discovery rate (FDR) corrected p value was used to control for multiple testing [26].
RESULTS
The demographic and clinical characteristics of the current study are shown in Table 1. Among the 312305 participants included, 54% were females. The average attained age of the participants was 57 years old. Most participants self-reported a good health status (excellent: 16.0% good: 58.3%).
Descriptive characteristics of the study population
Associations of AD PRS with ICD10 main chapters
The estimated odds ratio (OR) and corresponding 95% confidence intervals (95% CI) for 16 ICD10 main chapters per 1 SD increase of AD PRS are shown in Table 2. Since the low prevalence of diagnoses in Chapter XVI, we did not test the association between Chapter XVI and AD PRS. Two ICD10 main chapters were statistically significant after FDR correction but with a small effect size, which may due to the contradictory effects towards different diseases in the same ICD10 chapter. The most significant association was Chapter IV (Endocrine, nutritional and metabolic diseases) (OR = 1.008, 95% CI [1.006, 1.010], p < 0.001). We also found negative associations between Chapter VII (Eye and adnexa disorders) and AD PRS (OR = 0.998, 95% CI [0.996, 0.999], p = 0.022) after multiple testing correction.
Association of main ICD10 Chapters with Alzheimer’s disease PRS
OR, odds ratio; CI, confidence interval. Models were adjusted for birth year, age, sex, region, and the top 10 genetic principal components. FDR corrected p value was reported.
Associations of AD PRS with ICD10 level-1 chapters
To further investigate the association between AD PRS and disease outcomes, we performed logistic regression of 136 disease classifications with cases more than 1000 based on level-1 of the ICD-10 tree against the scaled AD PRS. Figure 2 shows the significant associations after FDR correction and the full results can be found in Supplementary Table 1.

Associations of AD PRS with ICD10 level-1 chapters. Significant associations after FDR correction with OR and corresponding 95% CI were shown.
Ten phenotypes were significantly associated with scaled AD PRS after FDR correction. Corresponding to the primary results with ICD10 main chapters, we found a positive association, which is also the most significant association, between metabolic disorders and AD PRS (OR 1.08, 95% CI [1.07, 1.09], p = 1.28E-71), following by organic, including symptomatic, mental disorders (OR 1.22, 95% CI [1.19, 1.25], p = 4.39E-49), and a negative association between disorders of choroid and retina (OR 0.97, 95% CI [0.96, 0.99], p = 6.29E-3) and AD PRS. Another two phenotypes, diabetes mellitus (OR 0.97, 95% CI [0.96, 0.98], p = 1.68E-5) and obesity and other hyperalimentation (OR 0.96, 95% CI 0.95, 0.97], p = 7.83E-11), were found to be negatively associated with AD PRS. Apart from the results belong to the ICD10 main chapters found in the first step, we also found a positive association between other degenerative diseases of the nervous system (OR 1.18, 95% CI [1.13, 1.24], p =7.74E-10) and polyneuropathies and other disorders of the peripheral nervous system (OR 0.93, 95% CI [0.91, 0.96], p =1.59E-4) with AD PRS. Furthermore, AD PRS was also negatively associated with Chronic lower respiratory diseases (OR 0.99, 95% CI [0.98, 0.99], p = 1.65E-2). Two additional phenotypes including disorders of gallbladder, biliary tract, and pancreas (OR 0.98, 95% CI [0.97, 0.99], p =1.77E-2) and infections of the skin and subcutaneous tissue (OR 0.98, 95% CI [0.97, 0.99], p =1.77E-2) were significantly inversely associated with AD PRS. Furthermore, as obesity and diabetes mellitus always come together, we grouped them together and tested for relationships with both groups. AD PRS was negatively associated with obesity and diabetes mellitus and achieved a lower p value (OR 0.97, 95% CI [0.96, 0.98], p =2.53E-12) after merging (Supplementary Table 1).
PheWAS of AD PRS
A PheWAS of 377 phecodes with cases more than 1000 against per 1-SD change in AD PRS was performed to further illustrate the relationship between AD PRS and medical condition more precisely (Fig. 3). After FDR correction, a total of 13 phenotypes were associated with AD PRS, including 2 mental disorders, 4 endocrine/metabolic, 3 respiratory, 3 digestive, and 1 sense organs phenotypes.

Manhattan plot for AD PRS phenome-wide association study. Phenotypes with cases more than 1000 were classified into 17 categories and their corresponding p values from the logistic regression were shown. The upward triangle represents a positive association, and the downward triangle represents a negative association. The red line denotes the FDR corrected p value and the blue line denotes the nominal p threshold.
Corresponding to the results found in second step, 13 of 13 phenotypes were replicated to be associated with AD PRS. Expectedly, AD PRS was positively associated with dementias (OR 1.39, 95% CI [1.34, 1.45], p = 1.96E-59) and delirium dementia and amnestic and other cognitive disorders (OR 1.35, 95% CI [1.3, 1.4], p = 1.77E-55). Apart from the two mental disorders, obesity yielded the most significant association with the AD PRS (OR 0.96, 95% CI [0.95, 0.97], p = 1.72E-10), followed by overweight, obesity and other hyperalimentation, diabetes mellitus, obstructive chronic bronchitis, type 2 diabetes, chronic bronchitis, chronic airway obstruction, diseases of pancreas, acute pancreatitis, other retinal disorders, and cholecystitis without cholelithiasis. The complete results for the 377 phecodes can be found in the Supplementary Table 2.
When we included the phecodes with cases more than 200 in the PheWAS, 577 phecodes remained, and 4 additional phenotypes were identified after FDR correction, including 4 mental disorders phenotypes with cases less than 1000 (Supplementary Figure 2). To be more specific, as for the 4 newly found phenotypes (vascular dementia, other specified nonpsychotic and/or transient mental disorders, specific nonpsychotic mental disorders due to brain damage, and paranoid disorders), all of them were positively associated with AD PRS (OR: 1.14 1.45). The complete results for the 577 phecodes can be found in the Supplementary Table 3. Since the effect of APOE is significantly higher than that of other common associated variants, we recalculated the AD PRS excluding APOE region and performed PheWAS analysis in diseases with cases more than 1000 (Supplementary Table 4, Supplementary Figure 3). Only 5 associations detected in above PheWAS analysis remained significant and had flipped OR values, indicating an important contribution of APOE region to the risk of health in the individuals without AD.
DISCUSSION
In this study, we investigated the association between AD PRS and a range of diseases from broad to narrow in 312,094 individuals without AD diagnosis in UK Biobank. We provided evidence that AD PRS was significantly associated with metabolic disorders, organic, including symptomatic, mental disorders, and other degenerative diseases of the nervous system. Unexpectedly, we also found that individuals with a higher AD PRS were more likely to suffer from less diseases, including diabetes mellitus (type 2 diabetes), obesity, overweight, obesity and other hyperalimentation, obstructive chronic bronchitis, chronic bronchitis, chronic airway obstruction, other retinal disorders, diseases of pancreas, acute pancreatitis, and cholecystitis without cholelithiasis. Our study provides a new sight in the pleiotropic of AD risk gene and can help the clinical management of comorbidity in individuals at high risk of AD.
In the analysis between 136 disease classifications based on level-1 of the ICD-10 tree against the scaled AD PRS, we provide evidence that the genes not specific to AD contribute to the occurrence of metabolic disorders, organic, including symptomatic, mental disorders and other degenerative diseases of the nervous system. The positive associations with metabolic disorders (mainly including patients suffer from disorders with lipoprotein metabolism, disorders with mineral metabolism, and other disorders of fluid, electrolyte, and acid-base balance) were in accordance with previous studies [27–30]. In addition to the well-established role for APOE ɛ4 as a risk factor for AD and hypercholesterolemia [31, 32], ɛ4 allele carriers also found an increasing sodium, copper, and magnesium levels [29], which further supported our genetic-based causal association between AD PRS and metabolic disorders. As for the phenotype of organic, including symptomatic, mental disorders (mainly including diagnosis with a variety of dementia and delirium) and other degenerative diseases of the nervous system (mainly including diagnosis with AD and other degenerative diseases of nervous system, not elsewhere classified), as we excluded participants diagnosed with AD before the analysis, the positive associations indicated a shared pathogenic gene between AD and degenerative diseases, which is in line with previous studies [33, 34]. Furthermore, previous study reported both infectious delirium and AD suffered from synapse pathology and loss of homeostatic microglial control [35], which suggested a potential target of the shared gene between AD and delirium.
We also found that higher AD PRS have beneficial effects on some medical conditions, which was consistent with a previous study focusing on the effects of APOE on a range of diseases [36]. Actually, improved fitness during fetal development, infancy, and youth has been found in young ɛ4 allele carriers relative to ɛ3 allele [37]. And in our study, we suggested an inverse association between AD PRS and obesity. Likewise, low weight has also been found in preclinical stage of autosomal dominant AD [38] and a lower BMI was found in ɛ4 carriers in children [39], which may due to the shift from global metabolic toward lipid oxidation and enhanced thermogenesis [40]. In addition, since severe obesity drives the risk for T2DM in adolescents and young adults [41], the negative association between AD PRS and diabetes mellitus may be mediated by the inverse association with obesity. As for the decreased rate of other retinal disorders, converse effects of APOE ɛ4 and APOE ɛ2 has been reported in age-related macular degeneration (AMD) compared with their effects in AD [42–44]. Proangiogenic effects of APOE ɛ2 and APOE ɛ3 and their role in pathogenic subretinal inflammation may partly explain this association [45, 46]. In summary, most of the inverse associations found in prior studies were based on APOE, one contributor of the total AD PRS, and our study extended the pleiotropic effects of AD risk genes in a whole level.
The inverse association between AD PRS and bronchitis found in our study is novel. Specifically, the relationship between AD and bronchitis is not clear, but more and more evidence emerged to show that inflammation may be the bridge between them. C-reactive protein (CRP), as a serum acute and chronic inflammation biomarker [47], was found to be negatively associated with APOE ɛ4 in cognitively healthy individuals [48]. A recent work illustrated a positive association between genetically determined CRP and bronchitis, on the other side, a negative association with AD [49], strengthening the suggestion of an inverse relationship between AD PRS and bronchitis. And whether inflammation mediates this association still need further investigation.
Novel associations between AD PRS and diseases of pancreas, acute pancreatitis, and cholecystitis without cholelithiasis were also found in our study. As a protective factor for AD, APOE ɛ2 has been associated with type III hyperlipoproteinemia, which is linked with acute pancreatitis [50–52] and can partly explain the inverse relationship between AD PRS and acute pancreatitis. As for cholecystitis without cholelithiasis, previous study has reported the protective effect of APOE4 on cholecystitis and cholelithiasis [36] and the observed AD PRS-associated decreased pancreas risk may be explained by the reduce in cholecystitis risk. Future studies are needed to figure out the relationship between AD risk genes and cholecystitis without cholelithiasis.
The main strength of our study is the large sample size. Compared with a recent study using only 30,000 individuals [15], using information from over ten-fold participants makes the genetic effects in a range of diseases detectable. In addition, compared with previous study which investigated the associations between different genotypes of APOE with multiple medical conditions [36], focusing on the PRS can integrate effects from multiple genetic variants, thus increase the statistical power. Moreover, excluding individuals with a diagnosis of AD can provide the evidence of the direct causal effects of AD PRS independent of the disease, as the reverse causality is impossible. Furthermore, testing the diseases from broad to narrow can reduce the hypothesis-driven associations and facilitate the investigation in the genetic pleiotropy.
There are several limitations in our study. First, although over 700 diseases were included in our analysis, including diseases with cases less than 200, the limited number of cases restricted the power to detect the causal effect of AD PRS on the rare diseases. Thus, to make the results general and reliable, we mainly focused on the associations with those diseases with cases more than 1000. Second, as the PRS integrate effects from multiple genetic variants, the single genetic variants and the linkage disequilibrium variants need further investigation to elucidate the specific effects of AD risk genes. Third, as we only include White-British participants in our study, further exploration is needed to test the generalizability in other races. Finally, biological experiments are necessary to illustrate the molecular mechanisms behind these associations.
Altogether, we confirmed several associations reported previously and found some novel results using a population without AD diagnosis, which greatly extended the pleiotropic effects of AD risk genes. Specifically, AD PRS was linked to an increased risk of other types of dementia and other degenerative diseases of the nervous system and a decreased risk of diabetes mellitus, obesity, chronic bronchitis, diseases of pancreas, other retinal disorders, and cholecystitis without cholelithiasis. Further mechanistic studies are needed to better understand the pleiotropy and underlying etiology of AD risk genes, which might benefit the treatment of AD.
AVAILABILITY OF DATA AND MATERIALS
The dataset supporting the conclusions of this article is available in the UK Biobank (https://biobank.ndph.ox.ac.uk/showcase/index.cgi) upon application.
Footnotes
ACKNOWLEDGMENTS
We thank the International Genomics of Alzheimer’s Project (IGAP) for providing summary results data for these analyses. The investigators within IGAP contributed to the design and implementation of IGAP and/or provided data but did not participate in analysis or writing of this report. IGAP was made possible by the generous participation of the control subjects, the patients, and their families. The i–Select chips was funded by the French National Foundation on Alzheimer’s disease and related disorders. EADI was supported by the LABEX (laboratory of excellence program investment for the future) DISTALZ grant, Inserm, Institut Pasteur de Lille, Universi ´ de Lille 2 and the Lille University Hospital. GERAD/PERADES was supported by the Medical Research Council (Grant n° 503480), Alzheimer’s Research UK (Grant n° 503176), the Wellcome Trust (Grant n° 082604/2/07/Z) and German Federal Ministry of Education and Research (BMBF): Competence Network Dementia (CND) grant n° 01GI0102, 01GI0711, 01GI0420. CHARGE was partly supported by the NIH/NIA grant R01 AG033193 and the NIA AG081220 and AGES contract N01–AG–12100, the NHLBI grant R01 HL105756, the Icelandic Heart Association, and the Erasmus Medical Center and Erasmus University. ADGC was supported by the NIH/NIA grants: U01 AG032984, U24 AG021886, U01 AG016976, and the Alzheimer’s Association grant ADGC–10–196728.
Funding sources: Science and Technology Innovation 2030 Major Projects no. 2022ZD0211600 (JTY); National Natural Science Foundation of China (no. 82071201 and no. 81971032 to JTY; no. 82071997 to WC); Shanghai Rising-Star Program (no. 21QA1408700 to WC); Shanghai Municipal Science and Technology Major Project (no.2018SHZDZX01 to JTY); Research Start-up Fund of Huashan Hospital (2022QD002 to JTY); Excellence 2025 Talent Cultivation Program at Fudan University (3030277001 to JTY); ZHANGJIANG LAB, Tianqiao and Chrissy Chen Institute, and the State Key Laboratory of Neurobiology and Frontiers Center for Brain Science of Ministry of Education, Fudan University (to JTY).
