Abstract
Background
This project has investigated the role of the Bacillus Calmette-Guérin (BCG) vaccine as a potential treatment against Alzheimer's disease (AD) and related dementias (ADRD).
Objective
To further establish that BCG treatment results in lower risk of ADRD through novel machine learning methods and to analyze the heterogeneity of treatment effects.
Methods
This retrospective cohort study was conducted from May 28, 1987 to May 6, 2021, in patients who were 50 years or older and were diagnosed with non-muscle-invasive bladder cancer (NMIBC). Follow-up duration was 15-years. Machine learning algorithms using survival analysis and the random forest algorithm were the primary methods of data analysis.
Results
The research has found that on average, NMIBC patients who received BCG treatment had a 6.9% (95% CI: 0.43%, 13.4%) lower risk of developing ADRD compared to those who did not. Heterogeneous treatment effects were also detected for those with a history of mental health disorders and also for those with a history of respiratory diseases. Those with mental health disorders were at a 14.7% (95% CI: 0.6%, 28.9%) reduced risk of ADRD if they received BCG treatment compared to no BCG treatment. Additionally, those taking BCG with respiratory diseases increased risk of ADRD by 13.6% (95% CI: 1.1%, 26.1%) compared to those with no BCG treatment.
Conclusions
BCG is associated with a lower risk of ADRD through novel analysis methods and has detected heterogeneity of treatment effects. This presents BCG as a potential low-cost method, with few side-effects, to prevent ADRD.
Keywords
Introduction
Alzheimer's disease (AD) stands as one of the largest challenges in contemporary global healthcare. This progressive neurodegenerative disorder is characterized by debilitating cognitive decline, memory loss, and impaired daily functioning and affects healthcare systems worldwide.1–9 This underscores the pressing need for effective strategies to prevent and treat this devastating condition.1,10–14 AD is a specific type of dementia; it is a progressive neurodegenerative disorder in which memory loss is an early and prominent symptom.15–18 It is also the most common type of dementia pathology accounting for roughly two-thirds of dementia cases.19–22 For this study, the outcome of interest will be Alzheimer's disease and related dementias (ADRD), which encompass AD as well as other pathologies that contribute to dementia.
The Bacillus Calmette-Guérin (BCG) vaccine, initially developed for tuberculosis prevention, has emerged as a promising candidate to mitigate ADRD risk.23–28 BCG was initially developed for preventing tuberculosis, but, due to its action and positive effects on the body's immune system, it has emerged as an FDA-approved treatment for bladder cancer, particularly in patients with non-muscle-invasive bladder cancer (NMIBC).29–35 AD is marked by biological markers which include the build-up of amyloid-beta plaques and tau protein tangles, the pathologic hallmarks of the disease, genetic history, such as the APOE4 gene, and neuroinflammation.16,36–38 This neuroinflammation, driven by myeloid cells, worsens as the disease progresses.39–42 Because of the BCG vaccine's recognized immunomodulatory properties and enhancement of the body's immune system, BCG has been suggested as a novel prevention strategy for ADRD.23,29,43
This research will therefore build upon the work done by Weinberg et al. (2023) 23 to further investigate the average treatment effects of BCG treatment on risk of ADRD in NMIBC patients through implementation of novel and more advanced data analysis methods which incorporate machine learning (ML) and survival analysis using the random forest algorithm: it will also analyze the heterogeneous treatment effects of the relation. 34 The overall goal will be to determine whether certain subgroups of the population are at higher or lower risk for ADRD if they are treated with the BCG vaccine. To better understand heterogeneity, this study utilized electronic health records from the Mass General Brigham (MGB) healthcare system on 6467 patients with NMIBC studied by Weinberg and colleagues. 23
Our first prespecified hypothesis for these analyses is that for the overall treatment effects of BCG on the risk of ADRD, the results will be in line with previous literature and show a decreased risk of ADRD for patients treated with BCG. 23 As for heterogeneity, we hypothesize that those who are at increased risk for ADRD (e.g., those with depression or other mental health disorders), will show a greater reduction in risk of ADRD through BCG treatment.
Methods
Study design
This study utilized retrospective cohort (observational) data to analyze the relationship between BCG treatment for bladder cancer and the overall risk of ADRD as well as the heterogeneous treatment effects of that association. We specifically build upon work done by Weinberg et al. (2023), 23 utilizing the same cohort, which found a reduced risk of ADRD in NMIBC patients who were treated with BCG relative to those who were not. The data used for this study draws from the Mass General Brigham Healthcare Research Patient Data Registry (RPDR), which is a warehouse of electronic health records (EHR) across this large healthcare system. The research team selected patients in the MGB network who were diagnosed with NMIBC. 23 The exposed or treatment group for this study were patients who received BCG for treatment of NMIBC and the unexposed group were patients who did not. BCG is a common agent for the treatment of bladder cancer in addition to chemotherapy. 29 This analysis was cross sectional but accounted for time-to-event and censoring through utilization of survival analysis using ML. Survival analysis was utilized due to the possibility of right-censoring due to death or loss to follow-up in the data.23,44 The primary outcome for this study was whether or not a person was diagnosed with ADRD. For this study, Alzheimer's disease is defined as ADRD diagnosis reported in MGB's electronic health records.
Setting and participants
The periods of inclusion for the study cohort were from May 29, 1987 to May 6, 2021, and included those people who were 50 years or older and going through one of the initial and primary treatments for bladder cancer: transurethral resection of bladder tumor (TURBT), and had an NMIBC diagnosis. 23 The study included a 15-year follow-up of both those treated with BCG and those who were not treated with BCG. Furthermore, patients were excluded if their cancer developed into muscle-invasive bladder cancer within 4 months of the TURBT treatment or if they had a report of radical cystectomy pathology (indicative of muscle-invasive bladder cancer as opposed to NMIBC). 23 Finally, if a patient had less than 1 year of follow-up (defined as less than one year after the initial diagnosis of NMBIC) or a history of ADRD development within a year then they were also excluded from the study. 23
Variables
Treatment variable: BCG treatment
This variable categorizes individuals into two groups: NMIBC patients who were treated with BCG and those who were not. It is the primary independent variable under investigation as the potential treatment that may influence ADRD risk.
Outcome variable: Alzheimer's disease or related dementias status
This binary outcome variable serves as the dependent variable, indicating whether an individual has been diagnosed with AD. Here we define the outcome of AD as: Alzheimer's disease and related dementia (ADRD). The outcome variable had been previously coded for in the Weinberg et al. (2023) study. 23 The data were labeled by the previous study which identified this cohort and classified ADRD based on ICD-9, ICD-10 and drug coding's for ADand related conditions. 23 They included the following codes from ICD-9: 290.X, 294.X, 331.X, and 780.93X and the following from ICD-10: G30.X and G31.X. 23 Lastly, the presence of a prescription for drugs prescribed almost exclusively for ADRD were also used to classify ADRD cases. These drugs included: galantamine, donepezil, rivastigmine and memantine and were used as an indication of an ADRD diagnosis even if the relevant code was absent. 23
Statistical methods
There were two separate treatment effects models used for this study. The first was designed to detect the average overall treatment effect BCG on ADRD, using ML, incorporating survival analysis and the Random Forest algorithm. The second model was used to detect heterogeneous treatment effects: providing novel data on which parts of the population may be at higher or lower risk of ADRD if they were treated with BCG.
Average treatment effects model
Covariates included for average treatment effects model
These covariates included: Age, Sex (Male or Female), Race/Ethnicity (White or non-White), Education Level (Categorical: less than High School = 0, High School degree = 1, some college = 2, Bachelor's degree = 3, Master's degree = 4, Doctoral degree = 5), Depression (Yes or No), Diabetes (Yes or No), Hypertension (Yes or No), Tobacco Usage (Yes or No), grade of tumor malignancy (Low, Medium or High). These variables were included in the model as they are potential confounders due to their relationship with more severe levels of NMIBC therefore leading to BCG treatment and because they are risk factors for the development of ADRD.45,46 Including these covariates thereby increases accuracy of the model.
Data analysis
To determine the average treatment effects, this study utilized ML algorithms in R through the “grf” package. 44 The data was initially split up into training and testing data, with 70% of data going to the training data and 30% of the data utilized for testing the model. Next, the model was trained using the “causal_survival_forest” algorithm. This algorithm allowed for the estimation of the conditional average treatment effects (CATE) based on the covariates included in the model, while incorporating time to event and censoring in analysis as well as the Random Forest algorithm to predict onset of ADRD. Following the generation of the conditional average treatment effects, the model was run against the testing data. Finally, the average treatment effects (ATE) were generated using the “average_treatment_effect” algorithm in R, which gave the overall effect regarding the risk of ADRD in those treated with BCG minus the risk of ADRD in those who were not: E[Y(1) – Y(0)]. 44 The average treatment effects model utilizes the casual survival forest ML model to estimate the overall average treatment effects of BCG treatment on risk of ADRD.
Heterogeneous treatment effects model
Unsupervised clustering of diagnosis codes
In addition to the analysis of BCG treatment's impact on ADRD, our study included an unsupervised clustering analysis of ICD-9 codes to group similar diseases. Clustering was done due to the huge number of unique codes (69,364 different disease codes) among all the different patients from all patient visits. Therefore, a dimensionality reduction was done to cluster all the codes into different patterns of disease to see if there would be heterogeneous treatment effects across the clusters rather than all individual codes.
Clustering methodology
We utilized the Louvain clustering method, combined with low-dimensional embedding, to systematically categorize diseases based on similarities. To cluster the codes, we used embeddings (numerical vectors) from Choi et al. (2016), which utilized neural language modeling and disease co-occurrence to learn embeddings from claims data. 47 The learned embeddings for the diagnosis codes from Choi et al. (2016) were then used to make the clusters such that patients with similar International Classification of Diseases, Ninth Revision (ICD-9) codes were close to each other. 47 ICD-9 is an official system of assigning codes to diagnoses and procedures related to hospitalization around the world. 48 We utilized the embeddings from Choi et al. (2016) because they learn from description and co-occurrence in patient charts. This enhances the ability to cluster similar diseases together and makes the clusters conceptually different from the ICD-9 groupings. 47 To implement this, we first extracted all the ICD-9 codes from the cohort. Codes with low prevalence (frequency count less than 10) were discarded to focus on more prevalent disease patterns.
Labeling and analysis
Post clustering, we labeled each cluster based on the most frequent disease in the cluster. This step was crucial in understanding the predominant disease categories within each cluster. After each disease was classified, the cluster was further filtered to incorporate only the codes that were relevant to the cluster label. For example, if a cluster was labeled “Heart Diseases” based on prevalence within the cluster, the codes that were not relevant to Heart Diseases were removed. We then joined these disease clusters to our primary dataset, which included BCG status and AD diagnosis, to explore ATE and heterogeneous treatment effects. All the clusters were then included in the heterogeneous treatment effect model as covariates to determine heterogeneity of treatment effects.
Identifying heterogeneous treatment effects
To identify heterogeneous treatment effects, the same method of data analysis was used as for the average treatment effects model. Initially, the data was split up into testing and training data then the model was trained using the “causal_surival_forest” algorithm. The causal survival forest algorithm allowed for the estimation of the conditional average treatment effects based on the covariates included in the model, while also factoring in time-to-event and censoring in the model. In the CATE analysis, individual-level CATE values (‘CATEi’) were treated as the dependent variable (‘y’), while the covariates (‘Xi’) or the different disease clusters were employed as independent variables. Linear models of the form: CATEi ∼ beta_0 + Xi * beta were constructed utilizing the “best_linear_projection” algorithm for each covariate (disease clusters). 44 This was done to examine the potential influence of these independent disease clusters on ADRD risk within the context of BCG treatment. This allowed for the detection of heterogeneous treatment effects for the various disease clusters. There were no missing data in the treatment or outcome columns of the dataset and a significance level of 0.05 was utilized for all conclusions.
Class-based effects of heterogeneity
After detection of any heterogeneous treatment effects, further analysis was done to detect if there was a class-based or disease specific effect to the heterogeneity. To explore the disease specific effect within the clusters, the codes that were most frequent within the cluster were separated. Initially, frequency analysis was done to see which diseases were most common within the respective disease cluster. Then the top five most frequent codes were added as covariates to the treatment effects model to see if heterogeneous treatment effects were detected for any of the five specific diseases within the cluster.
Results
Participants and descriptive data
The cohort utilized for this study comes from a previous research study (Weinberg et al.) that investigated the overall treatments effects of BCG treatment on the risk of ADRD for NMBIC patients through the generation of Hazard Ratios for time to onset of AD and death. 23 Following the inclusion criteria aforementioned 17,274 unique patients were identified with bladder cancer in the study timeframe (from 1987 to 2021). 23 However, 10,807 were excluded based on the specific requirement for NMIBC and for at least one year of follow-up from the initial pathology report of NMBIC. Additional criteria for exclusion from the study can be found in Figure 1. The final cohort was 6467 patients. Among these, 3388 patients received BCG treatment and 3079 did not Table 1. 23

Consort diagram of the patients included in the cohort. 23
Demographic and descriptive data of cohort. Based upon covariates included in ATE model.
Outcome and analysis
Average treatment effects
The initial goal of the study was to analyze the average treatment effect of those who had been treated with BCG compared to those who were not on the overall risk of ADRD utilizing novel ML techniques. The ATE represents the estimated average reduction in the risk of ADRD associated with BCG treatment. The calculated ATE was: −0.069, 95% CI: −0.0134, −0.0043 (SE 0.033). This result indicates that, on average, NMIBC patients who received BCG treatment had a 6.9% (95% CI: 0.43%, 13.4%) lower risk of developing ADRD compared to those who did not, accounting for demographic and clinical conditions.
Heterogeneous treatment effects
Clustering diagnosis codes of bladder cancer cohort
In Figure 2, the UMAP algorithm has been used to project the clusters’ feature dimensions into only 2 (UMAP1 and UMAP2) for visualization. The clustering yielded 18 clusters of: Supplemental Classification of Diseases, Cancer, Heart Diseases, Injury and Poisoning, Diseases of the Musculoskeletal System, Symptoms Signs and Ill-Defined Conditions, Diseases of the Nervous System, Diseases of the Digestive System, Diseases of the Genitourinary System, Endocrine Nutritional and Metabolic Diseases, Endocrine Nutritional and Metabolic Diseases, Diseases of the Respiratory System, Diseases of the Skin, Mental Health Disorders, Infectious and Parasitic Diseases, Diseases of the Blood, External Causes of Injury, Congenital Anomalies and Conditions Originating in the Perinatal Period.

Graphical representation of clustering technique used. Codes that were not relevant to the major disease pattern were filtered out.
Presence of heterogeneity from ICD-9 coding classification of disease
Table 2 It was found that among patients with a mental health disorder, there was a 14.7% (95% CI: 0.6%, 28.9%) reduced risk of ADRD if they received BCG treatment compared to no BCG treatment (p = 0.047). However, BCG treatment increases the risk of ADRD in those with respiratory disease by 13.6% compared to those with no BCG treatment (95% CI: 1.1%, 26.1% with p = 0.032).
Evaluation of heterogeneous treatment effects utilizing generated clusters and relevant confounders that were not included in clusters as covariates.
Significant values are bolded and asterisked.
Class-based effects of heterogeneity
To explore the disease specific effect within the mental health disorders cluster, the codes that were most frequent within the cluster were separated. Initially, frequency analysis was done to see which mental health disorders were most common within the mental health disorders column. It was found that the following diseases were most common: Delirium, Generalized Anxiety Disorder, Panic Disorder and Post-traumatic stress disorder (PTSD). This further analysis was aimed at looking at heterogeneity within the disease group. Upon analysis, it was found that no specific disease led to heterogeneous treatment effects within the disease group of mental health disorders. From this finding, it was concluded that there is a class-based heterogeneous treatment effect of mental health disorders influencing or determining the impact of BCG treatment on the risk of ADRD. Similarly, common diseases within the Respiratory Disease cluster were plotted and separately explored in the model. These diseases included: pulmonary collapse, pleural effusion, bronchitis, pharyngitis and emphysema. It was found that no disease specific heterogeneous treatment effect was noted within the respiratory disease cluster. We can again conclude that there is a class-based effect for the risk of ADRD in BCG treated individuals who have respiratory diseases.
Discussion
The results of this study show consistency with previous literature regarding the decrease in onset of ADRD for those who take the BCG treatment while incorporating novel ML algorithms into the analysis. These findings further demonstrate the positive immunotherapeutic effects of the BCG vaccine to prevent disease. 35 The difference in risk for those who received BCG treatment compared to those who did not receive BCG treatment, are similar and consistent with the differences in risk found in the Weinberg study. 23 Differences in the values from the Weinberg study are likely due to cumulative differences in risk investigated in this analysis, rather than hazard ratios used in the previous study, and lack of stratification by age here. Also, this study applies ML to improve accuracy of estimates and generates values based on each individual's counterfactual outcome of whether or not they would have received BCG treatment. We, therefore, believe that these estimates give the most accurate measure of BCG's effect on risk for ADRD.
Additionally, this study provides novel findings about the heterogeneous treatment effects for subgroups based on comorbid diseases. Particularly, we found that those with mental health disorders display a class-based decreased risk of ADRD and those with respiratory diseases display a class-based increased risk for ADRD if they take BCG treatment among NMIBC patients.
This study has some limitations. First, this study utilizes a population of bladder cancer patients that is older than the general population with an average age of 70 years old. Therefore, this study may miss some of the early ADRD cases who were excluded from the study. This would lead to a moderate bias towards the null as we are missing some of the ADRD cases as outcomes that occur earlier in life. Second, ICD-9 and ICD-10 codes may underestimate the number of cases of ADRD detected leading to bias in the detection of the outcome.23,49 This may lead to an overall moderate bias towards the null as there is less incidence of the outcome, leading to a decreased ability to make significant findings. However, this limitation is not differential across exposure groups as all ADRD cases were detected using the same methods. Lastly, some patients may have taken the BCG vaccine, especially immigrants to the United States, outside of America during childhood. Thus, previous vaccination would not be accounted for. This would mean a greater number of people in the unexposed group who were actually exposed to the BCG vaccine, leading to a small bias towards the null as the exposed and unexposed groups would be more similar. Bladder cancer patients also may not be generalizable to the general population due to their alteration in cellular metabolism, functional changes in glycolysis and mitochondrial metabolism, tumor onset and overall increased risk of death from the disease as well as other age-related conditions. 50
This study has found that among those patients with a mental health disorder there was a statistically significant reduced risk of ADRD if they received BCG treatment compared to no BCG treatment. Mental health disorders such as depression may increase the risk of ADRD or may be an early indicator of the future onset of ADRD. A study by Korcyzyn and Halperin (2009) found that those who have clinically diagnosed depression, are at higher risk of dementia compared to those without depression. 51 BCG therefore may be decreasing the risk of ADRD for a subgroup of the population that is at higher risk for disease onset compared to those without a mental health disorder. Mental health disorders, such as depression, weaken the body's immune system. 52 Thus, the proven ability for BCG to enhance the body's immune system gives further protection from protein misfolding, such as in the tau and beta amyloid proteins, which are neuropathological hallmarks of AD. 16 Finally, those with mental health disorders may also respond better to BCG treatment to reduce the risk of ADRD due to altered brain structure and functioning such as those patients with schizophrenia, depression, post-traumatic stress disorder and other mental health disorders. 53
BCG vaccine was originally intended to reduce the rates of respiratory diseases through vaccination against Tuberculosis, a disease that mainly affects the lungs, the largest organ of the respiratory system. However, we find that there is an increased relation in ADRD onset for NMBIC patients who have received BCG treatment and also have a history of respiratory diseases. Respiratory diseases, in fact, have an impact on brain health which may be the reason for this association with ADRD, a disease which affects brain functioning. 54 Respiratory diseases can lead to low levels of oxygen supply to the brain resulting in hypoxia which can lead to brain damage and impairment. 55 The respiratory system and central nervous system, including the brain, are also closely connected: disorders of the respiratory system will also effect brain functioning. 54 Such brain damage or change, resulting from respiratory diseases, may influence and lead to the increased amount of ADRD cases for those exposed to the BCG treatment due to respiratory diseases effect on brain health. This suggests a potential biological mechanistic pathway between respiratory diseases leading to brain damage or alterations, and increased ADRD onset. BCG vaccination is also intended to reduce rates of respiratory disease onset, primarily tuberculosis. 28 However, this subgroup of the population still had a history of respiratory diseases. Therefore, those with a history of respiratory diseases may not respond well to BCG treatment and have decreased positive effects of the intended treatment. These effects likely include a lack of immune system enhancement to protect against ADRD for those with a history of respiratory diseases.
The results of this study carry significant implications for our understanding of BCG treatment as a potential means of reducing ADRD risk. The consistent reduction in AD risk associated with BCG treatment, observed across diverse demographic and patient profiles, highlights the potential of this agent to be repurposed as a preventive measure against ADRD. For the population of NMIBC patients studied, BCG treatment had been done in the bladder to stimulate an immune response for the attack of bladder cancer cells. However, it is our belief that direct insertion into the brain, especially where neuroinflammation is occurring, may provide the greatest benefits for decreased risk of ADRD, leading to even lower levels of risk of ADRD associated with BCG treatment. In addition, this study has also found differences within overall heterogeneous treatment effects for both those with mental health disorders and those with respiratory diseases. This shows which subpopulations may be at greatest benefit from taking the BCG vaccine as a protection from ADRD and deepens our understanding of what biological mechanisms may contribute to ADRD onset. BCG treatment therefore represents a potentially low-cost and simple way, with few side effects, in which we can decrease levels of ADRD at the population level. While these results are promising, it is still important to recognize that further research, including randomized controlled trials, are necessary to establish causality and to elucidate the mechanisms underlying the observed protective effect.
Footnotes
Acknowledgments
The authors would like to thank the Harvard University and specifically the Department of Epidemiology for their support and help in this research project. We would also like to thank Massachusetts General Hospital, Department of Neurology, for providing the data that was utilized for this work.
Author contributions
Irfan Chaudhuri (Data curation; Formal analysis; Investigation; Methodology; Visualization; Writing – original draft; Writing – review & editing); Sudeshna Das (Conceptualization; Project administration; Supervision; Writing – review & editing).
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability
The data supporting the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.
