Abstract
Background:
Epidemiological studies have reported positive associations between long-term exposure to particulate matter of 2.5 microns or less in diameter (PM2.5) and risk of Alzheimer’s disease and other clinical dementia. Many of these studies have analyzed data using Cox Proportional Hazards (PH) regression, which estimates a hazard ratio (HR) for the treatment (in this case, exposure) effect on the time-to-event outcome while adjusting for influential covariates. PM2.5 levels vary over time. As air quality standards for PM2.5 have become more stringent over time, average outdoor PM2.5 levels have decreased substantially.
Objective:
Investigate whether a Cox PH analysis that does not properly account for exposure that varies over time could produce a biased HR of similar magnitude to the HRs reported in recent epidemiological studies of PM2.5 and dementia risk.
Methods:
Simulation analysis.
Results:
We found that the biased HR can affect statistical analyses that consider exposure levels at event times only, especially if PM2.5 levels decreased consistently over time. Furthermore, the direction of such bias is away from the null and of a magnitude that is consistent with the reported estimates of dementia risk in several epidemiological studies of PM2.5 exposure (HR≈1.2 to 2.0).
Conclusions:
This bias can be avoided by correctly assigning exposure to study subjects throughout the entire follow-up period. We recommend that investigators provide a detailed description of how time-dependent exposure variables were accounted for in their Cox PH analyses when they report their results.
INTRODUCTION
In recent years, air pollution has been increasingly investigated as a risk factor for neurological outcomes. Several systematic reviews have evaluated associations between air pollution and neurodegenerative diseases [1–3]. These reviews have reported positive associations between PM2.5 and risk of dementia, including clinical Alzheimer’s disease (AD). For example, Delgado-Saborit et al. [1] identified nine studies published in 2019 or earlier that evaluated the association between exposure to PM2.5 and risk of dementia and/or AD [4–12] and concluded that consistent associations were reported between dementia incidence and air pollutants, including PM2.5. Ten more studies were published between 2020 and 2023 [13–22]. Most of these studies used Cox proportional hazards (PH) regression analyses.
When reviewing the results of these studies, we observed that associations of larger magnitude between PM2.5 and dementia were reported more often in studies that used moving PM2.5 averages as the exposure metric, i.e., in studies where the exposure values used in the statistical analyses decreased over time (e.g., Cacciottolo et al. [7], Mortamais et al. [14], Shaffer et al. [18]) than in studies that used baseline or cumulative exposures, i.e., in studies where statistical analyses were not based on decreasing exposure levels over time (e.g., Chen et al. [4]; Chen et al. [5], Ilango et al. [11]). Bowe et al. [12] reported a hazard ratio (HR) for dementia of 1.19 (95% confidence interval (CI): 1.15–1.23) based on a one-year moving average of PM2.5 among more than 4.5 million US veterans followed for up to 11 years through 2016 (median age, 64 years at baseline year of 2005), and also reported an HR of 1.04 (95% CI: 1.01–1.07) based on the three-year average exposure prior to baseline. Jung et al. [8] studied a cohort of 95,670 adults≥65 years of age in Taiwan and reported an adjusted AD HR of 2.38 (95% CI 2.21–2.56) per increase of 4.34μg/m3 in the one-year moving average PM2.5 but an adjusted AD HR of 1.03 (95% CI 0.95–1.11) per increase of 13.21μg/m3 in the one-year baseline average PM2.5.
Separately, results from two studies based on the Women’s Health Initiative Memory Study (WHIMS) cohort (ages 65 to 79 years) during comparable time periods were also inconsistent (see Fig. 1 in Cacciottolo et al. [7]; see Table 12 in Chen et al. [4]). Cacciottolo et al. [7] reported an adjusted dementia HR for high versus low exposure of 1.92 (95% CI 1.31–2.80) with high exposure defined as a 3-year moving average PM2.5 level that exceeded the current US National Ambient Air Quality Standards (NAAQS) for PM2.5 of 12μg/m3 (low exposure, 3-year moving average≤12μg/m3). In contrast, the analyses by Chen et al. [4] were based on annual cumulative averages and resulted in a null finding with an adjusted dementia HR of 0.99 (95% CI: 0.81–1.22) for the interquartile range (IQR) increase in PM2.5 of 3.9μg/m3.

Seasonally-weighted annual average PM2.5 concentrations, 2000–2020. Reproduced from USEPA [23].
Annual average PM2.5 levels have declined over the past 30 years in the United States and many other countries. A summary measure of time-dependent average PM2.5 exposure estimated near the diagnosis of a chronic disease does not sufficiently represent the exposure circumstance related to an underlying physiological process associated with chronic diseases that develop slowly over time. The exposure period relevant to the disease induction could be many years earlier (assuming that a relationship exists between exposure and disease). Dementia, for example, has an insidious onset which occurs over many years.
Intuitively, baseline exposures allow for a reasonable latency and disease induction for dementia in cohort members. Therefore, if there were an association between PM2.5 and progressive clinical dementia, we would expect HRs based on baseline exposures to exceed HRs based on moving averages. Many factors may have contributed to the inconsistent findings including the possibility that investigators focused their conclusions on positive findings resulting from moving PM2.5 averages rather than null findings based on baseline or cumulative exposures. Another explanation could be that time-dependent exposure was not appropriately accounted for in the set-up of the Cox PH models. Cox PH regression assesses time-to-event data (in this case, dementia diagnoses) throughout follow-up. Persons diagnosed with dementia, as well as persons who died from other causes or who are lost to follow-up for other reasons, are removed from further consideration. If exposure changes over time, dementia diagnoses must be linked with temporally relevant exposure levels. Bias may result if only exposures measured at time of diagnosis are considered relevant.
As an illustrative example, we considered a follow-up period with four distinct timepoints (Table 1). We assumed that the timepoint-specific PM2.5 exposure levels, which declined from 18μg/m3 to 8μg/m3 over time, were temporally relevant to the dementia diagnoses (rather than the current exposure). Persons A and B were diagnosed with dementia at timepoints 1 and 4, respectively, and therefore one exposure measure was sufficient for Person A but 4 exposure measures were required for Person B.
For Person A, an exposure level of 18μg/m3 was relevant to the dementia diagnosis at timepoint 1; no further follow-up was required (Table 1). For Person B, the exposure relevant to the dementia diagnosis was the low exposure of 8μg/m3 at timepoint 4; higher exposures of 18μg/m3, 15μg/m3, and 12μg/m3 corresponded to the absence of diagnoses at timepoints 1–3. Overall, the data provided no evidence of an association between high exposure and dementia. Next, we assumed that only exposures relevant to time at diagnosis were available and that the exposure of 8μg/m3 measured at Person B’s diagnosis was applied to all 4 timepoints. In this case, Person B’s absence of diagnosis at timepoints 1 to 3 was mistakenly assigned to the low exposure level of 8μg/m3 creating the appearance of an association between exposure and dementia.
Illustrative example of correct and incorrect assignment of timepoint-specific PM2.5 exposure levels in Cox PH regression
The cohorts in these studies were followed up over time during which national PM2.5 exposures generally declined (Fig. 1). When exposures decrease over time, basing exposure levels exclusively on the time of diagnosis or the end of follow-up always leads to bias away from the null. Ignoring— for simplicity and without loss of generality— persons lost to follow-up due to other causes, this can be explained as follows. Low exposure levels are possible among persons diagnosed with dementia later in the follow-up period (e.g., Person B in Table 1) and among persons remaining dementia-free at the end of follow-up. However, high exposure levels are limited to earlier diagnoses (e.g., Person A in Table 1) and are only possible in the presence of dementia. This creates an artificial association between high exposure and dementia. When temporally relevant exposures are used for each timepoint instead, a high exposure level can be linked with a diagnosis or with the absence of a diagnosis. For example, at timepoint 1 in Table 1, the exposure level of 18μg/m3 corresponds to a dementia diagnosis for Person A and the absence of a diagnosis for person B.
Investigators do not typically provide a detailed description of how a time-dependent exposure variables were accounted for in their Cox PH analyses. In this study, we used a simulation analysis to investigate whether a Cox PH analysis that does not properly account for exposure that varies over time could produce an HR of similar magnitude to the HRs reported in these studies.
METHODS
To illustrate the potential influence of incorrect modeling of time-dependent exposures, we generated three simulated cohorts.
Cohort A was loosely based on the study by Bowe et al. [12]. We generated a hypothetical cohort of 4,500,000 participants aged 55–76 years in 2006 who were followed from 2006 through 2016 (Supplementary Table 1, columns 1 and 2). Annual PM2.5 exposures for the follow-up years (Supplementary Table 1, column 3) were based on the median reported by Bowe et al. [12]; concentrations decreased slightly through 2012 and then remained fairly stable, similar to US national trends (Fig. 1). Because most cohort members in the Bowe et al. [12] study were men, we based mortality rates (Supplementary Table 1, column 4) on the 2010 US life table for men [24]. We assumed a slight increase in dementia diagnoses as the cohort aged (Supplementary Table 1, column 5) and roughly matched the dementia prevalence throughout follow-up to the prevalence of dementia deaths of 22 per 1,000 reported by Bowe et al. [12].
Cohort B was loosely based on the study by Cacciottolo et al. [7] and consisted of 3,600 women aged 65–79 years in 1995 who were followed from 1995 through 2010 (Supplementary Table 2, columns 1 and 2). Annual average PM2.5 exposure values for the follow-up years (Supplementary Table 2, column 3) were based on the median of 12.19μg/m3 and interquartile range (IQR) of 10.62–14.34μg/m3 reported by Cacciottolo et al. [7] and decreased over time, similar to US national trends [23]. We based mortality rates (Supplementary Table 2, column 4) on the 2000 US life table for women [24], and we assumed an increasing trend in dementia diagnoses as the cohort aged (Supplementary Table 2, column 5). The dementia prevalence throughout follow-up roughly matched the corresponding prevalence of 47 per 1,000 reported by Cacciottolo et al. [7].
Cohort C was loosely based on the study by Jung et al. [8]. We generated a hypothetical cohort of 100,000 participants aged 65–84 years in 2000 who were followed from 2000 through 2010 (Supplementary Table 3, columns 1 and 2). Annual PM2.5 exposures for the follow-up years were based on median PM2.5 levels in Taiwan as reported in Fig. 5 in Jung et al. [8]. Except for the years 2004 and 2005, there was an overall tendency towards lower exposure levels over time (Supplementary Table 3, column 3). We based mortality rates (Supplementary Table 3, column 4) on the 2000 US life table [24] based on the rationale that crude mortality rates and proportions of deaths after age 65 were shown to be similar in the US and Taiwan [8, 25 pages 12 and 17]. We assumed a slight increase in dementia diagnoses as the cohort aged (Supplementary Table 3, column 5) and roughly matched the dementia prevalence throughout follow-up to the prevalence of 15 per 1,000 reported by Jung et al [8].
We based individual cohort members’ exposures on the annual average PM2.5 levels for each follow-up year shown in Supplementary Table 1 for Cohort A, in Supplementary Table 2 for Cohort B, and in Supplementary Table 3 for cohort C. Specifically, we randomly selected values from normal distributions with means equal to the annual average values. Standard deviations were chosen such that the IQRs of the exposures in Cohort A, Cohort B, and Cohort C roughly matched the values reported by Bowe et al. (2019), Cacciottolo et al. (2017), and Jung et al. (2015), respectively.
To evaluate if incorrect modeling of time-dependent exposures could create an artificial association between PM2.5 and dementia, we assumed that PM2.5 exposure was not associated with dementia or death. Therefore, we randomly assigned dementia diagnoses and deaths to cohort members based on the rates shown in Supplementary Table 1 for Cohort A, in Supplementary Table 2 for Cohort B, and in Supplementary Table 3 for Cohort C. For each follow-up year, we created two cut points (cut point 1 = Dementia diagnoses per 1,000, cut point 2 = Dementia diagnoses per 1,000 + Deaths per 1,000) and randomly assigned a number, R, between 0 and 1 to each living and dementia-free cohort member. Dementia diagnoses and deaths were then assigned as follows: If R≤cut point 1, the cohort member was marked as a dementia case and excluded from further follow-up. If cut point 1 < R≤cut point 2, the cohort member was marked as deceased and removed from follow-up. Cohort members with R > cut point 2, were considered alive and dementia-free and were reassessed in the next follow-up year. For example, for year 1995 in Table 1, cut point 1 was 1 per 1,000 and cut point 2 was 1 per 1,000 + 25 per 1,000 = 26 per 1,000. Cohort members with R≤1 per 1,000 were marked as dementia cases, cohort members with 1 per 1,000 < R≤26 per 1000 were marked as deceased, and cohort members with R > 26 per 1,000 were considered alive and dementia-free.
We conducted Cox PH regression analyses using two different approaches. The first approach accounted for each person’s entire follow-up period (similar to the correct setup for Person B in Table 1). The second approach focused on exposure levels at event times (i.e., at dementia diagnosis, death, or end of follow-up) and applied them to event-free time periods (similar to the incorrect setup for Person B in Table 1).
All analyses were conducted in SAS 9.4 (SAS Institute, Cary, NC). The code is provided in the Supplementary Material.
RESULTS
When exposure and outcome were evaluated across the entire follow-up period, there was a null association between PM2.5 exposure and dementia in all three cohorts (Table 2). When the analyses focused on exposures at event times that were then applied to the entire follow-up period, the HRs were biased away from the null. The bias was more pronounced in the two simulated analyses with steeper declines in PM2.5 exposure levels over time (Cohorts B and C, bias in HRs > 100%) than for Cohort A where PM2.5 concentrations decreased slightly initially but then remained fairly stable (bias in HR > 20%).
Results from Cox PH regression analysis for cohorts A, B, and C
aIQR of the PM2.5 range over the follow-up period in the study by Bowe et al. (2019) [12]. bHigh exposure defined as a 3-year average exceeding the current US National Ambient Air Quality Standards (NAAQS) for PM2.5 of 12μg/m3 (low exposure, 3-year average≤12μg/m3) in the study by Cacciottolo et al. (2017) [7]. cIQR of the decrease in PM2.5 over the follow-up period in the study by Jung et al. (2015) [8].
DISCUSSION
We quantitatively assessed the impact of incorrectly modeling time-dependent exposures in models assessing effects of PM2.5 average exposures on dementia risk. We showed that the resulting bias can affect statistical analyses that consider exposure levels at event times only, especially if PM2.5 levels decreased consistently over time. Furthermore, we showed that the direction of such bias is away from the null and of a magnitude that is consistent with the reported estimates of dementia risk in several epidemiological studies of PM2.5 exposure. Importantly, we showed that the bias can be avoided by correctly assigning exposure to study subjects throughout the entire follow-up period. To proactively address questions about whether Cox PH regression was performed correctly, study investigators should explicitly describe how they accounted for time-dependent variables, for example, by using and correctly analyzing a counting process in which multiple records reflecting different exposure periods are created for each study subject. The authors could also provide the relevant programming code used to create and/or analyze the time-dependent variables.
We assumed that the prevalence of dementia increased over time. This is a reasonable assumption considering the aging population. While there is some evidence that the incidence of dementia is decreasing in the US [26–28] and other high-income countries [28], the burden of disease is expected to increase in the population ≥65 years of age [28, 29].
Some risk factors for dementia are the same as risk factors for cardiovascular disease, including high blood pressure, smoking, diabetes, lack of physical activity, obesity, and poor diet [28, 30]. Additional risk factors include alcohol consumption, low cognitive engagement, depression, hearing loss, social isolation, and traumatic brain injuries [28, 30]. Potential biological pathways by which long-term PM2.5 exposure could lead to dementia include neuroinflammation and oxidative stress in the cerebral cortex, hippocampus, and hypothalamus, possibly via respiratory tract inflammation or translocation of particles and soluble components [31]. Neuroinflammation may be associated with morphological changes in the brain and neurodegeneration as indicated by reduced brain volume and cortical white matter and increased indicators of Alzheimer’s disease (e.g., phosphorylated tau and amyloid-β peptides) in the cerebral cortex [31]. Alternatively, activation of the sympathetic nervous system, possibly via upregulation of the renin-angiotensin system (RAS), may contribute to increased blood pressure [31], and vascular dysfunction.
As described above, a summary measure of time-dependent average PM2.5 exposure estimated near the diagnosis of a chronic disease with an insidious onset such as dementia does not sufficiently represent the exposure period relevant to the disease induction, which could be many years earlier. In addition, Li et al. [32] state that repair mechanisms for neurological damage from PM2.5 exposure are not yet understood but are likely observed within some time window following exposure. Cumulative exposures, lagged to account for a relevant window of exposure for disease development, are likely to be a more appropriate summary measure in studies of dementia etiology. Because cumulative exposure measures take earlier, higher exposure periods into account, these exposure measures may prove useful in identifying early biological indicators of any PM2.5-related decreases in cognitive function or Alzheimer’s disease, as well as help identify relevant time windows of repair or deterioration in the injury process related to PM2.5 exposure.
CREDIT AUTHOR STATEMENT
Linda D. Dell (Conceptualization; Data curation; Funding acquisition; Investigation; Methodology; Project administration; Resources; Supervision; Validation; Visualization; Writing – original draft; Writing – review & editing); Annette M. Bachand (Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Project administration; Software Validation; Visualization; Writing – original draft; Writing – review & editing).
Footnotes
ACKNOWLEDGMENTS
The authors are grateful to Ms. Evelyn DeVellis for support with manuscript preparation.
FUNDING
This research was funded in part by the American Petroleum Institute (API) through a contract between API and Ramboll U.S. Consulting, Inc., an international science and engineering company that provided salary compensation to the authors. No authors were directly compensated by API.
CONFLICT OF INTEREST
AB and LD are salaried employees of Ramboll Americas Engineering, Inc., a consulting firm that provides scientific and technical support to a variety of clients in private and public sectors. API requested a review of the completed work but was not involved in the preparation of the manuscript or the interpretation of the results. The views expressed in this manuscript are those of the authors and do not necessarily reflect the views of API, Ramboll Americas Engineering, Inc. or its parent company, Ramboll Group A/S.
DATA AVAILABILITY
The data supporting the findings of this study are available within the article and/or its supplementary material.
