Abstract
Objective
Epidemiological models for estimating the prevalence and burden of disease inform health policy and service planning decisions. Our aim was to describe the challenges in evaluating such models using the example of epidemiological models for chronic obstructive pulmonary disease (COPD).
Methods
Two reviewers searched Medline, Embase, CAB Abstracts and World Health Organization (WHO) Databases from 1980 to November 2013 for epidemiological models of COPD prevalence and burden. Two reviewers extracted data and assessed the quality of the studies. We then undertook a descriptive and narrative synthesis of data.
Results
We identified 22 models employing a variety of techniques to calculate the prevalence and/or burden of COPD. Models calculated prevalence and/or mortality or other facet of disease burden using demographics and risk factors or trends, Markov-type modelling and microsimulation modelling. The six models which scored highly on the quality framework were: the Peabody model, which generated estimates of COPD prevalence; the WHO DISMOD II model which produced burden estimates in terms of disability adjusted life years with COPD and life years lost to COPD; the Atsou model which gave the life expectancy gains of individual smokers who quit smoking and associated costs; two Dutch COPD models which produced estimates of mortality and health care costs related to COPD; and the Pichon–Riviere model which gave the costs and cost effectiveness of smoking quit programmes.
Conclusions
The field of chronic disease modelling is burgeoning. As a result, policy makers need to understand how to interpret epidemiological models and their data sources.
Introduction
Governments need accurate and timely information in order to be able to plan for the health care needs of their population. Over recent years, the challenges of ageing populations with a high prevalence of chronic diseases have been widely discussed. 1 While epidemiological modelling has traditionally been used to describe patterns of infectious diseases, chronic disease epidemiological modelling is relatively new. It combines elements of mathematical and health economic disease modelling with clinical data to estimate current and future prevalence and disease burden. However, as models proliferate, decision makers will need to choose which will best suit their purposes.
Key challenges in the evaluation of an epidemiological model
First, it is pertinent to consider as precisely as possible the definition of the disease that is being modelled. Diseases may be defined according to the presence of risk factors or according to diagnostic tests.
Second, data sources should be clearly documented so that they can be consulted and assessed by other researchers for their validity. This is sometimes poorly done in modelling; however, it is important that other researchers can decide for themselves whether the sources of data used for the modelling are the most representative available and substitute alternative sources of data into the model where appropriate to allow cross validation of the model.
Third, the method of modelling should be documented transparently so that future researchers can understand what analysis has taken place. The 2010 global burden of disease 2 did not make clear how they had processed their data. The earlier version of the global burden of disease had an online supplement and additional publication3,4 detailing how they had calculated prevalence from mortality figures; however, this was lacking for the 2010 version.
Fourth, modelling is a relatively young science and there are a variety of different techniques being used. The Markov model is one of the most commonly used, and many models in this systematic review describe themselves as Markov models or Markov-type models. Such models originate in the discipline of health economics but also give estimates of prevalence and burden. Essentially, Markov models describe a disease as a number of linked health states and a cohort moves through these states from health to deteriorating health to (eventually) death. Sometimes states can additionally describe recovery from disease. Data from clinical trials is used to inform the transition probabilities between the states. For an excellent introduction to Markov models please see Briggs and Sculpher. 5
Fifth, the time frame for the model’s operation needs to be considered. It should be considered whether the model is set to use past data to produce an estimate for current times and whether this done by projecting trends or by another method, or does it use present data and project into the future. The population for which the projection is made should also be considered. Populations as big as entire countries are often the frame of reference; however, this often pools estimates across urban and rural areas and other forms of geographical heterogeneity.
Sixth, sensitivity analysis should be carried out. This is where the inputs of the model are altered by a threshold amount, e.g. 10% and the model re-run to see the effect on the output. This allows the reviewer to thereby evaluate how sensitive the model is to each of its inputs. Sensitivity analysis may be one-way where only one input parameter is changed at a time or multi-way where additional input parameters are changed simultaneously to evaluate the combined effect of these simultaneous changes on the output.
And finally, it would be possible to compare the results from each model with results from large prevalence surveys. Comparisons with surveys have to ensure that the model and the survey were aiming for the same target population prevalence or burden. In the absence of an appropriate survey with which to compare, more than one model for the same population can be used for cross validation.
Our aim was to review systematically the models available for one specific long-term condition – chronic obstructive pulmonary disease (COPD).
Methods
We developed and reported a detailed protocol for this work. 6 We also registered the protocol in the PROSPERO database. 7
Eligibility criteria
Any modelling study which used demographic and epidemiological data to estimate the prevalence and/or disease burden was eligible for inclusion. The outcomes of interest were incidence, prevalence, disease burden and mortality. For the purposes of this review, disease burden was considered from the perspective of the health system.
The types of models of interest included demographic models, microsimulation models and Markov-type models. Models were excluded if they described animal cell lines, clinical series or estimates of individual risk (such as individual prognostic models). Decision analytical models or decision support models were excluded where they referred to clinical decision-making for individuals. Models comparing one intervention with another intervention were also excluded as the aim was to estimate the prevalence, disease burden and mortality rather than to investigate the effectiveness of interventions.
Information sources and study selection
A search strategy was developed using search terms to include the concepts of ‘modelling’, ‘disease burden’ and ‘chronic obstructive pulmonary disease’ as has been fully described elsewhere in McLean et al. 6 Searches were conducted in Medline, Embase, CAB Abstracts, World Health Organization (WHO) Library and Information Services (WHOLIS – library catalogue of books and reports), WHO Regional Indexes (AIM (AFRO, LILACS, (AMR/PAHO)), IMEMR (EMRO), IMSEAR (SEARO), WPRIM (WPRO)). A modified search strategy was used to identify reports from the WHO home website and Google. Searches were for both published and unpublished modelling studies from 1980 (when modelling first began to be widely used8,9) to November 2013. All studies were independently reviewed against the stated inclusion criteria by SM and VB and all disagreements were resolved by discussion between reviewers.
Data extraction
A piloted data extraction form was used by SM and checked by VB. The following data were extracted from each study: author and email address, year, institution and funding source, the purpose of the model, model title, model type, model setting, time period and population. Model inputs and source of input data details of processing of the model, were also extracted, along with COPD outcomes (incidence, prevalence, mortality, primary care visits, emergency department visits, hospitalizations, treatment costs), model output/ results, details of the model’s availability, any comparisons with other studies, social and economic policy implications of model’s output and future research recommendations.
Quality appraisal framework
We quality appraised the reporting of the models. A quality of reporting framework was designed following review of key guidelines as to good practice in modelling.10–13 A scoring mechanism was devised to weight the different elements required to produce a relevant and high quality model. 14 This scoring framework was described in the protocol for this systematic review 6 and considered the reporting of whether model development and the elements of calibration, internal and external validation and whether it was reported that policy and decision makers had had input into the development of the model. The quality of reporting scoring was up to a maximum of 20 points.
Synthesis of results
As the models had different purposes and were based in different settings we present a narrative synthesis of results as heterogeneity precluded any meaningful quantitative synthesis of results.
Results
Around 1726 studies were title-screened following duplicate deletion. About 157 titles and abstracts were selected for full text review. Excluded were regression models which were quantifying the effect of risk factors on COPD rather than projecting prevalence and/or disease burden. Twenty-one models3,4,15–35 and DISMOD 3 (unpublished) were selected, as described in the PRISMA flow diagram in Figure 1.
PRISMA diagram.
Six models scored highly on the quality of reporting framework out of a maximum 20 points: Peabody’s prevalence model 21 (16), Atsou’s smoking burden model 36 (16), WHO burden model as described by Shibuya 4 (16), two Dutch models: Feenstra 21 (17) and Hoogendoorn et al. 17 (17) and the Pichon–Riviere smoking burden model 34 (17.5). These models are described below in detail; summaries of all models are found in Tables 1 to 4, available online. High reporting quality should ensure that adequate information is included in the report in order to draw some conclusions as to the underlying quality of the model. The main focus of quality was whether sensitivity analysis had been reported and what were the results and other features of quality assurance such as debugging and external validation of results.
Peabody prevalence estimation model
This model used six key risk factors for COPD identified and quantified from the literature: smoking, age, gender, indoor air pollution, outdoor air pollution and occupational exposure to airborne particles. They assumed no cases under the age of 30 and a smoking prevalence of 15%. Based on the literature it was assumed that there was an exposure of non-smoking population to occupational airborne particles of 1.9% and a prevalence of COPD in urban populations of 1.4%. In the model, population was distributed by percentage between urban and rural populations and linked to environmental exposure. There was also an input for national socioeconomic development based on World Bank figures. COPD prevalence was then estimated for four countries and compared with survey data for external validation. Then COPD prevalences for an additional 12 countries were estimated.
Sensitivity analysis was performed on hypothetical populations with differences in their rural/urban population distribution and high/low income split. The model was externally validated by comparing its results to survey results for Nepal, Norway, Poland and Spain. The predictions were not statistically different from the survey results for any of the countries suggesting that the model is a useful prediction tool.
Atsou smoking burden model
This Markov-type model included the severity stages of COPD: mild, moderate, severe and very severe. Transition probabilities could be altered for modelled patients moving up these severity stages as their disease progressed. The model aimed to report the impact of smoking cessation on a COPD patient’s life expectancy in terms of individual health gains. The model includes a cost-effectiveness assessment of the impact of smoking cessation programmes in England.
The sensitivity analyses that were conducted involved changing the transition rates from one disease severity stage to the next, the mortality and exacerbation rates and the costs of COPD management, different discounting rates and different smoking cessation rates. For the scenarios investigating the costs, Quality Adjusted Life Years and cost effectiveness of the smoking cessation programmes; the effects of different input costs for the programme and different quit rates, were investigated. The results were sensitive to transition rates from one disease stage to the next and to an increase in mortality rates. When it was suggested that ex-smokers experience fewer exacerbations than smokers, there were monetary and health care gains. The model was not very sensitive to changes in disease management costs.
WHO burden of disease model DISMOD II
DISMOD II used the WHO world regions to calculate the burden of COPD in terms of years lived with disability and mortality in terms of years of life lost. The model was based on a regression model including a measure of smoking impact and an air pollution variable to take into account proportions of households in each region that use indoor biofuels and age and sex dummy variables which reflected variability in the exposure data for different regions. The input data to the model was the COPD-related mortality rates and total mortality rates per WHO region. These data were used to generate an equation which could be solved for the prevalence and incidence of COPD.
No sensitivity analysis using DISMOD II was undertaken. However, the predictive validity of the model was checked by comparing the estimated relative risks for death from COPD for the America and Pacific regions with recent burden of disease analyses from the United States and Australia. Published prevalence data based on spirometry in the WHO regions were also compared to the DISMOD II outputs and found to be consistent with the exception of estimates from the Global burden of disease 1990 estimates due to improved methodology since then. 37
Feenstra chronic disease model
This model concentrated on the population with COPD in the Netherlands. A dynamic multistate life table model combined information on the demography of the Dutch population and smoking. The number of new cases of COPD each year was calculated from the incidence rates of COPD for smokers and former smokers in combination with prevalence data. Independently modelled modules exist for mortality from lung cancer, stroke, coronary heart disease and asthma. The model is dynamic in that each year’s total incorporated calculations for birth, and mortality.
One way sensitivity analysis was performed to determine the effect of changes in inputs of incidence and excess mortality rates on the output prevalence rates. It was found that a 20% change in incidence input rates resulted in a 17% change in prevalence rates and a 20% change in excess mortality input resulted in a 4% change in prevalence output. Therefore, the model was more sensitive to changes in incidence rates.
Hoogendoorn COPD model
This Markov-type model was developed from the Feenstra chronic disease model 16 by the addition of the four Global Initiative for Obstructive Lung Disease (GOLD) severity stages for COPD into the model. The COPD population is divided into mild, moderate, severe and very severe COPD subpopulations and also by smoking status: never-smoker, smoker or former smoker. Annual incidence and mortality rates are applied to each age, sex and smoking subpopulation. Costs are calculated by multiplying annual cost data for each severity stage by the number of patients in each stage.
Sensitivity analyses for this model were conducted by changing the input severity distribution of the COPD prevalence so that it was assumed firstly that COPD patients began with less severe COPD: more mild and moderate cases, then it was assumed that COPD patients began with more severe COPD: more severe and very severe cases. The next sensitivity analysis to be conducted was similarly changing the severity distribution of the incident COPD cases to first less severe and then more severe. Further sensitivity analyses were conducted varying the rate of lung function decline and changing the impact on lung function decline of stopping smoking from an improvement to a null impact. The results of the sensitivity analyses were that all prevalence results were within a range of 5% of the projections of the base case. The results were most sensitive to the input of the severity distribution of the incidence. Cost projections were more sensitive than prevalence projections due to the difference in costs for different COPD severity stages.
Pichon–Riviere smoking burden model
This model met the needs of 68 decision makers who had been surveyed across seven countries in Latin America (i.e. Argentina, Bolivia, Brazil, Chile, Colombia, Mexico and Peru). It considered heart disease, cerebrovascular disease, COPD, pneumonia/influenza, lung cancer and nine other neoplasms. The model used country-specific data sources where possible. However, as a result of the lack of local data, estimates of incidence were often derived from mortality data using the WHO DISMOD II methodology. 4 The baseline risk in non-smokers was then calculated based on the age-, sex- and country-specific smoking prevalence as well as disease-specific smoking relative risk. The parameters for smokers and former smokers were then calculated from this baseline risk. The model was a microsimulation of individual subjects incorporating the natural history, costs and quality of life of the above diseases. A functioning version of the model was constructed validated and calibrated using data from Argentina.
The Argentinian version of this model underwent extensive internal validation with debugging by the inputting of null and extreme values and checks for inconsistencies. The model was then calibrated by comparing general mortality and all age- and sex-specific death rates predicted by the model with local health statistics. COPD was excepted from this process as COPD mortality was agreed to be underestimated in national statistics. Equations were modified to improve fit to the reference values. Lethality and survival rates were estimated from local and international studies and also used to calibrate the model by visual exploration of observed and expected curves to confirm a good fit. After calibration, as had been expected, correlation between predicted and observed results was better among high incidence events.
External validation for age- and sex-specific COPD predicted prevalence was performed by comparison with the results from the Latin American Project for the Investigation of Obstructive Lung Disease, a population-based survey carried out in five Latin American cities. 38 The model underestimated the level of COPD in comparison to the survey; however, the results were within 5% for each age group. This model’s report followed the International Society Pharmacoeconomics and Outcomes Research guidelines for model development 11 and reporting and therefore included extensive description of preliminary consultation with policy and decision makers and detail regarding debugging, internal validation, calibration and external validation procedures.
Discussion
Main findings
We identified 22 models, of which six scored highly on the quality of reporting framework. These six were: the Peabody model 21 generates credible estimates of COPD prevalence in specific countries round the world. In addition, the WHO DISMOD II model 4 produces burden estimates in terms of disability adjusted life years with COPD and life years lost to COPD and the Atsou model 32 gives the life expectancy gains of individual smokers who quit smoking and associated costs. Feenstra et al. 16 and Hoogendoorn et al. 17 models produce estimates of mortality and health care costs related to COPD. The Pichon–Riviere model 34 gives the costs and cost effectiveness of smoking quit programmes in the context of all tobacco-related diseases.
Strengths and limitations
This study involved a very broad and comprehensive search strategy including many world databases included in the WHO’s library. A limitation of this study is that we did not consider models that studied the effects of interventions on COPD because it was decided that our first priority should be to establish the baseline burden of COPD and pharmacological interventions have already been systematically reviewed. 39
COPD models are chiefly focused on smoking as the main risk factor and any risk-reduction predictions made by these models were from consideration of smoking alone. As understanding of the pathogenesis of COPD increases the role of other factors in the development of COPD is better understood, the overwhelming focus on smoking may be considered to be a further limitation of these models.
A limitation of using the quality of reporting framework is that a high quality of model may be ignored because its reporting is not such high quality or lacks clarity. No report on the development of the model and its subsequent quality validation was found for COPD for DISMOD 3, this is the model that was used to calculate Global burden of Disease in the recent Lancet series.2,40
Deep critical review of all the models in this paper was not possible as we did not have full access to all the models and their underlying mathematics. This is a potential limitation when reviewing any kind of model, there is a need for transparency in publication and for a mathematical modelling skill-set to interpret the findings.
Interpretation of findings in the light of previous research
This is the first systematic review of modelling studies for COPD prevalence and burden. A review of chronic heart disease policy models 14 commented that while reporting criteria are available for many study types this does not yet apply to modelling studies and this affects the quality of model reporting and consequently the underlying model quality as inferred by potential model users. In addition, it was highlighted that although models are heavily reliant on their data inputs, few models critique the quality of their data sources.
Implications for policy and research
Implications for COPD policy are not clear overall as the models included have many different purposes and were designed for different contexts, there is not yet a consensus on the structure of the optimum COPD model.
Implications for policymakers include that, as techniques of chronic disease modelling are increasingly being used and recognized, they need to be aware of how to interpret them and how to critique the datasets that are used as model inputs.
Implications for research on COPD models include that sensitivity analyses should be conducted in order to highlight the parameters that most affect the outputs and so demonstrate the internal validity of each model. Validation of the included COPD models is fertile ground for further research, in particular as more detailed and higher quality data become available. In addition the design of future COPD models to include other risk factors such as prematurity, low birthweight, asthma, tuberculosis and childhood respiratory infections needs further research.
In terms of models in general, Markov-type models cannot easily model differences between cohorts and so future work could be directed towards how to best represent such differences where they exist.
Conclusions
COPD epidemiological models have widely differing structures and include Markov models and microsimulation models. In general, the field of chronic disease modelling is burgeoning. As a result, policy makers need to understand how to interpret epidemiological models and their data sources.
Footnotes
Funding
SM is funded by the University of Edinburgh and an Edinburgh and Lothian Research Fund Grant. VB had no specific funding.
Acknowledgements
SW, CS and AS are all members of staff at the University of Edinburgh. AS has served as a research consultant to Almirall and Napp.
Professor Simon Capewell of Liverpool University helped to devise the quality of reporting scoring framework.
