Improving Prognostic Web Calculators: Violation of Preferential Risk Independence

Abstract

Background:

Web-based applications are available for prognostication of individual patients. These prognostic models were developed for groups of patients. No one is the average patient, and using these calculators to inform individual patients could provide misleading results.

Objective:

This article gives an example of paradoxical results that may emerge when indices used for prognosis of the average person are used for care of an individual patient.

Methods:

We calculated the expected mortality risks of stomach cancer and its associated comorbidities. Mortality risks were calculated using data from 140,699 Veterans Administration nursing home residents.

Results:

On average, a patient with hypertension has a higher risk of mortality than one without hypertension. Surprisingly, among patients with lung cancer, hypertension is protective and reduces risk of mortality. This paradoxical result is explained by how group-level, average prognosis could mislead individual patients. In particular, average prognosis of lung cancer patients reflects the impact of various comorbidities that co-occur in lung cancer patients. The presence of hypertension, a relatively mild comorbidity of lung cancer, indicates that more serious comorbidities have not occurred. It is not that hypertension is protective; it is the absence of more serious comorbidities that is protective. The article shows how the presence of these anomalies can be checked through the mathematical concept of preferential risk independence.

Conclusion:

Instead of reporting average risk scores, web-based calculators may improve accuracy of predictions by reporting the unconfounded risks.

Introduction

Web-based prognostic calculators are increasingly available for clinical use. For example, web calculators are used to predict who is at risk of diabetes,¹ heart failure,² stroke outcomes,³ and common cancers,⁴ among other diseases, with websites available to assist clinicians and patients in their use:

• http://eprognosis.ucsf.edu/,

• http://depts.washington.edu/shfm/,

• http://hp2010.nhlbihin.net/atpiii/calculator.asp,

• http://www.zunis.org/FHS%20Afib%20Risk%20Calculator.htm,

• http://www.cancer.gov/bcrisktool/,

• http://gosset.wharton.upenn.edu/mortality/,

• http://www.mdcalc.com

Web calculators also report survival risks in different settings of care including community-based and intuitional settings.^5–8 Indices developed by Mazzaglia,⁹ Gagne,¹⁰ Carey,^11,12 Lee,⁸ Schonberg,¹³ Porock,^14,15 Mitchell,¹⁶ Flacker,^17,18 Levine,¹⁹ Walter,^20,21 Di Bari,²² Inouye,²³ Pilotto,^24,25 Teno,²⁶ Dramé,²⁷ and Fischer²⁸ provide survival estimates for a variety of diagnoses and specific patient populations using clinical data. In addition to the clinical scales, a number of investigators have developed prognostic indices from patients' medical history within electronic health records. These include efforts led by Charlson,²⁹ Deyo,³⁰ Romano,³¹ Manitoba,³² D'Hoores,³³ Elixhauser,³⁴ and van Walraven.³⁵ Because of the importance of prognosis in management of patient care and in research, repeated efforts have been made to refine these methods and estimates.

Despite widespread efforts to create web-based calculators, a fundamental limitation exists in these efforts. Most prognostic indices were developed to describe survival of the average person but are being used to report survival of an individual. Since everyone differs from the average person by some feature, these statistical models may be misleading, when applied to care of individuals. The best decision for an average patient is likely to be different from the best advice to an individual patient.^36,37 Statisticians often describe this phenomenon in terms of pooled estimation across different subgroups and subgroup (stratified) analysis. One way to personalize data is to divide the data into strata; then if the patient matches a specific stratum, the estimate for that stratum is the most appropriate for the patient. The cross-strata average is not a good estimate. Still another way to personalize data is to create models with numerous interacting variables and to evaluate the models at the patient's profile. None of the web calculators reviewed earlier examined interactions among their variables or conducted subgroup (stratified) analysis, and, therefore, the use of these models to guide care of an individual patient may be suspect. Statisticians warn against the use of these models to guide individuals, but such use continues.^38–48

Many examples can be provided where patient characteristics interact. For example, in diabetes,⁴⁹ arthritis,⁵⁰ cardiovascular illness,⁵¹ mental health,⁵² and cancer,⁵³ comorbidities interact to change a patient's prognosis. Age and comorbidities also interact.⁵⁴ These interactions are well known in the literature but absent in prognostic calculators. At the group level, it may not matter, as the averages reflect the influence of interacting comorbidities; for some patients in the group, we may under-estimate the prognosis; for others, we may over-estimate; and for the remaining, we may still be accurate.^55–66 For one person, however, prognosis can vary several fold based on their comorbidities, and ignoring interactions could mislead the patient. The purpose of this brief article is to highlight, by way of an example, how the result of these websites could be misleading.

Paradoxical results: Diseases that are good for you?

The failure to adequately examine combinations of features could lead to paradoxical advice from web-based calculators: The calculator may report that having an illness could improve prognosis. Sometimes this is possible; for example, having chicken pox in childhood may create immunity to this illness later. In general, however, one expects that the majority of illnesses make prognosis at least tangibly worse, rarely better. The fact that many illnesses can be found that would increase longevity seems prima facie wrong. Furthermore, if an illness sometimes increases risk and at other times decreases risk, we have a paradox. In mathematics, this situation is referred to as the violation of preferential risk independence.^67,68

To understand preferential independence, imagine two people, one well and another with an illness, let us say hypertension. The hypertensive patient has a higher risk of mortality than the well patient. So far, the comparison makes sense. The preferential independence is violated when adding an identical diagnosis, for example, stomach cancer, to both the normal and hypertensive patients reverses the risk preference (see Fig. 1). Hypertension increases the risk when it occurs by itself, and it decreases the risk when it occurs among patients with stomach cancer. This contradicts our expectation that diseases adversely affect prognosis. Keeney and Raifa⁶⁷ have also shown that if the principle of preferential risk independence does not hold, no statistical model can preserve the reversal of risk preference. In other words, there is no mathematical function that can score hypertension and stomach cancer so that adding the same diagnosis (stomach cancer) changes hypertension from a risk to a protective factor.

FIG. 1.

Violation of Preferential Independence. Values in parentheses are average risk of mortality in six months for veterans residing in nursing homes.

The concept of preferential risk independence is closely related to Simpson's paradox, which has been shown in numerous settings.^69–77 In Simpson's paradox, the impact of a variable is reversed when it is examined in the subgroup as opposed to overall patients. Similar to Simpson's paradox, preferential risk independence reports reversal of impact of a variable, but now in two different subgroups.

In this article, we provide data that confirm violation of preferential independence and the counter-intuitive results produced when average risk is used to guide prognostic calculators. At the end of this article, we provide alternative ways of resolving these contradictions.

Source of data

Data were obtained for 140,699 nursing home residents from electronic health records of veterans across 126 medical centers through the Veterans Administration Informatics and Computing Infrastructure (VINCI). These data included all hospital diagnosis from 2003 to 2012. All-cause probability of mortality was calculated as the percentage of residents who survived six months after date of the diagnosis. In a separate paper, we have reported the use of these data to predict prognosis of nursing home patients.⁷⁸ A naïve Bayes model was used to predict prognosis of patients. Similar to other indices in the literature, this model does not account for interactions among the diagnoses. It works for the average person, but later we show that its use for an individual patient can create confusing and contradictory results.

Violation of Preferential Risk Independence

We provide an example to illustrate the violation of preferential independence. Consider the probability of mortality among patients who have stomach cancer (i.e., Malignant Neoplasm of Stomach Cardia, code 151.0 in International Classification of Disease Version 9). On average, the probability of six-month mortality is 0.52 and is shown as a dashed line in Figure 2. This is the pooled estimate across various subgroups. Among veterans in nursing homes, stomach cancer occurs with several different comorbidities, each of which defines a new subgroup. The comorbidities that occurred more than 250 times in our sample of data are provided in Figure 2. These comorbidities are listed in order of average probability of mortality on the X-axis, starting with “hypertension” and ending with “palliative consult.” The dark solid line shows the probability of mortality of patients who have both stomach cancer and the comorbidity. The gray solid line shows the probability of mortality for patients who have only the comorbidities. In our cohort of nursing home residents, hypertension, by itself, increased risk of mortality by 8%. However, the probability of mortality for hypertensive patients who had stomach cancer was 0.44, which was fully 8% lower than the probability of mortality for patients who had stomach cancer: 0.52. Paradoxically, hypertension increased the risk of a normal patient but reduced the risk of a patient with stomach cancer. This is an example of violation of preferential independence. Clinicians and patients may find it counter-intuitive that adding “hypertension” to a patient who has “stomach cancer” would reduce the average mortality rate.

FIG. 2.

Average Mortality from Stomach Cancer and Frequent Comorbidities. Comorbidity Code and Description: Description of comorbidity codes: 1, hypertension NOS; 2, hyperlipidemia NEC/NOS; 3, diabetes mellitus without complication not stated as uncontrolled; 4, esophageal reflux; 5, depressive disorder NEC; 6, coronary atherosclerosis of native coronary artery; 7, coronary atherosclerosis of unspecified type of vessel; 8, chronic airway obstruction NEC; 9, benign hypertrophy of prostate without urinary obstruction; 10, history tobacco use; 11, iron deficiency anemia NOS; 12, congestive heart failure unspecified; 13, hypopotassemia; 14, atrial fibrillation; 15, anemia NOS; 16, esophageal stricture; 17, urinary tract infection NOS; 18, dehydration; 19, encounter for chemotherapy; 20, antineoplastic chemotherapy encounter; 21, pneumonia organism NOS; 22, acute renal failure NOS; 23, pleural effusion NOS; 24, dysphagia NOS; 25, dysphagia; 26, radiotherapy encounter; 27, malignant neoplasm of intra-abdominal lymph nodes; 28, protein-calorie malnutrition NOS; 29, secondary malignant neoplasm of lung; 30, secondary malignant neoplasm liver; and 31, palliative care encounter. NOS stands for not otherwise specified. NEC stands for not elsewhere classified.

Discussion

The data reviewed in this article show the paradoxical results where hypertension increases the risk of mortality for normal patients but decreases the risk of mortality for patients with stomach cancer. The pooled estimate across different comorbidities reflects the influence of serious (#30 liver cancer) and less serious (#1 Hypertension NOS) diseases. These estimates are an average. If the patient has less serious disease, the average is too high. If the patient has more serious disease, the average is too low. Since on discharge from hospitals, the 5 most serious diseases are listed, the inclusion of hypertension, a relatively benign comorbidity, signals the absence of more serious diseases. It is not surprising that hypertension appears to reduce the average probability of dying from stomach cancer.

To avoid these types of counter-intuitive situations where comorbidities reduce risk of mortality, we could rely on measurement of the probability (or risk) of stomach cancer and mortality, without any other comorbidities. Note that this is not the same as the average probability of stomach cancer, which reflects the influence of various comorbidities of stomach cancer. In Figure 2, the unconfounded impact of “stomach cancer” can be obtained by averaging the additional contribution of “stomach cancer” beyond the contribution of each of its comorbidities: the difference between the dark and gray solid lines. For example, for hypertensive patients, stomach cancer adds 36% more to the mortality rate. For patients with Secondary Malignancy of Neoplasm of the Liver, stomach cancer contributes far less. In our data, the direct impact of stomach cancer ranges from 4% to 43% and on average, it is 19%. A patient with stomach cancer and no other illness has a 19% chance of mortality and not the earlier reported average risk of mortality of 52%. In this scenario, all comorbidities, including “hypertension,” add to this estimate.

If a patient or clinician asks for the prognosis of “stomach cancer” assuming no influence from any other comorbidity, the correct probability to report in our example is 19%. Web-based calculators that report the average risk should clarify that these risks include the contribution of a constellation of comorbidities, which may or may not be present in the patient at hand. To report the probability of six-month mortality from stomach cancer as 52% could be misleading for a patient who has no other comorbid conditions.

There are many ways this problem can be corrected in web-based calculators. Some investigators, for example, Elixhauser³⁵ have avoided this problem by scoring only severe diagnosis that increases odds of mortality. One could, for example, report the maximum risk associated with patients' various diagnoses. This approach, though simpler, still distorts the data. It may report the prognosis of patients with serious diseases correctly but would inaccurately report the prognosis of patients with a combination of serious and relatively benign comorbidities.

A second approach to addressing the confusion with a combination of serious and relatively benign diseases is to report the probability of mortality associated with the combination of the patient features. To accomplish this, web-based calculators derived from logistic regression equations would need to include all interaction terms; web-based calculators that are derived from Bayesian likelihood ratios must calculate the likelihood ratio associated with combinations of events. Given the increasing size of databases and available data, it is often feasible to calculate the probability of joint diagnoses.

Finally, we also recommend the use of propensity or other forms of matching and data balancing to remove confounding, thereby estimating the unconfounded mortality rate. Models derived in this fashion would be far more complicated, perhaps not as accurate across large groups, but certainly more valid for the patient at hand. The information provided to patients and their clinicians is personalized and different from the information provided to the average person. For web calculators to be accurate, they need to fit their data to the patient's comorbidities and reduce reliance on use of population averages. Web calculators purport to provide personalized data; however, they do not do so. They provide population averages, and, by implication, assume that these averages are the best prediction for the patient at hand.

Footnotes

Acknowledgment

This project was funded by appropriation #3620160 from VA Office of Geriatrics and Extended Care.

Author Disclosure Statement

No competing financial interests exist.

The contents of this article do not represent the views of the Department of Veterans Affairs or the United States Government.

References

Hippisley-Cox

, Coupland

, Robson

, et al.: Predicting risk of type 2 diabetes in England and Wales: Prospective derivation and validation of QDScore. BMJ, 2009; 338:b880.

Horne

, May

, Kfoury

, et al.: The Intermountain Risk Score (including the red cell distribution width) predicts heart failure and other morbidity endpoints. Eur J Heart Fail, 2010; 12:1203–1213.

Flint

, Faigeles

, Cullen

, et al.: VISTA Collaboration. THRIVE score predicts ischemic stroke outcomes and thrombolytic hemorrhage risk in VISTA. Stroke, 2013; 44:3365–3369.

Hippisley-Cox

, Coupland

: Development and validation of risk prediction algorithms to estimate future risk of common cancers in men and women: Prospective cohort study. BMJ Open, 2015; 5:e007825.

Horne

, Muhlestein

, Lappé

, et al.: The intermountain risk score predicts incremental age-specific long-term survival and life expectancy. Transl Res, 2011; 158:307–314.

Horne

, Lappé

, Muhlestein

, et al.: Repeated measurement of the intermountain risk score enhances prognostication for mortality. PLoS One, 2013; 8:e69160.

Yourman

, Lee

, Schonberg

, et al.: Prognostic indices for older adults: A systematic review. JAMA, 2012; 307:182–192.

Lee

, Lindquist

, Segal

, Covinsky

: Development and validation of a prognostic index for 4-year mortality in older adults. JAMA, 2006; 295:801–808.

Mazzaglia

, Roti

, Corsini

, et al.: Screening of older community-dwelling people at risk for death and hospitalization: The Assistenza Socio-Sanitaria in Italia project. J Am Geriatr Soc, 2007; 55:1955–1960.

10.

Gagne

, Glynn

, Avorn

, et al.: A combined comorbidity score predicted mortality in elderly patients better than existing scores. J Clin Epidemiol, 2011; 64:749–759.

11.

Carey

, Covinsky

, Lui

, et al.: Prediction of mortality in community-living frail elderly people with long-term care needs. J Am Geriatr Soc, 2008; 56:68–75.

12.

Carey

, Walter

, Lindquist

, Covinsky

: Development and validation of a functional morbidity index to predict mortality in community-dwelling elders. J Gen Intern Med, 2004; 19:1027–1033.

13.

Schonberg

, Davis

, McCarthy

, Marcantonio

: Index to predict 5-year mortality of community-dwelling adults aged 65 and older using data from the National Health Interview Survey. J Gen Intern Med, 2009; 24:1115–1122.

14.

Porock

, Oliver

, Zweig

, et al.: Predicting death in the nursing home: Development and validation of the 6-month Minimum Data Set mortality risk index. J Gerontol A Biol Sci Med Sci, 2005; 60:491–498.

15.

Porock

, Parker-Oliver

, Petroski

, Rantz

: The MDS Mortality Risk Index: The evolution of a method for predicting 6-month mortality in nursing home residents. BMC Res Notes, 2010; 3:200.

16.

Mitchell

, Miller

, Teno

, et al.: Prediction of 6-month survival of nursing home residents with advanced dementia using ADEPT vs hospice eligibility guidelines. JAMA, 2010; 304:1929–1935.

17.

Flacker

, Kiely

: Mortality-related factors and 1-year survival in nursing home residents. JAGS, 2003; 51:213–221.

18.

Kruse

, Parker Oliver

, Mehr

, et al.: Using mortality risk scores for long-term prognosis of nursing home residents: Caution is recommended. J Gerontol A Biol Sci Med Sci, 65:1235–1241.

19.

Levine

, Sachs

, Jin

, Meltzer

: A prognostic model for 1-year mortality in older adults after hospital discharge. Am J Med, 2007; 120:455–460.

20.

Walter

, Brand

, Counsell

, et al.: Development and validation of a prognostic index for 1-year mortality in older adults after hospitalization. JAMA, 2001; 285:2987–2994.

21.

Rozzini

, Sabatini

, Trabucchi

: Prediction of 6 month mortality among older hospitalized adults. JAMA, 2001; 286:1315–1316

22.

Di Bari

, Balzi

, Roberts

, et al.: Prognostic stratification of older persons based on simple administrative data: Development and validation of the “silver code,” to be used in emergency department triage. J Gerontol A Biol Sci Med Sci, 2010; 65:159–164.

23.

Inouye

, Bogardus

, Vitagliano

, et al.: Burden of illness score for elderly persons: Risk adjustment incorporating the cumulative impact of diseases, physiologic abnormalities, and functional impairments. Med Care, 2003; 41:70–83.

24.

San Carlo

, D'Onofrio

, Franceschi

, et al.: Validation of a modified-multidimensional prognostic index (m-MPI) including the mini nutritional assesment short-term (MNA-SF) for the prediction of one-year mortality in hospitalized elderly patients. J Nurr Health Aging, 2011; 15:169–173.

25.

Pilotto

, Ferrucci

, Franceschi

, et al.: Development and validation of a multidimensional prognostic index for one-year mortality from comprehensive geriatric assessment in hospitalized older patients. Rejuvenation Res, 2008; 11:151–161.

26.

Teno

, Harrell Jr

, Knaus

, et al.: Prediction of survival for older hospitalized patients: The HELP survival model. JAGS, 2000; 48(5 Suppl):S16–S24

27.

Dramé

, Novella

, Lang

, et al.: Derivation and validation of a mortality-risk index from a cohort of frail elderly patients hospitalized in medical wards via emergencies: The SAFES study. Eur J Epidemiol, 2008; 23:780–791.

28.

Fischer

, Gozansky

, Sauaia

, et al.: A practical tool to identify patients who may benefit from a palliative approach: The CARING criteria. J Pain Symptom Manage, 2006; 31:285–292.

29.

Charlson

, Pompei

, Ales

, MacKenzie

: “A new method of classifying prognostic comorbidity in longitudinal studies: Development and validation”. J Chronic Dis, 1987; 40:373–383.

30.

Deyo

, Cherkin

, Ciol

: “Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases”. J Clin Epidemiol, 1992; 45:613–619.

31.

Romano

, Roos

, Jollis

: Adapting a clinical comorbidity index for use with ICD-9-CM administrative data: Differing perspectives. J Clin Epidemiol, 1993; 46:1075–1079.

32.

Roos

, Walld

, Romano

, Roberecki

: Short-term mortality after repair of hip fracture. Do Manitoba elderly do worse?. Med Care, 1996; 34:310–326.

33.

D'Hoore

, Sicotte

, Tilquin

: Risk adjustment in outcome assessment: The Charlson comorbidity index. Methods Inf Med, 1993; 32:382–387.

34.

Elixhauser

, Steiner

, Harris

, Coffey

: Comorbidity measures for use with administrative data. Med Care, 1998; 36:8–27.

35.

Van Walraven

, Austin

, Jennings

, et al.: A modification of the Elixhauser comorbidity measures into a point system for hospital death using administrative data. Medical Care, 2009; 47:626–633.

36.

Sacristan

: Evidence from randomized controlled trials, meta-analyses, and subgroup analyses. JAMA, 2010; 303:1253–1254; author reply 4–5.

37.

Bouwmeester

, Zuithoff

, Mallett

, et al.: Reporting and methods in clinical prediction research: A systematic review. PLoS Med, 2012; 9:1–12.

38.

Kent

, Hayward

: Limitations of applying summary results of clinical trials to individual patients: The need for risk stratification. JAMA, 2007; 298:1209–1212.

39.

van Klaveren

, Vergouwe

, Farooq

, et al.: Estimates of absolute treatment benefit for individual patients required careful modeling of statistical interactions. J Clin Epidemiol, 2015; 68:1366–1374

40.

, Vollenweider

, Varadhan

, et al.: Support of personalized medicine through risk-stratified treatment recommendations - an environmental scan of clinical practice guidelines. BMC Med, 2013; 11:7.

41.

Spiegel

, Hawkins

: 'Personalized medicine' to identify genetic risks for type 2 diabetes and focus prevention: Can it fulfill its promise?. Health Aff (Millwood), 2012; 31:43–49.

42.

Kent

, Rothwell

, Ioannidis

, et al.: Assessing and reporting heterogeneity in treatment effects in clinical trials: A proposal. Trials, 2010; 11:85.

43.

Rothwell

: Can overall results of clinical trials be applied to all patients?. Lancet, 1995; 345:1616–1619.

44.

Hayward

, Kent

, Vijan

, Hofer

: Multivariable risk prediction can greatly enhance the statistical power of clinical trial subgroup analysis. BMC Med Res Methodol, 2006; 6:18.

45.

Rothwell

: Treating individuals 2. Subgroup analysis in randomised controlled trials: Importance, indications, and interpretation. Lancet, 2005; 365:176–186.

46.

Greenfield

, Kravitz

, Duan

, Kaplan

: Heterogeneity of treatment effects: Implications for guidelines, payment, and quality assessment. Am J Med, 2007; 120:S3–S9.

47.

Rothwell

, Mehta

, Howard

, et al.: Treating individuals 3: From subgroups to individuals: General principles and the example of carotid endarterectomy. Lancet, 2005; 365:256–265.

48.

Kravitz

, Duan

, Braslow

: Evidence-based medicine, heterogeneity of treatment effects, and the trouble with averages. Milbank Q, 2004; 82:661–687.

49.

Imai

, Kaira

, Mori

, et al.: Prognostic significance of diabetes mellitus in locally advanced non-small cell lung cancer. BMC Cancer, 2015; 15:989.

50.

Boers

: Evidence for interaction between disease severity and comorbidity in rheumatoid arthritis? comment on the article by Navarro-Cano et al. Arthritis Rheum, 2004; 50:1695; author reply 1696–1697.

51.

Ferdinandy

, Hausenloy

, Heusch

, et al.: Interaction of risk factors, comorbidities, and comedications with ischemia/reperfusion injury and cardioprotection by preconditioning, postconditioning, and remote conditioning. Pharmacol Rev, 2014; 66:1142–1174.

52.

Keeley

, Chmielewski

, Bagby

: Interaction effects in comorbid psychopathology. Compr Psychiatry, 2015; 60:35–39.

53.

Extermann

: Interaction between comorbidity and cancer. Cancer Control, 2007; 14:13–22. Review.

54.

Incalzi

, Capparella

, Gemma

, et al.: The interaction between age and comorbidity contributes to predicting the mortality of geriatric patients in the acute-care hospital. J Intern Med, 1997; 242:291–298.

55.

Hand

, Yu

: Idiot's Bayes-Not So Stupid After All?. Int Stat Rev, 2001; 69:385–398

56.

Titterington

, Murray

, et al.: Comparison of discrimination techniques applied to a complex data set of head injured patients. J R Stat Soc, 1981; Series A, 144:145–175.

57.

Monti

, Cooper

: A Bayesian network classifier that combines a finite mixture model and a naive Bayes model. In Proceedings of the 15th Conference on Uncertainty in Al, Stockholm, Sweden, 1999.

58.

Nordyke

, Kulikowski

: A comparison of methods for the automated diagnosis of thyroid dysfunction. Comput Biomed Res, 1971; 4:374–389.

59.

Todd

, Stamper

: The relative accuracy of a variety of medical diagnostic programmes. Methods Inf Med, 1994; 33:402–416.

60.

Gammerman

, Thatcher

: Bayesian diagnostic probabilities without assuming independence of symptoms. Methods Inf Med, 1991; 30:15–22.

61.

Croft

, Mitchol

: Mathematical models in medical diagnosis. Ann Biomed Eng, 1987; 2:69–89.

62.

Ohmann

, Yang

, Kiinneke

, et al.: Bayes theorem and conditional dependence of symptoms: Different models applied to data of upper gastrointestinal bleeding. Methods Inf Med, 1988; 27:73–83.

63.

de Dombal

, Leaper

, Staniland

, et al.: Computer aided diagnosis of acute abdominal pain. Br Med J, 1972; 2. 9–13.

64.

Bailey

NTJ

: Probability methods of diagnosis based on small samples. Math Comput Sci Biol Med, 1964; 103–107.

65.

Boyle

, Greig

, Franklin

, et al.: Construction of a model for computer-assisted diagnosis: Application to the problem of non-toxic goitre. Q J Med, 1966; 35:565–588.

66.

Fryback

: Bayes' theorem and conditional non-independence of data in medical diagnosis. Comput Biomed Res, 1978; 11:423–434.

67.

Keeney

, Raiffa

: Decisions with Multiple Objectives–Preferences and Value Tradeoffs, Cambridge University Press, Cambridge and New York, 1993.

68.

Thurston

: Multi-attribute utility analysis of conflicting preferences. In: Lewis

, et al. (ed): Decision Making in Engineering Design. New York, NY: ASME Press, 2006, pp. 125–133.

69.

Jiamsakul

, Kerr

, Chandrasekaran

, et al.: TREAT Asia HIV Observational Database (TAHOD). The occurrence of Simpson's Paradox if site-level effect was ignored in the TREAT Asia HIV Observational Database (TAHOD). J Clin Epidemiol, 2016;pii: S0895–S4356(16)00093–00097.

70.

Holt

: Potential Simpson's Paradox in Multicenter Study of Intraperitoneal Chemotherapy for Ovarian Cancer. J Clin Oncol, 2016.

71.

Kronman

, Freund

, Hanchate

, et al.: Nursing home residence confounds gender differences in Medicare utilization an example of Simpson's paradox. Womens Health Issues, 2010; 20:105–113.

72.

Albers

: Dutch research funding, gender bias, and Simpson's paradox. Proc Natl Acad Sci U S A, 2015; 112:E6828–E6829.

73.

Martin

, Martin

: Simpson's Paradox: Why Smoking Reduces the Risk of Dying of Cardiovascular Disease. Value Health, 2015; 18:A383.

74.

Persoskie

, Leyva

: Blacks Smoke Less (and More) than Whites: Simpson's Paradox in U.S. Smoking Rates, 2008 to 2012. J Health Care Poor Underserved, 2015; 26:951–956

75.

Carey

: CCR 20th Anniversary Commentary: Simpson's Paradox and Neoadjuvant Trials. Clin Cancer Res, 2015; 21:4027–4029.

76.

Gale

, Lamb

, Allen

, et al.: Simpson's Paradox and the impact of different DNMT3A mutations on outcome in younger adults with acute myeloid leukemia. J Clin Oncol, 2015; 33:2072–2083.

77.

Marang-van de Mheen

, Shojania

: Simpson's paradox: How performance measurement can fail even with perfect risk adjustment. BMJ Q Saf, 2014; 23:701–705.

78.

Levy

, Kheirbek

, Alemi

, et al.: Predictors of six-month mortality among nursing home residents: Diagnoses may be more predictive than functional disability. J Palliat Med, 2015; 18:100–106.