Abstract
The aim of the present study is to use the ACG (Adjusted Clinical Groups) System to create an impactibility model by identifying homogeneous clinical subgroups of patients with high risk of an adverse health outcome in a population of heart failure patients with complex health care needs (PCHCN). This method will allow policy makers to target and prioritize services for the highest risk PCHCN in the context of limited health care resources, by identifying relatively homogeneous groups of patients with similar comorbidities. Subjects classified in 2012 as PCHCN in a local health unit by the ACG System were linked with hospital discharge records in 2013. The authors applied the Apriori algorithm to identify the most common sets of the most predictive diseases for the following outcomes of interest: at least 1 admission and at least 1 preventable admission in the year. Predictive performance for the former outcome was compared between the impactability model with the available ACG's individual risk score. The Apriori algorithm also was applied to predict the latter outcome as an example of an event that a policy maker would be able to prevent. Evidence showed no statistically significant difference between the 2 methods. The present model also displayed evidence of good calibration. The Apriori algorithm was applied as an impactibility model, built based on the ACG System, that allowed the authors to obtain an “ACG-based group risk score” and use it to identify clinically homogeneous subgroups of PCHCN. This will help policy makers develop “tool kits” for homogeneous groups of patients that improve health outcomes.
Introduction
The greatest challenge facing health systems globally in the twenty-first century concerns the progressive aging of the population, accompanied by an increasing burden of chronic diseases 1 that requires ongoing management over a period of years or decades. 2 A strategic vision, coupled with the ability to mobilize and deliver appropriate resources to chronic disease patients in the community, is needed so that health care professionals can provide accessible, safe, well-coordinated, cost-effective, and high-quality care. 3
Therefore, policy decision makers should prioritize how limited resources can be used to optimize the health needs of the chronically ill by applying health information technology and data sharing. In fact, great potential exists to improve the health of these individuals through improved coordination that integrates multiple domains of the population health management approach.
Patients with complex health care needs (PCHCN), who place a heavy burden on health care resources, typically include individuals with multiple chronic conditions (multimorbidity). 4 For example, a previous study found that elderly patients labelled as PCHCN suffer from >4 diseases on average. 5
Targeting the needs of PCHCN by designing tailored health care models capable of improving these patients' health outcomes while reducing related costs is therefore an important policy priority. 6
One strategy could be to optimize the case-finding process by developing tools such as a risk score to prioritize enrollment in case management programs. 7 Such programs can provide intensive and personally tailored care to those PCHCN who are at greatest risk of hospital admissions and who are responsible for the highest costs. 8
An incremental second strategy would be to identify the subgroup of patients who are more likely to benefit from a case management (CM) intervention. For example, a systematic review revealed that adopting CM in the subgroup of heart failure PCHCN produced a greater reduction in unplanned hospital admissions and length of hospital stay compared to patients with other chronic diseases. 9 Similarly, a study found that a CM intervention improved outcomes of patients with chronic heart failure and decreased readmissions and length of hospital stay. 10 Moreover, another strategy to further increase the effectiveness of CM interventions is the adoption of “impactibility models” that identify the subgroup of patients in a cohort of PCHCN patients at higher risk of events that could potentially be prevented by CM interventions. 7
This allows one to redesign primary care using the framework suggested by Porter to sustain and improve primary care practice. This framework specifies that primary care should be organized around subgroups of patients with similar comorbidities and therefore similar needs and that team-based services should be provided to each patient subgroup over the full care cycle. 11 As an example, a previous study pointed out that the presence of morbidities like chronic kidney disease, chronic obstructive pulmonary disease (COPD), or asthma in heart failure PCHCN was strongly associated with a greater demand for hospital services measured as “at least one hospital admission” and “at least one preventable admission.” 12
Thus, the aim of the present study is to create an impactibility model that uses the ACG System 13 to identify homogeneous clinical subgroups of heart failure patients with high risk of an adverse health outcome. This method will allow policy makers to more efficiently target and prioritize services for the highest risk PCHCN in the context of limited health care resources. In the study, the research team chose to prioritize based on morbidity variables, defining the pathologies that best predict adverse outcomes in this cohort. The 2 adverse health care events on which the team focused are at least 1 hospitalization and at least 1 preventable hospital admission in a year.
Methods
Context
The Italian National Health System (NHS) is a public service financed mainly by general taxation. It is grounded in values of universality, free access, freedom of choice, pluralism in provision, and equity. Regional authorities plan and organize health care facilities and activities through their regional health departments in accordance with a national health plan designed to assure equitable provision of comprehensive care throughout the country. Regional authorities coordinate and control local health units (LHUs), each of which is a separate geographically-based public company delivering public health, promotion, community health services, primary care, and hospital care, either with their own facilities and personnel or through outside contractors. The Veneto Regional Health Service had 21 such LHUs serving a population of about 5 million.
The LHU involved in the present study was the Azienda ex-ULSS4-Veneto, which serves a population of about 190,000 in the province of Vicenza, in northeastern Italy.
The ACG (adjusted clinical groups) system: identification of subjects
The ACG System is a method of population risk stratification that is used internationally to characterize multimorbidity on the strength of routinely collected administrative data that are gathered together using record linkage. 13 This tool has been implemented in the Veneto Region since 2012. 14
Starting from individual-level diagnoses and based on clinical judgement of likelihood (persistence or recurrence over time, demand for specialist services, hospitalization, disability or decline in quality of life, and expected need for and use of diagnostic or therapeutic procedures), it is then adjusted for age and sex to create 93 mutually-exclusive combinations of conditions (ACGs) that represent clinically logical categories of patients expected to require similar types and levels of health care resources.
Based on their health care resource usage, the ACG System automatically collapses the different ACG categories into 6 Resource Utilization Bands (RUBs), from 0 (nonuser or invalid diagnosis) to 5 (very high morbidity). For the purposes of the present study, only people older than age 65 years in 2012, with a diagnosis of heart failure, residing in the area served by the Azienda ULSS4-Veneto LHU, and characterized as PCHCN corresponding to RUBS 4 and 5 (respectively high morbidity and very high morbidity) were included.
The ACG system: identification of morbidities
The diagnosis of heart failure and other chronic diseases were established using the EDC (Expanded Diagnosis Clusters), which coincides with clinical diagnoses the ACG System assigns to single patients by combining different information flows. To improve sensitivity of the model, patients with chronic conditions also were identified by means of the information available from the Pharmacy (RX)-based Morbidity Markers (Groups), Rx-MGs, and the clinical criteria used to assign medications to morbidity groups. The Rx-MGs provide further methods to describe the unique morbidity profile of a population and form the basis of the pharmacy-based predictive models. 13
The selection and definition of comorbidities to be included is inevitably subjective to some degree and depends strictly on available data. This study focused on a subset of conditions including: cancer, ischemic heart disease, atrial fibrillation, cerebrovascular disease, Alzheimer's disease, depression, asthma/bronchitis, diabetes, COPD, osteoporosis, hypothyroidism, and chronic renal disease. In particular, cases of neoplastic disease, Alzheimer's disease, atrial fibrillation, and cerebrovascular disease were only available from EDC codes.
In order to better identify patients at higher risk, the research team decided to distinguish between patients with and without “gaps” in pharmacy utilization. A gap may occur when the patient has no prescriptions in an appropriate chronic medication drug class, or the patient is classified as potentially untreated or an appropriate drug class have been discontinued (eg, asthma with gap). 13 This information is provided by the ACG System and is available for the following diseases: ischemic heart disease, depression, asthma, diabetes, osteoporosis, and hypothyroidism. A dichotomous variable was assigned to each chronic disease, separately for presence and absence of gaps, where applicable (1 if EDC or Rx-MG identifies that clinical condition, 0 otherwise).
Identification of outcomes
Diagnoses and pharmacy records for subjects identified as PCHCN (RUBs 4 and 5) with a diagnosis of heart failure were linked to health discharge records and a mortality registry for the year 2013. Two outcomes were defined: (1) at least 1 admission and (2) at least 1 preventable admission, defined as admission for an “ambulatory care sensitive conditions” (ACSs) as defined by the Agency for Healthcare Research and Quality in its Prevention Quality Indicators. 15 Specifically, the research team included all discharges with International Classification of Diseases, Ninth Revision, Clinical Modification principal diagnostic codes for bacterial pneumonia, hypovolemia, urinary tract infection, angina, congestive heart failure (CHF), hypertension, asthma, COPD, uncontrolled diabetes, short-term complications of diabetes, and long-term complications of diabetes.
Identification of homogeneous subgroups of patients with higher risk of hospital admission via association rules
Although the ACG System provides a score that predicts the probability of hospitalization in the next 12 months (Probability Inpatient [IP] Hospitalization score), it does not allow one to identify similar groups of patients at higher risk. In fact, patients with higher values of this risk score could actually be a very heterogeneous group.
Hence, the goal of the present study is to define an ACG-based group risk score that identifies homogeneous (in terms of diagnoses of comorbid diseases) subgroups of heart failure patients at higher risk of events, benefitting from information provided by the ACG System (namely the cohort of PCHCN and their morbidities).
To accomplish this aim, the Apriori algorithm, 16,17 a data mining technique used to identify the most frequent sets of “association rules,” was used. An association rule is a relationship of the form X = >Y, where X is called antecedent and Y the consequence. Specifically, this work focused on association rules where the Y is composed of one of the aforementioned outcomes, while X may represent a single condition or combination of chronic conditions. Through the Apriori algorithm, one can thus discover the most frequent sets of the most predictive diseases for the outcomes of interest.
Briefly, each association rule can be evaluated by the following indexes:
Support, which is the proportion of how many patients present the disease (or group of diseases) X and the outcome Y among the whole PCHCN cohort.
Confidence, which evaluates the probability that an outcome Y occurs given the presence of a chronic condition or a combination of these conditions X (that is the positive predictive value).
Lift, which is calculated as the ratio between the confidence for the association rule X = >Y and the proportion of patients with the outcome Y. The more that lift is greater than 1, the more likely there's a strong relationship between the chronic condition and the outcome. The lift index can thus provide a “priority” measure for each chronic condition.
The Apriori algorithm can be applied in order to discover the combinations of disease that are frequent (the ones with a support above a minimum threshold, here defined at 2%) and predictive of an outcome (ones with a confidence above a minimum threshold defined by the user).
These conditions then can be ranked according to their priority measure: this allows one to discover which groups of patients have priority to be enrolled on the basis of the morbidity from which they suffer. Thus, an ACG-based group risk score can be calculated for each patient on the basis of the positive predictive value (or confidence of the association rule) of the disease from which he or she suffers to predict the outcome.
Although this algorithm allows the discovery of complex combinations of predictive conditions (such as dyads and triads of diseases), the present work focuses only on a single chronic condition, because the research team did not see any improvement in the predictive performance of the model when considering dyads of diseases.
Validation of ACGbased group risk score
Discrimination of the ACG individual risk score and the ACGbased group risk score were compared using the area under the receiver operating characteristic curve (AUROCC). The research team also created a gain chart (also called “cumulative lift chart”), on which the percentage of patients enrolled in the cohort is shown on the x-axis, while the percentage of patients with the outcome correctly targeted (sensitivity) is shown on the y-axis. This graph helps to determine how many patients should be enrolled to achieve the desired sensitivity, as well as the sensitivity that can be achieved if the number of patients able to be enrolled is fixed by the available resources.
Model calibration refers to whether the predicted probabilities agree with the observed ones and was assessed by means of the Hosmer-Lemeshow goodness-of-fit test, and visually by a calibration plot. 18
To prevent model overfitting and overly optimistic results, all of the aforementioned analyses were evaluated by randomly splitting the data into a training sample (75%) and a validation sample (25%), using the former to select the most predictive conditions and the latter to assess model performances. A range of plausible values for the estimate of the AUROCC was determined by iterating the procedure of random splitting 100 times and recalculating the AUROCC obtained for each new validation sample.
Subgroup selection for the outcome of at least 1 preventable hospital admission
Finally, the research team also applied the Apriori algorithm to predict the outcome of at least 1 preventable admission in the subsequent year. This is an example of an event a policy maker may be able to prevent using a CM approach.
Once again, the team considered solely the presence of single diseases, present in at least the 2% of the cohort, in order to avoid considering rare conditions. Given the list of ranked diseases, the team evaluated each subgroup capture by reporting for each single chronic condition its positive predictive value and its lift index. Also, for each combination of diseases (for which the combination at the kth step is built as the kth disease plus the previous ones in the priority list), sensitivity, specificity, AUROCC, and lift are reported.
Results
A total of 185,887 persons were resident in the LHU4–Veneto Region in 2012, of whom 39,643 individuals were older than age 64 years. The ACG System classified 2691 subjects older than age 64 years as PCHCN; 2250 (84%) were RUB 4 and 441 (16%) were RUB 5 in 2012. Of these, 1690 (62.8%) had a diagnosis of CHF, and 1225 (72.5%) were still alive at the beginning of 2013 and constitute the study cohort. Of these, 481 patients (39.3%) had at least 1 hospital admission in 2013, while 185 (15.1%) had at least 1 preventable hospital admission.
The performance of ACGbased group risk model in predicting the outcome of at least one hospital admission as measured using AUROCC was compared with the ACG risk index in the validation set and is shown in Figure 1. There was no statistically significant difference in the area under the 2 curves. Although both methods discriminate poorly on the validation set, one must consider that this study is focusing on a cohort of patients who are characterized by a particularly high risk of hospital admission and only 39.3% actually experience this outcome.

Comparison of the empirical receiver operating characteristic curves of the ACG-based Group Risk Score and ACG individual risk score in predicting the occurrence of the outcome of 1 hospital admission. ACG, Adjusted Clinical Groups.
The lift chart (Figure 2) shows that for the same percentages of patients enrolled, the model achieved a slightly lower sensitivity. A sensitivity equal to 59% was achieved by enrolling the 41% of PCHCN with the model, while by selecting the same number of patients according to ACG risk score, a sensitivity equal to 65% was achieved.

Lift chart: comparison of the sensitivities achieved by ACG-based Group Risk Score and ACG individual risk score in predicting the occurrence of the outcome of 1 hospital admission on the basis of the same percentages of patients enrolled. ACG, Adjusted Clinical Groups.
Despite the relatively poor discrimination, the Hosmer-Lemeshow test showed the model to have good calibration. In particular, by splitting the expected probabilities respectively in 5, 10, and 15 groups, the research team obtained a statistic of 7.04 (P = 0.071), 9.03 (P = 0.340), and 13.07 (P = 0.442), respectively. In the calibration plot in Figure 3, the expected versus observed probabilities, divided in deciles (10 groups), are reported. Observed probability of hospitalization was consistently lower than predicted.

Calibration plot for the ACG-based Group Risk Score in predicting the outcome at least 1 admission on the validation set. Deciles of expected probabilities have been plotted versus the observed probabilities. ACG, Adjusted Clinical Groups.
Regarding the outcome of at least 1 preventable admission in the year, Table 1 reports the full list of frequent and most predictive chronic pathologies sorted by lift index, including each one's positive predictive value and lift index. For each subgroup of patients defined at each row, the same indexes are reported, along with sensitivity, specificity, and AUROCC. For example, by prioritizing enrollment of patients with asthma with no medication gaps, chronic kidney disease or COPD in a program to reduce readmissions, the team would enroll 27.9% of the PCHCN identified by ACG, and detect 39% with at least 1 preventable admission in the year. Figure 4 reports the observed vs expected probabilities graph and indicates good calibration, confirmed by P = 0.166 for the Hosmer-Lemeshow test. Again, observed probability of hospitalization was consistently lower than predicted.

Calibration plot for the ACG-based Group Risk Score in predicting the outcome at least 1 preventable admission on the whole cohort of patients with complex health care needs. Deciles of expected probabilities have been plotted versus the observed probabilities. ACG, Adjusted Clinical Groups.
Pathologies Sorted in Decreasing Order by Lift Index in Predicting the Outcome at Least One Preventable Admission
Performance measures have been reported both for each chronic disease and for the combination of these. At the kth row, the subgroup of patients enrolled comprises those affected by any of the first k diseases, along with heart failure.
AD, Alzheimer's disease; AF, atrial fibrillation; ASTH, asthma; AUC, area under the curve; CAN, cancer; CHD, coronary heart disease; CKD, chronic kidney disease; COPD, chronic obstructive pulmonary disease; CVD, cerebrovascular disease; DD, depressive disorder; DM, diabetes mellitus; HT, hypothyroidism; NPV, negative predictive value; PPV, positive predictive value.
Discussion
This study has demonstrated an approach to identifying highly homogenous subgroups of patients with a high probability of experiencing an adverse health care event, such as a hospitalization or readmission, using the ACG system. By identifying a small number of homogenous groups of patients, CM teams can tailor approaches to the same handful of conditions in these high-risk heart failure patients, rather than having to develop an ad hoc approach for each patient because they have different combinations of comorbidities.
The research team evaluated the performance of the ACG-based Group Risk Score in a population of PCHCN with a diagnosis of heart failure, comparing it with the individual risk score provided by the ACG System. An individual risk stratification system such as the one implemented in the ACG System produces an individual risk score for each patient, while a simple method such as the one proposed herein allows health systems to identify clinically homogenous groups who suffer from a given chronic disease or diseases.
Although the simplicity of the present study's approach is an advantage, this is reflected in its lower sensitivity compared with the ACG individual risk model. Furthermore, both methods showed only modest discrimination based on the AUROCC. It must be considered that the research team has tried to discriminate which patients would experience at least 1 admission in the subsequent year, given that all the patients in the cohort have been labeled as PCHCN, have a diagnosis of heart failure, and thus have a high risk of admission.
Nevertheless, the selection via ACG-based Group Risk Score had excellent calibration and led to clinically homogeneous subgroups of patients being identified. In addition, examination of the gain charts (which display the sensitivity in respect to the proportion of population enrolled) demonstrated that the 2 methods have similar performance in terms of “impact.”
Although health care costs are proportional to the proportion of the population requiring services, they also are increased when services are not standardized. 19 By identifying a small number of homogeneous patient subgroups that predict risk for hospitalization or preventable hospitalization, such as patients with heart failure and chronic kidney disease, standardized tool kits and protocols can be developed to facilitate consistent, evidence-based approaches to care management.
As Sikka has pointed out, the Quadruple Aim of the health care system should be improving the individual experience, improving overall health, improving the experience of providing care, and reducing the cost of providing care. 20 Moreover, the present study's approach allows one to develop models useful to predict other outcomes of interest (eg, multiple different kinds of hospitalizations, death).
The research team found that the best predictors for the occurrence of at least 1 preventable admission were asthma without gaps, chronic kidney disease, and COPD. These diagnoses are ACSs for which good primary care could potentially prevent hospitalization and reduce complications. While it may seem counter-intuitive that asthma patients without a gap in medication have a higher likelihood of admission, this could represent patients with frequent or daily symptoms and more severe asthma who are therefore less likely to omit doses or refills of their medication. Hospital admissions for ACSs are in fact indicators of reduced or poor access to primary care services and can be used as a proxy measure for the quality of primary care received. A study by Caminal and colleagues identified the aspects of primary health care that were most responsible for preventing hospitalization, which included primary prevention, early detection and monitoring of acute episodes, and follow-up and monitoring of chronic conditions. 21
All of these can be implemented by means of a targeted CM program aimed at early detection of symptoms related to the onset of disease, appropriate treatment once the disease is diagnosed, and adequate monitoring to avoid or delay the occurrence of acute and chronic complications and resulting hospitalizations. The ACG Group Risk Score methodological approach could be used by policy makers to tailor health care services to address the needs of specific homogenous groups of patients at higher risk for resource consumption.
In fact, identifying high-risk individuals who suffer poor health outcomes, will have high utilization in the future, and whose utilization patterns can be reduced through intervention is one way to improve outcomes and reduce overall costs. An additional challenge is identifying those patients with “rising risk” for whom CM interventions may have a positive impact. Such identification would enable one to organize health care services with a tertiary preventive approach for example adopting homogenous care approaches as care pathways, recommended and contraindicated medications, frequency of monitoring etc., with the aim of maximizing the value of coordinated care and minimizing the need for high-cost interventions such as emergency department visits and hospitalizations. Current approaches to CM often fail to demonstrate economic value across an entire managed care membership.
This is not surprising because health care management generally does not systematically target the care of the relatively small number of members who consume a disproportionate share of medical resources. 22 In fact, new population-oriented care models, which differ from the conventional approach of “crisis-oriented care,” could seek to provide care for patients through better use of information technology, evidence-based care, and an integrated approach among health and social workers, as summarized in a previous review. 23 This has the capacity to enhance management of the health of populations by identifying potentially high-risk patients, with the aim of achieving all aspects of the Quadruple Aim: reduced cost, improved outcomes, improved experience of patients, and motivation of health care professionals. 22,24 –26
However, all strategies used to improve the impact of risk stratification programs and impactability models act as population screening. As one can see, no screening test or risk stratification or impactability model tool is ever completely accurate; therefore it is important to consider the adverse impact of false positives and false negatives.
The problem with false positive results is that the individuals concerned are offered an intervention to prevent an event that they were not actually going to experience. As a result, the preventive intervention would be “wasted” on these people and the resources would have been better spent elsewhere. Moreover, these individuals might experience needless anxiety from being wrongly told that they were high risk, and they also might be subject to over-investigation or overtreatment. In contrast, the difficulties associated with false negative results are related to unwarranted reassurance. 27 However, this approach may help reduce health care inequalities because ACSs are more prevalent in more deprived areas. 28
Footnotes
Author Disclosure Statement
The authors declare that there are no conflicts of interest. The authors received no financial support for this study.
