Abstract
Individuals infected with HIV who are out of care are at a higher risk of HIV-related morbidity and mortality. It has been difficult to recruit a representative sample of out-of-care patients for epidemiological studies. Using a novel weighting method, we constructed a representative sample of out-of-care HIV patients from a representative sample of in-care patients. In-care patients were weighted based on the probability of receiving care during the study period and the probability of selection to participate in the study, and out-of-care patients were represented by those who were previously out of care and recently returned. The method can be used in other patient populations, if every patient in the population has a known, non-zero probability of receiving care and a known, non-zero probability of participating in the study.
Antiretroviral therapy (ART) can significantly improve the length and quality of life for HIV-infected persons. 1 Despite the wide availability of HIV treatment in the United States, many HIV-infected persons are not in care or receiving ART.2–9 The Centers for Disease Control and Prevention (CDC) estimated that only 65.8% of 1,148,200 persons living with HIV in 2009 in the United States were linked to care and 36.7% were retained in care. 10 Persons who are out of care are at a higher risk of HIV-related morbidity and mortality and transmitting the virus to others because of unsuppressed HIV viral load.11,12
To better engage the out-of-care population, we need to better understand them; to better understand this population, we need a representative sample. However, without out-of-care patients’ updated contact information, it is difficult, if not impossible, to construct a sampling frame and draw a representative sample. Without a representative sample, studies have relied on a convenience sample, e.g., out-of-care patients who were reachable by outreach staff, or a subset of the population, e.g., newly diagnosed patients who delayed their initiation of care or who were linked to care but later dropped out.13–16 Such studies have reported that out-of-care patients were more likely to be people of color, and injection drug users. However, the findings from these studies may not be generalizable.
Weighted analysis has been widely used in population-based surveys and research studies when study subjects have a known, non-zero probability of selection.17,18 Each participant is given a weight, which is the inverse of the probability of being sampled, to adjust for unequal selection probability. When a study of limited duration that enrolls patients who present for care in a clinical setting, patients receiving regular care will have a higher probability and patients receiving sporadically care will have a lower probability of selection. Given the natural history of untreated HIV and the wide availability of health care generally and HIV treatment in developed countries,3,19 almost all patients would eventually seek care, meaning all patients, including out-of-care patients, have a non-zero probability to participate in the study. Patients being out of care for a longer period of time will have a lower probability and patients being out of care for a shorter period of time will have a higher probability of being included in the study. Out-of-care patients can be represented by those who were previously out of care and recently returned, after adjusting for unequal selection probability.
Methods
Study design
We used the data from the New York City Medical Monitoring Project (NYC MMP), a supplementary surveillance project sponsored by the CDC to provide representative, population-based data on clinical status, care, outcomes, and behaviors of HIV-infected persons receiving care in the United States. MMP uses a three-stage sampling design to obtain a nationally representative sample of HIV-infected adults who are in care. The methods have been previously described elsewhere in detail.20,21 Briefly, health jurisdictions in the United States were randomly selected based on the number of persons living with acquired immunodeficiency syndrome (AIDS) in the jurisdiction, and NYC was one of the 23 jurisdictions selected. Then, patients were selected through a two-stage process at participating jurisdictions. HIV medical facilities in NYC were randomly selected to participate in MMP with a probability proportional to size based on an estimated patient load (EPL) – an estimate of the number of HIV-infected adult patients served during a pre-determined time period. NYC MMP staff compiled a list of HIV-infected adults who received medical care at participating facilities during the population definition period (PDP), a predetermined period of time which defines the population of inference, and CDC then randomly selected patients from the list in a manner that resulted in an equal probability of selection at the patient level in NYC. Face-to-face interviews, or telephone interviews if face-to-face interviews were not feasible, were conducted to collect information on demographics, health status, behavioral risk factors, and adherence to HIV medication regimens, and patients’ medical records were abstracted to collect information on prescription of ART, comorbidities, and health service utilization. Patients were also matched to the NYC HIV registry to collect basic demographic data.
Weighting and construction of an out-of-care patient sample
Base weight
The NYC MMP 2012 cycle included patients who received medical care from an NYC HIV medical facility during the PDP: 1 January and 30 April 2012. The base weight is the inverse of the probability that a patient was selected to participate in MMP, given that the patient was receiving care during PDP. As patients were randomly selected in a manner that resulted in an equal probability of selection at the patient level in NYC, patients who visited only one HIV medical facility once or multiple times during the PDP would receive a base weight of 1, and patients who visited more than one HIV medical facility would receive a based weight of the inverse of the number of facilities that he or she visited during the PDP. The information of the number of HIV medical facilities that a patient visited during the PDP was obtained from HIV registry data. Ineligible patients, e.g., persons younger than 18, being included in a facility’s patient list and sampled would make eligible patients less likely to be sampled. Therefore, the base weight of participants from that facility would be adjusted to the number of sampled patients (eligible + ineligible patients) divided by the number of eligible patients.
PDP weight
We used the presence of any CD4 cell count or viral load report in the NYC HIV registry to indicate a care visit, and an in-care patient was defined as having at least one care visit in a 12-month period between 1 May 2011 and 30 April 2012. The PDP weight is the inverse of the probability that a patient received care during the four-month PDP, given that the patient was in care (≥1 CD4 cell count/viral load test between 1 May 2011 and 30 April 2012). The time interval between the last care visit prior to 2012 and the first visit in 2012 was used to calculate the probability.
If the time interval was ≤4 months, meaning that the patient was receiving care very frequently and had a certain probability (100%) of being included in the sampling frame, the patient would receive a PDP weight of 1. If a patient’s time interval was >4 months but ≤1 year, meaning that the patient was receiving care less frequently and had a lower probability of being included in the sampling frame, the patient would receive a PDP weight equal to the time interval in months divided by four, which is the length of PDP in months. If a patient’s time interval was >1 year, meaning that the patient was an out-of-care patient and returned for care in 2012, the patient would receive a PDP weight of 3 because the patient could return anytime during the year and had 1/3 of chance of receiving care during the four-month PDP. If there was no care visit prior to 2012, meaning that the patient was either newly diagnosed in NYC, previously diagnosed in NYC but was never linked to care, or previously diagnosed outside of NYC and recently moved into NYC, the patient would also receive a PDP weight of 3 because he or she also had a 1/3 of chance receiving care during the four-month PDP.
In-care weight
Every HIV patient in NYC had a non-zero probability of receiving care in 2012, because all HIV patients would eventually seek care given the natural history of untreated HIV, and the wide availability of health care generally and HIV treatment specifically in NYC.3,19 The in-care weight is the inverse of the probability that an HIV patient would receive care during the one-year study period, given that the patient lived in NYC. The probability was calculated based on the time interval between the last care visit prior to 2012 or the date of diagnosis if the patient was previously diagnosed in NYC with no care visits prior to 2012, and the first care visit during the PDP: 1 January 1 and 30 April 2012.
If a patient’s time interval was ≤1 year, meaning that the patient was in care (≥1 care visit in a year) and had a certain probability (100%) of receiving care during the one-year study period, the patient would receive an in-care weight of 1. If a patient was previously diagnosed outside of NYC and recently moved into NYC with no CD4 cell count/viral load tests prior to 2012 in the NYC HIV registry, the patient would also receive an in-care weight of 1. If a patient’s time interval was greater than one year, the patient would receive a weight equal to the time interval in years. For example, if a patient had his last care visit exactly three years ago prior to his first care visit in 2012, he would receive a weight of 3, and the patient would not only represent himself, but also two other patients, who did not receive care during 1 May 2011 and 30 April 2012 and were out of care.
Construction of an out-of-care patient sample
Patients who were previously out of care and recently returned represent not only themselves, who are now in care, but also patients who are still out of care. To account for the different care status, we split all records of these patients with an in-care weight of greater than 1 into two, one with an in-care weight of 1 representing the patient himself, and the other with the original weight minus 1 representing the out-of-care patients. Using the previous example, the patient with a weight of 3 would be split into two records, one with an in-care weight of 1 and a care status of ‘in-care’, and the other with an in-care weight of 2 and a care status of ‘out-of-care’.
By doing this, we constructed a representative sample of HIV patient population, including both in-care and out-of-care patients, from a representative sample of in-care patient population, with in-care patients represented by the entire MMP sample, and out-of-care patients represented by a subset of the sample, who were previously out of care and recently returned for care during the PDP after an absence from HIV care for more than one year.
Non-response weight
MMP collects minimum data (sex, age, race/ethnicity, transmission risk, and most recent CD4 cell count) from the local HIV registry on each patient sampled. In NYC, some facilities refused to provide the names of sampled patients for minimum data abstraction if the patient chose not to participate in MMP. For this analysis, patients with minimum data collected were considered respondents, and those without were non-respondents. We used the facility information to calculate non-response weight, because the only information on these non-respondents was the facility where they received care. Patients who received care at facilities where all sampled patients’ names were released to the MMP team for minimum data abstraction would receive a non-response weight of 1, and those who received care at facilities where only the names of patients who participated in MMP interviews and/or medical record abstractions were released for minimum data collection would receive a non-response weight; that is the number of sampled patients divided by the number of MMP participants at that facility. For example, if 10 patients were sampled from one facility but only eight were given to the MMP team for minimum data abstraction, each of these eight patients would receive a non-response weight of 1.25 (10/8 = 1.25). Patients who did not match any record in the NYC HIV registry were also considered non-respondents for this analysis.
Final weight
The product of the above four weights was computed to give each record an initial weight and then adjusted to the unweighted sample size to give each record a final weight. The unweighted sample size is the number of respondents whose minimum data were available for analysis. Because patients had an equal probability of selection at the patient level in NYC, and a high participation rate (98.4%) for minimum data collection, no post-stratification weights were assigned. The assigned six weights were summarized in the below box.
Measures
We use patients who were previously out of care, recently returned to care, and are now in care to represent patients who are currently out of care. However, out-of-care patients may be ‘healthier’ than their representative in-care patients, who recently returned to care because of sickness or other reasons. We made an adjustment when estimating the CD4 cell count of out-of-care patients, by adding back 50 cells/mm3 per year. 22 No adjustment was made when estimating viral suppression status because, (1) if representative in-care patients were virally suppressed, their represented out-of-care patients were very likely to have the same status and be suppressed, and (2) if representative in-care patients were virally unsuppressed, their represented out-of-care patients could be suppressed or unsuppressed and to be conservative, we assumed them all unsuppressed, the same status as their representative in-care patients. 23
Statistical analysis
We described the characteristics of HIV patients in NYC by their care status. The survey procedures in SAS were used for weighted analysis while taking clustering into account, because patients selected from the same facility tend to have similar characteristics. Statistical significance was determined by the Rao-Scott modified Chi-square test at the significant level of 0.05.
MMP was considered to be not human subject research but a routine disease surveillance activity by CDC, and as such did not require Institutional Review Board (IRB) review at CDC. However, at each participating jurisdiction, the determination was made locally, and in NYC the study protocol was reviewed and approved by the IRBs at the NYC Department of Health and Mental Hygiene and participating facilities at which an IRB review and approval was required.
Results
In the NYC MMP 2012 study cycle, among 35 facilities selected, five were later found out to be ineligible for not providing HIV medical care and four refused to participate. A total of 800 patients were randomly selected from 26 participating HIV facilities. Thirty-six patients were excluded from the analysis, including two HIV-negative, one younger than 18, five with no matching record in the NYC HIV registry, 13 with no identifiers provided by their HIV facility, and 15 with no evidence of receiving HIV care in NYC during 2011–2012. One patient was sampled twice from two facilities, and was de-duplicated with one record kept in the dataset. The final sample size was 763.
Characteristics of the study sample.
CI: confidence interval; HIV: human immunodeficiency virus; IDU: injection drug users; MSM: men who have sex with men.
Sum of weighted estimates may not equal total due to rounding.
Estimated proportion of persons living with HIV in New York City who were out of care in 2012, by characteristics.
CI: confidence interval; HIV, human immunodeficiency virus; IDU, injection drug users; MSM, men who have sex with men.
Sum of weighted estimates may not equal total due to rounding.
p values cannot be computed when at least one table cell has 0 frequency. p value for race/ethnicity was calculated by excluding other/unknown group, and p value for age was calculated by combining 18–24 and 25–34 age groups.
Estimated CD4 cell count and viral load values of persons living with HIV in New York City in 2012, by care status.
CI: confidence interval; HIV: human immunodeficiency virus.
19 patients missing CD4 cell counts and 17 missing viral load values.
Discussion
If every patient would seek care with a non-zero probability, a statistical weighting can be used to construct a representative sample of out-of-care patients from a representative sample of in-care patients. Out-of-care patients are represented by a subset of the study sample, who were previously out of care and recently returned for care. Using a HIV patient sample, we demonstrated the method, constructed a representative sample of out-of-care patients, and estimated that 5.3% (95% CI: 3.0%, 7.6%) of persons living with diagnosed HIV in NYC were out of care, with no differences by sex, age, race/ethnicity, transmission risk or year of diagnosis. Equivalent findings were observed from the analysis based on population-based HIV surveillance data, with 9.0% (95% CI: 8.8%, 9.2%) out of care and no differences by those characteristics. 24 We did not find any differences in CD4 cell count and viral suppression between in-care and out-of-care patients because of a lack of statistical power – only 27 study subjects representing out-of-care patients. On the other hand, the fact that only 27 patients who were previously out of care were included in a random sample of 763 in-care patients indicates a low proportion of out-of-care patients in NYC.
The method has several limitations. First, the newly developed weighting method may overestimate the proportion of out-of-care patients. Taking an extreme case as an example, if all previously out-of-care patients were brought back for care during the one-year study period, meaning no out-of-care patients anymore in the jurisdiction, the method would still produce a non-zero estimate of out-of-care patients. However, this may not be a major concern because, (1) return-to-care efforts, no matter how comprehensive and elaborate, are unlikely to have an immediate, dramatic impact,7,25 and (2) the estimate of out-of-care patients could be validated by comparing estimates from earlier years or estimates from other methods.24,26
Second, the representative sample of out-of-care patients constructed from patients who were previously out of care enables us to make inference regarding some characteristics, e.g., basic demographics and chronic conditions, of the out-of-care patient population, but not all. Many previously out-of-care patients returned to care because of a recent change in his/her life, e.g., an acute illness, or a change of health insurance status. We assumed no such changes when we used the representative in-care patients to estimate the characteristics of the patients who are currently out-of-care.
In summary, we demonstrated that by using a weighting method we can construct a representative sample of out-of-care patients from a representative sample of in-care patients by weighting participants based on their care visit time interval and selection probability. We described the method using a HIV population. The method can be used in other patient populations as well, if every patient in the population has a non-zero probability of receiving care and a non-zero probability of selection to participate in the study, and the two probabilities can be estimated.
Footnotes
Acknowledgements
The authors would like to acknowledge Jacek Skarbinski of the CDC for his contributions to the national MMP, and Kent Sepkowitz, John Rojas, Jay Varma, and James Hadler of the New York City Department of Health and Mental Hygiene (NYC DOHMH) for their review and comments on this paper.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project was supported in part by a Cooperative Agreement with the Centers for Disease Control and Prevention (CDC), PS09-937, #5U62PS001598-04.
