Test–Retest,Responsiveness,and Minimal Important Change of the Ability to Perform Physical Activities of Daily Living Questionnaire in Individuals with Type 2 Diabetes and Obesity

Abstract

Background:

The Ability to Perform Physical Activities of Daily Living Questionnaire (APPADL) measures the self-reported ability of individuals with type 2 diabetes mellitus (T2DM) and obesity to perform daily physical activities. The primary objective of this study was to estimate APPADL test–retest reliability, responsiveness, and minimal important change (MIC).

Subjects and Methods:

Study participants were individuals with T2DM and body mass index ≥30 kg/m² enrolled in clinical weight loss programs in the United States. Data were obtained for clinical measures, APPADL, and other patient-reported instruments. APPADL test–retest reliability was estimated with intraclass correlation coefficient. To estimate responsiveness in a subgroup of participants, baseline and 6-month data were analyzed using paired t test and calculation of responsiveness indices (e.g., effect size [ES]). To estimate MIC, both distribution-based and anchor-based methods were used.

Results:

Test–retest data for 106 study participants (mean age, 52 years; 69% female; 31% white; mean body mass index, 38 kg/m²) yielded an intraclass correlation coefficient of 0.91. In the subgroup (n=40) used to estimate responsiveness, weight was significantly less at end point than at baseline (mean, 222.0 vs. 231.9 pounds; P<0.001, ES=0.24), and APPADL scores were significantly better than at baseline (mean, 77.0 vs. 70.8; P=0.01, ES=0.32). Results of distribution- and anchor-based methods to establish MIC suggest values of 6–14 points (0–100 scale).

Conclusions:

The APPADL has demonstrated reliability and validity. In addition, it has demonstrated responsiveness to weight loss in individuals with T2DM and obesity, thereby making it a potentially valuable tool in the evaluation of weight loss interventions (e.g., antihyperglycemic medications that produce weight loss) targeted toward patients with T2DM.

Introduction

I n a previous article published in Diabetes Technology & Therapeutics, Hayes et al.¹ reported on the development and preliminary validation of a new patient-reported outcome (PRO) measure, the Impact of Weight on Activities of Daily Living Questionnaire (IWADL). The IWADL (renamed and subsequently referred to as the Ability to Perform Physical Activities of Daily Living Questionnaire [APPADL]) was designed to assess the self-reported ability to perform daily physical activities of individuals with type 2 diabetes mellitus (T2DM) and obesity.¹ As antihyperglycemic medications (e.g., incretin mimetics) that not only produce improvements in glycemic control but also weight loss became available in the United States, the question arose as to what benefits may be directly attributed to this weight loss from the patient perspective. In a qualitative study involving interviews with 30 individuals with T2DM and obesity (body mass index [BMI], 30–35 kg/m²), participants agreed that a 5% weight loss, while not reflective of their ultimate goal, would be meaningful. Their expected benefits of this weight loss included improved ability to perform physical activities of daily living.² The physical activities that these individuals identified as important and relevant were classified as pertaining to the concepts of flexibility, mobility, and activity level. Thus, these physical activities served as basis for generation of 13 items. Psychometric tests conducted with data from a sample of individuals with self-reported T2DM and obesity (BMI, 30–40 kg/m²), participating in a cross-sectional Web-based survey, led to item reduction for a finalized seven-item APPADL and a demonstration of its reliability (i.e., internal consistency) and construct validity.¹

Although the preliminary validation of the APPADL lent support for its validity as a measure of self-reported performance of physical activities of daily living, additional evidence is needed if the APPADL is to be used as a PRO end point in the evaluation of weight loss interventions targeted toward individuals with T2DM. It must demonstrate the ability to detect significant change in the individuals' self-reported ability to perform physical activities when significant weight loss occurs.³ Prior to that, however, there must be evidence that there is insignificant intrasubject variability in the measure when no weight loss occurs (i.e., test–retest reliability).³ Finally, if the APPADL is responsive to change in self-reported ability to perform physical activities as a result of weight loss, it would be important to know the change in APPADL scores that is consistent with patient-perceived and/or clinical benefit (i.e., minimal important change [MIC])^4,5 rather than simply a change that is statistically significant. Because the previously published study conducted by Hayes et al.¹ was a cross-sectional study, it was not possible to obtain any of this essential information. Therefore, the primary objective of this prospective study was to estimate APPADL test–retest reliability, responsiveness, and MIC in a sample of individuals with T2DM and obesity.

Research Design and Methods

Study participants

Study participants were recruited primarily from four sites in the United States. Three of the four sites were managed by a company that is no longer in operation but offered a holistic approach to weight loss. The fourth site is a university-based site that was testing a standard behavioral weight loss program plus a portion-controlled diet (Nutrisystem^®, Horsham, PA) for weight loss versus a diabetes support and education program. In addition, a few study participants were recruited from a specialty clinic in which all physicians are board-certified in Endocrinology and Metabolism and provide diabetes management training or from a weight loss program that offered behavioral and lifestyle modification in addition to an oral serotonin supplement. Eligibility criteria included (1) actively seeking or currently engaged in a weight loss intervention, (2) diagnosed with T2DM at least 6 months prior to screening, (3) 25–65 years of age, and (4) BMI ≥30 kg/m². Although the APPADL was developed for a target population of patients with T2DM and BMI of 30–40 kg/m², no upper limit of BMI was imposed to facilitate recruitment. In addition, no criteria were imposed as to when a participant began or how long a participant had been involved in a weight loss program. Sample size was targeted at 300 patients for robust psychometric analyses.

Procedure

At baseline (Visit 1), site coordinators provided study participants with a study packet that included a sociodemographic form, the APPADL, and other PRO instruments. Site coordinators captured clinical data including height, weight, co-morbidities, diabetes medications, and verification of T2DM (glucose reading or confirmation of prescription for diabetes medication) on a standardized form. At Visit 2, approximately 5 days postbaseline, participants were administered the APPADL. Six months postbaseline (Visit 3), participants were administered the APPADL, the PRO instruments administered at Visit 1, and a global impression of change question. Clinical variables (i.e., weight) were also captured at Visit 3. Study approval was obtained from the Independent Investigational Review Board, Inc.

PRO instruments

The packet administered to participants at Visits 1 and 3 included, along with the APPADL, the Short-Form 36 Version 2.0 (SF-36),⁶ Weight-Related Symptom Measure (WRSM),^7,8 and the Obesity and Weight Loss-Quality of Life Measure (OWQOL).^7,8 The seven-item APPADL is designed to assess self-reported ability to perform physical activities of daily living.¹ The SF-36 consists of eight domains, including a physical function domain, and is designed to provide a general self-evaluation of health status.⁶ The WRSM is designed to assess existence and bothersomeness of 20 symptoms related to obesity, and the 17-item OWQOL is designed to assess the impact of obesity or weight loss on aspects of quality of life.^7,8 The SF-36, WRSM, and OWQOL were administered for comparison purposes (see Statistical analysis below). Additional details of PRO instruments administered are found in Table 1. Finally, at Visit 3, participants were asked to respond to a global question pertaining to change in ability to perform physical activities since the last visit on a 3-point scale from “worse” to “better.”

Table 1.

Patient-Reported Outcome Instruments Administered to Study Participants at Visits 1 and 3

Patient-reported outcome	Number of items	Recall period	Item concepts	Response set
APPADL	7	Current status	Difficulty in doing, for example, standing for 2–3 h, household chores or yard work that requires bending over or squatting down, or moderate activities	5-point scale: from “unable to do” to “not at all difficult”
Short-Form 36 Domains
Physical Function	10	Current status	Extent to which health limits ability to do activities, for example, walking various distances, moderate or vigorous activities	3-point scale: “yes, limited a lot” to “no, not limited at all”
Role Physical	4	Past week	Amount of time physical health has interfered with, for example, the amount of time spent on work or other activities	5-point scale: from “all of the time” to “none of the time”
Role Emotional	3	Past week	Amount of time emotional health has interfered with, for example, amount of time spent on work or other activities	5-point scale: from “all of the time” to “none of the time”
Social Functioning	2	Past week	Amount of time/extent to which physical health or emotional problems have interfered with social activities	5-point scale: from “all of the time” to “none of the time”/ ”not at all” to “extremely”
Bodily Pain	2	Past week	Amount of pain/extent to which pain interfered with life	6-point scale: from “none” to “very severe”/”not at all” to “extremely”
Vitality	4	Past week	Amount of time feeling, for example, worn out	5-point scale: from “all of the time” to “none of the time”
Mental Health	5	Past week	Amount of time feeling, for example, calm and peaceful	5-point scale: from “all of the time” to “none of the time”
General Health	5	Current status	Extent to which statements are true or false for the respondent, for example, “I seem to get sick a little easier than other people”	5-point scale: from “definitely true” to “definitely false”
			Rating of general health (1 item)	5-point scale: from “poor” to “excellent”
OWLQOL	17	At this time	The extent to which statements describe the respondent, for example, “Because of my weight, I try to wear clothes that hide my shape.”	7-point scale: from “not at all” to “a very great deal”
WRSM	20	Past 4 weeks	Presence of symptom, for example:	Yes or no
			• Shortness of breath
			• Decreased physical stamina
			If symptom present, bothersomeness of symptom	7-point scale: from “no symptom/not at all” to “a very great deal”

APPADL, Ability to Perform Physical Activities of Daily Living; OWLQOL, Obesity and Weight Loss-Quality of Life Measure; WRSM, Weight-Related Symptom Measure.

Statistical analysis

Descriptive statistics (e.g., mean and SD) were calculated for demographics, clinical variables, and each PRO instrument's domain and total scores. Total and domain scores were reversed-scored when necessary so that higher scores for all PRO measures corresponded to better outcome (e.g., greater ability to perform physical activities of daily living, greater health status). Subsequently, all total scores were linearly transformed to a 0–100 scale for easier comparison among instruments. However, APPADL individual item scores are reported as original scores (1–5) to more easily compare item statistics with those in the previous study conducted by Hayes et al.¹

APPADL test–retest reliability was estimated with intraclass correlation coefficient (absolute agreement two-way random effects model)⁹ using data from Visits 1 and 2. APPADL internal consistency reliability was estimated with coefficient α¹⁰ using data from Visits 1, 2, and 3. Reliability coefficients of 0.80 were considered satisfactory.

To establish whether there was a significant weight loss in the subgroup of participants followed prospectively, a paired t test was performed using Visits 1 and 3 data. To estimate APPADL responsiveness, a paired t test was performed using Visits 1 and 3 data. The common responsiveness indices of effect size (ES) (mean change in score/SD of baseline scores) and standardized response mean (SRM) (mean change in score/SD of mean change score) were calculated.^11,12 To compare APPADL responsiveness with change in clinical measures (weight, BMI) and the responsiveness of other PRO instruments that have been used as end points in clinical trials of pharmaceutical and commercial weight loss interventions (e.g., SF-36 domains),^8,13,14 similar analyses were performed using Visits 1 and 3 data for the other measures.

To estimate MIC, Crosby et al.¹⁵ recommend an integrated approach that includes both distribution- and anchor-based methods. Distribution-based methods use a statistical property (e.g., SD) of the sample or PRO measure to estimate MIC, while anchor-based methods compare changes in the PRO measure with, for example, patient impression of change or other clinically relevant variables. For this study, distribution-based methods were 0.5 SD and 1 SE of measurement (1 SEM [square root of 1 minus reliability multiplied by baseline SD]). These two statistics have been shown to be good approximations of MIC determined by other methods.^4,5,16,17

Participants' responses to the global question pertaining to change in ability to perform daily physical activities since the last visit were dichotomized into two categories: “better” and “no/change worsening.” Percentage of body weight loss was computed by dividing weight change by Visit 1 weight. Because a 5% body weight loss is considered to be both important to individuals with T2DM and obesity² and clinically beneficial,^18,19 participants were divided into those who achieved a 5% or greater weight loss from Visit 1 to Visit 3 and those who did not. To provide an anchor-based method to determining an MIC, the difference in the APPADL mean change score for those participants who reported having better ability to perform physical activities and those who reported no change/worsening was calculated. The difference in the APPADL mean change score for those who achieved 5% or more weight loss and those who achieved less than 5% was also calculated. Independent t tests were also performed to establish the statistical significance of the observed difference in means for both group comparisons (global impression of change and percentage weight loss). The α level for all analyses was set at P<0.05.

Results

Study participants

Because of recruitment issues, including the ceasing of operation of the business that managed three study sites, the study was terminated at the recruitment of 119 participants. Study participants (n=106) with complete data for Visits 1 and 2 were included in the test–retest analyses. These participants were mostly African-American (54%), female (69%), middle-aged (mean [SD]=52 [10] years), and moderately to severely obese (BMI mean [SD], 38 [6] kg/m²) (Table 2). Complete data needed for the responsiveness analysis (Visits 1 and 3 data) were only available for 40 participants from the university-based study site. With the exception of one individual who did not participate in the test–retest study, these 40 participants were a subgroup of the participants for the test–retest analyses. Characteristics of the subgroup were similar to those of the larger test–retest group with the exception that the responsiveness group was more racially diverse (31% white vs. 15% white) and were recruited from one clinical site.

Table 2.

Demographics, Selected Co-Morbidities, Diabetes Medications, and Patient-Reported Outcome Instrument Scores at Baseline for Test–Retest Study Participants (n=106) and Responsiveness Subgroup (n=40)

Survey variables	Test–retest participants	Responsiveness participants ^a
Age (years)	52 (10)	52 (12)
Weight (pounds)	235 (44)	232 (42)
Body mass index (kg/m²)	38 (6)	37 (5)
Gender (female)	69%	65%
Race
White	31%	15%
African-American	54%	78%
All other	15%	8%
Education (some college or higher)	68%	63%
Employment (full-time)	47%	60%
Co-morbidities
High blood pressure	72%	73%
High cholesterol	56%	65%
Asthma	7%	3%
Arthritis	5%	3%
Gastroesophageal reflux disease	4%	10%
Diabetes medications
No medications	11%	0%
Single or multiple orals	67%	80%
Combination (orals+incretin mimetics and/or insulin)	17%	18%
Insulin only	6%	3%
Patient-reported outcome measures^b
APPADL	57.0 (24.5)	70.8 (19.1)
SF-36
Physical Function	63.7 (25.7)	76.9 (21.0)
Role Physical	68.0 (26.9)	79.5 (25.6)
Bodily Pain	63.1 (26.0)	77.2 (22.6)
Mental Health	73.8 (17.0)	79.6 14.3)
Role Emotional	76.3 (26.4)	85.4 (22.1)
Social Functioning	80.0 (23.4)	85.6 (23.3)
Vitality	51.4 (19.3)	59.2 (18.0)
General Health	52.3 (20.9)	61.5 19.2)
WRSM	72.6 (18.8)	83.0 (13.7)
OWQOL	54.8 (25.7)	64.8 (23.8)

Data are mean (SD) values or percentages.

Subgroup (n=40) of test–retest study participants (n=106).

All scores have been transformed to a scale from 0 to 100, with higher scores corresponding to better outcome.

APPADL, Ability to Perform Physical Activities of Daily Living; OWLQOL, Obesity and Weight Loss-Quality of Life Measure; SF-36, Short-Form 36; WRSM, Weight-Related Symptom Measure.

Item statistics, internal consistency, and test–retest reliability

Item statistics for APPADL at Visit 1 indicated that most participants reported moderate difficulty (APPADL item score, approximately 2.5–3.5 on a 1–5 scale) in performing the seven physical activities. Ceiling effects (percentage of participants responding “not at all difficult”) were less than 25% (Table 3). Cronbach's α coefficients calculated for Visits 1, 2, and 3 APPADL administrations were all ≥0.89. Test–retest reliability for Visits 1 and 2 was 0.91.

Table 3.

Item Statistics for the Seven-Item Imapct of Weight on Activities of Daily Living Questionnaire at Visit 1 (n=106)

Item number	How difficult is it for you to:	% unable to do (floor)	% not at all difficult (ceiling)	Mean	SD	Median	Item-total correlations
1	… get up from the floor or ground?	1	11	3.3	1.0	3.0	0.75
2	… get down—for example, to sit, squat, or kneel on the floor or ground?	3	16	3.3	1.1	3.0	0.79
3	… stand for 2–3 h?	12	19	3.3	1.3	4.0	0.70
4	… walk up two flights of stairs?	5	23	3.5	1.2	4.0	0.80
5	… do household chores or yard work that require you to bend over or squat down, such as cleaning the bathtub or weeding?	4	18	3.4	1.1	4.0	0.84
6	… engage in moderate physical activity for 30 min, such as walking quickly, playing softball, playing volleyball, or ice skating?	9	23	3.4	1.2	4.0	0.81
7	… engage in strenuous physical activity for 30 min such as running, playing basketball, biking skiing, or swimming laps?	18	8	2.7	1.2	3.0	0.77
APPADL total score		6	11	3.3	1.0	3.3

Visit 1 Cronbach's α=0.93 (n=106), 0.89 (n=40); Visit 2 Cronbach's α=0.94 (n=106); test–retest reliability (Visit 1 and 2)=0.91 (n=106); and Visit 3 Cronbach's α=0.93 (n=40).

APPADL, Ability to Perform Physical Activities of Daily Living Questionnaire.

Responsiveness

A significant difference was found in participants' mean weight between Visits 1 and 3 (231.9 vs. 222.0 pounds; P<0.001) (Table 4) with an average percentage body weight loss equaling 4%. A significant difference was also found from Visits 1 and 3 in mean BMI (37.0 vs. 35.4 kg/m²; P<0.001), mean APPADL scores (70.8 vs. 77.0; P=0.01), mean SF-36 Vitality scores (59.3 vs. 66.2; P=0.04), and mean OWQOL scores (64.8 vs. 72.2; P=0.01). No significant differences in any of the other PRO instruments were observed. The ES for change in weight was 0.24, and that for change in BMI was 0.31. The values of ES for change in APPADL, SF-36 Vitality scores, and OWLQOL scores were 0.32, 0.38, and 0.31, respectively. The SRMs for change in APPADL, SF-36 Vitality scores, and OWLQOL scores were 0.42, 0.34, and 0.42, respectively.

Table 4.

Differences in Patient-Reported Outcome Instrument and Weight-Related Means Between Baseline (Visit 1) and End Point (Visit 3)

		Mean (SD)
Variable	n	Baseline (Visit 1)	End point (Visit 3)	Mean change ^a	ES	SRM
APPADL	40	70.8 (19.1)	77.0 (21.9)	6.2 (14.6)^**	0.32	0.42
SF-36
Physical Function	40	76.9 (21.0)	81.0 (18.5)	4.1 (15.6)
Physical Role	40	79.5 (25.6)	83.3 (22.0)	3.8 (18.9)
Bodily Pain	40	77.2 (22.6)	78.1 (25.5)	0.8 (26.0)
Mental Health	38	79.7 (14.5)	77.2 (17.9)	−2.5 (17.1)
Emotional Role	40	85.4 (22.1)	87.5 (22.8)	2.1 (24.8)
Social Function	40	85.6 (23.3)	83.4 (24.6)	−2.2 (22.1)
Vitality	39	59.3 (18.2)	66.2 (18.8)	6.9 (20.1)^*	0.38	0.34
General Health	40	61.5 (19.2)	61.0 (20.3)	−0.5 (17.9)
WRSM	40	83.0 (13.7)	87.0 (14.0)	4.0 (14.0)
OWLQOL	40	64.8 (23.8)	72.2 (22.2)	7.4 (17.5)^**	0.31	0.42
Weight (pounds)	39	231.9 (42.3)	222.0 (39.5)	−10.0 (14.9)^***	0.24
BMI (kg/m²)	39	37.0 (4.8)	35.4 (5.0)	−1.5 (2.1)^***	0.31

Mean change is calculated by subtracting Visit 1 value from Visit 3 value.

P<0.05, ^** P≤0.01, ^*** P≤0.001.

APPADL, Ability to Perform Physical Activities of Daily Living Questionnaire; BMI, body mass index; ES, effect size; OWQOL, Obesity and Weight-Loss Quality of Life Measure; SF-36, Short Form-36; SRM, standardized response mean; WRSM, Weight-Related Symptom Measure.

MIC

The two distribution-based methods used to estimate MIC in APPADL scores were 0.5 SD and 1 SEM. These were calculated as 9.6 and 6.3, respectively. MIC was also estimated by “anchoring” APPADL scores to weight loss and patient global impression of improvement. The APPADL scores for those participants who achieved a 5% or more weight loss (n=12) from baseline to end point was 15.5 compared with 1.9 (P=0.01) for those individuals who did not lose at least 5% (n=27), suggesting an MIC of 13.6. The APPADL scores for those participants who reported having better (n=20) ability to perform daily physical activities since the last visit was 11.1 compared with 1.3 (P=0.03) for those who reported no change/worsening (n=20), suggesting an MIC of 9.8 (i.e., difference between 11.1 and 1.3).

Discussion

The overall goal of this study was to provide additional support for use of the APPADL as a potential secondary end point in trials of weight loss interventions targeted at individuals with T2DM and moderate to severe obesity. This could include those antihyperglycemic medications that produce weight loss. The first objective was to demonstrate acceptable item statistics and test–retest reliability with the APPADL measure. Item statistics showed that, in general, the racially diverse and moderate to severely obese sample in the present study reported the ability to do most of the physical activities of daily living with moderate difficulty and internal consistency of the items was quite high. These item statistics were similar to those observed by Hayes et al.¹ in a sample of primarily white individuals with T2DM and moderate obesity. Test–retest reliability was acceptable and comparable to the test–retest coefficients reported for generic physical function measures (e.g., SF-36 Physical Function Domain²⁰) and obesity-specific physical function measures (e.g., Impact of Weight on Quality of Life-Lite Physical Function Domain¹⁵).

The previous study by Hayes et al.¹ showed an association between APPADL scores and change in weight as calculated from participants' reports of their current weight and their previous year's weight. This relationship was supported by significant differences in weight loss across APPADL groups (low, medium, and high scores). However, although the ability for scores to discriminate groups is suggestive of responsiveness, a true test of responsiveness requires a prospective study in which a true difference from baseline to end point occurs. Therefore, our second objective was to estimate responsiveness by testing the APPADL within a prospective observational study of weight loss interventions in individuals with T2DM and obesity. The change in weight from baseline to end point was statistically significant. The APPADL demonstrated its ability to detect that weight change by the statistically significant change in APPADL scores. The responsiveness indices (ES and SRM) for the APPADL mean change were relatively small, but the ES for weight change was also small. The correspondence between weight change and APPADL scores further supported the correspondence between weight loss and performing physical activities that was anticipated by individuals with T2DM and obesity in the qualitative study.²

The SF-36 Vitality Scale and the OWLQOL measure also were responsive to the change in weight. Although these are important tools for measuring the patient perspective of weight loss, these measures have limitations for the support of labeling claims in the development of pharmaceutical interventions that produce weight loss.²¹ The SF-36 Vitality Scale was designed to assess health status across different diseases and conditions⁶ and was not developed with input from any target population such as individuals with T2DM and obesity. The OWLQOL includes quality of life items that measure possible negative psychological impacts of obesity (i.e., shame, frustration, and stress) that could be influenced by other factors (e.g., patient expectations and social support) besides weight. Moreover, as defined by the Food and Drug Administration, a quality of life measure contains “non health-related aspects of life, and because the term generally is accepted to mean what the patient thinks it is, it is too general and undefined to be considered appropriate for a medical product claim.”²¹ The APPADL, in contrast, was developed according to regulatory guidance by obtaining input from the target patient population (i.e., individuals with T2DM and moderate obesity) in its development and limiting its conceptual framework to those concepts deemed important and relevant to weight loss in that target population.

The last objective was to suggest a change in the APPADL score that may be considered meaningful for purposes of interpretation. The distribution-based methods for determining MIC indicated an approximately 6–10-point change may be meaningful. However, anchor-based approaches, which were based on patient input or a weight loss change deemed meaningful by patients, suggested a larger MIC of 10–14 points. Thus, for sample size estimation aimed at showing statistical significance, 6 points may be adequate,⁵ but to demonstrate change in which patients actually perceive a difference in their ability to perform physical activity and/or that corresponds to meaningful weight loss may be closer to 10 points on a 0–100 scale. It should be noted, however, that, as with other psychometric properties of instruments, the MIC is sample-specific. Additional research is needed to provide support for these estimates.

Limitations

The current sample was mostly African-American (54%) and therefore not representative of the general population of individuals with T2DM and obesity. However, given that Hayes et al.¹ conducted a preliminary validation of the APPADL in a primarily white sample, recruitment in this study was aimed at providing support for APPADL validation in a more racially diverse sample.

Inclusion criteria required study participants to be actively seeking or currently engaged in a weight loss intervention. Fontaine et al.²² observed that obese populations seeking treatment are different than those who do not seek treatment because they have a tendency to report greater impairment, specifically in the areas of self-reported physical function. Therefore, the participants in this study may not be representative of all individuals with T2DM who are obese. Nevertheless, individuals who enroll in clinical trials of antihyperglycemic medications that also produce weight loss are, by virtue of their enrollment, seeking treatment and may be similarly impaired. Consequently, the results reported in this study may not be generalizable to the population of individuals with T2DM and obesity as a whole but are likely representative of the target population for clinical trials.

Another limitation, not of the study design, but potentially of the APPADL, is that it is a self-reported assessment of ability to perform physical activities as opposed to an assessment of physical performance as rated by trained professionals. Summary performance measures of, for example, lower extremity function (walking speed, timed chair stands, and standing balance) are useful because they have been shown to be predictive of disability.²³ However, these measures capture performance on physical tests that are not necessarily linked to daily physical activities. The APPADL asks respondents to provide their perception of their ability to perform various physical activities. These activities were identified by the target population (individuals with T2DM and moderate obesity) as relevant and important to them because they experience doing them, or attempting to do them, on a nearly daily basis.² Therefore, although physical performance tests are valued tools in research and geriatric assessment, especially as predictors of deterioration,²³ patients' self-reported improvement in daily physical activities may be more relevant as direct measure of weight loss treatment benefit. Moreover, the simplicity of the administration of a questionnaire over a physical performance measure also provides a clear advantage for use in clinical trials. The APPADL takes less than 5 min to complete, has a Flesch Kincaid reading level of 9^th grade, and has been linguistically validated in several languages. The APPADL is publicly available and can be obtained by contacting the first author.

Conclusions

Per the Food and Drug Administration guidance for the use of PRO measures in medical product development, “Use of a PRO instrument [in clinical trials] is advised when measuring a concept best known by the patient or best measured from the patient perspective.” ²¹, p.² Although, it cannot be denied that clinical measures (i.e., weight loss) are the appropriate primary end points for weight loss interventions, the relevance of this weight loss to a patient can only be reported by the patient. A previous study² has shown that for individuals with T2DM and obesity, relevance of weight loss is linked to their perceived improvement in their ability to perform physical daily activities. Therefore, to fully evaluate the benefit of weight loss interventions, it would seem important to collect not only clinical data but also patients' perceptions of their physical function as well.

This study provided additional information on the psychometric properties of the APPADL. The data indicated acceptable test–retest reliability, responsiveness, and an MIC of approximately 10–13 points in a racially diverse sample. Thus, the APPADL has shown the potential to be a useful tool in the evaluation of weight loss interventions, including antihyperglycemic medications that produce weight loss, targeted at individuals with T2DM and moderate obesity.

Footnotes

Acknowledgments

This study was funded by Eli Lilly and Company. The authors would like to thank both Teri Tucker, a full-time employee of PharmaNet/i3, an inVentiv Health Company, and Michael Meldahl for their assistance in the preparation of this manuscript.

Author Disclosure Statement

All authors are full-time employees and shareholders of Eli Lilly and Company.

References

Hayes

, Nelson

, Meldahl

, Curtis

. Ability to perform daily physical activities in individuals with type 2 diabetes and moderate obesity: a preliminary validation of the Impact of Weight on Activities of Daily Living Questionnaire. Diabetes Technol Ther, 2011; 13:705–712.

Curtis

, Hayes

, Fehnel

, Zografos

. Assessing the effect of weight and weight loss in obese persons with type 2 diabetes. Diabetes Metab Syndr Obes, 2008; 1:13–23.

Kirshner

, Guyatt

. A methodological framework for assessing health indices. J Chronic Dis, 1985; 38:27–36.

Norman

, Sloan

, Wyrwich

. Interpretation of changes in health-related quality of life: the remarkable universality of half a standard deviation. Med Care, 2003; 41:582–592.

Terwee

, Roorda

, Knol

, De Boer

, De Vet

. Linking measurement error to minimal important change of patient-reported outcomes. J Clin Epidemiol, 2009; 62:1062–1067.

Ware

. SF-36^® Health Survey Update. www.sf-36.org/tools/SF36.shtml. 2012 January .

Niero

, Martin

, Finger

, Lucas

, Mear

, Wild

, Glauda

, Patrick

. A new approach to multicultural item generation in the development of two obesity-specific measures: the Obesity and Weight Loss Quality of Life (OWLQOL) questionnaire and the Weight-Related Symptom Measure (WRSM) Clin Ther, 2002; 24:690–700.

Patrick

, Bushnell

, Rothman

. Performance of two self-report measures for evaluating obesity and weight loss. Obes Res, 2004; 12:48–57.

McGraw

, Wong

. Forming inferences about some intraclass correlation coefficients. Psychol Methods, 1996; 1:30–46.

10.

Cronbach

. Coefficient alpha and the internal structure of tests. Psychometrika, 1951; 16:297–334.

11.

Husted

, Cook

, Farewell

, Gladman

. Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol, 2000; 53:459–468.

12.

Terwee

, Dekker

, Wiersinga

, Prummel

, Bossuyt

. On assessing responsiveness of health-related quality of life instruments: guidelines for instrument evaluation. Qual Life Res, 2003; 12:349–362.

13.

Samsa

, Kolotkin

, Williams

, Nguyen

, Mendel

. Effect of moderate weight loss on health-related quality of life: an analysis of combined data from 4 randomized trials of sibutramine vs placebo. Am J Manag Care, 2001; 7:875–883.

14.

Foster

, Borradaile

, Vander Veur

, Leh Shantz

, Dilks

, Goldbacher

, Oliver

, Lagrotte

, Homko

, Satz

. The effects of a commercially available weight loss program among obese patients with type 2 diabetes: a randomized study. Postgrad Med, 2009; 121:113–118.

15.

Crosby

, Kolotkin

, Williams

. An integrated method to determine meaningful changes in health-related quality of life. J Clin Epidemiol, 2004; 57:1153–1160.

16.

Sloan

, Symonds

, Vargas-Chanes

, Fridley

. Practical guidelines for assessing the clinical significance of health-related quality of life changes within clinical trials. Drug Inf J, 2003; 37:23–31.

17.

Rejas

, Pardo

, Ruiz

. Standard error of measurement as a valid alternative to minimally important difference for evaluating the magnitude of changes in patient-reported outcomes measures. J Clin Epidemiol, 2008; 61:350–356.

18.

de Leiva

. What are the benefits of moderate weight loss? Exp Clin Endocrinol Diabetes, 1998; 106,Suppl 2:10–13.

19.

Center for Drug Evaluation Research, Food, Drug Administration, U.S. Department of Health and Human Services: Guidance for Industry: Developing Products for Weight Management. 2007. www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm071612.pdf. 2012 March 25.

20.

Marx

, Menezes

, Horovitz

, Jones

, Warren

. A comparison of two time intervals for test-retest reliability of health status instruments. J Clin Epidemiol, 2003; 56:730–735.

21.

Center for Drug Evaluation Research, Center for Biologics Evaluation Research, Center for Devices and Radiological Health, Food and Drug Administration, U.S. Department of Health and Human Services: Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. 2009. www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM193282.pdf. 2012 March 25.

22.

Fontaine

, Bartlett

, Barofsky

. Health-related quality of life among obese persons seeking and not currently seeking treatment. Int J Eat Disord, 2000; 27:101–105.

23.

Guralnik

, Ferrucci

, Pieper

, Leveille

, Markides

, Ostir

, Studenski

, Berkman

, Wallace

. Lower extremity function and subsequent disability: consistency across studies, predictive models, and value of gait speed alone compared with the short physical performance battery. J Gerontol A Biol Sci Med Sci, 2000; 55:M221–M231.