Abstract
Accurate retrospective reporting of activities and symptoms has been shown to be problematic for older adults, yet standard clinical care relies on self-reports to aid in assessment and management. Our aim was to examine the relationship between self-report and sensor-based measures of activity. We administered an online activity survey to participants in our ongoing longitudinal study of in-home ubiquitous monitoring. We found a wide range of accuracies when comparing self-report with time-stamped sensor-based data. Of the 95 participants who completed the 2-hr activity log, nearly one quarter did not complete the task in a way that could potentially be compared with sensor data. Where comparisons were possible, agreement between self-reported and sensor-based activity was achieved by a minority of participants. The findings suggest that capture of real-time events with unobtrusive activity monitoring may be a more reliable approach to describing behavioral patterns and meaningful changes in older adults.
Self-report assessment is a cost-effective means of gathering large amounts of data, and can be adapted for different populations and research applications. Thus, retrospective self-reports of health status, daily activity, clinical symptomatology, and other important health-related variables have been widely used for decades in health outcomes research. However, in many cases, poor or inconsistent reliability of self-report questionnaires and inventories has been well recognized. Discrepancies between self-report and more objective measures have been documented in a wide variety of important behaviors, including medication adherence (Wild & Cotrell, 2003), health service utilization (Wallihan, Stump, & Callahan, 1999), dietary intake (O’Loughlin et al., 2013), and driving (Cotrell, Wild, & Bader, 2006). This is an increasingly important issue as more survey research moves online while potentially more accurate and objective methods of data collection are tested and developed.
For example, the importance of physical activity in maintaining health and independence in older adults has been well established (Concannon, Grierson, & Harrast, 2012; Sun, Norrman, & While, 2013). At the same time, self-reporting of types, intensities, and amounts of physical activity has been shown to be unreliable. Saelens and Sallis (2000), in a review of self-report physical activity measures, concluded that while such measures have been developed for use with older adults, impaired recall, misinterpretation of survey items, and social desirability biases can limit the accuracy of data collected by those methods. Similarly, Tudor-Locke and Myers (2001) describe limitations of physical activity questionnaires when administered to more sedentary adults. Most instruments fail to address lower levels of activity both in terms of time spent and level of intensity, and emphasize more vigorous pursuits. Given the documented limited reliability of such measures in assessing activities more typical of older adults, the authors recommend approaches that combine self-report with more objective measures derived from motion-sensing devices.
With the development of applicable technologies, research has recently focused on achieving greater levels of accuracy in detecting and describing daily physical activity. Banda et al. (2010) found poor agreement between questionnaire responses and accelerometer data in middle-aged and older adults, with significantly higher levels of activity on self-report. Others have reported similar results with discrepancies between self-report and accelerometry (Dyrstad, Hansen, Holme, & Anderssen, 2014; Grimm, Swarz, Hart, Miller, & Strath, 2012; Hekler et al., 2012), pedometers (Harris et al., 2009), and doubly labeled water as a measure of total energy expenditure (Neilson, Robson, Friedenreich, & Csizmadi, 2008). In a review of 21 studies comparing self-report and direct measures of physical activity, Kowalski, Rhodes, Naylor, Tuokko, and MacDonald (2012) found “weak to moderate” correlations between these data collection methods. Further analysis revealed higher strength of association between direct measures than between indirect, or self-report, measures of activity. The authors suggest that the measurement of physical activity in older adults presents unique challenges, including intra-individual variability due to changes in health, and failure of indirect measures to appropriately assess intensity, type, and duration of activity relevant to this population.
In response to the demonstrated unreliability of self-reports of daily events, more direct reporting techniques have been developed to minimize the effects of memory lapses and recall biases. Ecological Momentary Assessment (EMA; Kahnemann, Krueger, Schkade, Schwarz, & Stone, 2004) is based on real-time monitoring or sampling of behaviors of interest as they occur in one’s natural environment (Shiffman, Stone, & Hufford, 2008). In a review of EMA research with older adults, Cain, Depp, and Jeste (2009) described the typical study as having collected data for a period of 2 weeks, related to physical and daily living activities. They found compliance to be acceptable across studies and recommended the use of such methods in combination with other continuous monitoring techniques and domains.
The Day Reconstruction Method (Kahnemann et al., 2004) is a strategy for real-time behavior sampling in which a structured diary format elicits recall of a continuous sequence of events by means of a detailed description of duration and location of activities and the related social interactions and affect states. Similarly, Milligan, Bingley, and Gatrell (2005) used “solicited diaries” to collect data on the health and daily activities of older adults. They found that this method offered unique insights that were less influenced by memory lapses or biases by capturing events close in time to their actual occurrence. However, limitations of this technique included differences in response styles and possible burden on respondents with increasing duration of participation. Jacelon and Imperio (2005) also reported that solicited diaries were a productive means of tracking daily activities of older adults. However, they found that the richness of the data tended to decline over time, and recommended a 2-week maximum for such intense participation by older adults. Others have reported mixed results with the diary method. A large sample of older adults completed a 7-day daily diary symptom report (Aroian & Vander Wal, 2007), but respondents were found to endorse fewer symptoms each day. The authors hypothesized that frequently occurring symptoms become “taken for granted” and thus are less likely to be reported on a daily basis. They recommended the use of this method primarily for quantifying the appearance of new symptoms or change in symptoms over time. In an effort to reduce burden on older adults, we used an abbreviated Day Reconstruction Method approach to record activities for a 24-hour period. Consistent with other studies, we found wide variability both across participants and within diaries in terms of level of detail and consistency of response (Wild, Maxwell, Campbell, Hayes, & Kaye, 2011).
In-home monitoring of daily activity and behavior has benefited from dramatic advances in related technologies. Brownsell, Bradley, Blackburn, Cardnaux, and Hawley (2011), in a review of “lifestyle monitoring technologies,” concluded that such approaches are useful in tracking and detecting changes in activity, but the important next step of applying those data to identification of health care needs and interventions has not yet been reported. Similarly, Reeder et al. (2013) reviewed promising studies of smart home technologies and concluded that while prediction of functional decline based on changes in patterns of activity is readily attainable, future research should focus on the interpretation of those findings for improving individualized health care. Kaye et al. (2011) described a system of in-home continuous activity monitoring that has been deployed in the homes of more than 250 older adults for an average of nearly 3 years. In this longitudinal study, real-time ubiquitous and unobtrusive monitoring of activities and behaviors is captured by strategically placed in-home motion sensors. At the same time, participants respond to regularly scheduled questionnaires via a desktop computer. Continuously assessed metrics include walking speed, total activity, computer use, and time out of home, among others. With continued advances in monitoring capabilities and algorithms to interpret the ensuing data, assessment by self-report of symptomatology, behaviors, and activities may become a useful adjunct rather than the sole means of data collection.
Given this increasing capability to integrate “of the moment” self-report data with objectively sensed data, we were interested in determining the degree of fidelity in self-reports of activity relative to sensor-detected activity in older adults. Furthermore, we hypothesized that there might be limitations to the reliability of recalled events and behaviors, particularly when implemented with older adults. To reduce burden and maximize accuracy, we developed an online survey to obtain activity information. In the present study, web-based self-reports of activity were compared with time-stamped sensor-based activity data.
Method
Participants
All participants had provided written informed consent and were already enrolled in one of two ongoing studies of in-home monitoring: the Oregon Center for Aging and Technology (ORCATECH) Living Laboratory study and the Intelligent Systems for Assessing Aging Change (ISAAC) study. The protocol was approved by the Oregon Health & Science University (OHSU) Institutional Review Board (IRB #2353). Both studies use the same in-home sensor technology and computers to detect early behavioral and cognitive changes that occur with aging. Inclusion criteria were as follows: 60 years and older for the Living Laboratory study and 80 and older for the ISAAC study, living independently, cognitively healthy (Clinical Dementia Rating score < 1; Mini-Mental State Exam score > 24), and of average health for age with no or well-controlled chronic health conditions. The details of the sensor system and its deployment in homes have been described previously (Kaye et al., 2011). All participants from these two cohorts who routinely completed a weekly electronic health form were asked to participate in this study. At the time of this study, that included 136 currently active participants who were regular computer users.
Procedures
At the time of completion of their online health form, participants were asked to complete an additional survey at one time point (see Figure 1). In the survey, participants were asked to report their activity, location in the home, and time of activity for the 2 hours immediately preceding completion of the survey. Text boxes in the survey were designed to expand as needed for each entry. Examples were given to illustrate appropriate responses. Sensor-based activity data for the 2 hours prior to the time stamp of the activity survey were examined for each participant for comparison with self-reported activity. Standard clinical and cognitive measures are administered on a yearly basis to all ISAAC participants and were available for this cohort.

Online activity survey.
Recognizing that a level of precision in documenting time and location of activity similar to that of motion sensors was beyond the capability of most participants, activity reports from sensor data were translated into sequences of locations rather than based on time-stamped accuracy. For example, sensor firings in the living room followed by firings in the kitchen and bedroom could be assessed for agreement with self-reported sequences of activity for the same time frame. Determinations of concordance between self-report and sensor data were made by direct comparison of room locations identified for the 2 hour time frame by each method. Those comparisons where locations identified in self-report were in the same sequence as determined by sensor data were termed a match; those where at least one location was consistently identified were considered to be in partial agreement; and finally those instances where there was no overlap between self-report and sensor data were termed “no match.”
Statistical Analysis
Demographics, clinical characteristics, and neuropsychological test scores were compared between participants who were and were not able to complete the real-time activity reporting task. Student’s t test or Wilcoxon rank sum test was used as appropriate for continuous variables and Pearson’s chi-square test or Fisher’s exact test was used as appropriate for categorical variables. All analyses were performed using SAS 9.3 software (SAS Institute, Inc., Cary, NC).
Results
Of the 136 ISAAC participants to complete a weekly health form, 95 completed the activity survey as requested. Of those 95 older adults, nearly one quarter (n = 22) did not complete the survey adequately, as defined by failure to include at least one activity, location, and time that could potentially be compared with time-stamped sensor-based data. Group differences between those who were and were not able to complete the real-time activity reporting task are presented in Table 1.
Characteristics of Older Adults Who Were and Were Not Able to Complete the Real-Time Activity Reporting Task.
Note. MMSE = Mini-Mental State Examination; GDS = Geriatric Depression Scale; FAQ = Functional Assessment Questionnaire; CIRS = Cumulative Illness Rating Scale.
Represents time to completion in seconds; higher score reflects worse performance.
Between-group comparisons showed significantly lower mental status scores in those who did not submit usable activity reports than those who did. Verbal memory as measured by delayed recall on a word list learning task was also significantly lower for the participants who were unable to successfully complete the survey, while differences in word list learning were significant at the .06 level. Two participants who had most recent Clinical Dementia Rating scores of 0.5 indicative of mild cognitive impairment were both in the “unusable” survey group.
Of the 73 respondents with usable surveys, 49 were residing alone and had usable sensor data (see Figure 2). Of the 49 instances of comparable data sets, roughly one quarter (12) demonstrated a match between self-report and data based on sensor firings in a particular sequence of locations. Thirteen residents had no agreement at all between self-reported activity and sensor firings for the same 2-hr period (see Figures 3 and 4). For example, one respondent reported attending an exercise class, while sensors were firing for much of the reported episode. Finally, about half the reports had some overlap with sensor-based activity readings. There were no differences between these three groups (match, partial agreement, and no match) on any demographic or cognitive variables.

Participants in survey study.

Example of matching self-report (arrows) and sensor data.

Example of mismatch between self-report (arrows) and sensor data.
Discussion
We have demonstrated the difficulties inherent in obtaining accurate assessments of daily activity as obtained from older adults’ self-reports. In a relatively structured daily activity reconstruction format, nearly one quarter of the participants in an online survey failed to record a usable activity entry. Even among the generally cognitively intact older adults of the present study, those who had difficulty completing an online survey had lower scores on objective measures of mental status and verbal memory. While the clinical significance of these small test score differences may be slight, that such differences were statistically significant in a cohort of healthy adults without neurological diagnoses is of note. These individuals were not demented, lived independently, and had responded regularly to online surveys for the previous 4 years. Continued follow-up of these participants will serve to detect clinically meaningful changes in cognition as they occur. Finally, we found that agreement between self-reported and sensor-based activity was achieved by a minority of participants (24%). One-half of usable survey responses had partial overlap with data based on time-stamped sensor firings. Fully one quarter of the respondents reported activity that had no correspondence to sensor data for the same time interval.
While self-report instruments are a cost-effective means of gathering large amounts of data, and can be adapted for different populations and research applications, limitations of this type of assessment have been recognized and described in multiple research settings. van Uffelen, Heesch, Hill, and Brown (2011) reported “poor to fair agreement” between self-reported and objectively measured sedentary activity in older adults. They cited respondents’ difficulty recalling actual activities as opposed to typical or average time spent sitting, and failure to restrict their responses to the time span in question. Furthermore, they noted a tendency to report activities that were given as questionnaire examples and to omit activities that were not a regular part of their schedule. Similarly, in a comparison of self-reported and objective measures of driving experience, Blanchard, Myers and Porter (2010) found that older adults made significant errors in both under- and overestimating miles driven per week, and recommended use of multiple measures from different sources for maximizing accuracy of data.
Bolstered by previous research documenting the limitations of self-report as a sole source of behavioral data, we suggest that the present path of investigation as a whole highlights the need to seek more reliable methods of data collection whenever possible, especially in older populations. We are not the first to call for more cost-effective, scalable, and reproducible measures to describe daily activity and to detect subtle changes in those activities. “Smart home” technologies aimed at prolonging safety and independence in older adults, and identifying the earliest signs of physical or cognitive decline, have been extensively described and reviewed elsewhere (Alwan, 2009; Brownsell et al., 2011; Mahoney et al., 2007; Reeder et al., 2013). However few have been deployed in long-term, large-scale studies. Reeder et al. (2013) reported an attempt to define the relationship between self-report and sensor-based measures of activity in a small group of older adults living in an independent retirement community. While they gathered important data on the acceptability of in-home sensors, difficulties with technical issues precluded the intended comparisons.
Physical activity assessed by remote sensing in the home is just one objective metric suitable for use as a marker of incipient decline in older adults. Recently, Kaye et al. (2014) demonstrated that daily, continuous monitoring of computer use can detect change in function over time, despite relative stability on more traditional functional status measures. Similarly, subtle changes in medication use (Hayes, Larimer, Adami, & Kaye, 2009), walking speed (Dodge, Mattek, Austin, Hayes, & Kaye, 2012; Silbert, 2012), sleep patterns (Dodge et al., 2012), and overall activity level (Kaye et al., 2011) have been identified by in-home monitoring technologies as potential early signs of incipient cognitive decline. In a related finding, Thielke et al. (2014) report associations between monitored in-home behaviors and self-reported low mood.
There are current limitations to the approach reported here. The current state of sensor-derived data analysis has yet to successfully address the issue of multiple person dwellings; thus, our comparison of self-reports with sensed data was limited to participants living alone. Future research will benefit from the development of analytic strategies that can reconstruct the activities and movements of multiple persons within the same space. In addition, the participants recruited for these kinds of studies are generally “early adopters” of technology and may not be representative of the broader older adult population. As aging cohorts have more experience with computers and in-home technology, it can be expected that those findings will be more generalizable to the older adult population as a whole. In particular, the inclusion of a more diverse group of older adults will be essential in moving this work forward.
We propose that the traditional methods of assessment are not ideally suited to detecting subtle but potentially meaningful changes in behavior. Continuous sensor-based assessment acquires data in a way that minimizes memory deficits and recall biases, by taking place in real time in the home environment rather than being subject to retrospective, episodic clinic-based reporting. While the difficulties inherent in self-report retrospective data collection are not limited to older adults, the prevalence of memory impairment in this cohort further weakens the reliability of such methods. We believe that due to the inherent inaccuracies of self-report data, behavioral research can be substantially advanced by using sensor-based pervasive computing assessment methods that have been shown capable of detecting subtle changes that might signal early cognitive decline. Finally, as applications of in-home technology move from research to the clinical practice setting, reliance on more sensitive and accurate behavioral profiles can serve to improve patient care.
Footnotes
Acknowledgements
We are grateful to all the participants and research staff involved in this project.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by grants from the National Institutes of Health [P30AG024978, R01AG024059, P30AG008017].
