Pilot Validation of Ambulatory Activity Monitors for Sleep Measurement in Huntington’s Disease Gene Carriers

Abstract

Sleep disturbance occurs early in Huntington’s disease (HD). Consumer- and research-grade activity monitors may enable routine assessment of sleep disturbances in HD. We compared Actiwatch Spectrum Pro, Jawbone UP2 and Fitbit One to the gold standard, polysomnography, in four late presymptomatic and three early HD participants. Compared to polysomnography, all ambulatory monitors overestimated total sleep time by >60 minutes and sleep efficiency by ∼15%. Thus, for assessment of specific sleep parameters in HD, none of the activity monitors are sufficiently accurate to replace polysomnography, although they may be sufficient for estimating overall sleep-wake patterns. Larger sample replication is required.

Keywords

Huntington’s disease sleep actigraphy ambulatory monitoring validation

INTRODUCTION

Sleep disruption is one of the earliest symptoms of Huntington’s disease (HD), emerging up to 10 years prior to diagnosis [1]. Chronic insufficient sleep in this population may contribute to cognitive impairment or decline, some neuropsychiatric symptoms [2 –4], and an increased rate of neurodegeneration [5]. Sleep problems are prevalent in people with HD, and perceived by them as contributing to the disease burden [6]. Sleep dysfunction is greater in more advanced disease [1]. Ongoing sleep monitoring using non-invasive techniques has the potential to clarify how sleep disturbances advance with the disease, and how they are associated with disease progression, and cognitive function, in particular.

At present, actigraphy and consumer wearable activity monitors, such as Fitbit or Jawbone, might be the most suitable assessment tool for this task. These devices estimate sleep-wake patterns by detecting wrist movements, and can be deceived by an absence of movement during wake, or by excessive movement during sleep. The precision of actigraphic sleep estimations varies between populations and different models of the monitors, thus validation of each specific model of actigraph, for each population, has been recommended [7]. In healthy populations actigraphy and consumer wearables showed reasonable validity [8 –13]. In clinical samples with disrupted sleep, validity is reduced depending on the level and characteristics of sleep disturbance [14 –21]. A study, providing comparable statistical characteristics allowing for assessment of validity of actigraphy and/or consumer wearables in HD populations, has not yet been published.

Poor sleep quality is often present in Huntington’s populations [22], and along with characteristic movement disorders, emphasises the importance of validation of actigraphy and consumer wearables before research and/or clinical use. Therefore, we evaluated the validity of Actiwatch Spectrum Pro, Jawbone UP2 and Fitbit One, in comparison to lab-based polysomnography as the gold standard for sleep measurement in late premanifest and early stage HD.

MATERIALS AND METHODS

Participants

Seven Huntington’s gene carriers, all Caucasian (6 females, 1 male; M_age = 54.14±6.4, M_CAG = 42.6), were recruited from Monash University HD research volunteer database. The participants’ disease severity ranged from presymptomatic (n = 4; M_{Disease Burden Score} = 333.4), to early symptomatic (n = 3; M_{duration of illness} = 3.2 years). The Unified Huntington’s Disease Rating Scale Total Functional Capacity score ranged 9–13, and Total Motor Score ranged 0–19 (see Supplementary material 1 for more clinical characteristics and medication list). Exclusion criteria included: a) current participation in clinical drug trials; b) concomitant major neurological, psychiatric, or severe medical illness; c) a history of traumatic brain injury; d) drug or alcohol abuse; e) shift work; f) travel across time zones within the previous 3 months; and f) regular consumption:>300 mg caffeine per day, or ≥4 standard alcoholic drinks in one sitting or ≥2 a day. The study was approved by the Monash University Human Research Ethics Committee. All participants provided written informed consent.

Self-reported sleep quality

Participants completed two self-report measures: the Pittsburgh Sleep Quality Index (PSQI) [23], measuring global sleep quality, and the Insomnia Severity Index (ISI) [24], measuring severity of insomnia symptoms. On average, participants reported poor sleep quality (PSQI total = 10.6±6.1), and subthreshold insomnia (ISI = 10.4±10.5).

Polysomnography

We recorded standard clinical polysomnography using Compumedics Grael High Definition (Compumedics Limited, Australia). Sleep was scored according to the American Academy of Sleep Medicine criteria [25] in 1 min epochs.

Actigraphy, Jawbone UP2 and Fitbit One

Consumer wearable monitors Jawbone UP2 and Fitbit One were set to default settings, and their companies’ specialised software provided sleep scored in 1 min epochs. To keep data comparable, the Actiwatch Spectrum Pro (Philips/Respironics, Murrysville, PA) was also set to collect data in 1 min epochs using default settings (medium sensitivity threshold (40counts); and 10 min of immobility rule with ≤1 epoch scored as wake for auto scoring of sleep onset and offset).

Procedure

During habitual sleep-wake times participants underwent overnight laboratory polysomnography while wearing the Actiwatch SP, Jawbone UP2 and Fitbit One on their non-dominant wrist (see Supplementary material 6).

Data processing and analysis

We aligned the monitors’ data with polysomnography and analysed from lights-out until lights-on. Based on sleep-wake activity recorded by polysomnography and all monitors we calculated outcome sleep parameters: total sleep time – time asleep; sleep latency – time until first 10 min of inactivity with ≤1 epoch scored as wake; sleep efficiency – percentage of sleep epochs between lights-out and lights on; and wake after sleep onset – time awake between initial falling asleep and final awakening.

Statistical analyses

Estimates of sleep parameters from the ‘gold standard’ polysomnography were compared to estimates from Actiwatch, Jawbone and Fitbit using paired t-tests. We used the Bland-Altman method [26] to assess agreement between monitoring methods. We set a priori clinical agreement limits to ±30 min for total sleep time, sleep latency, and wake after sleep onset, and to ±5% for sleep efficiency [21, 27].

We assessed epoch-by-epoch concordance between polysomnography and all monitors by determining sensitivity, specificity, accuracy, predicted value for sleep, predicted value for wake, and the Prevalence and Bias-Adjusted Kappa (PABAK). PABAK gives balanced weight to sleep and wake epochs [28], correcting for the overrepresentation of sleep epochs. PABAK’s strength of agreement was interpreted using Landis and Koch’s guidelines [29].

RESULTS

Estimates of the sleep parameters by the monitors differed significantly from polysomnography, with each monitor showing similar patterns (Table 1). Actiwatch significantly overestimated total sleep time by 74 min (t = 3.60, p = 0.011, d = 1.36) and sleep efficiency by 14.8% (t = 3.54, p = 0.012, d = 1.34). Jawbone overestimated total sleep time by 78.7 min (t = 4.07, p = 0.007, d = 1.54) and sleep efficiency by 16.3% (t = 4.25, p = 0.005, d = 1.6), and underestimated wake after sleep onset by 36 min (t = –3.38, p = 0.015, d = –1.28). Fitbit overestimated total sleep time by 88.1 min (t = 4.93, p = 0.003, d = 1.9) and sleep efficiency by 17.4% (t = 4.64, p = 0.004, d = 1.8), and underestimated wake after sleep onset by 39 min (t = –3.55, p = 0.012, d = –1.34).

Table 1

Results of parametric paired comparisons of sleep parameters between PSG and monitors

	M	SD	95% CI
	difference
Actiwatch Spectrum Pro – PSG
Total Sleep Time (min)	74.0 ^*	54.4	23.7	124.3
Sleep Efficiency (%)	14.8 ^*	11.0	4.6	25.0
Sleep Latency (min)	–23.0	26.4	–47.5	1.5
Sleep Latency exc. outlier (min)	–14	12.6	–27.2	–0.8
Wake After Sleep Onset (min)	–20.0	32.2	–49.8	9.8
Jawbone UP2 – PSG
Total Sleep Time (min)	78.7 ^**	51.2	31.4	126.0
Sleep Efficiency (%)	16.3 ^**	10.1	6.9	25.6
Sleep Latency (min)	–19.3	28.8	–46.0	7.4
Sleep Latency exc. outlier (min)	–9.2	11.8	–21.5	3.2
Wake After Sleep Onset (min)	–36.0^*	28.2	–62.1	–9.9
Fitbit One – PSG
Total Sleep Time (min)	88.1 ^**	47.3	44.4	131.9
Sleep Efficiency (%)	17.4 ^**	9.9	8.2	26.6
Sleep Latency (min)	–17.1	27.4	–42.5	8.2
Sleep Latency exc. outlier (min)	–8	14.1	–22.8	6.8
Wake After Sleep Onset (min)	–39.0^*	29.1	–65.9	–12.1

Note. Significantly different compared to polysomnography at ^*p < 0.025, ^**p < 0.01 (two-tailed). Estimation errors beyond a priori clinical agreement limits are in bold. PSG – polysomnography.

Bland-Altman analyses (Supplementary materials 2-4) showed that, compared to the polysomnography gold standard, average estimation errors (Bias) of all three monitors for total sleep time and sleep efficiency, fell outside of the clinical agreement limits. For wake after sleep onset it was the case only for Jawbone and Fitbit. Total sleep time was overestimated by Actiwatch in 86% of cases, and by Jawbone and Fitbit in 71% of cases; sleep efficiency was overestimated by Actiwatch, Jawbone and Fitbit in 86% of cases; wake after sleep onset was underestimated by Actiwatch in 29% of cases, and by Jawbone and Fitbit in 43% of cases. All monitors showed a trend towards larger estimation errors of sleep parameters in participants with poorer sleep, as seen in the regression lines in Supplementary materials 2, 3. Descriptively, all monitors showed slightly smaller estimation errors in the early HD group (Supplementary material 5), although all comparisons between presymptomatic and symptomatic subgroups are exploratory only due to the very small sample sizes.

All monitors showed high sensitivity in determining sleep, low and substantially varied specificity in identifying wake, and satisfactory level of accuracy. PABAK scores fell in the low end of substantial agreement between PSG and all monitors (Table 2).

Table 2

Epoch by epoch agreement analyses

	Sensitivity	Specificity	Accuracy	PVS	PVW	PABAK
Actiwatch SP
M	0.97	0.31	0.80	97.1	30.9	0.62
Range	0.91–1	0.08–0.57	0.69–0.89
Jawbone UP2
M	0.99	0.34	0.83	99.5	33.7	0.65
Range	0.96–1	0.20–0.69	0.69–0.94
Fitbit One
M	0.99	0.27	0.81	99.0	26.9	0.62
Range	0.97–1	0.12–0.55	0.68–0.93

Note. Sensitivity – a proportion of PSG-classified sleep epochs correctly identified by actigraphy; Specificity – a proportion of PSG-classified wake epochs correctly identified by actigraphy; Accuracy – a proportion of epochs correctly identified by actigraphy: PVS – Predictive values of sleep; PVW – Predictive values of wake; PABAK – Prevalence and Bias Adjusted Kappa. The range represents spread of values of sensitivity, specificity and accuracy across the participants.

DISCUSSION

Actiwatch SP, Jawbone UP2 and Fitbit One exhibited similar patterns across sleep parameters, significantly overestimating total sleep time by >60 minutes, and sleep efficiency by ∼15%, while underestimating wake after sleep onset by >30 minutes (Table 1), thus failing to meet our pre-specified threshold for acceptable levels of clinical agreement. The only exceptions were the Actiwatch showing acceptable agreement with polysomnography on wake after sleep onset, and all monitors showing acceptable agreement with polysomnography for sleep latency, although these were still notably underestimated. The unexpected observation of marginally more precise estimation of sleep parameters by the monitors in the symptomatic subgroup (Supplementary material 5) seems to contradict the trends for reduction in the monitors’ validity with sleep deterioration (Supplementary materials 2, 3). This subgroup had worse subjective sleep as shown by PSQI (Supplementary materials 1), however, subjective ratings are rarely good predictors of objective sleep [30]. This observation is also only descriptive in nature, due to the small size of the subgroups.

Overall, these results are similar to other studies [14 , 31], in which epoch-by-epoch comparisons with polysomnography showed consistently high sensitivity for sleep detection, and varied and low specificity for wake identification (see Table 2). PABAK showed acceptable consistency between polysomnography and each of the monitors, but only just reached the acceptable level. Accuracy, which accounts for both sleep and wake, was acceptable for participants with better levels of sleep efficiency, but dropped below acceptable levels for participants with low sleep efficiency (<72%). Thus, suggesting a reduction in the monitors’ validity with deterioration of sleep (Supplementary materials 2-4). This could have important consequences for longitudinal studies in HD, where sleep is expected to deteriorate over time.

Our small sample size might not reflect true population’s objective sleep at this stage of HD. It is unlikely, though, sleep quality in our sample was better than average, due to the disruptive nature of the “first laboratory night effect” [32], and possible insomnia side effects from medication used by two of our participants. If we captured worse than average sleep quality, then it provided a greater test for the monitors. However, our observations are consistent with patterns shown in other studies in clinical populations of similar age and clinical characteristics (i.e., overestimating total sleep time and sleep efficiency, and underestimating wake after sleep onset) [14–19 , 21]. One caveat is that our sample had very little chorea, so we were unable to ascertain the impact of chorea on the accuracy of the monitors, although it is reasonable to assume that chorea will adversely affect monitor accuracy. Accuracy could potentially be improved by utilising more sensitive activity thresholds for determining wake periods on the Fitbit One and Actiwatch SP (see Parkinson’s study [14]), however, Jawbone UP2 does not provide an option for changing activity thresholds. Another option is to change the sampling rate to 30 sec epochs on Actiwatch SP, which is not available for Jawbone or Fitbit.

Overall, we demonstrated that compared to polysomnography, Actiwatch SP, Fitbit One and Jawbone UP2 produce less accurate estimates of specific sleep parameters in HD. Nonetheless, in the absence of inexpensive alternatives to polysomnography that can be widely applied in patients’ homes, consumer-grade wearables may be sufficient for overall estimations of sleep-wake patterns, and/or to assess gross level changes over time.

CONFLICT OF INTEREST

The authors have no conflict of interest to report.

Footnotes

ACKNOWLEDGMENTS

The authors would like to thank all the participants for their contribution to this study. The authors also would like to acknowledge the team members of the Monash University Sleep and Circadian Medicine Laboratory that provided training, support and night shift supervision. Special thanks are to Christopher Andara and Parisa Vidafar.

This research was funded by the Monash University.

The supplementary material is available in the electronic version of this article: .

References

Lazar

, Panin

, Goodman

, et al. Sleep deficits but no metabolic deficits in premanifest Huntington’s disease. Annu Neurol. 2015;78:630–48. doi: 10.1002/ana.24495

Videnovic

, Leugrans

, Fan

, et al. Daytime somnolence and nocturnal sleep disturbances in Huntington disease. Parkinsonism Relat Disord. 2009;15:471–4. doi: 10.1016/j.parkreldis.2008.10.002

Baker

, Domínguez

DJF

, Stout

, et al. Subjective sleep problems in Huntington’s disease: A pilot investigation of the relationship to brain structure, neurocognitive, and neuropsychiatric function. J Neurol Sci. 2016;364:148–53. doi: 10.1016/j.jns.2016.03.021

Aziz

, Anguelova

, Marinus

, et al. Sleep and circadian rhythm alterations correlate with depression and cognitive impairment in Huntington’s disease. Park Relat Disord. 2010;16:345–50. doi: 10.1016/j.parkreldis.2010.02.009

Xiem

, Kang

, Xu

, et al. Sleep drives metabolite clearance from the adult brain. Science. 2013;342(6156::373–377. doi: 10.1126/science.1241224

Taylor

, Bramble

. Sleep disturbance and Huntingdon’s disease. Br J Psychiatry. 1997;171:393c. doi: 10.1192/bj171.4.393c

Sadeh

. The role and validity of actigraphy in sleep medicine: An update. Sleep Med Rev. 2011;15:259–67. doi: 10.1016/j.smrv.2010.10.001

Morgenthaler

, Alessi

, Friedman

, et al. Practice parameters for the use of actigraphy in the assessment of sleep and sleep disorders: An Update for 2007. Sleep. 2007;30:519–29. doi: 10.1093/sleep/30.4.519

Kanady

, Drummond

, Mednick

. Actigraphic assessment of a polysomnographic-recorded nap: A validation study. J Sleep Res. 2011;20:214–22. doi: 10.1111/j.1365-2869.2010.00858.x

10.

Cellini

, Buman

, McDevitt

, et al. Direct comparison of two actigraphy devices with polysomnographically recorded naps in healthy young adults. Chronobiol Int. 2013;30:691–8. doi: 10.3109/07420528.2013.782312

11.

De Souza

, Benedito-Silva

, Nogueira

, et al. Further validation of actigraphy for sleep studies. Sleep. 2003;26:81–5. doi: 10.1093/sleep/26.1.81

12.

Paquet

, Kawinska

, Carrier

. Wake detection capacity of actigraphy during sleep. Sleep. 2007;30:1362–9. doi: 10.1093/sleep/30.10.1362

13.

Rupp

, Balkin

. Comparison of Motionlogger Watch and Actiwatch actigraphs to polysomnography for sleep/wake estimation in healthy young adults. Behav Res Methods. 2011;43:1152–60. doi: 10.3758/s13428-011-0098-4

14.

Maglione

, Liu

, Neikrug

, et al. Actigraphy for the assessment of sleep measures in Parkinson’s disease. Sleep. 2013;36:1209–17. doi: 10.5665/slee2888

15.

van de Wouw

, Evenhuis

, Echteld

. Comparison of two types of Actiwatch with polysomnography in older adults with intellectual disability: A pilot study. J Intellect Dev Disabil. 2013;38:265–73. doi: 10.3109/13668250.2013.816274

16.

Kushida

, Chang

, Gadkary

, et al. Comparison of actigraphic, polysomnographic, and subjective assessment of sleep parameters in sleep-disordered patients. Sleep Med. 2001;2:389–96. doi: 10.1016/S1389-9457(00)00098-8

17.

Marino

, Li

, Rueschman

, et al. Measuring sleep: Accuracy, sensitivity, and specificity of wrist actigraphy compared to polysomnography. Sleep. 2013;36:1747–55. doi: 10.5665/slee3142

18.

Sivertsen

, Omvik

, Havik

, et al. A Comparison of actigraphy and polysomnography in older adults treated for chronic primary insomnia. Sleep. 2006;29:1353–8. doi: 10.1093/sleep/29.10.1353

19.

Taibi

, Landis

, Vitiello

. Concordance of polysomnographic and actigraphic measurement of sleep and wake in older women with insomnia. J Clin Sleep Med. 2013;9:217–25. doi: 10.5664/jcsm.2482

20.

Montgomery-Downs

, Insana

, Bond

. Movement toward a novel activity monitoring device. Sleep Breath. 2012;16:913–7. doi: 10.1007/s11325-011-0585-y

21.

de Zambotti

, Claudatos

, Inkelis

, et al. Evaluation of a consumer fitness-tracking device to assess sleep in adults. Chronobiol Int. 2015;32:1024–8. doi: 10.3109/07420528.2015.1054395

22.

Goodman

, Barker

. How vital is sleep in Huntington’s disease?J Neurol. 2010;257:882–97. doi: 10.1007/s00415-010-5517-4

23.

Buysse

, Reynolds

, Monk

, et al. The Pittsburgh Sleep Quality Index: A new instrument for psychiatric practice and research. Psychiatry Res. 1989;28:193–213. doi: 10.1016/0165-1781(89)90047-4

24.

Bastein

, Vallieres

, Morin

. Validation of the Insomnia Severity Index as an outcome measure for insomnia research. Sleep Med. 2001;2:297–307. doi: 10.1016/S1389-9457(00)00065-4

25.

Berry

, Brooks

, Gamaldo

, Harding

, Lloyd

, Marcus

, et al. The AASM manual for the scoring of sleep and associated events. Rules, Terminology and Technical Specifications, Darien, Illinois. American Academy of Sleep Medicine. 2016.

26.

Bland

, Altman

. Statistical methods for assessing agreement between two methods of clinical measurement. The Lancet. 1986;327(8476):307–10. doi: 10.1016/S0140-6736(86)90837-8

27.

Meltzer

, Hiruma

, Avis

, et al. Comparison of a commercial accelerometer with polysomnography and actigraphy in children and adolescents. Sleep. 2015;38:1323–30. doi: 10.5665/slee4918

28.

Byrt

, Bishop

, Carlin

. Bias, prevalence and kappa. J Clin Epidemiol. 1993;46:423–9. doi: 10.1016/0895-4356(93)90018-V

29.

Landis

, Koch

. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–74. doi: 10.2307/2529310

30.

Zhang

, Zhao

. Objective and subjective measures for sleep disorders. Neurosci Bull. 2007;23:236–40. doi: 10.1007/s12264-007-0035-9

31.

Toon

, Davey

, Hollis

, Nixon

, Horne

, Biggs

. Comparison of commercial wrist-based and smartphone accelerometers, actigraphy, and PSG in a clinical cohort of children and adolescents. J Clin sleep Med. 2016;12:343–50. doi: 10.5664/jcsm.5580

32.

Agnew

HWJ

, Webb

, Williams

. The first night effect: An EEG study of sleep. Psychophysiology. 1966;2:263–6. doi: 10.1111/j.1469-8986.1966.tb02650.x