Abstract
PURPOSE:
Aside from typical symptoms such as dizziness and vertigo, persons with vestibular disorders often have cognitive and motor problems. These symptoms have been assessed in single-task condition. However, dual-tasks assessing cognitive-motor interference might be an added value as they reflect daily life situations better. Therefore, the 2BALANCE protocol was developed. In the current study, the test-retest reliability of this protocol was assessed.
METHODS:
The 2BALANCE protocol was performed twice in 20 healthy young adults with an in-between test interval of two weeks. Two motor tasks and five different cognitive tasks were performed in single and dual-task condition. Intraclass correlation coefficients (ICC), the standard error of measurement, and the minimal detectable difference were calculated.
RESULTS:
All cognitive tasks, with the exception of the mental rotation task, had favorable reliability results (0.26≤ICC≤0.91). The dynamic motor task indicated overall substantial reliability values in all conditions (0.67≤ICC≤0.98). Similar results were found for the static motor task during dual-tasking (0.50≤ICC≤0.92), but were slightly lower in single-task condition (–0.26≤ICC≤0.75).
CONCLUSIONS:
The 2BALANCE protocol was overall consistent across trials. However, the mental rotation task showed lowest reliability values.
Introduction
Postural balance and gaze stabilization are mediated by a complex multisensory network of visual, proprioceptive, somatosensory, and vestibular input. Suboptimal functioning of one or more of these input systems might compromise stable and safe stance and ambulation. Subsequently, persons suffering from peripheral vestibular disorders might show aberrant postural control and gait characteristics, such as increased postural sway, stance variability and swing time, and decreased gait speed [3, 36]. These postural disturbances can be partially attributed to alterations in the three vestibular reflex pathways (i.e. vestibulo-ocular, vestibulospinal, and vestibulocervical).
In some cases, pharmaceutical or surgical interventions can treat peripheral vestibular disorders; however, physiotherapy is mostly the main therapeutic approach as loss of vestibular function cannot be regained. However, complaints such as problems with concentration and attention, short-term memory loss, and problems with multitasking often remain, which could all indicate cognitive fatigue [6, 41]. Additionally, although often alleviating motor symptoms, balance and gait exercises in a therapeutic setting differ from everyday situations, which often require adequate cognitive-motor dual-task (DT) performance. Subsequently, the motor confidence which might be experienced in the controlled therapeutic environment, decreases in everyday situations and might lead to an increased fall risk [15]. This feeling of unsafety may lead to anxiety and stress [21], and can play a major role in maintaining primary complaints. Additionally, avoiding the provocative context may impede participation in physical and societal activities, thereby further hampering the level of physical performance, and again increasing fall risk.
In healthy adults, based on the attentional capacity model, everyday DTs often do not exceed their total cognitive and attentional capacity, leading to adequate performance on both tasks [26]. However, in persons with vestibular hypofunction, postural motor tasks cease to be automatic and therefore require a certain cognitive capacity [31]. This might decrease the cognitive reserve to perform both motor and cognitive tasks adequately.
Moreover, even in single-task (ST) setting, without the vestibular system being challenged, problems with visuospatial cognition, attention, memory, processing speed, and executive function have been observed in persons with vestibular hypofunction [6, 32]. This can be explained by a multitude of neural networks which are also involved in cognitive processes. These networks surpass the vestibular reflex pathways and disperse throughout subcortical and cortical areas, with the hippocampus playing a pivotal role [7, 42].
Because of these motor and cognitive symptoms in persons with vestibular hypofunction, DT performance might be disproportionally impaired compared to cognitive and motor performance in ST condition. Such decrease in DT performance has already been reported in other populations such as Parkinson’s and Alzheimer’s disease [13]. However, in the vestibular-impaired population, these studies are scarce and their outcomes are very heterogeneous [2, 41]. We believe that DT assessment has great potential to shed light on the daily experienced difficulties for which separate laboratory motor and cognitive assessment might not be sufficiently sensitive. Additionally, diagnosis and therapy currently mainly focus on motor complaints, while cognitive complaints are often overlooked. DTs might indicate the domain each individual struggles with most, which could then be used as starting point for individualized rehabilitation.
This led towards the development of the 2BALANCE protocol [14]. Before implementation in patients with vestibular disorders, the outcome measures should be consistent across trials in the healthy population. Therefore, the test-retest reliability was assessed in 20 healthy adults. To minimally burden the participants’ cognitive and attentional recourses, an optimal test duration was investigated by limiting the number of test items for each cognitive task, without substantially compromising the test-retest reliability.
Methods
Participants
Twenty healthy adults ranging from 19 to 32 years old, were recruited from the general population by means of convenience sampling. This sample size was based on calculations made by Bujang and Baharum (2017) where for two observations per subject and a power of 80%, 15 subjects should suffice for intraclass correlation coefficient (ICC) values of 0.6 and higher [8]. A male-to-female ratio of 1:1 was applied, with mean ages of respectively 25.2 and 24.9 years. Factors with a possible impact on cognitive-motor performance such as vestibular, auditory, motor, developmental, affective complaints or disorders, or color blindness were used as exclusion criteria and queried using an anamnestic questionnaire. Additionally, persons with a score of 25 or less on the Montreal Cognitive Assessment were excluded. Finally, all participants scored within the normal values on the Dizziness Handicap Inventory (DHI), the Activities-specific Balance Confidence scale (ABC), the Hospital Anxiety and Depression Scale (HADS), the Falls Efficacy Scale (FES I), the Standard Assessment of Negative Affectivity, Social Inhibition, and Type D Personality (DS14), the Headache Impact Test (HIT), the Tinnitus Handicap Index (THI), and the algemene toestandslijst.
Test protocol
The 2BALANCE protocol consisted of a series of cognitive-motor DTs comprising two motor tasks and five cognitive tasks, all assessing a different cognitive domain or modality (Fig. 1). Each cognitive and motor task was additionally performed in ST condition. All participants were instructed to perform both tasks to the best of their abilities, and were not asked to prioritize one of both tasks. Testing took place in the morning, to limit the influence of fatigue. The test-retest interval was exactly two weeks and both sessions started at the same time. All motor and cognitive tests were randomized between subjects to account for possible order effects. The same randomization was used for the first and second test sessions within subjects. The test protocol is briefly discussed below. A more detailed description of each subtest can be found in Danneels et al. (2020) and at clinicaltrials.gov with identifier NCT04126798 [14].

Visual representation of the 2BALANCE protocol. Two motor tasks are performed: a static motor task consisting of balancing on a force platform and a dynamic motor task consisting of walking at a self-selected speed on the GAITRite Walkway. The cognitive tasks assess visuospatial memory (corsi block), response inhibition (visual and auditory Stroop task), mental rotation (mental rotation task), processing speed (coding task), and working memory (visual and auditory backward digit recall test).
The
For most cognitive tasks, the percentage of correct responses (%), the response time (sec), and processing time (sec) were calculated. For the coding task, the total amount of correct responses was assessed (items/minute). For the corsi block and BDRT, processing time was only calculated in case of correct responses. All test sessions were conducted by the same examiner, who read and repeated the instructions before the start of each test. These instructions were also presented visually.
Motor tasks
For both motor tasks, the participants were barefoot. The
Dual-task cost (mean and standard deviation) for session 1 of the motor data. The following parameters were analyzed for the GymPlate data: surface (mm2), length (mm), length left/right (mm), length rear/front (mm), mean velocity (mm/s) and LFS. The following parameters were analyzed for the GAITRite data: velocity (cm/sec), step length, stride length, base support. Dual-task cost was analyzed for each condition: corsi block (CB), mental rotation (MR), auditory and visual Stroop task (aSTR and vSTR), auditory and visual backward digit recall test (aBDRT and vBDRT), and the coding task (CT). Negative values indicate a decrease for a specific parameter in dual-task setting compared to single-task setting, while positive values indicate an increase
Dual-task cost (mean and standard deviation) for session 1 of the motor data. The following parameters were analyzed for the GymPlate data: surface (mm2), length (mm), length left/right (mm), length rear/front (mm), mean velocity (mm/s) and LFS. The following parameters were analyzed for the GAITRite data: velocity (cm/sec), step length, stride length, base support. Dual-task cost was analyzed for each condition: corsi block (CB), mental rotation (MR), auditory and visual Stroop task (aSTR and vSTR), auditory and visual backward digit recall test (aBDRT and vBDRT), and the coding task (CT). Negative values indicate a decrease for a specific parameter in dual-task setting compared to single-task setting, while positive values indicate an increase
Dual-task cost (mean and standard deviation) for session 1 of the cognitive data. Negative values indicate a decrease for the cognitive parameter in dual-task condition compared to single-task condition, while positive values indicate an increase
Statistical analyses were performed using SPSS (IBM Corp. 2017, IBM SPSS Statistics, Version 26.0, Armonk, NY). Descriptive statistics were performed for all cognitive and motor parameters for the test and retest session. The normality of all cognitive and motor data was assessed using QQ-plots, the Kolmogorov-Smirnov test, and histograms. Intraclass correlation coefficient (ICC) values were measured for all ST and DT conditions using the two-way random effects model with absolute agreement. Labels assigned by Landis and Koch were used for interpreting the ICC values [29]. Values with an agreement of > 0.80, 0.61–0.80, 0.41–0.60, 0.21–0.40, 0.00–0.20, and < 0.00 were respectively considered perfect, substantial, moderate, fair, slight, and poor. ICC values of 0.61 and higher will be discussed as sufficiently reliable. The same cutoff value had been used for the sample size calculation as well as in the systematic review on which the development of the current test protocol was based [13]. The test length was shortened for each cognitive test while still trying to maintain ICC values above the cutoff value. In case of lower ICC values, the sequence length with the highest ICC value was chosen. Subsequently, further analyses were performed on this shortened cognitive protocol as well as on the GAITRite data corresponding with the number of cognitive items and the total acquisition length of the GymPlate data (30 sec). Additionally, standard error of measurement (SEM) and minimal detectable difference with a confidence interval of 95% (MDD95) were calculated. The former was calculated as
Results
Reduction of test length and reliability analysis of cognitive parameters
Table 1 reports the ICC values for all different test lengths for the cognitive STs and DTs. Solely the shortened test protocol will be discussed (Table 2). The coding task and both Stroop tasks had an overall substantial agreement (0.61≤ICC≤0.79). The corsi block showed substantial to perfect agreement (0.61≤ICC≤0.85), with the exception of response time and processing time values in ST condition, which only had slight and moderate agreement (ICC = 0.10 and 0.50). For both BDRT, ICC values indicated substantial to perfect agreement (0.65≤ICC≤0.90), except for several percentages of correct responses (% C) that had slight to moderate agreement (vBDRT_SDT_% C, vBDRT_DDT_% C, and aBDRT_ST_% C; 0.2≤ICC≤0.59). The mental rotation task showed lowest ICC values (0.05≤ICC≤0.67). SEM% was < 10% for the % C values of all cognitive tasks (3.18–7.59%). Similar results were found for the reaction time of the Stroop tasks and the number of responses per minute of the coding task (5.17–10.81%). The reaction time and processing time of the corsi block and BDRTs indicated a larger spread around the true scores with values between 8.68 and 35.64%. This tendency could also be observed in the MDD95%, where all % C values as well as the coding task and Stroop values were below 30% [11], while the reaction time and processing time values for the corsi block and the BDRT range from 34.57 to 98.79%.
Intraclass correlation coefficient (ICC) for all cognitive tasks in single-task (ST), static dual-task (SDT), and dynamic dual-task (DDT) condition. Reaction times (RT) and processing times (PT), as well as the percentage of correct responses (% C) are presented when applicable. Values in bold indicate adequate ICC values (≥0.61). The chosen test length and amount of test items is marked in gray
Intraclass correlation coefficient (ICC) for all cognitive tasks in single-task (ST), static dual-task (SDT), and dynamic dual-task (DDT) condition. Reaction times (RT) and processing times (PT), as well as the percentage of correct responses (% C) are presented when applicable. Values in bold indicate adequate ICC values (≥0.61). The chosen test length and amount of test items is marked in gray
Mean and standard deviation (SD) for session 1 and session 2, intraclass correlation coefficients (ICC), standard error of measurement (SEM), SEM percent change (SEM%), the minimal detectable difference (MDD95), and the MDC percentage change (MDD95%). All values are presented for the shortened test protocol for single-task (ST), static dual-task (SDT), and dynamic dual-task (DDT) setting. Values in bold indicate adequate ICC values (≥0.61)
For the GymPlate data (Table 3), ICC values were lowest for the ST condition, where all postural parameters had moderate to substantial agreement (0.41≤ICC≤0.75), except for the surface parameter which had poor agreement (–0.26). For the DTs, all postural parameters had moderate to perfect agreement (0.50≤ICC≤0.92), with most ICC values scoring higher than the cutoff value of 0.61. For the STs and DTs, the parameter surface showed highest (i.e. least favorable) SEM% and MDD95% values (SEM% : 22.48–60.00%; MDD95% : 62.30–166.31%). All other parameters showed lower (i.e. more favorable) SEM% (10.32–31.85%) and MDD95% (21.13–88.28%) values. All dynamic motor parameters measured on the GAITRite Walkway (Table 4) showed substantial to perfect ICC values (0.67≤ICC≤0.98). Additionally, all parameters had SEM% values lower than 10% (2.30–7.04%) and MDD95% values lower than 20% (6.37–19.50%).
Mean and standard deviation (SD) for session 1 and session 2 of the GymPlate data. Intraclass correlation coefficients (ICC), standard error of measurement (SEM), the percentage of SEM (SEM%), the minimal detectable difference (MDD95), and the minimal detectable percentage of change (MDD95%) are presented. All values are calculated for the single-task (ST) as well as static dual-task (SDT) condition. For each condition, the following parameters were analyzed: surface, length, length left/right (L/R), length rear/front (R/F), mean velocity, and the length in function of surface (LFS). Values in bold indicate adequate ICC values (≥0.61)
Mean and standard deviation (SD) for session 1 and session 2 of the GymPlate data. Intraclass correlation coefficients (ICC), standard error of measurement (SEM), the percentage of SEM (SEM%), the minimal detectable difference (MDD95), and the minimal detectable percentage of change (MDD95%) are presented. All values are calculated for the single-task (ST) as well as static dual-task (SDT) condition. For each condition, the following parameters were analyzed: surface, length, length left/right (L/R), length rear/front (R/F), mean velocity, and the length in function of surface (LFS). Values in bold indicate adequate ICC values (≥0.61)
Mean and standard deviation (SD) for session 1 and session 2 of the GAITRite data. Intraclass correlation coefficients (ICC), standard error of measurement (SEM), the percentage of SEM (SEM%), the minimal detectable difference (MDD95), and the minimal detectable percentage of change (MDD95%) are presented. All values are calculated for the single-task (ST) as well as static dual-task (SDT) condition. For each condition, the following parameters were analyzed: velocity (cm/sec), step length, stride length, base support
Dual-task performance is still under-explored in persons with vestibular disorders. Because of the motor as well as cognitive complaints in this population, cognitive-motor DTs might be an added value to the more routinely performed STs. Given the novel character of the 2BALANCE protocol, its feasibility and test-retest reliability were assessed in healthy adults. However, these results cannot simply be extrapolated to patient populations yet. Aside from persons with isolated vestibular dysfunction, this protocol additionally shows potential to be performed in patient populations that are also characterized by motor and cognitive dysfunction such as patients with Parkinson’s disease and Alzheimer’s disease. Interestingly, recent studies indicated vestibular dysfunction being more prevalent in these populations than in the healthy population [19, 30]. This protocol can, therefore, be used as a starting point for validation of the protocol in a variety of patient groups. The first purpose of the current study was to select the most ideal test length for each cognitive task. To ensure the optimal balance between a feasible test duration and acceptable test-retest reliability, ICC values were assessed for different lengths of the cognitive tasks (Table 1). Only the mental rotation task was not shortened as the ICC values were below 0.61. To the best of our knowledge, the ideal test length for DTs had not been assessed before. Currently, the feasibility of this shortened protocol has been confirmed in a group of patients with bilateral vestibulopathy (n = 30).
Subsequently, the test-retest reliability of the cognitive tasks was assessed for this shortened protocol. In line with previous studies in healthy adults, the auditory and visual Stroop tasks had good reliability in ST and DT setting [1, 39]. The reliability of the coding task had only been assessed in DT setting in persons with multiple sclerosis, resulting in high ICC values, similar to the current study [33]. To the best of our knowledge, test-retest reliability of the mental rotation task had not been assessed in any ST or DT study before. The low reliability values in the current study might have been caused by a lack of between-subjects variability. More specifically, even if the variability between both test sessions is low, when subjects differ only little from each other, ICC values will be low [40]. As depicted in Table 2, the mean values of the % C for the mental rotation task were near 100%, indicating a ceiling effect. Persons with vestibular hypofunction have previously shown aberrant visuospatial performance [9]. Therefore, it might be presumed that this ceiling effect might not be encountered in persons with vestibular hypofunction and ICC values might be higher. ICC values should always be interpreted within the context of each test, and should be complemented by additional reliability values such as SEM% and MDD95%. These values were indeed respectively lower than 10% and 30% for the % C. This indicated that change between an individual’s scores exceeding these relatively small percentages were believed not to be attributed to random measurement error with a certainty of 95%. These measures could be valuable to document a person’s evolution for rehabilitation purposes [19]. Similar findings could be observed for the BDRTs, where conditions with the highest mean % C (vBDRT_SDT, vBDRT_DDT, and aBDRT_ST) also showed lowest ICC values. In contrast, the outcome parameters reaction time and processing time showed greater variation between test items and test subjects for the BDRT and the mental rotation task, resulting in larger standard deviations (Table 2). This variability resulted in higher ICC values for reaction time and processing time than for % C. Similar findings were observed by Tamura et al. (2018) [38], where ICC values for the BDRT were higher for reaction time than for % C. Standard deviations do not influence ICC values, but are used to calculate SEM and MDD95 values, which could explain their high percentages. The corsi block shows overall adequate reliability, except for the reaction time and processing time in ST condition. This might again be explained by a possible lack of variability in the least challenging test condition. To summarize, the cognitive tests in the 2BALANCE protocol showed an overall sufficient test-retest reliability based on ICC as well as SEM and MDD95 values, except for the mental rotation task, which should be interpreted with caution in future research.
Finally, the test-retest reliability of the motor tasks was studied. In accordance with previous research, all assessed parameters measured by the GAITRite Walkway showed high reliability and appeared to be sensitive to change [4, 37]. The current study was the first to assess the test-retest reliability of the GymPlate. This equipment showed overall adequate ICC values. The lowest scores were obtained for the ST condition, which is consistent with the constrained-action hypothesis which states that the internal attentional focus on the motor task could negatively affect the motor performance which is an otherwise automatic process. The addition of a subsequent cognitive task might draw away attention to an external focus and might again restore movement automaticity [18, 25]. The ICC values of the static motor task were higher than 0.61 in combination with all cognitive tasks, except for the auditory BDRT (0.50–0.92), which were slightly lower compared to the visual variant. It might be hypothesized that a lack of visual fixation might influence the postural balance. However, this tendency could not be observed for the auditory compared to the visual Stroop task. To summarize, the motor tasks also showed overall sufficient reliability in ST and DT condition, based on the ICC as well as SEM and MDD values.
These findings should not simply be generalized for all populations and age categories, but should be analyzed in persons with vestibular hypofunction before clinical implementation. Notwithstanding the evidence demonstrating a clear link between vestibular and cognitive dysfunction [35], recent studies have also shown an important link between hearing loss and cognitive decline [16]. It is therefore important to control for hearing loss and to uncover the contribution of both sensory input systems.
Conclusions
2BALANCE is the first comprehensive protocol developed for persons with vestibular hypofunction, taking into account motor and cognitive symptoms. Overall sufficient reliability levels were achieved in ST and DT setting in healthy adults. For the cognitive tasks, the lowest reliability values were observed for the mental rotation task, possibly caused by a lack of between-subjects variability in the healthy population.
Ethics and data management
Approval by the ethics committee of Ghent University was obtained on July 5th 2019 (registration number B670201940465). In accordance with the Declaration of Helsinki, all participants gave their written informed consent. This work was supported by Fonds voor Wetenschappelijk Onderzoek (FWO) with grant number 3F020219.
Footnotes
Acknowledgments
The authors would like to thank Lubbe Van Gucht en Jolien Vangeneugden for their assistance with the dual-task assessment of several participants.
