Abstract
Background
Autonomic neuropathy assessment is needed for the diagnosis and prognostication of different clinical disorders. Heart rate variability (HRV) and autonomic reactivity assessment by Ewing’s battery of tests form the cornerstones of laboratory assessment of cardiac autonomic neuropathy evaluation.
Purpose
While these tests are routinely used, there are conflicting reports regarding the visit-to-visit repeatability of these tests. Therefore, we assessed autonomic measures derived using aforementioned tests on multiple visits in healthy subjects.
Methods
We enrolled 31 healthy subjects and performed autonomic function evaluation on five visits by assessment of HRV and autonomic reactivity on day 1 forenoon and afternoon, next day, one week later and one month later. Repeatability assessment was evaluated using Intraclass correlation coefficients. Values were defined as moderate, good and excellent based on previously reported criteria.
Results
Thirty-one subjects completed all five visits (17 males, 14 females; mean age = 29 ± 5.44 years). While time-domain measures demonstrated good to excellent repeatability, frequency-domain measures were only moderately repeatable. Autonomic reactivity indices also displayed good to excellent repeatability with the exception of blood pressure response to orthostatic challenge which was moderately repeatable.
Conclusion
We recommend that sole reliance on frequency domain metrics for HRV assessment should be avoided. HRV indices and autonomic reactivity measures may continue to be used for cardiac autonomic neuropathy assessment.
Introduction
There is increasing evidence available that autonomic neuropathy plays a pivotal role in morbidity and mortality in various clinical disorders. These disorders range from endocrine disorders such as Diabetes mellitus to neurological disorders such as Parkinson’s disease and Multisystem atrophy. In addition, autonomic neuropathy due to pharmacotherapy for diseases such as epilepsy is also being explored.1–8
Evaluation of cardiac autonomic neuropathy is performed using a combination of heart rate variability (HRV) and autonomic reactivity.9–11 HRV assesses variability between interbeat intervals in the ECG and provides an estimate of cardiac autonomic tone.12–15 Autonomic reactivity assessment quantifies the blood pressure and heart rate responses to physiological stimuli such as orthostasis, Valsalva manoeuvre, deep breathing, cold stimulation and isometric exercise, which are components of Ewing’s battery of tests.10, 16–20 Normative ranges for these parameters have been defined in various populations across the globe.21–23
While these tests are routinely used in clinical practice, there are confounding reports regarding the visit-to-visit reliability of these tests. While some reports find these tests to be reliable and repeatable,24–26 there are confounding reports with reference to repeatability.27, 28 HRV metrics have been shown to be highly reliable in some reports while moderately to poorly reliable in other studies. 29 Similarly, blood pressure responses to isometric exercise and postural challenge have mixed reports regarding reproducibility of blood pressure responses.25, 30–32 In addition, there are recommendations to generate and assess repeatability data for individual centres. 30
Therefore, there is a need to investigate temporal stability of metrics derived from autonomic neuropathy testing when assessed repeatedly in the neurophysiology laboratory. We conducted the present work to assess repeatability of HRV metrics and indices derived using components of Ewing’s battery in healthy adult subjects.
Method
The study was prospective observational in nature. We commenced subject recruitment after obtaining ethical clearance from Institute Ethics Committee (IEC) of our institution (IEC of All India Institute of Medical Sciences, Jodhpur, Rajasthan; Letter number AIIMS/IEC/2021/3813 dated 8/9/2021). Informed written consent was obtained from all subjects before recruitment. Apparently, healthy subjects of either gender between 18 and 45 years were recruited in the study. All female subjects were recruited between days 2nd to 6th of their cycle for the first visit.
The subjects reported to the lab two hours after a light meal. Abstinence from tea, coffee and any form of tobacco was ensured on the day of the test. Also, the subjects were requested to refrain from heavy exercise in the 24 hours preceding the test. The tests were performed in a noise-free, humidity- and temperature-controlled environment in the autonomic function laboratory of our department.
The first visit was scheduled in the forenoon (9 am – 12 noon) of day 1. The subjects were requested to revisit the lab in the afternoon of day 1 (2:30 pm – 5 pm), forenoon of next day (day 2), forenoon after one week (day 8) and forenoon after one month (day 31).
Post arrival to the lab, subjects were requested to empty their bladder. Adhesive Ag-AgCl ECG electrodes were placed in Lead II configuration. A digital stethograph belt was tied around the 4th intercostal space around the chest. Digital ECG and respiration were acquired using Bionomadix™ wireless module of Biopac MP 150™ system (Biopac Inc, USA). Data was visualised using Acqknowledge™ software version 4.4 installed on a desktop system. Blood pressure was estimated using an Omron™ BP device (model HEMCR24™) for baseline blood pressure measurement and subsequent measurements.
Assessment of autonomic tone and reactivity was done as described previously.10, 33, 34 After a supine rest of 5 minutes, Lead II ECG was acquired for 5 minutes. The data was visually inspected for presence of artefacts, if any. A noise-free segment was chosen to assess autonomic tone in accordance with Task Force Guidelines proposed by the European Society of Cardiology. 12 Time domain and frequency domain indices and non-linear measures were computed and tabulated. SDNN, RMSSD, SDSD and pRR50 were computed amongst time domain indices. Total power, LF-power, HF-power, LF/HF ratio were computed amongst frequency domain indices. SD1, SD2 and SD1/SD2 were computed amongst non-linear measures.11, 35, 36
This was followed by performance of Ewing’s battery of tests for autonomic reactivity.16, 37–40 The tests performed were Lying to standing, deep breathing test, cold pressor test and hand grip test using standard protocol, as described previously.10, 33, 41 Lying to standing test involved attaining standing posture from supine position within three seconds. Deep breathing test involved performance of 6–7 slow deep breathing cycles with 5 seconds of inhalation and exhalation. Hand grip test involved performance of isometric hand grip exercise at 1/3rd of maximum voluntary capacity (MVC) using dominant hand for 4 minutes. Cold pressor test involved immersion of hand in cold water (10 ºC) for a period of 1 minute. Lead II ECG and respiration were recorded throughout the test protocol. Blood pressure values were recorded at standard time points as described in literature.
We computed and tabulated fall in systolic blood pressure (∇SBP) and 30:15 ratio during Lying to standing test, Change in Heart rate (∇HR) and E:I ratio during deep breathing test, Rise in Diastolic blood pressure (DBP) above baseline in cold pressor test and hand grip exercise test, termed as CPTDBP and HGTDBP, respectively.
A considerable part of the data acquisition was done during the COVID-19 pandemic period. Performance of Valsalva manoeuvre (VM) involves forceful expiration into a mouthpiece connected to a sphygmomanometer. Ensuring sterility of the mouthpiece and the manometer was technically challenging during the pandemic period. Therefore Valsalva manoeuvre was not performed to prevent subject-to-subject spread of infection.20, 31, 42–44
HRV and Ewing’s battery of tests (except VM) was repeated in the afternoon of day 1 (visit 2, V2), next day (visit 3, V3), one week later (visit 4, V4) and one month later (visit 5, V5). All indices were computed and tabulated.
Statistical analysis was performed using MedCalc software version 22.016 (MedCalc Software Ltd, Ostend, Belgium;
Results
We recruited 39 healthy subjects in the study. Five subjects did not complete all five follow-up visits and were therefore excluded from the study. One subject did not wish to continue the battery of tests and chose to withdraw from the study. Two subjects demonstrated elevated blood pressure repeatedly during baseline measurements and were excluded from the study. Thirty-one subjects completed all five visits (17 males, 14 females; mean age = 29 ± 5.44 years, BMI 24.03 ± 3.40 kg/m2) and were included in the final analysis.
Baseline SBP pressures were comparable across five visits for all subjects (115.87 ± 13.99, 114.97 ± 13.44, 114.41 ± 11.26, 112.87 ± 13.18 and 112.96 ± 12.72 mm Hg, P = .03, Repeated Measures ANOVA with no difference between the groups on Tukey’s post hoc test). Similarly, Diastolic blood pressures were also comparable across the groups (72.25 ± 7.74, 71.74 ± 7.81, 71.25 ± 7.05, 71.74 ± 7.56 and 72.45 ± 8.12 mm Hg, P = .64, Repeated Measures ANOVA).
Time domain indices
Median SDNN values were comparable across the groups on five subsequent visits (P = .33, RMANOVA with post hoc test). Similar trends were observed in RMSSD, SDSD and pRR50 wherein all values for five consecutive visits did not show any statistically significant difference (P = .14, .05 and .098, respectively, RMANOVA with post hoc tests).
Frequency domain parameters
Total power was comparable across all visits 1–5 (P = .67, Friedman’s test with post hoc multiple comparison tests). HF-power for visit 2 was significantly lower than visits 1,3 and 4 (P = .012, Friedman’s test with post hoc multiple comparison tests). LF-power was comparable across all visits (P = .14, Friedman’s test with post hoc multiple comparison tests). VLF power was also comparable across all visits (P = .85, Friedman’s test with post hoc multiple comparison tests). While there was a statistical trend observed in HF values, they were in normal range with reference to data available for healthy subjects elsewhere. The values for time and frequency domain are summarised in Figures 1 and 2, respectively.


Changes in systolic blood pressure from baseline (∇SBP) and 30:15 ratio during Lying to standing test were comparable across all visits (P = .86 and .81 respectively, Freidman’s test with post hoc multiple comparison tests). Indices derived using deep breathing test—∇HR and E:I ratio—were also comparable for all visits (P = .73 and .47, respectively, Friedman’s test with post hoc multiple comparison tests). The rise in diastolic blood pressure (HGTDBP) during hand grip test was significantly higher for visit 2 when compared to visits 1 and 4, visit 3 when compared with visit 4 and visit 5 when compared with visit 4 (P = .01, Friedman’s test with post hoc Repeated measures test). However, all values were above established cut-offs for normal population.10, 33 Similarly, rise in DBP for CPT (CPTDBP) on visit 1 was significantly higher for visits 3, 4 and 5 (P = .04, Friedman’s test with post hoc Repeated measures test). All values for CPT were also above-established cut-offs for normal population.10, 33 The values of autonomic reactivity parameters are summarised in Figure 3.

Test-retest reliability was assessed using Intraclass correlation coefficient. We employed a single rater with a two-way random effect model for the computation. ICC has been shown to be a reliable marker for estimation of test-retest reliability. The values of ICCs for different parameters are summarised in Table 1.
Intraclass Correlation Coefficients for Heart Rate Variability Parameters.
Time domain indices displayed the high values of ICCs except pNN50. Median ICCs for SDNN, SDSD and RMSSD were 0.72, 0.74 and 0.70, respectively. Frequency domain parameters, on the contrary, had relatively lower values of ICCs. Median ICCs for Total Power, LF-power, HF-power and LF/HF ratio were 0.49, 0.47, 0.52 and 0.53, respectively. A similar trend was demonstrated by non-linear measures – SD1, SD2 and SD1/SD2. Median ICC values for these parameters were 0.61, 0.62 and 0.57, respectively.
Autonomic reactivity indices demonstrated highest reliability values for indices derived using deep breathing test—∇HR and E:I ratio—median values being 0.85 and 0.90, respectively. Other parameters also demonstrated moderate to good reliability. The values are summarised in Table 2.
Intraclass Correlation Coefficient for Autonomic Reactivity Parameters.
Discussion
Autonomic neuropathy is an important hallmark of multiple clinical disorders. Assessment of the status of the different limbs of the system can be performed by a battery of tests. Cardiac autonomic neuropathy screening is important from a clinical standpoint since it has a direct impact on morbidity and mortality in various pathophysiological states. These diseases include, but are not limited to, Hypertension, Diabetes mellitus, Parkinson’s disease, Multi-system atrophy and Heart failure.5, 48–55
Test-retest reliability of autonomic function assessment is an important criterion when using these tests for clinical assessment. Reliability of assessed parameters is important from a clinical and laboratory perspective. Two key parameters reported in this context are Intraclass correlation coefficient and Coefficient of Variance.
We assessed time and frequency domain indices derived from HRV. While time-domain indices demonstrated good to excellent repeatability, frequency-domain indices showed moderate repeatability (Table 1). This can probably be attributed to the difference in VLF power being different at different visits since subjects were requested to breathe spontaneously instead of paced breathing. Reliability of HRV metrics has been explored in depth by multiple groups. Long duration of HRV (24 hours) as well as ultra-short duration (2.5 minutes) recordings have been shown to have excellent reproducibility.56–58 Reproducibility of HRV indices using short-term recordings (5 min) have been shown to be good to excellent, especially those pertaining to time domain. We have observed similar results with good to excellent reliability for time-domain indices and moderate reliability for frequency-domain indices (Table 1). Commonly used time domain indices in our subjects, such as SDNN, SDSD and RMSSD, exhibited high reproducibility as observed by ICC values. This is also consistent with previous studies.27, 59, 60 Only pNN50 showed moderate reliability in time domain measures in our study. However, we could not find what led to this finding.
The reliability of frequency domain indices was only moderate. In this context, we would like to point out that our data collection was done using spontaneous breathing, instead of paced/metronomic breathing. We chose spontaneous breathing as there is a large corpus of literature that recommends its use over metronomic breathing. Pitzalis and colleagues have reported frequency domain parameters derived using spectral analysis demonstrated better reproducibility during spontaneous breathing with a decline in reproducibility observed during paced breathing. 61 They have suggested that paced breathing may not be a good intervention for reliability assessment since it is likely to change alertness and may act as a stimulant for the nervous system which is likely to affect reproducibility. Kowaleski et. al. and Bernardi et. al. also recommend use of spontaneous breathing for the assessment of HRV repeatability since they argue that breath frequency is likely to have a bearing on spectral measures of HRV.62, 63
However, the use of paced breathing instead of spontaneous breathing is a matter of debate. Some groups recommend use of paced breathing for good repeatability estimates. Gisselman and colleagues investigated intersession reliability of HRV parameters and observed that controlled breathing improves reliability and decreases dispersion in HRV indices when compared to spontaneous breathing. 64
Contrary to the aforementioned schools of thought, Sinnreich and colleagues 65 observed that metronomic breathing did not exert any effect on reproducibility of HRV indices. Similar observations were made by Bertsch and colleagues. 66 Based on our findings, we propose that spontaneous breathing is likely to affect repeatability estimates of HRV indices. This is because the contribution of VLF power to Total power in a five-minute window is likely to change based on respiratory frequency and is bound to have a bearing on overall reproducibility. However, we concede that this debate regarding choice of spontaneous versus paced breathing needs further exploration by studies on large sample sizes.
There are few studies that have performed comprehensive estimation of tone and reactivity parameters in a single study. Kowalewski and colleagues assessed HRV and reactivity parameters and concluded that HRV parameters were more reproducible and should be considered when performing long-term observations. 62 Keet and colleagues also evaluated 20 healthy adults for HRV and cardiac autonomic function tests. They observed moderate to good reproducibility for HRV indices and autonomic indices and poor reproducibility for VLF band of HRV. However, they assessed these parameters at only three-time points. Also, they did not evaluate responses to cold pressor test, an important component of autonomic reactivity. They observed poor reproducibility to BP and HR responses to postural challenge test.
Autonomic reactivity assessment was done using blood pressure and heart rate responses to postural challenge, deep breathing test, cold pressor test, and isometric exercise (handgrip) test. We could not perform Valsalva manoeuvre, a standard component of Ewing’s battery, due to risk of spread of infection due to inability to disinfect sphygmomanometer after each use. We observed good to excellent reliability of metrics derived using heart rate/RR intervals, that is, 30:15 ratio, ∇HR and E:I ratio. Diastolic blood pressure responses to cold pressor test and isometric exercise test also had good reliability. This is also like previous observations by different groups wherein autonomic reactivity parameters have been found to be more robust and consistent in repeated measurements. The lowest reliability was observed for blood pressure response to Lying to standing test. This is like observations made by Keet et. al. 32 who reported ICC values to be 0.32 under standardised testing conditions. Other groups have also reported low reproducibility of blood pressure response to orthostatic challenge.67, 68 Multiple reasons have been ascribed to this phenomenon ranging from changes in autonomic activity, hydration status and environmental effects. 25 While we attempted to standardise timing of tests and temporal relationship with meal (2 hours after a light meal) in our protocol and maintained similar environmental conditions in the testing laboratory, we did not have means to measure hydration status and therefore cannot comment upon this factor influencing our results about orthostatic challenge. Also, unavailability of continuous beat-to-beat blood pressure recording at our centre was a limiting factor that may have provided better assessment of blood pressure responses to orthostatic challenges. We believe this phenomenon needs to be explored in more detail in future work.
The premise of defining reliability by Intraclass correlation coefficient values needs discussion. While ICC is a standard metric for reliability assessment, different groups have defined their own criteria for grading reliability based on this parameter. Da Cruz defined reliability to be excellent, good, moderate and poor based on ICC being >0.90, between 0.75 and 0.90, between 0.50 – <0.75 and <0.50, respectively. 59 Kowalewski and colleagues followed a more conservative approach and defined reproducibility to be excellent for ICC > 0.74 and values to be reproducible for ICC > 0.60. 62 In the present work, we have followed guidelines by Koo and colleagues, 47 wherein reliability is classified as moderate, good and excellent based on ICCs between 0.5 and 0.75, 0.75 and 0.90 and >0.90.
There are some limitations to our study. The repeatability was assessed only in healthy subjects and there is a need to extrapolate the findings to diseased population. Our findings are limited by the small sample size of our study and require replication in large cohorts across age groups for further validation. Also, we feel that computation of reliability of these parameters over longer time windows, say over a span of few years, may yield interesting results. We were unable to include Valsalva manoeuvre in our battery and therefore cannot comment on reliability of this test. Another point that merits discussion is the potential effect of menstrual cycle on autonomic parameters in female subjects. The fourth visit, one week after the first visit (between day 2nd and 6th of menstrual cycle), may have been prone to this effect. While we did not find a significant reduction in any parameter on visit 4 with respect to visit 1, as observed in repeated measures analysis, we cannot conclusively state that menstrual cycle phase did not affect the indices in our study. Our findings are limited by the absence of hormonal profile measurement, since we could not perform the same due to cost constraints. We accept this as a limitation of our study and emphasise the need for future studies exploring the effect of gender differences on repeatability studies.
Conclusion
To conclude, short-term HRV provides various metrics in time and frequency domains. Time domain metrics were found to be very reliable over the period of one month in our population. We may safely propose their use for follow-up studies in healthy subjects and patients. Frequency domain measures were only moderately reliable over the described time window. We may safely conclude that sole reliance on frequency domain measures as indicators of cardiac autonomic tone may not be a prudent measure.
Autonomic reactivity assessment measures derived using standard battery of tests demonstrated good to excellent reliability over a period of one month in our study population. Based on our findings, we recommend continued use of these measures for assessment of cardiac autonomic neuropathy. Blood pressure response to postural challenge demonstrated moderate reliability in our study and therefore we recommend further work on this index in future studies. Based on the findings in the present work, we recommend laboratory assessment of cardiac autonomic neuropathy using HRV and autonomic reactivity tests, since they are reliable for use in day-to-day practice. These indices are likely to provide valuable information regarding different clinical disorders leading to autonomic neuropathy.
Footnotes
Abbreviations
ECG = electrocardiogram, HRV = heart rate variability, IEC = Institute Ethics committee, Ag-AgCl = Silver-Silver chloride, SDNN = standard deviation of normal-to-normal intervals, RMSSD = root mean square of standard deviation, SDSD = Standard deviation of differences between adjacent NN intervals, pRR50 = percentage of RR intervals varying by more than 50 milliseconds, LF = low frequency, HF = high frequency, MVC = maximum voluntary capacity, SBP = systolic blood pressure, DBP = diastolic blood pressure, HR = heart rate, VM = Valsalva manoeuvre, RMANOVA = repeated measures analysis of variance, CPTDBP = rise in diastolic blood pressure after cold pressor test, HGTDBP = rise in diastolic blood pressure after hand grip test.
Acknowledgements
The authors acknowledge the support provided by Mr. Shiv Kumar, Laboratory staff, for his help provided in data acquisition.
Authors’ Contribution (as per CRediT© taxonomy)
Conceptualization ASJ, SS, BM. Data curation ASJ, SS. Formal analysis ASJ, SS. Investigation ASJ. Supervision SS, BM. Writing original draft ASJ, SS. Writing – review and editing SS, BM.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The authors received no financial support for the research, authorship and/or publication of this article.
ICMJE Statement
This article complies with the International Committee of Medical Journal Editors (ICMJE) uniform requirements for the manuscript.
Informed Consent
Informed written consent was obtained from all subjects before recruitment.
Statement of Ethics
The study was approved by Institute Ethics Committee, All India Institute of Medical Sciences Jodhpur vide letter number AIIMS/IEC/2021/3813 dated 8/9/2021.
