Abstract
Background
Clinical fall risk prediction often relies on subjective observation or simplistic metrics, despite the high costs associated with falls in older adults.
Objective
This proof-of-concept study evaluated the validity and reliability of a consumer-grade depth camera system as an objective alternative for automated fall risk assessment.
Methods
Thirty-nine community-dwelling adults performed Timed Up and Go (TUG), Five Times Sit-to-Stand (FTSS), and Tandem Stance (TST) tests. Concurrent measurements were taken by an automated depth camera and blinded physical therapists. Validity (concurrent, convergent, discriminative) and reliability were assessed.
Results
Automated FTSS and TUG tests demonstrated strong concurrent validity with therapist measurements (r = 0.813 and 0.915) and high discriminative accuracy for fall history (AUC = 0.941 and 0.864). Depth camera-based FTSS vertical velocity was significantly lower in participants with a fall history (p < 0.001). TST sway metrics showed limited discriminative validity. The system showed good to excellent test-retest reliability. In an age-stratified analysis of older adults (≥65 years), AFTSS time and the AUC-weighted composite score demonstrated acceptable discrimination for retrospective fall history (AUC = 0.892 and 0.867, respectively)
Conclusions
The depth camera system showed promise as a valid and reliable tool for objective quantification of performance on fall-risk-related functional tests, particularly FTSS and TUG, when benchmarked against therapist-administered measurements. Discriminative findings against retrospective fall history should be interpreted as exploratory, and larger prospective studies are required before clinical screening thresholds can be recommended.
Trial Registration
ClinicalTrials.gov (NCT06519864).
Introduction
Falls remain a leading cause of injury and functional decline in older adults, affecting one-third of community-dwelling individuals over 65 annually.1,2 The resulting economic burden exceeds $50 billion annually in the United States alone, highlighting the urgent need for accessible and accurate risk assessment tools. 3 While laboratory-based motion capture systems can detect subtle movement patterns predictive of falls, their cost and complexity limit clinical translation.
Current clinical practice relies on functional tests such as the Timed Up and Go (TUG), Five Times Sit-to-Stand (FTSS), and Berg Balance Scale, which have demonstrated utility but reduce complex movements to single temporal values.4,5 This field of clinical assessment is continually expanding, with ongoing validation of other dynamic balance tests, such as Timed 360° Turn Test, Modified Four Square Step Test, and investigations into interventions targeting fall risk in vulnerable populations.6–8 Physical function, particularly affecting basic movements like walking and standing, strongly predicts fall risk.9–11 However, the subjective nature of clinical observation and reliance on simple timing may miss important movement characteristics that contribute to fall risk.12,13
Instrumented approaches have therefore been explored to add objectivity to fall risk assessment, most commonly using wearable inertial sensors and force platforms. Wearable sensors can be deployed across settings and can quantify movement during standardized tests or daily activities; however they require donning, doffing and correct placement, can be influenced by adherence and sensor orientation, and often provide limited multi-segment kinematics unless multiple sensors are used.12,14 Force platforms provide high-fidelity center of pressure measurement for postural control and have been widely studied as predictors of falls, yet their high cost, dedicated installation, and limited portability can restrict routine use in many outpatient and community settings. 13 Consumer-grade depth cameras offer a complementary middle ground by enabling markerless full-body skeletal tracking without wearable instrumentation and with a relatively small footprint; however they remain sensitive to line-of-sight occlusion, restricted capture volume, and environmental conditions (such as lighting or interference), and may be less portable than small wearables, underscoring the need for standardized setup and robust algorithms.
Among these sensing modalities, recent advances in depth camera technology, particularly systems using artificial intelligence for skeletal tracking, have enabled markerless motion capture (MMC) as a practical approach for objective movement assessment without requiring wearable instrumentation. MMC/depth-camera approaches have been increasingly explored for fall risk and frailty assessment. For example, a recent scoping review identified 39 studies evaluating fall risk and/or frailty using MMC technologies (including 3,114 participants), with Microsoft Kinect being the most commonly used platform and with substantial heterogeneity in extracted features and reporting practices. 15
Within this literature, depth cameras have been used to instrument common clinical mobility tests and to derive richer parameters beyond total time. Dubois et al. automated the Timed Up and Go (TUG) test using Kinect depth images, identified phases, and extracted gait- and turn-related parameters that supported discrimination of fall-risk groups. 16 Ejupi et al. developed a Kinect-based five-times-sit-to-stand (5STS) algorithm and reported that timing- and velocity-related measures (including sit-to-stand transition velocity) discriminated fallers and correlated across laboratory and in-home assessments. 17 More recently, Kinect-based multifactorial test batteries have been evaluated with prospective fall follow-up in larger cohorts, illustrating the feasibility of depth-camera-based screening in geriatric populations. 18
Despite a growing body of work demonstrating the feasibility of depth-camera-based instrumented assessments,19–22 key gaps remain for clinical translation. These include heterogeneity in protocols and feature definitions, as well as inconsistent reporting of setup parameters and data quality, which limit comparability across studies. In addition, measurement properties that are critical for clinical deployment—such as concurrent validity against clinician-administered assessments, test-retest and inter-rater reliability, and measurement error—have been reported inconsistently across test types and implementations. Specifically, few studies have systematically validated a single, accessible system against a comprehensive suite of established clinical fall risk measures while simultaneously assessing its clinical feasibility and reliability.
This proof-of-concept study therefore aimed to evaluate the measurement properties of an automated depth-camera-based multi-test workflow (TUG, FTSS, and TST), including concurrent validity against blinded therapist measurements, convergent validity, and test-retest and inter-rater reliability, while treating discrimination against retrospective fall history as exploratory.
Materials and methods
We conducted a cross-sectional proof-of-concept study to validate an automated depth camera system for fall risk assessment, following the Standards for Reporting Diagnostic Accuracy Studies (STARD) guidelines. 23 The study protocol was approved by the Institutional Review Board of Sahmyook University (SYU 2022-06-003-001) and registered at ClinicalTrials.gov (NCT06519864). All procedures adhered to the Declaration of Helsinki principles, with written informed consent obtained from all participants before enrollment.
Participants
Between August to December, 2023, we recruited community-dwelling adults through a purposive sampling strategy designed to capture a spectrum of fall risk across the adult lifespan. Participants were enrolled from senior community centers and university-affiliated health programs in Seoul, South Korea. Inclusion criteria were: (1) age ≥18 years, (2) ability to ambulate independently for at least 6 meters without assistive devices, (3) Mini-Mental State Examination score ≥24, and (4) provision of written informed consent. Sample size was calculated to detect a correlation coefficient of 0.60 between automated and clinical measurements, assuming a null hypothesis of r = 0.00, with 80% power and α = 0.05, yielding a minimum requirement of 35 participants.24,25 We oversampled by 10% to account for potential technical failures.
Automated assessment system
The automated assessment system utilized a Microsoft Azure Kinect depth camera (Microsoft, USA), which incorporates time-of-flight sensing with AI-powered skeletal tracking. Operating at a 30 Hz capture rate with a maximum of 512 × 512 pixel depth resolution, the system tracked 32 joint positions in real-time.
The custom software required for data extraction and analysis was developed using the C# programming language within the Unity 2021.3 (Unity Technologies, USA) software environment. The LightBuzz Kinect4Azure SDK was employed to interface with the camera and receive skeletal tracking data.
To ensure data stability and mitigate signal noise inherent in markerless tracking, all 3D joint coordinates, particularly the Pelvis joint, were processed in real-time using a digital low-pass filter using the 4th-order Butterworth filter with a 6 Hz cutoff. This smoothed data formed the basis for all subsequent kinematic calculations. The core of the automated assessment relied on a custom-built postural state machine algorithm. This algorithm classified the participant's state in each frame as either ‘Sit’ or ‘Stand’. This classification was determined by comparing the vertical (Y-axis) coordinate of the filtered Pelvis joint against pre-calibrated height thresholds established during an initial calibration for each participant.
The specific algorithms for each functional test were as follows:
For the AFTSS, the test timer and data recording were initiated automatically when the system confirmed the participant was stable in the initial ‘Sit’ state. The algorithm then tracked discrete state transitions, and a repetition counter was incremented by one only when a full transition from the ‘Sit’ state to the ‘Stand’ state was detected. Transitions from ‘Stand’ back to ‘Sit’ were also tracked for validation. The test was automatically concluded, and the final total time was recorded, immediately upon the algorithm detecting the completion of the fifth ‘Stand’ state. During the AFTSS, it measured the total completion time (s), and the mean vertical velocity of the pelvis (m/s), which was calculated by averaging the mean vertical velocities recorded during each of the five upward sit-to-stand transitions. A fail-safe mechanism was also implemented: if the algorithm detected no change in postural state for 10 consecutive seconds, the test was automatically terminated and flagged as incomplete.
The ATUG algorithm was governed by a sequential state machine. The test sequence was initiated, and the timer was reset to zero, once the participant was confirmed to be in the stable, initial ‘Sit’ state. The algorithm then progressed through a defined set of phases: (1) sit start, (2) Walk (triggered upon detecting a ‘Stand’ posture), and (3) sit end. This state-based progression served as the phase decomposition mentioned in the manuscript. The total test time was captured when the algorithm detected a return to the ‘Sit’ posture, which triggered the final ‘SitEnd’ state. A 30-s timeout was implemented; if the test did not reach the ‘SitEnd’ state within this period, it was automatically terminated. The total completion time was selected as the primary ATUG endpoint to maximize comparability with conventional manual TUG timing in this proof-of-concept study and to avoid over-interpreting phase-specific metrics in a small sample.
For this test, data recording was initiated and the timer was started once the system detected that the participant had lifted one foot off the ground. This was determined by calculating the relative distance between lower-limb joints and comparing it to a predefined threshold. During the test, the algorithm continuously monitored and cumulatively summed the 3D displacement of the Pelvis joint to quantify postural sway, as described in the manuscript. The test was programmed to run for a fixed duration of 30 s, after which it automatically stopped and recorded the final sway data, aligning with the “success” condition of the clinical test.
For the ATST, the system continuously monitored the pelvic trajectory over 30 s to yield the cumulative sway path length of the pelvis (m) (referred to as ATST displacement), mean sway velocity (m/s), and sway acceleration (m/s2). During the AFTSS, it measured the total completion time (s), vertical center-of-mass velocity during transitions (AFTSS velocity, m/s). Finally, for the ATUG, the software calculated the total completion time (s). While the system is technically capable of providing phase decomposition and turn-related kinematic outputs, the present study utilized the total completion time as the primary ATUG outcome. This choice was made to maximize comparability with conventional manual TUG timing. Therefore, turn-related outputs were not included in the primary analyses.
Detailed postural state machine and kinematic calculations with explicit equations, diagram of the postural state machine used to classify Sit/Stand based on participant-specific pelvis-height calibration and stability criteria, and test-specific flowcharts are provided in the Supplementary Materials.
Clinical assessment
Clinical assessments were conducted by two experienced physical therapists, each with at least five years of experience in geriatric evaluation. To ensure objectivity, both therapists were blinded to the automated system's results and to each other's measurements. All timed tests were performed using calibrated digital stopwatches (CL-066, Accuresearch, Republic of Korea) with a precision of ±0.01 s. The clinical Tandem Stance Test (TST) was evaluated as a binary outcome, recording success or failure to maintain the position for 30 s. The Five Times Sit-to-Stand (FTSS) and Timed Up and Go (TUG) tests were administered and timed according to widely established, standardized protocols to ensure consistency and comparability.
In addition to these primary tests, a suite of additional established clinical fall risk assessments was administered. The Short Physical Performance Battery (SPPB) assessed lower extremity function through balance, gait, and sit-to-stand tests. 26 The Berg Balance Scale (BBS), using the Korean translated version, measured balance through 14 functional tasks.27,28 The Functional Reach Test (FRT) evaluated limits of stability, 29 the Four-Square Step Test (FSST) assessed dynamic balance, 30 and Grip Strength Ratio (grip strength/body weight, GSR) was measured with a dynamometer as an indicator of overall strength.31,32
Study procedure
Assessments occurred in a living laboratory at Sahmyook University, simulating home environments while maintaining standardization. Each test (TST, FTSS, TUG) was performed four times: practice trial, two trials by therapist A (test-retest reliability), one trial by therapist B (inter-rater reliability). Three-minute rest intervals were provided to prevent fatigue (Figure 1A). Safety personnel remained within arm's reach without providing support unless balance was lost.

Study procedure and experimental setup. (A) Flow diagram of the study procedure. Participants first completed the suite of baseline clinical assessments (SPPB, BBS, FRT, FSST, GSR). Subsequently, for each of the three primary tests (TUG, FTSS, TST), they performed one practice trial followed by three main trials separated by 3-min rest intervals. All main trials were concurrently measured by the automated depth camera system. (B) Schematic of the standardized experimental setup showing the depth camera placement (height: 0.90 m), chair seat height (0.46 m), the 3-m TUG walkway with the turn-around point, and the camera-to-chair distance (5 m). Abbreviations: SPPB, Short Physical Performance Battery; BBS, Berg Balance Scale; FRT, Functional Reach Test; FSST, Four- Square Step Test; GSR, Grip Strength Ratio; TUG, Timed Up and Go; FTSS, Five Times Sit-to-Stand; TST, Tandem Stance Test.
The depth camera was mounted on a tripod at a height of 0.90 m and positioned at a fixed location relative to the chair and walkway (Figure 1B). The chair seat height was 0.46 m. For the TUG, a 3-m walkway was marked on the floor with a turn-around cone placed at the end of the walkway. The camera-to-chair distance was 5 m, and the camera optical axis was aligned with the walkway centerline with a fixed orientation (yaw 0° and pitch −10°, tilted downward). Camera placement (height, distance, and orientation) was fixed and identical across all participants and all tests using floor markings and a fixed tripod setup.
Statistical analysis
All statistical analyses were performed using SPSS (version 26.0, IBM, USA), with a p-value < 0.05 considered statistically significant. The normality of data was assessed using the Kolmogorov-Smirnov test. To evaluate validity, several approaches were used. Baseline group (younger vs elderly groups) differences were subsequently assessed using independent samples t-tests and Mann-Whitney U tests for non-normally distributed data. Specific Intraclass Correlation Coefficient (ICC) models were selected based on the experimental design to appropriately evaluate reliability and validity. 33 Specifically, a two-way random-effects model with absolute agreement, ICC2,1, was utilized to evaluate concurrent validity (between the automated system and therapists) and inter-rater reliability. Conversely, a two-way mixed-effects model, ICC3,1, was employed for test-retest reliability, as the repeated trials were assessed by the exact same fixed rater.
To evaluate validity, several approaches were used. Concurrent validity between the automated and therapist-measured FTSS and TUG was assessed using Pearson correlation (r), the coefficient of determination (r2), and ICC2,1. 34 To visually assess the agreement and proportional bias between automated and therapist measurements, scatter plots were generated for both the TUG and FTSS tests. For the TST, as the automated test yielded continuous sway data (such as displacement) while the clinical assessment was a binary (pass or fail) outcome, a Receiver Operating Characteristic (ROC) curve analysis was used to evaluate how well the system's displacement data could classify the binary outcome of the therapist's test.35,36 Convergent validity was determined by correlating the automated tests with the SPPB, BBS, FRT, FSST, and GSR. 37 Discriminative validity was assessed using ROC curve analysis to determine the area under curve (AUC) as the ability of each automated test to distinguish between participants with and without a history of falls, with the Youden index used to find optimal cut-off values.38–40 AUC values were interpreted as follows: poor (0.50–0.69), acceptable (0.70–0.79), excellent (0.80–0.89), and outstanding (≥0.90). 41
Test-retest reliability was assessed by having a single rater measure two separate trials performed by the same participant, while inter-rater reliability was assessed by having two raters independently measure the same performance trial. Reliability was quantified using r, r2, and ICC (ICC3,1 for test-retest; ICC2,1 for inter-rater). 34 Absolute reliability was further assessed using the Standard Error of Measurement (SEM) and the 95% Minimal Detectable Change (95%MDC), with Bland-Altman plots used to visualize agreement.42,43 The correlation and ICC grading followed the guidelines: <0.50 = poor; 0.5–0.75 = moderate; 0.75–0.90 = good; >0.90 = excellent.
To explore the potential of combining multiple depth camera variables, two multivariate approaches were employed using fall history as the primary outcome measure.
44
First, a multivariate logistic regression model with backward elimination was used to identify an optimal combination of predictors; the criterion for variable removal was set at p > 0.10. Second, a composite fall risk score was developed by integrating individual variables using an AUC-weighted z-score normalization method. This score was calculated by integrating four key variables (ATUG, AFTSS, AFTSS velocity, and ATST sway) using an AUC-weighted z-score normalization method. Specifically, each variable's raw value (
A weight (
The final composite score for each participant was calculated as the weighted sum of these normalized composite score:
The robustness of all reported AUC values was confirmed by calculating bootstrap confidence intervals with 2000 iterations. Finally, to address the potential confounding effect of age, a subgroup analysis was conducted comparing functional measures between fallers and non-fallers within the elderly (≥65 years) cohort only, using the independent t-tests or Mann-Whitney U tests.
Results
Participant characteristics
The flow of participants through each stage of the study, reported in accordance with the STARD guidelines, is detailed in Figure 2. From an initial 45 potentially eligible individuals, 39 participants (11 males, 28 females) were included in the analysis. Detailed demographic and clinical characteristics are presented in Table 1.

Adapted from STARD flow diagram for validation of automated fall risk assessment using depth camera technology. All participants completed automated depth camera assessments (index test) and were classified using the composite fall risk score with exploratory cutoff of 1.12. The reference standard (12-month fall history) identified 12 fallers (30.8%) and 27 non-fallers (69.2%). No inconclusive results occurred. Classification yielded as 11 true positives, 4 false positives, 23 true negatives, and 1 false negative.
General characteristics of the participants.
Note. Values are expressed as mean ± standard deviation (SD).
BMI = body mass index; TST = Tandem Stance Test; FTSS = Five Times Sit-to-Stand Test; TUG = Timed Up and Go; SPPB = Short Physical Performance Battery; BBS = Berg Balance Scale; FRT = Functional Reach Test; FSST = Four Square Step Test; GS = Grip Strength; GSR = Grip Strength Ratio.
Concurrent validity
The automated system showed strong correlations with therapist measurements (Table 2). The ATUG and AFTSS demonstrated excellent agreement while ATST also showed good classification performance against binary outcome of success rate of TST. The scatter data points in Figure 3 for both AFTSS vs. PFTSS and ATUG vs. PTUG comparisons clustered tightly around the regression line, indicating a high level of agreement and minimal systematic bias.
Concurrent validity of the automated measurements (n = 39).
Note. TUG = Timed Up and Go; TST = Tandem Stance Test; FTSS = Five Times Sit-to-Stand Test.
*TST compared using ROC analysis (continuous vs binary outcome).

Concurrent validity of automated timing measures against the reference standard. (a) The scatter plot for the Automated Five Times Sit-to-Stand (AFTSS) test compared to the physical therapist measurement (PFTSS).(b) The scatter plot for the Automated Timed Up and Go (ATUG) test compared to the physical therapist measurement (PTUG). Each dot represents a single test trial from the participants with the linear regression line ± 95% confidence interval.
Construct validity
Construct validity was assessed through convergent and discriminative analyses. For convergent validity, the AFTSS and ATUG tests demonstrated moderate to strong correlations with a range of established clinical measures for retrospective fall history (Table 3). Notably, both the AFTSS (r = −0.887) and ATUG (r = −0.889) showed strong negative correlations with the total SPPB score, indicating that longer test times were associated with poorer physical function. Moderate correlations were also observed with the BBS, FRT, and FSST. In contrast, the ATST showed only weak or non-significant correlations with most clinical measures, suggesting limited convergent validity.
Convergent validity assessed with clinical assessments (n = 39).
Note. AFTSS Velocity represents the mean vertical velocity of the pelvis across the five sit-to-stand transitions. ATST Displacement represents the total cumulative 3D displacement of the pelvis over the 30-s trial. **p < 0.01.
Abbreviations: TUG = Timed Up and Go; ATST = Automated Tandem Stance Test; AFTSS = Automated Five Times Sit-to-Stand Test; SPPB = Short Physical Performance Battery; BBS = Berg Balance Scale; FRT = Functional Reach Test; FSST = Four Square Step Test; GS/kg = Grip Strength/body weight.
For discriminative validity, the system's ability to distinguish between participants with and without a fall history was evaluated (Table 4). The AFTSS (AUC = 0.941) and ATUG (AUC = 0.864) were highly effective at discriminating fallers from non-fallers. In the full sample exploratory analysis, AFTSS time (AUC = 0.941; cutoff 13.73 s) and ATUG time (AUC=0.864; cutoff 12.13 s) showed discrimination for retrospective fall history. However, given the age imbalance between fallers and non-fallers, these estimates are likely inflated; age-stratified analyses are presented in Table 5.
Discriminative validity for retrospective fall history using the automated assessments (n = 39).
Note. AFTSS Velocity represents the mean vertical velocity of the pelvis across the five sit-to-stand transitions. ATST Displacement represents the total cumulative 3D displacement of the pelvis over the 30-s trial. ‡Weighted combination of all measures.
Abbreviations: TUG = Timed Up and Go; ATST = Automated Tandem Stance Test; AFTSS = Automated Five Times Sit-to-Stand Test; PPV = positive predictive value; NPV = negative predictive value.
Discriminative validity for retrospective fall history within older adults (≥65 years, n = 22).
Note. AFTSS Velocity represents the mean vertical velocity of the pelvis across the five sit-to-stand transitions. ATST Displacement represents the total cumulative 3D displacement of the pelvis over the 30-s trial. ‡Weighted combination of all measures Abbreviations: TUG = Timed Up and Go; ATST = Automated Tandem Stance Test; AFTSS = Automated Five Times Sit-to-Stand Test; PPV = positive predictive value; NPV = negative predictive value.
The composite score was developed by integrating four variables from the depth camera assessments: ATUG time, AFTSS time, AFTSS vertical velocity, and ATST sway velocity. The raw values (xi) were converted to Z-scores for standardization, weight (wi) was then calculated for each variable, proportional to its individual ability to discriminate fall history as measured by its Area Under the Curve (AUC) value. The final composite score for each participant was then calculated as the weighted sum of the normalized Z-scores (Figure 4).

Flowchart for the development of the AUC-weighted depth camera-based composite score. A composite score was developed by integrating four standardized (Z-score) variables from the depth camera assessments. Each variable was weighted based on its individual discriminative performance (AUC for fall history), with the final score calculated as the weighted sum of these values.
The results of an age-stratified subgroup ROC analysis on the older adult cohort (≥65 years, n = 22) are presented in Table 5. The automated measures showed promising discriminative potential within this older group. The AFTSS demonstrated acceptable to good discriminative ability for fall history (AUC = 0.892) at an optimal cutoff of 15.54 s. The exploratory composite score yielded an AUC of 0.867. Furthermore, the ATUG time and the AFTSS vertical velocity showed moderate discriminative potential (AUC = 0.717 and 0.733, respectively). Consistent with the full sample analysis, the ATST displacement did not effectively discriminate fallers in this small cohort (AUC = 0.550).
In addition to the ROC analysis, independent comparisons of functional measures within this older adult subgroup (Table 6) confirmed that the automated timing measures, AFTSS (p = 0.016) and ATUG (p = 0.032), both remained significant discriminators, showing poorer performance (47.14% and 31.75% slower, respectively) in fallers. Physical therapist measurements showed mixed results; the manual FTSS was also significant (p = 0.041), but the manual TUG showed no statistically significant difference (p = 0.051).
Comparison of functional measures by fall status for the full sample and age-stratified subgroup (≥65 years, n = 22).
Note. AFTSS Velocity represents the mean vertical velocity of the pelvis across the five sit-to-stand transitions. ATST Displacement represents the total cumulative 3D displacement of the pelvis over the 30-s trial. * Physical therapist's measurement.
Abbreviations: ATUG = Automated Timed Up and Go; ATST = Automated Tandem Stance Test; AFTSS = Automated Five Times Sit-to-Stand Test; TUG = Timed Up and Go; FTSS = Five Times Sit-to-Stand Test.
Reliability
The reliability of the automated assessments was robust. For test-retest reliability, all three automated tests demonstrated good to excellent consistency across two separate trials, with ICC values ranging from 0.858 to 0.976 (Table 7). Notably, the automated ATST (ICC=0.858) showed higher test-retest reliability than the therapist-measured PTST (ICC=0.694). For inter-rater reliability, the automated tests showed moderate to good agreement when assessed by two different raters, with ICC values ranging from 0.579 to 0.885 (Table 6). The Standard Error of Measurement (SEM) and 95% Minimal Detectable Change (95%MDC) values indicated a reliable level of precision for the AFTSS and ATUG.
Reliability evaluations of the automated and physical therapists’ measurements (n = 39).
Note. ATUG = Automated Timed Up and Go; ATST = Automated Tandem Stance Test; AFTSS = Automated Five Times Sit-to-Stand Test; PFTSS = Physical Therapist's Five Times Sit-to-Stand Test; PTUG = Physical Therapist's Timed Up and Go Test; SEM = Standard Error of Measurement; MDC95 = Minimal Detectable Change at 95% confidence.
Discussion
This proof-of-concept study provides preliminary evidence that a depth camera system can be used for automated fall risk assessment with promising reliability and technical feasibility. The strong concurrent validity with therapist measurements suggests technical feasibility, while the encouraging discriminative accuracy for fall history, particularly for the AFTSS test, warrants further investigation. However, as a proof-of-concept study with a modest sample size and cross-sectional design, these findings should be considered hypothesis-generating rather than definitive.
Within the existing depth-camera/Kinect-based fall-risk literature—which includes automated TUG with phase-level parameters, 16 Kinect-based FTSS measures such as transition velocity, 17 and larger prospective multifactorial batteries, 18 the present study contributes primarily by reporting measurement properties of a standardized, clinically familiar multi-test workflow implemented on an accessible depth-camera platform. Specifically, we provide concurrent validity against blinded therapist measurements, test-retest and inter-rater reliability, and measurement error estimates, which are essential for interpreting change over time and for planning future prospective validation and implementation studies.
A key objective of this study was to determine if novel kinematic data, such as sit-to-stand movement velocity, offered insights beyond traditional timing. In the full sample analysis, this appeared highly promising, with the AFTSS vertical velocity being 34.8% lower in participants with a fall history (p < 0.001), suggesting it as a potential digital biomarker. 45 This aligns with laboratory-based research indicating that lower extremity power generation is a critical factor in maintaining balance and preventing falls. 9 By quantifying the quality of movement rather than just its duration, depth camera technology has the potential to add a new layer of biomechanical insight to routine clinical tests, using accessible equipment. 46
The high discriminative accuracy observed in the full sample (AUC = 0.941 for AFTSS) was likely influenced by the age differences between the younger cohort (zero falls) and the older cohort (all fallers), introducing an age-related confounding effect. 47 To explore whether the system's metrics might reflect fall risk beyond mere age differences, we conducted an exploratory age-stratified subgroup analysis isolating the older adult cohort (≥65 years, n = 22). Within this small but clinically relevant population, the AFTSS time and the composite score yielded AUC values of 0.892 and 0.867, respectively, for identifying retrospective fall history. Furthermore, the novel kinematic metric of AFTSS vertical velocity showed moderate discriminative potential (AUC = 0.733) in this subgroup. While the small sample size (n = 22) warrants cautious interpretation, these preliminary findings suggest that the automated depth camera metrics may capture functional decline and biomechanical alterations associated with fall risk, rather than simply acting as an ‘age detector.’ However, larger prospective studies in age-matched older adults are necessary to confirm these exploratory results.
The clinical implementation of fall prevention strategies is often hindered by a significant gap between providers’ beliefs and their perceived expertise. While the vast majority of providers believe fall risk assessment is effective and should be universally applied to older adults, a previous report suggested lacking the specific expertise to conduct these assessments, and awareness of evidence-based toolkits remains low. 48 This is a critical barrier that accessible technology could help overcome. A low-cost, automated system, such as the one evaluated in this study, could lower the barrier to screening by providing an objective, easy-to-use tool that does not require specialized biomechanical expertise. This could empower clinicians in primary care or community settings to conduct regular, standardized monitoring, complementing their clinical judgment with reliable, documented data for tracking functional changes over time.
To explore the potential of combining measures, a composite score was also developed. This score, which integrated multiple depth camera-derived kinematic parameters, also demonstrated high discriminative accuracy (AUC = 0.929). This suggests that a multivariate approach can effectively model the multifactorial nature of fall risk.14,49 Interestingly, in our sample, this composite score did not provide additional discriminative value beyond the AFTSS single metric. Interestingly, even within the older adult subgroup, this composite score (AUC = 0.867) did not provide additional discriminative value beyond the AFTSS single metric (AUC = 0.892). This suggests that sit-to-stand performance was already the dominant variable captured by the model, and the inclusion of weaker, non-significant variables (such as ATST displacement) did not provide further clinical utility in this specific cohort.
Tandem stance sway metrics showed limited discriminative ability in this study, a finding that contrasts with previous investigations but aligns with emerging understanding of postural control complexity. 50 Notably, greater sway in community-dwelling adults may sometimes reflect adaptive compensatory strategies to maintain stability rather than inherent instability. 51 Recent studies have demonstrated that during complex postural tasks, key stabilizing muscles play essential roles in maintaining balance against gravity, activating compensatory balance control mechanisms through heightened muscle activation and strong neuromuscular coordination.52,53 Previous studies suggested that while sophisticated sway parameters from laboratory equipment showed promise, simple displacement measures often failed to predict falls prospectively. 54 The distinction between static balance assessment (maintaining a fixed position) and dynamic balance during functional tasks (sit-to-stand, walking, turning) appears crucial for fall risk evaluation, with dynamic measures consistently demonstrating superior predictive validity compared to static balance measures in community-dwelling older adults. 55 Dynamic balance tasks engage complex neuromuscular coordination patterns and adaptive strategies that better reflect real-world fall scenarios, since most falls occur during dynamic activities such as walking, turning, or transitional movements rather than during quiet standing.56,57
To bridge technical feasibility and clinical implementation, future work should explicitly engage clinicians as key stakeholders using a human-centered co-design approach. Involving physical therapists, geriatricians, and primary care providers could refine the selection and sequencing of test protocols, safety procedures, and the interface and reporting format so that outputs are clinically actionable. Clinician feedback would also help define decision thresholds and identify how automated assessments can be integrated into existing clinical workflows. Alongside technical validation, subsequent studies should incorporate clinician-focused usability and acceptability evaluations using a mixed-methods approach, combining standardized usability scales with qualitative methods to capture barriers, facilitators, and interpretation needs in real-world settings.
Several limitations must be acknowledged. First, the cross-sectional design with retrospective fall history assessment limits causal inference and is subject to recall bias. Prospective validation is essential to establish predictive validity for future falls. Second, the sample size, while adequate for this proof-of-concept, is insufficient for establishing definitive clinical cutoff values. Third, a major limitation of the full-sample analysis is the severe confounding effect of age, which resulted from the concentration of falls in the older adult group. While our exploratory subgroup analysis of older adults partially mitigated this issue and demonstrated promising discriminative potential, the small sample size (n = 22) means these findings remain preliminary. This methodological issue underscores why the high discriminative accuracy values from the full sample are likely inflated and not generalizable. Fourth, we tested only one depth camera model while validation across different systems would be beneficial.
Furthermore, while our system successfully explored specific novel kinematic biomarkers such as FTSS vertical velocity, we acknowledge that the current analysis did not fully exploit the broader spectrum of kinematic data theoretically available, such as step length, trunk inclination, or turning radius. We chose not to leverage these variables fully due to the well-recognized kinematic challenges in a single, front-facing camera setup. Capturing complex rotational movements around the vertical axis during turns often leads to severe body segment occlusion and relies heavily on the camera's fixed orientation. 19 In addition, in IMU-based gait analyses, event detection and parameter estimation have been reported to be less robust during turning compared with straight walking, highlighting the methodological complexity of turn-specific metrics. 58 Beyond turning, extracting robust steady-state spatio-temporal gait parameters (e.g., step length) during the short 3-m TUG walk—which consists primarily of acceleration and deceleration phases—proved equally challenging. Therefore, these factors make the robust analysis of complex gait and turn phases technically demanding without multi-sensor or multi-camera configurations.
This proof-of-concept study provides a foundation for larger, prospective trials. The primary focus for future work must be validation in an age-matched cohort to determine if these automated measures can truly predict incident falls within a high-risk demographic, rather than simply discriminating between young and old participants.
Conclusion
In conclusion, this study suggests that depth camera based automated systems can evaluate fall-related function with promising reliability and technical feasibility. While these findings are encouraging, these findings should be considered preliminary as larger prospective studies are still required to validate these depth camera variables and establish their role in clinical practice. This work supports the continued development of accessible, objective tools that could one day enhance fall risk screening and prevention strategies for aging adults.
Supplemental Material
sj-docx-1-thc-10.1177_09287329261445778 - Supplemental material for Validity and reliability of an automated fall risk assessment system using a depth camera in community-dwelling adults: A proof-of-concept pilot study
Supplemental material, sj-docx-1-thc-10.1177_09287329261445778 for Validity and reliability of an automated fall risk assessment system using a depth camera in community-dwelling adults: A proof-of-concept pilot study by Sungbae Jo in Technology and Health Care
Supplemental Material
sj-docx-2-thc-10.1177_09287329261445778 - Supplemental material for Validity and reliability of an automated fall risk assessment system using a depth camera in community-dwelling adults: A proof-of-concept pilot study
Supplemental material, sj-docx-2-thc-10.1177_09287329261445778 for Validity and reliability of an automated fall risk assessment system using a depth camera in community-dwelling adults: A proof-of-concept pilot study by Sungbae Jo in Technology and Health Care
Footnotes
Acknowledgments
The author thanks the participants of the study for their time and cooperation.
Ethical considerations
The study protocol was approved by the Institutional Review Board of Sahmyook University (SYU 2022-06-003-001) and the study was conducted in accordance with the principles of the Declaration of Helsinki.
Consent to participate
Written informed consent to participate in the research was obtained from all individuals prior to their enrollment.
Consent for publication
Not applicable.
Author contributions
SJ conceived the study, designed the protocol, developed the automated software, performed data collection and analysis, and wrote the manuscript.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability
Data supporting the findings of this study are available from the corresponding author upon reasonable request.
Supplemental material
Supplemental material for this article is available online.
