Abstract
Background:
The Rowland Universal Dementia Assessment Scale (RUDAS) is a cognitive test with favorable diagnostic properties for detecting dementia and a low influence of education and cultural biases.
Objective:
We aimed to validate the RUDAS in people with Alzheimer’s disease (AD), Parkinson’s disease (PD), and multiple sclerosis (MS).
Methods:
We enrolled one hundred and fifty participants (60 with AD, 30 with PD, 60 with MS, and 120 healthy controls (HC)). All clinical groups completed a comprehensive neuropsychological battery, RUDAS, and standard cognitive tests of each disorder: MMSE, SCOPA-COG, and Symbol Digit Modalities Test. Intergroup comparisons between clinical groups and HC and ROC curves were estimated. Random Forest algorithms were trained and validated to detect cognitive impairment using RUDAS and rank the most relevant scores.
Results:
The RUDAS scores were lower in patients with AD, and patients with PD and MS showed cognitive impairment compared to healthy controls. Effect sizes were generally large. The total score was the most discriminative, followed by the memory score. Correlations with standardized neuropsychological tests were moderate to high. Random Forest algorithms obtained accuracies over 80–90% using the RUDAS for diagnosing AD and cognitive impairment associated with PD and MS.
Conclusion:
Our results suggest the RUDAS is a valid test candidate for multi-disease cognitive screening tool in AD, PD, and MS.
INTRODUCTION
Cognitive screening tests are the first step in neuropsychological assessment and constitute useful tools for detecting cognitive impairment and guiding diagnosis, treatment and monitoring [1, 2]. Several screening measures have been developed for different neurodegenerative diseases, giving rise to a wide range of available screening tests at present [3–5].
Ideally, a neuropsychological test should be useful in different settings, populations, and disorders, allowing the comparison between different clinical conditions, reducing possible cultural biases that could influence performance during cognitive assessment, and facilitating rater’s training. A “broad spectrum” cognitive test should also be helpful to detect incident cognitive impairment in community sample studies. Nevertheless, very few screening tools have been studied for their application in different neurodegenerative diseases. Due to its prevalence and cognitive implications, three disorders are among the most frequent reasons for cognitive assessment: Alzheimer’s disease (AD), Parkinson’s disease (PD), and multiple sclerosis (MS).
AD is the most common cause of dementia [6]. The most prevalent symptoms at the early stages of the disease are the impairment of cognitive functions, particularly episodic memory deficits, and changes in functioning and behavior [7]. Several screening tests have been proposed for the cognitive assessment of AD patients, with differences in length and number of cognitive domains evaluated [8]. The Mini-Mental State Examination (MMSE) [9] is still the most used screening test for AD [3]. However, there is an important body of literature that has shown remarkable limitations regarding language, culture, ethnicity, and education [10–18]. The Rowland Universal Dementia Assessment Scale (RUDAS) has been proposed as an alternative for multicultural cognitive assessments [19]. It includes different items to measure visuospatial orientation, praxis, visuoconstructional drawing, judgment, memory, and semantic fluency, with a total score of 30. Several studies have proved the suitable psychometric and ecological properties of RUDAS against MMSE [8, 21]. According to a meta-analysis, RUDAS was less affected by language and education than MMSE [22], confirming its utility for screening dementia from a multicultural perspective. Very few studies have evaluated the diagnostic capacity of RUDAS in mild cognitive impairment or prodromal AD [22–24].
While some advances have been made in the field of AD, cross-cultural tests have been less studied in PD, the second most frequent neurodegenerative disorder and a common cause of cognitive impairment [25], and in MS, which is the most common non-traumatic disabling disease in young adults [26, 27]. Both disorders are often associated with cognitive impairment, especially with ageing [28, 29]. In the case of PD, cognitive impairment can range from mild cognitive impairment to severe dementia. In particular, executive functioning, attention, visuospatial abilities, and memory deficits are common in PD [30]. Previous studies already highlighted the limitations of the MMSE as a screening tool for cognitive impairment in PD, since PD patients with cognitive impairment could score high on this test [31]. In this regard, other specific screening tools have been developed, such as Mini-Mental Parkinson [32], Parkinson’s Disease Cognitive Rating Scale (PD-CRS) [33], Parkinson’s Disease Dementia-Short Screen [34], and Scales for Outcomes in Parkinson’s Disease-Cognition (SCOPA-COG) [35]. PD-CRS and SCOPA-COG have shown good reliability and validity properties, but these tests require longer periods of time (approximately 20 min) and are influenced by education level and age [36].
MS is the most common cause of non-traumatic neurological disability in young adults [37] and cognitive decline is a common symptom in MS with a prevalence of 50–60% [38]. Slowed cognitive processing speed and episodic memory deficits are the most prominent cognitive deficits, sometimes in combination with executive functioning, verbal fluency, and visuospatial processing impairments [39–42]. Currently, Symbol Digit Modality Test (SDMT) is the most frequently used and recommended screening test in MS [43–45], although it is especially focused on processing speed assessment. This is a potential limitation of SDMT for capturing the cognitive complexities of MS. Also, normative data for SDMT considering different populations have shown an important influence of age, education, and ethno-racial group [46]. Another test, the Paced Auditory Serial Addition Test, has been associated with math abilities and psychological stress that results in high rejection rates [47, 48]. To our knowledge, no previous studies of RUDAS validation in PD or MS exist.
RUDAS was developed considering a multicultural approach [19]. The main advantages of this test are the following: 1) the average time of administration is only 10 min, 2) it has been translated into at least 30 languages with no significant modifications, 3) the test assesses multiple cognitive domains, including executive functioning, 4) it implies different response modalities (verbal, non-verbal, written, and praxis), allowing a more comprehensive assessment, 5) presents low influence of the most important biases in neuropsychological assessment (e.g., age, education level, preferred language, sex) [19, 22].
We hypothesized that the RUDAS could be a valid cognitive screening test in several clinical conditions. This could be very useful in clinical practice, especially in non-specialized settings, where the availability of cognitive tools and time is generally limited. Thus, we aimed to validate the RUDAS in AD (in MCI and mild dementia stages), PD with MCI, and MS.
MATERIALS AND METHODS
Participants
Two hundred and seventy participants were enrolled from the Department of Neurology of our center. Sixty participants with AD (50% Clinical Dementia Rating (CDR) = 0.5 and 50% CDR = 1.0), 30 participants with PD with mild cognitive impairment (PD-MCI), 60 participants with MS (50% cognitively impaired, MS-CI and 50% non-cognitively impaired, MS-nCI), and 120 healthy controls (HC) were recruited for this study. All participants were Spaniards, and Spanish was their mother tongue. Table 1 shows the main demographic and clinical characteristics of all participants. Because each disorder has different demographic characteristics, a different control group was enrolled for each disease group (AD, PD, and MS), with no differences in sex, age, or years of education (Supplementary Material 1). Inclusion and exclusion criteria are summarized in Supplementary Material 2.
Main demographic and clinical characteristics
Data are shown as mean (standard deviation) or percentage. N, number of participants; MMSE, Mini-Mental State Examination test; GDS, Geriatric Depression Scale; FAQ, Functional Activities Questionnaire; IDDD, Interview for deterioration in daily living activities in dementia; SDMT, Symbol Digit Modalities test; EDSS, Expanded Disability Status Scale; FSS, Fatigue Severity Scale; MFIS, Modified Fatigue Impact Scale; BDI, Beck’s Depression Inventory; SCOPA-COG, Scales for Outcomes in Parkinson’s Disease-Cognition; AD, Alzheimer’s disease group; HC, healthy control group; MS, multiple sclerosis group; (n)CI, (non)cognitively impaired; PD-MCI, Parkinson’s disease with mild cognitive impairment.
Neuropsychological assessment
Neuropsychological assessment was performed by a trained neuropsychologist. All clinical groups completed the neuropsychological battery Neuronorma (NN) with normative data in our country [49]. As standard cognitive screening tests, we considered MMSE, SCOPA-COG, and SDMT, for AD, PD-MCI, and MS participants, respectively. Geriatric Depression Scale [50] was completed by AD, PD-MCI groups and their HC groups. Furthermore, CDR and Functional Activities Questionnaire (FAQ) [51] were also administered for people with AD; Hoehn and Yahr scale [52] for PD-MCI; and Expanded Disability Status Scale [53], Fatigue Severity Scale [54], Modified Fatigue Impact Scale [55], and Beck’s Depression Inventory [56] for MS. HC group was evaluated with MMSE, and also with CDR and FAQ to ensure the inclusion criteria.
All participants completed the RUDAS. The main characteristics of RUDAS are described in Supplementary Material 3.
Procedure
The study was conducted with the approval of our hospital’s Ethics Committee (code 19/126-E), and all participants gave written informed consent.
The neuropsychological assessment took approximately three hours and was conducted in two sessions during the same week, in which cognitive and functional scales were administered. NN scores confirmed the diagnosis of MCI–multiple domains for PD-MCI participants (i.e., at least two impaired tests in one cognitive domain or one impaired test in two different cognitive domains) [57]. We used age- and education-adjusted scaled scores≤5 to define impairment on each test according to the recommendations of the NN project [49]. Finally, the classification between MS with cognitive impairment (MS-CI) and MS without cognitive impairment (MS-nCI) was based on a previous validation study using the NN battery in our setting [40]. The neurological diagnosis and cognitive classification were performed independently and blind to RUDAS, which was not used for diagnosis.
Statistical analysis
Statistical analysis was performed using SPSS Statistics 22.0, JASP 0.16.3.0, Jamovi 2.2.5, and Python 3.8.12. A p-value<0.05 was considered statistically significant. Bonferroni correction was applied for multiple comparisons. For the study of normality, the Shapiro-Wilk test was calculated, and Q-Q plots were examined.
Pearson’s chi-squared test was calculated for intergroup comparisons with categorical variables, while Kruskal-Wallis and post-hoc tests were calculated for intergroup comparisons (more than two groups). For effect size estimation, we used eta squared and it was regarded as small (η2 = 0.01), medium (η2 = 0.06), and large (η2 = 0.14). For intergroup comparisons within two groups (e.g., PD-MCI versus HC), Mann-Whitney’s U test was calculated, and Cohen’s d was used as a measure of effect size, regarded as small (d = 0.20), medium (d = 0.50), and large (d = 0.80). Spearman’s rho correlation was calculated to study the relationship between quantitative variables, and correlations were regarded as very low (0–0.29), low (0.30–0.49), moderate (0.50–0.69), high (0.7–0.89), or very high (>0.89). ROC curves were estimated and DeLong’s test was performed to compared between two AUCs [58]. We defined the best cutoff points on total score, according to Youden’s index (always > 0.40), in those cases where the area under the curve was statistically significant and>0.70.
Scores on MMSE, SCOPA-COG, and SDMT were considered to determine concurrent validity using Spearman’s rho correlation coefficient.
Machine learning analysis
We implemented a supervised classification algorithm named Random Forest using scikit-learn library [59] v.0.24.2 in Python v.3.8.12. We applied the RandomForestClassifier algorithm to evaluate the ability to correctly classify diagnosis based on seven variables corresponding to the seven cognitive subscores of the RUDAS test. According to the diagnostic groups defined in above, we proposed six classification tasks, all of them classifying diagnosis (positive label) against their respective controls (negative label). The classification tasks proposed were: (i) AD versus HC, (ii) AD CDR = 1.0 versus HC, (iii) AD CDR = 0.5 versus HC, (iv) MS versus HC, (v) MS-CI versus HC, and (vi) PD-MCI versus HC. We randomly split each of the five datasets used for those tasks into training (80%) and test (20%) sets. We also made a “stratified split”, meaning the train-test split for each task considers the distribution of each diagnostic label in each original dataset. Supplementary Material 4 shows the number of cases and controls in the original datasets and their corresponding train and test sets. We performed a 5-Fold Cross-Validation Grid Search on the training sets to determine the best hyperparameters for each model. Then, we obtained the accuracy, precision, recall, and F1-score values over the test sets of the best models for each of the six cases. Finally, we used Random Forest models to rank the importance of the RUDAS test variables for each of the six classification tasks.
RESULTS
Alzheimer’s disease
Intergroup comparisons between CDR 0.5, CDR 1, and HC
We found a group effect for the total score, copy of cube, memory, and fluency. Post-hoc comparisons showed differences between the three groups in total score and memory. There were no differences on the copy of cube task between CDR = 0.5 and CDR = 1.0 or between HC and CDR = 0.5 on fluency. Effect sizes were large for those variables with intergroup differences. Data are shown in Table 2. Total scores of all groups are represented in Fig. 1.
Intergroup comparisons in RUDAS for AD, PD-MCI, and MS
Data are shown as mean (standard deviation). Significant differences according to post-hoc test: aHC versus CDR = 0.5. bHC versus CDR = 1.0. cCDR = 0.5 versus CDR = 1.0. dHC versus MS-CI. eMS-CI versus MS-nCI.

Violin plots showing RUDAS (total score) for all clinical groups and their healthy control groups. HC-AD, healthy control group for AD; HC-PD, healthy control group for PD-MCI; HC-MS, healthy control group for MS.
ROC analysis for the discrimination between HC versus CDR = 0.5 and HC versus CDR = 1.0
In the case of CDR = 0.5, the area under the curve (AUC) was 0.904 (p < 0.001, CI: 0.831 –0.978) for total score and 0.771 (p < 0.001, CI: 0.652 –0.889) for memory. Accordingly, the best cutoff point on total score was 26 (sensitivity = 83.33%, specificity = 80.0%, Youden’s index = 0.633).
In the case of CDR = 1.0, AUCs were 0.989 (p < 0.001, CI: 0.970 –1.0) for total scores and 0.923 (p = 0.033, CI: 0.858 –0.988) for memory. According to the ROC analysis, the best cutoff point for the total score was 24 (sensitivity = 100%, specificity = 93.33%, Youden’s index = 0.933). All ROC curves are represented in Fig. 2. All cutoff points with significant AUCs are reported in Supplementary Material 5.

ROC curves for RUDAS (total score) for all clinical groups and their healthy control groups.
RUDAS and MMSE
The correlation between MMSE and RUDAS was moderate (r = 0.603, p < 0.001). For the discrimination between CDR = 0.5 and HC groups, AUC was 0.891 (p < 0.001, IC: 0.809 –0.978) for MMSE. According to DeLong test, there were no statistically significant differences between both AUCs (Z = –0.35, p = 0.726). In the case of CDR = 1.0 and HC groups, AUC was 0.993 (p < 0.001, IC: 0.977 –1.0) for MMSE, also with no statistically significant differences compared with RUDAS total score (Z = 0.34, p = 0.734).
Parkinson’s disease with mild cognitive impairment: PD-MCI
Intergroup comparisons
Compared with HC, PD-MCI performance was lower in total score and memory with large effect sizes (Table 2).
ROC analysis for the discrimination between HC versus PD-MCI
AUCs for total score were 0.832 (p < 0.001, CI: 0.731 –0.933) and 0.834 (p < 0.001, CI: 0.729 –0.940) for memory. The best cutoff point was 25 (sensitivity = 93.33%, specificity = 53.33%, Youden’s index = 0.467).
RUDAS and SCOPA-COG
Correlation between scores on SCOPA-COG and RUDAS was moderate (r = 0.671, p < 0.001).
Multiple sclerosis: MS-CI and MS-nCI
Intergroup comparisons
Kruskal-Wallis’s test showed an intergroup effect for the total score, praxis, judgement, and memory, with large effect sizes. Post-hoc tests showed differences between MS-CI versus HC and MS-CI versus MS-nCI. Data are shown in Table 2.
ROC analysis for the discrimination between HC versus MS-nCI and HC versus MS-CI
Regarding MS-nCI and HC, ROC analysis showed a statistically significant AUC for total score: 0.652 (p = 0.019, CI: 0.538 –0.766). In the case of MS-CI and HC, AUCs for total score was higher: 0.912 (p < 0.001, CI: 0.848 –0.976) and AUC for memory was: 0.917 (p < 0.001, CI: 0.849 –0.985). Cutoff point was 27, sensitivity = 86.67%, specificity = 86.67%, Youden’s index = 0.733.
RUDAS and SDMT
Correlation between SDMT and RUDAS was high (r = 0.715, p < 0.001). For the discrimination between MS-CI and MS-nCI, the AUC was 0.875 (p < 0.001, CI: 0.785 –0.965) for SDMT and 0.970 (p < 0.001, CI: 0.926 –1.0) for RUDAS total score, showing a higher AUC according to DeLong test (Z = –2.14, p = 0.032).
Machine learning classification
We used Random Forest to predict six different diagnoses using the RUDAS test scores as input variables to evaluate this test’s classification capacity. Table 3 shows the six classification tasks proposed and their best Random Forest model performance over the test set in terms of several classification metrics (accuracy, precision, recall, and F1-score). Supplementary Material 6 summarizes each Random Forest models’ tuned hyperparameters and specifications. Regarding the F1-score results obtained, the best classification model was the one that detects PD-MCI subjects against HC (0.9161). The other classification tasks also achieved good F1-scores (Table 3). The model that obtained the lowest F1-score was the prediction of MS against HC (0.6080). There were no differences in classification between different levels of functional impairment in AD: both AD CDR = 1.0 and AD CDR = 0.5 obtained an F1-Score of 0.8333 (Table 3). Figure 3 shows the feature importances of each Random Forest model built for the six classification tasks.
Performance values of Random Forest models over the test sets for six classification tasks
*F1-Scores higher than 0.7.

Feature importances of the RUDAS test variables (y-axis) obtained in Random Forest models for each classification task (x-axis). Purple: AD; green: MS; blue: PD classification tasks. Feature importances represent the average impurity decrease of each feature for all decision trees in each Random Forest model; higher scores indicate more important features or variables in the decision-making process.
DISCUSSION
This study aimed to evaluate the diagnostic properties of the RUDAS in three of the most common disorders causing cognitive impairment: AD, PD, and MS. The use of a single screening tool for different neurological diseases has important advantages in clinical practice. Remarkably, we found favorable diagnostic properties of the RUDAS in the three disorders evaluated, compared with standard cognitive screening tests.
In all clinical groups of AD, the total score of the RUDAS test showed high areas under the curve, according to the previous literature on AD [17, 20]. AUC was higher for CDR = 1.0 than CDR = 0.5 compared with HC, due to the more evident cognitive impairment in later stages of the disease. However, the AUC in the MCI due to AD group was 0.904, which suggests that this test is also valid for diagnosing the early stages of AD. This is consistent with a recent study on Thai population that showed similar AUC between RUDAS and MoCA for detecting MCI [23]. In our sample, no statistically significant differences were found between AUC of RUDAS and MMSE for CDR = 0.5 and CDR = 1.
Regarding PD-MCI, we observed a high AUC for RUDAS (AUC = 0.832) in PD-MCI compared to HC. Other screening tools such as MMSE or MoCA showed a similar capacity for cognitive impairment detection in PD versus HC [60]. In addition, the RUDAS test showed moderate to high correlation with SCOPA-COG, a test developed for assessing cognitive deficits specific to PD [36]. These results suggest that the RUDAS is a valid and useful cognitive screening test for PD-MCI patients. Moreover, the RUDAS test fills the gap in cognitive screening tools for PD, by proposing a screening test without cultural influence [36].
We also found favorable properties for diagnosing cognitive impairment in MS, and a higher AUC for RUDAS than for SDMT in MS. Other screening tools such as the BICAMS provides a more comprehensive and focused assessment of cognitive function in MS but are generally limited to MS units. Our findings suggest that the RUDAS could have a role in less specialized settings.
The cutoff suggested in our study for dementia due to AD was 24, which is similar to previous studies in Australia [19, 61] and Malaysia [62]. We also suggest, for the first time, cutoff points for MCI due to AD, PD-MCI, and MS with cognitive impairment. However, although the RUDAS seems to be less influenced by age, language, or education than other cognitive tests, these cutoffs should be validated in independent samples.
Furthermore, RUDAS showed adequate high correlations MMSE, SCOPA-COG, and SDMT in AD, PD-MCI, and MS, respectively. This confirms the concurrent validity of the test in these settings, in which those tests are regarded as standards to evaluate cognitive dysfunction. Correlations and diagnostic capacity were high in the three samples (AD, PD, and MS), even considering the demographic differences between them (i.e., in our sample AD patients were older and showed low levels of education, while MS were younger and with high levels of education). Overall, this suggests that the RUDAS performs well in patients with high and low levels of education, as has been found in previous studies [61]. However, specific studies directly comparing high- and low-educated patients with the same disorder and age range are necessary to confirm this finding.
Another interesting result of our study is the application of a machine learning algorithm to the test. Machine learning is increasingly used in cognitive assessment to optimize the diagnosis using neuropsychological tests alone or in combination with other clinical data, biomarkers, or neuroimaging [63, 64]. In our study, random forest algorithms confirmed the diagnostic capacity of the test in the different scenarios and allowed us to select the most important subtests. In this regard, it is worth mentioning that the total and memory scores were the most important subtests in all the diagnoses. However, there were some differences in other subtests. For instance, judgement followed by cubes were more important in PD, while cubes and language were more important in AD. Although we studied different disorders that are not generally included in the differential diagnosis between them, these findings suggest that a careful assessment of RUDAS subtests could orient to the cognitive profile of each disorder, as has been observed in other screening tests [65]. This opens the door to evaluating the use of RUDAS in the differential diagnosis of other neurodegenerative disorders with overlapping clinical presentations with AD, such as frontotemporal dementia or dementia with Lewy bodies, among others.
The present study has some limitations. First, diagnoses of AD had no pathological confirmation. However, all patients met the current diagnostic criteria supported by biomarkers and were followed up for at least six months, confirming the diagnosis. Second, participants were recruited from a specialized department. Although our department receives patients directly from primary care [66], it would be interesting to study RUDAS in less specialized settings [18]. Thirdly, AUCs comparison between RUDAS and SCOPA-COG was not reported in PD-MCI results, due to the absence of SCOPA-COG in the assessment protocol of HC. Finally, although the sample sizes were adequate for statistical analyses, the study finding should be taken with caution when applying them to broader AD, PD-MCI, and MS population, and the study was not powered to evaluate the superiority of RUDAS against other tests.
In conclusion, the present study confirms the validity of the RUDAS as a multi-disease screening tool. We found adequate diagnostic properties for AD (in both MCI and mild dementia stages) and for detecting cognitive impairment in PD-MCI and MS. Future studies aiming to directly compare RUDAS with other screening tools and evaluate the role of RUDAS in the differential diagnosis of neurodegenerative disorders (different causes of dementia and atypical parkinsonisms) may be of interest.
Footnotes
ACKNOWLEDGMENTS
This work is supported by the Instituto de Salud Carlos III (PI19/01260). Jordi A. Matias-Guiu is supported by Instituto de Salud Carlos III through the project INT20/00079 (co-funded by European Regional Development Fund “A way to make Europe”). María Valles-Salgado is supported by Instituto de Salud Carlos III through a predoctoral contract PFIS (FI20/000145) (co-funded by European Regional Development Fund “A way to make Europe”).
