Abstract
BACKGROUND:
Computerized neuropsychological tests provide advantages to clinicians with cost, administration, and time. However, studies have pointed out performance differences between manual and computerized versions of some neuropsychological tests. One of these is the Wisconsin Card Sorting Test (WCST). Due to the performance difference, the normative data of manual tests cannot be used for their computerized versions. Therefore, normative data searches are needed for computerized versions.
OBJECTIVE:
This study aimed to determine the norm values of WCST-CV in a healthy sample.
METHODS:
422 healthy adults aged 18–78 participated in this study. WCST-CVsub-scores are modeled by Regression Analysis based on Age and Education level to generate normative data. Among the 13 WCST scores, the regression models for WCST 2, WCST 3, WCST 4, WCST 10, and WCST 11 are significant. WCST 2, WCST 4, and WCST 11 scores are estimated with Ordinary Least Squares (OLS). However, WCST 3 and WCST 10 scores are estimated with Weighted Least Squares (WLS) due to the violation of the homoscedasticity assumption.
RESULTS:
The regression results show that p-values calculated from error increase as age and education level increase.
CONCLUSION:
As a result of our research, norm values between 18–78 years of age were produced using RA. It was determined that gender was not significant for any sub-score. Therefore, only age and education level from socio-demographic variables were included in the model.
Introduction
Neuropsychological tests have been used for a long time in the investigation of cognitive profiles occurring in healthy and neuropsychiatric disorders. Revealing these profiles gives information about the “normal-healthy” cognitive structure and helps us diagnose and follow the rehabilitation process for different mental and neurological disorders (Kessels & Hendriks, 2016; Nelson & Searl, 2002; Shan et al., 2008).
The Wisconsin Card Sorting Test (WCST) has been one of the most frequently used tests in evaluating frontal functions. To succeed in the WCST, an individual must employ many cognitive abilities concurrently, including attention, set shifting, perseveration, cognitive flexibility, inhibition of impulsive responses, problem-solving, and planning (Heaton et al., 1993). Hence, the WCST exhibits a considerably more intricate structure than other neuropsychological tests. When a deficit in cognitive abilities results from a mental or neurological disorder, there is a substantial decline in test performance compared to healthy controls (Gillan et al., 2020; Itzhaky et al., 2023; Petkus et al., 2020; Tercero et al., 2021; Venezia et al., 2018).
Psychometrically, different methods exist to compare individuals with their peers regarding specific cognitive abilities. Percentiles (Atalay & Cinan, 2007; Faustino et al., 2022) and Z and T transform scales created from Classical Test Theory are among the most common methods (Crawford, 2003; Guàrdia-Olmos et al., 2015). However, the most significant limitation of these methods is the need for a gold standard for classifying socio-demographic characteristics, especially education level and age, which are known to affect cognitive performance (Miranda et al., 2019; Rammal et al., 2019). WCST performance changes in different age groups due to age-related cognitive declines and individual differences such as education level and gender (Rhodes, 2004; Rammal et al., 2019). These variations highlight the need for age-appropriate norms and consider multiple factors when interpreting WCST results across different age groups. In this respect, the most common application is to categorize the sample arbitrarily according to age and education level and extract norm tables while creating norm values (Karakaş et al., 1999; Salthouse, 2001; Salthouse et al., 2003). However, this method can lead to wrong decisions about the individual’s cognitive performance; it does not allow us to know to what extent age and education level affect test performance. Another limitation is that age does not affect cognitive performance at the same level in every age range. In other words, while the decrease in performance in a test decreases with a particular slope, it can show a dramatic decrease after a specific age. Especially in neurodegenerative disorders, this situation is more evident as cognitive decline decreases after 65 years of age. To address these limitations, this study explores the application of regression models in cognitive assessment, drawing on recent advances in the field (Arango-Lasprilla et al., 2015; Guàrdia-Olmos et al., 2015). Using regression models also helps eliminate floor and ceiling effects often encountered in norm tables (Hartshorne & Germine, 2016).
Computerized WCST
Developing computer versions of psychological tests provides clinicians with many advantages. These advantages i) saving time, ii) minimizing errors, iii) being cheaper in the long term. However, studies show that the computer and manual versions of the WCST do not exhibit similar psychometric properties in both psychiatric and healthy samples (Çelik et al., 2021). Therefore, it is necessary to create norm tables like manual versions of computerized tests. This study aimed to generate predicting normative data for healthy individuals aged 18 –78 using regression models.
Methods
Participants
Data were gathered from a comprehensive sample comprising 478 adult participants. The scores derived from 47 individuals on the Beck Depression and Anxiety Scales were omitted from the dataset because they exceeded the predefined cut-off scores. Furthermore, 9 participants were excluded from the research analysis, as they discontinued their engagement with the assessment scales and testing procedures during the data collection phase. Consequently, the research was carried out, and subsequent analyses were performed utilizing the dataset derived from the remaining cohort of 422 ostensibly healthy adults aged between 18–78. The average age of the participants is 44.05±12.04. 59.5% (n = 251) of the participants were male, and the rest were female. In our study, the education year was taken into account as a continuous variable, and the average was found to be 11.08±4.34. In case of any neurological or psychiatric disease reported by the participant or noticed by the psychiatrist, the participant was not included in the study. In addition, participants with long-term alcohol substanceusage and taking drugs that may affect cognitive processes in the last month were excluded from the study. Written informed consent was obtained from all participants who voluntarily participated in the study. Ethics committee approval of the study was received by Bartın University Social and Human Sciences Ethics Committee (Approval number: 2021-SBB-0489; Approval date: 15.12.2021).
Measures and procedures
A structured survey instrument was employed to systematically collect socio-demographic data from study participants, encompassing critical variables including age, educational level, and gender. This comprehensive data-gathering instrument consisted of inquiries aimed at eliciting pertinent socio-demographic details, preferred handedness, the presence of psychiatric and neurological disorders, and an exhaustive medical history.
The Beck Depression and Beck Anxiety Scales were used for all participants to determine their current depression and anxiety levels. Those with scores above 9 and 8 on these scales, respectively, were excluded from the research (Hisli, 1988; Ulusoy et al., 1998). Similarly, the Montreal Cognitive Assessment Test (MOCA) was used to screen cognitive skills and show that especially older individuals are not at cognitive risk. Participants who scored less than 21 points on the MOCA were also excluded due to poor cognitive skills (Ozdilek & Kenangil, 2014). Lastly, WCST-CV was administered to participants who received the required scores from all scales and tests.
Wisconsin Card Sorting Test-Computerized Version (WCST-CV)
This study used the computer version of WCST developed by Çelik et al. (2021). WCST-CV allows the participant to read the instructions and complete the computerized test. Therefore, the participant must have basic computer skills (moving the cursor on the screen, etc.). In its manual application, the tester’s “True, False” feedback was arranged in WCST-CV with both auditory and visual feedback. When the participant completes the test, it calculates their performance according to the scoring directive used in Karakaş et al.’s (1999) study. Thirteen different scores were obtained through WCST administrations. These were total number of responses (WCST1), total number of errors (WCST2), total number of correct responses (WCST3), number of categories completed (WCST4), total number of perseverative responses (WCST5), total number of perseverative errors (WCST6), total number of nonperseverative errors (WCST7), percentage of perseverative responses (WCST8), number of responses used to complete the first category (WCST9), number of conceptual level responses (WCST10), percentage of conceptual level responses (WCST11), failure to maintain set (WCST12), and score of learning to learn (WCST13).
Statistical analysis
All analyses were performed in R with olsrr, lmtest and Metrics libraries.
Regression models and estimators
Regression analysis can be used for modeling between dependent variable as WCST sub-scores and independent variables, as age and education level to generate normative data (Rammal et al., 2019). OLS estimator is the optimal method for Linear Regression Analysis under specific assumptions, and in case of violation of these assumptions, it may lead to misleading results (Yildirim and Kantar, 2014; Yenilmez et al., 2018). OLS estimators are based on assumptions such as normally distributed error terms, no multicollinearity between the independent variables, and homoscedasticity of variance (Vicente et al., 2021). Kolmogorov-Smirnov is used to test the normality of error term (p-value must be greater than significance level, e.g. 0.05), Variance Inflation Factors (VIF) is used to assess multicollinearity (the value must be lower than 10) (Rivera et al., 2021), and Breusch-Pagan is used to test heteroscedasticity (p-value must be greater than significance level, e.g. 0.05). In case of violation of these assumptions, alternative regression methods must be used, such as robust regression for violation of normality (Yildirim and Kantar, 2014, 2020) and Weighted Least Squares (WLS) for heteroscedasticity (Kantar, 2016). In the regression analysis, age is taken into account as a continuous variable and education level is taken into account as a dummy variable similar to Rammal et al. (2019) (12 years and below is taken as 0, and over 12 years is taken as 1).
The linear Regression Model is defined as
Where β0, β1, β2 are regression coefficients, ɛ is error term k is WCST type, and k represents the WCST sub-scores as k = 2,4,11 for OLS.
The weighted Regression Model is defined as
Where β0, β1, β2 are regression coefficients, ɛ is the error term, k represents the WCST sub-scores, and Weight should be determined according to heteroscedasticity. In this study, Weights are determined as Weight = 1/Age for k = 3 and Weight = 1/Fitted WCST Score for k = 10.
Regression coefficients and R2 values are given in Table 1. Age and education coefficients are significant at the 1% level for WCST 1, 2, 3, 4, 7, 10, and 11. There is a negative relationship between age and WCST 3, 4, 10, and 11, as well as between education level and WCST 1, 2, and 7. While establishing the regression models, gender was added to the model, but since its effect on WCST scores was found to be insignificant, it was removed from the model. This shows that gender has no effect on WCST scores, which means, it does not affect whether the scores increase or decrease. However, only WCST 2, 3, 4, 7, and 10, 11 models give more meaningful results when R2 values are considered. Age and education variables were insufficient to explain the score of WCST 5, 6, 8, 9, 12 and 13. SDe represents the standard deviation of error terms.
Regression coefficients and evaluation metrics
Regression coefficients and evaluation metrics
Signif. codes: ‘***’ 0.001 ‘**’0.01 ‘*’ 0.05 ‘.’ 0.1.
Test statistics for OLS assumptions are given in Table 2. As is seen in Table 2, where VIF values are less than 10, there is no multicollinearity problem between the independent variables. However, there are non normality or heteroscedasticity problems for WCST 1, 5, 6, 7, 8, 9, 12, and 13 models. Although the OLS assumptions are not satisfied for WCST 3 and WCST 10, this issue has been overcome using WLS. In this study, Weights are determined as Weight = 1/Age for WCST 3 and Weight = 1/Fitted WCSTScore for WCST 10. Additionally, WCST 7 are not satisfied the OLS assumptions, and a suitable weight for WLS could not be identified. The OLS assumptions are satisfied by WCST 2, 4, and 11 among 13 different WCST models.
Assumption of OLS
Normative data is used for determining the percentage of residuals between predicted and observed values. The predicted WCST score is calculated by regression equation (1) for OLS and equation (2) for WLS with the coefficient represented in Table 1. This procedure is explained on the WCST 11 score. The predicted WCST 11 score is calculated by WCST Score11 = 76,4240–0,7427*Age+10,9989*Education Level where Education Level is a binary variable as 0 for below 12 years and 1 for above 12 years. For instance, a person 25 years old and having 16 years of education has a score of 26 on WCST 11. The predicted WCST score is calculated by 76,4240–0,7427*25 + 10,9989*1 and the predicted score is 68,8554. For this example, residual is a difference between predicted and observed values such as 26 –68,8554 = –42,8554. Standardized residual is calculated as residual/SDe and 42,8554 / 18,5622 = –2,30875 for this example where SDe values are given in Table 1 for each WCST score. This standardized residual also knows as z-score and the percentage can be taken any z table. The percentage of our example is 1%. In other words, the 25-year-old participant with 16 years of education should havea score of 68,8554; but the participant’s performance is 26. According to the regression model, this participant’s score is higher than his/her peers, only 1%.
An important point that clinicians should consider when evaluating the percentages obtained here is that obtaining a high score is not a good indicator. A high percentile for WCST3, WCST4, WCST 10 and WCST 11 is a good indicator, while a high percentile for WCST 2 is negative. For this reason, clinicians can go for interpretation by subtracting the percentile from 100. The example is given in the below for WCST 2.
The predicted WCST2 score is calculated as 18,5533 + 0,8251*25–12,5971*1 = 26,5837. The residual is 56 –26,5837 = 29,4163 and standardized residual is 29,4163 /19,5748 = 19,5748. The percentage of our example is 93%. It is not desirable to have more error responses in WCST 2. For this reason, it would be more accurate to subtract the percentile in the table from 100. For this example, this participant’s percentile is 7%.
Appendix pages 2–7 present the predicted values generated through the regression model. Due to the extensive nature of these pages, a user-friendly calculator has been incorporated on the first page of the appendix. This calculator simplifies the process for clinicians and researchers when assessing individuals. They can input the participant’s age, education level, and the raw score obtained from the WCST-CV. Subsequently, the calculator will automatically display the corresponding percentile of the participant’s peers for the specific score type in the percentile section.
In addition to the five score types derived from the analysis, to interpret scores from other score types within the domain of neuropsychological evaluation, mean and standard deviation values for each score type are provided across six distinct age groups and three different education levels. These values enable clinicians to perform calculations using the conventional approach.
As previously mentioned in the introduction, the selection of age and education level categories in the establishment of normative groups for neuropsychological tests is arbitrary. Initially, WCST norms were categorized into five age groups, conforming to the common approach, resulting in 12 age groups (Rammal et al., 2019; Arango-Lasprilla et al., 2015; Arango-Lasprilla et al., 2017; Shan et al., 2008). However, upon conducting group comparisons, no statistically significant disparities were observed among individuals’ WCST score types, particularly within the age range of 18 to 37. Consequently, participants within this age range were grouped. Subsequently, age groups were recategorized as illustrated in Table 3.
The means and standard deviations of sub-scores derived from the Wisconsin Card Sorting Test-Computer Version (WCST-CV)
The means and standard deviations of sub-scores derived from the Wisconsin Card Sorting Test-Computer Version (WCST-CV)
* This table is intended for clinicians seeking to interpret additional scores obtained from the WCST-CV. To ensure statistical significance, we stratified the sample by various age groups and levels of education. Consequently, some age groups exhibit more comprehensive ranges due to the absence of significant differences in WCST scores within specific age groups, notably for individuals aged 18 to 37. However, as cognitive decline becomes more pronounced in older age groups, the age categories become narrower. A similar pattern is observed for education levels. Our statistical group comparisons revealed significant differences, particularly between individuals with 1–5 years of education and those with 6–8 years of education. Nevertheless, when education levels were further subdivided, the number of observations per group became insufficient, denoted by “Not Applicable (N.A.)” in cases where the mean and standard deviation values are not presented. Categories with Not Applicable (N.A.) in their mean and standard deviation values indicate that the number of observations is less than ten people. Therefore, we did not write it here since consistent and reliable findings cannot be obtained in groups of less than ten people. An essential point to consider is that the averages of some scores, particularly in older age groups, did not exhibit significant differences concerning education level. For instance, in the 64 and older age group, the average score for the WCST1 score type was 128, but with very low standard deviations. As elucidated in the introduction, this phenomenon is attributed to the test’s limited ability to distinguish between individuals within this age range. Given that individuals in this age bracket consistently achieved the lowest scores, a ceiling effect, indicative of the test’s inability to capture fine distinctions, became evident.
The primary objective of this investigation is to generate normative data for the WCST-CV for a sample of healthy Turkish adults using regression modeling. Our findings indicate that five distinct WCST-CV sub-scored conform to the prerequisites for regression analysis, with education level and age variables providing a satisfactory explanation for the variance observed. In contrast, when scrutinizing Tables 1 and 2 in tandem, it becomes evident that more than five WCST scores exhibit associations with education level and age. Nevertheless, norm scores were not derived for these particular score types due to their notably low explained variance rates. The existing literature reveals that in studies adopting a similar methodology, if the regression model attains statistical significance, estimated data are often presented without regard for the magnitude of explained variance (Arango-Lasprilla et al., 2015; Arango-Lasprilla et al., 2017). However, it is essential to emphasize that this practice represents an unsuitable approach within the broader context of regression analysis (Gujarati, 2022).
Norms for psychological tests should undergo periodic updates within the targeted sample. An illustrative example of this practice is found in the Wechsler Intelligence Tests for Children. The intelligence scores are periodically revised, considering biological factors (genetic factors) and development levels of countries change over time and, accordingly the growing accessibility to formal education commencing at increasingly early ages and technological advancements. This phenomenon is explained by the Flynn effect, which highlights a consistent upward trend in intelligence scores over time (Graves et al., 2021; Trahan et al., 2014; Bratsberg & Rogeberg 2018). The Turkish adaptation of the Wisconsin Card Sorting Test (WCST) was undertaken by Karakaş et al. in 1999. Nevertheless, this adaptation was limited to a relatively small sample of 69 healthy individuals, and it failed to address its applicability within clinical populations, particularly concerning ecological validity, which stands out as one of the pivotal methodological concerns. Furthermore, as alluded to in the introduction, the WCST ranks among the prominent neuropsychological tests internationally employed for the assessment of cognitive functioning in both healthy and clinical populations. However, its infrequent usage within our national context can be attributed to the scarcity of individuals possessing expertise in its administration and scoring, in contrast to more commonly employed tests. In light of this, our research team developed an extended version of the WCST in 2021 to facilitate its broader application in research and clinical contexts (Çelik et al., 2021). Nevertheless, psychometric investigations conducted among individuals afflicted by schizophrenia and those without such a diagnosis reveal disparities between the two versions. Consequently, this discrepancy necessitates formulating a distinct set of norms tailored explicitly to the WCST-CV, particularly when endeavors to assess an individual’s cognitive capacity within a clinical setting are at the forefront.
Using the computerized WCST version gives some advantages to clinicians, such as being more economical and minimizing errors in application and calculation. Due to these advantages, the production of the norms of the WCST-CV will enable the widespread use of this test in clinical and research. The increasing use of technology, especially in neuropsychological assessment tools, necessitates a review of norm studies. Having norm values of WCST-CV in a large sample will contribute to the widespread use of WCST-CV for both clinical and research purposes. Unlike other norm studies, we defined age as a continuous variable. Thus, we think that clinicians will make more precise decisions. Especially in the regression models, the gender variable was not significant for any score type. Therefore, gender was not included in the regression scores. Upon a comprehensive review of the literature, it becomes evident that a relatively limited number of studies demonstrate a substantial gender-related impact on WCST scores (Lineweaver et al., 1999). The consensus in the field is that gender is generally regarded as having no significant influence on WCST scores (Tercero et al., 2021; Norman et al., 2011; Caffara et al., 2004). Similarly, the previous norm study conducted in Türkiye found that gender did not affect WCST performance.
Conclusion
Compared to other studies, in this study, norm values for WCST-CV of healthy adults were calculated using a much larger sample population. Similarly, standardized tests such as the Beck Depression and Anxiety Scale and MOCA were used to determine whether the participants were mentally and neurologically healthy. Our aim is to establish norm values for the Turkish sample in order to popularize the use of the computer version of the WCST, which provides comprehensive information to clinicians and researchers about the frontal functions of the individual. Unlike other studies, instead of presenting norm lists/tables exceeding 200 pages for each score type, we developed a calculator. However, despite the unique aspects of our research, it also has some limitations. One of the most important limitations is that, as in other studies, education level was considered as a dummy variable. Dividing the education level into two groups: 12 years or less and above makes it difficult to investigate the real effect of education on WCST scores. For example, when Table 3 is examined, education level is 1–8; 9–12. Even when divided into more groups such as 13 or more years of education, a significant difference was found between education groups in terms of WCST scores. However, when the education level is wanted to be examined multivariately, the number of samples must be much higher than the current number, making the data collection process difficult and reducing the applicability of the research. In parallel with this, the number of healthy adults who have received 13 years or more of education in older ages is relatively lower than other groups. Therefore, as can be seen from Table 3, since the number of sample sizes is not sufficient for some specific education and age groups (at least 10 people for each age and education level were included in this study; if the number of people per group is less than 10 people, mean and standard deviation values are not given and are indicated as N.A.). A second limitation is that the norm values obtained are valid for the Turkish sample and cannot be generalized to other countries.
Footnotes
Acknowledgment
The Scientific and Technological Research Council of Turkey (TÜBİTAK) supported this study under project number 1919B012113778.
Ethics statement
The study was approved by the Bartın University Social and Human Sciences Ethics Committee (Approval number: 2021-SBB-0489; Approval date: 15.12.2021).
Conflict of interest
The authors have no known conflict of interest to disclose.
Availability of data and materials
If researchers are interested in obtaining the WCST-CV program and the calculator for estimating norm values in healthy adults, they can contact the corresponding author via email.
