Abstract
Background:
Semantic memory (SM) constitutes a cognitive system that is seriously affected by Alzheimer’s disease (AD). There are several tests for assessing SM, but a tool is needed to assess AD in the early stages of the illness.
Objective:
The study aimed to create, validate, and normalize a new test to assess SM, called the Ikos test, for AD and early AD in clinical practice.
Methods:
62 healthy adults as a control group (CG), 62 AD, and 60 amnestic mild cognitive impairment (aMCI) subdivided into a group that progresses to AD, and another group that does not progress to AD were selected. The internal consistency (IC), the construct validity (CV), and reliability between raters and the test-retest were analyzed. We used the Bayesian approach to establish the accuracy of the diagnosis of the Ikos test in AD and early AD.
Results:
IC showed a Kuder-Richardson index of r = 0.945. The CV between the Ikos test and Pyramids and Palm Trees; Intraclass Correlation Coefficient (ICC) index was 0.897. The Kappa index was between 0.865 and 0.912, and the ICC index was 0.873 for the test-retest reliability. The Area Under the Curve was 0.981, sensitivity (SE) was 0.95, and specificity (SP) was 0.96 in AD/CG. In contrast, in the MCI-AD/CG group, SE = 0.77 and SP = 0.80.
Conclusion:
The Ikos test accomplishes the criteria of validity and reliability with high correlation indexes. Therefore, it can be considered a valid, reliable, and easily applicable tool for SM assessment in diagnosing AD and the early stages of clinical disease.
Keywords
INTRODUCTION
Semantic memory (SM) constitutes an essential cognitive system upon which built the general knowledge of the meaning of the world and the relationships between different concepts [1]. Moreover, SM is involved in essential processes such as acquisition, representation, and processing of conceptual information [2] and because of the central role it plays, damage to it can lead to significant cognitive deficits.
One of the neurodegenerative diseases in which SM is affected is Alzheimer’s disease (AD), and recent research also points to its involvement in prodromal stages of the disease known as mild cognitive impairment (MCI), especially the amnesic subtype (aMCI). However, the theories about its damage and the most effective assessment technique are still up for debate [2–16].
The study of the underlying mechanisms of SM has followed the hypothesis of the existence of a dissociation between living and non-living things and has generally found specific categorical damage in living things. This dissociation may be the result of a structure in terms of physical attributes (living beings) versus functional attributes (non-living beings) [17, 18]. However, several studies report different results and do not report category-specific damage or the prevalence of one over the other [19–22].
On the other hand, some authors [22–26] propose that the different manifestations of SM impairment are related to how SM is evaluated, specifically with the degree of processing effort required by the task. They contend that although factors such as word frequency, familiarity, or the visual complexity of the images may be stimuli that have an essential impact on semantic performance, this should not be immediately associated with a semantic deficit. Instead, some assessment results relate to other cognitive aspects depending on the stimuli.
The temporal pole is generally considered the center of semantic processing at the structural level. This theory is due to the considerable atrophy in this area in individuals with semantic dementia (SD), which results in a significant deficit in semantic knowledge [27–30]. In contrast, AD is insidious in onset and is initially less severely and more diffusely affected than SD [28].
Some authors such as Gonnerman et al. reject an explanation based on the extension of the pathology to different specialized anatomical regions and propose that semantic impairment in AD is caused by a diffuse process that simultaneously affects many parts of a distributed network of semantic representations [31]. Neuroimaging studies report dysfunctions in posteromedial areas (precuneus and posterior cingulum) and frontal areas, affecting the default network area in which a higher concentration of amyloid-β (Aβ) protein exists in AD. Results have even shown alterations in this network in patients with MCI and patients with genetic risk of AD [32].
Currently, another hypothesis proposes that semantic cognition processes depend on interactive neural systems that form patterns consistent across individuals and distributed in different cortical association areas that function according to the stimulus. In addition, this neuronal system needs a semantic hub or axis that makes the information coherent and generalizable; this hub is in the anterior temporal region (bilaterally) [33–39].
In addition, this approach proposes to regulate semantic cognition through the role of semantic control carried out in a distributed frontal and temporoparietal neural network involved in the executive control of semantic information, the necessary and appropriate information to perform a task according to the time or situation [35].
Another research proposal suggests that in AD there exists an early dysfunction of executive control [35]. It means that the fronto-temporo-parietal neuronal network is affected and not the central semantic representations that tend to degrade in moderate and severe stages of the disease and are related to the semantic hub [40, 41]. Not being so affected in early stages in AD as aMCI, changes in SM are usually subtle and do not affect activities of daily living or cognitive performance. Hence, their assessment is not a priority, and the evaluation focuses mainly on deficits in episodic memory. One of the disadvantages of focusing assessment on episodic memory is that it is prone to decline even in healthy aging; on the other hand, SM remains stable and even improves and solidifies with age [14, 43].
Therefore, an alteration in SM detected early could indicate the presence of a neurodegenerative process such as AD in prodromal stages, earlier than other cognitive alterations. In addition, many authors reported alterations in SM in aMCI. They claim that SM is affected several months before the onset of any other cognitive impairment and that this appears informative in predicting disease progression in patients with cognitive impairment [2–16].
Therefore, it is necessary to incorporate semantic tests into routine clinical assessments, and their performance level must be taken as a diagnostic criterion in AD and aMCI jointly with additional biomarker assessment. In addition, it would increase diagnostic accuracy at prodromal stages where intervention could have better results [6, 44].
Evaluation of the SM
There are several tests for assessing SM, each using different strategies to assess SM. Generally, test trough verbal access are priority [45, 46], tests such as naming tasks [47], and semantic fluency, among others [48]. Another alternative are SM tests through visual access, such as the Birmingham Object Recognition Battery [49], which poses object decision tasks. Finally, some assessment methodologies propose simultaneously assessing episodic and SM through semantic category encoding and the simultaneous learning of word lists [50].
Despite the existence of several tests to assess SM, one of the first-choice tests is the Pyramids and Palms Tree (PPT) Test [51] and the Cambridge Semantic Battery, which includes several tasks for the evaluation of SM, in this battery the Camels and Cactus Test [52] which is an improved version of the PPT test.
Most of these tests that assess SM require the person to recognize each item presented, retrieve, activate, compare the corresponding semantic information, and identify semantic associations [10, 53]. In addition, the level of education, gender, age, and cultural background of the patient can influence those tests [53–56]. Therefore, subjects with SM pathologies would perform better on tests with a high frequency of use, familiarity, and typical and culturally appropriate items or objects [53, 58]. Conversely, failure on a test with highly familiar and typical items may indicate early SM impairment.
The tests described above have high cognitive demands that make SM challenging to assess, especially in the study population in which the scan is performed. Furthermore, the frequent use of verbal tasks to assess SM presents a problem, as language difficulties are an essential feature in AD, and it might be difficult to separate semantic from linguistic deficits using such tests [23]. One of the ways to reduce these cognitive demands and assess SM is through color image stimuli, as it provides fewer confounds than verbal stimuli, and semantic feature information can be directly manipulated [59–61].
The present study aimed to create and validate a new feasible and valuable tool in clinical practice to assess SM in AD and aMCI related to AD using the PPT paradigm with familiar and culturally appropriate visual stimuli where the relationship of functional knowledge is the semantic key.
METHODS
Materials and procedures
We have created a new test to assess SM, called the Ikos test, using as a basis the paradigm of the PPT test [51], which is commonly used in clinical and research settings to evaluate the integrity of semantic knowledge. In this new test, participants are simultaneously shown four stimuli by print (20 prints in the test with 80 total images); one as a sign (seman) on the top of the print, and three as a possible relationship, link (tikos) at the bottom of the print, and they must identify which of the three items on the bottom is more closely related to the stimulus at the top. In addition, we have added one more stimulus compared to the PPT test on the bottom to replace the 50% probability of getting it right, which jeopardizes psychometrics.
We have created new visual stimuli with realistic representations of daily objects to adapt visual items to the population of possible dementia diseases. This task requires participants to recognize all four items and compare their corresponding semantic information, selecting the appropriate link and rejecting other relationships. The semantic key information needed is related to the use of objects represented in the Ikos test, where the objects are of everyday use, allowing the population to whom the test is addressed to have this prior knowledge, thus understanding this knowledge as more “ecological”.
The only modality considered in this new test is the visual modality through pictures. With this test, we want to demonstrate that people without pathology can obtain all the correct answers and that the errors already indicate dysregulation of the semantic systems. The instruction was as follows: which of the three objects at the bottom is best related to the object at the top?
Considering an acceptable range of errors in subjects without pathology, that will be analyzed statistically later. Due to a down index of internal consistency for item/print one, we decided to use this item as a practice trial. This item permits explaining in detail how the test works for the subject. The Ikos test is scored based on 19 items with a maximum score of 19 points and one trial item. Figure 1 displays item number 18 of the Ikos test.

Item/print number 18 of the Ikos test.
At the same time, all participants were cognitively assessed by a neuropsychological Battery, called the Basic Neuropsychological Battery-Dementia version (BNB-D) [62]. In addition, all subjects were classified according to the Episodic Memory Stage [63] using episodic memory test of BNB-D.
Participants
Sixty-two healthy adults 50 to 99 years old were selected from the community as a control group (CG) and were used to determine psychometric attributes (validity and reliability) and the average efficiency of the new test. To determine that the control subjects did not suffer from cognitive impairment, we established as a criterion to have a Mini-Mental State Examination (MMSE) [64] higher than 26 points and present a maximum of one error in the recall of the three words of this screening test.
We also complemented this criterion with another more complete neuropsychological test such as the BNB-D [62] where the control subjects had to obtain more than 76 points in the global cognitive index (Icog, limit of –1 Standard Deviation).
Sixty-two patients were selected for the AD group at the Dementia Unit Service from April 2017 to April 2022. They fulfilled the diagnostic criteria for AD according to the National Institute of Neurological and Communicative Disorders and Stroke and Alzheimer’s Disease and Related Disorders Association (NINCDS-ADRDA). In addition, they were in the mild-moderate stage of illness according to the Global Deterioration Scale, stage 4 [65].
A subgroup of 60 patients diagnosed with amnestic type MCI [66] was selected, and they were a follow-up of 4 years. This group was selected from the first evaluation stages in 2017 and had an evolutionary follow-up of at least four years until April 2022. The MCI group divides into two groups: those who progressed to AD (MCI-AD, N = 26) and those who did not progress within four years of MCI diagnosis (MCInotAD, N = 30). At the four-year follow-up, four MCI subjects had not completed the assessment (two exitus and two lost follow-ups). We use this cohort to determine SM impairment and establish the diagnosis accuracy property of the Ikos Test.
We use medical and psychiatric medical records to ensure that all participants were free from neurological disease, psychiatric disease, head injury, and stroke. In addition, all subjects signed the informed consent.
Statistical analysis
In the first level of analysis, we obtained descriptive data from the sample (CG, AD, MCInotAD, and MCI-AD) and the tool used in the study. Afterward, to establish the internal consistency of the Ikos test, it was analyzed through the establishment of the Kuder-Richardson KR20 index. Next, the Intraclass Correlation Coefficient (ICC) was used between the Ikos test and the PPT test (random selection of 20 subjects of CG) to establish the construct validity. In addition, the Kappa index was calculated (random selection of 20 CG subjects) between four raters to determine their reliability.
Four clinicians (2 different pairs) assessed the participants, one of whom led the session, and they would later rate the Ikos test independently. Finally, the test-retest reliability was analyzed (random selection in 20 CG subjects) using the intraclass correlation coefficient (ICC) between two different time/same participant assessments (rank time: 25–30 days).
In the second level of analysis, we compared the results of the Ikos test between groups (CG, AD, and MCI subgroups) using a non-parametric statistics test (Kruskal Wallis for four group comparisons and Mann-Whitney U test by pairs) due to the non-gaussian distribution of the new semantic test. Also compared were age and years of formal education intergroup (CG/AD/MCI subgroups) using the One-Way Analysis of Variance (ANOVA) with Scheffé Post Hoc comparisons. Finally, we compared the MMSE and BNB-D using the Student’s t- test for pairs: CG/AD and MCI-AD/MCInotAD.
We applied multivariate regression analysis for age and education level (years of formal instruction) in CG as independent variables to identify the modulation magnitude of these two modifier variables over the Ikos test score as a dependent one. Multiple linear regression analysis used the enter method.
To establish the diagnosis accuracy between CG subjects and AD of the new SM test in AD (to establish the cut-off score), we used the area under the Receiver Operating Characteristic (ROC) curve (AUC) to summarize the global diagnostic accuracy of the Ikos test. Moreover, using the 2×2 tables, we calculated sensitivity and specificity and the positive and negative likelihood ratios (+LR and –LR, respectively) using the Bayesian approach. Finally, we calculated the likelihood ratio confidence interval (CI) at 95%. All diagnostic accuracy analyses were done at the study level and not at the participant diagnosis judgment.
To establish predictive values (+LR and –LR) at the cut-off score obtained in the CG/AD contrast, the same process was obtained using MCI-AD and MCInotAD to assess MCI patients and their probability of being prodromic AD.
RESULTS
One hundred eighty subjects participated in the study (62 control subjects, 62 AD patients, and 56 MCI subjects subdivided into 26 MCI who progressed to AD and 30 who did not progress to AD). Of the 62 control subjects, 50.32% were women. In the AD subject group, that number was 74.19%; in MCI-AD, it was 69.23%, and in the MCInotAD group, it was 66.66%. Table 1 describes the sociodemographic and clinical characteristics of the different groups. When comparing sociodemographic characteristics between groups, age and education did not have significant differences (F = 1.30, p = 0.274 for age, and F = 2.94, p = 0.065 for years of education). MMSE and ICog Index of BNB-D had statistical differences between CG and AD group (t = 7.82 p = 0.001 for MMSE and t = 9.93 p = 0.001 for Icog Index), but not in MCI-AD and MCInotAD comparison (t = 1.59 p = 0.211 for MMSE and t = 1.62 p = 0.193 for BNB-D).
Sociodemographic and clinical characteristics of the sample by group
CG, control group; AD, Alzheimer’s disease group; MCI-AD, MCI due to Alzheimer’s disease; MCI-notAD, MCI without progression to Alzheimer’s disease; y, years; EMS, Episodic Memory Stage according Cejudo 2016 classification [63]. *Not significant statically differences between groups according to ANOVA. **Statically significant differences between groups according to ANOVA and Mann Whitney U for Ikos test.
Results of the Ikos test between groups (CG and AD); Mann-Whitney U test was 73.00 p = 0.001. For MCI-AD and MCInotAD, Mann-Whitney U test was 138.00 p = 0.001.
We applied regression analysis in CG subjects for age and education level (years of formal instruction) as independent variables and the Ikos test score as a dependent one. The model shows no influence of age and years of formal education on the Ikos test scores; F = 1.43 p = 0.242. The new semantic knowledge tool’s internal consistency showed a Kuder-Richardson KR20 index of r = 0.945 for the full scale (19 items). The item/print 1 of the 20 initial items was the most confusing among the control subjects (KR20 index = 0.71 with item 1 included), so it became the test item in which the rater explained to the test subjects in greater detail the type of relationship and the task that the subject must perform between the proposed images. The construct validity between the Ikos test and the PPT test ICC index was 0.897.
The Kappa index was between 0.865 and 0.912 (worst and best index between rater pairs) for reliability between raters. The ICC index was 0.873 for the test-retest reliability.
To establish the diagnosis accuracy of the new SM test in AD, the area under the ROC curve (AUC) was 0.981. Coordinates of the curve at 17 score, sensitivity was 0.95 and specificity 0.95.
Using the Bayesian approach, we establish a sensitivity of 0.95 (87–98%. 95% CI) and specificity of 0.95 (87–98%. 95% CI) in CG and AD groups with a positive likelihood ratio (+LR) of 20 (6.51–59 at 95% CI), and negative likelihood ratio (-LR) of 0.05 (0.02–0.15 95% CI). Furthermore, we establish sensitivity of 0.77 (61–88%. 95% CI) and specificity of 0.88 (66–89%. 95% CI) in MCI-AD and MCInotAD groups (at 17 Ikos cut-off score) with a positive likelihood ratio (+LR) of 3.85 (1.82–8.11 at 95% CI), and negative likelihood ratio (-LR) of 0.05 (0.14–0.60 95% CI).
Table 2 shows direct results from the Ikos test according to study groups (number of subjects at every score of the Ikos test).
Direct results from the Ikos test according study groups (number of subjects at every score)
Table 3 shows sensitivity, specificity, and likelihood ratios according to scores of the Ikos test in MCI subgroups contrast.
Sensitivity and specificity according to scores of the Ikos test in MCI subgroups contrast
SE, sensitivity; SP, specificity; +LR, positive likelihood ratio; –LR, negative likelihood ratio.
DISCUSSION
We developed the Ikos test to evaluate semantic knowledge (called SM) through a visual image-image matching task with stimuli of a high frequency of daily use and considering functional knowledge relations as the semantic key. The Ikos test uses visual stimuli because assessing semantics through visual access reduces confounds and provides a way to manipulate information about semantic performance directly [60]. In addition to assessment through visual access, the new test does not require a high cognitive demand since the stimuli presented through images are of high frequency used in daily life, making it more ecological than other tests.
The Ikos test accomplishes the basic principles of construct, criterion validity, classic psychometrics, and reliability with high correlation indexes. Based on such results, the Ikos test can be considered a valid, reliable, and easily applicable tool for SM assessment, especially in diagnosing AD and early AD. In addition, the Ikos test is a tool that can be used in daily clinical practice thanks to its brief administration time of three to seven minutes.
The Ikos test score is designed to be maximal (19 out of 19 correct items) in subjects without cognitive impairment (ceiling effect). However, subjects with SM impairment show a dispersion in lower scores. Table 2 evidence these results.
Using everyday objects as relevant in semantic judgment has been justified to diminish the effects of prior acquired knowledge that is more related to academic knowledge and formal instruction (schooling). Also, the semantic judgment of everyday objects and their use means the test does not change with age.
The Ikos test has excellent diagnostic sensitivity and specificity for AD, which shows that SM is already affected in the early clinical stages of the disease. In our sample, patients with AD mainly obtained Ikos test scores indicating impairment (17 or below) in the mild stage of the disease [65]. These findings agree with those of different authors who have reported in their research that SM was affected in early stages of AD [2–12, 14–16].
On the other hand, some authors suggest that early SM disturbances are related to the difficulty of the tests used for evaluation. For example, some batteries that assess SM use tasks that require high cognitive demands, and a low performance could not be attributed to an utterly semantic failure when it is highly mediated by other cognitive functions such as attention and executive processes that are necessary for tasks of high cognitive demands [21, 23]. In addition, although some tests [50] propose to assess semantic and episodic memory simultaneously, we propose to evaluate episodic memory and SM separately. Because although they are two related mechanisms, their separate assessment allows us to obtain much more information about the person’s cognitive-amnestic state.
Therefore, when exploring semantic dysfunction in AD, the cognitive demands should be reduced as much as possible. In the case of the Ikos test, this proves to be a straightforward task in controlling participants that begin to be altered in the early clinical stages of AD. Thus, tasks with low cognitive demand and high familiarity and frequency of stimuli in which information can be accessed or retrieved automatically may better expose the true semantic deficits [22, 23].
The Ikos test results also prove no age or formal education influence. That is a crucial point since other tests usually used to evaluate SM, such as the PPT test, according to the validation performed in the Spanish population, are influenced by variables such as the level of education [55]. In addition, other tests such as semantic fluency and confrontation naming tests are also influenced by other factors such as gender, age, and the level of schooling. That is because, usually, these tests assess semantics through lexical access and working memory, among others [67–69]. In normal aging these abilities naturally decline; therefore, failure cannot be entirely attributed to a semantic deficit [60, 70–73] and people with AD perform worse on cognitive tasks with verbal versus other stimuli, such as picture stimuli [61].
The AD group’s results as an identifier variable, what we call “diagnostic accuracy,” are excellent and higher than we initially hypothesized. We see how patients already clinically diagnosed with AD present a complete range of performance in the Ikos test but reach the maximum score very infrequently, as shown in Table 2. In contrast, control subjects show a ceiling effect in the test that differentiates the two groups, resulting in excellent sensitivity and specificity. This fact shows that people with a precise clinical diagnosis of AD in their majority already show evident impairment of SM by eliminating the effect of schooling (acquired knowledge) using functional relationships between everyday objects. Furthermore, the dispersion of the results in control subjects does not occur as in other tests of SM evaluation.
This research shows that a deficit in SM in prodromal stages such as MCI could indicate an increased likelihood of developing AD in the future [9–11, 74–76]. Using the Ikos test to assess semantic and episodic memory would add a further indicator to help identify likely AD cases early on. That, jointly with biological markers and neuroimaging tests such as functional magnetic resonance, should serve as cognitive markers of disease progression.
One of the study’s limitations is that, although the Ikos test is easy to administer, it must be administered and scored by a person experienced in neuropsychological assessment. That is to ensure that any errors made are not due to visual deficits (blindness, agnosia) and to be able to pick up the clinical semiology of semantic impairment while administering the test.
On the other hand, it is essential to investigate how the Ikos test works with other types of dementias concerning SM, especially those with direct involvement of temporal lobe structures, such as in frontotemporal dementia and especially in SD.
