Abstract
BACKGROUND:
The term ‘dementia’ covers a range of progressive brain diseases from which many elderly people suffer. Traditional cognitive and pathological tests are currently used to detect dementia, however, applications using Artificial Intelligence (AI) methods have recently shown improved results from improved detection accuracy and efficiency.
OBJECTIVE:
This research paper investigates the efficacy of one type of data analytics called supervised learning to detect Alzheimer’s disease (AD) – a common dementia condition.
METHODS:
The aim is to evaluate cognitive tests and common biological markers (biomarkers) such as cerebrospinal fluid (CSF) to develop predictive classification systems for dementia detection.
RESULTS:
A data analytics process has been proposed, implemented, and tested against real data obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) repository.
CONCLUSION:
The models showed good power in predicting AD levels, notably from specified cognitive tests’ scores and tauopathy related features.
Introduction
Dementia describes a range of symptoms that can affect how the brain works and is more likely to be exhibited by people over 60 years of age [1]. It is a progressive disease, meaning that symptoms such as forgetting recent events, losing the ability to carry out some of the normal everyday functions, misplacing items in inappropriate places, poor judgment, and speech impediment worsen over time [2]. The most common form of progressive dementia is Alzheimer’s disease (AD), which causes brain cells to degenerate and eventually die [3]. There are approximately 5.8 million individuals aged 65 and older in the United States of America (USA) diagnosed with AD [3].
Early detection of dementia conditions is an important area of research that may contribute to disease intervention and healthcare service [4, 5, 6]. Typically, AD is established through cognitive tests and pathological procedures. The latter includes magnetic resonance imaging (MRI), positron emission tomography (PET), and CFS from a lumbar puncture [7], while the former includes the Montreal Cognitive Assessment (MoCA), and the Mini Mental State Examination (MMSE) [8, 9]. However, AD cannot be confirmed until an autopsy or biopsy is performed on the brain.
Cognitive testing involves various methods including cognitive criteria defined in the Statistical Manual of Mental Disorders (DSM-5) [10]. These methods often require many tests to measure the cognitive levels of individuals besides their performance vary significantly so researchers have investigated attributes linked to genetics, neuroimaging, and biomarkers to enhance dementia detection such as by [11, 12, 13], However, these medical procedures can be time consuming, invasive, expensive, and usually unavailable in most countries [14]. Therefore, finding features related to cognitive and affordable biomarkers can be a promising approach in dementia detection research. Our research falls under this category in which we try to combine various dementia features to identify few effective features that physicians can evaluate during a clinical check. Our research combines cognitive and biomarkers to find out if adding indicators such as tauopathy (tau) when used with cognitive tests scores can improve the performance of the models developed by the supervised learning techniques on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) data subjects [3].
Furthermore, this research builds upon previous research works related to enhancing dementia classification systems aiming to identify which classification techniques are suitable for dementia detection. We thus propose a data analytics process a) to identify effective cognitive scores and pathological features, and b) to learn from these features automated systems for dementia detection by using classification techniques and attribute selection. The research questions that we aim to answer are:
How can we find a set of cognitive scores and tau biomarkers related to AD to train a classification system based on real data subjects using supervised learning in data analytics? Which classification technique is superior for predicting AD using real data subjects?
The paper is structured as follows: Section 2 reviews recent and related works with a focus on data analytics techniques. Section 3 presents the process, and Section 4 discusses the experiments and the results analysis. Finally, we conclude in Section 5.
Since we developed a data analytics process using real data from ADNI then most of the reviewed work in this section is based on empirical studies with data examples from ADNI project (
[16] characterised cross sectionally and longitudinally clinical measures in 819 subjects of the ADNI data repository with three possible baseline diagnosis: NC (neutral cognitive/normal cognitive), MCI, and AD to understand the progression of the disease. For NC, MCI, and AD, each subject may encounter different cognitive and biomarker tests over the course every six months. Results based on descriptive analyses using baseline diagnosis (class label) and within 12 months showed that subjects with MCI were more memory impaired than NC subjects. In addition, non-memory cognitive measures were found to be minimally impaired in those with MCI.
[17] investigated cognitive tests to predict the progression of AD in young and old patients using a subset of the ADNI data repository. The biomarkers and cognitive tests were initially examined to find if CSF- and PET-related attributes would lose prognostic value when used on patients older than 75 years of age as opposed to those of MRI. The CSF attributes included Amyloid-B, total-tau, and phospho-tau (proteins) and phosphorylated threonine. FDG-PET attributes included those related to the gyrus. The authors applied logistic regression [18] to analyse attributes against the diagnosis. They found that MRI-related attributes and neuro test scores predicted a conversion of 63–67% classification success in both the under and over 75-year-olds, whereas CSF biomarkers only attained a reasonably similar result for those under 75 years of age. FDG-PET attributes in the total sample classified 57% of the subjects correctly.
[19] investigated predicting the conversion to AD with longitudinal measures using subjects from the ADNI dataset. The authors aimed to find which clinical biomarkers are influential for tracking progression in MCI to probable AD subjects. The analysis conducted required at least one follow up visit from the baseline and included demographics, cognitive test scores, and biomarkers. Results of the Cox-Model (cross-sectional prediction model) revealed that the strongest predictor on the selected subset of subjects was the ADAS-Cog 13 score [17] followed by FAQ score, RAVLT score, and biomarker including hippocampal volume and middle temporal gyrus.
[20] contrasted the performance of classification techniques in the early stage of dementia using 324 subjects from the ADNI dataset. All the considered subjects had taken the CDR cognitive test as a follow-up at least 18–36 months from the baseline. A Support Vector Machine (SVM) developed a good predictive model to perform binary group separation using Matlab as a platform to implement the classifier. Classification used both linear and nonlinear kernels for performance comparison. Results showed that ADAS-Cog, ALVT, and FAQ cognitive tests were more frequent predictors in the models developed by the SVM classifier.
[21] studied the performance of deep learning techniques for early prediction of AD based on the MRI attributes within the ADNI dataset and the Australian Imaging Biomarkers and Lifestyle Study of Aging (AIBL) cohorts. The goal was to provide an accurate means to facilitate clinical trials within a specific period. The authors used a time-to-event model to predict progression to AD. The longitudinal data used a baseline of MRI scan and data from one clinical follow up. The deep learning developed results of a concordance index of 0.752 on 439 testing MCI subjects with a follow up between 6 and 78 months and a C-index of 0.781 on 40 AIBL testing MCI subjects with a follow up between 18 and 54 months.
[22] applied the Random Forest algorithm [23] to enhance the prediction of AD using pairwise selection from a subset of the ADNI dataset. Two demographic attributes (age and gender), three cognitive test attributes (MMSE, CDRSB, FAQ), biomarker attributes related to MRI scans, the time delay for time series classification, patient ID, and the three-class diagnosis (DX) were used for prediction. The authors used relative importance for the attributes measured by increased prediction error when the value of the variable was permuted across the out-of-bag observations. It was discovered that RID and TIME_DELAY were the most important diagnosis predictors hence the last examination scores were central. Classification results using Random Forest led to an area under curve (AUC) of 82% and accuracy of 73% compared to Support Vector Machine (SVM) at 62% with accuracy of 52%.
[24] used attribute selection methods and a Random Forest classifier to evaluate which attributes affect the forecasting of AD and CN using the ADNI dataset [15]. The authors used a dataset consisting of 425 attributes and 240 instances, divided equally into classes of mild MCI converting (cMCI), MCI, AD, and CN. The results of the Random Forest classifier showed that the Memory-related attributes captured by the MMSE cognitive test, having ApoE, hippocampal volume, and a range of nutritional variables were relevant attributes to determine AD. The result was also found to have a peak accuracy of 89% with Random Forest when using 210 attributes.
Table 1 shows the summary of the literature reviewed here. The above research studies identified that dementia detection at early phases is fundamental for designing therapy plan and intervention. However, limited studies have accounted for combination of pathological and cognitive elements to detect to dementia conditions such as AD. For example, Spencer et al. [25, 26] only considered simple cognitive elements in common dementia medical tests to build models for early classification of dementia levels. The authors used descriptive analytics to measure if the cognitive elements can detect AD levels. Furthermore, [20] evaluated just behavioural and cognitive elements and concentrated on detecting AD using models developed from these elements. Also, [5] studied the association between dementia advancement and the decline in functional activities of patients from ADNI project empirically using machine learning techniques with feature engineering.
Summary of literature review conducted
Summary of literature review conducted
On the other hand, [21] focused on features extracted from the brains of cases and controls using neuroimaging techniques like MRI. These images contain features that usually show areas within the brains with noticeable deposition of certain proteins such as tauopathy and amyloid-beta related ones and are used as dementia indicators. The neuroimaging techniques may provide useful information on the progress of the disease, yet they are costly and require trained clinicians and expensive equipment to perform, which are rarely available in most third world countries.
In this research, we aim to integrate cognitive, neuroimaging, and pathological elements to identify whether such large set of dementia features can lead to more accurate classification of dementia levels. We try to categorize the related features and then look for their impact on the disease in the early phases. We also test whether the cognitive elements of the chosen medical assessment method can be linked to the cognitive areas in the DSM-5 for degenerative diseases, so medical professionals understand the significant elements that are associated with dementia diseases like the AD. Lastly, this research will build on previous research works on how data driven approaches like, computational data analytics can be beneficial and would provide effective systems for dementia detection that are usually more accurate than traditional medical procedures.
The process used in this research is depicted in Fig. 1. Initially, the data subjects are obtained from the ADNI-Merge data repository [15]. The dataset is comprised of longitudinal property which means that each patient has a clinical visit every six months during which new cognitive test scores are recorded. However, since we sought attributes that differentiated between the different possible class labels in the dataset, we only considered the baseline class label (first visit class label in the dataset). Therefore, the raw ADNI-Merge dataset has been minimised in the pre-processing stage as we are limited to demographics, cognitive tests scores, and CFS-related biomarkers such as tauopathy attributes. The reason for limiting biomarkers to tauopathy-related attributes is cost effectiveness.
Process of the research work.
Using attribute selection process, we assessed attributes’ ability to predict the medical diagnosis class. The attribute selection methods used in our research to find dementia indicators are Correlation attribute subset evaluation (CFS) [27], Leave One out Cross Validation (LOO-CV) [28], and Relief-F [29]. We evaluated common dementia symptoms that appeared in the subset of sets chosen by the attribute selection methods besides evaluating each attribute’s significance separately. The evaluation of the dementia symptoms phase produced a few cognitive attributes and biomarkers to offer as input to the classification learning algorithms to build predictive dementia models.
Distinctive classification algorithms have been exploited to measure the effectiveness of the attributes reported in the dementia attributes evaluation phase. These are based on various learning mechanisms including decision trees, probability, support vector machine (SVM), fuzzy rules, and statistical regression. The grounds for using the classification algorithms are:
To obtain inclusive results to come up with conclusions in terms of predictive performance. To evaluate diverse learning mechanisms and their affect in predicting AD. Not previously done in a single research study concerning AD prediction.
The algorithms used were C4.5, Random Forest, Ripple Down Rule Learner (RIDOR), Repeated Incremental Pruning to Produce Error Reduction (RIPPER), PART, Fuzzy Unordered Rule Induction Algorithm (FURIA), Naïve Bayes and logistical regression, and Sequential Minimal Optimization (SMO) [30, 23, 31, 32, 33, 34, 35, 18, 36].
Average age for the possible diagnosis values.
Gender vs diagnosis.
(a) Average cognitive score vs diagnosis. (b) Standard deviation of cognitive score per diagnosis.
The distinct attribute sets used for training the models by the classification techniques were chosen based on an in-depth analysis of the attributes. The focus was on high scores attributes having good correlation with the class. For instance, we evaluated dementia attributes with high ranking. In measuring the systems developed from the distinct attribute sets we used evaluation methods related to classification including precision, recall, predictive accuracy, and others.
The dataset used in this research is ADNI-Merge consisting of over 2,000 patients with a total of 14,628 instances and more than 110 attributes [15]. Each patient is associated with multiple instances with each correlating to a clinical examination visit. Different types of attributes including demographics, cognitive, and CFS have been used. Examples of the cognitive tests used include ADAS-Cog, CDRSB, MMSE, FAQ, and MoCA. CFS attributes (TAU, PTAU, ABETA and APOE4) are extracted using a lumbar puncture – a procedure taking 20–45 minutes and performed by inserting a needle into the spine [37]. The cognitive tests which their total scores have been used in the experimental analysis are explained in the next section.
The dataset contains 818 control (CN), 1026 MCI, and 396 demented instances-excluding instances with a missing baseline diagnosis. We extracted 2,261 data instances for the empirical analyses to answer the research questions (See Table 5). Data subjects with missing target class values were ignored.
Figure 3 displays the distribution between gender versus the diagnostic class, and DX which consists of three possible values: cognitively normal (CN), mild cognitive impairment (MCI), and dementia/AD. The data seems to be balanced with respect to the class label. We added ‘ethnicity’ as an extra attribute with possible values including Indian, Asian, Black, European, and others, to further categorise gender. Most of the data subjects relate to European ethnicity. Figure 3 displays the average age (from male and female) corresponding to each diagnosis. The average rounded ages for CN
Figures 4A and B display the diagnosis against the average and standard deviation (SD) scores of each cognitive test respectively including ADAS13, MMSE, CDRSB, and FAQ. MMSE is showcased differently as scores of that test are inverted (high
In this section we discuss the main dementia cognitive tests that are used in the experimental analysis and focus on how the total score attribute is computed and recorded by medical professionals. The reason for discussing these tests is because we are evaluating combinations of cognitive elements that are highly correlate with the diagnostic decision related to dementia.
The Mini Mental State Examination (MMSE), developed by [9], is a widely-used diagnosis tool formulated to determine the presence of cognitive impairment in the elderly. The conventional cut-off score is 24 – any score less than 24 represents a certain level of cognitive impairment. The questions are designed to evaluate various domains including orientation to time, orientation to place, registration, attention and calculation, recall, language, and visual construction. Although the MMSE is easy to run and is popular for clinically assessing cognitive impairment, it has a few limitations such as it does not test long delay recall and does not take into account executive function and spatial recall. Additionally, age and education also influence the score in the MMSE but these factors are not considered during assigning the score to the patient. According to a meta-analysis of 34 studies, the MMSE showed a pooled sensitivity of 79.8% and specificity of 81.3% at a cut-off score of 24.
For the purpose of staging dementia severity, researchers developed a rating scale called the Clinical Dementia Rating Scale Sum of Boxes (CDR-SB) which was a more detailed measure than the global score of Clinical Dementia Rating (CDR) [38]. The CDR-SB is effective in providing information for mild dementia patients. There are several advantages in using this scale over the global CDR score: it is easier to calculate, it doesn’t require an algorithm, and this score can be considered as interval data not ordinal. The main benefit of the CDR-SB in staging of dementia severity is that it has higher precision in tracking variations across time. The score is calculated from six domains: memory, orientation, judgment and problem solving, community affairs, home and hobbies, and personal care. A 5-point scale is used to measure each factor and for personal care with its 4-point scale as shown in Table 2A. The CDR-SB tallies the scores from all domains with a range of 0 to 18 points. The CDR-SB delivered a pooled sensitivity of 84% and specificity of 94% under research conducted on 15 different studies [39].
Point scale of CDR-SB’s items
Point scale of CDR-SB’s items
[8] developed a diagnosis tool called the MoCA to detect MCI with the objective of helping health professionals to check for early signs of dementia. The MoCA is a 30-point test to gather information on short-term memory, visuospatial abilities, executive function, attention and concentration, language skills, orientation to time, and orientation to place. At a cut off score of 26, the MoCA delivers a sensitivity of 90% and specificity of 87%. The major benefits of this tool are its easy to execute, has good accuracy, it tests vital cognitive domains, and it can be administered in just 10 minutes. The test includes various activities to evaluate different domains, for short-term memory; the test taker is told five words and asked to repeat, with this process repeated a few times and later, after they complete other tasks, they are asked to repeat them again. For attention, the person is asked to recite a number forwards and a different number backwards; for language they are asked to repeat two sentences and list all words starting with the letter ‘F’. The MoCA also includes a ‘Clock Drawing’ test with time represented at 11.10.
[40] proposed a scale for older adults to check their functional capacity for living independently. They developed the Functional Activities Questionnaire (FAQ), based on the Instrumental Activities of Daily Living (IADL) scale [41] in which daily routine tasks for a self-sufficient older person were utilised. This questionnaire has a total of 10 questions and each question is scaled from 0 to 3 points, where a score of 3 represents that the participant is fully dependent on other individuals and 0 being self-reliant. Table 2b shows the scoring system for the FAQ test questions. From the total score of the questionnaire, a cut-off score of 9 determines the impairment level in the participant. One advantage of the FAQ test is that it doesn’t require assistance from a health professional and it can be executed by any relative or informant. It delivers a high sensitivity of 85% and specificity of 81% against lower results for sensitivity from the IADL at 57% and a higher specificity of 92%.
Scoring system for the FAQ test
[42] developed the Alzheimer’s Disease Assessment Scale, a scale to determine the cognitive and non-cognitive severity in the behaviour of an Alzheimer’s patient. This test takes about 45 mins to execute and covers a total of 11 domains initially, including memory, language, praxis, and orientation. The maximum points for a patient is 70, of which 48 are for the first 9 items, and 22 for word recall and recognition. A higher score indicates more severe cognitive dysfunction in an AD patient. A modified version of this scale – ADAS-Cog 13 was created having two additional items of number cancellation and a delayed free recall and the total probable score in this scale is 85. The most efficient cut off score is 17 which shows a sensitivity of 90.09% and specificity of 85.88%.
Table 3 shows the description of the attributes used in this study and their data type based on ADNI data dictionary description.
Attributes used with brief description
Experimental settings
For all experiments, Waikato Environment for Knowledge Analysis (Weka version 3.8), an open data mining software, was used [28] with default hyper parameter settings for all attribute evaluation and classification methods. Ten-fold cross validation [43] was used in all experiments related to building the classification models to ensure that the models developed did not over fit the training dataset. Lastly, all experiments were run on a personal computer with 2.7 GHz processing unit and 8 GB random access memory.
Contingency table for a binary classification problem
Contingency table for a binary classification problem
To evaluate the quality of the attributes chosen we used known performance metrics (Eqs (1)–(4)) from the confusion matrix that are related to binary classification problems as shown in Table 4. The metrics are calculated using the rates shown in the table and can be explained as follows:
TP
Accuracy: The correctly classified instances out of the total instances.
Precision (P): The correctly predicted positive instances to the total predicted positive instances.
Recall (R): The proportion of correctly predicted positive instances to all instances in the actual class-yes. This is also known as true positive rate.
Attribute selection
The proposed data analytics process evaluates the attributes related to the cognition and biomarkers of patients and controls using three methods: Relief-F, CFS, and LOO-CV [29, 27, 28]. The grounds for using these methods is due to the diverse approaches they adopt in assessing attribute-class correlations therefore a wider prospective can be observed on these relationships between the attributes and the class. We try to have fewer but different attributes.
Relief-F is a method that calculates the scores of each attribute class label using the neighbouring instances in the training dataset [29] using Eq. (4). The original algorithm covers a notable training observation set by the end-user and the training dataset to calculate the correlation of the attributes and the class according to Eq. (4).
Where,
The CFS method evaluates not only attributes and class relationships, but also considers minimising attribute-attribute relationships (Hall, 1999). The quality of the correlation is outlined mathematically using Eqs (5) and (6).
Where
The correlation between
Where
Finally, the LOUCV method shows the attribute worthiness when that attribute is dropped from the dataset. In other words, how much gain or loss of accuracy is observed when dropping a certain attribute from the training dataset using a classifier. We used the Naïve Bayes classifier in the experiments of LOO-CV.
C4.5 is a decision tree algorithm that builds tree models for classification problems [30]. Usually, C4.5 uses entropy to decide which attribute is the best to grow the tree. Once the tree is constructed, C4.5 prunes the tree to reduce overfitting. Random Forest is another decision tree algorithm which develops several tree models to allocate the target class of test data based on a collective decision (majority class in the models developed) [44]. PART is a hybrid algorithm that produces models with rules using decision trees [32]. FURIA is a rule induction algorithm that learns fuzzy rules instead of conventional rules and unordered rule sets instead of rule lists [33].
RIDOR is a rule-based classifier, which learns rules with exceptions to construct classification models (Gaines & Compton, 1995). RIPPER is a propositional rule learner that builds a set of rules for each class in the training dataset while minimising errors [31]. RIPPER uses an optimisation procedure and multiple pruning methods to reduce rules redundancy in the final classification system.
Naïve Bayes is a probabilistic algorithm based on Bayes Theorem that holds an assumption of independence among attributes [35]. Naïve Bayes assumes the presence of an attribute in a class that is unrelated to the presence or identity of another attribute. SMO is a SVM algorithm that separates the dataset based on the available class labels [36]. The LG algorithm finds the relationships between the independent variables and dependent variable and then models these relationships using a function [18].
We have selected these classification algorithms because of the following purposes
These algorithms have been used in medical research including dementia detection. These algorithms produce different outcomes. These algorithms use different training methods; for instance, Ridor is a rule induction algorithm, and Naïve Bayes uses probability theorem to classify test data. On the other hand, decision tree algorithm uses information theory-based measures to build trees for classification.
The aim of the attribute evaluation experiments is to ensure that we have few attributes that can help in constructing diagnostic systems for AD prediction. We included cognitive and biomarker attributes in the attribute evaluation experiments from the ADNI-Merge dataset. Table 5 depicts the attributes that showed a significant decrease when dropped from the dataset using LOO-CV with Naïve Bayes, RIPPER, and PART classifiers. The results of the LOO-CV implies that CDRSB and MMSE cognitive tests have a strong relationship with detecting AD as they have appeared consistently in NB, RIPPER, and PART models.
LOO-CV results
LOO-CV results
The CDRSB cognitive score has been identified as the primary cognitive attribute that can be used by clinicians for dementia diagnosis. This finding supports the fact that this test was used by ADNI clinicians to help assign class in the ADNI dataset. If CDRSB is dropped from the dataset the accuracy of RIPPER, NB, and PART when used in LOO-CV is reduced by 13%, 8%, and 14%, respectively. On average the accuracy of the classification models drops by 7% when CDRSB and MMSE scores are absent from the ADNI-Merge dataset.
The impact of PTAU and APOE4 based on the results obtained from LOO-CV, is less than that of the cognitive scores, at least on the considered dataset. These results indicate that tauopathy-related indicators are possibly not primary indicators for dementia detection – PTAU revealed minimal effect on the models of the LOO-CV experiments. In addition, the whole brain volume (WBV) has only appeared to be significant when dropped by RIPPER classification models. WBV normally decreases when the patient’s condition progresses in dementia as more areas of the brain become affected such as the amygdala and hippocampus [45]. Lastly, the POE4 protein seems to have some correlation with dementia which supports other research studies; POE4 attracts amyloid beta accumulation (A
The attributes selected by the CFS and Relief-F methods from the cognitive and Tau-related indicators (47 attributes) of the ADNI merge dataset exhibit that the CDRSB, MMSE, and other cognitive scores such as RAVLT and ECog are highly significant (see Table 6). Interestingly, APOE4 is the only biomarker related to the A
Top attributes selected by Releif-F and CSF methods
Description of the attribute sets
To disclose the significance of the attributes we assessed various distractive attribute sets based on the results obtained from the attribute evaluation using classification algorithms. We examined dementia classification models using classification algorithms against distinctive attributes sets as described in Table 7. In all experimental runs, the classification models were trained with ten-fold cross validation to avoid over-fitting. Initially we assessed cognitive scores besides demographics, as shown in Table 8, recording the accuracy, precision, and recall of the classification models built against cognitive attributes subsets.
Performance of the classification algorithms on the attributes’ subsets of the ADNI data
Compared with other algorithms, C4.5 performed better under all performance metrics with each attribute set. More notably, in contrast with other combinations, attribute set 8 with CDRSB score and 9 with MMSE score yielded promising results for this paper’s research question. Results analysis indicated that logistic regression when processing 8 attributes, and the C4.5 algorithms when processing 9 attributes outperform other classifiers with an accuracy of 90.13% and 92.28%, respectively, while also achieving high precision and recall rates.
It is notable that the results obtained by the classification algorithms against a set of 12 attributes yield the highest accuracy of 92.77% given cognitive scores. Nevertheless, the impact on accuracy is not that significant, i.e. increase is less than 0.5% compared with models developed by the same algorithms from the 9-attribute set. Hence, it can be said that CDRSB and MMSE scores are the dominant attributes for AD diagnosis, at least from the neuropsychological prospective and using the ADNI-Merge data subjects.
Overall performance results in Table 8 uncovered that FURIA’s models against the 11-attribute set outperform models developed by the other classification algorithms achieving an accuracy of 92.77%, with class precision and recall at 93% and 92.8%, respectively. This result also indicates that APOE4 and P-tau biomarkers from a lumbar puncture are influential attributes that can contribute to improving models developed by the classification algorithms – they are also cost effective.
Dementia is a progressive brain condition that many people, particularly the elderly, exhibit and early detection is crucial for early intervention. AI techniques such as supervised learning in data analytics can offer quality solutions for dementia detection since they provide specialist physicians predictive systems with additional knowledge. The aim of our research was to identify the best combination of cognitive tests scores and biomarker indicators that can lead to an effective classification model using real data subjects. This has been achieved by proposing a data related approach that evaluates cognitive tests scores and non-expensive biomarkers to build predictive models using attributes’ evaluation and classification.
The experimental process developed included three attribute selection methods and nine classification algorithms, which were applied on a substantial number of data observations obtained from the ADNI-Merge dataset. The results uncovered that cognitive tests’ scores related to CDRSB, MMSE, and biomarker indicators including APOE4, P-tau, and whole brain are influential in predicting AD. The models developed by the classification algorithms on the subsets of cognitive tests scores and biomarkers obtained from the attribute evaluation phase disclosed that rule-based algorithms were superior to the other algorithms specially FURIA developed models from just 11-attribute subsets (two tauopathy indicators, CDRSB, MMSE tests scores and demographics) with accuracy of 92.77%, precision of 93%, and recall of 92.8%.
The study is limited to the ADNI-Merge dataset and the experimental analysis are limited to the attribute selection and classification methods considered. We will expand the work soon to investigate the disease progression based on the critical stages (CN to MCI and MCI to light dementia). One of the limitations of this research study is not evaluating each cognitive item in the considered cognitive tests (MMSE and CDRSB). In the future, we would like to extract cognitive items from ADNI project to find which of the cognitive items may have larger associations with dementia and at which dementia category.
Footnotes
Acknowledgments
Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.
Conflict of interest
None to report.
