Abstract
Background:
The number of patients with Alzheimer’s disease is increasing rapidly every year. Scholars often use computer vision and machine learning methods to develop an automatic diagnosis system.
Objective:
In this study, we developed a novel machine learning system that can make diagnoses automatically from brain magnetic resonance images.
Methods:
First, the brain imaging was processed, including skull stripping and spatial normalization. Second, one axial slice was selected from the volumetric image, and stationary wavelet entropy (SWE) was done to extract the texture features. Third, a single-hidden-layer neural network was used as the classifier. Finally, a predator-prey particle swarm optimization was proposed to train the weights and biases of the classifier.
Results:
Our method used 4-level decomposition and yielded 13 SWE features. The classification yielded an overall accuracy of 92.73±1.03%, a sensitivity of 92.69±1.29%, and a specificity of 92.78±1.51%. The area under the curve is 0.95±0.02. Additionally, this method only cost 0.88 s to identify a subject in online stage, after its volumetric image is preprocessed.
Conclusion:
In terms of classification performance, our method performs better than 10 state-of-the-art approaches and the performance of human observers. Therefore, this proposed method is effective in the detection of Alzheimer’s disease.
Keywords
INTRODUCTION
Alzheimer’s disease (AD) is an irreversible and progressive brain disorder [1–3]. It is currently ranked as the 6th leading cause of death in the United States. This disease is named after Dr. Alois Alzheimer, who noticed the plaques, the tangles, and the loss of neural connections in the brain of a dead woman. In the preclinical stage, the patients are symptom-free, but abnormal protein deposits form tau tangles [4] and amyloid plaques [5] throughout the brain. This leads healthy neurons to eventually stop functioning, and the brain begins to shrink. In the final stage of the disease, the whole brain shrinks dramatically and is in dysfunction. Various research teams have proposed many detection methods based on magnetic resonance imaging (MRI). Plant et al. [6] employed brain region cluster (BRC). They tested voting feature interval (VFI) and support vector machine (SVM). Zhang et al. [7] combined kernel support vector machine (KSVM) and decision tree (DT). Zhang et al. [8] proposed eigenbrain (EB). They used a radial basis function support vector machine (RBF-SVM). Phillips [9] expanded the EB to three-dimensional (3D-EB), and utilized the RBF-SVM method. Savio and Grana [10] used geodesic anisotropy (GDA) and Bhattacharyya distance (BD) as features. They employed SVM as the classifier. Wang [11] used multilayer perceptron (MLP). They introduced a novel algorithm— biogeography-based optimization (BBO) to train the MLP. Zhang [12] proposed a new displacement field (DIF) feature. They tested both SVM and generalized eigenvalue proximal SVM (GEPSVM) methods. Gray et al. [13] utilized voxel-based morphometry (VBM) and random forest (RF). Du [14] presented a method that combined pseudo Zernike moment (PZM) and linear regression classifier (LRC).
These methods that were described achieved satisfying detection results. However, these methods also have several problems: 1) Their features did not capture the texture information of brain tissues. 2) Their classifiers are not stable, i.e., the classification performance may fluctuate among different runs.
To solve the first problem, we consider using discrete wavelet transform. The wavelet is a wave-like tool. This mathematical tool has been widely used in signal processing fields, such as fingerprint recognition [15], facial recognition [16], dendrite spine detection [17], emotion recognition [18], fruit classification [19], and tea classification [20]. Wavelet has also been used in detecting disease in the brain, such as Parkinson’s disease [21], sensorineural hearing loss [22, 23], and neuromuscular disease [24].
The wavelet feature is a translational variant. Due to this property, the extracted features may change even if the brain image is translated one or two pixels. Hence, we used stationary wavelet transform (SWT) in this study. Additionally, entropy was combined with the coefficients of SWT in order to reduce the size of the features.
To solve the second problem, we introduced a bioinspired algorithm— particle swarm optimization (PSO) [25]. PSO iteratively improves candidate solutions by mimicking the bird flock and fish school. The stability [26] and convergence [27] performances of PSO are proved to be superior to traditional gradient-descent algorithms. PSO is currently a hot topic in the field of computational mathematics and has been applied in crop classification [28], circuit design [29], path planning [30, 31], job scheduling [32], spam detection [33], protein-ligand docking [34], and structuring element decomposition [35].
The predator-prey (PP) model mimics the behavior of sardines and killer whales. Adding PP model into PSO can increase its optimization performance. Hence, the PP-PSO is expected to give better stability than solely using PSO does.
Our contribution in this study involves three points: 1) We proposed the stationary wavelet entropy to extract the texture information of brain images. 2) We proposed the use of PP-PSO to increase algorithm stability. 3) In terms of accuracy, our proposed system is better than 10 state-of-the-art approaches.
MATERIALS AND METHODS
Materials and subjects
The brain imaging data are from two sources: One source is downloaded from “Open Access Series of Imaging Studies (OASIS)” [36]. We selected 126 subjects, removing those with missing records. The 126 subjects contain 28 AD patients and 98 healthy control (HC) subjects. Their demographic status is provided in reference [37]. The other source is from local hospitals (Affiliated Nanjing Brain Hospital of Nanjing Medical University, Children’s Hospital of Nanjing Medical University, and Zhong-Da Hospital of Southeast University). We enrolled 70 AD subjects from community advertisements. The exclusion criteria for all participants were known neurological or psychiatric diseases, brain lesions such as tumors or strokes, taking psychotropic medications, and contraindications to MR imaging. This study was approved by the Ethics Committee of those hospitals. A signed informed consent form was obtained from every subject prior to entering this study. Scanning was implemented by a Siemens Verio Tim 3.0T MR scanner (Siemens Medical Solutions, Erlangen, Germany). All subjects lie as still as possible with eyes closed and not to fall asleep. In total 176 sagittal slices covering the whole brain were acquired, using an MP-RAGE sequence. The imaging parameters were: TE = 2.48 ms, TR = 1900 ms, TI = 900 ms, FA = 9°, FOV = 256 mm×256 mm, matrix = 256×256, slice thickness = 1 mm.
All the images from the online OASIS dataset and local hospitals were combined together with their demographics and can be seen in Table 1. After combining both data sets there was 98 AD subjects and 98 HCs. In total, a 196-image dataset was created.
Demographics of two sources
AD, Alzheimer’s disease; HC, heathy controls; M, male; F, female; MMSE, Mini-Mental State Exam; CDR, Clinical Dementia Rating.
All the images were preprocessed via FMRIB Software Library (FSL) v5.0. The brain extraction tool was utilized to extract brain areas. FLIRT and FNIRT were used for spatial normalization. Smoothing was implemented by a Gaussian kernel.
Our proposed system consisted of three main steps: 1) Preprocessing, which included brain extraction, spatial normalization, smoothing, slice selection, and histogram stretch; 2) Stationary wavelet entropy as the feature extraction; 3) Single-hidden-layer neural network as the classifier; 4) The predator-prey particle swarm optimization as the training algorithm. The pipeline of our method is illustratedin Fig. 1.

Pipeline of our method.
Slice selection
Only the most distinguishing slice along the axial direction for each 3D brain was selected. The selection criterion was to include the hippocampus, which is believed to shrink in AD patients. The slice was along axial direction with Z = –22 mm in MNI space. Figure 2 shows three slices of a HC subject from OASIS, an AD subject from OASIS, and an AD subject from local hospitals, respectively. Two points were observed: 1) The AD subjects had a smaller hippocampus compared to the HC subjects; 2) The image obtained in local hospitals were slightly darker than those from the OASIS dataset.

Illustration of samples. HC, healthy control; AD, Alzheimer’s disease.
Histogram stretching
In addition, since there were two different image sources histogram stretching was used. Histogram stretching (HS) method was used to fulfill two objectives: 1) To increase the dynamics of all brain images; and 2) To remove the effect of having different image sources. The HS transformed original image c to a new image d as:
Stationary wavelet entropy (SWE)
SWE is a novel feature extraction. It combines the stationary wavelet transform and information entropy [38]. In recent years, SWE has been reported to give excellent results in gene expression [39], multiple sclerosis detection [40], hearing loss detection [41], etc.
There are three reasonable assumptions of why SWE was chosen to extract the features from the brain slices in this study. First, the textures of brain tissues are similar to those of a fingerprint image, and the wavelet based fingerprint identification has achieved a remarkable success in both academic and commercial fields [42]. Second, the structure of the brain corresponds to gray level fluctuation in brain images, and wavelet transform is known to capture the rapid signal change (high-frequency change) in an efficient way. Third, the hippocampus shrinking rearranged the brain image gray-levels, which eventually changes the order/disorder degree, measured by entropy [43].
The input image I was used to perform a one-level SWT decomposition via a low-pass filter (l) and a high-pass filter (h), which generated four subbands: LL1, LH1, HL1, and HH1.
From above *
r
represents row-wise (i.e., horizontal) filtering and *
c
represents column-wise (i.e., vertical) filtering. The LL1 subband was sent to perform another one-level SWT decomposition, which generated four new subbands: LL2, LH2, HL2, and HH2.
The decomposition iterated until desired level, as shown in Fig. 3.

Two-dimensional stationary wavelet transform (j is an arbitrary integer).
For a k-level decomposition, the SWT generated (3k+1) subbands. For each subband X, it was regarded as a random variable with possible values of {X1, X2, … X
n
}. The probability mass function, P
i
, was estimated by an image histogram algorithm. Finally, the information entropy E(X) was implemented as
Thus, the SWE outputs a (3k+1) feature was set. Compared to traditional wavelet entropy (WE) [44, 45] and biorthogonal wavelet entropy [18], the SWE had an obvious advantage that the extracted feature is “translation invariant”. This means that the SWE remains unchanged, even if the brain image is motion-shifted during MRI scanning, or the spatial normalized brain is not strictly registered to the atlas.
Single-hidden-layer neural network
The single-hidden-layer neural network (SNN) is commonly described in detail in many textbooks and literature. The SNN is a particular feedforward neural network, and it can approximate to any function at any degree, guaranteed by the universal approximation theorem [46]. From the view of structure, SNN contains three fully connected layers: the input layer, the hidden layer, and the output layer.
Figure 4 shows the structure of SNN. The extracted features from brain images were sent to the input layer, hence, the number of input neurons is equal to the number of features. The neuron number of hidden layer was determined by the grid-searching method. The output neuron corresponds to the class category, and the argmax function chose the class with highest output. The training methods were back-propagation style. First the weights were initialized randomly, and tested for error. Then, the error information was sent backwards to the neural network, in order to update the weights. This procedure was iteratively repeated and as the error over training set decreased, it terminates until the error over validation set increases.

Structure of single-hidden-layer neural network.
Predator-prey particle swarm optimization
The traditional gradient-descent method cannot get stable results, i.e., the converged result depends on the initial values. Hence, scholars have proposed to use bioinspired algorithms to train the SNN. Allahkarami et al. [47] utilized genetic algorithm (GA). Buyukada [48] used PSO. Dugenci et al. [49] suggested the use of bee algorithm (BA). Yang [50] proposed the use of BBO algorithm.
The above-mentioned mechanisms can solve the unstable solution to some degree. In order to further improve the training algorithm, we used the PSO algorithm as the basis, and proposed a PP-PSO. PSO mimics the behavior of bird flock and fish swarm. It imagines each bird or fish as a particle. Each particle represents a candidate solution; hence, finding the best solution can be achieved by moving the particles near the global optimal point.
Particle swarm optimization
In the algorithm, each particle was assigned two characteristics: position P and the velocity V. The fitness function f is evaluated at each iteration over the whole particle swarm. In each iteration, two categories of best particles were updated.
One category was the previous best (B
p
) position a particle has traversed so far:
The other category was the global best (B
g
) position that all particles have traversed so far:
Based on B p and B g , the whole swarm was updated by
Here w is the inertia weight, which can balance local exploitation and global exploration. Two positive constant parameters q p and q g are acceleration coefficients with the aim of modifying the distance towards B p and B g , respectively. x p and x g are random variables within range [0, 1].
Velocity clamping [48] technique was used to limit particles flying out of the search space as
Proposed training algorithm
The PP-PSO mimicked the behavior of sardines and killer whales. Inspired by this, Wang and Lv [51] proposed a predator-prey model, in which the predators chase the center of the swarm of prey, and the preys escape from predators using different behaviors. The swarm in PP-PSO can be divided into two types: prey swarm (marked as y) and predator swarm (marked as r). Its core idea is: the predators chase after the preys, while preys try to escape from the predators. Using this idea, the equation (13) was updated as follows:
The weights w
r
(t) and w
y
(t) are inertia weights for predator and prey swarm, respectively. They are defined as:
The whole algorithm
The whole algorithm is depicted in Table 2. The 10-fold cross validation was used to get the strict statistical results. The complete 10-fold cross validation repeated 50 times independently. For example, Hasan et al. [52] used 5 complete runs. Mu and colleagues [53] used only one run. Sanz et al. [54] used 10 runs. Compared to state-of-the-art statistical experiments, this study’s 50 complete runs can be used to reflect the distribution of the classifier performance.
Pseudocode of our algorithm
SWE, stationary wavelet entropy; SNN, single-hidden-layer neural network; PP-PSO, predator-prey particle swarm optimization.
The 50 runs of 10-fold cross validation was a valid and rigorous method to avoid overfitting. It can be used to make sure the result is generalizable. Under this setting, one classifier was created in each trial, thus 10 classifiers were created for a 10-fold cross validation, and 500 different classifiers were created for the 50 different and independent runs. It is not practical to draw the receiver-operating characteristics (ROC) in this paper for all 500 classifiers. This paper reports the mean and standard deviation of sensitivity, specificity, accuracy, and area under the curve (AUC) in the experiment.
RESULTS
Histogram stretching
Histogram-stretching method was used to make the AD images from local hospitals brighter than the original. The original images and their histograms are plotted in Fig. 5. The gray levels at 0 were not counted, since there were too many pixels in the background with the gray level value of 0. Additionally, the histogram of Fig. 5h is smoothed and shown in Fig. 5h, since many density values are down to zero due to the histogram stretching method.

Illustration of histogram stretching.
Statistical result
The decomposition level was set to 4, and the neuron number of hidden layer was set to 3. These parameters were obtained by grid-searching method, as shown below. The 50 runs over the sensitivities, specificities, and accuracies are presented in Table 8. The average over 50 runs show that the proposed method achieved a sensitivity of 92.69±1.29%, a specificity of 92.78±1.51%, an accuracy of 92.73±1.03%, and an AUC of 0.95±0.02.
Algorithm Comparison (Unit: %)
Bold means the best. BRC, brain region cluster; VFI, voting feature interval; SVM, support vector machine; KSVM, kernel support vector machine; DT, decision tree; EB, eigenbrain; RBF, radial basis function; 3D, three-dimensional; GDA, geodesic anisotropy; BD, Bhattacharyya distance; MLP, multilayer perceptron; BBO, biogeography-based optimization; DIF, displacement field; GEPSVM, generalized eigenvalue proximal support vector machine; VBM, voxel-based morphometry; RF, random forest; SWE, stationary wavelet entropy; SNN, single-hidden-layer neural network; PP-PSO, predator-prey particle swarm optimization.
Effect of slice selection
Comparison with manual interpretation
Computational time in offline training over the dataset
h, hour; min, minute; s, second.
Computational time in online identification over a single volumetric image
min, minute; s, second.
Statistical result on our balanced dataset
R, Run; Sen, Sensitivity; Spc, Specificity; Acc, Accuracy; Avr, Average.
Optimal decomposition level
The grid-searching method was utilized to find the optimal value of the decomposition level. According to common knowledge, the decomposition level was set k from 1 to 8 with an increment of 1. The results are shown in Fig. 6.

Optimal decomposition level.
Optimal neuron number at hidden layer
In this experiment, neuron number of hidden layer was varied from 2 to 10 with increment of 1. The results are presented in Fig. 7.

Optimal neuron number at hidden layer.
Training algorithm comparison
The feature dimension and neural network structure remained unchanged in the experiment. The proposed PP-PSO algorithm was compared with global optimization algorithms: including GA [47], PSO [48], BA [49], and BBO [50]. The maximum iteration number was set to 1,000. Matlab command of “boxplot”, was used to show the median, quartile, whisker, and outlier of each algorithm over 50 runs. The results are shown in Fig. 8.

Our training algorithm compared to global optimization based training algorithm. GA, genetic algorithm; PSO, particle swarm optimization; BA, bee algorithm; BBO, biogeography-based optimization; PP-PSO, predator-prey particle swarm optimization.
Next, the proposed method was compared with traditional gradient-based backpropagation methods. The competing algorithms include backpropagation, momentum backpropagation, and adaptive backpropagation. Each algorithm was run 50 times, and the iteration number was set to 1000. The results comparing the different algorithms are shown in Fig. 9.

Our training algorithm compared to gradient-based training algorithms.
Comparison with state-of-the-art approaches
The SWE+SNN+PP-PSO method was compared with 10 state-of-the-art approaches: BRC+VFI [6], BRC+SVM [6], KSVM+DT [7], EB+RBF-SVM [8], 3D-EB+RBF-SVM [9], GDA-BD+SVM [10], MLP+BBO [11], DIF+SVM [12], DIF+GEPSVM [12], and VBM+RF [13]. All the simulation settings were the same as previous experiments. The results are listed in Table 3. The performance results of competing approaches were taken from different literatures. Some methods only ran once, hence, the standard deviation could not be reported.
Effect of slice selection
This section describes how the effect of the slice selection was validated. If slice selection was not used, then SWE features would have been extracted from all slices of the brain. Note that each slice will generate 13 features, for the whole brain 120 slices were used (those above and below the brains are not used), thus there are 13*120 = 1560 features. Those 1560 features were submitted to the SNN+PP-PSO. The comparison results are listed inTable 4.
Comparison with manual interpretation
We compared our algorithm with three experienced observers (O1, O2, O3) with clinical experiences longer than 10 years in neuroradiology. 62 Subjects (23 ADs, 39 age-matched and sex-matched HCs) were enrolled and scanned. The observers were blinded to the age and sex of the subjects, and they assessed only the selected slice. The comparison results can be seen inTable 5.
Computational complexity
The computational complexity of the proposed method was tested with regards to offline training and online identification. The computation time of offline training stage is listed in Table 6. The computation time to process one image of online identification stage is listed in Table 7.
DISCUSSION
Figure 5a-d shows that HC and AD subjects of the same OASIS dataset has a similar histogram envelope. From Fig. 5c-f, it can be seen that AD subjects from local hospitals is a bit darker than AD subjects from OASIS. Figure 5f validates that the histogram of AD subjects from local hospitals is a low-contrast histogram. Figure 5g shows the HS result of Fig. 5e, with the new histogram and smoothed version offered in Fig. 5h and 5i. It is clear that the histograms from two different sources (OASIS dataset and local hospitals) have similar gray level distribution envelope, as compared in Fig. 5c and 5i.
There are other excellent intensity normalization algorithms, such as whole body intensity standardization [55], Whitestripe [56], contrast-limited adaptive histogram equalization [57], etc. However, the method proposed in this paper is simple and works only on one slice. Hence, it is faster than other methods. In the future, performance of other algorithms will be tested.
As Fig. 6 shows, the curves achieved the highest point at k equals 4, hence, it was decided to choose the decomposition level as 4. The reason was two-fold: 1) When k increases from 0, the higher level will give higher resolution decomposition; 2) But if k is too large, the calculation error (such as the rounding up error) will sum up, and worsen the classification performance.
From Fig. 7, it can be observed that the effect of changing neuron number of the hidden layer is less than the effect of changing the decomposition level as shown in Fig. 6. As the size of the hidden layer increases, the performance decreases slightly. The optimal neuron number of the hidden layer was 3, which suggested the best SNN structure of this study was 13-3-1 in terms of numerical results. Nevertheless, it was shown that the classification performances with neuron number at hidden layer of 2 or 4 quite approximated to the one with neuron number at hidden layer as 3, which indicates that this problem may have multiple solutions.
It is seen in Fig. 8 that the proposed PP-PSO is the most stable algorithm, which yielded the highest performance among all algorithms. The BBO [50] ranked as the 2nd best algorithm, and the PSO [48] ranked as the 3rd best algorithm. The PSO [48] had slightly inferior performance than that of the BBO [50]. The BA [49] ranked as the 4th, and GA [47] gave the worst performance.
The comparison results in Fig. 9 shows that the PP-PSO had a better mean value than the other three algorithms in terms of sensitivity, specificity, and accuracy. Also, the PP-PSO has a much smaller variance than the other three algorithms. All the measures showed the robustness of PP-PSO as a training algorithm.
The reason is all of the three methods are based on a gradient descent, which commonly initializes the solution usually. These initialization influences the convergence performance profoundly. That means, if the initialization was near a local minimum point, then the algorithms may be stuck into that localminimum.
The proposed PP-PSO belongs to global optimization algorithms, which can solve this problem. The swarm contains several solution candidates. If one candidate is trapped into a local minimum point; other candidates with better results within the swarm will put them out of the local region. If traditional backpropagation and its variants were used, good result or obtain a bad result could be obtained due to the large variance. This indicates the necessity of using PP-PSO. There are many other interesting global optimization algorithms, such as artificial bee colony [58], bat algorithm [59], harmony search [60], gray wolf algorithm [61], etc. There are many other variants of PSO, such as quantum-behaved PSO [62], PSO with time-varying acceleration coefficient [63], bare-bone PSO [64], etc. In the future, it is necessary to make objective tests to improve the performance of the training algorithm.
Table 3 shows that the proposed method achieved the highest accuracy of 92.73±1.03%, better than the other methods. The MLP+BBO [11] obtained the second highest accuracy of 92.40%, but the authors did not report the standard deviation. The GDA-BD+SVM [10] obtained the third highest accuracy of 92.09±2.60%. However, its sensitivity was too low, with a large standard deviation of 80.00±4.00%.
In terms of specificity, the BRC+VFI [6] obtained a perfect specificity, but its sensitivity was too low at 65.63%. In terms of sensitivity, the BRC+SVM [6] obtained the highest sensitivity of 96.88%, but its specificity was only 77.78%. Considering all three measures, the proposed method is the best among all 11 algorithms.
Table 4 shows that using the whole brain did not increase the detection performance. In contrast, the detection over the whole brain slightly decreases the performance, with a sensitivity of 92.71±1.46%, a specificity of 92.61±1.41%, and an accuracy of 92.66±1.08%. The reason may be two-fold: 1) There were too many features (1560) in using the whole brain. The excessive features could have made the classifier training difficult, and thus the classifier did not converge to its optimal condition. 2) Many slices are unrelated to AD. Therefore, including these slices will make the input data more complicated than simply using slice selection method. It can be seen from Table 5 that the three human observers reach an accuracy in the range of 72% to 78%, while the developed algorithm can achieve an accuracy of 91.94%. This again shows the power of machine learning and computer vision.
On the other hand, we already know the powerfulness of deep learning used in automated diagnostic systems: Esteva et al. [65] used a deep convolutional neural network to create a dermatologist-level classifier. Suk et al. [66] used deep sparse multi-task learning in diagnosing AD. Morabito et al. [67] employed deep learning representation to check early-stage Creutzfeldt-Jakob disease. In this study, deep learning was not used due to the small size of the dataset. Yet, it is believed deep learning can help improve the proposed system in future research work.
From Table 6, one can observe that the preprocessing in offline training cost 40.65 h, slice selection cost 4.71 s, SWE cost 156.42 s, and the classifier training cost 182.80 min. Note that it was necessary to handle 198 images, and they needed to run 50 times to get the cross-validation result in training. Hence, for one image, the calculation time was reasonable. The training time for 1 run was about 3.66 min.
From Table 7, it can be seen that the preprocessing in online identification cost 14.12 min, slice selection cost 0.02 s, SWE cost 0.85 s, and prediction only on cost 0.01 s. This means that after the image is preprocessed, it only needs 0.88 s to identify if a subject is AD or not. This is quite rapid and that suggests that the proposed method meets the real-time requirement.
Table 8 shows the sensitivities, specificities, and accuracies over the balanced dataset based on 50 runs. The averaged performance showed the algorithm of this study achieved a sensitivity of 92.69±1.29%, a specificity of 92.78±1.51%, and an accuracy of 92.73±1.03%. If the data from online OASIS dataset was only used, then the unbalanced dataset may force the classifier towards the majority class (i.e., the healthy subjects).
A shortcoming of the proposed method is that this system is oriented for computer machine instead of a human. Hence, it can be regarded as a “black box”. The human experts cannot get effective rules from this “black box”. This is why the proposed method is identified as a subfield of “machine learning”. In the future, we shall try to use rule-based systems to translate the machine-oriented rules to human-oriented rules.
Another shortcoming is that the employed feature, SWE, is only suitable on two-dimensional features. This is the reason why a slice had to be selected before implementing the SWE operation. In the future, we shall try to extend SWE to three-dimensional situation. At that time, we shall try to process the volumetric image by the three-dimensional SWE feature.
Conclusion
Our team developed a novel system based on computer vision and machine learning. We proposed a novel predator-prey particle swarm optimization to help train the classifier. The proposed system is better than 10 state-of-the-art approaches in the combined dataset from both the OASIS dataset and the dataset from local hospitals. Additionally, the proposed system has a better performance than human observers do in analyzing realistic brain imaging data. In the future, we shall try to detect mild cognitive impairment. We shall also make tentative tests by using advanced classifiers, such as convolution neural network or autoencoder,
Footnotes
ACKNOWLEDGMENTS
This study was supported by Natural Science Foundation of China (61602250), Program of Natural Science Research of Jiangsu Higher Education Institutions (16KJB520025, 15KJB470010), Open fund for Jiangsu Key Laboratory of Advanced Manufacturing Technology (HGAMTL1601), Open Program of Jiangsu Key Laboratory of 3D Printing Equipment and Manufacturing (3DL201602), Open fund of Key Laboratory of Guangxi High Schools Complex System and Computational Intelligence (2016CSCI01), and Natural Science Foundation of Jiangsu Province (BK20150983).
The authors acknowledge their gratitude to the OASIS dataset that came from NIH grants P50AG05681, P01 AG03991, R01 AG021910, P50 MH071616, U24 RR021382, and R01 MH56584.
