Hierarchical genetic optimization of convolutional neural models for diabetic retinopathy classification

Abstract

Diabetic retinopathy (DR) is one of the worse conditions caused by diabetes mellitus (DM). DR can leave the patient completely blind because it may have no symptoms in its initial stages. Expert physicians have been developing technologies for early detection and classification of DR to prevent the increasing number of patients. Some authors have used convolutional neural networks for this purpose. Pre-processing methods for database are important to increase the accuracy detection of CNN, and the use for an optimization algorithm can further increase that accuracy. In this work, four pre-processing methods are presented to compare them and select the best one. Then the use of a hierarchical genetic algorithm (HGA) with the pre-processing method is done with the intention of increasing the classification accuracy of a new CNN model. Using the HGA increases the accuracies obtained by the pre-processing methods and outperforms the results obtained by other authors. In the binary study case (detection of DR) a 0.9781 in the highest accuracy was achieved, a 0.9650 in mean accuracy and 0.007665 in standard deviation. In the multi-class study case (classification of DR) a 0.7762 in the highest accuracy, 0.7596 in mean accuracy and 0.009948 in standard deviation.

Keywords

Convolutional neural networks image pre-processing hierarchical genetic algorithms diabetic retinopathy

1. Introduction

Diabetic retinopathy (DR) is one of the conditions caused by type 1 and type 2 diabetes mellitus (DM) [1], this condition is the main cause of blurred vision or its complete loss in people who did not receive adequate treatment [2]. A study estimated that in 2030 there will be twice as many people with this disease com-pared to the number of patients in 2010 [3]. That is, it is more likely that there will be twice as many people with a decrease in vision.

There are several issues as to why DR cases have grown so rapidly [4]. The first point is the age at which the disease begins to affect. DM patients with some degree of DR have been detected since the age of 20 years old [5]. Another problem with DR is the small number of people who are trained to detect this disease [6]. Due to this, different computer technologies have been used to help in the detection of this condition and make better decisions for the medical treatment for patients.

The use of artificial neural networks has benefited the detection of different diseases (not only DR) since the beginning of its implementation [7, 8]. Deep learning methods have been used to detect DR, and one of the methods that has offered the best results is convolutional neural networks (CNN) [9]. Satisfactory results can be obtained either by creating an architecture from scratch or by using an existing convolutional neural network model. However, these models have a margin of error, and optimization algorithms are used to reduce it [10], such as: particle swarm optimization and genetic algorithms.

Optimization methods have been used on existing trained CNN models like ResNet for COVID-19 [11, 12] and for CNN model made by scratch for DM classification [13]. Thanks to the promising results, hybrid intelligent systems are used to take advantage of two or more technologies and deep learning methods have greatly benefited, helping to solve different problems that would not have been possible to solve using each technology separately [14, 15].

In this work, the first step was creating four different pre-processing methods for the images of the database to improve the accuracy obtained by the CNN models. Other researchers only focused on the implementation of at most one pre-processing, but pre-processing methods are very important, and this could affect in negative way the final accuracy obtained after train the CNN. Two different CNN models are used with the intention to analyze what happens when are using a few convolution layers or too many of them. Study pre-processing methods could help other authors in the future to create or improve their CNN models.

The second step is the implementation of a hierarchical genetic algorithm to optimize the hyperparameters of the architecture of CNN for the detection of DR. Other research has already implemented genetic algorithms to optimize neural networks models, but with no focus on the detection or classification of diabetic retinopathy.

In Section 2, related work made by different authors are presented. In Section 3, basic concepts are detailed, and examples with figures are shown to fully understand this work. In Section 4, the methods applied in this work are explained and described in detail. In Section 5 the results of the experiments carried out are presented and finally, in Section 6, the conclusions are presented.

2. Related work

Different works have the same objective of detect and classify diabetic retinopathy. However, it is also important to mention the work of the authors regarding the optimization of convolutional neural networks. In this work, these two technologies were used to fulfill our objective, and it is important to point out that the previous works served as inspiration to our work.

The research carried out by Victor Vives-Boix and Daniel Ruiz-Fernandez consists of carrying out experiments to classify diabetic retinopathy in a binary and multiclass manner [16]. The authors decided to take existing and previously trained convolutional neural network architectures to focus on the classification of diabetic retinopathy, but they did not limit themselves to only using them, but also to modify these models to improve the results obtained.

According to its authors, a bioinspired approach on synaptic meta plasticity in CNN was proposed. This biological phenomenon is known to directly interfere with both learning and memory by reinforcing fewer common events during the learning process. Synaptic meta plasticity has been included in the back propagation stage of a convolution operation for each convolutional layer.

APTOS 2019 database was used with four existing CNN architectures. The results show that convolutional neural network architectures, including synaptic meta plasticity, improve the accuracy obtained. The best results have been obtained for the In-ceptionV3 architecture with synaptic meta plasticity with an accuracy of 95.56% for a binary classification using 3662 images for training and an accuracy of 76.67% for multiclass classification using the same modified architecture.

In the research made by Omar Dekhil et al., they saw a significant increase in patients losing vision due to diabetic retinopathy. In response to this problem, they presented a convolutional neural network-based computer-aided diagnostic tool to classify fundus images into one of five DR stages. The proposed method consisted of first applying a preprocessing to the APTOS 2019 database. The proposed CNN consists of five convolutional layers, each with the ReLU activation function. Finally, three fully connected layers were added [17].

Transfer learning was adopted to minimize overfitting by training the model on a larger dataset of 3.2 million images (i.e., ImageNet database) prior to use of the model in the APTOS 2019 database. The neural network model used to carry out the learning transfer was the VGG 2016. The proposed approach has achieved a test accuracy of 77% (0.7700), which offers, according to its authors, a promising solution for successful early diagnosis and automated staging of DR. No tests for binary classification were performed.

The research carried out by Ashray Bhandare and Devinder Kaur consists of the development of a genetic algorithm with the aim of discovering convolutional neural network architectures that can work without the intervention of an expert technician [18]. In other words, the aim was to automate the process of selecting a convolutional neural network using a genetic algorithm that was responsible for optimizing the hyperparameters of the network.

The database used in this work was that of MNIST, a database of images of the numbers from 0 to 9 written by hand. It is an extension of the “NIST” (National Institute of Standards and Technology) database, acronym of the original authors, however, the images were preprocessed to facilitate the work of the learning algorithms. Due to this change, the letter “M” was added to the database name.

The MNIST database contains 50,000 labeled images that are used to train and validate the convolutional neural network, and another 10,000 images that are not labeled and are normally used for testing. It is one of the most popular databases as it allows researchers to study their proposed methods in a controlled environment. In this work, the genetic algorithm is responsible for optimizing different hyperparameters of the convolutional neural network. Examples taken from MNIST database can be observed on Fig. 1.

Figure 1.

Images examples taken from MNIST database.

To insert all the values of the table in a chromosome, the authors of this work made the decision to create a binary chromosome, that is, each gene will have only values of 0 or 1. To obtain the decimal values of the ranges shown in the Table 1, several genes are taken according to the maximum value that each hyperparameter can obtain. This means that, to obtain the values of the number of epochs, 7 genes are necessary, while for the size of the filters only 3 genes are required. This representation can be seen in Fig. 2.

Figure 2.

Example of the chromosome used to obtain the epochs.

The population size of the genetic algorithm is 10 individuals, and the way to evaluate the fitness of each one is through the precision obtained (value between 0 and 1) of the CNN (created by taking the hyperparameters of the chromosome) with the evidence from the MNIST database.

As properties of the genetic algorithm, for the selection the tournament operation is performed by selection taking two individuals, for the crossing the crossing of a single point with a probability of 50%, and for the mutation the multipoint operation with a probability of 80%. mutating only 10% of the genes.

The genetic algorithm was executed 10 times, obtaining as the highest result a precision of 0.9919. Only the MNIST database was used, but the authors concluded that genetic algorithms have the potential to generate successful convolutional neural network architectures automatically.

3. Basic concepts

In related work are used some method or technologies names, so, for a complete understanding of this work, it is essential to have knowledge of the concepts presented in this section.

3.1 Neural networks

Artificial neural networks have served since their creation for the detection of diseases, predictions, and pattern recognition [19]. To detect diseases, neural networks have better results if the network has supervised learning [20]. Super-vised learning means that the database for training the network has the information tagged.

To train a neural network with images is necessary to extract the best features of the images with an expert technician, but there is a kind of neural networks that does not need an expert technician: convolutional neural networks [21]. This network has the capacity of detection and classification of objects, medical dis-eases and more.

The main difference between convolutional neural networks and feed forward neural networks is the use of convolutional layers before the hidden layers [22]. CNNs have hierarchy in their layer, starting with a convolutional layer with an activation function and a MaxPooling layer [23].

Figure 3.

Example of the convolutional function.

Figure 4.

Example of the ReLU activation function.

3.1.1 Convolutional layer

The first layer used in CNNs after the input layer is the convolutional layer. This function uses a kernel for lines and borders detection; the more you use, the more complex shapes it recognizes [24]. The kernel will go through all the neurons obtaining a new matrix. An example of this layer can be observed in Fig. 3.

3.1.2 Rectified linear unit function (ReLU)

As previously mentioned, after the convolution layer is needed an activation function, and one that has offer good results using CNNs is the ReLU function [25]. This function avoids negative numbers converting them in zeros and allow equal positive numbers. An example of this function can be seen in Fig. 4.

3.1.3 MaxPooling layer

Just like the convolutional layer, MaxPooling layer uses a kernel, but in this layer the kernel is used to find the maximum values in the matrix, that means, extract the best characteristic of the same area of the kernel [26]. This function reduces computational load because converts the original matrix in a smaller one. An example of this function can be observed in Fig. 5.

Figure 5.

Example of the MaxPooling function.

3.2 Hierarchical genetic algorithms

Many authors have been discussed about evolution theories for many years [27], and one of the most studied is the Darwin evolution theory. Genetic algorithms (GA) are based in how the theory of Darwin works: initial population, obtain the fitness of each one, selection of the population, crossover, and mutation [28].

Simple genetic algorithms have different extensions, and the one used in this work are the hierarchical genetic algorithms (HGA) used for hierarchical problems. Unlike simple GA, HGA used two types on genes: control and parametric [29]. It is usually represented as a tree where the control genes are the higher levels and control the parametric genes in lower levels. An example of HGA can be observed on Fig. 6.

Figure 6.

Example of HGA.

Figure 7.

The two CNN architectures used in this work.

4. Proposed methods

In this section, the information about the developed methods is presented. Starting with the CNN models for the pre-processing methods experiments. Then ex-plaining the APTOS 2019 database and the pre-processing methods applied to it. Finally, the structure of the HGA and the meaning of the value of each gen.

4.1 Convolutional neural networks models

Two difference CNN architecture were designed to train each one with the four pre-processing methods in two study cases. The main difference between the two CNN models is number of convolutional layers. The first model has 3 convolutional layers with input an image size of 200x200. The second model is similar with 10 convolutional layers with input an image size of 512x512.

Both models have two hidden layers after the convolution layers and an output layer of 2 or 5 neurons. The number of neurons in the output layer depends on the case study (2 neurons for binary study case and 5 for multi-class study case). Also, the activation function of the output layer depends on the study case, being Sigmoid function if it is the binary study case or the SoftMax if it is the multi-class study case. The complete CNN architectures can be observed in Fig. 7.

4.2 APTOS 2019 database

This database contains 3662 tagged images for the training and validation of the CNN for detection and classification of DR [30]. In the Table 1 are observed how many images have each stage of DR and in Table 2 the number of images for healthy retina and retina images with DR. Distribution images of Table 2 is used for the binary study case, while the distribution images of the Table 1 are used for the multi-class study case.

Table 1
Number of stage images in APTOS 2019

Stage of diabetic retinopathy	Number of images
Healthy retina	1805
No Proliferative (NP) Mild	370
NP Moderate	999
NP Severe	295
Proliferative	193

Table 2

Number of healthy images and DR images in APTOS 2019

Type of image	Number of images
Healthy retina	1805
Retina with DR	1857

4.3 Pre-processing

Four different pre-processing methods were implemented using the APTOS 2019 database with the intention of compare them using the two models of CNN.

Figure 8.

Examples of the first pre-processing.

Figure 9.

Examples of the second pre-processing.

4.3.1 First pre-processing

This pre-processing is also used in the next pre-processing methods, that means that this is the most important pre-processing and one of the most used by other authors. This method consists in delete the dark pixels of the background and extract the retina from the image. To do this, is necessary to convert the original images from RGB (it is an image that has three color channels: red, blue, and green) to grayscale. With a grayscale image is possible to convert the image to a binary image. To convert the grayscale image to a binary, each pixel of the image is selected. If the value of the individual pixel is less than 20, that means the new value of the pixel will be 0, otherwise, the value will be 1. With the new binary image, the next step is search and find the retina, with the binary image is easy to find it because is just needed the bigger shape. After find the retina, the position of it is extracted, and then it is used to extract the retina of the original images. The examples of the resulting images can be seen in Fig. 8.

4.3.2 Second pre-processing

This pre-processing method looks similar with the previous one. The resulting images of first pre-processing method have the necessary information to start training CNN models, but the CNN models in this work have same size for width and height for its input image. That means that the images in the training process will be deformed.

To solve this, the second pre-processing method consist in extract the width and height of the resulting images of pre-processing 1 method, and then add black pixels when is necessary to finally obtain an image with the same size of width and height. The examples of the resulting images can be seen in Fig. 9.

4.3.3 Third pre-processing

After applied the first pre-processing method, is needed to extract the color channel of the images. In this work the color space RGB is used, so after extraction the result will be three images, one for each channel. Then a histogram equalization is applied to each one, and finally merge the three images to have again only one retina image. Examples of the resulting images can be seen in Fig. 10.

Figure 10.

Examples of the third pre-processing.

Figure 11.

Examples of the fourth pre-processing.

4.3.4 Fourth pre-processing

This is the last pre-processing method and was made before by other authors. It is very similar to the third pre-processing method with the difference that only is used the green channel of the retina image. First, we extract the green channel of the image retina, and then apply a histogram equalization. That resulting retina image will be in grayscale. Examples of the resulting images can be seen in Fig. 11.

4.4 Fourth pre-processing

In Section 3 is mentioned that hierarchical genetic algorithms have control and parametric genes. Genes for control and parametric have different purpose and is necessary to understand what control and parametric genes do.

Figure 12.

Representation of the control genes.

4.4.1 Control genes

The first gen of the control genes has an integer value and decides how many convolutional layers will use the CNN model; the values are between 3 and 6, inclusive. The second gen has an integer value that specify how many fully connected layers will create after the convolutional layers. This gen does not include the output layer. The third and last control gen has an integer value between 10 and 100 and can be divided by 10 and represents the number of epochs. A graphic representation of the complete control chromosome fragment can be observed on Fig. 12.

Figure 13.

Representation of the parametric genes for convolutional layers.

4.4.2 Parametric genes for convolutional layers

The first control gen will decide how many convolutional layers will have the CNN model, and all parametric genes will affect the hyperparameters of the convolutional layers created by control genes. This first gen has an integer value of 1, 2 or 3 and means which layers will have the convolutional layer after it. If the value is 1, the convolutional layer will have just the activation function ReLU. If the value is 2, ReLU function is added and a MaxPooling layer with a stride of 2 for a kernel of 2x2. If the value is 3, ReLU function and MaxPooling layer are added and a Dropout probability.

The second parametric gen has an integer value of 3, 4 or 5 and means the size of width and height for the size of the filter of the convolutional layer. The third gen also has an integer value, and it means how many filters will have the convolutional layer. The minimum and maximum value of this gen will depend on the position of the convolution layer. If the position is 1 or 2, minimum and maximum will be 16 and 32, inclusive. If the position is 3 or 4, minimum and maximum will be 64 and 128, inclusive. Finally, if the position is 5, minimum and maximum will be 256 and 512, inclusive.

The last parametric gen for convolutional layers has an integer value between 10 and 50, inclusive and must be divided by 10. Dropout layer needs a value between 0 and 1, so, before inserting this value in the CNN model, it will be di-vided by 100. A graphic representation of the complete parametric chromosome fragment for convolutional layers can be observed on Fig. 13.

4.4.3 Parametric genes for convolutional layers

The second control gen will decide how many fully connected layers will have the CNN model, and all parametric genes will affect the hyperparameters of the fully connected layers created by control genes. This first gen has an integer value of 1 or 2 and means which layers will have the fully connected layer after it. If the value is 1, the fully connected layer will have just the activation function ReLU. If the value is 2, ReLU function is added and a Dropout probability.

The second parametric gen has an integer value, and it means how many neurons will have the fully connected layer. The minimum and maximum value of this gen will depend on the position of the fully connected layer. If the position is 1 or 2, minimum and maximum will be 64 and 128, inclusive. If the position is 3 or 4, minimum and maximum will be 128 and 256, inclusive.

The last parametric gen for fully connected layers has an integer value be-tween 10 and 50, inclusive and must be divided by 10. Dropout layer needs a value between 0 and 1, so, before inserting this value in the CNN model, it will be divided by 100. A graphic representation of the complete parametric chromo-some fragment for fully connected layers can be observed on Fig. 14.

Figure 14.

Representation of the parametric genes for convolutional layers.

Table 3

Distribution of APTOS 2019 database

Type of images	Percentage of images	Number of images
Training	72%	2637
Validation	8%	293
Testing	20%	732

Table 4

Results of the binary study case

Experiment number	CNN depth	Pre-processing	Mean accuracy	Standard deviation	Min value	Max value
1	3	1	0.9363	0.00739364	0.9154	0.9461
2	3	2	0.9401	0.00824493	0.9154	0.9523
3	10	3	0.9011	0.08574371	0.5539	0.9659
4	10	1	0.9012	0.07387703	0.5375	0.9536

Table 5

Results of the multi-class study case

Experiment number	CNN depth	Pre-processing	Mean accuracy	Standard deviation	Min value	Max value
5	3	1	0.7218	0.00724198	0.7053	0.7381
6	3	2	0.7401	0.01051900	0.7176	0.7585
7	10	1	0.7091	0.03506277	0.5866	0.7394
8	10	2	0.7204	0.02397598	0.6712	0.7585

5. Experimental results

The first experimental results were about the pre-processing methods. Two study cases were considered: binary and multi-class cases. In both cases were used the CNN models showed in Section 4. For binary case were carried out 8 experiments and for multi-class also 8 experiments: using the 4 pre-processing methods with the two CNN models.

Each experiment for pre-processing methods was executed 30 times to obtain mean and standard deviation to do a statistical test and prove which pre-processing method offer better results. The minimum and maximum values were considered too. The optimizer used in each experiment was the Adaptive Moment Estimation (Adam). For all experiments, APTOS 2019 database was distributed as seen in Table 3. For hierarchical genetic algorithm experimentation were used the same properties.

5.1 Binary case for pre-processing experimentation

Eight experiments were realized for this study case as it was mentioned in Section 4. In Table 4 can be observed the best 4 experiments: 2 using the CNN models with 3 convolutional layers and 2 using the CNN model with 10 convolutional layers.

According with the results of the Table 4, the highest mean accuracy is in experiment #2 using the CNN model with 3 convolutional layers and the pre-processing method 2. The maximum value was obtained by experiment #3 using the CNN models with 10 convolutional layers and the pre-processing method 3.

5.2 Multi-class case for pre-processing experimentation

As same as binary study case, 8 experiments were realized for multi-class study case as it was mentioned in Section 4. In Table 5 can be observed the best 4 experiments: 2 using the CNN models with 3 convolutional layers and 2 using the CNN model with 10 convolutional layers.

According with the results of the Table 5, the highest mean accuracy is in experiment #6 using the CNN model with 3 convolutional layers and the pre-processing method 2. The maximum value was obtained by experiment #6 and #8 using the CNN models with 3 and 10 convolutional layers respectively, and the pre-processing method 2.

5.3 Comparison of pre-processing experimentation result

To have a better understanding of the data in the tables, different statistical tests were carried out to determine which pre-processing offers better results. According by the work [31], experiments using the CNN model with three convolutional layers offered best results than the experiments that used the CNN model with ten convolutional layers. So, for the statistical test only the two best experiments using the CNN with three convolutional layers were used.

5.3.1 Hypothesis testing for binary case in pre-processing experimentation

According to the results in Table 4, the hypothesis testing will be between the experiment #1 and #2. The second experiment got a higher mean accuracy, so, our statement is that the pre-processing method 2 offers a bigger mean accuracy than the offered by pre-processing method 1 for binary study case.

To reject the null hypothesis, the critical value must be less than $-$ 1.96 because the Alpha value is 0.05. The score of the statistic test is $-$ 1.8564. $-$ 1.96 is lower than $-$ 1.8564, so, it fails to reject the null hypothesis. That means that there is not enough evidence to support the claim. The hypothesis testing can be observed in Fig. 15.

Figure 15.

Hypothesis testing for binary study class for pre-processing experimentation.

5.3.2 Hypothesis testing for multi-class case in pre-processing experimentation

According to the results in Table 4, the hypothesis testing will be between the experiment #1 and #2. The second experiment got a higher mean accuracy, so, our statement is similar to binary study case: the pre-processing method 2 offers a bigger mean accuracy than the offered by pre-processing method 1 for binary study case.

To reject the null hypothesis, the critical value must be less than $-$ 1.96 because the Alpha value is 0.05. The score of the statistic test is $-$ 7.8358. $-$ 7.8358 is lower than $-$ 1.96, so, the null hypothesis is rejected. That means that there is enough evidence to support the claim. The hypothesis testing can be observed in Fig. 16.

Figure 16.

Hypothesis testing for multi-class study class for pre-processing experimentation.

5.4 Binary case for hierarchical genetic algorithm experimentation

For the HGA experimentation is used the pre-processing method 2. The experiment consisted in execute the hierarchical genetic algorithm 30 times with an initial population of 10 individuals, 10 generations, 80% for selection probability, 50% for crossover probability and 80% for mutation probability. The best children replace the worst fathers. The binary crossover was used for this experiment and for mutation, it was implemented by taking a random row of each parametric gen and changing its value in the same way as detailed in previous section. The fitness for each chromosome was the accuracy obtained for the CNN model using the testing images. In Table 6 can be seen the five better fitness obtained by the hierarchical genetic algorithm.

Table 6
Best five results obtained by the HGA for binary study case

Experiment number	Best fitness obtained
3	0.9727
8	0.9781
11	0.9713
18	0.9727
30	0.9747

According with the results, the higher fitness was obtained in the experiment #8 with an accuracy of 0.9781 using 100 epochs. The mean accuracy obtained by the experiments was 0.9650 with a standard deviation of 0.007665.

5.5 Multi-class case for hierarchical genetic algorithm experimentation

The same parameters for the hierarchical genetic algorithm were used in this study case. In Table 7 can be seen the results obtained by the hierarchical genetic algorithm.

Table 7
Results obtained by the HGA for multi-class study case

Experiment number	Best fitness obtained	Experiment number	Best fitness obtained	Experiment number	Best fitness obtained
1	0.7558	11	0.7639	21	0.7544
2	0.7748	12	0.7762	22	0.7557
3	0.7571	13	0.7612	23	0.7680
4	0.7694	14	0.7503	24	0.7366
5	0.7598	15	0.7612	25	0.7748
6	0.7667	16	0.7476	26	0.7571
7	0.7571	17	0.7571	27	0.7735
8	0.7557	18	0.7530	28	0.7353
9	0.7680	19	0.7598	29	0.7639
10	0.7653	20	0.7503	30	0.7598

According with the results, the higher fitness was obtained in the experiment #12 with an accuracy of 0.7762 using 90 epochs. The mean accuracy obtained by the experiments was 0.7596 with a standard deviation of 0.009948. The CNN architectures for binary and multiclass study cases can be observed on Fig. 17.

Table 8

Comparison results for binary study case

Work	Best accuracy obtained	Mean accuracy	Standard deviation
Pre-processing experiment #2	0.9523	0.9401	0.008245
HGA binary experimentation	0.9781	0.9650	0.007665
Modified InceptionV3 [16]	0.9557	–	–

Table 9

Comparison results for multi-class study case

Work	Best accuracy obtained	Mean accuracy	Standard deviation
Pre-processing experiment #6	0.7585	0.7401	0.010519
HGA multi-class experimentation	0.7762	0.7596	0.009948
Modified InceptionV3 [16]	0.7667	–	–
Pre-trained VGG 2016 [17]	0.7700	–	–

Figure 17.

Best architectures of the CNN models for a) binary and b) multiclass study cases.

5.6 Multi-class case for hierarchical genetic algorithm experimentation

Before presenting the conclusions, the best accuracies for binary study case can be observed in Table 8. In Table 9 can be observed the accuracies for multi-class study case. Accuracies for related work, pre-processing methods experimentation and hierarchical genetic algorithms experimentation and this work are presented.

According with the Tables 8 and 9, there is not a possible way to do a statistical test because other authors did not provide mean accuracy nor standard deviation. So, the hypothesis testing below are only between pre-processing experimentation and HGA experimentation.

5.6.1 Hypothesis testing for binary case

According to the results in Table 8, the hypothesis testing will be between the work “Pre-processing experiment #2” and “HGA experimentation”. The HGA binary experimentation got a higher mean accuracy, so, our statement is that the HGA binary experimentation offers a bigger mean accuracy than the offered by Pre-processing experiment #2.

To reject the null hypothesis, the critical value must be less than $-$ 1.96 because the Alpha value is 0.05. The score of the statistic test is $-$ 1.7691. $-$ 1.96 is lower than $-$ 1.7691, so, it fails to reject the null hypothesis. That means that there is not enough evidence to support the claim. The hypothesis testing can be observed in Fig. 18.

Figure 18.

Rejection region for hypothesis testing of binary case.

5.6.2 Hypothesis testing for multi-class case

According to the results in Table 9, the hypothesis testing will be between the work “Pre-processing experiment #6” and “HGA multi-class experimentation”. The HGA multi-class experimentation got a higher mean accuracy, so, our statement is that the HGA multi-class experimentation offers a bigger mean accuracy than the offered by Pre-processing experiment #6.

To reject the null hypothesis, the critical value must be less than $-$ 1.96 because the Alpha value is 0.05. The score of the statistic test is $-$ 7.3771. $-$ 7.3771 is lower than $-$ 1.96, so, the null hypothesis is rejected. That means that there is enough evidence to support the claim. The hypothesis testing can be observed in Fig. 19.

Figure 19.

Rejection region for hypothesis testing of multi-class case.

6. Conclusions

In this work four different pre-processing methods were presented and applied to the same images: APTOS 2019 database. After that, two different CNN models were developed and a comparison between them was made. Each CNN model had different number of convolutional layers. Then, two study cases were prepared: binary and multi-class. The CNN models with less convolutional layers offered better results than the other one with more convolutional layers. Also, the CNN model with higher accuracies was using the second pre-processing method, and it was used for the rest of the work.

The next step was to develop the HGA and improve the results obtained by the experiments of the pre-processing methods. The same HGA were used for the binary and multi-class study case. In the experiments, two CNN architectures were obtained with better results in the multi-class study case than the pre-processing methods experimentation for same study case, proved by its respective hypothesis testing.

As future work, the number of the pre-processing methods could be increased or changed. For example, the fifth pre-processing could be a combination of the pre-processing methods #2 and #4. The pre-processing method #2 were used along all the HGA experimentation, and an improvement of that method could increase even more the accuracy obtained.

Also, the HGA could be improved increasing the number of hyperparameters. There are many hyperparameters to use, like the size of the kernel for the Max-Pooling layer or more activation functions after the convolutional layer. HGA has its limits, so, change the optimizer algorithm could improve the complete CNN architecture.

Finally, APTOS 2019 is a good database for detection of DR, but it is not the biggest one. So, using a bigger database could help for detection and classification of this disease. With the results obtained was proved that the use of CNN models is a robust deep learning method that can detect complicated diseases that require an expert physician.

References

et al., Diabetes mellitus as a risk factor for chemotherapy-induced peripheral neuropathy: A meta-analysis, Supportive Care in Cancer 29 (2021), 7461–7469.

Rahim

S.S.

Palade

Shuttleworth

and Jayne

, Automatic screening and classification of diabetic retinopathy and maculopathy using fuzzy image processing, Brain Informatics 3 (2016), 249–267.

Burgess

P.I.

et al., First Prospective Cohort Study of diabetic retinopathy from Sub-Saharan Africa, Ophthalmology 123 (2016), 1919–1925.

Qiao

Zhu

and Zhou

, Diabetic retinopathy detection using prognosis of Microaneurysm and early diagnosis system for non-proliferative diabetic retinopathy based on Deep Learning Algorithms, IEEE Access 8 (2020), 104292–104302.

Sarki

et al., Image preprocessing in classification and identification of diabetic eye diseases, Data Science and Engineering 6 (2021), 455–471.

Fenwick

E.K.

et al., The impact of diabetic retinopathy on quality of life: Qualitative findings from an item bank development project, Quality of Life Research 21 (2012), 1771–1782.

Mangipudi

P.S.

Pandey

H.M.

and Choudhary

, Improved optic disc and cup segmentation in glaucomatic images using Deep Learning Architecture, Multimedia Tools and Applications 80 (2021), 30143–30163.

Sun

, The neural network of one-dimensional convolution-an example of the diagnosis of diabetic retinopathy, IEEE Access 7 (2019), 69657–69666.

Somfai

G.M.

et al., Automated classifiers for early detection and diagnosis of retinopathy in diabetic eyes, BMC Bioinformatics 15(106) (2014).

10.

Tong

D.L.

and Mintram

, Genetic Algorithm-Neural Network (GANN): A study of neural network activation functions and depth of genetic algorithm search applied to feature selection, International Journal of Machine Learning and Cybernetics 1 (2010), 75–87.

11.

Bharati

et al., CO-ResNet: Optimized ResNet model for COVID-19 diagnosis from X-ray images, International Journal of Hybrid Intelligent Systems 17(1–2) (2021), 71–85.

12.

Farag

H.H.

et al., Hyperparameters Optimization for ResNet and Xception in the Purpose of Diagnosing COVID-19, Journal of Intelligent & Fuzzy Systems 41(2) (2021), 3555–3571.

13.

Melin

and Sánchez

, Optimal design of type-2 fuzzy systems for diabetes classification based on genetic algorithms, International Journal of Hybrid Intelligent Systems 17(1–2) (2021), 15–32.

14.

Umoh

et al., Hybrid intelligent telemedical monitoring and predictive systems, International Journal of Hybrid Intelligent Systems 17(1–2) (2021), 43–57. doi: 10.3233/HIS-210005.

15.

Kadam

Jadhav

and Yadav

, Bagging based ensemble of support vector machines with improved elitist GA-SVM features selection for cardiac arrhythmia classification, International Journal of Hybrid Intelligent Systems 16(1) (2020), 25–33.

16.

Vives-Boix

and Ruiz-Fernández

, Diabetic retinopathy detection through convolutional neural networks with synaptic metaplasticity, Computer Methods and Programs in Biomedicine 206 (2021), 106094–106094.

17.

Dekhil

et al., Deep Learning based method for computer aided diagnosis of diabetic retinopathy, in: 2019 IEEE International Conference on Imaging Systems and Techniques (IST), 2019, pp. 1–4.

18.

Bhandare

and Kaur

, Designing convolutional neural network architecture using genetic algorithms, International Journal of Advanced Network, Monitoring and Controls 6 (2021), 26–35.

19.

Gomez-Cravioto

D.A.

Diaz-Ramos

R.E.

Cantu-Ortiz

F.J.

and Ceballos

H.G.

, Data Analysis and forecasting of the COVID-19 spread: A comparison of recurrent neural networks and time series models, Cognitive Computation (2021).

20.

Yang

et al., Weakly-supervised convolutional neural networks of renal tumor segmentation in abdominal CTA images, BMC Medical Imaging 20 (2020), 37.

21.

Chi

and Tian

, Deep convolutional neural networks for feature extraction of images generated from complex networks topologies, Wireless Personal Communications 103 (2018), 327–338.

22.

Xin

Zhang

and Shao

, Complex network classification with Convolutional Neural Network, Tsinghua Science and Technology (25) (2020), 447–457.

23.

Bhandari

Koppen

and Agzarian

, Convolutional neural networks for Brain Tumor segmentation, Insights into Imaging 11 (2020), 77.

24.

Rodriguez

Gonzalez

C.I.

Martinez

G.E.

and Melin

, An Improved Convolutional Neural Network Based on a Parameter Modification of the Convolution Layer, in: Fuzzy Logic Hybrid Extensions of Neural and Optimization Algorithms: Theory and Applications. Studies in Computational Intelligence Castillo

and Melin

, ed., Springer, Cham, Vol. 940, 2021, pp. 125–147.

25.

Ying

et al., Rectified exponential units for convolutional neural networks, IEEE Access 7 (2019), 101633–101640.

26.

Pratt

et al., Convolutional Neural Networks for diabetic retinopathy, Procedia Computer Science 90 (2016), 200–205.

27.

Rosenau

, Evolution and biogeography: Leading students in Darwin’s and Wallace’s footsteps, Evolution: Education and Outreach 5 (2012), 582–584.

28.

Serrano

J.I.

and del Castillo

M.D.

, On the origin of the evolutionary computation species influences of Darwin’s theories on Computer Science, Artificial Intelligence Review 38 (2012), 41–54.

29.

Xing

et al., Hierarchical genetic algorithm based RBF Neural Networks and application for modelling of the automatic depth control electrohydraulic system, International Journal of Control, Automation and Systems 9 (2011), 759–767.

30.

et al., Semi-supervised auto-encoder graph network for diabetic retinopathy grading, IEEE Access 9 (2021), 140759–140767.

31.

Cordero-Martínez

Sánchez

and Melin

, Comparison of Image Pre-processing for Classifying Diabetic Retinopathy Using Convolutional Neural Networks, in: Hybrid Intelligent Systems. HIS 2021. Lecture Notes in Networks and Systems Abraham

Siarry

Piuri

Gandhi

Casalino

Castillo

and Hung

, ed., Springer, Cham, Vol. 420, 2022, pp. 194–204.

Hierarchical genetic optimization of convolutional neural models for diabetic retinopathy classification

Abstract

Keywords

1. Introduction

2. Related work

3.1 Neural networks

3.1.2 Rectified linear unit function (ReLU)

3.1.3 MaxPooling layer

4.1 Convolutional neural networks models

4.2 APTOS 2019 database

Table 1 Number of stage images in APTOS 2019

4.3.2 Second pre-processing

4.3.3 Third pre-processing

4.4 Fourth pre-processing

4.4.3 Parametric genes for convolutional layers

5.1 Binary case for pre-processing experimentation

5.2 Multi-class case for pre-processing experimentation

5.3 Comparison of pre-processing experimentation result

5.3.1 Hypothesis testing for binary case in pre-processing experimentation

Table 6 Best five results obtained by the HGA for binary study case

Table 7 Results obtained by the HGA for multi-class study case

5.6.1 Hypothesis testing for binary case

References

Table 1
Number of stage images in APTOS 2019

Table 6
Best five results obtained by the HGA for binary study case

Table 7
Results obtained by the HGA for multi-class study case