A novel deep learning technique for analysis and detection of ARMD using OCT scan images

Abstract

Age-Related Macular Degeneration (ARMD) is a medical situation resulting in blurred or no vision in the middle of the eye view. Though this disease doesn’t make the person completely blind, it makes it very difficult for the person to perform day to day activities like reading, driving, recognizing people etc. This paper aims to detect ARMD though Optical Coherence Tomography (OCT) scans where the drusen in the macula is detected and identify the infected. The images are first passed though Directional Total Variation (DTV) Denoising followed by Active contour algorithm to mark the boundaries of the layers in macula. In deep learning, a convolutional neural network is a class of deep neural networks, most commonly applied to analyzing visual imagery. Then these images categorized as healthy and infected using Convolution Neural Network. Different CNN variant algorithms like Alexnet, VggNet and GoogleNet have been compared in the experiments and the results obtained are better compared to traditional methods.

Keywords

1. Introduction

Age-Related Macular Degeneration (ARMD) is a rapidly growing ailment with respect to the age. The eye or the sense of vision is very essential for living being as we depend on vision to see something. Vision is one of the important factors for human beings. ARMD is a leading cause of blindness in people over 50 years old in developed countries. 80% of the affected people have the “dry” form of ARMD and 20% of people have the “wet” form of ARMD.

The risk factors for the development of ARMD are given below:

1.
Genetics: various scientific studies have shown that Genetics can play a key role in the development of ARMD. The complement factor H (CFH) Y402Hpolymorphismthat is located above chromosome 1 is associated with a higher risk of ARMD. The alterations in this may lead to the regulation of inflammation. Patients homozygous for the risk allele of CFH have a seven-fold increased risk of ARMD.
2.
Race: Moreover, if a person is a Caucasian, then there is a high possibility for the development of ARMD.
3.
Age: Getting old by age is an additional factor of risk in the development of ARMD.

ARMD is diagnosed by checking the vision and eye pressure and by administering tropical drops on to the eye to dilate the pupils which helps to view the retina. In order to study the ARMD, the normal retina has to be studied which is a complex tissue located in the back part of the eye. Retina has various cellular layers and underneath the retina, basement membrane is located and a tissue Retinal Pigment Epithelium (RPE). In the initial stages of ARMD, the development of Drusen occurs when the basement membrane thickens and yellow spots (drusen) become apparent.

The classification of ARMD is performed in the following ways:

iii. i.
No ARMD: When someone is observed and has a few or no small drusen ( $>$ 63 $\mu$ m), then they are classified as having no ARMD.
ii.
Early ARMD: When a person is detected with many small or few medium-size drusen ( $\geqslant$ 63 $\mu$ m, $<$ 125 $\mu$ m), then that person is having the signs of early ARMD.
iii.
Intermediate ARMD: Ifone has many medium size drusen or $\geqslant$ 1 large drusen ( $\geqslant$ 125 $\mu$ m), then the person is categorized as having intermediate ARMD.
iv.
Advanced ARMD: In this classification there are two forms that cause ARMD, they are Neovascular (“wet”) which causes choroidal neovascularization and Non-Neovascular (“dry”) which causes geographic atrophy. Untreated Advanced ARMD may lead to loss of central vision.

Micronutrients supplementations with certain oxidants and vitamins are beneficial for the effective treatment of ARMD. The National Eye Institute has found that by the combination of the oxidants provides some benefits and decreases the progression of ARMD. This medication contains VitaminC, Vitamin E, Zinc, Copper, Beta-caroteneor Lutein and Zeaxanthine. The early ARMD must be examined forevery 1 year and the intermediate ARMD for every 6 months.

Advanced ARMD in wet form leads to untreated severe vision loss. Reduced central vision, central scotoma, Distortion, Decreased contrast sensitivity, Decreased colour vision are the symptoms of Advanced ARMD. Wet form of ARMD is caused due to the development of Choroidal Neovascularization where new abnormal blood vessels proliferate and penetrate basement membrane in setting of large-sized drusen. Subretinal hemorrhage, Subretinal fluid, Intraretinal fluid, RPE elevation are the clinical features of Choroidal Neovascularization (Wet ARMD). Retinal photos and Optical Coherence Tomography (OCT) are non-invasive imaging tests which provides the 2D Cross-sectional view of the retina. Another imaging test is Fluorescein angiography where the intravenous dye is given to view retinal and choroidal blood vessels. Another new imaging tool is OCT Angiography (OCTA) which is Non-invasive method and allows the view of retinal and choroidal blood vessels and intravenous dye is not required here.

Effective treatments for Wet ARMD were developed by blocking the excess Elevated levels of Vascular Endothelial Growth Factor (VEGF) using intraretinal injection of anti-VEGF medicine and a shot of medicine into the back of the eye. Untreated Wet ARMD leads to Scar Formation where the new blood vessels accompanied by scar tissue that replaces normal retina tissue continue to grow which results in permanent loss of vision. Therefore, early diagnosis and treatment of Wet ARMD is essential. Wet ARMD can be treated with additional treatments by implementing new medicines with longer duration and by evaluating stem cell therapy and Gene therapy.

Additional advanced ARMD is Geographic Atrophy i.e., “Dry ARMD”, which involves the center of fovea and accompanied by thinning of the retina and loss of pigmentation that can cause loss of vision. Geographic Atrophy appears as dark spot on Auto fluorescence imaging. At this time, there are no proven treatments for Dry ARMD since clinical trials are underway.

Martidis et al. [1] carried out series of case studies for investigation effect of IVI of triamcinolone acetonide leading high visual acuity and macular edema in patients with BRVO. Crabb et al. [2] proposed a Drusen analysis method for separating the microgram and Bruch’s membrane from one another for proteome analysis. This is an experimental method using spectrometric analysis where the direct analysis of the protein is carried out to analyze the molecular mechanisms of ARMD. Drusen dissections of several patients were carried out to identify the infected ones.

Lee et al. [3] proposed a Computer-aided diagnosis of the Optical Coherence Tomography (OCT) using deep learning. A large collection of OCT images both healthy and infected were selected and a database was constructed. The two folders are fed as input to the Convolutional Neural Network which classifies the images as infected or normal. Ting et al. [4] presented the applications of Artificial Intelligence (AI) and deep learning in ophthalmology. Tian et al. [5], proposed an automatic segmentation of the choroidin enhanced depth imaging optical coherence tomography images. The following sections illustrate the proposed techniques used for the implementation.

Benhamou et al. [6] performed investigation for effect of intravitreal triamcinolone injection on refractory pseudophakic cystoid macular edema. Authors recorded significant improvement in visual acuity. Rosenfed et al. [7] carried out experimentation to test Ranibizumab antibody in ARMD. Authors concluded that usage of Intravitreal of ranibizumab for 2 years prolonged period of vision loss and increased average visual acuity.

Antcliff et al. [8] carried out series of case studies for investigation effect IV triamcinolone acetonide on uveitic cystoid macular edema. Sunness et al. [9] presented a method to measure Geographic Atrophy in ARMD. This clinical method projected a 30 degree image on the macula of the eye. The key points present in the image and the atrophic and spared areas present in the image are marked. The total area categorized under atrophy is computed.
2. Proposed method

The images are first passed though Directional Total Variation (DTV) Denoising followed by Active contour algorithm to mark the boundaries of the layers in macula. The resultant data is fed to CNN to identify the abnormality in the input image.

Figure 1.

Proposed technique block diagram.

2.1 DTVD

In Image signal processing, the method named total variation denoising is commonly used for noise removal. The algorithm is primarily based on the principle that the images with high and probably specious detail is said to have large total variation. This means that the variation in the adjacent pixels is too high when compared to a normal image. So when this variation is reduced and brought similar to the original image, the unwanted noise in the image is removed while preserving the edges. When compared to traditional denoising techniques like linear and median filtering which smooth the edges, the DTV retains the edges present in the images. In order to increase the noise removal capability, a directional total variation denoising algorithm is proposed. The algorithm is described below.

Algorithm

Step 1. Step 1.
Input image is read
Step 2.
Color to gray scale conversion, if read image is color.
Step 3.
Initialize the parameters:

a. a.
$\lambda$ : Weight of the TV term in the cost function
b.
$\alpha$ : Length of the major axis of the ellipse
c.
MAX_ITER: Maximum Iterations allowed for the process

Step 4.
Define the direction of the ellipse, $\theta$ .
Step 5.
Initialize the filters for realizing the Delta operator and its transpose
Step 6.
$h=[1∼{}∼{}-1]$ ;
Step 7.
$g=[-1∼{}∼{}1]$ ;
Step 8.
$R=\exp(i\theta)$
Step 9.
$\kappa={\displaystyle\frac{{\alpha}^{-2}}{8{\lambda}^{2}}}$
Step 10.
In every iteration, calculate

a. a.
$ux=\alphavx$
b.
$uy=vy$ , where $u x$ and $u y$ vector fields in $X$ and $Y$ direction
c.
$u=R(ux+iuy)$
d.
Apply the filter $g$ and $g^{\prime}$ on the real and imaginary part of $u$ to update $u x$ and $u y$
e.
$u=x-\lambda(ux+uy)$
f.
Apply the filter $h$ and $h^{\prime}$ on the real and imaginary part of $u$ to update $u x$ and $u y$
g.
$u=\lambda\text{conj}(R)(ux+iuy)$ ;
h.
Obtain the final vx and vy using the equations

ii. i.
$vx=vx+\kappa\text{real}(u)\alpha$
ii.
$vy=vy+\kappa\text{imag}(u)$

i.
Calculate the magnitude value of $v$
j.
Extract the maximum value of the magnitude, $m$
k.
Normalize $v x$ and $v y$ w.r.t $m$ end iteration

2.2 Active contour

Active contour is an image segmentation concept which is based on curves present in the image rather than pixels. The goal of this algorithm is to obtain a segmentation curve that smoothly conforms to the edges present in the image. The algorithm begins by taking a rough position of the contour as an input. Then the contour stretches or contracts its shape to adjust to the nearby edges over iterations. This concept is better explained with the help of energy function. Consider a curve having energy $E(c)$ . The energy $E(c)$ increases as the model of the “optimal” position away. Often there are conditions in the energy demand the smoothness or symmetry of the model. Mostly the total energy then subdivided into the potential energy that generates forces that the contour to the image data ‘i’ adjust the object boundaries, and the internal energy, the Regardless of the data provides for a smooth contour. The goal is to close the contour find that represents a minimum of this total energy. The curve is said to have segmented a shape with good accuracy is if the energy of the curve is minimum. This energy internally consists of two parts:

$\displaystyle E(c)=E_{\text{internal}}(c)+E_{\text{external}}(c)$ (1)

here, the $E_{\text{internal}}(c)$ term represents the smoothness of the shape of the contour and $E_{\text{external}}(c)$ represents the relation of the curve with respect to the image pixel values. The internal component is expressed as

$\displaystyle E_{\text{internal}}(c)=\int\nolimits^{1}_{0}{\alpha{c^{\prime}(s% )}^{2}+\beta{\rm d}s}$ (2)

here, the term $c^{\prime}(s)$ represents the stretch of the curve and the $c^{\prime\prime}(s)$ represents the smoothness of the curve. $\alpha$ and $\beta$ are constants.

$\displaystyle E_{\text{external}}(c)=\int\nolimits^{1}_{0}{-{\nabla I(c(s))}^{% 2}{\rm d}s}$ (3)

where $\nabla I(c(s))$ represents the gradient of the surface around the curve.

These two terms $E_{\text{internal}}(c)$ and $E_{\text{external}}(c)$ are calculated at every point of the curve. The internal part tries to make the curve as smooth as possible and the external part fits the curve along the edges present on the surface.

2.3 Convolution neural network

In deep learning, a convolutional neural network is a class of deep neural networks, most commonly applied to analyzing visual imagery. Convolution neural networks are a special class of artificial neural networks (ANNs) which are designed to handle computer vision applications. Like regular ANNs, CNN also has neurons, weights, bias and optimization of objective functions. Image are provided as inputs to these CNN where the sparse connection in the extracted features are analyses. CNNs are most popularly used for Image recognition, SAR image processing, semantic segmentation and medical image analysis. The general structure of a CNN consists of

•
ImageInputLayer: This layer takes the input images.
•
Convolution2d layer: These layers perform the convolutional operation on the input images to extract the features. The size of the filters are specified in this layer.
•
ReluLayer: This layer performs the thresholding operation where negative values are changed to zero.
•
Max pooling 2d layer: This layer is responsible for down sampling the input data into rectangular pooling regions.
•
Fully connected layer: This layer connects the intermediate layers to the output
•
SoftmaxLayer: A softmax function is created in this layer.
•
Classification layer: This layer comes into play in case of multi class classification by computing the cross class entropy.

With the help of these layers, the CNNs extract texture patterns form images using the pre-trained filters present in CNN. These patterns are then converted to feature vectors representing the edges, shapes colours etc.

In CNN, an image is given to a neural network model which is basically a function and this function will give a probability of what is the object contained in the image. For example an image having thousand objects will have a thousand probabilities that the object contained in the images of one of those classes, so the CNN is broken down into a smaller set of function of layers and each one of those layers are a neural network. So the neural network can be of a single layer or multiple layers, the multiple-layer neural network is as shown below.

Figure 2.
multiple-layer neural network model.

2.4 AlexNet

AlexNet was the very first CNN used for large scale image classification. AlexNet was able to outperform all previous non deep learning based models by a significant margin, and so this was the comNet that started the spree of comNet research and usage afterwards. So the basic comNet AlexNet architecture is a conv layer followed by pooling layer, normalization, com pool norm, and then a few more conv layers, a pooling layer, and then several fully connected layers afterwards.

Figure 3.

CNN architecture for AlexNet.

Figure 4.

CNN architecture for VggNet.

2.5 VggNet

The VGG network has much smaller filters than the Alexnet. This is also much deeper than the predecessor as the number of layers present in VggNet is more compared to the AlexNet. One key thing that was done is very small filter are maintained by it so only three by three conv all the way, which is basically the smallest com filter size that is looking at a little bit of the neighboring pixels. Simple structures of 3 $\times$ 3 convs with the pooling is used in the network.

2.6 GoogleNet

GoogleNet has 22 layers to make it more deeper network but the primary concern of this network is the computational efficiency. They did this using this inception module which we’ll go into more detail and basically stacking a lot of these inception modules on top of each other. The fully connected layers were removed from the network resulting in twelve times less parameters than AlexNet, which had 60 million even though it’s much deeper now.

3. Experimental results

The experiments have been performed on the SDOCT data set available for free from Kaggle.com. The images are of size 512 $\times$ 1000 and a total of 2500 image are chosen for the experimentation.

Figure 5 shows the input ARMD image. Figure 6 shows the DTV output. Figure 7 shows the result of active contour marking the boundary and Fig. 8 shows the binary image marking the layer. In the obtained binary image, the upper edge indicates the presence of drusen. A smooth curve indicates a healthy eye and a crooked or rough curve indicated an infected eye.

Figure 5.

Input ARMD noisy image.

Figure 6.

DTV output.

Figure 7.

Active contour result.

Figure 8.

binary image.

Figure 9.

Smooth upper edge.

Figure 10.

Rough upper edge.

Figure 11.

(a) Healthy eye input image; (b) Healthy eye binary result.

Figure 12.

(a) Infected eye input image; (b) Infected eye binary result.

The Figs 11–13 show the input images and the final binary results given as input to the CNN module. Figure 11 represent healthy eye images and the Figs 12 and 13 represent infected images.

Figure 13.

(a) Infected eye input image; (b) Infected eye binary result.

4. Case study

We collected Fundus and OCT scan images of 25 patients in which 16 male and 9 female from East and West Godavari Districts, Andhra Pradesh, India, which were used for validation of the proposed method on real-time images. The scans are presented in images.

Figure 14.

Macula multi cross OCT scan image of male aged 71.

Figure 15.

Macula multi cross OCT scan image of female aged 64.

Table 1

Comparison of accuracy

Training images	CNN AlexNet	VggNet	GoogleNet
70 percent training	75.34%	77.20%	78.54%
80 percent training	80.98%	81.85%	82.74%
90 percent training	84.54%	85.58%	86.24%

Table 2

Approximate drusen lengths on 100 OCT scan images

	$<$ 63 $\mu$ m	$<$ 63 $\mu$ m and $<$ 125 $\mu$ m	$>$ 125 $\mu$ m
Image count	24	54	22

After preprocessing and applying the proposed method, the images were segmented and the drusens where clearly identified.

Figure 16.

(a) Real time OCT input image; (b) Real time OCT image binary result.

Figure 17.

(a) Input image; (b) Active contour output; (c) obtained drusen length.

5. Analysis

The active contour algorithm is used to calculate the width of the region between the choroid and the retina. This analysis helps us to determine the length of the drusen thereby finding the stage of the AMD. The width of the region from left to right of the image in every column is measured. The variation in the width of the region along the entire image remains constant for a health eye. In case of the infected eye, the minimum width is calculated which in turn measures the length of the drusen. This phenomena is presented in the image below.

The database images have been captured from Bioptigen, Inc (Research Triangle Park, NC), from 4 of the clinics located in different places. The dimensions of the images captured are from a retinal cross section of 6.7 mm $\times$ 6.7 mm. This makes the scale of the image as 1 pixel equal to 6 $\mu$ m. The Lengths of the drusens are averaged to get the final drusen length. Based on this value, the category of the drusen is identified. In the experiment, 100 drusen images are taken and categorized accordingly.

6. Conclusion

OCT scans reveal the dursen present in the macula region in fundus of human eye caused by ARMD. Reading of these images requires technical expertise and time. The purpose of this work is to evaluate the performance of deep learning with image augmentation for detecting ARMD automatically. The proposed method involved DTV based preprocessing for denoising, Active contour for segmentation and CNN for classification. The results obtained outperform the existing techniques.

Footnotes

Acknowledgments

We sincerely thank Dr. T. Krishna MBBS, DOMS, FIVRS, Vitreo Retina Surgeon for providing real time SDOCT scans for carrying out our experimentation and validation.

References

Martidis

Duker

J.S.

Greenberg

P.B.

et al., Intravitreal triamcinolone for refractory diabetic macular edema, Ophthalmology 109(5) (2001), 920–927.

Crabb

J.W.

Miyagi

Shadrach

West

K.A.

Sakaguchi

Kamei

Hasan

Yan

Rayborn

M.E.

Salomon

R.G.

and Hollyfield

J.G.

, Drusen proteome analysis: An approach to theetiology of age-related macular degeneration, Proc Natl Acad SciUSA 99 (2002), 14682–14687.

Lee

C.S.

Baughman

D.M.

and Lee

A.Y.

, Deep learning is effective for the classification of oct images of normal versus age-related macular degeneration ophthamol, Retina 1 (2016), 322–327

Ting

D.S.W.

Pasquale

L.R.

Peng

et al., Artificial intelligence and deep learning in ophthalmology, Br J Ophthalmol 103 (2019), 167–175

Tian

Marziliano

Baskaran

Tun

T.A.

and Aung

, Automatic segmentation of the choroid in enhanced depth imaging optical coherence tomography images, Biomed Opt Express 4(3) (2013), 397–411.

Benhamou

Massin

Haouchine

Audren

Tadayoni

and Gaudric

, Intravitreal triamcinolone for refractory pseudophakic macular edema, American Journal of Ophthalmology 135(2) (2003), 246–249.

Rosenfeld

P.J.

Brown

D.M.

Heier

J.S.

et al., Ranibizumab for neovascular age-related macular degeneration, NewEngland Journal of Medicine 355(14) (2006), 1419–1431.

Antcliff

R.J.

Spalton

D.J.

Stanford

M.R.

Graham

E.M.

Ffytche

T.J.

and Marshall

, Intravitreal triamcinolone for uveitic cystoid macular edema: An optical coherence tomography study, Ophthalmology 108(4) (2001), 765–772.

Sunness

J.S.

Bressler

N.M.

Tian

Alexander

and Applegate

C.A.

, Measuring geographic atrophy in advanced age-related macular degeneration, Invest Ophthalmol Vis Sci 40 (1999), 1761–1769.