Abstract
This paper presents a supervised method for blood vessel segmentation in digital retinal images by a combination of learning and classification. For an image, the method defines and computes pixel strength as primary features for conservatively computing vessel and background pixels as preliminary segmentation, from which the main segmentation selects training data to learn a neutral network (NN) classifier on the fly. Each pixel in the training data set is represented by an 8-D vector composed of intensity descriptor and pixel strength features, and the learned classifier for the image is next applied to classify the undetermined pixels. The segmentation results are further refined by filtering out the outliers. The method was evaluated on the publicly available DRIVE database, and the results showed better or comparable performance when comparing with other existing solutions in literature. The much better sensitivity and robustness of our approach with different image conditions make it potentially suitable for clinical applications such as automated screening for early diabetic retinopathy detection, and auto- and semi-automatic grading of diabetic retinopathy.
Introduction
Automatic blood vessel segmentation in retinal images has been recognized as a key to quantitative analysis of retinal conditions, which is crucial for timely and accurate diagnosis of retinopathy. Vessel segmentation methods informally can be categorized in two groups: rule-based methods and supervised methods. The rule based methods include matched filters [1–8], adaptive thresholds [3, 9], intensity edges [10, 11], region growing [12], statistical inferencing [13], mathematical morphology [12, 15], andHessian measures [16–18]. While supervised methods are based on pixel classification, which classify each pixel into two classes: vessel and background. Normally a classifier is trained by supervised learning with manually segmented images as training data. For most image segmentation tasks, nonlinear classifiers are preferable to linear classifiers. The nonlinear classifiers such as the KNN method [19, 20], support vector machines [21], Bayesian classifier [22], or neural networks [23] usually perform well in vessel segmentation, but further improvement is still needed especially when these image conditions exist: (1) vessel central light reflex; (2) poor contrast for bifurcation and crossover regions; (3) thin vessels; (4) noisy optic disc and pathological regions; (5) overlapping or closely neighboringof vessels.
All the above conditions can cause the inaccurate segmentation and measurements of vascular structures. The impact of these factors to the vessel segmentation of some existing methods is illustrated in Fig. 1 also revealing the limitations of these approaches. For the segmentation result of Staal et al. [18, 19], there are fragmented vessel, as well as part of the optic disc contour misclassified as vessel. For Soares et al. [21] method, some of the closely neigh- boring vessels are merged together. Both methods missed some thin vessels, and some of the central part of vessels due to their inability to handle the central light reflex. Ricci-line [24] segmented the image with many disconnected vessel segments, and some of the crossing regions are somehow merged together. Ricci-svm [24] segmentation is less noisy, and extracted vessels are less fragmented than [21]. It, however, also missed some thin vessels, and produced some false positives inthe results.
In addition, there are signal or intensity variations between images, supervised methods based on global training data set (all images) are not adaptively sensitive to variations between individual images. This is one of the issue we tried to address in our proposed method.
Instead of performing classifier training offline first, and then do segmentation using learned classifier as most classification systems do, we embed or integrate the classifier training inside the image processing and segmentation process, which is similar, in a way, to classification using online learning [25]. For each image a simple and conservative method based on pixel strength is used for preliminary extraction of vessel pixels. By setting a biased threshold for pixel strength, the method guarantees to minimize the false positives. The training samples are then chosen automatically from the preliminary segmentation result, and used forsubsequent classifier training.
The embedded learning is designed for adaptive vessel segmentation and carried out dynamically for each individual image. We choose a neural network in our implementation. The training and classification use a richer feature vector including both pixel strength and intensity based descriptor. Both pre-processing and post-processing are also performed in our method. As an example, Fig. 1(f) shows an improved segmentation result by our method applied to the same image.
Experimental results with DRIVE database showed a promising potential of the proposed method forapplications in real clinical environments.
This paper is structured in four sections, Section 2 describes in details the proposed vessel segmentation method, Section 3 is the experiment and evaluation, we concluded the discussion in Section 4.
Vessel segmentation by embedded learning and classification
Some vessels in retinal image are very thin, because of the weak contrast, noise, nonuniform illumination etc., it is difficult to completely segment all the vessels, sometimes the segmentation includes some false positives, i.e. some background pixels are misclassified as vessel pixels, and the optic disc edges may also be segmented as vessels. So the image needs to be preprocessed so as to enhance the image contrast and remove the noise.
Preprocessing
Contrast enhancement
One of the main problems faced in vessel extraction of low resolution images is the efficient extraction of the smaller vessels. It is difficult to segment them out because of the low contrast with the background [26]. So it is necessary to enhance the contrast of the retinal images. Normally the enhancement was done on the green channel because it offers the highest intensity contrast between the vessels and the background [27]. This paper used contrast-limited adaptive histogram equalization (CLAHE) [28] to enhance the contrast adaptively through the image by constraining the maximum slope in the transformation function as shown in Fig. 2, T (r) is the transformation function determined by the location of points (r 1, s 1) and (r 2, s 2). Divide the image into small non-overlapping regions, and CLAHE is applied on the small regions. Then, the neighboring tiles are combined using bilinear interpolation to reduce induced boundaries. In this paper, the image is divided into 8×8 pixels, and the histogram is generated with 128 bins. Figure 3 shows the result of CLAHE contrast enhancement on green-channel of retinal images.
Local adaptive noise filtering
The contrast of the image has been enhanced by CLAHE, yet the noise still exists in the image which will affect the vessel segmentation accuracy, so noise filtering needs to be performed. The traditional noise filtering techniques such as mean filter or Gaussian filter will neutralize the contrast enhancement, hence this paper proposed a local adaptive noise filtering (LANF)technique, which will filter the pixel based on itsneighbor context.
For each pixel, a rectangle window is set with size no larger than the vessel width so as to avoid filtering the whole vessel, to determine whether a pixel needs to be filtered or not, the pixel intensity status is analyzed within the window, the average intensity a and variance v is calculated, and a variance threshold t v is set. If v <t v , then the region in the window is considered to be homogenous, and they may be all vessel pixels or background pixels; in the homogenous region, if the intensity of the center pixel has a large difference from the average intensity a, then it is considered to be noise, and will be replaced by the color value composed of average R,G,B value of all the pixels within the window; if v ≥ t v , the region is considered to be heterogenous region, then no filtering will be performed. By this way, the salt and pepper noise may be eliminated which can reduce the false positives. The effect of the adaptive filtering method for the segmentation result is shownin Fig. 4.
Algorithm design
The fundamental idea underlying the proposed method is two-folds: be sensitive to local variation (between images), and combine a robust method for preliminary segmentation, and training sampleselection for embedded learning.
Definitions
Given a matrix I n×m to represent an n × m two dimensional image, n is the image width, and m is the image height. Centered at each pixel, 18 lines with different angles θ (θ is relative to the horizontal axis) are generated, and each line Line θ has fixed numberof pixels.
Different from [29, 30], this paper used a fixed window, so M is fixed, from the 18 lines, the line with largest average intensity is chosen to compute the pixel strength of p.
In the feature extraction stage, pixels are computed and the image is characterized by a feature vector. In the classification stage, the vector is used to discriminate vessel and background. Two sets of featuresare used. Intensity-level based features: within a certain sized window, the features are defined based on the difference of the intensity values between the candidate pixel and its surrounding pixels. Pixel strength based features: within a certain sized window, the features are defined based on the difference of the pixel strength values between the candidate pixel and its surrounding pixels.
1) Intensity-level based features: according to the fact that the blood vessels are always darker than the background, and vessels are surrounded by large area of background. So for the vessel pixels, there should be a certain number of background pixels whose intensity is higher than it. Choose a window with certain size w × w to define a neighborhood of the center pixel (x, y), and features based on the intensity-level variation of pixels within the neighborhood are determined. Let represent the set of coordinates in the window, and I E represent the enhanced image. Then, the feature descriptors can be expressed as:
2) Pixel strength based features: as discussed in 2.2.3, the vessel pixel has higher pixel strength than the background pixel, and for a real vessel pixel, it should have certain amount of background pixels whose pixel strength is lower than it. So within a w × w sized square window centered at the point (x, y), features based on the pixel strength difference between surrounding pixels and the center candidate pixel are determined, let S E represent the pixel strength matrix of enhanced image I E . Then, the feature descriptors can beexpressed as:
In the retinal image, background pixel has larger intensity value than vessel pixel. Let I vessel be the vessel pixel intensity, and I background be the background pixel intensity, so I vessel < I background . For a line whose length is l, let l vessel represent the number of vessel pixels, and l background represent the number of background pixels.
The average intensity L on the line is calculated by Equation (10).
If p is a vessel pixel, then l vessel >0, and there are S vessel vessel pixels in the window. Normally S vessel >> l vessel , so the average intensity M of the window is lower than the average intensity L of the line, and for this reason, a vessel pixel has larger pixel strength.
If l vessel = 0, then p is a background pixel, and the average intensity L will be larger, while most or even all of the pixels within the window centered at p are background pixels, and the average intensity M is close to or even larger than L, so it has smaller pixel strength.
Based on the intensity and pixel strength features of vessel and background, if a pixel has smaller intensity and larger pixel strength, then it has higher probability to be a vessel pixel, otherwise it has higher probability to be a background pixel.
Initial segmentation & training data selection
As mentioned above, a pixel with higher pixel strength has higher probability to be a vessel pixel. A pixel strength threshold T is set to determine if a pixel belongs to vessel or background. To make it more adaptive, T is computed for each retinal image. Firstly, a pixel strength is calculated for every pixel, and a histogram is generated by analyzing the strength distribution status, see Fig. 5. Each bin represents the number of pixels with the same pixel strength. For a 565×584 retinal image, the number of vessel pixels is between 20000 and 30000. So we choose the bin with pixel number between 10000 and 20000 and its right neighbor bin has less than 10000 pixels, here in this histogram, T is set to be 3. For each pixel in the image, if its pixel strength is larger than T, then it is marked as vessel, otherwise marked as background. The segmentation result is shown in Fig. 6.
As shown in Fig. 6, the above simple threshold method can only extract the vessel skeleton, and thus the black pixels include the true background and some vessel pixels, so the segmentation result needs to be processed to search the missed true vessel pixels from the black pixel set S Black , and a learned classifier is used for further refinement.
Neural network learning & classification
In the feature extraction stage, each pixel in a retinal image is characterized by a vector in an 8-D feature space.
Now, based on the feature vector, each undetermined pixel in S Black will be classified as vessel(C 1)(in white in Fig. 6) or background(C 2)(in black). To learn a classifier adaptively for the single image, the labeled pixels produced in the preliminary segmentation are used as training samples as shown in Fig. 6, where the white pixels represent the vessel, and the black pixels include background and some true incorrectly labeled vessel pixels due to the deliberately pre-set high threshold to guarantee no false positives are generated or at least false positives are minimized. Both positive samples and negative samples are selected from this labeled pixel pool with a distance criteria to ensure the quality of the samples. A local classifier is then trained for this image. The same training process is repeated for each subsequent image in tandem. The classifier learned this way for each image is more sensitive to individual image properties than a single global classifier learned from a group of images. In this paper we select a multilayer feedforward NN as our classifier, and the classification stage is composed of two phases.
1) Neural Network Design: A multilayer feedforward network composed of an input layer, three hidden layers and an output layer is used in this paper. For the input layer, there are a number of neurons equal to the dimensions of the feature vector(in this context, 8 neurons). While each of the hidden layers contains 15 neurons which will provide optimal NN configuration. The output layer has only one single neuron linked to a non-linear logistic sigmoid activation function which will limit the output range within 0 and 1.
As we can see from Fig. 6, with a higher pixel strength threshold, vessel pixels with stronger strength values are segmented with little or no noise introduced, so these pixels are used as samples whose classification results are C
1; while in S
Black
, there are vessel and background pixels. How to get the real background samples is important. As discussed previously, the pixel with lower pixel strength has a higher probability to be background, so by setting a lower threshold, those real background pixels can be extracted, and by this way the training set S
t
is obtained. Suppose there are N candidates in S
t
represented by the feature vector F defined from Equation (11), and the classification results (C
1 or C
2: vessel or background) are known, then S
t
can be represented as
Once S t is established, NN is trained by adjusting the weights of the connections through errorinterpretation.
2) Neural Network Application: at this stage, the trained NN is used to refine the segmentation result by searching more vessel pixels from S Black . The pixel strength threshold T is reduced gradually, and those pixels whose strength values are higher than T are candidates which will be classified by NN. Each candidate is represented by the feature vector defined from (11) and individually passed through the NN. Since a logistic sigmoidal activation function was selected for the single neuron of the output layer, the NN decision determines a classification value between 0 and 1 which is called probability P (x, y) of a pixel belongs to vessel. In order to obtain a binary segmentation result, a threshold P T is set to determine whether a pixel of the undetermined set S Black whose pixel strength is above T belongs to vessel(C 1) or background(C 2). So a refined classification result image I R is generated by assigning class C 1 an intensity value 255 and C 2 an intensity value 0 as shown in Fig. 7. Mathematically
Where P (C 1, F (x, y)) represents the probability of a pixel (x, y) described by feature vector F (x, y) belongs to class C 1.
As shown in Fig. 8, there are some discrete regions which contain small number of pixels, and some of these regions are noise which need to be removed like the regions marked by red ellipses, but some are vessel segments and should be remained like the regions marked by red rectangles. How to discriminate the two kinds of regions is a challenge.
As we can see from Fig. 8, the red ellipses are normally discrete regions with a relatively large distance to other regions, while small vessel segments are often very close to the main vessel skeleton, based on this feature, a double-window is set to screen the noise regions, which is a small window surrounded by a large window as shown in Fig. 9, then slide it on the segmentation result such as Fig. 7, if some small discrete regions can be completely surrounded by the small window, then check whether the large window can also completely include the region, if it is true, which means that the region is far from other vessels, then the regions are considered to be noise and should be removed. Here the function of the large window is to discriminate noise and vessel segments; if some region can be surrounded by small window but cannot be surrounded by large window as shown in green double window, then they are considered as small vessel segments which should be remained. In order to catch more small regions, the double-window size L is set as 4 ≤ L ≤ 100, and the size difference between the two windows is 2. The result is shown in Fig. 10.
Experiment and evaluation
In this paper, the Digital Retinal Image for Vessel Extraction (DRIVE) database of retinal images is used. It contains 40 retinal images with 565×584 size, and it provides ground truth image which contains vessels segmented manually by two experts to test the segmentation accuracy. Our method is implemented by java, under Windows 7 platform, with Intel(R) Core(TM)2 Duo CPU E7600 @3.06GHZ 3.06GHZ and 4GB RAM.
In order to test the segmentation performance of our algorithm, this paper used Accuracy (Acc), Sensitivity(SN), Specificity(SP) and Positive Predictive Value(PPV). These metrics are defined based on the terms in Table 1.
The accuracy(Acc) is measured by the ratio of the total number of correctly classified pixels (TP+TN) and the number of pixels in the image. Sensitivity (SN) reflects the ability of an algorithm to detect the vessel pixels. Specificity(SP) is the ability to detect background pixels. The Positive Predictive Value (PPV) gives the proportion of identified vessel pixels which are true positives. These metrics are defined by the following equations:
The average of the selected performance metrics obtained for the DRIVE database is shown in Table 2.
This paper also used ROC curve to show the good performance of our proposed method, in ROC curve, the closer the curve to the left top corner, the better of the method performance, and the performance can also be evaluated by the area AUC under the curve, the closer of AUC to 1, the better of the method performance. Figure 11 is the ROC curve of the proposed method for vessel segmentation on retinal images of DRIVE database.
The performance of the proposed method is compared with state of the art algorithms published in the last decade in Table 3 for DRIVE. Figure 12 shows some segmentation results of the retinal images in DRIVE compared with the ground truth images.
As shown in Table 3, our method shows an overall better or comparable performance in terms of accuracy and AUC measurements comparing with all other tested methods. Interestingly, our method does show a significantly better sensitivity than all other tested methods but a below average specificity. This may be explained by the effect of the embedded learning based segmentation refinement which is done within a single image rather than using whole set of gold standard image data. Each image is segmented using its own classifier being sensitive to the local characteristic of the image. The other possible explanation is the missing of true positives in some of the images from the gold standard dataset. We reviewed and compared, with our eye specialist, original images in the dataset against mark-up images and realized this is the case.
Automatic analysis of blood vessels is a very important task in many clinical investigations and scientific research related to retina vascular features. To address some limitations in handling some poor image quality and cross-image variations, this paper proposed a supervised vessel segmentation method by embedded learning and classification using intensity descriptor and pixel strength features, which is carried out dynamically and adaptively for each individual image. The method was tested with the DRIVE database, and the results showed very promising potential for clinical applications such as automated screening for early diabetic retinopathy detection, and auto- and semi-automatic grading of diabetic retinopathy. The proposed method has a relatively high computational complexity mainly due to the embedded learning. So we plan in our future work to implement a parallel algorithm of the method to significantly improve the speed of the current implementation. In some cases of poor image quality, the method also missed parts of vessel especially with thin vessels. In addition to further improving the performance of the method, we will incorporate vessel width and tortuosity measurements in the method and develop an interactive vessel analysis software tool for ophthalmologists.
Footnotes
Acknowledgments
This work was partially supported by National Natural Science Foundation of China, under grant No. 61373079,61272244,1175023,61175053,61272430, Natural Science Foundation of Shandong Province, under grant No. ZR2013FL022, Key Project of Chinese Ministry of Education, under grant No. 2012101.
