Abstract
BACKGROUND:
Carotid atherosclerosis plaque rupture is an important cause of myocardial infarction and stroke. The effective segmentation of ultrasound images of carotid atherosclerotic plaques aids clinicians to accurately assess plaque stability. At present, this procedure relies mainly on the experience of the medical practitioner to manually segment the ultrasound image of the carotid atherosclerotic plaque. This method is also time-consuming.
OBJECTIVE:
This study intends to establish an automatic intelligent segmentation method of ultrasound images of carotid plaque.
METHODS:
The present study combined the U-Net and DenseNet networks, to automatically segment the ultrasound images of carotid atherosclerotic plaques. The same test set was selected and segmented using the traditional U-Net network and the ResUNet network. The prediction results of the three network models were compared using Dice (Dice similarity coefficient), and VOE (volumetric overlap error) coefficients.
RESULTS:
Compared with the existing U-Net network and ResUNet network, the Dense-UNet network exhibited an optimal effect on the automated segmentation of the ultrasound images.
CONCLUSION:
The Dense-UNet network could realize the automatic segmentation of atherosclerotic plaque ultrasound images, and it could assist medical practitioners in plaque evaluation.
Keywords
Introduction
The continuous aging of human populations and the acceleration of urbanization have increased the risk factors of cardiovascular disease (CVD) incidence in Chinese residents. A rapid growth and individual aggregation trend is noted in younger, lower-income groups. The number of patients with cardiovascular disease will continue to grow rapidly in the next 10 years [1].
Carotid atherosclerosis is a major risk factor for high-risk cardiovascular accidents that have been reported in recent years, such as ischemic stroke. Following development of carotid atherosclerosis, the patient’s life is threatened, and serious adverse effects may occur that affect the quality of life. CVD monitoring and diagnosis are of great significance [2]. According to the “Report on Cardiovascular Health and Diseases in China 2020”, the number of patients with cardiovascular disease is estimated to 330 million. A total of 13 million out of 330 million account for stroke patients. The occurrence of carotid atherosclerotic plaque is considered to be the process of carotid atherosclerosis. The apparent features are closely related to the occurrence of ischemic stroke [3]. Therefore, investigation of the stability of the carotid plaque is important for the prevention of cardiovascular diseases. Ultrasound imaging is currently the preferred method for screening carotid atherosclerotic plaques due to its ease of use, low cost and safety [4]. It can reflect the anatomy and plaque characteristics of the blood vessels [5].
Quantitative analysis of the diagnosis and treatment of cardiovascular diseases has important clinical significance for the accurate diagnosis of the severity of cardiovascular disease or the corresponding treatment effect. Medical image segmentation is the most important of this type of treatment. An authoritative semi-automatic or automatic segmentation method at home and abroad is used to accurately segment the carotid atherosclerotic plaque. At present, it is still at an early stage, which includes visual observation of the medical image corresponding to the carotid atherosclerotic plaque. This is performed by the medical practitioner, and involves manual segmentation of the plaque boundary. However, the segmentation result is greatly influenced by human factors.
Although medical image segmentation is one of the most common medical image processing methods, the ultrasound image are susceptible to noise, and the imaging resolution is not high. Besides the carotid atherosclerotic plaques are irregular in morphology of plaques and blood vessels. The influence factors, such as the unclear boundary of the vascular wall and plaques affect the results of the automatic segmentation of the ultrasound image. In addition, the edge information of the carotid plaque is difficult be defined, and the segmentation of vascular wall and carotid plaque area is prone to errors.
Prior to proposing deep learning [6], the common segmentation methods of the carotid plaques mainly include the following: The threshold segmentation method based on the Snakes model [7, 8, 9], the segmentation method of the level set [10], and the segmentation method of clustering [11]. The edge information of these segmentation methods is notably lost. In recent years, significant progress has been made with deep learning, which is widely used in the field of computer vision. When deep learning is applied to solve an image segmentation problem, it is found that the segmentation mode of deep learning can accurately solve the end-to-end segmentation mode, and reduce several labor costs. Shin J et al. proposed the use of AlexNet to segment the carotid atherosclerotic plaques [12, 13]. This was mainly used to define the ROI region of interest in carotid ultrasound images, and improve the segmentation accuracy. This method used a large number of manual markers in the ultrasound image pretreatment stage. In 2015, Ronneberger et al. proposed the application of U-Net [14], as a new method of image segmentation, which was mainly used in the medical field. Ultrasound image segmentation of carotid atherosclerotic plaques has been investigated by a large number of studies that have used transformations of the basic U-Net network. The image is segmented by adding factors suitable for segmentation [15]. Chen et al. merged the blood vessel map into the U-Net. In the input, the U-Net network highlights the characteristics of the carotid plaque, and the segmentation effect is significantly improved. However, due to the complexity of the carotid plaque edge contour information, a part of it is missing in the segmentation result. Due to the complexity of the ultrasound image of the carotid artery, even experienced physicians may draw different conclusions during the process of diagnosis, which makes the deep learning network more difficult to be achieved [16]. Li et al. used the H-Dense-UNet [17]network to solve the 3D liver CT image segmentation algorithm. However, the ultrasound image was more blurred than the CT image and therefore could not be directly applied to the carotid artery.
In response to the aforementioned problems, the carotid vascular and the atherosclerotic plaque are labeled in the ultrasound image of carotid atherosclerotic plaques which were segmented by the DenseNet [18] and the U-shaped convolutional neural networks. The advantages of the dense connection path and the U-Net connection are merged. The densely connected path originates from the densely connected network DenseNet, and the improved information flow and parameter efficiency reduce the difficulty of training the deep network. Unlike DenseNet, U-Net is added between the coding and the decoding parts of the architecture. This enables successful connection and ensures that the network can save low-level spatial features for better on-chip context exploration and improved segmentation of the arterial plaque. The performance is important for the diagnosis of the arterial plaque.
Methods and materials
The introduction of the U-Net network
The U-Net and the FCN (Fully Convolutional Networks) networks appeared almost simultaneously [19]. Compared with the FCN, the U-Net network is completely symmetrical. The left-side downsampling operation of the U-Net network is similar to the right upsampling operation. The stack operation is used, and the FCN is an additional operation.
The classic U-Net network consists of four downsampling operations and four corresponding upsampling operations. A jump connection is used to combine low-level feature maps with advanced feature maps. The study received ethical approval from Sichuan Academy of Medical Sciences & Sichuan Provincial People’s Hospital (2019[220]). Informed consent was not required.
By combining low-level feature maps with advanced feature maps, the features of lowdimensional and highdimensional feature maps can be obtained simultaneously, improving the accuracy of the model results.
Carotid atherosclerotic plaques have lower number of ultrasound images, and in the U-Net network segmentation process, the amount of data required is small. Since the ultrasound image of the carotid plaque is clinically used, the information derived from the entire image aims to aid the doctor to assess parameters, such as position, shape and size, as well as obtain the information of the internal tissue components of the region. It is considered that the information of each scale is very important, and it is helpful for the results of the diagnosis. Therefore, the effective information should be considered at multiple scales as much as possible. U-Net can combine high-level and low-level features to increase the collection of the information. According to this description, the U-Net network is suitable for segmentation of ultrasound images of the carotid atherosclerotic plaques.
The introduction of DenseNet network
The DenseNet network is a convolutional neural network with dense connections. It is improved on the ResNet network [20]. In the DenseNet network, each layer is directly connected to all the layers in front of it to determine the reuse of features. This indicates that the input of each layer of the network corresponds to all the layers above, while the feature map of each layer’s output will be transmitted directly to all subsequent layers as input. Each of the dense blocks in Fig. 1 has a structure, and each layer in each of the dense blocks has a BN-ReLu-Conv operation, that is, a batch standardization operation, and a displacement layer between each block. The layers include convolution and pooling layers.
Network structure diagram of DenseNet.
The biggest feature of the DenseNet network is the dense connection. This feature alleviates the problem of gradient disappearance in deep neural networks, over-fitting on smaller training data sets, enhancing feature propagation, and encouraging feature reuse, which greatly reduces the number of parameters measured. In the ultrasound image of the carotid atherosclerotic plaque, the edge and other types of information of the atherosclerotic plaque are essential for medical practitioners to assess the severity of the disease. Therefore, the collection of features of the carotid plaque ultrasound images is very important, and the feature reuse of the DenseNet network should be frequently selected for efficient diagnosis. The DenseNet network ensures that the characteristics of the plaque are identified, which indicates its improved efficiency compared with other networks.
Ultrasound images exhibit considerable noise that affects the effect of segmentation. Therefore, prior to the experiment, the image algorithm can be used to preprocess the ultrasound image. Firstly, the region of interest is divided. The narrowing of the recognition range can significantly improve the segmentation accuracy, and subsequently the image enhancement operation is performed on the ultrasound image. The image enhancement can depict the artery plaque. The edges and contours of the ultrasound image are more pronounced. The image enhancement process is shown in Fig. 2. Firstly, the image is grayed out, and the second step is to binarize it. The third step is to enhance the image, and finally the image is contrast-enhanced.
Flow diagram of Image enhancement.
Considering the importance of the edge information of the atherosclerotic plaque, the U-Net network was used to segment the ultrasound image of the atherosclerotic plaque, resulting in significant losses of the edge information. The improvement on the traditional U-Net network structure is achieved by combining DenseNet and U-Net.
The present study uses a combination of DenseNet and U-Net for segmentation. This network has deep and efficient network features. In order to fully extract the characteristics of each image, this network combines the advantages of DenseNet and U-Net networks. The dense connection path of the DenseNet network is used and the skip connection of the U-Net network is added in the encoding and decoding parts of the DenseNet network.
The image feature extraction portion utilizes the structure of DenseNet-161, which consists of repeating densely connected building blocks with different output dimensions. In each densely connected building block, a direct connection is present from any layer to all subsequent layers. As shown in Fig. 3, each layer produces k feature maps, where k stands for the growth rate. One advantage of dense connections between layers is that it has lower number of output dimensions than traditional networks, avoiding learning redundancy. In addition, densely connected paths ensure maximum information flow between layers, improving gradient flow, and thereby reducing the burden of searching for optimal results in the deep neural networks.
Network structure diagram of Dense-UNet.
Generally, DenseNet-161 is designed for object classification tasks. In addition, the segmentation network contains several maximum pooling layers and upsampling operations, which may result in the loss of various parts of image feature information. Therefore, the dense connection path and the U-Net connection are taken into consideration. A dense connection between layers is employed within each microblock to ensure maximum information flow, while the U-Net remotely connects the encoded and decoded portions of the information.
Network architecture table of Dense-UNet
The diagram and detailed structure of the Dense-UNet are shown in Fig. 3 and Table 1, respectively. Dense-UNet extends to 167 layers, termed Dense-UNet-167, which consists of 167 convolutional layers, collection layers, dense blocks, transition layers and upsampling. The dense blocks represent the feature obtained by U-Net in the contraction encoder portion of the DenseNet network. They also include the decoding portion for generating the tag output using the skip connection to obtain the characteristics of each segment of the encoder portion and the deconvolution operation in the decoder. All layers are directly connected. A transition layer is required, which includes Batch Normalization (BN) and a 1
In the present study, the batch standardization processing in the Dense-UNet network was based on the high number of layers of the Dense-Unet network. As the network structure is gradually deepened, the training will become more and more difficult, and the convergence speed of the network will gradually increase. This is caused due to the training data of the middle layer of the training network being updated with a large number of parameters or due to the network being gradually deepened. The distribution of these data parameters is slowly altered and these changes will follow the deepening of the network layer. The model is designed in this to adapt to learning new data distributions, increase the burden of training, and reduce the speed of network training. Loffe et al. proposed a batch standardization processing method [21]. The batch standardization processing method was used to perform the operation of mean and variance correction on the data between layers. These operations can ensure the relative stability of the parameter and data training distributions. Concomitantly, two learnable parameters are introduced in order to scale and translate the data, which can ensure the ability of the network to extract the features. The method of batch standardization is to speed up the convergence speed of the Dense-UNet network and prevent the occurrence of over-fitting, so that it can be used by minimizing or even without dropout and other regularization operations.
The validity of the U-Net Connections is estimated as follows: The effectiveness of the U-Net connections was analyzed in the framework used in the present study. Both DenseNet and Dense-U-Net use the same training strategy for training. The difference is that Dense-UNet contains long-distance connections between the coded and decoded parts to preserve high-resolution features. As shown in the loss curve of the different networks in Fig. 9, it is clear that Dense-UNet achieves a lower loss value than DenseNet. The U-Net model uses the skip connection method to characterize the shrink-encoded path in the model. The information is cascaded with the features obtained by the deconvolution operation in the extended path to obtain multi-scale feature information and improve network performance. This proves that the U-Net connection aids the efficient integration of the network.
In the current neural network model, a nonlinear activation function is introduced after the convolutional layer, so that the neural network is no longer a linear combination of inputs, and the related network model can theoretically approximate an arbitrary function. This effectively improves the expression ability of the network model. The commonly used nonlinear activation functions are the following: Sigmoid function [22], tanh function [23], and ReLu function [24]. The definition of the ReLu activation function is shown in Fig. 4.
The grap of the Activation function of ReLu. The ReLu function is a piecewise linear function.
It can be seen from the Eq. (1) that the ReLu function is a piecewise linear function, which converts all negative values into 0. The positive value remains unchanged. This operation is generally called unilateral suppression. This unilateral suppression operation allows the neurons in the neural network to be sparsely activated. It is precisely because of this sparse activation feature that the network model can better acquire the feature information of the image, accelerate the convergence process, and fit the training data. The ReLu function is more expressive than other common linear activation functions, and this ability tends to perform better in deep neural networks. Compared with other common nonlinear functions, the gradient of non negative interval defined by ReLu is constant, and the gradient will not disappear. When the gradient value is less than 1, the error of the automatically segmented segmentation graph and the image labeled by the doctor will be attenuated once for each layer of propagation. The use of other loss of functions will cause the training model to converge, which renders ReLu more convergent than other network models and improves its operational speed. At present, ReLu functions are often applied to deeper network models. The activation layer of the network structure of the present study selects the use of the ReLu function as the activation function. Its description is shown in the following formula:
where
In the network structure of the present study, the parameters of the encoder part in the Dense-UNet are initialized with the weight of DenseNet (object classification training), and the decoder part is trained with random initialization. Since the weights are initialized in a random distribution in the decoder section, following several iterations, the U-Net connection is added to adjust the model, and the cross-entropy loss
Operation diagram of data initialization to prepare for deep learning (A) Original image of Ultrasound; (B) Original label image (C) Plaque tissue composition image; (D) Transformed label image.
The ultrasound images of the carotid atherosclerotic plaque used in this paper were provided by the Sichuan Provincial People’s Hospital. The original data contained image data of carotid atherosclerotic plaque in more than 200 patients. Concomitantly, the areas of the carotid plaque and tissue components were marked by experienced ultrasound medical doctors from Sichuan Provincial People’s Hospital, as shown in Fig. 5. Firstly, in Fig. 5A, the ITK-SNAP medical image annotation software provided by the Image Computing Science Laboratory of the University of Pennsylvania, USA, uses different colors for the regional and tissue components of the carotid plaque in the original image and the blood vessel wall. The labeling is performed, and the labeling results are shown in Fig. 5B and C. Figure 5B is the gradation pattern of the labeling result, and Fig. 5C is the color mode of the labeling result. This study focuses on the segmentation of plaques. Therefore, when the carotid plaque segmentation experiment is completed, it is necessary to replace the labeling results in Fig. 5B or C with the required plaque region segmentation label pattern. The corresponding picture information format is as shown in Fig. 5D. In this experiment, the image files are converted to image file tags by a basic image algorithm.
In order to test the segmentation accuracy of the segmentation algorithm, the corresponding contrast experiment was completed, and the data were divided into the training set and the test set according to a ratio of 6:1. Subsequently, the relevant regions of the carotid atherosclerotic plaque layer were extracted for construction. The original image data and the corresponding label image data of each training set were subjected to two image deformations, and subsequently the images were horizontally flipped, translated, and mirrored one by one. The data are input into the training network as a training set sample.
Data pre-processing
When selecting the way of image enhancement, factors, such as the method and effect of image processing have to be considered. The image enhancement algorithm used in the current study was median filtering. This algorithm used was a relatively classic filtering algorithm. Although Gaussian filtering is relatively good for image feature retention, it uses domain average method to denoise. It also results in blurred partial boundaries, while median filtering is a non-linear image processing method that preserves the edge information of the image well.
The Ultrasound image pre-processing to reduce the influence of noise (A) Original image of Ultrasound; (B) Median filter enhancement image.
As shown in Fig. 6, Fig. 6A is an original image of the ultrasonic image, and Fig. 6B is an effect diagram obtained after performing a plurality of median filtering processes. It can be seen that some details are enhanced following processing. Clearly, the same network model also aids in the extraction of features, eliminating some of the noise.
In addition, the contrast enhancement is used to further enhance the edge information, as shown in the following equation:
where
In the present study, a segmentation model of the atherosclerotic ultrasound image of the Dense-UNet network was constructed based on the keras framework. In order to assess the correct rate of the network structure, the data set was divided into training samples and test samples. In order to quantitatively evaluate the segmentation performance of the algorithm, two evaluation indicators were used in the current study [25] as follows:
(1) Dice (Dice similarity coefficient), which intuitively represents the ratio of the area of the two bodies intersecting to the total area. This indicator can be used to measure the coincidence degree between the actual segmentation result and the theoretical segmentation result [26]. Its definition is shown in Eq. (4):
where
(2) VOE (volumetric overlap error). This is similar to Dice. It is defined as shown in Eq. (5). Compared with Dice, it replaces the operation with a subtraction operation to represent the error rate [27]. The following formula is used:
The value of VOE ranges between 0 and 1, and the lower the value of the obtained VOE, the higher the coincidence rate of the graph and the medical practitioner’s mark, which is segmented by the segmentation algorithm, which eventually leads to improved segmentation effect.
The test set was used to test the effectiveness of the algorithm, and the average segmentation accuracy of the ultrasound image of the Dense-UNet in each patient’s arterial plaque was tested. Figure 7 indicates the segmentation results of the ultrasound images of certain arterial plaques. The first row of data (Fig. 7A–D), represents the original image of the arterial plaque ultrasound image, the second row of data (Fig. 7A1–D1) represents the segmentation result manually labeled by the doctor, and the third row of data (Fig. 7A2–D2) corresponds to the segmentation result automatically segmented by the algorithm. Figure 7 demonstrates that the segmentation effect is ideal. The effect of segmentation is not considerably different than the segmentation map manually drawn by the medical practitioner.
Evaluation table of dice and VOE coefficient
Evaluation table of dice and VOE coefficient
Image segmentation result of Dense-UNet (The first line of image is the original picture of the test: A, B, C, D, the second line of image is the result image drawn by the doctor: A1, B1, C1, D1, and the third line of image is the result of the Dense-UNet segmentation: A2, B2, C2, D2). The effect of Dense-Unet segmentation is not considerably different than the segmentation map manually drawn by the medical practitioner.
Table 2 indicates the prediction results of the Dense-UNet network structure, which lists the evaluation values of the Dice coefficient and the VOE coefficient. Table 2 indicates that the value of Dice reached 0.96 on average, and the average value of the VOE coefficient was 0.047. The results of automatic segmentation were not considerably different from those marked by professional medical practitioners.
In order to verify that the Dense-UNet network in the current study was superior to other segmentation algorithms in segmenting carotid plaque, the current study selected two image segmentation networks and compared them with the Dense-UNet algorithm used. These two algorithms were the U-Net network model and the ResUNet network model combined with ResNet and U-Net.
Figure 8 indicates an effect of automatic segmentation of an ultrasound image of the same arterial plaque using different network models. Figure 8A indicates the pre-processing result of the carotid ultrasound image, and Fig. 8B corresponds to the contours of the three different network segments. Red corresponds to the result of the medical practitioner’s delineation, green is the result of Dense-UNet network segmentation, blue is the result of automatic segmentation of the traditional U-Net network, and yellow corresponds to the result of automatic segmentation of the ResUNet network. It is shown that the Dense-UNet network segmentation resulted in an improved effect. Figure 8C indicates the image marked by the medical practitioner, Fig. 8D indicates the results of the segmentation using the network Dense-UNet network of the current study, and Fig. 8E indicates the segmentation provided by the U-Net network. Figure 8F indicates the result of the segmentation using the Resnet101 network combined with the U-Net network [28]. It can be seen that the effect of the Dense-UNet segmentation resembled the result of the manual segmentation performed by the medical practitioner.
Figure 9 indicates the change of the loss value of three different network structures and the training algebra epoch value curve. It can be seen that the loss value of the Dense-UNet network converged faster. The final value of the loss value convergence of the UNet network was small.
Evaluation coefficient of result graphs for automatic segmentation of different networks
Evaluation coefficient of result graphs for automatic segmentation of different networks
Comparison of segmentation results of different networks (A) Original image of ultrasound; (B) segmentation contour comparison image; (C) annotation image of the medical practitioner; (D) Dense-UNet segmentation result; (E) U-Net segmentation result; (F) ResUNet segmentation result. It can be seen that the effect of the Dense-UNet segmentation resembled the result of the manual segmentation performed by the medical practitioner.
Loss value curve of different networks, the loss value of the Dense-UNet network converged faster. The final value of the loss value convergence of the UNet network was small.
Table 3 indicates the average Dice coefficient and VOE coefficient corresponding to the automatic segmentation of three different networks. It can be seen from the table that the Dice average obtained by segmentation of the Dense-UNet network used in the current study is the highest, and the value of the obtained VOE is the lowest, which proves that the segmentation effect was optimal, and the closest to that of the medical practitioner. Figure 8 of the above comparative experiment and Table 3 of the corresponding data table demonstrate the feasibility and superiority of the Dense-Unet network.
Aiming at the automatic segmentation of atherosclerotic plaque ultrasound images, an automatic segmentation algorithm based on Dense-UNet network model was proposed. In this model, the original image was preprocessed by the image algorithm and the data were augmented. The processed image was input into the Dense-UNet network as a training set for training. The novelty of the network is that it combines the advantages of the DenseNet and of the U-Net network models, and deepens the number of the training network layers to 167 layers. This results in optimal extraction of the characteristics of ultrasound images of the carotid atherosclerotic plaques that in turn achieves better results and improves the actual segmentation effect. Compared with the existing U-Net network and ResUNet network, the network structure model of the current study not only improves the segmentation accuracy, but also exhibits a faster convergence speed.
Funding
This work was supported by the Sichuan Provincial Department of Science and Technology Project (grant number 2018JY0649) and the fundamental Research Funds for the Central University (grant number ZYGX2020ZB038).
Ethics statement
The study received ethical approval by Sichuan Academy of Medical Sciences and Sichuan Provincial People’s Hospital (2019[220]). Informed consent was not required.
Data availability
All data generated or analysed during this study are included in this published article.
Footnotes
Acknowledgments
None to report.
Conflict of interest
The authors declare that they have no conflict of interest.
