Fully-automatic segmentation of coronary artery using growing algorithm

Abstract

Currently, cardiac computed tomography angiography (CTA) is widely applied to coronary artery disease diagnosis. Automatic segmentation of coronary artery has played an important role in coronary artery disease diagnosis. In this study, we propose and test a fully automatic coronary artery segmentation method that does not require any human-computer interaction. The proposed method uses a growing strategy and contains three main parts namely, (1) the initial seed detection that automatically detects the root points of the left and right coronary arteries where the ascending aorta meets the coronary arteries, (2) the growing strategy that searches for the neighborhood blocks to decide the existence of coronary arteries with an improved convolutional neural network, and (3) the iterative termination condition that decides whether the growing iteration finishes. The proposed framework is validated using a dataset containing 32 cardiac CTA volumes from different patients for training and testing. Experimental results show that the proposed method obtained a Dice loss ranged from 0.70 to 0.83, which indicates that the new method outperforms the traditional methods such as level set.

Keywords

Coronary artery segmentation computed tomography angiography (CTA)growing algorithm 3D U-net deep learning

1 Introduction

Coronary artery disease (CAD) has seriously affected human health in recent years and thus has become one of the major diseases. Myocardial ischemia caused by coronary artery stenosis is the main reason for CAD. With the increasing demand for the diagnosis of CAD, cardiovascular imaging techniques emerge rapidly. In current clinical practice, conventional coronary angiography (CCA) is a gold standard in CAD diagnosis. Although the lesion of the coronary artery can be shown clearly in CCA, this technique is invasive. With the advent of computed tomography (CT) scanning technology, computed tomography angiography (CTA) has been developed and widely used in diagnosis of cardiac and pulmonary diseases [1, 2]. Specifically, cardiac computed tomography angiography (CCTA) has become the main technique in CAD diagnosis because it is noninvasive and provides high-resolution three-dimensional (3D) data [3]. However, the boundary between tissues is usually blurred in CCTA data because of maldistribution of contrast agent. To improve the efficiency and accuracy of CAD diagnosis, algorithms of medical image analysis are necessary.

Segmentation of coronary artery has significant meaning in CCTA. Firstly, it can be used to measure the coronary artery stenosis. Therefore, it can provide quantitative information to the doctors. Kirili, et al. [1] introduced a standardized evaluation framework for the evaluation of coronary artery stenosis detection and stenosis quantification. Jelmer, et al. [4] proposed a method to score coronary artery calcification. It is crucial for these systems to accurately segment coronary artery. Coronary artery segmentation for CCTA has been a key technique. On the other hand, coronary segmentation can provide doctors registration information of medical images from different sources for heart operations.

With the development of pattern recognition and medical image processing, many coronary artery segmentation algorithms sprang up recently. Zhou, et al. [5] proposed a dynamic balloon tracking method to extract coronary artery tree. The vascular structures within the heart region were enhanced and segmented using a multiscale coronary response in this method. Kitamura, et al. [6] proposed a supervised detection and shape matching method. In their work, a classifier of tubular 3D objects was learned, and then Markov random field (MRF) framework was used to segment vascular structure. Deng, et al. [7] proposed a 3D interactive coronary artery segmentation using random forests and MRF optimization. Basit, et al. [8] presented a multi-scale model for 2D segmentation of coronary arteries combing the directional information and Hessian eigenvalue analysis. Chi Y, et al. [9] proposed a composite of features for learning-based coronary segmentation method, which integrated coronary artery density features, local shape and global structure into the learning framework. Gong [10] proposed an automatic coronary artery segmentation method, which combined the work of Frangi, et al. [11] and region growing method. With the development of deep learning, many algorithms based convolutional neural network (CNN) are applied in coronary artery segmentation. For example, Ronneberger, et al. [12] presented an encoder-decoder CNN called U-Net, which achieved good performance on different biomedical segmentation applications. In 2016, Moeskops, et al. [13] proposed a multi-layer CNN to segment the coronary artery, which is trained by three 2D-patches at the orthogonal views from 3D volumes. In 2017, Øyvind [14] proposed a dual pathway architecture–3D DeepMedic, which could incorporate both local and larger contextual information of coronary artery.

The above methods apply different feature extractors and classifiers to achieve coronary artery segmentation. However, the above-discussed methods need to be improved in two way–automaticity and accuracy. Regarding the lack of automaticity, in the work of Chi Y, et al. [9], the aortic valve and apical coordinates need to be specified manually. The algorithm proposed by Kirili, et al. [1] relies on manual positioning of the aortic valve, mitral valve, and apex. In addition, for most existing methods, the ROI including the entire heart area needs to be interactively intercepted. As for accuracy, artificial features of density or tubular structure are always applied in traditional algorithms. Although those features are significant for segmentation, the performance will be not good if methods only rely on manual features which have weak ability to express model. Other researchers [15, 16] extracted the central lines or segmented coronaries with deep learning methods. Well trained CNN can segment coronary artery well, but the false positive regions are always high because of lacking the fusion of global and local information.

In this study, a fully automatic segmentation method using growing strategy is proposed. In this method, initial seed points are automatically located by detecting the obvious root points of the Left Coronary Artery (LCA) and Right Coronary Artery (RCA). Then, a growing algorithm is developed to segment the coronary arteries. The growing starts from the initial seed points and segment coronary arteries in a neighborhood block with improved 3D U-Net [17, 18]. After segmentation, new seed points are detected and used as the start points of next iteration. When no new seed points are found, the growing stops. Compared with traversing method, growing from seed points can avoid being interfered by other vessel-like structure.

The main contributions of this study are as follows.

Automatic segmentation is achieved by automatically locating initial seed points and growing strategy.

A growing algorithm combining with 3D U-Net is proposed. The improved 3D U-Net is used for coronary artery segmentation. In order to make the network converge to be optimal, residual block and ‘Two-Phase Training’ are applied in the proposed method. Because the input data of the network is the neighborhood block of a seed point and it makes the coronary artery centered, the proposed network needs only a small-sized training dataset.

The remainder of this paper is organized as follows. Section 2 depicts our method. In Section 3, experimental methods and data analysis results are reported. Finally, conclusions and future work are discussed in section 4.

2 Methods

In this section, the details of the proposed method will be presented. As shown in Fig. 1, the proposed segmentation algorithm adopts growing strategy and contains the following three main parts.

Fig. 1

Flowchart of the proposed method.

2D U-Net based initial seed detection. Initial seed points are the root points of the left and right coronary arteries which are automatically located for starting the growing.

3D U-Net based Coronary artery segmentation by growing. The growing is an iteration procedure, which starts from the initial seed points to obtain the whole coronary arteries until reaching iterative termination condition. For each seed point in the iteration, a block data of size 64×64×64 is cropped and multi-receptive field 3D U-Net is improved to segment the coronary artery. Some new seed points are generated based on the output, and those points are used as seed points for next iteration.

Iterative termination condition. The condition decides whether the growing iteration stops.

2.1 2D U-Net based initial seed (root point) detection

In this study, the initial seeds are defined as the root points of the left and right coronary arteries, or the points where the ascending aorta (AAO) meets the coronary artery. So, initial seeds are also called root points in this paper. The AAO region in different CT slices are shown in Fig. 2(a), (b), (c), and the reconstructed 3D data is demonstrated in Fig. 2(d). The slice containing RCA root points is called RCA root slice and the slice containing LCA root points is named LCA root slice.

Fig. 2

Ascending aorta region (red bordered area): (a, b, c) show 3 different slices with AAO and (d) shows 3D data.

In Fig. 2, although it is shown that the features of AAO are more obvious than that of other tissues, the performance of AAO segmentation is still challenging because CCTA of different patients shows different density due to different dose of contrast agent and different machines for data acquisition. Therefore, the accuracy of root point detection depends on the accuracy of AAO segmentation.

The initial seed detection method is the first step of the proposed method. As shown in Fig. 3, it contains two major procedures: AAO segmentation and the root point location (in the 2 dashed boxes in Fig. 3).

Fig. 3

The flowchart of the first part of the proposed method (initial seed detection).

2.1.1 Procedure 1: AAO segmentation

Because ascending aorta segmentation is of great importance for the root point detection, 2D U-Net is used to segment AAO. 2D U-Net is an encoder-decoder convolutional neural network. In the encoder process, abstract features are extracted by convolutional layers and pooling layers. In the decoder process, abstract features are remapped to the original data space by upsampling or transposed convolutional layers. In order to minimize the space loss caused by pooling layers, concatenating corresponding convolutional layers in encoding phase is applied in the network.

2.1.2 Procedure 2: Root point location

The coordinate of a voxel in a volume can be defined as (x, y, t), where t represents the index of a slice in the volume and (x, y) means the coordinates in the 2D slice. The aim of this procedure is to locate the component t and (x, y) individually.

The masks of AAO in CTA slices are obtained, when slices are fed into trained 2D U-Net. Since the area of AAO changes sharply near the coronary root points, it can be used as a feature to detect root points. As is shown in Fig. 10 (c). Specifically, Equation (1) is used to detect the slice where the root point is located.

${\begin{matrix} {\dot{t}}_{r} = \underset{t \in [1, N - 1]}{arg max} (\sum_{x \in [1, h]} \sum_{y \in [1, w]} I (x, y, t + 1) - \sum_{x \in [1, h]} \sum_{y \in [1, w]} I (x, y, t)) \\ {\dot{t}}_{l} = \underset{t \in [1, N - 1]}{arg min} (\sum_{x \in [1, h]} \sum_{y \in [1, w]} I (x, y, t + 1) - \sum_{x \in [1, h]} \sum_{y \in [1, w]} I (x, y, t)) \end{matrix}$ (1)

where ${\dot{t}}_{r} / {\dot{t}}_{l}$ represents the index of the RCA/LCA root slice, I(x, y, t) represents the segmented slices, and h, w, N represents the height, width, depth of a volume data respectively. Then, plane coordinates on the slice are obtained with Equation (2).

${\begin{matrix} \dot{x} = \frac{\sum_{x \in [1, h]} \sum_{y \in [1, w]} \tilde{I} (x, y, \dot{t}) \cdot x}{\sum_{x \in [1, h]} \sum_{y \in [1, w]} \tilde{I} (x, y, \dot{t})} \\ \dot{y} = \frac{\sum_{x \in [1, h]} \sum_{y \in [1, w]} \tilde{I} (x, y, \dot{t}) \cdot y}{\sum_{x \in [1, h]} \sum_{y \in [1, w]} \tilde{I} (x, y, \dot{t})} \end{matrix}$ (2)

In Equation (2), $\tilde{I}$ is the image containing the maximal connected object. $\dot{t}$ represents the slice index and its value is ${\dot{t}}_{r}$ or ${\dot{t}}_{l}$ . The coordinates $(\dot{x}, \dot{y}, \dot{t})$ provide initial position for coronary artery segmentation. Finally, two initial seeds are obtained, namely, the left seed point $LSP ({\dot{x}}_{l}, {\dot{y}}_{l}, {\dot{t}}_{l})$ and the right seed point $RSP ({\dot{x}}_{r}, {\dot{y}}_{r}, {\dot{t}}_{r})$ .

2.2 3D U-Net based coronary artery segmentation by growing

The coronary arteries are the arteries of the coronary circulation that transport blood into and out of the cardiac muscle. They are mainly composed of the left and right coronary arteries. Both of LCA and RCA give off branches. Coronary arteries on different slices are shown in Fig. 4 (area surrounded by red).

Fig. 4

Areas of coronary arteries on different slices.

As shown in Fig. 4, areas of coronary arteries are very small. The diameters of coronary arteries are about 5mm. In this paper, the resolution of each slice data is 512×512 voxels. The image’s field of view is about 20 cm×20 cm. Small coronary arteries occupy about 50 voxels on slice, while large coronary arteries are about 500 voxels. As a result, coronary artery segmentation in 2D space is very difficult.

One of the most popular coronary artery segmentation algorithms is region growing in 3D data. Region growing is a segmentation method based on density and tube-like features. However, the density of coronary artery is determined by the dose of contrast agent and equipment precision. To improve the effectiveness of segmentation algorithms, robust feature extractors are crucial. In this section, the proposed segmentation method which is shown in Fig. 5 is depicted in details. The method mainly included two aspects: segmentation neural network and the growing strategy.

Fig. 5

Overview of the second part of the proposed method (growing method using 3D U-Net).

2.2.1 Multi-receptive field 3D U-Net

CNN models based on mass data can extract abstract and robust features which are useful for decision making. Since 2D image segmentation performs well in coarse-grained segmentation, the models need to be improved in fine-grained segmentation.

(1) Architecture: In the improved network, 3D convolutional kernels are used. According to the experiments, the trained model did not work when it was fed into all slices of one volume, because the positive sample areas are too small. It is difficult for the model to learn the real mapping rule from large negative samples. Fixed-size blocks of data are used to train the model in this paper. There are two reasons why the size of block data is 64×64×64. Firstly, the input size of a network is usually power-of-2 because the stride of pooling layer is usually set as 2. Secondly, 32×32×32 block is too small for some deep networks, which could be only suitable to describe local features, while the proportion of negative samples is large if 128×128×128 is used. The architecture of the network in this paper adopts the encoder-decoder structure. In order to train the network easily and gain better property of convergence, Batch Normalization (BN) [19] and residual block [20] (the structure is shown in Fig. 6(b)) are applied in our network. Pyramid pooling module (the structure is shown in Fig. 6(a)) is used to improve the accuracy of our method because fusing local features and global features can help the classifier to make more reasonable judgement.

Fig. 6

The structure of network modules where (a) shows a pyramid pooling module and (b) shows a residual block.

In the encoder process, 8×8×8×512 feature maps are obtained after block of size 64×64×64 is fed to 8 convolution layers which contain 2 convolutional layers and 3 residual blocks. In the proposed network, a full pre-activation residual block is used because the performance of the architecture was proven to be effective in the paper of He, et al. [21]. Global pooling, 4×4×4 average pooling, and 2×2×2 average pooling are used to separate those feature maps into different sub-region. To maintain the weights of global feature, 1×1×1 convolution layer is used. The different sub-region representations are followed by upsampling and concatenation layers to fuse local and global context information into final feature representation. Pyramid pooling module [22], instead of only max pooling, makes the capacity of the encoder more powerful. In the decoder process, upsampling and concatenation layers are applied. To offset the space loss caused by pooling in the encoder stage, upsampling layers are concatenated with the corresponding convolution layers which have the same size in encoder layers, and two convolutions followed by a BN and Rectified Linear Unit (ReLU). The outputs are mapped from final convolutional layer into 0–1 by Sigmoid activation. The network has 22 convolutional layers in total. The architecture of proposed network is shown in Fig. 7.

Fig. 7

The architecture of the proposed network.

In Fig. 7, ‘Ec’ represents the encoder process, while ‘Dec’ denotes the decoder process. The parameters of the convolutional layers of different modules are shown in Table 1, which contains the number, size and stride of convolutional kernels and the output shape of the convolutional layer.

Table 1

Parameters of the proposed network

Name of layers	Kernel	Output size	Layers name	Kernel	Output size
Ec Conv Layer1	${\begin{matrix} 64, 3 \times 3 \times 3, 1 \\ 64, 3 \times 3 \times 3, 1 \end{matrix}}$	64³ × 64	Pyramid Layer	{521, 1 × 1 ×1, 1 } × 4	8³ × 512
Res Layer1	{128, 3 × 3 ×3, 1 } × 2	32³ × 128	Dec Conv Layer2	${\begin{matrix} 256, 2 \times 2 \times 2, 1 \\ {256, 3 \times 3 \times 3, 1} \times 2 \end{matrix}}$	16³ × 256
Res Layer2	${\begin{matrix} 256, 3 \times 3 \times 3, 2 \\ 256, 3 \times 3 \times 3, 1 \end{matrix}}$	16³ × 256	Dec Conv Layer3	${\begin{matrix} 128, 2 \times 2 \times 2, 1 \\ {128, 3 \times 3 \times 3, 1} \times 2 \end{matrix}}$	32³ × 128
Res Layer3	${\begin{matrix} 512, 3 \times 3 \times 3, 2 \\ 512, 3 \times 3 \times 3, 1 \end{matrix}}$	8³ × 512	Dec Conv Layer4	${\begin{matrix} 64, 2 \times 2 \times 2, 1 \\ {64, 3 \times 3 \times 3, 1} \times 2 \\ 1, 1 \times 1 \times 1, 1 \end{matrix}}$	64³ × 1

(2) Loss and optimizer: The prediction is obtained by the above network. Loss function needs to be defined in training to minimize the difference between prediction and label by back-propagation. In coronary artery segmentation, label imbalance is challenging because the area of coronary artery is very small. Generally speaking, assigning different weights to the loss of different labels is an effective solution. However, too large weights are not conducive to network convergence. Therefore, another solution (optimizing directly criterion) is applied. The dice loss is a common evaluation criterion. Dice loss is defined as the loss function of the proposed network (Equation (3)),

$Dice Loss = 1 - \frac{2 \times \sum_{i \in D} (la b_{i} \cdot pr e_{i}) + smooth}{\sum_{i \in D} pr e_{i} + \sum_{i \in D} la b_{i} + smooth}$ (3)

where D is a set of 64³ voxels, lab and pre represent ground truth and prediction of each voxel respectively, smooth is constant which prevents the denominator from being 0. smooth is set as 1 here. In optimization strategy of loss, Adaptive Moment Estimation (Adam) is used, where learning rate is set to 1e-5, decay rate for the moment estimates is set to 0.95.

(3) Two-phase training: Although blocks of size 64×64×64 volumes are selected for training, the proportion of positive samples is still relatively small. When all training data are used to train the network, the loss value of the network will oscillate, suggesting that learning small sample is difficult. In the first phase, the blocks that contain coronary artery are sampled to train the network, which can help network learn the true distribution of small samples. In the second phase, a lower learning rate is used to train all the training data which contain positive samples (coronary artery area) and negative samples (other background area).

2.2.2 Growing strategy

There are two patterns to select block data from CCTA: traversing model and growing model. Traversing pattern is transversely selecting the block data from CCTA data. This pattern takes some invalid blocks which do not contain coronary arteries. The segmentation results are easily affected by other blood vessels. On the other hand, too many invalid blocks affect the speed of the algorithm. To make the segmentation robust and accurate, the proposed growing method selects suitable data blocks and searches new seeds on the surfaces of the blocks. The strategy is shown in Fig. 8.

Fig. 8

The flowchart of point searching strategy.

As discussed in Section 2.1, the root points of coronary arteries (point ‘O’ in Fig. 8) are located. Those root points are set as the initial seed points of growing method. Block data centered on initial seed point is selected. A mask of the same size (Block 1 in Fig. 8) is obtained after the block data is fed into the trained improved 3D U-Net model. In order to locate the subsequent seed points, the segmentation results of data blocks need to be analyzed. The far ends of the newly segmented region are considered as the candidates of the subsequent seed points. To prevent the possible mis-segmentation from affecting the search of subsequent seed points, only the area connected to the central point is considered. Therefore, a 3D thinning of connected region is firstly applied, and the end point connected to the central point is considered as a new seed. In order to prevent double-counting of data blocks, when the end point is in the region that has been calculated before, the end point would not be considered as the subsequent seed point. For example, in Block 2 of Fig. 8, point ‘O’, ‘3’, and ‘4’ are the end points. Because ‘O’ has been calculated in Block1, ‘O’ would not be considered as a subsequent seed point. In addition, in order to improve the search efficiency, when the end point is in the volume (not on the surfaces of the block), it will not be regarded as a subsequent seed point. For example, in Fig. 8 (Block 2), point ‘3’ and ‘4’ are not extended to the surface, so they are not regarded as subsequent seed points.

According to Fig. 8, there are obvious overlaps between neighbor blocks and the central voxel in each block is definitely a coronary voxel. This means the coronary is aligned in some degree, so it is helpful to train the network with relatively small training dataset.

2.3 Iterative termination condition

The proposed growing strategy can be illustrated with a tree (Fig. 9). When there are no new seed points founded, all the branches will reach the leaves and the iteration will be terminated.

Fig. 9

The illustration of the proposed growing strategy.

In summary, the segmentation algorithm detects the initial seed, then segment coronary artery by adopting a growing strategy and a modified convolutional neural network. The growing finishes when there are no new seed points. The pseudo-code for the algorithm is shown as Table 2.

Table 2

The pseudo-code of the proposed method

Algorithm: Coronary artery segmentation
Input: CTA volume, V_data; Queue of seed points, Q_n; Trained segmentation model, F_model ;
Output: Coronary artery mask, Out_mask ;
1: whileQ_n is not empty:
2: Popping seed point from Q_n, p_new;
3: Extracting 64×64×64 patch cube centered p_new from V_data, V_patch;
4: If V_patch is not segmented before, the code goes on; otherwise, go back to next seed point;
5: Segmenting V_patch by F_model, V_pre;
6: Recording the V_patch is already calculated, and updating the Out_mask;
7: Calculating following seed root from V_pre, p_next;
8: Pushing the p_next into Q_n;
9: end while

The segmentation algorithm needs one cardiac CTA data, root points and trained CNN model. Final mask is generated when the segmentation algorithm finished. There are only two values in this mask: ‘0’ and ‘1’. ‘0’ represents the background area, while ‘1’ represents the coronary artery area.

3 Results

In this section, experiments of both initial seed detection and coronary artery segmentation are introduced in detail. In the experiments, a dataset containing 32 cardiac CTA volumes of different patients is used for training and test. The CTA volumes are captured from some patients including 3 patients who suffer from coronary stenosis. And the segmentation results on the volumes of the 3 patients (called patient 1, patient 2 and patient 3) are reported in Table 4. Those CTA volumes are collected with GE Multidetector CT scanners in Beijing Anzhen Hospital. Each volume has about 224 slices, and each slice has 512×512 voxels. The voxel resolution is 0.3828×0.3828 mm and the thickness is 0.625 mm. In those experiments, the 32 CCTA volumes are divided into 3 groups, 10 for training, 1 for validation, and 21 for test.

3.1 Results of initial seed detection

In this experiment, 1000 CTA slices which include 600 positive samples and 400 negative samples are extracted from 10 CTA volumes. The positive samples are slices which contain region of AAO, while the negative samples mean that slices do not contain the interested region. A pre-train model on ImageNet is used to fine-tune our training slices. The area of AAO (Segmentation results are shown in Fig. 10 (b)) could be obtained when the test slices are fed into the trained model. After AAO segmentation, z-coordinate of initial seeds is estimated according to the feature of Fig. 10 (c) and it represents which slice the initial seeds will appear on. When the z-coordinate is set, the x and y can be calculated by computing the centroid of maximal connected domain of area which is obtained by differentiating z-th slice and (z–1)th slice. The centroids are shown as Fig. 10 (d).

Fig. 10

Analysis of root points detection. From top to bottom respectively showing (a) one slice from a volume, (b) the segmented AAO corresponding to the slice in (a), (c) AAO area difference of the whole volume, and (d) the slice containing root points region. Two columns (I and II) represent two test CTA volumes.

3.2 Results of coronary artery segmentation

In this section, three main parts are introduced in detail, including dataset preparation, training and performance evaluation.

3.2.1 Dataset and environment

In this paper, segmentation is a supervised task. To get the labels of dataset, each CTA volume is manually labeled with the help of doctors of Beijing Anzhen Hospital. Then, 1,094 blocks are randomly selected from coronary area of the training volumes, while 500 blocks are randomly selected from areas that do not contain coronary arteries. Some slices and corresponding labels along the coronal plane in training dataset are shown in Fig. 11.

Fig. 11

Some slices and corresponding labels on training dataset.

The program is based on the Tensorflow framework, which performs all computation on GPUs in single-precision arithmetic. The experiments are conducted on a computer with Intel Core i5-4590 CPU and a NVIDIA GTX 1080 Ti graphics card.

There are three kinds of evaluation criteria: Dice, Positive Predicted Value (PPV) and Sensitivity as shown in Equation 4.

$\begin{matrix} Dice = \frac{| P_{1} ⋂ T_{1} |}{(| P_{1} | + | T_{1} |) / 2} & PPV = \frac{| P_{1} ⋂ T_{1} |}{| P_{1} |} & Sensitivity = \frac{| P_{1} ⋂ T_{1} |}{| T_{1} |} \end{matrix}$ (4)

where T₁ is the true coronary artery area and P₁ is the subset of voxels predicted as positives for the coronary artery region. |S| represents the size (number of elements) of a set S.

3.2.2 Training

In early stages of training, 1,594 blocks are fed into the network. From dice curves as shown in Fig. 12, to overcome overfitting, two solutions are applied in this network: data augmentation and dropout layer. The training dataset is augmented by rotating 90 degrees and shifting 5–10 voxels randomly in 6 directions, while the training is regularized by dropout regularization for the final convolutional layer (dropout ratio set to 0.5). The network works well on training and validation sets. Figure 12 shows the training loss on training set and Dice on validation set.

Fig. 12

Loss on training set and Dice on validation set.

In the line chart, the dotted lines represent the Dice on validation dataset, while the solid lines represent training Dice loss. The ‘Ag’ represents ‘data augmentation’, while ‘Dp’ represents ‘Dropout’. The performance of the network on the training dataset is better without data augmentation and dropout regularization, but the performance on validation dataset is worse. To avoid overfitting problem, the learning is stopped when the output is stable (in the experiments, it is stable after 31K steps).

3.2.3 Test and analysis

Test procedures contain two parts: test on block dataset and test on holistic CTA dataset. Firstly, test on block dataset is introduced. Trained models are tested on the test dataset which contains 3500 selected blocks from 21 test CTA data. The results of two networks are shown in Table 3.

Table 3
Results on 3500 test blocks with size of 64×64×64

Methods PPV Sensitivity Dice

3D U-Net 0.693 0.809 0.742

Proposed 0.876 0.728 0.795

Methods	PPV	Sensitivity	Dice
3D U-Net	0.693	0.809	0.742
Proposed	0.876	0.728	0.795

In Table 3, the sensitivity of 3D U-Net is higher, but the PPV is lower. This means the false positive rate is too high and more tissues are mistakenly segmented into coronary artery. The proposed method fuses the different receptive field and incorporates multi-scale features, so fewer tissues are segmented into coronary artery. Thus, the improved network has great improvement compared with the original 3D U-Net in term of Dice. Because the test block data are mostly selected from coronary artery region, the proportion of blocks which contain small arteries is small.

In order to test the performance of the proposed segmentation algorithm, four methods (including growing method based on multi-receive field 3D U-Net (method 4, proposed), traversal method based on proposed network (method 3), region growing method (method 2) and 3D U-Net (method 1)) are tested. Final results of 3 patients suffering from coronary stenosis are shown in Table 4.

Table 4

Performance comparison of different methods on CTA volumes from 3 patients

Methods	Patient1			Patient2			Patient3			Ave Dice
	PPV	Sens	Dice	PPV	Sens	Dice	PPV	Sens	Dice
⁽¹⁾3D U-Net	0.466	0.938	0.622	0.441	0.759	0.558	0.733	0.912	0.813	0.664
⁽²⁾R-growing	0.938	0.441	0.600	0.976	0.468	0.633	0.965	0.406	0.572	0.602
⁽³⁾Traversing	0.68	0.568	0.619	0.951	0.357	0.52	0.867	0.563	0.682	0.607
⁽⁴⁾ Proposed	0.647	0.859	0.738	0.825	0.615	0.705	0.906	0.776	0.836	0.760

In Table 3 and 4, the Dice of method 1 is lower by 0.053 on block dataset and lower by 0.096 on CTA volume data than that of the proposed method. The method 1’s PPV is very low, indicating that false positive region is too large because of lacking the global information. The results of method 3 indicate that the method based on growing is powerful in coronary artery segmentation because this method could find blocks on which CNN model may have good performance instead of sequentially selecting. The PPV of method 2 is highest, but the sensitivity is too low. The results of region growing are determined by a suitable threshold and single threshold does not work well because the distribution of contrast agents is relatively weak in some fine vessels. So, region growing could not segment well the fine vessels, which causes that the sensitivity is too low. Meanwhile, overgrowing is avoided by a relatively suitable threshold, and it makes the PPV very high.

Mask of the coronary artery is computed by the proposed method, and final out is calculated by products of the mask and the original CTA data. To visualize the segmentation results, the segmentation results are reconstructed by professional medical processing software. Two volumes are shown in Fig. 13. Each row represents the segmentation result of one test sample. The first column is the ground truth label of each CTA data. From the second column to the last one respectively represents the segmentation results with region growing, 3D U-Net and the proposed method on the anterior view.

Fig. 13

Results of the reconstructed images.

In Fig. 13, left and right main coronary arteries are segmented well with the three methods. Although the segmentation result of 3D U-Net is better than that of the proposed method in some fine arteries, the results are easily influenced by other tissue, which means false positive region is too big. The proposed method reduces the false positive rate by pyramid pooling model which could incorporating the local and global information. The above experiments show that the proposed method is feasible for coronary artery auto-segmentation with an overall better performance on coronary artery segmentation.

Table 5 shows a comparison of Dice among different methods, measured against the ground truth that is manually generated with the help of doctors. It shows that the multi-receive field 3D U-Net method presented in this paper shows better results.

Table 5

Comparison of difference methods by Dice

Methods	Proposed method	3 views of 2D CTA	DeepMedic, 3D	32×32×32 3D U-Net
Dice	0.70–0.83	0.65	0.60	0.71–0.78

We also compared our results with other methods in the state-of-the-art literature, the Mohr, et al. [23] and Shahzad, et al. [24], which are based on the coronary centerline. The comparison results are shown in Table 6.

Table 6

Comparison with methods introduced in other papers by Dice

Methods	Proposed method	Graph cut [24]	Level set [23]
Dice	0.70–0.83	0.65–0.68	0.69–0.73

4 Discussion

Coronary artery segmentation is a challenging task because of its fine vessels and blurring boundary present among tissues. Meanwhile, automatic segmentation is more difficult, because results are easily affected by other tissue without the select region of interest (ROI) or key points such as aortic valve and heart apex. This study proposes and tests a fully-automatic segmentation of coronary artery whose main contribution can be summarized in two ways. First, a root point detection method is used to replace manually selected seed points. Second, a segmentation method based on growing and CNN is used to extract more representative features, which brings evidence that methods based on CNN are feasible for medical image segmentation.

As a result, the growing strategy adopted in this study assures that the central voxel of each block is a coronary voxel, so that we use 10 volumes, a relatively small number of data, to train the network. The study results demonstrate that this strategy is encouraging, especially considering the fully supervised nature of the neural approach, which learns from raw pixel data and does not rely on any a priori knowledge of the vascular structure. To extract robust and higher-order features, two modules containing residual block and pyramid pooling were applied to original U-Net. In this way, the raw block data is transformed into a more abstract representation that implies key information about the expression of coronary arteries, and then the abstract representation is remapped into original parameter space.

The performance of the method suggests that the two modules are helpful for feature extraction and fusion in the encoding stage. The final output is more accurate than the original U-Net and region growing methods which use only original value as criteria. It would be our future work to improve the performance of network on small coronary arteries segmentation. Two aspects about the network are considered to ameliorate: more cardiac CTA volumes should be collected, and the architecture of network should be further improved to segment small coronary arteries. Meanwhile, appropriate preprocessing may help network perform better and some posterior algorithms may be investigated for dealing with the rare cases of incorrect segmentation for refining the result. The potential purpose of coronary artery segmentation is coronary stenosis detection. Although more details on coronary stenosis detection is not covered in this paper, it deserves further study in the future work.

Footnotes

Acknowledgments

The authors would like to acknowledge the financial support from the National Key R&D Program of China under Grant 2017YFB0802300.

References

Kirili

H.A.

, Schaap

, Metz

C.T.

, et al., Standardized evaluation framework for evaluating coronary artery stenosis detection, stenosis quantification and lumen segmentation algorithms in computed tomography angiography, Medical Image Analysis 17 (2013), 859–876.

, Gao

, Kong

, et al., Application of 640-slice CT wide-detector volume scan in low-dose CT pulmonary angiography, Journal of X-ray Science and Technology 27(2) (2019), 197–205.

Chen

, Wu

, Di

and Zhao

, Association between magnetic resonance imaging of carotid artery and coronary stenosis detected by computed tomography angiography, Journal of X-ray Science and Technology 28(2) (2020), 299–309.

Jelmer

, Wolterink

, Leiner

, et al., Automatic coronary calcium scoring in cardiac CT angiography using convolutional neural networks, Medical Image Analysis 34 (2015), 123–136.

Zhou

, Chan

H.P.

, Chughtai

, et al., Automated coronary artery tree extraction in coronary CT angiography using a multiscale enhancement and dynamic balloon tracking (MSCAR-DBT) method, Computerized Medical Imaging & Graphics 36 (2012), 1–10.

Kitamura

, Li

, Ito

and Ishikawa

, Automatic coronary extraction by supervised detection and shape matching, IEEE International Symposium on Biomedical Imaging (2012), 234–237.

Deng

J.J.

, Xie

X.H.

, Alock

and Roobttom

, 3D interactive coronary artery segmentation using random forests and Markov random field optimization, IEEE International Conference on Image Processing, ICIP (2014), 942–946.

Basit

, Khan

S.A.

and Akram

M.U.

, Segmentation of coronary arteries, IEEE Symposium on Industrial Electronics & Applications (ISIEA), Kota Kinabalu, (2014), 66–70.

Chi

, Huang

, Zhou

, et al., A composite of features for learning-based coronary artery segmentation on cardiac CT angiography, International Workshop on Machine Learning in Medical Imaging, Springer, Cham 9352 (2015), 271–279.

10.

Gong

Z.H.

, The research of the automatic segmentation algorithm of coronary CTA scans, Thesis in The University of Chinese Academy of Sciences, (2016) (in Chinese).

11.

Frangi

A.F.

, Niessen

W.J.

, Vincken

K.L.

and Viergever

M.A.

, Multiscale vessel enhancement filtering, Medical Image Computing and Computer-Assisted Intervention — MICCAI’ 98 1496 (1998), 130–137.

12.

Ronneberger

, Fischer

and Brox

, U-Net: Convolutional Networks for Biomedical Image Segmentation, International Conference on Medical Image Computing and Computer-Assisted Intervention 9351 (2015), 234–241.

13.

Moeskops

, Wolterink

J.M.

, Van

d.V.B.H.M.

, et al., Deep learning for multi-task medical image segmentation in multiple modalities, Medical Image Computing and Computer-Assisted Intervention–MICCAI 9901 (2017), 478–486.

14.

Øyvind

, Segmentation of Coronary Arteries from CT-scans of the heart using Deep Learning, Thesis for M.Sc in Computer Science, Norwegian University of Science and Technology, June (2017).

15.

Wolterink

J.M.

, Van

Hamersvelt R.W.

, Viergever

M.A.

, et al., Coronary artery centerline extraction in cardiac CT angiography using a CNN-based orientation classifier, Medical Image Analysis 51 (2019), 46–60.

16.

Huang

, Huang

, Lin

, et al., Coronary artery segmentation by deep learning neural networks on computed tomographic coronary angiographic images, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE (2018), 608–611.

17.

Long

, Shelhamer

and Darrell

, Fully convolutional networks for semantic segmentation, IEEE Transactions on Pattern Analysis & Machine Intelligence 39(4) (2014), 640–651.

18.

Shi

, Jiang

and Zheng

, A stacked generalization U-shape network based on zoom strategy and its application in biomedical image segmentation, Computer Methods and Programs in Biomedicine 197 (2020), 105678.

19.

Ioffe

and Szegedy

, Batch normalization: accelerating deep network training by reducing internal covariate shift, Computer Science 37 (2015), 448–456.

20.

, Zhang

, Ren

and Sun

, Deep residual learning for image recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, (2015), 770–778.

21.

, Zhang

, Ren

and Sun

, Identity mappings in deep residual networks, Computer Vision –ECCV 9908 (2016), 630–645.

22.

Zhao

, Shi

, Qi

, et al., Pyramid scene parsing network, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 6230–6239.

23.

Mohr

, Masood

and Plakas

, Accurate lumen segmentation and stenosis detection and quantification in coronary CTA, In Proceedings of 3D Cardiovascular Imaging: A MICCAI Segmentation Challenge Workshop, (2012).

24.

Shahzad

, Kirişli