Abstract
BACKGROUND:
Spectral computed tomography (CT) has the capability to resolve the energy levels of incident photons, which has the potential to distinguish different material compositions. Although material decomposition methods based on x-ray attenuation characteristics have good performance in dual-energy CT imaging, there are some limitations in terms of image contrast and noise levels.
OBJECTIVE:
This study focused on multi-material decomposition of spectral CT images based on a deep learning approach.
METHODS:
To classify and quantify different materials, we proposed a multi-material decomposition method via the improved Fully Convolutional DenseNets (FC-DenseNets). A mouse specimen was first scanned by spectral CT system based on a photon-counting detector with different energy ranges. We then constructed a training set from the reconstructed CT images for deep learning to decompose different materials.
RESULTS:
Experimental results demonstrated that the proposed multi-material decomposition method could more effectively identify bone, lung and soft tissue than the basis material decomposition based on post-reconstruction space in high noise levels.
CONCLUSIONS:
The new proposed approach yielded good performance on spectral CT material decomposition, which could establish guidelines for multi-material decomposition approaches based on the deep learning algorithm.
Introduction
Conventional x-ray CT system employs a photon-integrating detector, which collects photons over the whole x-ray spectrum, and ignores spectral responses of materials [1, 2]. Thus, conventional CT oftentimes does not have sufficiently high contrast resolution for biological soft tissues. However, spectral CT based on a photon-counting detector can identify the absorption feature in available x-ray energy ranges [3–5] and obtain more attenuation information for material classification and quantification.
Traditional material decomposition can be performed in projection domain by pre-reconstruction or in image domain by post-reconstruction [6–10]. Projection-based methods solve the decomposition functions directly by projection data, which needs to introduce K-edge characteristics of contrast agents in the case of multi-material decomposition [11]. Image-based methods are able to decompose multiple materials theoretically, but image quality is easily affected by artifacts and noises [12]. Moreover, poor image contrast can also lead to inaccurate decomposition results due to the materials with similar attenuation characteristics.
Recently, deep learning has attracted widespread attention in various imaging applications such as image denoising, deblurring, segmentation, detection, recognition [13–16], etc. It can solve identification problems according to feature learning and hierarchical feature extraction, which is feasible to improve the efficiency and accuracy of material decomposition. Some studies have focused on the application of neural networks in CT fields. W. J. Lee et al. adopted a feedforward neural network to perform material depth reconstruction in multi-energy x-ray imaging simulation [17]. T. Wurfl et al. developed a reconstruction network to solve limited angle problems in parallel-beam geometry and then extended to fan-beam and cone-beam geometry [18]. E. Kang et al. and H. Chen et al. studied the removal of artifacts and noises in low-dose CT imaging through convolutional neural network (CNN) [19, 20]. D. P. Clark et al. used a CNN-based method with a U-net structure for material decomposition with spectral micro-CT [21]. Y. Xu et al. proposed a deep neural network (DNN) consisting of two sub-nets to solve projection decomposition problems [22] and used a Fully Convolutional Network (FCN) to realize decomposition of dual-energy CT images [23]. C. Feng et al. designed multi-energy material decomposition through a neural network to output the monochromatic attenuation maps from the polychromatic reconstructions [24]. Thus, deep learning has opened a door for spectral CT to facilitate multi-material decomposition researches.
In this study, we proposed a multi-material decomposition method for spectral CT via FC-DenseNets. Spectral data were acquired by the photon-counting detector and training set was constructed by the reconstructed spectral CT images. Finally, we performed real experiments to demonstrate that our proposed method based on deep learning had a better performance than the basis material decomposition based on post-reconstruction space. In the next section, the methods are described. After presenting the results in the third section, we discuss some related issues and conclude the paper in the last section.
Methods
Image-based material decomposition
Spectral CT can acquire attenuation characteristics in the available x-ray energy ranges, after x-ray beams passed through an object, the photon counts captured by the photon-counting detector is represented as:
When m < n, the number of unknowns is more than the number of equations, Equation (3) becomes an indeterminate problem. To solve the problem, one solution is introducing the principle of mass conservation to projection decomposition algorithm [25], another is using volume conservation assumption in triple material decomposition [26]. Both previous methods are based on dual-energy CT material decomposition, in this paper we aimed at studying multi-material decomposition combined with deep learning to solve the Equation (3) in the case of m > n.
Due to artifact and noise limitations in spectral CT images, it is difficult for basis material method to obtain accurate material decomposition. Explicitly, the detailed interior structure of spectral CT images with different energy ranges are similar but the pixel gray values are various, which is feasible to construct abundant training sets for deep learning to distinguish diverse material regions. In this study, our object is to classify the bone, lung and soft tissue regions from reconstructed spectral CT images based on CNN method.
In material decomposition of spectral CT images, we can set specific encoder-decoder structures for CNN to achieve precise pixel gray classification. This study designed an improved FC-DenseNets to perform material decomposition. Figure 1 shows the network structure of FC-DenseNets. According to the characteristics of different paths, it can be divided into two parts. The first part is the encoder downsampling path to extract various features from spectral CT images, and the second part is the decoder upsampling path to recover spatially detailed information of spectral CT images. The encoder and decoder implement data transmission through a long skip connection. Compared with other networks such as FCN and U-net [27, 28], FC-DenseNets uses a densely linked structure (as shown in Fig. 2) to extract more feature information and collects output characteristics of all dense blocks via the long skip connections, so that the features extracted by underlying network can be effectively reflected in the final result.

Network structure of FC-DenseNets.

A dense block construction of 4 layers.
Encoder architectures of the designed FC-DenseNet
Step 1: Input the spectral CT images into convolution kernel, and preliminary decomposition maps are outputted by forward propagation in FC-DenseNets.
Step 2: Use cross entropy loss function expressed in Equation (4) to figure out the errors of bone, lung and soft tissue between the decomposition maps and label maps.
Step 3: Adopt backpropagation algorithm to update the training parameters of all layers in the network and use gradient vector to adjust each weight.
Step 4: Repeat the previous process until the loss value no longer drops or iteration process is stopped, then it can reach good training results with a spatial resolution of 512 × 512 decomposition images.
Data set preparation
The experimental data were acquired by a spectral CT system equipped with a hybrid photon-counting detector (SANTIS 0804 produced by DECTRIS). The detector has four fully calibrated and independently adjustable thresholds with 1024 × 256 pixels, and the size of pixel is 150 μm. The experiments were strictly carried out in accordance with the Experimental Animal Management Regulations of China National Science and Technology Council. We selected a small mouse with the length of about 10 cm and the weight of about 150∼180 g. The mouse was anesthetized with urethane and placed in a plastic bottle for scanning, which is shown in Fig. 3. For small animal scan configuration, source-to-detector distance was 350 mm, source-to-object distance was 210 mm, tube voltage was 90 kVp and tube current was 200 μ A. A total number of 250 projection views over 360° were acquired for each scan, and the energy thresholds were set to 25, 30, 35, 40, 45, 50, 55, 60, 65 and 70 keV.
Spectral CT system based on the photon-counting detector.
Reconstructed mouse CT images were generated by the Split-Bregman algorithm, which can reconstruct high-quality images from sparse-view data or high-noise data [30, 31]. The scanned spectral data contains 256 slice projections with one energy bin, and Fig. 4 shows the middle slice of reconstructed images with ten energy bins. To distinguish the bone, lung and soft tissue region of mouse CT images, we selected 100 slices of reconstructed images in each energy bin to construct the training set and analyzed a total of 1000 training samples. Reconstructed CT images of middle slice with ten energy bins. The energy thresholds from (a) to (j) were 25, 30, 35, 40, 45, 50, 55, 60, 65, 70 keV.
The network training was performed on a PC with Intel i7-4790k CPU, 16G DDR4 RAM and TITAN XP 12G GPU. The FC-DenseNets was coded in Pythorch0.4.0 using Ubuntu16.04. The initial learning rate was 0.001 and decayed by an exponential power of 0.95. Training the network required about 8 hours with 100 epochs until pixel accuracy (PA) convergence to 95%. The 1000 spectral CT images were input and randomly rotated for training, and the other CT images outside the training set were used as a testing set. Figure 5 shows the performance of trained network for material decomposition. It is obvious that the designed FC-DenseNets structure successfully classified and extracted the bone, lung and soft tissue region of mouse CT images.
Decomposed images with energy threshold of 25 keV. Row (a), (b) and (c) are the images of three different slices. Column 1: the input reconstructed images for testing, column 2: the output bone images, column 3: the output lung images, column 4: the output soft tissue images.
The test images in Fig. 5 were acquired at low energy threshold (25 keV), which means that the detector can receive more photons with a wider energy bin, so the reconstructed images contained less artifacts and noises. In order to analyze the decomposition performance of our method for high-noise data, we tested images of the first row in Fig. 5 with some narrower energy bins, and the decomposition results are shown in Fig. 6. It indicates that the proposed method can distinguish three different components well even with poor image quality, especially some details that could not be identified by human vision.

Decomposed images with energy thresholds of (a) 40 keV (b) 60 keV and (c) 70 keV. The three test images are the same slice as the first row in Fig. 5.
To evaluate the performance of proposed material decomposition method, we used PA and intersection over union (IoU) as metrics methods. PA is a simple measurement method to mark the ratio of correct pixels to total pixels.
IoU is a standard measurement method by calculating the ratio of intersection to union of the label map (predicted value) and the decomposition map (ground truth). Table 2 lists the accuracy evaluation on three decomposition targets.
Evaluations of network performance
Meanwhile, we also decomposed the three parts using the basis material decomposition based on post-reconstruction space. For reconstructed mouse CT images, the attenuation characteristics of lung and soft tissue are very close, which are significantly different from that of bone. To decompose three materials, basis materials were selected close to the atomic number of bone, lung and soft tissue. Finally, calcium, water and carbon were used as the three basis materials to perform triple material decomposition. Three basis materials should be matched with three energy spectrum measurements according to Equation (3). The decomposed images of basis materials with different noise levels are shown in Figs. 7 and 8. From the decomposed figures, the traditional method could not well distinguish lung and soft tissue due to the similar attenuation characteristics.

Decomposed images of basis materials with low noise levels. From left to right, (a) the reconstructed CT images with energy thresholds of 25, 30 and 35 keV, (b) the decomposed “bone”, “lung” and “soft tissue” images.

Decomposed images of basis materials with high noise levels. From left to right, (a) the reconstructed CT images with energy thresholds of 50, 55 and 60 keV, (b) the decomposed “bone”, “lung” and “soft tissue” images.
This study focused on multi-material decomposition of spectral CT images. To compare the feasibility of our method, we supplemented the basis material decomposition based on post-reconstruction space, which was able to separate components with larger attenuation difference such as bone and soft tissue. However, different types of soft tissue have the similar attenuation characteristics, and the basis material decomposition method failed to accurately decompose them. Especially for spectral CT images with narrower energy bins, noises and artifacts can lead to seriously inaccurate results.
For other material decomposition methods based on deep learning, D. P. Clark et al. [21] proposed a CNN-based method for material decomposition and used phantom simulation data to train the network. Y. Xu et al. [22, 23] focused on material decomposition for dual-energy CT imaging. C. Feng et al. [24] theoretically analyzed multi-energy material decomposition using a neural network. In this study, we proposed a new multi-material decomposition method via FC-DenseNets and adopted the real spectral CT images to construct training set. The experiments demonstrated that the proposed method could effectively decompose three different material components with high noise levels.
The improved FC-DenseNets has better feature extraction ability, less network parameters and faster convergence speed. Meanwhile, the deepening of network layers also promoted the efficiency and accuracy of material decomposition. According to metrics evaluation, our proposed method performed well on the test images. In the follow-up study, we will inject contrast agents into mouse to mark specific soft tissue regions and identify the contrast agent regions to further characterize different soft tissues.
In conclusion, we proposed a material decomposition method for spectral CT via FC-DenseNets. The experimental results demonstrated that our proposed method achieved more accurate multi-material decomposition than the basis material decomposition based on post-reconstruction space and made better performance for spectral CT data with high noise levels. Our work could establish guidelines for multi-material decomposition of spectral CT images based on deep learning.
Footnotes
Acknowledgments
This work was partially supported by National Key R&D Program of China (No. 2016YFC0104609), National Natural Science Foundation of China (No. 61401049, 11605017), the Strategic Industry Key Generic Technology Innovation Project of Chongqing (No. cstc2015zdcy-ztzxX0002), the Chongqing Foundation and Frontier Research Project (No. cstc2016jcyjA0473), Grant for Science and Technology Innovation in Chongqing (No. cstc2017shmsA00004), and the Fundamental Research Funds for the Central Universities (No. 10611CDJXZ238826, 2018CDGFGD0008). The authors would like to thank Beijing Tai Kun Industrial Equipment Co.,Ltd for test of DECTRIS photon-counting detector.
