Abstract
Introduction
Radiation dose exposed to patients in computed tomography (CT) imaging has become a widespread concern due to the increasing CT examinations in medical imaging. A growing body of research indicates that a high x-ray radiation dose will increase the long-term risk of developing cancer [1]. Therefore, how to reduce the CT radiation dose is a crucial problem in the field of CT.
One of the key strategies to reduce the CT radiation dose is reconstructing images from few-view data. CT image can be obtained by exact reconstruction assuming sufficient projection data by using conventional filtered back-projection (FBP) and back-projection filtration (BPF) algorithms [2–7]. However, these methods are prone to cause artifacts and distortions in CT images reconstructed from insufficient few-view data due to the Shannon/Nyquist sampling theorem. Different from analytic algorithms, iterative algorithms arrive at the final solution in an iterative manner and they make milder assumptions on missing data. Therefore, iterative algorithms are usually adopted by CT reconstruction from few-view data.
Compressed sensing (CS) is a technique used for signal recovery from highly undersampled data [8]. It states that sparse or compressible signals can be accurately recovered from a reduced set of measurements, which may even be significantly below the Shannon/Nyquist sampling rate. Inspired by CS theory, various iterative algorithms have been proposed to solve the few-view CT reconstruction problem [9–18], among which total variation (TV) regularization is one of the most commonly used. Sidky et al. developed a CS image reconstruction algorithm which aims at solving constrained minimization of the estimated image TV, the so-called ASD-POCS framework [19–21]. The algorithm treats the raw data fidelity and the sparseness constraint separately in an alternating manner. The constraints are enforced by the use of projection onto convex sets (POCS) and the TV objective is minimized by steepest descent with an adaptive step-size. Ritschl et al. presented an improved method for optimized parameter adaption within the ASD-POCS framework for sparsity constrained image reconstruction [22]. Chen et al. proposed a prior image constrained compressed sensing (PICCS) method for CS-CT, where a prior image is utilized to sparsify CT images, which takes into account the similarity among images derived from the same image object [23].
However, as a fixed basis, TV regularization favors solutions with sparse gradients, thus it cannot effectively balance between true structures and image noise. In addition, a given fixed basis may not be universally optimal for all images. Dictionary learning (DL) operations can retain more true structures than TV [24–26]. It exploits the inherent sparsity in the image through the similar property and nonlocal operation on corresponding overlapping patches. DL methods adopt a two-step iterative scheme, where the sparse approximations are found with the dictionary fixed and the dictionary is subsequently optimized based on the current sparse coefficients. Xu et al. presented a dictionary learning based approach for low-dose X-ray CT, where the adaptive dictionary was learned from the samples and used to sparsify the images [27]. However, the existing DL methods train the dictionary in the original image-domain. An interesting question to consider is whether it is more effective when training the dictionary in a transform domain.
Recently, we developed a gradient-based dictionary learning method for MRI image recovery, where the dictionaries are trained from the horizontal and vertical gradient images instead of image itself, and the desired image is reconstructed from the sparse representations of both gradients [28]. This method can efficiently recover images from undersampled MRI data and presents superior performance compared to the traditional approaches. Then, we extended this strategy to CT reconstruction preliminarily [29]. Specifically, the reconstruction problem was formulated as an optimization problem and the alternating direction method (ADM) was used to provide a solution. However, this work is only confined to parallel beam CT geometry and was evaluated with simple computer simulations. Moreover, the step of reconstructing the image from gradients is based on an optimization algorithm which also leads to high computational complexity [30].
In the present study, we extend this method to practical fan-beam and cone-beam geometry and evaluate it with commercial micro-CT systems. Furthermore, the least-square method is adopted instead of the previous complex and time-consuming algorithm to reconstruct image from gradients. The proposed method has two advantages. First, the gradient-domain learned dictionary can lead to sparser representations than the conventional image-domain DL methods. This is because the gradient images are sparser than the image itself. Second, the learned dictionary can alleviate the artifacts caused by the piecewise smooth assumption and allow an image with complex structure to be recovered accurately. Experiments with simulation and real projections were performed to evaluate and validate the proposed Grad-DL algorithm.
The rest of this paper is organized as follows. Section 2 states the backgrounds on dictionary learning and DL-based CT reconstruction. In Section 3, the proposed model to learn dictionaries, image recovery methods from gradient-domain, and the corresponding overall workflow of Grad-DL algorithm are described in detail. Section 4 demonstrates the performance of the proposed algorithm on numerical and experimental results. Conclusions and future work are presented in Section 5.
Methodology
Dictionary learning
Given a 2D image f of size , fj,k is the vector representation of small square image patch of size , N >> n. Each image patch fj,k is uniquely indexed by the location of its top-left corner pixel (j, k) in the image f. The sliding distance of the patch is the number of pixels between the top left pixels of adjacent image patches. The dictionary is a matrix D ∈ Rn×K and each column d k ∈ R n (k = 1, . . . , K) is called an atom. The dictionary has K atoms, each is a n dimensional vector corresponding to a pixels image patch fj,k ∈ R n . It is assumed that each patch fj,k can be approximated by a linear combination Dαj,k of dictionary atoms, where αj,k ∈ R K has few nonzero entries. Accordingly, αj,k is sparse and the sparse representation of fj,k with respect to the dictionaryD. When n < K, the number of atoms is greater than the dimension of a patch, and D is said to be redundant.
The dictionary learning is the solution of the following optimization problem
The above equation can be rewritten as follows unconstrained form by the Lagrange method
Dictionary learning methods can avoid the defect resulting from fixed basis i.e. a given basis may not always be suitable for all images. Let f represent an image to be reconstructed, M denote the measurement matrix and g represent the projection data, the CT reconstruction problem can be expressed as the following formula based on the idea of dictionary learning.
It is observed that the gradient images are sparser than the image itself. Therefore, we can obtain sparser representation for image by learning dictionary in gradient-domain than in the conventional image-domain.
Here we show an example to demonstrate that the distribution of the signal samples influence the effectiveness of the DL method, which suggests improving sparsity indeed enhances the quality of dictionary learning. Specifically, we generated a dictionary with size 2 × 10 (referred to later as generated dictionary), whose ten atoms have a uniform angular distribution in the circle and unit l2 norm. Then two types of synthetic signal samples (each with size of 2 × 1500) were produced by a scalar multiplication of one atom, with uniformly distributed i.i.d. coefficients in random and independent locations. The only difference between these two types of sample is that one type of coefficients were drawn from standard Gaussian distribution while the other from standard Laplacian distribution to simulate difference sparsity level. Additionally, additive Gaussian noise was added to the resulting signal samples. Two dictionaries corresponding to these two signal samples were learned separately using K-SVD over 50 times, and compared to the original generated dictionary to measure the identification/recovery ability in the same criteria as that used in Ref. [31]. It can be obtained that the average identification percentage of the former is 96%, while the latter is almost 100%. Figure 1 shows the results in one run where the blue plus sign (+) denotes the signal samples, the green vectors represent generated atoms, and the red vectors show the learned atoms. It’s obvious that the samples from Laplacian distribution have a greater sparsity than those from Gaussian distribution. It also can be observed that some learned atoms from Gaussian coefficients deviate from the generating dictionary’s atoms as shown in Figure 1(a), while all atoms are successfully learned from Laplacian coefficients shown in Figure 1(b). Please note the learned basis vectors can be antiparallel to the generated vectors, because both positive and negative coefficients are allowed [32].
Proposed new model
We propose a new model to reconstruct CT image using dictionaries learned from the gradient-domain
A method is needed to exactly recover an image from the gradient-domain after the DL operations have been applied to the gradient image.
Let G = [G
x
, G
y
] denote the gradients of a 2D imagef (j, k) in the horizontal and vertical directions. The symbols j and k are the indices of the location of the discrete image, where and is the size of the 2D image. In the discrete domain, the gradient fields of an image in the horizontal and vertical directions are defined as the simple forward differences
The proposed Grad-DL algorithm consists of three components: ART reconstruction, gradient-domain DL and gradient images recovery. The algorithm alternately updates sparse representations of the gradient image patches, recovers the horizontal and vertical gradients, and estimates the desired image from both gradients. The flowchart of the Grad-DL algorithm is presented in Table 1.
Experimental results
In this section, two numerical simulations and real projections study were designed to compare the performance of the proposed algorithm with ART-TV and conventional DL algorithms. During the implementation of existing dictionary learning algorithm and proposed Grad-DL algorithm, the sparsity level T0 for image reconstruction was set to 5. The local reconstruction error threshold was set to 0.00025. The patch size was 12 × 12 pixels with the sliding distance of 2 pixels. After every 10 iterations of the ART, the DL process was performed. The number of iterations for TV-steepest descent was set to 10 and the gradient descent step was fixed at 0.2. In general, the algorithm parameters followed the setting of Ref. [19, 26] with the best performance.
Human dental simulation
The FBP reconstruction from the normal-dose sinogram in the prototype dental CT system was chosen as a numerical phantom. The developed dental CT prototype system consists of an x-ray source (X20P, IAE, Italy) and a flat-panel detector (PaxScan 2520D, Varian, USA). The detector consists of a 1920 × 1536 array with a pixel pitch of 0.127 mm. In human tooth imaging, the detector was operated in the mode of 2 × 2 Z pixel binning, corresponding to a 0.254 mm pixel size. Circular cone-beam data were acquired from the 360 projection views distributed over 2π of human tooth using the parameters of 90 kVp and 9 mA. The source to object (SOD) distance was set at 540 mm and the source to detector (SDD) distance was set at 654 mm.
In the simulation, the parameters of fan-beam CT geometry were set to be identical to those used in actual human tooth imaging except that 60 views were uniformly distributed over 360°. The iterative process was stopped after 105 iterations. Figure 2 shows the images reconstructed from the 60-view data by using the ART-TV, ART-DL, and Grad-DL algorithms, respectively. It can be observed that the images reconstructed from the Grad-DL and ART-DL preserve more detail structures than those reconstructed by using conventional ART-TV. In addition, the results of Grad-DL always have less noise than those of the ART-DL algorithm.
Laboratory rat simulation
The FBP reconstruction from the full view data in prototype micro-CT system was chosen as a numerical phantom. The developed micro-CT prototype system consists of an x-ray source, a rotational object holder and a CCD detector. The open x-ray source (FXE: 160.51, YXLON, Germany) has a minimum focal spot of the size less than 0.002 mm. A cooled x-ray CCD imaging detector (Quad-RO: 4320, Princeton Instruments, USA) is used to acquire the images. The CCD imaging detector has a high spatial resolution, large active imaging area and low noise (0.024 mm pixel size, 2084×2084 array and 50×50 mm2 active area). In order to validate the proposed algorithm’s performance, a laboratory rat was scanned to obtain projection data. Circular cone-beam data were acquired from the 360 projection views of the rat distributed over 2π, with the parameters of 80 kVp, 7 w and 8 s per exposure. The source-to-detector distance was set at 710 mm and the SOD was set at 530 mm.
In the simulation, the parameters of fan-beam geometry was set to be identical to those used in laboratory rat imaging except that few view data (1/6 times of the full scan) is used. The reconstructed images from 60 projections of the rat simulation using ART-TV, ART-DL and Grad-DL algorithms are shown in Fig. 3. The top, middle, and bottom rows demonstrate the images reconstructed from 105 iterations using ART-TV, ART-DL, and Grad-DL algorithms, respectively. Apparently the quality of reconstructed image using the Grad-DL algorithm is significantly better than that using the ART-TV or ART-DL algorithm. It can be seen that there are much streak artifacts around bones in the ART-DL reconstruction. This is consistent with the results from the dental CT images.
Real data study
In this experiment, the commercial in-vivo SkyScan-1076 micro-CT system was used to collect the real data. An anesthetized mouse was scanned in a circular cone-beam mode. Over a 2π range, 120 projections were uniformly collected. This meant that each projection covered 3°. The cone-beam was filtered with a 0.5 mm thick aluminum plate to reduce the beam hardening artifact and avoid saturating the detector. A source voltage of 70 kV and current of 140μA were used. The source to object and detector were 121 mm and 165 mm, respectively.
The reconstructed images are shown in Fig. 4. It is demonstrated that Grad-DL algorithm can reconstruct details more accurately from few-view data than conventional dictionary learning algorithms. Moreover, compared to Grad-DL, few-view data reconstruction by using the ART-TV algorithm appears serious deviation. The corresponding magnified regions are shown in Fig. 4(d). Grad-DL can suppress image noise more effectively while keeping subtle structures.
Parameters and quantitative evaluation
In addition to visualization based evaluation, we analyzed the sensitivity of the Grad-DL algorithm to parameter setting by varying one parameter at a time while keeping the rest fixed at their nominal values. We compared the sliding distance, patch size and sparsity level of the proposed algorithm in human dental simulation. Fixing the patch size at 12 × 12 and sparsity level at 5, reconstructions were performed when the sliding distance increased from 1 to 8. Then, fixing the sliding distance at 2 and sparsity level at 5, the reconstructions were performed for the patch size increasing from 4 × 4 to 28 × 28. Finally, the sparsity level increases from 2 to 10 when patch size and sliding distance are fixed at 12 × 12 and 2, respectively. The Grad-DL algorithm was quantitatively evaluated using the root mean square error (RMSE)
In order to evaluate the performance of the present Grad-DL algorithm in a more quantitative manner, the peak signal-to-noise ratio (PSNR) was used in this study. It is defined as
Due to the harmful effects of radiation dose to patients, minimizing the risk of x-ray exposure has become a hot research field in medical CT imaging. Few-view CT image reconstruction is of great importance in clinical imaging owing to its potentially reduced scan time and x-ray radiation dose. In this study, a novel gradient based dictionary learning method is proposed for CT image reconstruction from few-view data. This algorithm effectively integrates the gradient-domain image recovery and dictionary learning technique. The proposed Grad-DL algorithm aims to learn dictionary in the sparser gradient domain for better CT reconstruction. This algorithm has been tested in numerical simulations and with real CT datasets. The promising results were obtained in terms of preserving structural details and suppressing image noise. Especially, it is also verified that the proposed approach outperforms the conventional dictionary learning method for few-view CT data. Further studies are needed to improve the proposed algorithm, such as learning dictionaries in both pixel and gradient domains or introducing some filtering operation for gradient image. In addition, pre-learning a dictionary from images reconstructed from normal-dose full data also is a better choice.
Footnotes
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (81401410, 81301216, 81527901, 81527804, 11575285, 61501446), the Basic Research Program of Shenzhen in China (JCYJ20140417113430558, JCYJ20140610151856710, JCYJ20150630114942310), the Applied Science and Technology Project of Guangdong Province (2015B020233014), the National High Technology Research and Development Program (863 Program) of China (SS2015AA020935), the Special Fund Project for Development of Strategic Emerging Industry of Shenzhen in China (CXZZ20140505091419405).
