Abstract
X-ray radiation is harmful to human health. Thus, obtaining a better reconstructed image with few projection view constraints is a major challenge in the computed tomography (CT) field to reduce radiation dose. In this study, we proposed and tested a new algorithm that combines penalized weighted least-squares using total generalized variation (PWLS-TGV) and dictionary learning (DL), named PWLS-TGV-DL to address this challenge. We first presented and tested this new algorithm and evaluated it through both data simulation and physical experiments. We then analyzed experimental data in terms of image qualitative and quantitative measures, such as the structural similarity index (SSIM) and the root mean square error (RMSE). The experiments and data analysis indicated that applying the new algorithm to CT data recovered images more efficiently and yielded better results than the traditional CT image reconstruction approaches.
Introduction
CT is widely used in hospitals for diagnosis and intervention. However, X-ray radiation is harmful to human health, and many clinical studies have indicated that CT radiation doses beyond the normal range can cause diseases such as metabolic abnormalities and cancer. Because CT radiation doses accumulate over time, repeated CT scans increase the probability of cancer [1], highlighting the importance of low-dose CT. However, reducing the X-ray dose degrades the reconstructed image quality. Therefore, obtaining high-quality diagnostic CT images from either sparse view acquisition or low-dose radiation is an important research topic in the field of CT.
Two strategies exist to reduce the radiation dose: minimizing the milliampere-seconds (mAs) or peak kilovoltage (kVp) [2, 3] and reducing the number of projection views per rotation around the body [4, 5]. Various techniques, including better image reconstruction methods [5–7] and scanning schemes [8–10], have been extensively studied to reduce radiation doses in CT examinations. Among these, a method referred to as statistical iterative reconstruction (SIR) has been proposed. SIR optimizes the maximum-likelihood or penalized-likelihood function formulated according to the statistical characteristics of the projection data, which reduces the radiation dose compared to the filtered back-projection (FBP) reconstruction algorithm [11, 12]. In general, however, the conventional regularization term tends to produce unfavorable oversmoothing effects at the edge regions by penalizing the differences between local neighboring pixels. To solve this shortcoming, several edge-preserving regularization terms have been proposed [13, 14]. There are two typical examples: total variation (TV) regularization with the piecewise constant assumption (PCA) [15–17] and total generalized variation (TGV) regularization [18, 19] adapted under the penalized weighted least-squares (PWLS) criterion. Unfortunately, neither images reconstructed by TV regularization nor by TGV regularization achieve satisfactory results from few-view projections.
Dictionary learning (DL) has received increasing attention in recent years and has been incorporated into iterative reconstruction frameworks by various algorithms [20–23], such as adaptive dictionary-based statistical iterative reconstruction (ADSIR) and 3D dictionary learning (3D-DL) [24]. The combination of dictionary learning and iterative reconstruction in these methods can effectively improve the imaging performance. Other methods treat dictionary learning as postprocessing to reduce artifacts in images reconstructed with FBP, such as artifact suppression dictionary learning (ASDL) algorithms [25–31].
To reconstruct a better image from a few-view projection (116-view projections had been used), we propose an improved CT statistical iterative reconstruction method in this paper. Our approach is to reconstruct intermediate images using TGV under the PWLS standard [18] and then postprocess the results using sparse coding dictionary learning to eliminate residual noise and produce clinically acceptable CT images. The proposed method is abbreviated as PWLS-TGV-DL for simplicity. The novelty of our approach is that PWLS-TGV-DL can produce better images with less noise and fewer patchy artifacts. Both qualitative and quantitative assessments of digital and physical models are presented and evaluated in terms of accuracy and resolution.
The remainder of this paper is organized as follows. Section 2 reviews the PWLS image reconstruction model, TGV model, dictionary learning, the PWLS-TGV-DL algorithm, its optimization algorithm, and its workflow. In Section 3, we present the experimental setup, the evaluation metrics, and the results of the PWLS-TGV-DL algorithm along with those of comparison methods using both simulation and physical experiments. Finally, Section 4 provides a discussion and a conclusion.
Methods
PWLS image reconstruction
The penalized weighted least-squares (PWLS) approach for iterative reconstruction of X-ray CT images was studied previously [32]. Based on the noise properties of CT projection data, the PWLS criterion for CT image reconstruction can be written as follows [33]:
Based on previous works already available [34], in this study, the variance of
The TGV algorithm was first proposed by Bredies et al. in the image denoising model [18], which is used to measure image characteristics up to a certain order of differentiation. Mathematically, TGV can be defined as follows:
In this work, we focus only on the second-order TGV, which can be mathematically defined as
The ∞-norm in Eq. 5 can be calculated by
The minimum is taken over all vector fields in Ω, and the weak symmetrized derivative
Let N and K be integers and ℝ ; be the real space. Given a 2D image f of size
Finding a sparse representation α
j
,k∈ ℝ ;
K
×1 of an image patch f
j
,k∈ ℝ ;
N
×1 with respect to a given dictionary is equivalent to solving the following optimization problem:
Equation 10 can be rewritten in the following unconstrained form by the Lagrange method
We formulate the CT reconstruction problem using DL and discuss the strategy for dictionary construction. Inspired by the studies of TGV in image reconstruction [19], we provide the following cost function of our algorithm:
Therefore, the minimization problem in Equation 12 can be rewritten as follows:
Because Equation 14 includes several variables, we propose an improved alternative optimization method that involves three subproblems, which are mathematically defined as follows:
This subproblem is a classic statistical image reconstruction problem. The solution to this problem in the implementation has been studied by Elbakri and Fessler et al. [11], and this solution is a separable paraboloidal surrogate algorithm written as follows:
A corresponding discrete version to solve this problem was studied by Bredies et al. [39], which can be rewritten as follows:
Likewise, another
This subproblem aims at finding the sparsest representation α j ,k with respect to the dictionary D and updating an image with respect to an adaptive dictionary. DL-based image reconstruction in the few-view case was studied by Xu et al. [40] and Liao and Sapiro [41]. These studies used the orthogonal matching pursuit (OMP) algorithm [42] and the K-singular value decomposition (K-SVD) algorithm [23] to solve this problem. Then, the DL step was incorporated in the iterative reconstruction procedure to update an intermediate image.
The entire PWLS-TGV-DL algorithm can be summarized as follows:
f
old
= f
n
, w
old
= w
n
, w
n
+1 = w
n
+τ(p
n
+1 + div2 q
n
+1), :
Using the OMP and SVD algorithms, update
Selection of β1, β2 and β3
β1, β2 and β3 are hyperparameters to balance the data fidelity and regularization terms. How to choose appropriate parameters is a tricky problem in the field of CT image reconstruction. Practically, selecting them was usually judged by experience, and then we compared the reconstructed image obtained from the normal dose images using the selected parameters via visual examination. In our work, we found that the reconstructed images were less sensitive to the value of β1within the range 1×10–3≤β1≤1×10–2. The two parameters β2 and β3 are especially important in controlling the smoothness of the reconstructed image; selection of these variables was optimized case by case on the basis of the noise level of the image and sparse-view projections.
Selection of parameters in solving (P2 and P3)
The iteration number of the sub-iteration step is an important factor for obtaining a successful result. Fewer iterations may fail to achieve the expected convergence effect, and more iterations may cause the related computational load to be very heavy. In practice, assuming that the trial solution of (P1) is a reasonable initial value for (P2) and that the trial solution of (P2) is a reasonable initial value for (P3), the total number of iterations could be greatly reduced. Furthermore, the step variables ρ and τ that control the step lengths of the updating procedure are also crucial; we must choose suitable values in the experiment. In our studies, ρ and τ were optimized by using α0 = 2, α1 = 1. The dictionary redundancy improves the sparsity of representation. Similarly, the parameters of dictionary learning should also be in an appropriate range. A patch size of 5×5 pixels was used in this paper. In the dictionary learning process, we solve the optimization problem (22) by minimizing the representation error with ∈= 2.5×10–5.
Experiments and results
Experimental setup
To evaluate the performance of the PWLS-TGV-DL algorithm for CT image reconstruction, we conducted experiments on the digital extended cardiac-torso (XCAT) phantom and the head physical phantom.
Digital XCAT phantom
Figure 1(a) shows a slice of the XCAT phantom. We chose a geometry representative of a monoenergetic fan-beam CT scanner setup with a circular orbit to acquire 1160 projection views over 2π. The number of channels per view was 672. The distance from the detector arrays to the X-ray source was 1040 mm, and the distance from the rotation center to the X-ray source was 570 mm. The reconstructed images were composed of 512×512 square pixels. Each projection datum along an X-ray through the sectional image was calculated based on the known densities and intersection areas of the ray with the geometric shapes of the objects in the sectional image.

Digital and physical phantoms used in the studies: (a) a slice of the digital XCAT phantom; (b) the standard image reconstructed by the FBP method of the head phantom.
Similar to previous studies by Niu et al. [19], we first simulated the noise-free sonogram data ŷ and then generated the noisy transmission measurement I according to the statistical model of the prelogarithm projection data, that is,
Figure 1(b) shows the standard image of the head phantom reconstructed by the FBP method. This experiment was performed on an in-house CT imaging bench in our lab. The system had a rotating-anode tungsten target diagnostic level X-ray tube (Varex G-242, Varex Imaging Corporation, UT, USA) and was operated in 120.00 kV continuous fluoroscopy mode with a 0.40 mm nominal focal spot. The X-ray tube current was set to 11.00 mA. The X-ray detector was an energy-resolving photon-counting detector (XC-Hydra FX50, XCounter AB, Sweden) made from cadmium telluride (CdTe). The projection datasets were acquired with a 1×1 detector binning mode and were rebinned to 6×6 during the postprocessing procedures. Because there was no need to discriminate the photon energies, the detector was operated and calibrated with only a single energy threshold (10.00 keV). The source-to-detector distance was 1500.00 mm, and the source-to-rotation center was 1000.00 mm. A dental and diagnostic head phantom (Atom Max 711-HN, CIRS Inc., VA, USA) was imaged in this work. The phantom was rotated through 360 degrees at an 1.0 degree angular interval.
Performance evaluation
We applied numeric metrics to perform an objective assessment. The first metric is the structural similarity index (SSIM), which measures the similarity between two images by considering the luminance l, contrast c, and structural information s, where f is the reconstructed image, and y is the reference image. SSIM is defined mathematically as follows:
The second metric is root mean square error (RMSE), which indicates the difference between the reconstructed image and the ground truth image and characterizes the reconstruction accuracy:
To validate and evaluate the performance of PWLS-TGV-DL, TV regularization and TGV regularization were also conducted using the PWLS criterion for comparison. The methods being compared are referred to simply as PWLS-TV and PWLS-TGV. All the experiments were implemented using MATLAB 2016a and executed on a PC equipped with an Intel Core i7-6700 CPU @ 3.40 GHz and 56.0 GB of memory.
In the digital phantom study, the original phantom data were directly used as the ground-truth image. As mentioned above, low-dose CT can be implemented by limiting the number of projection views. To achieve a more comprehensive study, we tested our method using a few-view case.
In the few-view case, the CT scan projection views were set to 116 views over 360°. The reconstructed images and the zoomed ROIs of the tested methods are shown in Fig. 2. For simplicity, the two ROIs shown in the figure are referred to herein as ROI1 and ROI2, respectively. The PWLS-TV result in Fig. 2(a) contains substantial noise throughout the entire reconstructed region; some small structures are almost entirely obscured by noise, which can cause misdiagnosis in the clinic. Figure 2(b) shows the image reconstructed by PWLS-TGV, which is better than that of PWLS; the image is much cleaner after being denoised by TV, but it still contains some noise. Figure 2(c) shows the reconstruction by our proposed new model, which is to eliminate most noise and artifacts, indicating that PWLS-TGV-DL is effective at reducing noise. These images demonstrate that the proposed approach is better than the traditional CT reconstruction for few-view reconstruction.

Image reconstructed by (a) PWLS-TV, (b) PWLS-TGV, and (c) our proposed new model.
We also provide RMSE values and SSIM values in Table 1 for the reconstructed images of the three algorithms. The PWLS-TGV-DL method achieves the lowest RMSE value, and its SSIM value is closest to 1. Perhaps the RMSE and SSIM data from the full image are not significantly better. In order to better represent the advantages of our algorithm, we also provide the RMSE values and SSIM values from ROI1 and ROI2 in Table 2.
Numeric results from three algorithms (PWLS-TV, PWLS-TGV and Proposed method)
Numerical results of (ROI1 and ROI2) from three algorithms (PWLS-TV, PWLS-TGV and the proposed method)
The profile images and residual images are compared in Figs. 3 and 4, respectively. Clearly, the proposed method curve is closer to the Phantom curve than are those of the other two methods. The results show that compared to the other methods, our proposed method achieves images with superior quality.

The profiles were located at the pixel positions on the x-axis from 160 to 400 and y = 320. The Phantom curve represents the profile of the true image in Fig. 1(a). The PWLS-TGV curve represents the profile of the reconstructed image using the PWLS-TGV method in Fig. 2(b). The curve in Fig. 2(c) represents the profile of the reconstructed image using the proposed PWLS-TGV-DL method.

Residual images of the reconstructed results from the simulation XCAT data based on the (a) PWLS-TV method, (b) PWLS-TGV method, and (c) our proposed method. All the images are displayed in the same window.
In the physical phantom study, an image reconstructed with 720 views by the FBP method was used as the true image. To further evaluate our proposed method, we also performed an experiment in which the views were reduced to 90 and 180.
Figure 5 shows the images reconstructed by the different methods at 90 and 180 views. To further display the advantages of PWLS-TGV-DL, zoomed ROIs (as indicated by the squares in Fig. 5(a)) are shown in Fig. 6. Serious artifacts existed in the PWLS-TV results in all cases. Our proposed method yielded more noticeable gains than the PWLS-TGV method in terms of patchy artifact suppression. These results suggest that compared with the PWLS-TGV method, PWLS-TGV-DL achieves profiles that better match the gold standard.

Experimental results of the head phantom. The images in (a)– (c) are reconstructed with 90 views, and the images in (d)– (f) are reconstructed with 180 views. The images in (a) and (d) were reconstructed by the PWLS-TV algorithm; the images in (b) and (e) were reconstructed by the PWLS-TGV algorithm; and the images in (c) and (f) were reconstructed by our proposed algorithm.

Zoomed-in views of images reconstructed by PWLS-TV ((a) and (d)), PWLS-TGV((b) and (e)), and our proposed method ((c) and (f)) from 90-view (1st row) and 180-view (2nd row) projections.
Table 3 compares the numeric results of these methods. Note that PWLS-TGV-DL obtains the best results on all the numeric metrics. We can conclude that our proposed method has a clear performance advantage over the compared methods.
Numeric results from three algorithms on head phantom
In this paper, we proposed a new few-view CT reconstruction solution by combining TV minimization and sparse dictionary learning based on the PWLS standard, and then we discussed the results in terms of image qualitative and quantitative studies such as SSIM and RMSE. Whether in terms of visual effects or performance evaluations, our proposed method has considerable advantages in terms of reconstructed image quality relative to traditional algorithms such as PWLS-TV and PWLS-TGV. The proposed method has a higher reconstruction accuracy, and it can suppress artifacts and noise while preserving more edge structure information.
In summary, we proposed a new algorithm that combines penalized weighted least-squares using total generalized variation and dictionary learning for few-view projection data and demonstrated its promising performance. First, an intermediate image is reconstructed using TGV minimization; then, it is postprocessed using dictionary learning to remove residual noise and produce a clinically acceptable CT image. The proposed PWLS-TGV-DL method is targeted at efficiently eliminating noise and reducing the artifacts of the TGV-based method for few-view CT image reconstruction. From the simulation and physical experimental results presented in Section 3, in which PWLS-TGV-DL was compared with reconstruction methods such as PWLS-TV and PWLS-TGV, the proposed method improves the quality of reconstructed images.
The better reconstruction performance of the proposed method opens a new idea to study how dictionary learning can improve the reconstructed image quality of traditional algorithms. The proposed method outperforms the conventional denoising scheme and the edge-preserving regularization in terms of image quality at a similar level of noise suppression. The approach is convenient and attractive for clinical applications. However, the main shortcoming of PWLS-TGV-DL is that the matrix update process during dictionary learning increases the computational burden and requires a long running time. To solve this problem, a fast computer and dedicated hardware are required. Future research work will include evaluations of the approach in more realistic situations or for clinical raw data reconstruction. In the future, we believe that iterative-based image reconstructions such as the PWLS-TGV-DL algorithm will be widely used in medical clinics.
Footnotes
Acknowledgments
The authors would like to thank Prof. Jianhua Ma and Zhaoying Bian at Southern Medical University for providing the total generalized variation code. This work was supported by the National Natural Science Foundation of China (81871441), Guangdong Special Support Program of China (2017TQ04R395), the Natural Science Foundation of Guangdong Province in China (2017A030313743), the Guangdong International Science and Technology Cooperation Project of China (2018A050506064), the National Natural Science Foundation of China (61601426). The authors would like to thank the editor and anonymous reviewers for their constructive comments and suggestions.
