Super-resolution image reconstruction from sparsity regularization and deep residual-learned priors

Abstract

BACKGROUND:

Computed tomography (CT) plays an important role in the field of non-destructive testing. However, conventional CT images often have blurred edge and unclear texture, which is not conducive to the follow-up medical diagnosis and industrial testing work.

OBJECTIVE:

This study aims to generate high-resolution CT images using a new CT super-resolution reconstruction method combining with the sparsity regularization and deep learning prior.

METHODS:

The new method reconstructs CT images through a reconstruction model incorporating image gradient L₀-norm minimization and deep image priors using a plug-and-play super-resolution framework. The deep learning priors are learned from a deep residual network and then plugged into the proposed new framework, and alternating direction method of multipliers is utilized to optimize the iterative solution of the model.

RESULTS:

The simulation data analysis results show that the new method improves the signal-to-noise ratio (PSNR) by 7% and the modulation transfer function (MTF) curves show that the value of MTF50 increases by 0.02 factors compared with the result of deep plug-and-play super-resolution. Additionally, the real CT image data analysis results show that the new method improves the PSNR by 5.1% and MTF50 by 0.11 factors.

CONCLUSION:

Both simulation and real data experiments prove that the proposed new CT super-resolution method using deep learning priors can reconstruct CT images with lower noise and better detail recovery. This method is flexible, effective and extensive for low-resolution CT image super-resolution.

Keywords

CT image reconstruction super-resolution sparsity regularization deep learning prior plug-and-play. alternating direction method of multipliers

1 Introduction

During the process of medical computed tomography (CT) imaging, the resolution of CT images is affected by various factors such as the imaging resolution of CT imaging equipment, the radiation dose that patients can bear [1, 2]. Low-resolution CT images bring difficulties to subsequent diagnosis. Image resolution is an important indicator to measure the quality of CT images. The higher the image resolution, the more information can be obtained from the image [3, 4]. CT imaging quality can be improved by using more sophisticated hardware. However, manufacturing costs are often very high and delicate artifacts have a short life. Therefore, super-resolution (SR) reconstruction algorithms that transform low-resolution CT images (LRCT) into high-resolution CT images (HRCT) based on existing hardware facilities have become a research hotspot [5 –7].

At present, the mainstream super-resolution reconstruction algorithms for CT images can be divided into two types: the super-resolution technology based on optimization model and the super-resolution technology based on deep learning. Super-resolution algorithm based on the optimization model constrains the solution space of the super-resolution reconstruction problem by adding the prior knowledge of the image into the super-resolution reconstruction model as a constraint condition [8 –10]. However, when the prior information is insufficient, the reconstruction results of these super-resolution methods using regularization terms based on mathematical expressions are not satisfactory [11]. It is mentioned in the literature [12] that Total Variation (TV) Regularization will result in excessive smoothness of reconstruction results. The selection of wave base in the super-resolution technology based on wavelet transform has the problems of limited adaptability and insufficient ability to maintain high frequency details [13].

Recently, deep learning networks are widely used in image super-resolution [14, 15]. Ren proposed two advanced CNN-based models to reconstruct CT images using two advanced CTSR models [16] based on convolutional neural network (CNN) and residual learning: single-slice CTSR network (S-CTSRN) and multi-slice CTSR Network (MCTSRN). In 2018, Park [17] used the classic U-Net for CT image SR for super-resolution. In 2019, You [18] used an unsupervised residual learning method based on Cycle-GAN network to train the mapping of high-resolution CT images and low-resolution CT images. Recently, some super-resolution reconstruction methods combining sparse regularization and deep learning prior have been applied to natural images [19, 20]. Compared with sparse regularization terms based on mathematical expression, deep learning networks can learn deeper feature information of images by training data sets [21], thus improving the quality of super-resolution results. It is believed that the combination of deep learning and optimization model is a technical method that meets the actual demand. CT imaging has a high requirement for reliability, which is controversial in deep learning [22]. Therefore, the method based on optimization model can improve its reliability to a certain extent and make it practical. In this paper, a CT image super-resolution reconstruction model is designed based on low-resolution projection data. A CT super-resolution method based on deep learning prior (DIP-CTSR) is proposed, which combines sparse regularization and deep learning priors. The proposed method utilizes a deep plug-and-play framework and alternating direction method of multipliers (ADMM) [11] to introduce super-resolution networks into iterative optimization algorithms. Theoretical studies have shown that a suitable degradation model is the key to the success of super-resolution algorithms [23, 24]. Different from the super-resolution reconstruction algorithm in the natural image field, which starts from the image domain, CT imaging needs to go through CT image scanning, projection acquisition and reconstruction, and the CT super-resolution model takes the projection data as the initial value input. Therefore, how to design a reasonable super-resolution reconstruction model of CT image is an important challenge for proposed method.

The rest of the paper is organized as follows. Section 2 presents the CT super-resolution reconstruction model and implementation details. The simulated and real data and experimental results are described in Section 3, and the discussion and conclusions are given in Sections 4 and 5.

2 Method

2.1 Super-resolution reconstruction framework

In this work, a CT super-resolution reconstruction algorithm combining L₀-regularization terms and deep learning priors is proposed. This method is a model-driven super-resolution method. In the process of projection data generation, due to the insufficient number of photons, there will be additional noise interference. And the degradation process is represented by the following formula, where u ∈ R^N (N = N_w × N_h) represents a vectorized 2D CT image, g ∈ R^M (M = M₁ × M₂) stands for projection dataset, M₁ and M₂ are respectively the numbers of views and detector elements. The matrix H ∈ R^M×N represents the system matrix, and δ ∈ R^M represents the noise of sensing and random noise $g = Hu + δ,$ (8)

In Model-Based Iterative CT Reconstruction, the image u is reconstructed by solving the problem: $\hat{u} = \underset{u}{arg min} \frac{1}{2} {∥ Hu - g ∥}_{2}^{2} + λ_{1} {∥ \nabla u ∥}_{0},$ (9) since the CT super-resolution problem is an ill-posed inverse problem, regularization terms need to be introduced to reduce the solution space. Regularization terms ∥ ∇ u ∥ ₀ is introduced to suppress noise, preserve edges and obtain relatively clear results. A relatively ideal u is obtained by minimizing the gradient L₀-norm, which is also beneficial to our follow up work on super-resolution. And before generating projection data, due to the limitation of detector size, u is the degraded CT image. It is assumed that the image x that is the finally reconstructed high-resolution CT image exists such a correspondence with u. The relationship between u and x can be described by Equation (3), where ↓_s is down-sampling with a factor of s, q denotes the blurring kernel, and n represents the noise, and ⊗ represents the convolution operation, and x represents high resolution CT image. $u = (x ↓_{s}) \otimes q + n,$ (10) for the reconstruction of x, the proposed super-resolution model is shown in the following formula $min_{u, x} \frac{1}{2} {∥ Hu - g ∥}_{2}^{2} + \frac{1}{2 σ^{2}} {∥ u - (x ↓_{s}) \otimes q ∥}_{2}^{2} + λ_{1} {∥ \nabla u ∥}_{0} + λ_{2} φ (x),$ (11) where $\frac{1}{2} {∥ Hu - g ∥}_{2}^{2}$ is the projection fidelity term, ${∥ u - (x ↓_{s}) \otimes k ∥}_{2}^{2}$ is the super-resolution reconstruction term, ∥ ∇ u ∥ ₀ and φ (x) are regularization terms with respect to u and x, σ and λ₁, λ₂ are coefficients.

2.2 Algorithm implementation

For such multi-variable optimization problems, this paper uses the ADMM [23, 25] algorithm to decompose the optimization model into several sub-problems and solve them respectively. In order to facilitate the solution of the optimized reconstruction model, auxiliary variable z was added to the optimization model. Then, Equation (4) was reformulated as a constrained optimization problem: $\begin{matrix} \underset{u, z, x}{arg min} \frac{1}{2} {∥ Hu - g ∥}_{2}^{2} + \frac{1}{2 σ^{2}} {∥ u - z \otimes q ∥}_{2}^{2} + λ_{1} {∥ \nabla u ∥}_{0} + λ_{2} φ (x), \\ s . t ., z = x ↓_{s} \end{matrix}$ (12)

Meanwhile, the augmented Lagrangian function L of Equation (5) is as follows $L = \frac{1}{2} {∥ Hu - g ∥}_{2}^{2} + \frac{1}{2 σ^{2}} {∥ u - z \otimes q ∥}_{2}^{2} + λ_{1} {∥ \nabla u ∥}_{0} + λ_{2} φ (x) + 〈 x ↓_{s} - z, r 〉 + \frac{μ}{2} {∥ x ↓_{s} - z ∥}_{2}^{2},$ (13) where r is the Lagrangian multiplier, and μ is a nonnegative penalty parameter. Equation (6) can be decomposed into three sub-problems by an alternating minimization scheme: $u^{k + 1} = \underset{u}{arg min} \frac{1}{2} {∥ Hu - g ∥}_{2}^{2} + \frac{1}{2 σ^{2}} {∥ u - z^{k} \otimes q ∥}_{2}^{2} + λ_{1} {∥ \nabla u ∥}_{0},$ (14) $z^{k + 1} = \underset{z}{arg min} \frac{1}{2 σ^{2}} {∥ u^{k + 1} - z \otimes q ∥}_{2}^{2} + \frac{μ}{2} {∥ x^{k} ↓_{s} - z - r^{k} / μ ∥}_{2}^{2},$ (15) $x^{k + 1} = \underset{x}{arg min} λ_{2} φ (x) + \frac{μ}{2} {∥ x ↓_{s} - z^{k + 1} - r^{k + 1} / μ ∥}_{2}^{2} .$ (16) $r^{k + 1} = r^{k} + μ (z^{k + 1} - x ↓_{s}^{k + 1}),$ (17)

2.2.1 Solution for u sub-problem

The u sub-problem can be regarded as the minimization of image gradient L₀-norm. The u is replaced by the auxiliary variable v, Equation (7) can be transformed into the following constrained optimization problem: $\begin{matrix} \underset{u, v}{arg min} \frac{1}{2} {∥ Hu - g ∥}_{2}^{2} + \frac{1}{2 σ^{2}} {∥ u - z^{k} \otimes q ∥}_{2}^{2} + λ_{1} {∥ \nabla v ∥}_{0}, \\ s . t . v = u, \end{matrix}$ (18) and Equation (7) can be further written as follows: $\underset{u, v, t}{arg min} \frac{1}{2} {∥ Hu - g ∥}_{2}^{2} + \frac{1}{2 σ^{2}} {∥ u - z^{k} \otimes q ∥}_{2}^{2} + λ_{1} {∥ \nabla v ∥}_{0} + \frac{β}{2} {∥ u - v - t / β ∥}_{2}^{2},$ (19) where t is the Lagrangian multiplier, and β is a nonnegative penalty parameter. Equation (12) can be updated by solving Equations (15), where k, j represent the iteration index of outer and inner loops, respectively. $\underset{u}{arg min} \frac{1}{2} {∥ Hu - g ∥}_{2}^{2} + \frac{1}{2 σ^{2}} {∥ u - z^{k} \otimes q ∥}_{2}^{2} + \frac{β}{2} {∥ u - v^{j} - t^{j} / β ∥}_{2}^{2},$ (20) $\underset{v}{arg min} λ {∥ \nabla v ∥}_{0} + \frac{β}{2} {∥ u^{j + 1} - v - t^{j} / β ∥}_{2}^{2},$ (21) $t^{j + 1} = t^{j} + β (v^{j + 1} - u^{j + 1}) .$ (22)

Equation (13) can be solved using the separate parabolic surrogate method, which is expressed as. The expression for iterative update of u is as follows $u^{j + 1} = u^{j} - \frac{H^{T} ({Hu}^{j} - g) + \frac{1}{σ^{2}} (u^{j} - z^{k} \otimes q) + β (u^{j} - v^{j} - t^{j} / β)}{H^{T} H + \frac{1}{σ^{2}}},$ (23) the v sub-problem in Equation (13) includes the L₀-norm minimization of image gradient, and the proposed method adopt the approximate method proposed in [24], which is a special semi quadratic splitting alternative optimization strategy.

2.2.2 Solution for z sub-problem

For the z sub-problem, it can be solved efficiently by Fast Fourier Transform (FFT). Equation (8) has a fast closed-form solution [20, 26] $z^{k + 1} = {Ff}^{- 1} (\frac{F^{*} (q) F (u^{k + 1}) + μ σ^{2} F (x^{k} ↓_{s})}{F^{*} (q) F (q) + μ σ^{2}}),$ (24) where F () denotes the Fast Fourier Transform (FFT) operator, and F^* () denotes complex conjugate of F (), and q represents the blur kernel. In this method, a Gaussian fuzzy kernel with a size of 3×3 and a variance of 0.25 is selected. In z sub-problem, the fuzzy kernel q only involves the closed-form solution Equation (17) deals with fuzzy distortion. The purpose of this is to further refine the current estimates and facilitate subsequent super-resolution work.

2.2.3 Solution for x sub-problem

In order to facilitate further solving the x sub-problem, the x sub-problem is written as the following formula $\underset{x}{arg min} \frac{μ}{2} {∥ x ↓_{s} - z^{k + 1} - r^{k + 1} / μ ∥}_{2}^{2} + λ_{2} φ (x) .$ (25)

Let $τ = \sqrt{λ_{2} / μ}$ , then, Equation (15) can be rewritten as ${prox}_{φ} (x) = \underset{x}{arg min} φ (x) + \frac{1}{2 τ^{2}} {∥ x ↓_{s} - z^{k + 1} - r^{k + 1} / μ ∥}_{2}^{2},$ (26) using the plug-and-play framework, the proximal operator of regularization prox_φ is replaced by the deep learning-based super-resolutioner, which maps the low-resolution image to the high-resolution image. In this case, Equation (18) and Equation (19) can be further rewritten as Equation (20), where G represents a mapping relationship, the input of G is a relatively clear low-dimensional CT image after de-blurring, and the output of G is the corresponding high-resolution CT image after further processing. With the development of deep learning, more and more neural networks are introduced into the iterative process of the algorithm as a priori model, and the x sub-problem can be solved by the mapping of the super-resolution network. In this way, G is the plug-and-play network in the algorithm and s is the scaling factor, s is assigned the value 2. In this work, a deep learning-based super-resolution algorithm is used as the prior model. A deep learning super-resolution network is trained in the presence of a certain amount of noise to achieve further quality recovery of low-resolution CT images $x^{k + 1} = G (z^{k + 1} + r^{k + 1} / μ, τ) .$ (27)

2.2.4 Super-resolution with deep image prior

After solving the first two sub-problems, we get the z after initial de-blurring. Double cubic interpolation was used to enlarge the image twice and take it as the input of the super-resolution network. In the iterative reconstruction process, the super-resolution network maps the input CT image to a clearer HRCT image. After a certain number of alternate iterations, the CT super-resolution reconstructed image with low noise level and high resolution is finally obtained.

The network structure used in this method is an improved VDSR network framework [27]. The basic block in the network is a cascaded convolutional layer and a nonlinear layer, so the cascaded basic block has 30 layers. The network takes the interpolated low-resolution image as the input, and then adds this image with the residual image learned by the network to get the final output of the network. Each convolution layer in the network contains 64 filters, with the convolution kernel size of 3×3 and the Padding set to 1. And the rectifying linear element of nonlinear layer is leakage rectifying linear element (LReLu). Compared with VDSR, the depth of the improved algorithm is increased to 30 layers with more receptive fields. The VDSR+ structure is shown in Fig. 1. The choice of loss function is different from the previous VDSR network. Instead of using L₂-norm as loss function, it uses a more robust L₁-norm. The L₁-norm is more robust than L₂-norm because it can handle outliers in the data. The network optimizer uses the Adam optimizer, and the initial learning rate defaults to 0.0001. In order to demonstrate that the improved network can further improve the super-resolution results. In the experimental section, the super-resolution algorithm using the unimproved VDSR network in solving the x sub-problem is named DIP-CTSR, and the super-resolution algorithm using the improved VDSR+ network in solving the x sub-problem is named DIP-CTSR+.

Fig. 1

Improved Residual Block and improved network structure.

The network training data set was selected from Henan Provincial People’s Hospital, including 1200 head CT images of 7 patients and 750 abdominal CT images of 5 patients, with the image size of 512×512. In the construction of training data set, we obtained low-resolution projection through simulation and reconstructed 256×256 low-resolution CT images. In the supervised training mode, high resolution and low-resolution CT images were used for the training. The validation dataset included 200 head CT images from the other two patients and 110 abdominal CT images from the third patient, for a total of 310 images. Low-resolution projection data were generated using 100 abdominal section images from a patient in the test data set.

The process of the whole algorithm is presented in Algorithm 1 as follows:

Algorithm 1: DIP-CTSR
Input: g, σ, β, λ₁, λ₂ μ, α, s, q, J, K;
Initialize: u⁰, v⁰, t⁰, z⁰, x⁰;
For k from 0 to K
(1) Update variable u
For j from 0 to J
updating u^j+1 via Equation (16);
End
Set u^k+1 = u^j+1
(2) Update variable z
updating z^k+1 via Equation (17);
(3) Update variable x and Lagrangian multiplier r
updating x^k+1 via Equation (20);
updating r^k+1 via Equation (10);
k = k + 1;
End
Return final x

In the simulation data experiment, the initial image of all methods is set to zero, while in the actual data experiment, the initialization step is reconstructed using FBP. All iterative methods stop when convergence is reached, and the ordered subset SART (OS-SART) strategy is adopted in this paper to accelerate the convergence speed (the number of subsets is selected as 10) [28]. And Table 1 lists the parameter values for the proposed method. These values were optimized and chosen through extensive experimentations

Table 1

The parameter values for the proposed method

	σ	β	λ₁	λ₂	μ	α	s	J	K
Values	0.5	3×10^- 6	2	2	7×10^- 4	0.25	2	50	10

2.3 Image quality assessment

Objective evaluation of CT images in super-resolution work includes the complexity of the reconstruction algorithm and the root mean square error (RMSE), peak signal-to-noise ratio (PSNR) [29 –31] and structural similarity (SSIM) that can reflect the degree of deviation between images. The RMSE value reflects the deviation between the reconstructed image and the original image, while the PSNR is expressed based on the MSE, which reflects the similarity of the two images. In comparison, the PSNR is applied to the bit information of the image, which can reflect the difference more objectively. However, both quantitative analyses of RMSE and PSNR fail to take into account human visual effects, so the SSIM evaluation criterion [32 –34] can evaluate the similarity between images in terms of the structural similarity of visual features, as a useful supplement to PSNR. The PSNR value is usually used to measure whether a processing program is satisfactory. PSNR is inversely proportional to the logarithm of the RMSE of the original HR image and the generated image. SSIM is a measure of the structural similarity between images based on three relatively independent comparisons of brightness, contrast, and structure. The SSIM formula can be expressed as the weighted product of brightness, contrast, and structure comparison. By assuming that both HR and reconstructed image have N pixels, the calculation formula of the three evaluation indicators is as follows $RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(I (i) - \hat{I} (i))}^{2}}$ (28) $PSNR = 10 {log}_{10} (\frac{L^{2}}{MSE}),$ (29) $SSIM (I, \hat{I}) = {[l (I, \hat{I})]}^{α} {[c (I, \hat{I})]}^{β} {[s (I, \hat{I})]}^{γ} .$ (30)

In Equation (21), I (i) represents the reference image, $\hat{I} (i)$ represents the image to be evaluated, and N represents the total number of pixels contained in the image. In Equation (22), L is the maximum pixel value. In the calculation formula of structural similarity, $l (I, \hat{I})$ denotes the brightness comparison, $c (I, \hat{I})$ denotes the contrast comparison, $s (I, \hat{I})$ denotes the structure comparison, and α, β and γ are the weights of the brightness, contrast and structure comparison functions, respectively. The lower the RMSE value, the higher the PSNR value and the closer the SSIM value to 1, the better the image quality is.

In addition to the above three common evaluation parameters to evaluate the quality of reconstructed CT images, sharpness refers to the clarity of the boundary between two regions with different tones or colors, and it is also an important evaluation factor for image quality. It can be illustrated by the image quality of bar graphs with progressively increasing spatial frequency. Modulation transfer function (MTF) [35, 36] is a term in optical engineering. In the field of image quality, it can also be used to evaluate image sharpness. The vertical coordinate of MTF curve is the transfer function, and the horizontal coordinate is the resolution. In our work, the slanted edge method is used to calculate the MTF curve. MTF50 is the value of the MTF curve at 50% y-coordinate. Some literature says MTF50%. In the process of calculating MTF value, the average value of more than ten adjacent contour lines at the edge of reconstructed CT image is selected first. After extracting the average edge contour of each boundary, the gradient operation of each edge contour was carried out to obtain the line extension function (LSF). After filling in zero, the Fast Fourier Transform (FFT) of LSF can obtain the MTF value at each frequency. The higher the value of MTF50, the higher the resolution to some extent.

3 Experiment and result

3.1 Line pair phantom simulation

In this paper, the FBP reconstruction algorithm, the SART-TV reconstruction algorithm [37, 38], the block matching and TV joint regularization reconstruction algorithm (BMTV) [39], and the deep plug-and-play image super-resolution (DPSR) algorithm [20] were selected for comparison. To verify the good performance of the proposed algorithm. The experiments in this paper were carried out on the AMAX workstation. The workstation has two GeForce RTX2080Ti (NVIDIA Corporation) GPUs. This section presents the experimental results of the proposed CT super-resolution reconstruction method on simulated data. In the comparison of qualitative results, regions of interest (ROI) are marked with red square dashed boxes in the reconstruction results to visually evaluate the ability of different SR methods to recover image details. In the experimental part, RMSE, PSNR and SSIM were selected as quantitative indexes to evaluate the reconstructed images. In addition, the spatial resolution of different reconstruction results is shown by MTF curves. For fair comparison, the image restoration algorithm of the plug-and-play comparison algorithm conducts experimental comparisons based on simulated experimental data and selects its optimal parameters.

The first simulated data experiment used a digital phantom with the reference image shown in Fig. 2. This section uses this digital phantom to generate sinograms. In order to simulate the generation of low-resolution CT projections, a bar detector with size of 256 was constructed by simulation. The scanning angle is set to full angle, and the interval is 1°, and finally a projected chord diagram of size 256×360 is obtained. Furthermore, to illustrate the robustness and practicality of our method, Gaussian noise is added in this paper, the initial intensity of incident photons is set to 1×10⁵, and the variance is set to 25.

Fig. 2

Reference image of the phantom 1 used for the simulation data.

The experimental results are shown in the Fig. 3 below. In Fig. 3, the first row is the reconstruction result, and the second row shows the details of our selected ROI. From left to right in the figure, column (a) represents the original ground-truth image, and columns (b) to (f) are the super-resolution results of different methods, followed by FBP reconstruction algorithm, SART-TV reconstruction algorithm, A joint regularization reconstruction algorithm using block-based matching and TV (BMTV), a deep plug-and-play image super-resolution algorithm (DPSR), the CT super-resolution reconstruction with deep image priors using DIP-CTSR method, and the proposed CT super-resolution reconstruction using DIP-CTSR+ method. As can be seen from Fig. 3, the FBP reconstruction results have obvious ring artifacts. Although the reconstruction results of SART-TV and BMTV have no artifacts, there are obvious noise-like points, and the image details are still not clear enough. Although the DPSR reconstruction results are visually clearer than the SART-TV and BMTV reconstruction results, some details are distorted. Although the DIP-CTSR and DIP-CTSR+ proposed in this paper fails to recover all wire pairs, the detail recovery is significantly better than other algorithms. It can be seen from Table 2 that the CT image generated by DIP-CTSR+ further reduces the noise of the image and improves the SSIM to a certain extent. Compared with the results obtained by DPSR super resolution algorithm, the reconstruction results obtained by this method are 43% higher than PSNR and 7.6% higher than SSIM. MTF curves of each reconstruction result can be observed in Fig. 4. It shows that the value of MFT50 of the reconstruction result of DIP-CTSR and DIP-CTSR+ is higher than that of other comparison algorithms.

Fig. 3

CT image reconstruction results and related details of digital phantom 2. (a) Ground truth. (b) FBP (c) SART-TVR (d) BMTV (e) deep plug-and-play image super-resolution (f) Image reconstructed using DIP-CTSR (g) Image reconstructed using DIP-CTSR+. The red dashed rectangle represents the selected noise evaluation ROI. Green arrows point to adjacent boundaries for MTF assessment. The grayscale range of all images in the Figure is [0, 0.12].

Table 2

Quantitative comparison of reconstructed image results for digital phantom 1

	FBP	SART-TV	BMTV	DPSR	DIP-CTSR	DIP-CTSR+
RMSE	0.6246	0.2647	0.2562	0.2291	0.1283	0.1211
PSNR	4.0872	11.5458	11.8299	12.7972	17.8318	18.3433
SSIM	0.4566	0.8691	0.8551	0.8481	0.9033	0.9134

Fig. 4

Measured MTF curves of reconstructed CT images for phantom 1.

3.2 Abdominal phantom results

To verify the effectiveness of the method on CT images, two digital CT images were selected as test body membranes for this simulation experiment. In the second simulation data experiment, two digital chest CT images were used as test phantoms, whose reference images are shown in Fig. 5, marked as (a-1) and (a-2), respectively. Similarly, sinograms were generated by simulating digital CT image slices, and the projected sinograms were twice down-sampled and noise added. The settings of the analog scan parameters are the same as the above-mentioned head digital body membrane. The choice of noise level and reconstruction parameters is consistent with the first digital phantom.

Fig. 5

Reference image of the phantom 2 used for the simulation data.

The experimental results are shown in the Figs. 6 7 below. Figures 6 7 are the reconstruction results of slice 1 and slice 2. It can be seen from Figs. 5 6 that the reconstruction results of FBP, SART-TV and BMTV have obvious noise, and the image details are not clear enough. Although the DPSR reconstruction results are visually clearer than the SART-TV and BMTV reconstruction results, some details are distorted. The super-resolution results of DIP-CTSR+ proposed in this paper effectively suppress artifacts and restore image details more clearly, with higher signal-to-noise ratio. In order to quantitatively evaluate the reconstruction accuracy of the DIP-CTSR algorithm, three image evaluation indicators including PSNR, RMSE and SSIM, are used for evaluation. The quantitative results are shown in Tables 3 4 below. Overall, the RMSE of the super-resolution image obtained by the method in this paper is reduced by more than 15% compared with the result obtained by the BMTV algorithm and is reduced by more than 17% compared with the result obtained by the DPSR super-resolution algorithm. The SSIM of the super-resolution image obtained by DIP-CTSR+ also higher than other comparison algorithms. Combining the qualitative and quantitative experimental results, it is found that in the simulation data experiments, the DIP-CTSR+ super-resolution method proposed in this paper has significant advantages and can retain detailed information while suppressing noise.

Fig. 6

CT image reconstruction results and related details of slice 1 for digital phantom 1. (a) Ground truth. (b) FBP (c) SART-TVR (d) BMTV (e) deep plug-and-play image super-resolution (f) Image reconstructed using DIP-CTSR (g) Image reconstructed using DIP-CTSR+. The red dashed rectangle represents the selected ROI. Green arrows point to adjacent boundaries for MTF assessment. The grayscale range of all images in the Figure is [0, 0.06].

Fig. 7

CT image reconstruction results and related details of slice 2 for digital phantom 1. The content displayed in the row and column is the same as in Fig. 5.

Table 3

Quantitative comparison of simulated data experiment results for slice 1

Slice 1	FBP	SART-TV	BMTV	DPSR	DIP-CTSR	DIP-CTSR+
RMSE	0.1050	0.0650	0.0630	0.0617	0.0566	0.0508
PSNR	19.5706	23.7395	24.0086	24.1840	24.9448	25.8809
SSIM	0.8622	0.9637	0.9623	0.9595	0.9655	0.9755

Table 4

Quantitative comparison of simulated data experiment results for slice 2

Slice 2	FBP	SART-TV	BMTV	DPSR	DIP-CTSR	DIP-CTSR+
RMSE	0.0961	0.0532	0.0518	0.0466	0.0361	0.0352
PSNR	20.3458	25.4710	25.6828	26.6331	28.8597	29.0548
SSIM	0.8916	0.9755	0.9778	0.9764	0.9841	0.9845

To investigate the spatial resolution of the super-resolution results, we plotted the MTF images. The MTF curve of reconstruction results of abdominal simulation data is shown in Fig. 8. The adjacent boundaries used to evaluate the super-resolution reconstruction results of MTF are indicated by green rectangular arrows in the above reconstruction results. Compared with the BMTV method, as the MTF amplitude is reduced to 50%, the spatial resolution of the CT images reconstructed by DIP-CTSR+ is improved by a factor of 0.21 and 0.02. Compared with the DPSR method, the MTF is increased by a factor of 0.19 and 0.11. Therefore, combining qualitative and quantitative experimental results, it is found that the DIP-CTSR+ super-resolution method proposed in this paper has significant advantages in the simulation data experiments. It can retain details while suppressing noise and generate higher quality CT images.

Fig. 8

The illustration of (a) MTF curves of the reconstructed results of slice 1 for the simulated data, (b) MTF curves of the reconstructed results of slice 2 for the simulated data.

3.3 Real data experiments

To verify the SR performance of the proposed network model in an actual CT system, an actual scanning experiment was performed with the Chengdu Dosimetric Phantom (CDP) as the imaging object. The structure of the CDP model is shown in Fig. 9, and its specifications are described in Report No.48 of the International Commission on Radiation Units and Measurements [40]. The phantom is very similar to the human head, and the equivalent error of the attenuation coefficient between its synthetic materials and human tissues and organs does not exceed 5%.

Fig. 9

Real data experimental head phantom.

In the CT imaging system for scanning the CDP phantom, the X-ray source is Thales Hawkeye-130 and the flat panel detector is Thales 4343F. In the setting of CDP bulk film scanning parameters, the tube voltage is 120kVp, the tube current is 200μA, the number of projection acquisition frames is 360 frames, the scanning angle is 360°, the distance from the light source to the rotation axis is 529.16 mm, and the distance from the light source to the detector is 998.75 mm, and the reconstructed image size is 512 by 512.

In the real data experiment, the presentation of evaluation results was also based on two representative test slices in the test data set, denoted as Slice 1 and Slice 2 respectively. In order to better display image information, the reconstructed images of two slices are displayed in a 360×360 enlarged field of view in this paper, as shown in (a-1) and (a-2) in Fig. 10. The grayscale range is [0, 0.1]. The image in Fig. 10 is reconstructed from the HR projection chord graph obtained in the 1×1 acquisition mode, which is used as the reference reconstruction image.

Fig. 10

Reference reconstruction images of two representative test slices of CDP.

This section presents the experimental results of the proposed CT super-resolution reconstruction method on real data. Two structurally representative slices were selected for the experiments. Figures 11 12 show the experimental results of the proposed CT super-resolution reconstruction method on real data. It can be observed that the reconstructed result by the FBP algorithm obviously has a lot of noise, and the skull structure in the image is obviously overwhelmed by the noise. Although the SART-TV algorithm and BMTV algorithm can suppress the image noise to a certain extent, the edge part is still not clear enough. The comparison shows that the reconstruction results of the DPSR algorithm and the DIP-CTSR algorithm are obviously clearer, but the reconstruction results of the DPSR algorithm have excessive edge enhancement, which may lose the authenticity of the CT image, and there are some flocculent artifacts in the background of the DPSR reconstruction results. The reconstruction results of DIP-CTSR+ using the improved network are better in detail recovery, and the algorithm generates high-quality images that are closer to human visual perception. From the perspective of MTF images, the error of the reconstructed image by the HL-BMTV algorithm is also the smallest. Therefore, the visual evaluation of the actual data experiments again verifies the effectiveness of the proposed algorithm in practical applications.

Fig. 11

CT image reconstruction results and related details of slice 1. (a) Reference image (b) FBP (c) SART-TV (d) BMTV (e) deep plug-and-play image super-resolution (f) Image reconstructed using DIP-CTSR (g) Image reconstructed using DIP-CTSR+. The red dashed rectangle represents the selected ROI. Green arrows point to adjacent boundaries for MTF assessment. The grayscale range is [0, 0.1].

Fig. 12

CT image reconstruction results and related details of slice 2. The content displayed in the row and column is the same as in Fig. 11.

The quantitative results of experimental reconstruction of actual data are shown in Tables 5 6 below. The PSNR and SSIM of the super-resolution image obtained by the DIP-CTSR+ algorithm are higher than those of other comparison algorithms, and the signal-to-noise of the reconstructed image of the DIP-CTSR+ algorithm is improved by 5.1% compared with other algorithms. As can be seen from Fig. 13, compared with other comparison algorithms, the proposed method increases the MTF50 value of the reconnection result when the MTF amplitude is reduced to 50%. Compared with DPSR method, the MTF total factor is increased by 0.21 factors. Compared with BMTV algorithm, the MTF total factor is increased by 0.32 factors. Therefore, combined with the results shown and the MTF diagram, it shows that the DIP-CTSR+ super resolution method proposed in this paper has certain advantages in the CT super resolution experiment of real projected data. It preserves detail while suppressing noise and improving image resolution. By combining the qualitative and quantitative experimental results, it is found that the DIP-CTSR+ super-resolution method proposed in this paper has certain advantages in the real data experiment.

Table 5

Quantitative comparison of real data experiment results for slice 1

Slice 1	FBP	SART-TV	BMTV	DPSR	DIP-CTSR	DIP-CTSR+
RMSE	0.0750	0.0411	0.0425	0.0357	0.03158	0.0302
PSNR	22.4988	27.7144	27.4181	28.9253	30.0112	30.3930
SSIM	0.9322	0.9598	0.9517	0.9618	0.9761	0.9773

Table 6

Quantitative comparison of real data experiment results for slice 2

Slice 2	FBP	SART-TV	BMTV	DPSR	DIP-CTSR	DIP-CTSR+
RMSE	0.0712	0.0343	0.0358	0.0441	0.0362	0.0354
PSNR	22.9504	29.2741	28.9057	27.0963	28.8103	29.0173
SSIM	0.9345	0.9695	0.9618	0.9105	0.9757	0.9768

Fig. 13

The illustration of (a) MTF curves of the reconstructed results of slice 1 for the real data, (b) MTF curves of the reconstructed results of slice 2 for the real data experiment.

4 Discussion

In this paper, the advantages of DIP-CTSR in CT super-resolution reconstruction are verified by simulation and actual data experiments. Compared with other model-based super-resolution methods without deep learning priors, namely the reconstruction results of BMTV and ART-TV, the details of the reconstruction results of this method are more real and clear, because the prior terms of TV regularization and L₀ regularization rely on mathematical expressions to express some features of the image. However, there are still some deep image features that cannot be expressed mathematically. Deep learning priors can learn depth features of images by training data sets.

Compared with the reconstruction results of DPSR method, the texture details of the CT images reconstructed by DIP-VDSR+ are more realistic, which can be proved by the experimental results of simulated data and actual data. The spatial resolution of reconstructed images can also be improved to some extent. The reconstructed detail display and MTF curve verify that this method can improve the CT image resolution of actual CT data to a certain extent. The experimental results prove that the proposed method is flexible, effective and extensive for the super-resolution of CT images.

5 Conclusion

This paper proposes a method combining deep learning and iterative model optimization. This method is based on the traditional sparse L₀ regularization terms, combined with the plug-and-play framework, and introduces the deep learning priors to further dig the prior information and depth features of CT images through deep learning. Simulation and real data experiments prove that the texture details of the DIP-VDSR+ reconstructed CT images are more realistic, which verifies the advantages of the combination of deep learning prior and super-resolution optimization reconstruction model. This method is flexible, effective and extensive for LRCT image super-resolution.

Footnotes

Acknowledgments

This work was supported by the National Key Research and Development Project of China (Grant No. 2020YFC1522002). This work was also supported by the National Natural Science Foundation of China (Grant No. 62101596) and the China Postdoctoral Science Foundation (Grant No. 2019M663996).

References

Wang

, Rahman

S.S.

and Arns

C.H.

, Super resolution reconstruction of μ-CT image of rock sample using neighbour embedding algorithm, Physica A: Statistical Mechanics and its Applications 493 (2018), 177–188.

Zang

, Aly

, Idoughi

, et al., Super-resolution and sparse view CT reconstruction, in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 137–153.

Jin

, Xi

, Wang

, et al., Nanoparticles for super-resolution microscopy and single-molecule tracking, Nature Methods 15 (2018), 415–423.

Nieuwenhuizen

R.P.

, Lidke

K.A.

, Bates

, et al., Measuring image resolution in optical nanoscopy, Nature Methods 10 (2013), 557–562.

Leng

, Xu

and Zhang

, Medical image interpolation based on multi-resolution registration, Computers & Mathematics with Applications 66 (2013), 1–18.

Isaac

J.S.

and Kulkarni

, Super resolution techniques for medical image processing, in 2015 International Conference on Technologies for Sustainable Development (ICTSD), (2015), 1–6.

Umehara

, Ota

and Ishida

, Application of super-resolution convolutional neural network for enhancing image resolution in chest CT, Journal of Digital Imaging 31 (2018), 441–450.

Yan

, Li

, Lu

, et al., Super resolution in CT, International Journal of Imaging Systems and Technology 25 (2015), 92–101.

Wang

, Chen

, Wu

, et al., Enhanced generative adversarial network for 3D brain MRI super-resolution, in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, (2020), 3627–3636.

10.

Park

S.C.

, Park

M.K.

and Kang

M.G.

, Super-resolution image reconstruction: a technical overview, IEEE Signal Processing Magazine 20 (2003), 21–36.

11.

Venkatakrishnan

S.V.

, Bouman

C.A.

and Wohlberg

, Plug-and-play priors for model based reconstruction, in 2013 IEEE Global Conference on Signal and Information Processing, (2013), 945–948.

12.

Cascarano

, Piccolomini

E.L.

, Morotti

and Sebastiani

, Plug-and-Play gradient-based denoisers applied to CT image enhancement, Applied Mathematics and Computation 422 (2022), 126967.

13.

Feng

, Zhang

, Su

and Xu

, Optical Remote sensing image denoising and super-resolution reconstructing using optimized generative network in wavelet transform domain, Remote Sensing 13 (2021), 1858.

14.

, Sixou

and Peyrin

, A review of the deep learning methods for medical images super resolution problems, IRBM 42 (2021), 120–133.

15.

Kaji

and Kida

, Overview of image-to-image translation by use of deep neural networks: denoising, super-resolution, modality conversion, and reconstruction in medical imaging, Radiological Physics and Technology 12 (2019), 235–248.

16.

Ren

, El-Khamy

and Lee

, CT-SRCNN: cascade trained and trimmed deep convolutional neural networks for image super resolution, in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), (2018), 1423–1431.

17.

Park

, Hwang

, Kim

K.Y.

, et al., Computed tomography super-resolution using deep convolutional neural network, Physics in Medicine & Biology 63 (2018), 145011.

18.

You

, Li

, Zhang

, et al., CT super-resolution GAN constrained by the identical, residual, and cycle learning ensemble (GAN-CIRCLE), IEEE Transactions on Medical Imaging 39 (2019), 188–203.

19.

D.H.

, Srivastava

, Thibault

J.B.

, et al., Deep residual learning for model-based iterative CT reconstruction using plug-and-play framework, in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2018), 6668–6672.

20.

Zhang

, Zuo

and Zhang

, Deep plug-and-play super-resolution for arbitrary blur kernels, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2019), 1671–1681.

21.

Van Veen

, Jalal

, Soltanolkotabi

, et al., Compressed sensing with deep image prior and learned regularization, arXiv preprint, (2018), arXiv:1806.06438.

22.

Saxe

, Nelli

and Summerfield

, If deep learning is the answer, what is the question?, Nature Reviews Neuroscience 22 (2021), 55–67.

23.

Zhu

and Chern

I.L.

, Fast alternating minimization method for compressive sensing MRI under wavelet sparsity and TV sparsity, in 2011 Sixth International Conference on Image and Graphics, (2011), 356–361.

24.

, Lu

, Xu

and Jia

, Image smoothing via L0 gradient minimization, in Proceedings of the 2011 SIGGRAPH Asia conference, (2011), 1–12.

25.

Wang

, Jebara

and Chang

S.-F.

, Graph transduction via alternating minimization, in Proceedings of the 25th international conference on Machine learning, (2008), 1144–1151.

26.

Tirer

and Giryes

, Image restoration by iterative denoising and backward projections, IEEE Transactions on Image Processing, 28 (2018), 1220–1234.

27.

Kim

, Lee

J.K.

and Lee

K.M.

, Accurate image super-resolution using very deep convolutional networks, in Proceedings of the IEEE conference on computer vision and pattern recognition, (2016), 1646–1654.

28.

Wang

and Jiang

, Ordered-subset simultaneous algebraic reconstruction techniques (OS-SART), Journal of X-Ray Science and Technology 12 (2004), 169–177.

29.

Sara

, Akter

and Uddin

M.S.

, Image quality assessment through FSIM, SSIM, MSE and PSNR—a comparative study, Journal of Computer and Communications 7 (2019), 8–18.

30.

Wycoff

, Chan

T.-H.

, Jia

, et al., A non-negative sparse promoting algorithm for high resolution hyperspectral imaging, in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, (2013), 1409–1413.

31.

Patidar

and Morya

M.S.

, Performance analysis of QoS parameters like PSNR, MAE & RMSE used in image transmission using Matlab, Journal for Scientific Research and Development 2 (2017), 67321463.

32.

Fookes

, Lin

, Chandran

and Sridharan

, Evaluation of image resolution and super-resolution on face recognition performance, Journal of Visual Communication and Image Representation 23 (2012), 75–93.

33.

Sheikh

H.R.

, Sabir

M.F.

and Bovik

A.C.

, A statistical evaluation of recent full reference image quality assessment algorithms, IEEE Transactions on Image Processing 15 (2006), 3440–3451.

34.

Channappayya

S.S.

, Bovik

A.C.

and Heath

R.W.

, Rate bounds on SSIM index of quantized images, IEEE Transactions on Image Processing 17 (2008), 1624–1639.

35.

Boreman

G.D.

, Modulation transfer function in optical and electro-optical systems vol. 4: SPIE press Bellingham, Washington, 2001.

36.

Park

S.K.

, Schowengerdt

and Kaczynski

M.A.

, Modulation-transfer-function analysis for sampled image systems, Applied Optics 23 (1984), 2572–2582.

37.

Chen

, Jin

, Li

and Wang

, A limited-angle CT reconstruction method based on anisotropic TV minimization, Physics in Medicine & Biology 58 (2013), 2119.

38.

Qiao

and Lu

, A TV-minimization image-reconstruction algorithm without system matrix, Journal of X-Ray Science and Technology 29 (2021), 851–865.

39.

Liu

and Osher

, Block matching local SVD operator based sparsity and TV regularization for image denoising, Journal of Scientific Computing 78 (2019), 607–624.

40.

White

, Buckland-Wright

, Griffith

, et al., ICRU Report 48: Phantomas and computational models in therapy, diagnosis and protection, Journal of the International Commission on Radiation Units and Measurements, os-25 (1992).