Adaptive orthogonal directional total variation with kernel regression for CT image denoising

Abstract

BACKGROUND:

Low-dose computed tomography (CT) has been successful in reducing radiation exposure for patients. However, the use of reconstructions from sparse angle sampling in low-dose CT often leads to severe streak artifacts in the reconstructed images.

OBJECTIVE:

In order to address this issue and preserve image edge details, this study proposes an adaptive orthogonal directional total variation method with kernel regression.

METHODS:

The CT reconstructed images are initially processed through kernel regression to obtain the N-term Taylor series, which serves as a local representation of the regression function. By expanding the series to the second order, we obtain the desired estimate of the regression function and localized information on the first and second derivatives. To mitigate the noise impact on these derivatives, kernel regression is performed again to update the first and second derivatives. Subsequently, the original reconstructed image, its local approximation, and the updated derivatives are summed using a weighting scheme to derive the image used for calculating orientation information. For further removal of stripe artifacts, the study introduces the adaptive orthogonal directional total variation (AODTV) method, which denoises along both the edge direction and the normal direction, guided by the previously obtained orientation.

RESULTS:

Both simulation and real experiments have obtained good results. The results of two real experiments show that the proposed method has obtained PSNR values of 34.5408 dB and 29.4634 dB, which are 1.2392–5.9333 dB and 2.828–6.7995 dB higher than the contrast denoising algorithm, respectively, indicating that the proposed method has good denoising performance.

CONCLUSIONS:

The study demonstrates the effectiveness of the method in eliminating strip artifacts and preserving the fine details of the images.

Keywords

CT reconstruction denoising orthogonal direction kernel regression edge adaptive directional total variation

1 Introduction

In the field of medical imaging, the reconstruction of computed tomography (CT) images is a critical area of research. CT employs X-rays to create cross-sectional images of the body, which are useful for diagnosing a variety of health issues. The technique of CT imaging involves reconstructing an image by applying the Radon and inverse Radon transforms to data collected from the transmission of an X-ray beam. The quality of reconstructed CT images is influenced by various factors, including blurring, field of view (FOV), artifacts, and visual noise. An artifact refers to a distortion or inaccuracy in an image that does not accurately represent the object being imaged. While reducing the radiation dose in computed tomography (CT) scans lowers the risk of cancer, it is important to note that low-dose imaging is usually accompanied by the appearance of strip artifacts.

In the literature, numerous denoising techniques have been proposed for CT image denoising, which can generally be broadly classified into two categories: spatial domain filtering and transform domain filtering. Spatial filters, which can be broadly classified as linear and nonlinear filters, serve the purpose of noise reduction by directly filtering the original noisy image. To denoise the image in the spatial domain, linear filters such as the Gaussian filter [1] were initially used. However, these filters did not effectively preserve image details. To address this issue, nonlinear filters like the Wiener filter [2] and bilateral filter [3] were employed. In recent years, researchers have also explored the concept of least squares fidelity minimization [4] to remove Gaussian noise. Various regularization functions, including Tikhonov, anisotropic diffusion, and Total variation methods, have been investigated. The Tikhonov method [5] is the simplest approach using L₂ norm for regularization, but it tends to oversmooth image details [6]. On the other hand, anisotropic diffusion-based methods [7] can enhance image details by preventing diffusion at the edges, but they may blur the edges. To overcome the smoothness issue in denoised images, Total variation (TV) based regularization was developed in parallel with anisotropic diffusion. However, TV techniques often encounter problems with stair-artifacts [8]. To address these limitations, the anisotropic total variation method [9] was introduced, combining the advantages of TV and anisotropic diffusion. Bayram et al. [10] proposed a directional TV method for image denoising, which is based on images with a single major direction and utilizes the directional derivative of the image. This method was further improved in [11], where it became spatially adaptive through pixel-specific angle estimates. Non-local means (NLM) [12] filters have also gained popularity in recent years for image denoising. The Wavelet Transform [13] is the most investigated transform in denoising. It has been widely utilized in scientific and engineering fields [14]. Among the various methods, the wavelet threshold atrophy method [15] has been extensively studied. Currently, BM3D [16] is considered the state-of-the-art denoising method, surpassing other existing methods in terms of denoising performance. Dictionary learning is a commonly used method for image denoising. It involves learning a sparse dictionary representation of an image to achieve denoising. An improved deep convolutional dictionary learning algorithm [17] was proposed for Low-Dose computed tomography image processing and denoising.

The EADTV method in [11] utilizes a directional parameter based on the spatial variation of image edge directions. It estimates the edge direction from either the noisy image itself or a noiseless reference image. In CT reconstruction images, obtaining a noiseless reference image for estimating image edge orientation is typically challenging. The edge direction was originally estimated using a noisy image that had been preliminarily smoothed with a Gaussian filter. To obtain a more accurate directional information, we propose using kernel regression to fit the original noisy image. The kernel regression method [18] is a non-parametric estimation method that aims to estimate the unknown regression function [19] based on observational data. It can be extensively applied in various types of image and noise processing, preserving image details while requiring minimal information about the image or noise distribution. Additionally, the EADTV model does not consider the specific characteristics of strip artifacts in CT images, it may mistakenly treat strip artifacts as image edges and preserve them instead of removing them. To address this issue, we have used the EADTV model to perform denoising along the edge direction and the normal direction.

Based on the discussion above, we propose a new CT denoising method called AODTV. This method involves performing kernel regression on the original image, which is reconstructed using the FBP algorithm [20]. The first step of AODTV is to approximate the image and its first and second order derivatives using kernel regression. These derivatives are then updated through further kernel regression. In order to create an image included more accurate directional information for the AODTV model, the reconstructed image, regression function estimate, and updated derivatives are combined using weights. The second step of AODTV is to denoise the reconstructed image along the orthogonal direction, namely along the edge direction and the normal direction. This approach effectively preserves edge details and reduces stripe artifacts. To evaluate the performance of our method, we compared it with the FBP algorithm and several established denoising algorithms, including Guided Image Filtering [22], the L₀ smoothing algorithm [23], the WLS algorithm [24], the TV algorithm, the EADTV algorithm and the Truncated TV algorithm [25]. The comparison was conducted using both simulated and real data. The results demonstrate that our AODTV method outperforms the classical algorithms in terms of preserving details and eliminating artifacts.

With the wide application of CT imaging technology in clinical practice, low-dose CT imaging has received increasing attention in order to reduce the harm to the human body. However, the images obtained from low-dose CT reconstruction will be seriously disturbed by noise and produce strip artifacts. Therefore, it is necessary to suppress the noise in the image. EADTV is a denoising method that can handle images with multiple dominant orientations and it relies on a noisy reference image, which is pre-smoothed by Gaussian filtering, to estimate the edge orientation [11]. Gaussian filtering is a linear smoothing technique that assumes that the relationship between pixels and neighboring pixels is linear, which may not be accurate enough in real images, especially in complex texture regions or noise types that are not simple Gaussian, such as bone trabeculae, which have rich texture structure in medical images. Kernel regression is a non-parametric nonlinear method, which can build more complex mapping relations according to the training data, so as to better capture the local change trend and structure of the image, and provide more accurate recovery for noisy image. In theory, kernel regression can better control the prediction results by selecting the appropriate kernel function and bandwidth parameters, so as to better protect the image details and edge information to a certain extent. In order to obtain higher quality reference images and thus more accurately estimate the edge orientation of CT images, we introduced kernel regression. Since there is still noise in the information of the first and second derivatives of the fitted image obtained by kernel regression, we run kernel regression again on the fitted image. For multi-directional medical images, such as the reconstructed image of bone trabecular projection data used in our experiment, the EADTV method only denoises along the edge direction of the image, which is easy to identify the strip artifacts in the image as texture structure, resulting in incomplete denoising. Therefore, we use the EADTV method to denoise along the edge direction and its normal direction, so as to achieve better denoising effect.

The structure of this work is as follows: Section 2 provides a brief presentation of some basic theories and Section 3 introduces the AODTV method. The experiments and their results are reported in Section 4. Section 5 analyzes the influence of parameters on the experimental results. Finally, Section 6 discusses and concludes this work.

2 Related work

2.1 Classical kernel regression

The two-dimensional data measurement model is as follows [18]: $y_{i} = z (x_{i}) + ɛ_{i}, i = 1, . . ., P$ (1) where the coordinates of the measured data y_i is the 2 × 1 vector x_i. Accordingly, the local expansion of the regression function is as follows: $\begin{matrix} z (x_{i}) = z (x) + {\nabla z (x)}^{T} (x_{i} - x) + \frac{1}{2} (x_{i} - x)^{T} {\nabla^{2} z (x)} (x_{i} - x) + \dots \\ \begin{matrix} = \end{matrix} z (x) + {\nabla z (x)}^{T} (x_{i} - x) + \frac{1}{2} vec h^{T} {\nabla^{2} z (x)} vech {(x_{i} - x) (x_{i} - x)^{T}} + \dots \\ \begin{matrix} = β_{0} + β_{1}^{T} \end{matrix} (x_{i} - x) + β_{2}^{T} vech {(x_{i} - x) (x_{i} - x)^{T}} + \dots \end{matrix}$ (2)

Where ∇ and ∇² are the gradient (2 × 1) and Hessian (2 × 2) operators, respectively. β₀ = z (x), $β_{1} = \nabla z (x) = {[\begin{matrix} \frac{\partial z (x)}{\partial x_{1}}, & \frac{\partial z (x)}{\partial x_{2}} \end{matrix}]}^{T}$ , $β_{2} = \frac{1}{2} {[\begin{matrix} \frac{\partial^{2} z (x)}{\partial {x_{1}}^{2}} & \frac{\partial^{2} z (x)}{\partial x_{1} \partial x_{2}} & \frac{\partial^{2} z (x)}{\partial {x_{2}}^{2}} \end{matrix}]}^{T}$ , vech (·) is the half-vectorization operator which lexicographically orders the lower triangular portion of a symmetric matrix into a column-stacked vector, e.g., $\begin{matrix} vech ([\begin{matrix} a & b \\ b & d \end{matrix}]) = {[\begin{matrix} a & b & d \end{matrix}]}^{T} \\ vech ([\begin{matrix} a & b & c \\ b & e & f \\ c & f & i \end{matrix}]) = {[\begin{matrix} \begin{matrix} \begin{matrix} a & b \end{matrix} & c & e \end{matrix} & f & i \end{matrix}]}^{T} \end{matrix}$ (3)

Then the parameters ${β n}_{n = 1}^{N}$ are estimated by computing the following optimization problem (for N = 2): $min_{β_{n}} \sum_{i = 1}^{P} {[y_{i} - β_{0} - {β_{1}}^{T} (x_{i} - x) - β_{2}^{T} vech {(x_{i} - x) {(x_{i} - x)}^{T}} - \dots]}^{2} K_{H} (x_{i} - x),$ (4) where $K_{H} (t) = \frac{1}{det (H)} K (H^{- 1} t)$ . H is the 2 × 2 smoothing matrix, the standard form is H = hI. The smoothing parameter h controls the size of the kernel function. A smaller kernel function size results in a sharper kernel function that focuses more on local features in the data, while a larger kernel function size leads to a broader kernel function that pays more attention to the overall trend. The choice of kernel function K (·) is arbitrary, and satisfies the following two conditions: $\int_{R^{2}} tK (t) d t = 0, \int_{R^{2}} t t^{T} K (t) d t = h^{2} I$ (5)

The Gaussian function is commonly used as a kernel function due to its lower computational requirements compared to other kernel functions that satisfy the given conditions. Additionally, Equation (4) can be represented in a matrix form as a weighted least-squares optimization problem: $b = \underset{b}{arg min} {∥ y - X_{x} b ∥}_{W_{x}}^{2} = arg min (y - X_{x} b)^{T} W_{x} (y - X_{x} b),$ (6) where $y = {[y_{1}, y_{2}, \dots y_{P}]}^{T}, b = {[β_{0}, {β_{1}}^{T}, {β_{2}}^{T}]}^{T} = {[β_{0}, β_{11}, β_{12}, β_{21}, β_{22}, β_{23}]}^{T}$ (7) $W_{x} = diag [K_{H} (x_{1} - x), K_{H} (x_{2} - x), \dots, K_{H} (x_{P} - x)]$ (8) $X_{x} = [\begin{matrix} \begin{matrix} 1 & {(x_{1} - x)}^{T} & vech {(x_{1} - x) {(x_{1} - x)}^{T}} \end{matrix} \\ \begin{matrix} 1 & {(x_{2} - x)}^{T} & vech {(x_{2} - x) {(x_{2} - x)}^{T}} \end{matrix} \\ \begin{matrix} ⋮ & ⋮ & ⋮ \end{matrix} \\ \begin{matrix} 1 & {(x_{P} - x)}^{T} & vech {(x_{P} - x) {(x_{P} - x)}^{T}} \end{matrix} \end{matrix} \begin{matrix} \dots \\ \dots \\ ⋮ \\ \dots \end{matrix}]$ (9) Therefore, the least-squares estimation is simplified to $\hat{z} (x) = {\hat{β}}_{0} = e_{1}^{T} {(X_{x}^{T} W_{x} X_{x})}^{- 1} X_{x}^{T} W_{x} y,$ (10) $\hat{\nabla} z (x) = {\hat{β}}_{1} = [\begin{matrix} e_{2}^{T} \\ e_{3}^{T} \end{matrix}] {(X_{x}^{T} W_{x} X_{x})}^{- 1} X_{x}^{T} W_{x} y,$ (11) ${\hat{\nabla}}^{2} z (x) = {\hat{β}}_{2} = [\begin{matrix} e_{4}^{T} \\ \begin{matrix} e_{5}^{T} \\ e_{6}^{T} \end{matrix} \end{matrix}] {(X_{x}^{T} W_{x} X_{x})}^{- 1} X_{x}^{T} W_{x} y,$ (12) where e_i (i = 1, 2, …, 6) is a 6 × 1 column vector with the i-th entry equal to one and the rest of the entries equal to zero. The above paragraph describes the classical kernel regression estimation method. This method involves a local linear fit of the data, which allows for approximate best filtering results in the smooth regions of the image.

2.2 Edge adaptive directional total variation

In 1992, Rudin, Osher, and Fatemi introduced the classical total variation image denoising algorithm [9]. This algorithm aims to restore a clean image from a noisy image by establishing a noise model and utilizing an optimization algorithm solution module. Through continuous iteration, the restored image gradually approaches the ideal denoised image. The total variation model minimizes the following function: $E_{TV} (f) = λ \int_{Ω} | \nabla f | d Ω + \frac{1}{2} \int_{Ω} {(f - I)}^{2} d Ω,$ (13) where f is a denoised image, I is the observed noise image, Ω is the image domain, λ > 0 is a parameter that balances the regularity and fidelity terms. By introducing the concept of support function, the TV regulariser ∫_Ω| ∇ f|dΩ can be reformulated as follows: $TV (f) = \sum_{i, j} sup_{P \in B_{2}} 〈 \nabla f (i, j), P 〉,$ (14) where B₂ is the unit ball of the L₂ norm. The proposed model effectively reduces noise in images while maintaining edge information. However, it tends to create a ladder effect when dealing with high levels of noise. Additionally, the total variation is dependent on homotropy, making it less suitable for processing images with prominent directional features. Later, Bayram and Kamasak [10] proposed to replace the unit ball B₂ with an ellipse, so the regulariser is: $DT V_{α, θ} (f) = \sum_{i, j} sup_{P \in B_{α, θ}} 〈 \nabla f (i, j), P 〉,$ (15) where B_α,θ is an ellipse along the direction θ, with a unit length minor axis, and a major axis of length α > 1. The larger the value of α, the more sensitive the DTV is to variation along the direction θ. The image denoising model using DTV regularizer is: $E_{DTV} (f) = λ DT V_{α, θ} (f) + \frac{1}{2} \int_{Ω} {(f - I)}^{2} d Ω .$ (16)

Diffusion along the direction θ is enhanced by the DTV model. When the dominant direction in the image aligns with θ, the DTV model enhances the dominant structure. Conversely, it destroys the structure when the dominant direction does not align with θ. In cases where there are multiple dominant directions in the image, it becomes necessary to spatially vary the parameters θ throughout the image. In [11], spatial variation parameters θ (x, y) based on image edge orientation are proposed. The edge direction can be calculated using the following formula: $(cos (θ (x, y)), sin (θ (x, y))) = (n_{1} (x, y), n_{2} (x, y)),$ (17) where (n₁ (x, y) , n₂ (x, y)) is the edge direction of the image, which should be estimated in advance. When there is no reference image, the edge direction of the image is calculated from the noisy image I (x, y): $(n_{1} (x, y), n_{2} (x, y)) = (- g_{y}, g_{x}) / \sqrt{g_{x}^{2} + g_{y}^{2}},$ (18) where (g_x, g_y) is the gradient vector of g (x, y), and g (x, y) is a smoothed version of I (x, y) using a Gaussian filter [11]. Then the model containing θ (x, y) in the DTV model is simply called edge adaptive directional total variation model (EADTV) in Equation (16). In this way, the EADTV model adapts and enhances the diffusion along the image edge direction.

2.3 Comparison algorithm

In the experimental section, the Guided Image Filtering, the L₀ smoothing algorithm, the WLS algorithm, the TV algorithm, the EADTV algorithm and the Truncated TV algorithm are used as a comparison of AODTV method for image denoising. For the AODTV method, there are mainly four groups of parameters: The kernel size ksize1 and ksize2 are the parameters controlling the support region size of the kernel, which will affect the smoothness of the kernel regression and the fitting effect on the data. The weight parameter t₁ and t₂ balance the kernel regression results and the reconstructed image. The regularization parameters of the EADTV regular term are expressed as λ₁ and λ₂ in the model. The long axis lengths of the ellipses along the image edge direction and normal direction are denoted by α₁ and α₂. In the Guided Image Filtering algorithm, the regularization parameter is denoted by ɛ, the local window radius is denoted by r and the subsampling ratio is denoted by s. In the L₀ smoothing algorithm, λ is the smoothing parameter controlling the degree of smooth, κ is a parameter that controls the rate [23], a smaller value of κ leads to more iterations and sharper edges. In the WLS algorithm, λ is the balance parameter between the data term and the smoothing term. Increasing λ will produce smoother images. α controls the affinities to a certain degree by nonlinear scaling the gradient. Increasing α will result in sharper preserved edges. In the TV algorithm, λ is the smoothness parameter. In the EADTV algorithm, λ is the regularization parameter, α represents the length of the major axis of the ellipse, and varargin is number of iterations. In the Truncated TV algorithm, ɛ is the threshold parameter in the model, α is the regularization parameter of the truncated TV term and λ is the parameter introduced in the iterative solution of each sub-problem in the model.

3 AODTV method

In general, the AODTV method has two steps:

Step 1: The image y reconstructed by FBP obtains its fit $\hat{z} (x)$ and first and second derivatives $\hat{\nabla} z (x)$ and ${\hat{\nabla}}^{2} z (x)$ through kernel regression. The derivative information is then updated using kernel regression. The reconstructed image, along with the fit of kernel regression, and the updated first and second derivatives are added according to a specific weight to obtain $\hat{I}$ to estimate the image edge direction in image y.

Step 2: The reconstructed image is denoised using the EADTV model along the orthogonal direction of the image $\hat{I}$ .

Specific progress is as follows: Perform kernel regression again on the first-order and second-order differential information β₁₁, β₁₂, β₂₁, β₂₂, β₂₃ in Equation (7) obtained by kernel regression of the FBP reconstructed image y. Their corresponding regression functions are denoted as z_x (x_i), z_y (x_i), z_xx (x_i), z_xy (x_i), z_yy (x_i). Finally, the updated differential information is denoted as ${\hat{z}}_{x} (x)$ , ${\hat{z}}_{y} (x)$ , ${\hat{z}}_{xx} (x_{i})$ , ${\hat{z}}_{xy} (x)$ , ${\hat{z}}_{yy} (x)$ . Then they are weighted and summed with $\hat{z} (x)$ according to a certain weight: $\tilde{z} (x) = k_{1} \hat{z} (x) + k_{2} {\hat{z}}_{x} (x) + k_{3} {\hat{z}}_{y} (x) + k_{4} {\hat{z}}_{xx} (x) + k_{5} {\hat{z}}_{xy} (x) + k_{6} {\hat{z}}_{yy} (x),$ (19) where k_i (i = 1, 2, …, 6) is the manually weighted coefficient. The resulting weighted sum of $\tilde{z} (x)$ and y is as follow: $\hat{I} = t_{1} \hat{z} (x) + t_{2} y,$ (20) where t₁ and t₂ are the weight parameters. $\hat{I}$ is the reference image used to estimate the edge direction.

To be specific, the edge direction (n₁ (x, y) , n₂ (x, y)) based on the image $\hat{I}$ can be calculated by the following equation: $(n_{1} (x, y), n_{2} (x, y)) = (- {\hat{I}}_{y}, {\hat{I}}_{x}) / \sqrt{{\hat{I}}_{x}^{2} + {\hat{I}}_{y}^{2}},$ (21)

The spatial variation parameters θ₁ (x, y) can be calculated as follow: $(cos (θ_{1} (x, y)), sin (θ_{1} (x, y))) = (n_{1} (x, y), n_{2} (x, y)) .$ (22) Then the EADTV model is: $f^{*} = \underset{f}{arg min} \frac{1}{2} {∥ \hat{I} - f ∥}_{2}^{2} + λ_{1} EADT V_{α_{1}, θ_{1} (x, y)} (f) .$ (23)

Since the penalty is only applied along the edge direction of the image at this time, the details are preserved, but the image still has strip artifacts in the normal direction. Therefore, take f^* as the new input image and determine the spatial variation parameters θ₂ (x, y) based on the normal direction (t₁ (x, y) , t₂ (x, y)) of image $\hat{I}$ , which penalizes the image along the normal direction to remove strip artifacts: $(- sin (θ_{2} (x, y)), cos (θ_{2} (x, y))) = (t_{1} (x, y), t_{2} (x, y)) .$ (24) The normal direction of the image is calculated as: $(t_{1} (x, y), t_{2} (x, y)) = (- {\hat{I}}_{x}, - {\hat{I}}_{y}) / \sqrt{{\hat{I}}_{x}^{2} + {\hat{I}}_{y}^{2}} .$ (25) $f^{* *} = \underset{f *}{arg min} \frac{1}{2} {∥ \hat{I} - f^{*} ∥}_{2}^{2} + λ_{2} EADT V_{α_{2}, θ_{2} (x, y)} (f^{*}) .$ (26)

f^** is the final image result obtained by the AODTV method. The solution method for this model can be found in [10].

4 Experiment

In this section, both simulation and real data are images with texture details. The choice of experimental parameters is a compromise between removing noise and preserving detail. The parameters were selected manually through a large number of experiments. Mean square error (MSE), peak signal-to-noise ratio (PSNR), mean structural similarity (MSSIM) [26] and Gradient Magnitude Similarity Deviation (GMSD) [27] are utilized for quantitative evaluation of the denoised images.

Mean Square Error (MSE) is the squared difference between the true and predicted values and then summed and averaged. The calculation formula is as follows: $MSE = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{j = 1}^{W} {(X (i, j) - Y (I, J))}^{2} .$ (27)

PSNR is the most common and widely used objective image evaluation index, which is based on the error between the corresponding pixels, that is, error-sensitive image quality evaluation. The unit is dB, with higher values indicating less distortion. The calculation formula is as follows: $PSNR = 10 \cdot {log}_{10} (\frac{{(2^{B} - 1)}^{2}}{MSE}) .$ (28)

B is the number of bits per pixel, which is generally taken as 8, (2^B - 1) ² represents the maximum possible pixel value of the picture.

Mean Structural Similarity (MSSIM), The sliding window was used to divide the image into N blocks, and the mean, variance and covariance of each window were weighted, the weight w_ij satisfies ∑_i∑_jw_ij = 1. Usually a Gaussian kernel is used, Then the structural similarity SSIM of the corresponding block was calculated, and finally the average value was used as the structural similarity measure of the two images. SSIM measures image similarity from brightness, contrast and structure: $l (x, y) = \frac{2 μ_{x} μ_{y} + C_{1}}{{μ_{x}}^{2} + {μ_{y}}^{2} + C_{1}},$ (29) $c (x, y) = \frac{2 σ_{x} σ_{y} + C_{2}}{{σ_{x}}^{2} + {σ_{y}}^{2} + C_{2}},$ (30) $s (x, y) = \frac{σ_{xy} + C_{3}}{σ_{x} σ_{y} + C_{3}},$ (31) where μ_x, μ_y represents the mean of images x and y, respectively. σ_x, σ_y represents the variance of images x and y, σ_xy is the covariance of images x and y, C₁, C₂ and C₃ are all constants. The SSIM is calculated as follows: $SSIM (x, y) = [l {(x, y)}^{α} \cdot c {(x, y)}^{β} \cdot s {(x, y)}^{γ}] .$ (32)

The MSSIM is calculated as: $μ_{x} = \sum_{i = 1}^{H} \sum_{j = 1}^{W} w_{ij} x (i, j),$ (33) $σ_{x} = {({\sum_{i = 1}^{H} \sum_{j = 1}^{W} w_{ij} (x (i, j) - μ_{x})}^{2})}^{\frac{1}{2}},$ (34) $σ_{xy} = \sum_{i = 1}^{H} \sum_{j = 1}^{W} w_{ij} (x (i, j) - μ_{x}) (y (i, j) - μ_{y}),$ (35) $MSSIM (x, y) = \frac{1}{M} \sum_{j = 1}^{M} SSIM (x, y),$ (36)

GMSD provides accuracy by estimating pixel-level gradient similarity. GMSD values are between 0 and 1, with closer to 0 indicating better results. By utilizing the Prewitt filter to calculate the gradient, the Prewitt filters along horizontal (x) and vertical (y) directions are defined as: $h_{x} = [\begin{matrix} 1 / 3 & 0 & - 1 / 3 \\ 1 / 3 & 0 & - 1 / 3 \\ 1 / 3 & 0 & - 1 / 3 \end{matrix}], h_{y} = [\begin{matrix} 1 / 3 & 1 / 3 & 1 / 3 \\ 0 & 0 & 0 \\ - 1 / 3 & - 1 / 3 & - 1 / 3 \end{matrix}] .$ (37)

By convolving the image with the above operators, the gradient magnitudes of the image in the horizontal and vertical directions can be obtained, denoted as r and d. The gradient magnitudes of r and d at location, denoted by m_r (i) and m_d (i), are computed as follows: $m_{r} (i) = \sqrt{{(r \otimes h_{x})}^{2} (i) + {(r \otimes h_{y})}^{2} (i)},$ (38) $m_{d} (i) = \sqrt{{(d \otimes h_{x})}^{2} (i) + {(d \otimes h_{y})}^{2} (i)} .$ (39) The gradient magnitude similarity (GMS) map is computed as follows: $GMS (i) = \frac{2 m_{r} (i) m_{d} (i) + c}{{m^{2}}_{r} (i) + {m^{2}}_{d} (i) + c} .$ (40) The gradient magnitude similarity mean (GMSM) is computed as follows: $GMSM = \frac{1}{N} \sum_{i = 1}^{N} GMS (i) .$ (41) Combining GMS and GMSM, GMSD is calculated as: $GMSD = \sqrt{\frac{1}{N} {\sum_{i = 1}^{N} (GMS (i) - GMSM)}^{2}} .$ (42)

4.1 Simulation

For the simulation study, we utilized a medical image along with the corresponding region of interest (ROI) depicted as red rectangle. In order to facilitate visual comparison, we zoomed in on the ROI and included some detailed information, as shown in Fig. 1. Poisson noise was added to the projection data of the simulation image, and the reconstructed image was obtained by sampling at intervals of every four and eight angles within the range of 0 to π, respectively. The results obtained from the AODTV method and other contrast methods on the FBP reconstructed images are presented in Figs. 2 and 3.

Fig. 1

Simulation data. (a) Original simulation image. (b) Local magnification of the simulation image.

Fig. 2

(a) The FBP reconstructed result obtained by sampling simulated image projection data, with Poisson noise added, at intervals of every four angles within the range of 0 to π. (b)-(h) Performance of different denoising methods in this experiment. (b) Guided Image Filtering. (c) L₀ smoothing. (d) WLS. (e) TV. (f) EADTV. (g) Truncated TV. (h) AODTV.

Fig. 3

(a) The FBP reconstructed result obtained by sampling simulated image projection data, with Poisson noise added, at intervals of every eight angles within the range of 0 to π. (b)-(h) Performance of different denoising methods in this experiment. (b) Guided Image Filtering. (c) L₀ smoothing. (d) WLS. (e) TV. (f) EADTV. (g) Truncated TV. (h) AODTV.

For the first simulation, Fig. 2(a) shows the image reconstructed by the FBP algorithm, which is the result of sampling the projection data of the simulated image by 4 angles at intervals within the sampling range of 0 to π after adding Poisson noise. The optimal values for all parameters, as presented in Table 1, are chosen to attain the most effective denoising outcome for each denoising algorithm. The denoising effect of Guided Image Filtering and Truncated TV in Fig. 2(b) and (g) is similar, with some noise remaining in the image and the details not being clearly preserved in the locally enlarged area of Fig. 2 (b). Figure 2 (c) and (d) demonstrate that there is still considerable noise in the ROI, and the texture structure appears unclear after L₀ smoothing and WLS denoising. The result of TV in Fig. 2 (e) does not significantly differ from those in Fig. 2 (f). It can be noted that the image in the ROI retains detailed information after EADTV denoising, however, there still exist some residual noise in Fig. 2 (f). The AODTV method exhibits the best preservation of details and denoising effect in Fig. 2 (h). The quantitative results are presented in Table 2, where it is evident that the AODTV method produces a noise-reduced image with higher PSNR and MSSIM, and lower MSE.

Table 1

The best parameter values of different methods in the first simulation

Guided Image Filtering	r = 1 ɛ = 240 s = 1
L₀ smoothing	λ = 0.001 κ = 2
WLS	λ = 0.0001 α = 6
TV	λ = 6
EADTV	λ = 4 α = 2 varargin = 200
Truncated TV	ɛ = 5 α = 9 λ = 4
AODTV	ksize1 = 3 ksize2 = 1 t₁ = 0.1 t₂ = 0.9 λ₁ = 1 λ₂ = 2 α₁ = 4 α₂ = 1

Table 2

Evaluation results of different denoising methods for the first simulation

Method	MSE	PSNR(dB)	MSSIM	GMSD
FBP	1684.8	15.8653	0.8538	0.1790
Guided Image Filtering	1629.6	16.0100	0.8551	0.1549
L₀ smoothing	1632.4	16.0026	0.8571	0.1735
WLS	1656.6	15.9388	0.8540	0.1721
TV	1627.5	16.0156	0.8551	0.1513
EADTV	1623.5	16.0262	0.8554	0.1562
Truncated TV	1629.3	16.0109	0.8555	0.1546
AODTV	1611.7	16.0580	0.8595	0.1512

For the second simulation, Fig. 3(a) displays that the FBP reconstruction result is obtained by sampling the simulated image projection data with Poisson noise added, at intervals of every eight angles within the range of 0 to π. Table 3 presents the optimal parameter values for each denoising algorithm. Upon examining the resulting images, it is evident that in the local amplification area of Fig. 3 (b), a significant reduction in noise can be observed, but the details are not very clear after Guided Image Filtering denoising. Figure 3 (c) demonstrates that the L₀ smoothing technique preserves image details while still retaining a considerable amount of noise. On the other hand, the denoising effect of the WLS method, as shown in Fig. 3 (d), is poor. However, when we examine the local magnification areas in Fig. 3 (e), (f), and (g), it becomes apparent that the TV, EADTV, and Truncated TV methods yield better denoising results, with relatively smooth image details. Figure 3 (h) illustrates that the AODTV method allows for the preservation of texture details in the image, resulting in relatively clear detailed edges and a more noticeable noise removal effect. Table 4 indicates that the MSE, PSNR, and MSSIM values of the AODTV method surpass those of the other methods. This indicates that the AODTV method possesses the ability to remove noise while preserving image details.

Table 3

The best parameter values of different denoising methods in the second simulation

Guided Image Filtering	r = 1 ɛ = 700 s = 1
L₀ smoothing	λ = 0.008 κ = 2
WLS	λ = 0.003 α = 3
TV	λ = 10
EADTV	λ = 9 α = 5 varargin = 1000
Truncated TV	ɛ = 8 α = 16 λ = 5
AODTV	ksize1 = 9 ksize2 = 1 t₁ = 0.9 t₂ = 0.1 λ₁ = 0.001
	λ₂ = 1 α₁ = 5 α₂ = 1

Table 4

Evaluation results of different denoising methods for the second simulation

Method	MSE	PSNR(dB)	MSSIM	GMSD
FBP	3611.4	12.5540	0.7456	0.2540
Guided Image Filtering	3498.0	12.6926	0.7492	0.2278
L₀ smoothing	3554.1	12.6236	0.7476	0.2491
WLS	3532.9	12.6495	0.7482	0.2401
TV	3485.9	12.7077	0.7501	0.2176
EADTV	3474.9	12.7214	0.7499	0.2171
Truncated TV	3482.1	12.7123	0.7506	0.2234
AODTV	3123.7	13.1841	0.7665	0.2127

4.2 Real data experiment

In the real experiment, the full angle reconstructed image of the trabecular bone projection data and its local magnified area (indicated by the red rectangle) are presented in the Fig. 4. The projection data was sampled at intervals of six and fifteen angles within a sampling range from 0 to π. These data were then processed using the AODTV method and compared with other denoising methods as shown in the Figs. 5 and 6.

Fig. 4

Real data. (a) The full angle reconstructed image of the trabecular bone projection data. (b) The local magnified area of (a).

Fig. 5

(a) The reconstruction result of trabecular bone projection data sampled every six angles in the sampling range from 0 to π. (b)-(h) Comparison of noise reduction results for the first real experiment. (b) Guided Image Filtering. (c) L₀ smoothing. (d) WLS. (e) TV. (f) EADTV. (g) Truncated TV. (h) AODTV.

Fig. 6

(a) The reconstruction result of trabecular bone projection data sampled every fifteen angles in the sampling range from 0 to π. (b)-(h) Comparison of noise reduction results for the second experiment. (b) Guided Image Filtering. (c) L₀ smoothing. (d) WLS. (e) TV. (f) EADTV. (g) Truncated TV. (h) AODTV.

In Fig. 5(a), the reconstruction of the projection data at intervals of six angles within the sampling range from 0 to π resulted in noticeable strip artifacts. In the partial enlargement of the image in Fig. 5 (c), it can be seen that after processing by the L₀ smoothing method, the strip artifacts and details of the image are smoothed, and part of the image information is lost. In Fig. 5 (d) and (g), the image edge after denoising by WLS and Truncated TV are blurred. Although the edge of Fig. 5 (b) is distinct, the artifact removal is not entirely satisfactory. In Fig. 5 (e) and (f), EADTV and TV have good artifact removal effect but the details are not clear. In Fig. 5 (h), the image artifacts after denoising by the AODTV method are effectively removed, while the texture structure of the image is maintained. The optimal parameter values for each algorithm are presented in Table 5. The quantitative results can be found in Table 6. The evaluation index of the AODTV method demonstrates a significantly superior performance compared to other methods.

Table 5

The best parameter values of different methods in the first real experiment

Guided Image Filtering	r = 2 ɛ = 300 s = 1
L₀ smoothing	λ = 0.001 κ = 2
WLS	λ = 0.01 α = 3
TV	λ = 7
EADTV	λ = 6 α = 2 varargin = 500
Truncated TV	ɛ = 10 α = 5 λ = 3
AODTV	ksize1 = 3 ksize2 = 1 t₁ = 0.3 t₂ = 0.7λ₁ = 3
	λ₂ = 0.012 α₁ = 2 α₂ = 20

Table 6

Evaluation results of different denoising methods in Fig. 5(a)

Method	MSE	PSNR(dB)	MSSIM	GMSD
FBP	99.4789	28.1535	0.9120	0.0884
Guided Image Filtering	84.0609	28.8849	0.9322	0.0343
L₀ smoothing	30.4029	33.3016	0.9497	0.0742
WLS	89.6037	28.6075	0.9261	0.0525
TV	83.2311	28.9279	0.9375	0.0381
EADTV	82.9772	28.9412	0.9375	0.0371
Truncated TV	84.8475	28.8444	0.9353	0.0356
AODTV	22.8559	34.5408	0.9604	0.0334

In the second real experiment, an increase in strip artifacts was observed in the reconstruction results due to the sampling interval of the projection data being more angles, as shown in Fig. 6(a). The denoising algorithms also demonstrated a more intuitive effect. The L₀ smoothing in Fig. 6 (c) resulted in a highly blurred texture of the image, with significant loss of details. WLS in Fig. 6 (d) exhibited the poorest ability in denoising and preserving image edges. The strip artifacts of the image were relatively well removed in Fig. 6 (e) and Fig. 6 (f), but the edges appeared very blurred after TV and EADTV denoising. The effects of Guided Image Filtering and Truncated TV were similar in Fig. 6 (b) and (g). Figure 6 (h) demonstrates that the AODTV method effectively balances the preservation of edge details and the removal of strip artifacts, yielding the best results. The optimal parameter values for each algorithm are presented in Fig. 6. Quantitative results in Table 8 clearly indicate that the AODTV method outperforms other denoising algorithms.

Table 7

The best parameter values of different methods in the second real experiment

Guided Image Filtering	r = 4 ɛ = 800 s = 1
L₀ smoothing	λ = 0.001 κ = 2
WLS	λ = 0.001 α = 8
TV	λ = 20
EADTV	λ = 20 α = 2 varargin = 800
Truncated TV	ɛ = 10 α = 10 λ = 25
AODTV	ksize1 = 3 ksize2 = 1 t₁ = 0.5 t₂ = 0.5λ₁ = 6
	λ₂ = 0.05 α₁ = 2 α₂ = 20

Table 8

Evaluation results of different denoising methods in Fig. 6(a)

Method	MSE	PSNR(dB)	MSSIM	GMSD
FBP	400.4584	22.1052	0.7088	0.1642
Guided Image Filtering	347.7650	22.7179	0.7861	0.0760
L₀ smoothing	141.1049	26.6354	0.8432	0.1112
WLS	352.1226	22.6639	0.7819	0.0952
TV	345.1679	22.7505	0.7946	0.0828
EADTV	343.3642	22.7733	0.7981	0.0798
Truncated TV	349.0687	22.7017	0.7865	0.0799
AODTV	73.5765	29.4634	0.8703	0.0732

5 Parameter sensitivity analysis

In this subsection, the analysis is based on the first real experiment. We present the effect of the kernel size ksize2 at the second kernel regression, α₂ in Equation (26), λ₁ and λ₂ in Equations (23), (26), and weight parameter t₁ and t₂ in Equation (20). The reason for choosing to analyze these parameters was that it was found through a large number of experiments that they had a large impact on the experimental results, while adjusting the remaining parameters ksize1 and α₁ had little impact on the results. The analysis results are shown in Fig. 7. In order to determine the optimal value of a certain parameter, different parameter values are adjusted while keeping the other parameters fixed. Three well-known image quality metrics, PSNR, MSE and MSSIM, are used to analyze the influence of the parameters. It can be seen from Fig. 7 (a) that PSNR value is basically flat and has a downward trend after α₂ is taken as 20, while MSE changes from the original downward trend to a gradual increase after α₂ is taken as 20, which indicates the optimal value of major axis α₂ of the ellipse is 20. In Fig. 7 (b), when ksize2 takes 1, the PSNR reaches the maximum and the MSE reaches the minimum, so the optimal value of the ksize2 is 1. Figure 7 (c) shows that when λ₁ is larger than 3, the MSE value decreases and the PSNR value increases gradually. So the optimal value of λ₁ is 3. In Fig. 7 (d), it can be observed that when λ₂ is set to 0.012, MSE curve remained flat up to a certain distance and then started to increase. On the other hand, when λ₂ is set to 0.012, the PSNR curve remained flat up to a certain distance and then gradually decreased. These findings suggest that the optimal value of λ₂ is 0.012. It can be seen from the Fig. 7 (e) and 7 (f) that when the weight parameters t₁ is equal to 0.3 and t₂ is equal to 0.7, PSNR is maximum and MSE is minimum; accordingly, MSSIM also reaches maximum at this time, indicating the optimal values of t₁ and t₂ are 0.3 and 0.7, respectively. The optimal values of the parameters are shown in Table 5.

Fig. 7

The effect of parameters (a) The corresponding MSE and PSNR values for different. (b) The corresponding MSE and PSNR values for different. (c) The corresponding MSE and PSNR values for different. (d) The corresponding MSE and PSNR values for different. (e) The corresponding PSNR and RMSE values for different weight parameter. (f) The corresponding MSSIM values for different weight parameter.

It can be seen from the parameter sensitivity analysis that the disturbance of parameters α₂, λ₁ and λ₂ has a significant impact on the experimental results. However, the adjustment of parameters ksize2 within a certain range leads to little change in the evaluation index. Therefore, for trabecular bone, we give the ranges of these parameters, which can obtain reasonable CT images stably. The values of α₂, λ₁ and λ₂ range from [15 –21], [2 –4], and [0.011–0.013]. For the weight parameters t₁ and t₂, we should try to maintain t₂ a larger proportion, but not equal to 1.

6 Discussion and conclusion

In this work, we propose a new denoising method that can achieve good denoising effect while preserving the structure and details of the CT images. Although the experimental results are good, there are limitations. The proposed method involves many parameters, and adjusting so many usage parameters is not convenient for clinical use. According to the sensitivity of the parameter, we can take the value of the parameter in the appropriate range, so as to obtain a good denoising effect. When the traditional denoising method removes the noise, the details of CT image are also lost. In the process of denoising, the proposed method can more accurately estimate the edge direction of the image and denoise along the orthogonal direction, which can better retain the details of CT image when removing strip artifacts and noise. AI models offer advantages in performance, making them popular for low-dose and low-sampling CT imaging. Various deep learning-based image denoising methods exist, including convolutional neural network denoising methods like N2N [28], FFDNET [29], residual network denoising methods like FC-AIDE [30], and generative adversary-network denoising methods like GCDN [31]. These deep learning methods can autonomously extract shallow pixel level features and deep semantic level features quickly, demonstrating strong representation learning ability and effective denoising effects. Current research in deep learning and machine learning explores combining diffusion models with regularization models to enhance model performance and generalization [32, 33]. We can also try to combine it with deep learning, which is also a problem we need to study further.

In this innovative study, Kernel regression is introduced into EADTV to obtain a higher quality reference image, and the CT image is denoised along the orthogonal direction. The results demonstrate that the AODTV method significantly outperforms other denoising algorithms. This method focuses on eliminating strip artifacts while retaining the intricate details in the image. It achieves this by carefully balancing the penalties applied in both the edge direction and the normal direction of the reconstructed image. However, determining the precise proportion of penalty in the orthogonal direction requires additional research.

Footnotes

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grants Nos. 82371960, 82102037, and 82071922 and Tianjin Municipal Education Commission (2020KJ208).

References

Deng

, Cahill

L.W.

, Adaptive Gaussian filter for noise reduction and edge detection, IEEE Nuclear Science Symposium & Medical Imaging Conference 3 (1993), 1615–1619.

Jin

, Fieguth

, Winger

, et al., Adaptive Wiener filtering of noisy images and image sequences, IEEE International Conference on Image Processing 3 (2003), 349–352.

Tomasi

, Manduchi

, Bilateral filtering for gray and color images, in: Proceedings of the IEEE International Conference on Computer Vision, IEEE, pp. 839–846.

Yan

, Lu

W.S.

, New algorithms for sparse representation of discrete signals based on ℓp-ℓ2 optimization, in: IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing – Proceedings, pp. 73–78.

Bell

J.B.

, Tikhonov

A.N.

, Arsenin

V.Y.

, Solutions of Ill-Posed Problems, Mathematics of Computation 32 (1978), 1320.

Nikolova

, Local Strong Homogeneity of a Regularized Estimator, SIAM Journal on Applied Mathematics 61 (2000), 633–658.

Weickert

, Anisotropic diffusion in image processing, Theoretical Foundations of Computer Vision, 1996, p. 165.

Chambolle

, Pock

, A first-order primal-dual algorithm for convex problems with applications to imaging, Journal of Mathematical Imaging and Vision 40 (2011), 120–145.

Rudin

L.I.

, Osher

, Fatemi

, Nonlinear total variation based noise removal algorithms, Physica D: Nonlinear Phenomena 60 (1992), 259–268.

10.

Bayram

, Kamasak

M.E.

, Directional Total Variation, IEEE Signal Processing Letters 19 (2012), 781–784.

11.

Zhang

, Wang

, Edge adaptive directional total variation, The Journal of Engineering 2013, pp. 61–62.

12.

Buades

, Coll

, Morel

J.M.

, A Review of Image Denoising Algorithms, with a New One, Multiscale Modeling and Simulation 4 (2005), 490–530.

13.

Mallat

S.G.

, A theory for multiresolution signal decomposition: the wavelet representation, IEEE Transactions on Pattern Analysis and Machine Intelligence 11 (1989), 674–693.

14.

Vaiyapuri

, Alaskar

, Sbai

, Devi

, GA-based multi-objective optimization technique for medical image denoising in wavelet domain, Journal of Intelligent and Fuzzy Systems 41 (2021), 1575–1588.

15.

Van Fleet

P.J.

, WAVELET SHRINKAGE: AN APPLICATION TO DENOISING, in: Discrete Wavelet Transformations. Wiley, pp. 231–260.

16.

Dabov

, Foi

, Katkovnik

, et al., Image denoising with block-matching and 3D filtering, in Image Processing: Algorithms and Systems, Neural Networks, and Machine Learning 6064 (2006), 606414.

17.

Wang

, et al., Improved deep convolutional dictionary learning with no noise parameter for low-dose CT image processing, Journal of X-Ray Science and Technology 31 (2023), 593–609.

18.

Takeda

, Farsiu

, Milanfar

, Kernel Regression for Image Processing and Reconstruction, IEEE Transactions on Image Processing 16 (2007), 349–366.

19.

Bowman

A.W.

, Wand

M.P.

, Jones

M.C.

, Kernel Smoothing, Biometrics 54 (1998), 393.

20.

Herman

G.T.

, Naparstek

, Fast image reconstruction based on a Radon inversion formula appropriate for rapidly collected data, SIAM Journal on Applied Mathematics 33 (1977), 511–533.

21.

Horn

B.K.P.

, Fan-beam reconstruction methods, Proceedings of the IEEE 67 (1979), 1616–1623.

22.

, Sun

, Tang

, Guided image filtering, IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (2013), 1397–1409.

23.

, Lu

, Xu

, et al., Image smoothing via L0 gradient minimization, ACM Transactions on Graphics 30 (2011), 1–12.

24.

Farbman

, Fattal

, Lischinski

, et al., Edge-preserving decompositions for multi-scale tone and detail manipulation, ACM Transactions on Graphics 27 (2008), 1–10.

25.

Dou

, Song

, Gao

, et al., Image smoothing via truncated total variation, IEEE Access 5 (2017), 27337–27344.

26.

Wang

, Bovik

A.C.

, Sheikh

H.R.

, et al., Image quality assessment: from error visibility to structural similarity, IEEE Transactions on Image Processing 13 (2004), 600–612.

27.

Xue

, Zhang

, Mou

, et al., Gradient magnitude similarity deviation: A highly efficient perceptual image quality index, IEEE Transactions on Image Processing 23 (2014), 684–695.

28.

Moran

, Schmidt

, Zhong

, et al. Noisier2noise: Learning to denoise from unpaired noisy data[C], Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, 12064–12072.

29.

Zhang

, Zuo

, Zhang

, FFDNet: Toward a fast and flexible solution for CNN-based image denoising[J], IEEE Transactions on Image Processing 27(9) (2018), 4608–4622.

30.

Cha

, Moon

, Fully convolutional pixel adaptive image denoiser[C], Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, 4160–4169.

31.

Valsesia

, Fracastoro

, Magli

, Deep graph-convolutional image denoising[J], IEEE Transactions on Image Processing 29 (2020), 8226–8237.

32.

Sun

, Peng

, Zhang

, et al. Dynamic PET image denoising using deep image prior combined with regularization by denoising[J], IEEE Access 9 (2021), 52378–52392.

33.

Wang

, Huang

, Liu

, Variational-based mixed noise removal with CNN deep learning regularization[J], IEEE Transactions on Image Processing 29 (2019), 1246–1258.