Abstract
BACKGROUND AND OBJECTIVE:
Optimization based image reconstruction algorithm is an advanced algorithm in medical imaging. However, the corresponding solving algorithm is challenging because the model is usually large-scale and non-smooth. This work aims to devise a simple and convergent solver for optimization model.
METHODS:
The alternating direction method of multipliers (ADMM) algorithm is a simple and effective solver of the optimization model. However, there always exists a sub-problem that has not close-form solution. One may use gradient descent algorithm to solve this sub-problem, but the step-size selection via line search is time-consuming. Or, one may use fast Fourier transform (FFT) to get a close-form solution if the sparse transform matrix is of special structure. In this work, we propose a fully linearized ADMM (FL-ADMM) algorithm that avoids line search to determine step-size and applies to sparse transform of any structure.
RESULTS:
We derive the FL-ADMM algorithm instances for three total variation (TV) models in 2D computed tomography (CT). Further, we validate and evaluate one FL-ADMM algorithm and explore how two important factors impact convergence rate. These studies show that the FL-ADMM algorithm may accurately solve the optimization model.
CONCLUSION:
The FL-ADMM algorithm is a simple, effective, convergent and universal solver of optimization model in image reconstruction. Compared to the standard ADMM algorithm, the new algorithm does not need time-consuming step-size line-search or special demand to sparse transform. It is a rapid prototyping tool for optimization based image reconstruction.
Introduction
Image reconstruction is the core technique of medical imaging. There are three types of reconstruction algorithms: analytical algorithm [1], optimization based iterative algorithm [2, 3] and deep learning/machine learning based algorithm [4]. Among them, the optimization based algorithm has some advantages relative to the two other algorithms. Compared to the analytical algorithm, the optimization based algorithm may incorporating prior information into the optimization model and thus may achieve more accurate reconstruction from incomplete data and/or noisy data. Compared to the deep learning based algorithm, the optimization based algorithm is more stable and is not data dependent [5]. Therefore, the optimization based algorithm is always a hot spot in image reconstruction field.
The imaging system model of the optimization based algorithm is a linear system of equations. Usually, it is large-scale, ill-posed and often underdetermined. So, it is impossible to solve the linear system by direct inversion. We may continue to construct an optimization model to solve the ill-posed and underdetermined problems by incorporating prior information like sparse prior [6] or low-rank prior [7], etc.
In the optimization models for image reconstruction, total variation (TV) type models have been widely used for they may not only suppress streak artifacts but also suppress noise [8]. In 2006, Sidky et al. proposed a data divergence constrained, TV (DDcTV) minimization model for 2D CT [9]. In 2008, Sidky et al. proposed a DDcTV model for 3D CT and achieved accurate reconstruction [2]. From then on, a variety of TV-type models were proposed in image reconstruction, for example adaptively weighted TV (awTV) [10], anisotropic TV (aTV) [11], high order TV (HOTV) [12, 13], non-local TV (NLTV) [14], nuclear TV (nTV) [15], directional TV (dTV) [16], etc. These TV-type models may achieve more accurate reconstructions in their suitable cases. However, the solving problems of these TV-type models are challenging because these models are large-scale and non-smooth. They cannot be solved by simple gradient decent algorithm.
For TV-type models, there are mainly three solving algorithms: adaptive steepest descent-projection onto convex sets (ASD-POCS) algorithm [2, 9], Chambolle-Pock (CP) algorithm [17, 18], and alternating direction method of multipliers (ADMM) algorithm [19–26]. We have systematically studied ASD-POCS and CP algorithms and applied them to solve TV models for CT and electron paramagnetic resonance imaging (EPRI) [27–30]. ASD-POCS algorithm is devised according to the physical meaning and the optimization strategy, whereas it is not derived mathematically. Thus, those algorithm parameters in this algorithm should be carefully, empirically chosen to achieve convergence. CP algorithm is obtained by mathematical derivation and may guarantee convergence, but the derivation is not so easy for those who are not familiar with optimization, for example it needs the calculation of convex conjugate [18]. ADMM is another algorithm framework for solving TV-types models. The derivation of algorithm instance according to the algorithm framework is easier than CP algorithm for the core step is only to split the whole optimization problem into several simple sub-optimization problems by alternating direction technique. However, for TV-types models in image reconstruction, there is always a sub-problem that has not close-form solution in ADMM algorithm. One method is to solve this sub-problem by gradient descent algorithm [26]. However, on the one hand, the iteration number must be large enough to achieve accurate solution of this sub-problem, on the other hand, the step-size should be selected by one dimensional line search technique, for example nonmonotone line search [26], leading to long-time inner-iteration. Another method is to use fast Fourier transform (FFT) algorithm to get a close-form solution [20]. However, it demands that the system matrix and the sparse transform matrix have special structure, for example the system matrix for image deblurring, image denoising, etc, so that they are be regarded as convolution operations. For CT and EPRI, this method is not suitable. This is why we never use ADMM algorithm in TV-based reconstructions in CT and EPRI for a long time.
We noticed that some researchers began to use the linearization technique to linearize the quadratic term in the difficult sub-problem in ADMM so as to the FFT algorithm may be used to construct a close-form solution [24]. Additionally, linearized ADMM (L-ADMM) has been proposed to improve the ADMM algorithm [31–37]. However, we found that these L-ADMM algorithms are not thorough, i.e. people only linearized one quadratic term in the difficult sub-problem. Why do not people linearize all the quadratic terms in the difficult sub-problem? We think it should be deeply investigated.
In this work, we proposed a fully linearized ADMM (FL-ADMM) for image reconstruction to simplify the ADMM algorithm by avoiding the search of optimal step-size in gradient descent algorithm and the use of FFT algorithm. The proposed FL-ADMM algorithm framework may be used to derive simple, effective, and convergent algorithm instances for a variety of optimization models. To show the potential of the FL-ADMM for prototyping of the optimization models, we derive the FL-ADMM algorithm instances for unconstrained TV (ucTV) minimization model, data divergence constrained, TV (DDcTV) minimization model, and TV constrained, data divergence minimization (TVcDM) model for two dimensional (2D) computed tomography (CT). Also, we validate and evaluate the DDcTV-FL-ADMM algorithm for 2D CT to illuminate that the FL-ADMM algorithm is actually an accurate solver of the DDcTV model. Finally, we explore how the penalty parameter and other algorithm parameters of DDcTV-FL-ADMM impact the convergence rate.
In Section 2, we give the derivation of the FL-ADMM algorithm instances for three TV models. In Section 3, we perform reconstruction experiments via the proposed DDcTV-FL-ADMM algorithm. We give discussion and draw conclusions briefly in Section 4.
Methods
Preliminary knowledge
In this section, we give some basic optimization knowledge, which will be used in the following parts.
Shrinkage algorithm
1D Shrinkage algorithm or operation may solve the optimization problem shown in Equation (1).
Here, x is a vector of size N, ∥ · ∥ 1 is the ℓ1 norm of a vector, and ∥ · ∥ 2 is the ℓ2 norm of a vector.
Suppose that x
i
indicates the ith. element of this vector. Then the solution of this optimization problem is
Here,
2D Shrinkage algorithm or operation may solve the optimization problem shown in Equation (3).
Here, x is a 2D-vector-valued vector of size N, x
i
indicates the ith element of this vector,
The solution of this optimization problem is
Here,
In optimization theory, projection onto convex sets (POCS) algorithm/operation may solve a special optimization problem as follows.
Here, C is a convex set. x and a are both vectors of size N.
Then, the solution of this optimization problem is
Here,
If the convex set is a ℓ2 norm ball of radius, r, then the POCS operation of a to this ball is
Here, L2Ball (r) is the ℓ2 norm ball of radius r. This POCS operation means that, if the point a is in the ball, then the projection value is a, whereas, if the point is outside the ball, then projection value is the intersection point of this ℓ2 norm circle and the vector a. (Note that any point may also be regarded as a vector.)
If the vector a is of 2D form, then the POCS schematic diagram is shown in Fig. 1.

Projection onto ℓ2 norm ball. (a) is for the case that a is in the ball, and (b) is for the case that a is outside the ball.
The POCS operation of projection onto ℓ1 norm ball is more complicated than that of projection onto ℓ2 norm ball. Its schematic diagram for 2D case is shown in Fig. 2.

Projection onto ℓ1 norm ball. (a) is for the case that a is in the ball, and (b) is for the case that a is outside the ball.
From Fig. 2, we may see that, if a is in the ℓ1 norm ball, the projection point is still itself, whereas, if it is outside the ball, the projection point should be the foot of a perpendicular from a to the ℓ1 norm circle.
There exists accurate algorithm for
Pseudo-codes for
The indicator function may be defined as
Thus, a constrained optimization model may be transformed into the unconstrained optimization form. For example, Equation (5) may be written in the unconstrained form as follows.
In the following sections, we will often use this transformation. Note that, it is just a form-transformation and the optimization meaning of the two forms are completely the same.
In fact, any quadratic function may be linearized at a point according to the Taylor expansion.
Here, x0 is any point (vector), s cannot be selected arbitrarily and usually we may set its value according to [33]
Here, λ max means the largest eigenvalue of a matrix.
ADMM algorithm framework
We consider the structured, constrained convex optimization problem
Where, x and y are multi-dimensional vectors, A and B are matrices indicating linear transform, θ1 (x) and θ2 (y) are convex but not necessary smooth functions.
The corresponding augmented Lagrange function is
The ADMM algorithm framework is
The sub-problem (14.1) may be further written as
The sub-problem (14.2) may be further written as
Clearly, the ADMM algorithm may divide the original problem into three simple sub-problems. By this splitting technique, the two convex functions are split and the whole solving problem is potentially simplified.
For the sub-problems (15) and (16) are similar, we will just discuss the linearization technique for (15).
Often, (15) is still hard to solve for usually there is not a simple, close-form solution, for example in case that θ1 (x) is also a quadratic function. Thus, people proposed the L-ADMM algorithm, whose difference from the ADMM algorithm is only linearization to the quadratic function in (15).
According to the linearization method shown in Section 2.1.4, the linearized form of (15) is
Where, s ≥ β ∥ A T A ∥ 2 = βλ max (A T A) to guarantee convergence. (17) is potentially easier to get a close-form solution.
If θ1 (x) is a simple quadratic function, i.e. the corresponding matrix is an identity matrix or is of special structure, for example it is equivalent to a convolution operation, then (17) may get a close-form solution by use of the simple optimality condition and the FFT technique. However, the matrix corresponding to θ1 (x) usually is not of special structure in CT and EPRI, so (17) is still difficult to get a close-form solution.
The difference of FL-ADMM from L-ADMM is only that it linearizes any quadratic functions. By use of this fully linearization technique, (17) may get a close-form solution.
If
In this section, we will derive three FL-ADMM algorithm instances corresponding to three TV models.
TV models in image reconstruction have been heavily investigated and have achieved accurate reconstructions via sparse-view projections and/or noisy projections. There are three types of TV models: ucTV, DDcTV and TVcDM models. The unconstrained version is often solved by ADMM algorithm, whereas, DDcTV and TVcDM are often solved by ASD-POCS and CP algorithm. Though the derivation below, we will see that FL-ADMM algorithm may solve any type of TV models and each sub-problem is easy to compute because of its close form.
Without loss of generality, we derive the algorithm instances for 2D CT.
FL-ADMM algorithm for ucTV model
If the 2D image is of size [n x , n y ], then N = n x × n y . If the projection data (sinogram) is of size [n p , n a ], i.e. there are n a projections and each projection has n p measurements, then M = n P × n a .
The ucTV model may be formulated as
Here, D is a matrix of size 2N × N, indicating the gradient transform, and is of this form
Here, D
x
and D
y
are both matrices of size N × N, indicating the x and y direction gradient transform, respectively, shown as
Here, n x and n y are the row and column number of the image, respectively. And, x and y are the row and column index of the image, respectively.
Thus, the gradient magnitude transform ∥Du ∥ 2 may be formulated as
And, the TV norm is
Now, we have the ucTV model,
The FL-ADMM algorithm instance derivation process is as follows.
Let y = Du, i.e. Du - y = 0, then (27) is equivalent to this minimization problem,
The corresponding augmented Lagrange function is
According to the FL-ADMM algorithm framework, we may derive the algorithm instance.
We perform linearization to each quadratic function in (30), and get
Let the gradient of the objective function in (31) be 0, we get
Here, s1 ≥ λ
max
(A
T
A) and s2 ≥ βλ
max
(D
T
D). Clearly, (32) is of close form and only involve simple matrix-vector multiplication.
According to the 2D shrinkage algorithm in (4), we get
Note that y and λ are vectors of size 2N. But, if we regard them as vector-valued vector, then they are both vectors of size N. Further, if we consider their 2D form, they are both vector-valued matrices of size [n
x
, n
y
]. So, each element of y and λ is yi,j and λi,j, respectively. But, we should note that they are both a 2D vector, so (34) is a 2D shrinkage algorithm.
Now, we get the ucTV-FL-ADMM algorithm.
Pseudocode for K steps of the ucTV-FL-ADMM algorithm.
Algorithm 2-A is a simple but effective algorithm. It does not need step-size search via one dimensional search technique. Also, it does not need to use FFT algorithm. The main operations in this algorithm are just simple matrix-vector multiplication and shrinkage algorithm. When we implement this algorithm, the matrices do not need to be of explicit form, i.e. we must not store them in the computer memory. We may regard them as some specific operations, for example, Au means projection operation, whereas A T g means backprojection operation.
However, we find that this algorithm converges very slowly, so we propose an improved fast algorithm with inner iterations as follows.
Pseudocode for K steps of the ucTV-FL-ADMM algorithm with inner iterations.
Here, ‘n_inner_iteration’ means the inner iteration number. By adding this inner iterations technique, the algorithm will converge much faster. The reason will be discussed in the Section 4.
The data divergence constrained, TV (DDcTV) minimization model may be formulated as
This model has superior performance than the ucTV model since its model parameter ɛ has clear physical meaning that it embodies the noise level and level of system-inconsistence.
According to (21) and (8), (36) may be written as
Let y = Du, and Au - g = z, then (37) is equivalent to this minimization problem,
The corresponding augmented Lagrange function is
According to the FL-ADMM algorithm framework, we may derive the algorithm instance.
By linearizing the two quadratic functions in (40), letting the gradient of the objective function be 0, we may get
Here, s1 ≥ β1λ
max
(D
T
D) and s2 ≥ β2λ
max
(A
T
A). Cleary, (41) is of a simple close form and thus is easy to implement.
According to the 2D shrinkage algorithm in (4), we get
Now, we get the DDcTV-FL-ADMM algorithm.
Pseudocode for K steps of the DDcTV-FL-ADMM algorithm.
Also, this standard FL-ADMM algorithm is slow to convergence, so we propose the fast, improved FL-ADMM algorithm with inner iterations.
Pseudocode for K steps of the DDcTV-FL-ADMM algorithm with inner iterations.
By adding this inner iterations technique, the algorithm will converge much faster. The reason will be discussed in the Section 4.
The TV constrained, data divergence minimization (TVcDM) algorithm is another constrained TV model and has been widely used in CT and EPRI. They are often solved by the CP algorithm. Next, we will see that it may also be solved by FL-ADMM and will see that its derivation is easier than that of CP algorithm for it does not need to calculate the convex conjugate functions.
The TVcDM model may be formulated as
By use of the indicator function and the definition of the TV norm, (47) is equivalent to
Let y = Du, then (48) becomes
Then, the corresponding augmented Lagrange function is
According to the FL-ADMM algorithm framework, we may derive the corresponding algorithm instance.
It is the same as (30). By full linearization, we may get its close-form solution.
Here, s1 ≥ λ
max
(A
T
A) and s2 ≥ βλ
max
(D
T
D).
Then
Now, we get the TVcDM-FL-ADMM algorithm.
Pseudocode for K steps of the TVcDM-FL-ADMM algorithm.
Also, we may speed up this algorithm by use of inner iteration technique.
Pseudocode for K steps of the TVcDM-FL-ADMM algorithm with inner iteration
By adding this inner iterations technique, this algorithm will converge much faster. The reason will be discussed in Section 4.
The aim of this work is to design a FL-ADMM algorithm framework and to give prototyping method for optimization models in image reconstruction via the FL-ADMM algorithm framework. For optimization based image reconstruction, we know that optimization model decides the final solution, whereas the solving algorithm just impact the convergence rate and path. Since the FL-ADMM algorithm is a solving algorithm, we should validate if it may solve optimization model accurately, and evaluate what may impact its convergence rate. We have derived three FL-ADMM algorithms for ucTV, DDcTV and TVcDM models. Next, we will just validate and evaluate this solving algorithm for DDcTV model in 2D CT.
We design 4 studies: (1) Inverse crime of DDcTV-FL-ADMM algorithm; (2) The impact of β selection on convergence rate; (3) The impact of inner iteration number on convergence rate; and (4) Comparison with the CP algorithm.
Inverse crime of DDcTV-FL-ADMM algorithm
Inverse crime is a tool to validate the correctness of an inverse problem [39]. If the projection data is complete and exact, then the sign of inverse crime is that the reconstructed image is almost the same with the truth image, i.e. there is not any error except for the computer floating point error.
The imaging configurations are as follows. The phantom is the Shepp-Logan phantom of size [256, 256]. The imaging coordinate system is located at [128, 128]. The projection signal at each angle is of size 256 and its coordinate range is [–127, 128]. The virtual detector bin length is 1. The pixel size is also 1. The parallel beam scanning pattern is adopted. The projections are simulated by use of pixel-driven method [40]. We collect 256 projections uniformly distributed in the range of [0, π].
Thus, the projection data is of size [256, 256] and the image is also of size [256, 256]. For this discrete-to-discrete imaging linear system, the number of unknowns and the number of equations are the same. For the linear system is absolutely consistent, the solution of the corresponding DDcTV model should be accurate enough to reappear the phantom, i.e. the RMSE of the reconstructed image compared with the truth image should be small enough. If the inverse crime may be achieved, then we may think that the imaging system modelling, the optimization problem modelling, the FL-ADMM algorithm derivation and its computer implementation are all correct.
For this reconstruction case, we use Algorithm 3-B (it is used in all reconstructions in Section 3), in which, ɛ = 0, β1 = β2 = 1, and the inner iteration number is 50. The gray image has only 256 gray-level. So, if RMSE is less than 1/256 ≈ 3.9 × 10-3, the reconstructed image and the truth image will be visually the same, which may be the sign of inverse crime. More strictly, we define the sign of inverse crime as that RMSE ≤10-4.
At iteration 4570, inverse crime is achieved. The reconstructed images and the corresponding profiles are shown in Fig. 3. Three iteration trends are plotted in Fig. 4. From Fig. 3, we may see that the reconstructed image is visually the same with the truth image, and that the vertical-center-line profiles of the reconstructed image and the truth image are completely coincident. This shows that inverse crime is achieved. Now, we may think that the imaging system model, the DDcTV model, the FL-ADMM algorithm and its computer implementation are all correct. In Fig. 4, we plots three iteration trends to observe the iteration behavior. It may be seen that the RMSE of the reconstructed image and the data error between the guessed data and the truth data may go down and down with the increase of iteration number. At iteration 4570, the RMSE has been less than 10-4, which means the inverse crime is achieved. We may see that these curves still has a descent trend, showing its good convergence performance. From Fig. 4(c), we may see that during the later iteration period, the TV value may go down and down with the iteration marching. When the iteration stops, the TV value of the reconstructed image becomes the same with the truth TV.

The reconstructed image and profile of the DDcTV-FL-ADMM algorithm in the context of inverse crime. (a) is the truth image, (b) is the reconstructed image, and (c) plots the vertical-center-line profiles of (a) and (b).

The iteration trends of the DDcTV-FL-ADMM algorithm in the context of inverse crime. (a) is for the RMSE of the reconstructed image. (b) is for the data error, i.e. ∥Au - g2∥. (c) is for the TV value of the reconstructed image.
In this section, we evaluate how the penalty parameter, β, impact convergence rate. Here, β = β1 = β2.
The simulation phantom is the FORBILD phantom [41] of size [256, 256]. The projection data is of size [256, 100]. The scanning pattern is of parallel beam form. The 100 projections are uniformly distributed in the range of [0, π]. In the DDcTV-FL-ADMM reconstructions, we vary β from 0.01, 0.1, 1, 10, to 100, fix the inner iteration number as 50, and fix the iteration number as 5000 to evaluate their respective convergence rate.
The reconstructed images of different β are shown in Fig. 5, and the convergence curves of different β are shown in Fig. 6. All the reconstructions of different β are stopped at iteration 5000. Thus, more accurate images means faster convergence. From Fig. 5, we may see that β values of 0.1, 1 and 10 may achieve more accurate reconstructions, whereas β values of 0.01 and 100 lead to reconstructions of a certain level of artifacts. From the image at row 2 and column 5, we may see obvious artifacts. This means that the FL-ADMM algorithm of β value of 100 suffer from the slowest convergence rate. This observation may be more clearly seen in Fig. 6. The order of convergence rate of different β from slow to fast is 100, 0.01, 0.1, 1, and then 10. This indicates that an appropriate β value may achieve fast convergence, whereas too large or too small values always lead to too slow convergence. Even that too huge value may lead to a wrong convergence, i.e. the RMSE will converge to a very large value. However, we want to emphasize that the optimal β value is imaging-condition dependent. In this simulation study, the optimal value is 10. But, in other cases, it may be other value. Usually, one may vary the β value with interval of an order of magnitudes, then select the optimal one to achieve the fastest convergence.

The reconstructed images of the DDcTV-FL-ADMM algorithm with different β. The number above the images indicate the β value. The images on the second row is the zoomed-in images in the red box shown in the first row. The display window is [0,1].

Plots of RMSE as function of iteration number. The iteration number is 5000. The two axis are both of logarithm form.
In Algorithm 3-B, i.e. the DDcTV-FL-ADMM algorithm, there is an inner loop process. In this section, we investigate how the inner iteration number impact the convergence rate.
The phantom is still the FORBILD phantom of size [256, 256]. The imaging condition is the same with that of Section 3.2. In the DDcTV-FL-ADMM reconstructions, we vary the inner iteration number from 1, 50, 100, 150, to 200 and fix the iteration number as 5000, fix β as 10, to evaluate their respective convergence rate. Figure 7 shows the reconstructed images and Fig. 8 plots the convergence curves.

The reconstructed images of the DDcTV-FL-ADMM algorithm with different inner iteration number. The number above the images indicate the inner iteration number. The images on the second row is the zoomed-in images in the red box shown in the first row. The display window is [0,1].

Plots of RMSE as function of iteration number. The iteration number is 5000. The two axis are both of logarithm form.
From Fig. 7, we may see that the reconstructed images with inner iteration number of 50, 100,150 and 200 all have higher accuracy. However, if the inner iteration number is 1, i.e. if the FL-ADMM algorithm used is Algorithm 3-A, the reconstructed image suffers from serious artifacts. This means that appropriately selected inner iteration number may achieve faster convergence. The standard FL-ADMM algorithm without inner iteration always leads to too slow convergence rate. This observation may be clearly seen in Fig. 8. Though use of larger inner iteration number may achieve faster convergence rate, the inner loop process will take longer time. Thus, one should select an appropriate inner iteration number to achieve fast convergence and fast inner loop computation. In this case, inner iteration number of 50 is the optimal selection. Also, we want to emphasize that the optimal number of inner iteration is imaging-condition dependent. Usually, one may select some values to perform reconstructions and then select the optimal one.
CP algorithm has been proposed and widely used in image reconstruction for more than 10 years. We have applied the CP algorithm in EPR imaging and CT. The most important advantage of the CP algorithm is that it may always get closed-form solutions for sub-problems and the finial algorithm instance only involves simple matrix-vector multiplications and some simple operations. Thus, we say the CP algorithm is a fast prototyping tool for optimization based image reconstruction. Similar to CP, the FL-ADMM is also a fast prototyping tool for it may always get closed-form solutions for difficult sub-problems and it has not special demand on system matrix and sparse-transform matrix.
Through the studies in subsections 3.1 to 3.3, we have known the correctness of the FL-ADMM algorithm and have realized how the β selection and inner iteration number impact the convergence rate. In this subsection, we investigate the sparse reconstruction capability of the DDcTV-FL-ADMM algorithm by comparison with the DDcTV-CP algorithm whose pseudocode and some explanations are shown in Appendix 1.
The phantom is a real thoracic CT image of size [256, 256]. The imaging condition is the same with that of Section 3.2. In the DDcTV-FL-ADMM reconstructions, we fix the inner iteration number as 50, fix β as 10, and vary the projection number from 20,40,60,80, to 100. In the DDcTV-CP reconstruction, we set λ = 1 and
Figure 9 shows the reconstructed images of the two algorithms and Tables 1 to 3 shows the quantitative results.

The reconstructed images by the DDcTV-CP and DDcTV-FL-ADMM algorithms. The number above the images indicate the projection number. The text at the left of the images indicate the algorithm used. The images in row 3 and 4 are the enlarged region-of-interest (ROI) images which is indicated by the red rectangle in the right-top image. The red ellipses encircle a fine structure to emphasize observation.
RMSE comparison of the FL-ADMM and CP algorithms
SSIM comparison of the FL-ADMM and CP algorithms
PSNR comparison of the FL-ADMM and CP algorithms
The two algorithms are solving the same optimization model, DDcTV model. Theoretically speaking, model decides the solution, whereas the solving algorithm only decides the convergence rate and path. However, practically speaking, different solving algorithms cannot achieve the absolutely the same solution because the model-solution may be a solution-set, the algorithm-parameters cannot be selected absolutely optimally and there always exists numerical error induced by the computer floating point error. If the two solvers are both convergent accurate solvers, their reconstruction accuracy should be both very high and should be very similar. The CP algorithm has been used in CT and other imaging modalities for more than ten years and has been deeply explored. In this comparison, we may regard the CP algorithm as the state-of-the-art (SOTA) algorithm.
From Fig. 9, we may see that the image quality is better and better with the increase of the projection number for both algorithms. The reconstructed images via 100 projection by the two algorithms are both almost the same with the truth image. With the decrease of the projection number, the reconstructed images degrade gradually. The reconstructed images via 20 projections become too smooth. Their ROI images have lost the fine structure. For the two algorithms, they are both accurate solvers of the DDcTV model and both have capability to perform accurate sparse reconstructions. Comparing the two algorithms by Fig. 9, we almost cannot see the difference between each other. This shows the FL-ADMM may achieve comparable reconstruction quality with the SOTA CP algorithm according to the visual observations.
Tables 1 to 3 shows the quantitative comparison results of the two algorithms via metric of RMSE, SSIM and PSNR, respectively. RMSE means root mean square error, SSIM means structural similarity index measure, and PSNR means peak signal to noise ratio. From them, we may see that the two algorithms have very close accuracy, which is visually validated by Fig. 9. However, we may also see that the FL-ADMM algorithm is always a little bit better than the CP algorithm. This is because the optimal β selection in FL-ADMM is easier than the optimal selections of λ and ν in CP algorithm. For β, one may just search the optimal value via several reconstructions of different values. However, for the optimal selections of λ and ν, it would be more difficult for the search is in the range of a two dimensional plane. In image reconstructions, we usually search these values at tenfold intervals. Clearly, one-algorithm-parameter selection is much easier than two-algorithm-parameters selections. Viewed from this perspective, FL-ADMM algorithm is superior to the CP algorithm.
Both qualitative and quantitative evaluations show that the FL-ADMM algorithm may accurately solve the DDcTV model. In fact, we have also evaluated the FL-ADMM algorithms for ucTV and TVcDM models. To be brief, we just show the results of DDcTV-FL-ADMM algorithm. All the experiments on Shepp-Logan, FORBILD and real-CT-image phantoms show that FL-ADMM algorithm is a simple, effective, convergent and universal solving algorithm for optimization models in image reconstruction.
In this work, we propose a novel ADMM algoithm, FL-ADMM algorithm, which may be used as a prototyping tool for optimiaiton model. The key operation is to linearize all the quadratic terms so that the corresponding sub-problem may get a simple close-form. Further, we propose the fast FL-ADMM algorithm by use of the inner iteration technique. We have derived three FL-ADMM algorithm instances for three TV models, ucTV, DDcTV, and TVcDM. Further, we validate and evaluate the correctness and sparse recontruction capability of the DDcTV-FL-ADMM algorithm. Also, we analyze how the penalty parameter and the inner iteration number impact the convergence rate.
In optimization based image reconstruction, especially in TV-type norm based image reconstruciton, the ADMM algorithm always has a problem that one sub-problem has not simple close-form solution. Usually, people has two choices to solve this problem. One is to use gradient desent algorithm to solve this sub-problem. However, the step-size selection is difficult. If one uses the line search method to select the optimal step-size for each iteration, it needs too long time, especially when the imaging model is large-scale. The other one is to use FFT technique. In fact, why FFT may be used is because the sparse transform matrix and the system matrix may both be regarded as a convolution operation. So, this method has not universality. Once a sparse transform cannot be regarded as a convolution operation, this method loses efficacy.
These difficuties of ADMM motivated this work whose aim is to devise a method to simplify the implementation of the ADMM algoirthm. Motivated by the linearization technique of the L-ADMM algorithm, we propose the FL-ADMM algorithm and propose its acceleration version.
In Section 2, the standard FL-ADMM algorithms are named Algorithm-A, whearas the acclerated FL-ADMM algorithm are named Algoritm-B. For example, for the DDcTV-FL-ADMM algorithm, the standard algorithm is Algorithm 3-A, whereas its improved, accelerated version is Algorithm 3-B. In fact, the accelerated FL-ADMM algorithm uses a special gradient desenct algorithm. In Algorithm 3-B, Equation (4.3) is, in fact, a gradient desent equation. But, very importantly, the step-size is
Compared to the ADMM algorithm using FFT, this proposed algoirthm has universality for it does not need that the sparse transform matrix and the system maxtrix may be both regarded as a convolution operation.
Though this proposed algorithm has the two advantages over gradient descent based and FFT based ADMM algorithms mentioned above, it still has two disadvantages. One is the convergence rate. By varying the penalty parameter and the inner iteration number, we may actually get a comparatively high convergence rate. However, according to systematical comparison with the ASD-POCS and CP algorithms, we find that the FL-ADMM algorithm and the CP algorithm have similar convergence rate and are both slower than the ASD-POCS algorithm. Thus, the acceleration of FL-ADMM should be studied in the future. The other one is that we still do not know why the mathematically-derived, standard FL-ADMM without inner iteration is so slow. When we observed that the close-form solution for the sub-problem on data-fidelity term, for examle, Equation (52) in Algorithm 3-A, is, in fact, a gradient descent step, we realize that the slow convergence may be because Equation (52) only runs one time of gradient descent. Thus, we propose to accelerate the standard FL-ADMM algorithm by letting the gradient descent steps run more times/iterations. Though, we use inner iteration to perform gradient descent here, it is different from the direct use of gradient descent algorithm to solve the corresponding sub-problem, which needs time-consuming line search to select the optimal step-size for each iteration. However, as the CP algorithm, the FL-ADMM algorithm is a good prototyping tool for optimization model. If one design a new optimization model, one may use the FL-ADMM algorithm framework to derive the corresponding algorithm instance. This process will be very fast for the FL-ADMM algorithm instance is easy to derive and implement. Compared to CP algorithm, this algorithm derivation does not need the complicated calculation of convex conjugate. Compared to ASD-POCS algorithm, people do not need set up so many algorithm-parameters empirically. Compared to the standard ADMM algorithm, it may always get close-form solution for each sub-problem because of the full application of linearization technique. Also, the computer implementation of the FL-ADMM algorithm is easy for it usually just involve matrix-vector multiplication and some simple operations, for example shrinkage and projection operations.
Conclusions
The conclusions of this work may be summed up briefly as the following points. The FL-ADMM algorithm is a universal, simple, effective, and accurate solver of convex optimiztion model, no matter it is unconstrained or constrained. The FL-ADMM algoirthm improves the traditional ADMM algorithm by avoiding the step-size search for gradient desecnt and the special demands to the sparse transfrom and the system matrix for use of FFT technique. The penalty parameter in this proposed algoritm may impact the convergence rate. Too large or small values both lead to slow convergence. The inner iteration number in this algorithm may impact the convergence rate. One may select the optimal one by running several reconstrutions with different inner iteration number. According to our experience, 30-50 is usually efficient. But, we note that the optimal inner iteration number is imaging-condition dependent.
In the future, the FL-ADMM algorithm should be focused on its acceleration technique which may borrow the ideas in accelertion of the traditional ADMM algorithm.
Footnotes
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China under grant 62071281, by the Local Science and Technology Development Fund Project Guided by the Central Government under grant YDZJSX2021A003, and by the NIH under grants P41 EB002034, R01 CA098575, and P30 CA014599.We are very thankful to Prof. Bingsheng He with Southern University of Science and Technology, Shenzhen, Guangdong, China, for his constructive discussions with us.
Appendix 1.
DDcTV-CP algorithm
The DDcTV model can be formulated as
By use of the CP algorithm framework, we may derive the CP algorithm instance to solve (A2), which is shown in Algorithm A1.
