A fully linearized ADMM algorithm for optimization based image reconstruction

Abstract

BACKGROUND AND OBJECTIVE:

Optimization based image reconstruction algorithm is an advanced algorithm in medical imaging. However, the corresponding solving algorithm is challenging because the model is usually large-scale and non-smooth. This work aims to devise a simple and convergent solver for optimization model.

METHODS:

The alternating direction method of multipliers (ADMM) algorithm is a simple and effective solver of the optimization model. However, there always exists a sub-problem that has not close-form solution. One may use gradient descent algorithm to solve this sub-problem, but the step-size selection via line search is time-consuming. Or, one may use fast Fourier transform (FFT) to get a close-form solution if the sparse transform matrix is of special structure. In this work, we propose a fully linearized ADMM (FL-ADMM) algorithm that avoids line search to determine step-size and applies to sparse transform of any structure.

RESULTS:

We derive the FL-ADMM algorithm instances for three total variation (TV) models in 2D computed tomography (CT). Further, we validate and evaluate one FL-ADMM algorithm and explore how two important factors impact convergence rate. These studies show that the FL-ADMM algorithm may accurately solve the optimization model.

CONCLUSION:

The FL-ADMM algorithm is a simple, effective, convergent and universal solver of optimization model in image reconstruction. Compared to the standard ADMM algorithm, the new algorithm does not need time-consuming step-size line-search or special demand to sparse transform. It is a rapid prototyping tool for optimization based image reconstruction.

Keywords

Fully linearized ADMM optimization total variation computed tomography image reconstruction

1 Introduction

Image reconstruction is the core technique of medical imaging. There are three types of reconstruction algorithms: analytical algorithm [1], optimization based iterative algorithm [2, 3] and deep learning/machine learning based algorithm [4]. Among them, the optimization based algorithm has some advantages relative to the two other algorithms. Compared to the analytical algorithm, the optimization based algorithm may incorporating prior information into the optimization model and thus may achieve more accurate reconstruction from incomplete data and/or noisy data. Compared to the deep learning based algorithm, the optimization based algorithm is more stable and is not data dependent [5]. Therefore, the optimization based algorithm is always a hot spot in image reconstruction field.

The imaging system model of the optimization based algorithm is a linear system of equations. Usually, it is large-scale, ill-posed and often underdetermined. So, it is impossible to solve the linear system by direct inversion. We may continue to construct an optimization model to solve the ill-posed and underdetermined problems by incorporating prior information like sparse prior [6] or low-rank prior [7], etc.

In the optimization models for image reconstruction, total variation (TV) type models have been widely used for they may not only suppress streak artifacts but also suppress noise [8]. In 2006, Sidky et al. proposed a data divergence constrained, TV (DDcTV) minimization model for 2D CT [9]. In 2008, Sidky et al. proposed a DDcTV model for 3D CT and achieved accurate reconstruction [2]. From then on, a variety of TV-type models were proposed in image reconstruction, for example adaptively weighted TV (awTV) [10], anisotropic TV (aTV) [11], high order TV (HOTV) [12, 13], non-local TV (NLTV) [14], nuclear TV (nTV) [15], directional TV (dTV) [16], etc. These TV-type models may achieve more accurate reconstructions in their suitable cases. However, the solving problems of these TV-type models are challenging because these models are large-scale and non-smooth. They cannot be solved by simple gradient decent algorithm.

For TV-type models, there are mainly three solving algorithms: adaptive steepest descent-projection onto convex sets (ASD-POCS) algorithm [2, 9], Chambolle-Pock (CP) algorithm [17, 18], and alternating direction method of multipliers (ADMM) algorithm [19 –26]. We have systematically studied ASD-POCS and CP algorithms and applied them to solve TV models for CT and electron paramagnetic resonance imaging (EPRI) [27 –30]. ASD-POCS algorithm is devised according to the physical meaning and the optimization strategy, whereas it is not derived mathematically. Thus, those algorithm parameters in this algorithm should be carefully, empirically chosen to achieve convergence. CP algorithm is obtained by mathematical derivation and may guarantee convergence, but the derivation is not so easy for those who are not familiar with optimization, for example it needs the calculation of convex conjugate [18]. ADMM is another algorithm framework for solving TV-types models. The derivation of algorithm instance according to the algorithm framework is easier than CP algorithm for the core step is only to split the whole optimization problem into several simple sub-optimization problems by alternating direction technique. However, for TV-types models in image reconstruction, there is always a sub-problem that has not close-form solution in ADMM algorithm. One method is to solve this sub-problem by gradient descent algorithm [26]. However, on the one hand, the iteration number must be large enough to achieve accurate solution of this sub-problem, on the other hand, the step-size should be selected by one dimensional line search technique, for example nonmonotone line search [26], leading to long-time inner-iteration. Another method is to use fast Fourier transform (FFT) algorithm to get a close-form solution [20]. However, it demands that the system matrix and the sparse transform matrix have special structure, for example the system matrix for image deblurring, image denoising, etc, so that they are be regarded as convolution operations. For CT and EPRI, this method is not suitable. This is why we never use ADMM algorithm in TV-based reconstructions in CT and EPRI for a long time.

We noticed that some researchers began to use the linearization technique to linearize the quadratic term in the difficult sub-problem in ADMM so as to the FFT algorithm may be used to construct a close-form solution [24]. Additionally, linearized ADMM (L-ADMM) has been proposed to improve the ADMM algorithm [31 –37]. However, we found that these L-ADMM algorithms are not thorough, i.e. people only linearized one quadratic term in the difficult sub-problem. Why do not people linearize all the quadratic terms in the difficult sub-problem? We think it should be deeply investigated.

In this work, we proposed a fully linearized ADMM (FL-ADMM) for image reconstruction to simplify the ADMM algorithm by avoiding the search of optimal step-size in gradient descent algorithm and the use of FFT algorithm. The proposed FL-ADMM algorithm framework may be used to derive simple, effective, and convergent algorithm instances for a variety of optimization models. To show the potential of the FL-ADMM for prototyping of the optimization models, we derive the FL-ADMM algorithm instances for unconstrained TV (ucTV) minimization model, data divergence constrained, TV (DDcTV) minimization model, and TV constrained, data divergence minimization (TVcDM) model for two dimensional (2D) computed tomography (CT). Also, we validate and evaluate the DDcTV-FL-ADMM algorithm for 2D CT to illuminate that the FL-ADMM algorithm is actually an accurate solver of the DDcTV model. Finally, we explore how the penalty parameter and other algorithm parameters of DDcTV-FL-ADMM impact the convergence rate.

In Section 2, we give the derivation of the FL-ADMM algorithm instances for three TV models. In Section 3, we perform reconstruction experiments via the proposed DDcTV-FL-ADMM algorithm. We give discussion and draw conclusions briefly in Section 4.

2 Methods

2.1 Preliminary knowledge

In this section, we give some basic optimization knowledge, which will be used in the following parts.

2.1.1 Shrinkage algorithm

(1) One dimensional (1D) shrinkage

1D Shrinkage algorithm or operation may solve the optimization problem shown in Equation (1). $x^{*} = arg min_{x} λ ∥ x ∥_{1} + \frac{1}{2} ∥ x - a ∥_{2}^{2}$ (1)

Here, x is a vector of size N, ∥ · ∥ ₁ is the ℓ₁ norm of a vector, and ∥ · ∥ ₂ is the ℓ₂ norm of a vector.

Suppose that x_i indicates the ith. element of this vector. Then the solution of this optimization problem is ${x_{i}}^{*} = 𝕊 (a_{i}, λ) = max (| a_{i} | - λ, 0) \cdot sgn (a_{i}) . 1 \leq i \leq N$ (2)

Here, $𝕊$ is the 1D shrinkage operator and sgn (·) is the standard sign function, whose value is 1 for positive number, 0 for 0, and -1 for negative number. max(· , ·) is an operator for selecting maximal value.

(2) Two dimensional (2D) shrinkage

2D Shrinkage algorithm or operation may solve the optimization problem shown in Equation (3). $x^{*} = arg min_{x} λ ∥ ∥ x ∥_{2} ∥_{1} + \frac{1}{2} ∥ x - a ∥_{2}^{2}$ (3)

Here, x is a 2D-vector-valued vector of size N, x_i indicates the ith element of this vector, $x_{i}^{1}$ and $x_{i}^{2}$ are the two elements of the 2D vector, x_i.

The solution of this optimization problem is ${x_{i}}^{*} = 𝕊^{2} (a_{i}, λ) = max (∥ a_{i} ∥_{2} - λ, 0) \cdot a_{i} / ∥ a_{i} ∥_{2} . 1 \leq i \leq N$ (4)

Here, $𝕊^{2}$ is the 2D shrinkage operator and we should note that x_i and a_i are both 2D vector.

2.1.2 Projection algorithm

In optimization theory, projection onto convex sets (POCS) algorithm/operation may solve a special optimization problem as follows. $x^{*} = arg min_{x} ∥ x - a ∥_{2}^{2} s . t . x \in C$ (5)

Here, C is a convex set. x and a are both vectors of size N.

Then, the solution of this optimization problem is $x^{*} = ℙ (a, C) .$ (6)

Here, $ℙ$ is the projection operator. The meaning of (6) is to project the point a onto the convex set C.

(1) Projection onto ℓ₂ norm ball

If the convex set is a ℓ₂ norm ball of radius, r, then the POCS operation of a to this ball is $P (a, L 2 Ball (r)) = \frac{ra}{max (r, ∥ a_{2} ∥)} .$ (7)

Here, L2Ball (r) is the ℓ₂ norm ball of radius r. This POCS operation means that, if the point a is in the ball, then the projection value is a, whereas, if the point is outside the ball, then projection value is the intersection point of this ℓ₂ norm circle and the vector a. (Note that any point may also be regarded as a vector.)

If the vector a is of 2D form, then the POCS schematic diagram is shown in Fig. 1.

Fig. 1

Projection onto ℓ₂ norm ball. (a) is for the case that a is in the ball, and (b) is for the case that a is outside the ball.

(2) Projection onto ℓ₁ norm ball

The POCS operation of projection onto ℓ₁ norm ball is more complicated than that of projection onto ℓ₂ norm ball. Its schematic diagram for 2D case is shown in Fig. 2.

Fig. 2

Projection onto ℓ₁ norm ball. (a) is for the case that a is in the ball, and (b) is for the case that a is outside the ball.

From Fig. 2, we may see that, if a is in the ℓ₁ norm ball, the projection point is still itself, whereas, if it is outside the ball, the projection point should be the foot of a perpendicular from a to the ℓ₁ norm circle.

There exists accurate algorithm for $ℙ (a, L 1 Ball (r))$ , shown in Algorithm 1 [38].

Algorithm 1.

Pseudo-codes for $ℙ (a, L 1 Ball (r))$

1: Function

ℙ (a, L 1 Ball (r))

2: if ∥a ∥ ₁ ≤ r then

3: return a

4: end if

5: m = |a|

6: Sort m in descending order

7: k = max(j)

such that m_{j} - \frac{1}{j} (\sum_{i = 1}^{j} m_{i} - r) > 0 for all j \in [1, N]

θ = (1 / k) (\sum_{i = 1}^{k} m_{i} - r)

9: w = max(|a| - θ,0)

10: return w · sgn (a)

11: end function

2.1.3 Indicator function and constrain-unconstrain transformation

The indicator function may be defined as $δ_{C} (x) = {\begin{matrix} 0 x \in C \\ \infty x \notin C \end{matrix} .$ (8)

Thus, a constrained optimization model may be transformed into the unconstrained optimization form. For example, Equation (5) may be written in the unconstrained form as follows. $x^{*} = arg min_{x} ∥ x - a ∥_{2}^{2} + δ_{C} (x)$ (9)

In the following sections, we will often use this transformation. Note that, it is just a form-transformation and the optimization meaning of the two forms are completely the same.

2.1.4 Linearization of a quadratic function

In fact, any quadratic function may be linearized at a point according to the Taylor expansion. $\frac{β}{2} ∥ Ax - b ∥_{2}^{2} \approx \frac{β}{2} ∥ A x_{0} - b ∥_{2}^{2} + 〈 x - x_{0}, β A^{T} (A x_{0} - b) 〉 + \frac{s}{2} ∥ x - x_{0} ∥_{2}^{2}$ (10)

Here, x₀ is any point (vector), s cannot be selected arbitrarily and usually we may set its value according to [33] $s \geq β ∥ A^{T} A ∥_{2} = β λ_{max} (A^{T} A) .$ (11)

Here, λ_max means the largest eigenvalue of a matrix.

2.2 Three ADMM algorithm framework

2.2.1 ADMM algorithm framework

We consider the structured, constrained convex optimization problem $min {θ_{1} (x) + θ_{2} (y)} s . t . Ax + By = b .$ (12)

Where, x and y are multi-dimensional vectors, A and B are matrices indicating linear transform, θ₁ (x) and θ₂ (y) are convex but not necessary smooth functions.

The corresponding augmented Lagrange function is $L_{β} (x, y; λ) = θ_{1} (x) + θ_{2} (y) - 〈 λ, Ax + By - b 〉 + \frac{β}{2} ∥ Ax + By - b ∥_{2}^{2} .$ (13)

The ADMM algorithm framework is ${\begin{matrix} x^{k + 1} = arg min L_{β} (x, y^{k}; λ^{k}) (14.1) \\ y^{k + 1} = arg min L_{β} (x^{k + 1}, y; λ^{k}) (14.2) \\ λ^{k + 1} = λ^{k} - β (A x^{k + 1} + B y^{k + 1} - b) . (14.3) \end{matrix}$

The sub-problem (14.1) may be further written as $x^{k + 1} = arg min {θ_{1} (x) + \frac{β}{2} {∥ Ax + B y^{k} - b - \frac{λ^{k}}{β} ∥}_{2}^{2}} .$ (15)

The sub-problem (14.2) may be further written as $y^{k + 1} = arg min {θ_{2} (y) + \frac{β}{2} {∥ A x^{k + 1} + By - b - \frac{λ^{k}}{β} ∥}_{2}^{2}} .$ (16)

Clearly, the ADMM algorithm may divide the original problem into three simple sub-problems. By this splitting technique, the two convex functions are split and the whole solving problem is potentially simplified.

2.2.2 Linearized ADMM (L-ADMM) algorithm framework

For the sub-problems (15) and (16) are similar, we will just discuss the linearization technique for (15).

Often, (15) is still hard to solve for usually there is not a simple, close-form solution, for example in case that θ₁ (x) is also a quadratic function. Thus, people proposed the L-ADMM algorithm, whose difference from the ADMM algorithm is only linearization to the quadratic function in (15).

According to the linearization method shown in Section 2.1.4, the linearized form of (15) is $x^{k + 1} = arg min {θ_{1} (x) + 〈 x, β A^{T} (A x^{k} + B y^{k} - b - \frac{λ^{k}}{β}) 〉 + \frac{s}{2} ∥ x - x^{k} ∥_{2}^{2}} .$ (17)

Where, s ≥ β ∥ A^TA ∥ ₂ = βλ_max (A^TA) to guarantee convergence. (17) is potentially easier to get a close-form solution.

2.2.3 Fully Linearized ADMM (FL-ADMM) algorithm framework

If θ₁ (x) is a simple quadratic function, i.e. the corresponding matrix is an identity matrix or is of special structure, for example it is equivalent to a convolution operation, then (17) may get a close-form solution by use of the simple optimality condition and the FFT technique. However, the matrix corresponding to θ₁ (x) usually is not of special structure in CT and EPRI, so (17) is still difficult to get a close-form solution.

The difference of FL-ADMM from L-ADMM is only that it linearizes any quadratic functions. By use of this fully linearization technique, (17) may get a close-form solution.

If $θ_{1} (x) = \frac{γ}{2} ∥ Fx - a ∥_{2}^{2}$ , according to (17) and (10), then (17) has the close-form solution below. $x^{k + 1} = x^{k} - \frac{γ}{s_{1} + s_{2}} [F^{T} (F x^{k} - a)] - \frac{β}{s_{1} + s_{2}} [A^{T} (A x^{k} + B y^{k} - b - \frac{λ^{k}}{β})],$ (18) where, s₁ ≥ γ ∥ F^TF ∥ ₂, and s₂ ≥ β ∥ A^TA ∥ ₂. In fact, this is a gradient descent operation with fixed step-size.∥Clearly, FL-ADMM algorithm may make close-form solution and simplify the difficult sub-problem. Otherwise, one must use time-consuming line search to select the step-size for solving (17) via gradient descent or one must use FFT algorithm to solve (17), for which, F and A must be of special structure. This proposed FL-ADMM algorithm is the core contribution of this work.

2.3 FL-ADMM algorithm instances derivation of three TV models

In this section, we will derive three FL-ADMM algorithm instances corresponding to three TV models.

TV models in image reconstruction have been heavily investigated and have achieved accurate reconstructions via sparse-view projections and/or noisy projections. There are three types of TV models: ucTV, DDcTV and TVcDM models. The unconstrained version is often solved by ADMM algorithm, whereas, DDcTV and TVcDM are often solved by ASD-POCS and CP algorithm. Though the derivation below, we will see that FL-ADMM algorithm may solve any type of TV models and each sub-problem is easy to compute because of its close form.

Without loss of generality, we derive the algorithm instances for 2D CT.

2.3.1 FL-ADMM algorithm for ucTV model

(1) The ucTV model The discrete-to-discrete (D2D) imaging system model of 2D CT may be formulated as $g = Au,$ (19) where, u is a vector of size N, indicating the image, g is a vector of size M, indicating the projection data, and A is the system matrix (projection matrix) of size M × N, indicating the 2D Radon transform for parallel beam CT and the ray transform for fan beam CT. A_i,j is the contribution of the jth pixel to the ith ray.

If the 2D image is of size [n_x, n_y], then N = n_x × n_y. If the projection data (sinogram) is of size [n_p, n_a], i.e. there are n_a projections and each projection has n_p measurements, then M = n_P × n_a.

The ucTV model may be formulated as $u^{*} = arg min_{u} \frac{1}{2} ∥ Au - g ∥_{2}^{2} + α ∥ u ∥_{TV},$ (20) where, ∥u ∥ _TV is the TV norm of image u, and its isotropic form is $u_{TV} = ∥ (∥ D u ∥_{2}) ∥_{1} .$ (21)

Here, D is a matrix of size 2N × N, indicating the gradient transform, and is of this form $D = (\begin{matrix} D_{x} \\ D_{y} \end{matrix}) .$ (22)

Here, D_x and D_y are both matrices of size N × N, indicating the x and y direction gradient transform, respectively, shown as $(D_{x} u)_{x, y} = {\begin{matrix} u_{x, y} - u_{x - 1, y} x \in [2, n_{x}] \\ 0 x = 1 \end{matrix},$ (23) $(D_{y} u)_{x, y} = {\begin{matrix} u_{x, y} - u_{x, y - 1} y \in [2, n_{y}] \\ 0 y = 1 \end{matrix} .$ (24)

Here, n_x and n_y are the row and column number of the image, respectively. And, x and y are the row and column index of the image, respectively.

Thus, the gradient magnitude transform ∥Du ∥ ₂ may be formulated as ${(∥ Du ∥_{2})}_{x, y} = \sqrt{{(D_{x} u)}_{x, y}^{2} + {(D_{y} u)}_{x, y}^{2}} .$ (25)

And, the TV norm is $∥ u ∥_{TV} = \sum_{x, y} \sqrt{{(u_{x, y} - u_{x - 1, y})}^{2} + {(u_{x, y} - u_{x, y - 1})}^{2}} .$ (26)

(2) The ucTV-FL-ADMM algorithm

Now, we have the ucTV model, $u^{*} = arg min_{u} \frac{1}{2} ∥ Au - g ∥_{2}^{2} + α ∥ (∥ Du ∥_{2}) ∥_{1} .$ (27)

The FL-ADMM algorithm instance derivation process is as follows.

Let y = Du, i.e. Du - y = 0, then (27) is equivalent to this minimization problem, $min \frac{1}{2} ∥ Au - g ∥_{2}^{2} + α ∥ (∥ y ∥_{2}) ∥_{1} s . t . Du - y = 0$ (28)

The corresponding augmented Lagrange function is $L_{β} (u, y; λ) = \frac{1}{2} ∥ Au - g ∥_{2}^{2} + α ∥ (∥ y ∥_{2}) ∥_{1} - 〈 λ, Du - y 〉 + \frac{β}{2} ∥ Du - y ∥_{2}^{2} .$ (29)

According to the FL-ADMM algorithm framework, we may derive the algorithm instance. $u^{k + 1} = arg min_{u} \frac{1}{2} ∥ Au - g ∥_{2}^{2} + \frac{β}{2} ∥ Du - y^{k} - \frac{1}{β} λ^{k} ∥_{2}^{2}$ (30)

We perform linearization to each quadratic function in (30), and get

$\begin{matrix} u^{k + 1} = & arg min_{u} 〈 u, A^{T} (A u^{k} - g) 〉 + \frac{s_{1}}{2} ∥ u - u^{k} ∥_{2}^{2} \\ + 〈 u, β D^{T} (D u^{k} - y^{k} - \frac{1}{β} λ^{k}) 〉 + \frac{s_{2}}{2} ∥ u - u^{k} ∥_{2}^{2} . \end{matrix}$ (31)

Let the gradient of the objective function in (31) be 0, we get $u^{k + 1} = \frac{(s_{1} + s_{2}) u^{k} - A^{T} (A u^{k} - g) - β D^{T} (D u^{k} - y^{k} - \frac{1}{β} λ^{k})}{s_{1} + s_{2}} .$ (32)

Here, s₁ ≥ λ_max (A^TA) and s₂ ≥ βλ_max (D^TD). Clearly, (32) is of close form and only involve simple matrix-vector multiplication. $y^{k + 1} = arg min_{y} α ∥ (∥ y ∥_{2}) ∥_{1} + \frac{β}{2} ∥ D u^{k + 1} - y - \frac{1}{β} λ^{k} ∥_{2}^{2}$ (33)

According to the 2D shrinkage algorithm in (4), we get ${(y^{k + 1})}_{i, j} = 𝕊^{2} ({(D u^{k + 1} - \frac{1}{β} λ^{k})}_{i, j}, α / β) .$ (34)

Note that y and λ are vectors of size 2N. But, if we regard them as vector-valued vector, then they are both vectors of size N. Further, if we consider their 2D form, they are both vector-valued matrices of size [n_x, n_y]. So, each element of y and λ is y_i,j and λ_i,j, respectively. But, we should note that they are both a 2D vector, so (34) is a 2D shrinkage algorithm. $λ^{k + 1} = λ^{k} - β (D u^{k + 1} - y^{k + 1})$ (35)

Now, we get the ucTV-FL-ADMM algorithm.

Algorithm 2-A.

Pseudocode for K steps of the ucTV-FL-ADMM algorithm.

Input:g, A, D, α

1: set up β value, s₁ = λ_max (A^TA), s₂ = βλ_max (D^TD).

2: u⁰=0, y⁰=0, λ⁰=0, k = 0.

3: repeat

4: Equation (32) for u^k+1.

5: Equation (34) for y^k+1.

6: Equation (35) for λ^k+1.

7: k = k + 1

8: untilk ≥ K

Output: u ^K

Algorithm 2-A is a simple but effective algorithm. It does not need step-size search via one dimensional search technique. Also, it does not need to use FFT algorithm. The main operations in this algorithm are just simple matrix-vector multiplication and shrinkage algorithm. When we implement this algorithm, the matrices do not need to be of explicit form, i.e. we must not store them in the computer memory. We may regard them as some specific operations, for example, Au means projection operation, whereas A^Tg means backprojection operation.

However, we find that this algorithm converges very slowly, so we propose an improved fast algorithm with inner iterations as follows.

Algorithm 2-B.

Pseudocode for K steps of the ucTV-FL-ADMM algorithm with inner iterations.

Input:g, A, D, α

1: set up β value, s₁ = λ_max (A^TA), s₂ = βλ_max (D^TD).

2: u⁰ = 0, y⁰ = 0, λ⁰ = 0, k = 0.

3: repeat

4.1: u = u^k.

4.2: for ii = 1:n_inner_iteration

4.3:

u = \frac{(s_{1} + s_{2}) u - A^{T} (Au - g) - β D^{T} (Du - y^{k} - \frac{1}{β} λ^{k})}{s_{1} + s_{2}}

4.4: end for

4.5: u^k+1 = u

5: Equation (34) for y^k+1.

6: Equation (35) for λ^k+1.

7: k = k + 1

8: untilk ≥ K

Output: u ^K

Here, ‘n_inner_iteration’ means the inner iteration number. By adding this inner iterations technique, the algorithm will converge much faster. The reason will be discussed in the Section 4.

2.3.2 FL-ADMM algorithm for DDcTV model

(1) The DDcTV model

The data divergence constrained, TV (DDcTV) minimization model may be formulated as $u^{*} = arg min_{u} ∥ u ∥_{TV} s . t . ∥ Au - g ∥_{2} \leq ɛ .$ (36)

This model has superior performance than the ucTV model since its model parameter ɛ has clear physical meaning that it embodies the noise level and level of system-inconsistence.

(2) The DDcTV-FL-ADMM algorithm

According to (21) and (8), (36) may be written as $u^{*} = arg min_{u} ∥ (∥ Du ∥_{2}) ∥_{1} + δ_{L 2 Ball (ɛ)} (Au - g)$ (37)

Let y = Du, and Au - g = z, then (37) is equivalent to this minimization problem, $min ∥ (∥ y ∥_{2}) ∥_{1} + δ_{L 2 Ball (ɛ)} (z) s . t . Du - y = 0 and Au - z = g .$ (38)

The corresponding augmented Lagrange function is $\begin{matrix} L_{β_{1}, β_{2}} (u, y, z; λ_{1}, λ_{2}) = ∥ (∥ y ∥_{2}) ∥_{1} \\ + δ_{L 2 Ball (ɛ)} (z) - 〈 λ_{1}, Du - y 〉 + \frac{β_{1}}{2} ∥ Du - y ∥_{2}^{2} \\ - 〈 λ_{2}, Au - z - g 〉 + \frac{β_{2}}{2} ∥ Au - z - g ∥_{2}^{2} . \end{matrix}$ (39)

According to the FL-ADMM algorithm framework, we may derive the algorithm instance. $u^{k + 1} = arg min_{u} \frac{β_{1}}{2} {∥ Du - y^{k} - \frac{1}{β_{1}} λ_{1}^{k} ∥}_{2}^{2} + \frac{β_{2}}{2} {∥ Au - z^{k} - g - \frac{1}{β_{2}} λ_{2}^{k} ∥}_{2}^{2}$ (40)

By linearizing the two quadratic functions in (40), letting the gradient of the objective function be 0, we may get $u^{k + 1} = u^{k} - \frac{β_{1}}{s_{1} + s_{2}} D^{T} (D u^{k} - y^{k} - \frac{1}{β_{1}} λ_{1}^{k}) - \frac{β_{2}}{s_{1} + s_{2}} A^{T} (A u^{k} - z^{k} - g - \frac{1}{β_{2}} λ_{2}^{k})$ (41)

Here, s₁ ≥ β₁λ_max (D^TD) and s₂ ≥ β₂λ_max (A^TA). Cleary, (41) is of a simple close form and thus is easy to implement. $y^{k + 1} = arg min_{y} ∥ (∥ y ∥_{2}) ∥_{1} + \frac{β_{1}}{2} {∥ D u^{k + 1} - y - \frac{1}{β_{1}} λ_{1}^{k} ∥}_{2}^{2}$ (42)

According to the 2D shrinkage algorithm in (4), we get ${(y^{k + 1})}_{i, j} = 𝕊^{2} ({(D u^{k + 1} - \frac{1}{β_{1}} λ_{1}^{k})}_{i, j}, \frac{1}{β_{1}}) .$ (43) $\begin{matrix} z^{k + 1} = arg min_{z} δ_{L 2 Ball (ɛ)} (z) + \frac{β_{2}}{2} {∥ A u^{k + 1} - z - g - \frac{1}{β_{2}} λ_{2}^{k} ∥}_{2}^{2} \\ = ℙ (A u^{k + 1} - g - \frac{1}{β_{2}} λ_{2}^{k}, L 2 Ball (ɛ)) = \frac{ɛ (A u^{k + 1} - g - \frac{1}{β_{2}} λ_{2}^{k})}{max (ɛ, {∥ A u^{k + 1} - g - \frac{1}{β_{2}} λ_{2}^{k} ∥}_{2})} \end{matrix}$ (44) $λ_{1}^{k + 1} = λ_{1}^{k} - β_{1} (D u^{k + 1} - y^{k + 1})$ (45) $λ_{2}^{k + 1} = λ_{2}^{k} - β_{2} (A u^{k + 1} - z^{k + 1} - g)$ (46)

Now, we get the DDcTV-FL-ADMM algorithm.

Algorithm 3-A.

Pseudocode for K steps of the DDcTV-FL-ADMM algorithm.

Input:g, A, D, e

1: set up β₁ and β₂ values, s₁ = β₁λ_max (D^TD), s₂ = β₂λ_max (A^TA).

2: u⁰=0, y⁰=0, z⁰=0,

λ_{1}^{0}

λ_{2}^{0} =

0, k = 0.

3: repeat

4: Equation (41) for u^k+1.

5: Equation (43) for y^k+1.

6: Equation (44) for z^k+1

7: Equation (45) for

λ_{1}^{k + 1}

6: Equation (46) for

λ_{2}^{k + 1}

7: k = k + 1

8: untilk ≥ K

Output: u ^K

Also, this standard FL-ADMM algorithm is slow to convergence, so we propose the fast, improved FL-ADMM algorithm with inner iterations.

Algorithm 3-B.

Pseudocode for K steps of the DDcTV-FL-ADMM algorithm with inner iterations.

Input:g, A, D, e

1: set up β₁ and β₂values, s₁ = β₁λ_max (D^TD), s₂ = β₂λ_max (A^TA).

2: u⁰=0, y⁰=0, z⁰=0,

λ_{1}^{0}

λ_{2}^{0} =

0, k = 0.

3: repeat

4.1: u = u^k.

4.2: for ii = 1:n_inner_iteration

4.3:

u = u - \frac{β_{1}}{s_{1} + s_{2}} D^{T} (Du - y^{k} - \frac{1}{β_{1}} λ_{1}^{k}) - \frac{β_{2}}{s_{1} + s_{2}} A^{T} (Au - z^{k} - g - \frac{1}{β_{2}} λ_{2}^{k})

4.4: end for

4.5: u^k+1 = u

5: Equation (43) for y^k+1

6: Equation (44) for z^k+1

7: Equation (45) for

λ_{1}^{k + 1}

8: Equation (46) for

λ_{2}^{k + 1}

9: k = k + 1

10: untilk ≥ K

Output: u ^K

By adding this inner iterations technique, the algorithm will converge much faster. The reason will be discussed in the Section 4.

2.3.3 FL-ADMM algorithm for TVcDM model

(1) The TVcDM model

The TV constrained, data divergence minimization (TVcDM) algorithm is another constrained TV model and has been widely used in CT and EPRI. They are often solved by the CP algorithm. Next, we will see that it may also be solved by FL-ADMM and will see that its derivation is easier than that of CP algorithm for it does not need to calculate the convex conjugate functions.

The TVcDM model may be formulated as $u^{*} = arg min_{u} \frac{1}{2} ∥ Au - g ∥_{2}^{2} s . t . ∥ u ∥_{TV} \leq t .$ (47)

(2) The TVcDM-FL-ADMM algorithm

By use of the indicator function and the definition of the TV norm, (47) is equivalent to $u^{*} = arg min_{u} \frac{1}{2} ∥ Au - g ∥_{2}^{2} + δ_{L 1 Ball (t)} (∥ Du ∥_{2}) .$ (48)

Let y = Du, then (48) becomes $min \frac{1}{2} ∥ Au - g ∥_{2}^{2} + δ_{L 1 Ball (t)} (∥ y ∥_{2}) s . t Du - y = 0 .$ (49)

Then, the corresponding augmented Lagrange function is $L_{β} (u, y; λ) = \frac{1}{2} ∥ Au - g ∥_{2}^{2} + δ_{L 1 Ball (t)} (∥ y ∥_{2}) - 〈 λ, Du - y 〉 + \frac{β}{2} ∥ Du - y ∥_{2}^{2} .$ (50)

According to the FL-ADMM algorithm framework, we may derive the corresponding algorithm instance. $u^{k + 1} = arg min_{u} \frac{1}{2} ∥ Au - g ∥_{2}^{2} + \frac{β}{2} {∥ Du - y^{k} - \frac{1}{β} λ^{k} ∥}_{2}^{2}$ (51)

It is the same as (30). By full linearization, we may get its close-form solution. $u^{k + 1} = \frac{(s_{1} + s_{2}) u^{k} - A^{T} (A u^{k} - g) - β D^{T} (D u^{k} - y^{k} - \frac{1}{β} λ^{k})}{s_{1} + s_{2}}$ (52)

Here, s₁ ≥ λ_max (A^TA) and s₂ ≥ βλ_max (D^TD). $y^{k + 1} = arg min_{y} δ_{L 1 Ball (t)} (∥ y ∥_{2}) + \frac{β}{2} {∥ D u^{k + 1} - y - \frac{1}{β} λ^{k} ∥}_{2}^{2}$ (53) Let $m = ℙ ({∥ D u^{k + 1} - \frac{1}{β} λ^{k} ∥}_{2}, L 1 Ball (t)),$ (54)

Then ${(y^{k + 1})}_{i, j} = \frac{{(D u^{k + 1} - \frac{1}{β} λ^{k})}_{i, j}}{{({∥ D u^{k + 1} - \frac{1}{β} λ^{k} ∥}_{2})}_{i, j}} \times m_{i, j} .$ (55) $λ^{k + 1} = λ^{k} - β (D u^{k + 1} - y^{k + 1})$ (56)

Now, we get the TVcDM-FL-ADMM algorithm.

Algorithm 4-A.

Pseudocode for K steps of the TVcDM-FL-ADMM algorithm.

Input:g, A, D, t

1: set up β value, s₁ = λ_max (A^TA), s₂ = βλ_max (D^TD).

2: u⁰=0, y⁰=0, λ⁰=0, k = 0.

3: repeat

4: Equation (52) for u^k+1.

5: Equations (54) and (55) for y^k+1.

6: Equation (56) for λ^k+1.

7: k = k + 1

8: untilk ≥ K

Output: u ^K

Also, we may speed up this algorithm by use of inner iteration technique.

Algorithm 4-B.

Pseudocode for K steps of the TVcDM-FL-ADMM algorithm with inner iteration

Input:g, A, D, t

1: set up β value, s₁ = λ_max (A^TA), s₂ = βλ_max (D^TD).

2: u⁰=0, y⁰=0, λ⁰=0, k = 0.

3: repeat

4.1: u = u^k.

4.2: for ii = 1:n_inner_iteration

4.3:

u = \frac{(s_{1} + s_{2}) u - A^{T} (Au - g) - β D^{T} (Du - y^{k} - \frac{1}{β} λ^{k})}{s_{1} + s_{2}}

4.4: end for

4.5: u^k+1 = u

5: Equations (54) and (55) for y^k+1.

6: Equation (56) for λ^k+1.

7: k = k + 1

8: untilk ≥ K

Output: u ^K

By adding this inner iterations technique, this algorithm will converge much faster. The reason will be discussed in Section 4.

3 Results

The aim of this work is to design a FL-ADMM algorithm framework and to give prototyping method for optimization models in image reconstruction via the FL-ADMM algorithm framework. For optimization based image reconstruction, we know that optimization model decides the final solution, whereas the solving algorithm just impact the convergence rate and path. Since the FL-ADMM algorithm is a solving algorithm, we should validate if it may solve optimization model accurately, and evaluate what may impact its convergence rate. We have derived three FL-ADMM algorithms for ucTV, DDcTV and TVcDM models. Next, we will just validate and evaluate this solving algorithm for DDcTV model in 2D CT.

We design 4 studies: (1) Inverse crime of DDcTV-FL-ADMM algorithm; (2) The impact of β selection on convergence rate; (3) The impact of inner iteration number on convergence rate; and (4) Comparison with the CP algorithm.

3.1 Inverse crime of DDcTV-FL-ADMM algorithm

Inverse crime is a tool to validate the correctness of an inverse problem [39]. If the projection data is complete and exact, then the sign of inverse crime is that the reconstructed image is almost the same with the truth image, i.e. there is not any error except for the computer floating point error.

The imaging configurations are as follows. The phantom is the Shepp-Logan phantom of size [256, 256]. The imaging coordinate system is located at [128, 128]. The projection signal at each angle is of size 256 and its coordinate range is [–127, 128]. The virtual detector bin length is 1. The pixel size is also 1. The parallel beam scanning pattern is adopted. The projections are simulated by use of pixel-driven method [40]. We collect 256 projections uniformly distributed in the range of [0, π].

Thus, the projection data is of size [256, 256] and the image is also of size [256, 256]. For this discrete-to-discrete imaging linear system, the number of unknowns and the number of equations are the same. For the linear system is absolutely consistent, the solution of the corresponding DDcTV model should be accurate enough to reappear the phantom, i.e. the RMSE of the reconstructed image compared with the truth image should be small enough. If the inverse crime may be achieved, then we may think that the imaging system modelling, the optimization problem modelling, the FL-ADMM algorithm derivation and its computer implementation are all correct.

For this reconstruction case, we use Algorithm 3-B (it is used in all reconstructions in Section 3), in which, ɛ = 0, β₁ = β₂ = 1, and the inner iteration number is 50. The gray image has only 256 gray-level. So, if RMSE is less than 1/256 ≈ 3.9 × 10^-3, the reconstructed image and the truth image will be visually the same, which may be the sign of inverse crime. More strictly, we define the sign of inverse crime as that RMSE ≤10^-4.

At iteration 4570, inverse crime is achieved. The reconstructed images and the corresponding profiles are shown in Fig. 3. Three iteration trends are plotted in Fig. 4. From Fig. 3, we may see that the reconstructed image is visually the same with the truth image, and that the vertical-center-line profiles of the reconstructed image and the truth image are completely coincident. This shows that inverse crime is achieved. Now, we may think that the imaging system model, the DDcTV model, the FL-ADMM algorithm and its computer implementation are all correct. In Fig. 4, we plots three iteration trends to observe the iteration behavior. It may be seen that the RMSE of the reconstructed image and the data error between the guessed data and the truth data may go down and down with the increase of iteration number. At iteration 4570, the RMSE has been less than 10^-4, which means the inverse crime is achieved. We may see that these curves still has a descent trend, showing its good convergence performance. From Fig. 4(c), we may see that during the later iteration period, the TV value may go down and down with the iteration marching. When the iteration stops, the TV value of the reconstructed image becomes the same with the truth TV.

Fig. 3

The reconstructed image and profile of the DDcTV-FL-ADMM algorithm in the context of inverse crime. (a) is the truth image, (b) is the reconstructed image, and (c) plots the vertical-center-line profiles of (a) and (b).

Fig. 4

The iteration trends of the DDcTV-FL-ADMM algorithm in the context of inverse crime. (a) is for the RMSE of the reconstructed image. (b) is for the data error, i.e. ∥Au - g₂∥. (c) is for the TV value of the reconstructed image.

3.2 The impact of β selection on convergence rate of the DDcTV-FL-ADMM algorithm

In this section, we evaluate how the penalty parameter, β, impact convergence rate. Here, β = β₁ = β₂.

The simulation phantom is the FORBILD phantom [41] of size [256, 256]. The projection data is of size [256, 100]. The scanning pattern is of parallel beam form. The 100 projections are uniformly distributed in the range of [0, π]. In the DDcTV-FL-ADMM reconstructions, we vary β from 0.01, 0.1, 1, 10, to 100, fix the inner iteration number as 50, and fix the iteration number as 5000 to evaluate their respective convergence rate.

The reconstructed images of different β are shown in Fig. 5, and the convergence curves of different β are shown in Fig. 6. All the reconstructions of different β are stopped at iteration 5000. Thus, more accurate images means faster convergence. From Fig. 5, we may see that β values of 0.1, 1 and 10 may achieve more accurate reconstructions, whereas β values of 0.01 and 100 lead to reconstructions of a certain level of artifacts. From the image at row 2 and column 5, we may see obvious artifacts. This means that the FL-ADMM algorithm of β value of 100 suffer from the slowest convergence rate. This observation may be more clearly seen in Fig. 6. The order of convergence rate of different β from slow to fast is 100, 0.01, 0.1, 1, and then 10. This indicates that an appropriate β value may achieve fast convergence, whereas too large or too small values always lead to too slow convergence. Even that too huge value may lead to a wrong convergence, i.e. the RMSE will converge to a very large value. However, we want to emphasize that the optimal β value is imaging-condition dependent. In this simulation study, the optimal value is 10. But, in other cases, it may be other value. Usually, one may vary the β value with interval of an order of magnitudes, then select the optimal one to achieve the fastest convergence.

Fig. 5

The reconstructed images of the DDcTV-FL-ADMM algorithm with different β. The number above the images indicate the β value. The images on the second row is the zoomed-in images in the red box shown in the first row. The display window is [0,1].

Fig. 6

Plots of RMSE as function of iteration number. The iteration number is 5000. The two axis are both of logarithm form.

3.3 The impact of inner iteration number on convergence rate of the DDcTV-FL-ADMM algorithm

In Algorithm 3-B, i.e. the DDcTV-FL-ADMM algorithm, there is an inner loop process. In this section, we investigate how the inner iteration number impact the convergence rate.

The phantom is still the FORBILD phantom of size [256, 256]. The imaging condition is the same with that of Section 3.2. In the DDcTV-FL-ADMM reconstructions, we vary the inner iteration number from 1, 50, 100, 150, to 200 and fix the iteration number as 5000, fix β as 10, to evaluate their respective convergence rate. Figure 7 shows the reconstructed images and Fig. 8 plots the convergence curves.

Fig. 7

The reconstructed images of the DDcTV-FL-ADMM algorithm with different inner iteration number. The number above the images indicate the inner iteration number. The images on the second row is the zoomed-in images in the red box shown in the first row. The display window is [0,1].

Fig. 8

Plots of RMSE as function of iteration number. The iteration number is 5000. The two axis are both of logarithm form.

From Fig. 7, we may see that the reconstructed images with inner iteration number of 50, 100,150 and 200 all have higher accuracy. However, if the inner iteration number is 1, i.e. if the FL-ADMM algorithm used is Algorithm 3-A, the reconstructed image suffers from serious artifacts. This means that appropriately selected inner iteration number may achieve faster convergence. The standard FL-ADMM algorithm without inner iteration always leads to too slow convergence rate. This observation may be clearly seen in Fig. 8. Though use of larger inner iteration number may achieve faster convergence rate, the inner loop process will take longer time. Thus, one should select an appropriate inner iteration number to achieve fast convergence and fast inner loop computation. In this case, inner iteration number of 50 is the optimal selection. Also, we want to emphasize that the optimal number of inner iteration is imaging-condition dependent. Usually, one may select some values to perform reconstructions and then select the optimal one.

3.4 Comparison with the CP algorithm

CP algorithm has been proposed and widely used in image reconstruction for more than 10 years. We have applied the CP algorithm in EPR imaging and CT. The most important advantage of the CP algorithm is that it may always get closed-form solutions for sub-problems and the finial algorithm instance only involves simple matrix-vector multiplications and some simple operations. Thus, we say the CP algorithm is a fast prototyping tool for optimization based image reconstruction. Similar to CP, the FL-ADMM is also a fast prototyping tool for it may always get closed-form solutions for difficult sub-problems and it has not special demand on system matrix and sparse-transform matrix.

Through the studies in subsections 3.1 to 3.3, we have known the correctness of the FL-ADMM algorithm and have realized how the β selection and inner iteration number impact the convergence rate. In this subsection, we investigate the sparse reconstruction capability of the DDcTV-FL-ADMM algorithm by comparison with the DDcTV-CP algorithm whose pseudocode and some explanations are shown in Appendix 1.

The phantom is a real thoracic CT image of size [256, 256]. The imaging condition is the same with that of Section 3.2. In the DDcTV-FL-ADMM reconstructions, we fix the inner iteration number as 50, fix β as 10, and vary the projection number from 20,40,60,80, to 100. In the DDcTV-CP reconstruction, we set λ = 1 and $ν = \frac{∥ A ∥_{2}}{∥ D ∥_{2}}$ and vary the projection number from 20, 40, 60, 80, to 100.

Figure 9 shows the reconstructed images of the two algorithms and Tables 1 to 3 shows the quantitative results.

Fig. 9

The reconstructed images by the DDcTV-CP and DDcTV-FL-ADMM algorithms. The number above the images indicate the projection number. The text at the left of the images indicate the algorithm used. The images in row 3 and 4 are the enlarged region-of-interest (ROI) images which is indicated by the red rectangle in the right-top image. The red ellipses encircle a fine structure to emphasize observation.

Table 1

RMSE comparison of the FL-ADMM and CP algorithms

	20	40	60	80	100
CP	0.0451	0.0225	0.0158	0.0127	0.0107
FL-ADMM	0.0445	0.0214	0.0145	0.0108	0.0082

Table 2

SSIM comparison of the FL-ADMM and CP algorithms

	20	40	60	80	100
CP	0.800	0.928	0.962	0.975	0.981
FL-ADMM	0.806	0.934	0.966	0.980	0.987

Table 3

PSNR comparison of the FL-ADMM and CP algorithms

	20	40	60	80	100
CP	26.92	32.96	36.02	37.95	39.38
FL-ADMM	27.02	33.40	36.80	39.34	41.67

The two algorithms are solving the same optimization model, DDcTV model. Theoretically speaking, model decides the solution, whereas the solving algorithm only decides the convergence rate and path. However, practically speaking, different solving algorithms cannot achieve the absolutely the same solution because the model-solution may be a solution-set, the algorithm-parameters cannot be selected absolutely optimally and there always exists numerical error induced by the computer floating point error. If the two solvers are both convergent accurate solvers, their reconstruction accuracy should be both very high and should be very similar. The CP algorithm has been used in CT and other imaging modalities for more than ten years and has been deeply explored. In this comparison, we may regard the CP algorithm as the state-of-the-art (SOTA) algorithm.

From Fig. 9, we may see that the image quality is better and better with the increase of the projection number for both algorithms. The reconstructed images via 100 projection by the two algorithms are both almost the same with the truth image. With the decrease of the projection number, the reconstructed images degrade gradually. The reconstructed images via 20 projections become too smooth. Their ROI images have lost the fine structure. For the two algorithms, they are both accurate solvers of the DDcTV model and both have capability to perform accurate sparse reconstructions. Comparing the two algorithms by Fig. 9, we almost cannot see the difference between each other. This shows the FL-ADMM may achieve comparable reconstruction quality with the SOTA CP algorithm according to the visual observations.

Tables 1 to 3 shows the quantitative comparison results of the two algorithms via metric of RMSE, SSIM and PSNR, respectively. RMSE means root mean square error, SSIM means structural similarity index measure, and PSNR means peak signal to noise ratio. From them, we may see that the two algorithms have very close accuracy, which is visually validated by Fig. 9. However, we may also see that the FL-ADMM algorithm is always a little bit better than the CP algorithm. This is because the optimal β selection in FL-ADMM is easier than the optimal selections of λ and ν in CP algorithm. For β, one may just search the optimal value via several reconstructions of different values. However, for the optimal selections of λ and ν, it would be more difficult for the search is in the range of a two dimensional plane. In image reconstructions, we usually search these values at tenfold intervals. Clearly, one-algorithm-parameter selection is much easier than two-algorithm-parameters selections. Viewed from this perspective, FL-ADMM algorithm is superior to the CP algorithm.

Both qualitative and quantitative evaluations show that the FL-ADMM algorithm may accurately solve the DDcTV model. In fact, we have also evaluated the FL-ADMM algorithms for ucTV and TVcDM models. To be brief, we just show the results of DDcTV-FL-ADMM algorithm. All the experiments on Shepp-Logan, FORBILD and real-CT-image phantoms show that FL-ADMM algorithm is a simple, effective, convergent and universal solving algorithm for optimization models in image reconstruction.

4 Discussions

In this work, we propose a novel ADMM algoithm, FL-ADMM algorithm, which may be used as a prototyping tool for optimiaiton model. The key operation is to linearize all the quadratic terms so that the corresponding sub-problem may get a simple close-form. Further, we propose the fast FL-ADMM algorithm by use of the inner iteration technique. We have derived three FL-ADMM algorithm instances for three TV models, ucTV, DDcTV, and TVcDM. Further, we validate and evaluate the correctness and sparse recontruction capability of the DDcTV-FL-ADMM algorithm. Also, we analyze how the penalty parameter and the inner iteration number impact the convergence rate.

In optimization based image reconstruction, especially in TV-type norm based image reconstruciton, the ADMM algorithm always has a problem that one sub-problem has not simple close-form solution. Usually, people has two choices to solve this problem. One is to use gradient desent algorithm to solve this sub-problem. However, the step-size selection is difficult. If one uses the line search method to select the optimal step-size for each iteration, it needs too long time, especially when the imaging model is large-scale. The other one is to use FFT technique. In fact, why FFT may be used is because the sparse transform matrix and the system matrix may both be regarded as a convolution operation. So, this method has not universality. Once a sparse transform cannot be regarded as a convolution operation, this method loses efficacy.

These difficuties of ADMM motivated this work whose aim is to devise a method to simplify the implementation of the ADMM algoirthm. Motivated by the linearization technique of the L-ADMM algorithm, we propose the FL-ADMM algorithm and propose its acceleration version.

In Section 2, the standard FL-ADMM algorithms are named Algorithm-A, whearas the acclerated FL-ADMM algorithm are named Algoritm-B. For example, for the DDcTV-FL-ADMM algorithm, the standard algorithm is Algorithm 3-A, whereas its improved, accelerated version is Algorithm 3-B. In fact, the accelerated FL-ADMM algorithm uses a special gradient desenct algorithm. In Algorithm 3-B, Equation (4.3) is, in fact, a gradient desent equation. But, very importantly, the step-size is $\frac{1}{s_{1} + s_{2}}$ , which may be calculated before the whole iteration process. Thus, compared with the classical gardient descent algorihtm, this proposed method will be much faster for it doesnot need the time-consuming and complicated step-size search via line search technique.

Compared to the ADMM algorithm using FFT, this proposed algoirthm has universality for it does not need that the sparse transform matrix and the system maxtrix may be both regarded as a convolution operation.

Though this proposed algorithm has the two advantages over gradient descent based and FFT based ADMM algorithms mentioned above, it still has two disadvantages. One is the convergence rate. By varying the penalty parameter and the inner iteration number, we may actually get a comparatively high convergence rate. However, according to systematical comparison with the ASD-POCS and CP algorithms, we find that the FL-ADMM algorithm and the CP algorithm have similar convergence rate and are both slower than the ASD-POCS algorithm. Thus, the acceleration of FL-ADMM should be studied in the future. The other one is that we still do not know why the mathematically-derived, standard FL-ADMM without inner iteration is so slow. When we observed that the close-form solution for the sub-problem on data-fidelity term, for examle, Equation (52) in Algorithm 3-A, is, in fact, a gradient descent step, we realize that the slow convergence may be because Equation (52) only runs one time of gradient descent. Thus, we propose to accelerate the standard FL-ADMM algorithm by letting the gradient descent steps run more times/iterations. Though, we use inner iteration to perform gradient descent here, it is different from the direct use of gradient descent algorithm to solve the corresponding sub-problem, which needs time-consuming line search to select the optimal step-size for each iteration. However, as the CP algorithm, the FL-ADMM algorithm is a good prototyping tool for optimization model. If one design a new optimization model, one may use the FL-ADMM algorithm framework to derive the corresponding algorithm instance. This process will be very fast for the FL-ADMM algorithm instance is easy to derive and implement. Compared to CP algorithm, this algorithm derivation does not need the complicated calculation of convex conjugate. Compared to ASD-POCS algorithm, people do not need set up so many algorithm-parameters empirically. Compared to the standard ADMM algorithm, it may always get close-form solution for each sub-problem because of the full application of linearization technique. Also, the computer implementation of the FL-ADMM algorithm is easy for it usually just involve matrix-vector multiplication and some simple operations, for example shrinkage and projection operations.

5 Conclusions

The conclusions of this work may be summed up briefly as the following points.

The FL-ADMM algorithm is a universal, simple, effective, and accurate solver of convex optimiztion model, no matter it is unconstrained or constrained.

The FL-ADMM algoirthm improves the traditional ADMM algorithm by avoiding the step-size search for gradient desecnt and the special demands to the sparse transfrom and the system matrix for use of FFT technique.

The penalty parameter in this proposed algoritm may impact the convergence rate. Too large or small values both lead to slow convergence.

The inner iteration number in this algorithm may impact the convergence rate. One may select the optimal one by running several reconstrutions with different inner iteration number. According to our experience, 30-50 is usually efficient. But, we note that the optimal inner iteration number is imaging-condition dependent.

In the future, the FL-ADMM algorithm should be focused on its acceleration technique which may borrow the ideas in accelertion of the traditional ADMM algorithm.

Footnotes

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under grant 62071281, by the Local Science and Technology Development Fund Project Guided by the Central Government under grant YDZJSX2021A003, and by the NIH under grants P41 EB002034, R01 CA098575, and P30 CA014599.We are very thankful to Prof. Bingsheng He with Southern University of Science and Technology, Shenzhen, Guangdong, China, for his constructive discussions with us.

Appendix 1.

DDcTV-CP algorithm

The DDcTV model can be formulated as

(A1) $u^{*} = arg min_{u} ∥ u ∥_{TV}, s . t . ∥ g - Au ∥_{2} \leq ɛ,$ where, ɛ is a model parameter, referred to as the data tolerance parameter, which governs the acceptable noise level and system inconsistency in the reconstructed image. To balance the relative magnitude of data fidelity term and TV regularization term, we introduce two balance parameters for (A1) and formulate a new DDcTV model as

(A2) $u^{*} = arg min_{u} ν ∥ u ∥_{TV}, s . t . λ ∥ g - Au ∥_{2} \leq λ ɛ,$ where, λ and ν are the two balance parameters, which do not impact the solution of (A1) but may impact the convergence rate of the solving algorithm when solving (A2).

By use of the CP algorithm framework, we may derive the CP algorithm instance to solve (A2), which is shown in Algorithm A1.

In Algorithm A1, $u, \bar{u} \in R^{N}$ , p, a ∈ R^M, q, c ∈ R^2N. In line 8, 1_I ∈ R^N is a ‘1’ vector. In line 1 and 2, the calculation algorithm of ∥A_∥2, ∥D ∥ ₂ and L is referred to ref. [] of this paper.

References

Pan

, Sidky

E.Y.

, Vannier

, Why do commercial CT scanners still employ traditional, filtered back-projection for image reconstruction? Inverse Problems 25(12) (2009), 123009.

Sidky

E.Y.

, Pan

, Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization, Physics in Medicine & Biology 53(17) (2008), 4777–4807.

Qiao

, Zhang

, Pan

, et al., Optimization-based image reconstruction from sparsely sampled data in electron paramagnetic resonance imaging, Journal of Magnetic Resonance 294 (2018), 24–34.

Ben Yedder

, Cardoen

and Hamarneh

, Deep learning for biomedical image reconstruction: a survey, Artificial Intelligence Review 54(1) (2021), 215–251.

Sidky

E.Y.

, Lorente

, Brankov

J.G.

, et al., Do CNNs solve the CT inverse problem? IEEE Transactions on Biomedical Engineering 68(6) (2020), 1799–1810.

Zhang

, Han

, Pearson

, et al., Artifact reduction in short-scan CBCT by use of optimization-based reconstruction, Physics in Medicine & Biology 61(9) (2016), 3387–3406.

Zhao

, Haldar

J.P.

, Brinegar

, et al., Low rank matrix recovery for real-time cardiac MRI. pp. 996–999.

Han

, Bian

, Eaker

D.R.

, et al., Algorithm-enabled low-dose micro-CT imaging, IEEE Transactions on Medical Imaging 30(3) (2010), 606–620.

Sidky

E.Y.

, Kao

C.-M.

, Pan

, Accurate image reconstruction from few-views and limited-angle data in divergent-beam CT, Journal of X-ray Science and Technology 14(2) (2006), 119–139.

10.

Liu

, Ma

, Fan

, et al., Adaptive-weighted total variation minimization for sparse data toward low-dose x-ray computed tomography image reconstruction, Physics in Medicine & Biology 57(23) (2012), 7923–7956.

11.

Chen

, Jin

, Li

, et al., A limited-angle CT reconstruction method based on anisotropic TV minimization, Physics in Medicine & Biology 58(7) (2013), 2119–2141.

12.

Chan

, Marquina

, Mulet

, High-order total variation-based image restoration, SIAM Journal on Scientific Computing 22(2) (2000), 503–516.

13.

, Qiao

, Wang

, et al., Study of CT image reconstruction algorithm based on high order total variation, Optik 204 (2020), 163814.

14.

Liang

, Wang

, Chang

, et al., Sensitivity encoding reconstruction with nonlocal total variation regularization, Magnetic Resonance in Medicine 65(5) (2011), 1384–1392.

15.

Rigie

D.S.

, La Rivière

P.J.

, Joint reconstruction of multi-channel, spectral CT data via constrained total nuclear variation minimization, Physics in Medicine & Biology 60(5) (2015), 1741–1762.

16.

Zhang

, Chen

, Xia

, et al., Directional-TV algorithm for image reconstruction from limited-angular-range data, Medical Image Analysis 70 (2021), 102030.

17.

Chambolle

, Pock

, A first-order primal-dual algorithm for convex problems with applications to imaging, Journal of Mathematical Imaging and Vision 40(1) (2011), 120–145.

18.

Sidky

E.Y.

, Jørgensen

J.H.

, Pan

, Convex optimization problem prototyping for image reconstruction in computed tomography with the Chambolle–Pock algorithm, Physics in Medicine & Biology 57(10) (2012), 3065–3091.

19.

Chan

R.H.

, Tao

, Yuan

, Constrained total variation deblurring models and fast algorithms based on alternating direction method of multipliers, SIAM Journal on imaging Sciences 6(1) (2013), 680–697.

20.

Wang

, Yang

, Yin

, et al., A new alternating minimization algorithm for total variation image reconstruction, SIAM Journal on Imaging Sciences 1(3) (2008), 248–272.

21.

Huang

, Ng

M.K.

, Wen

Y.-W.

, A fast total variation minimization method for image restoration, Multiscale Modeling & Simulation 7(2) (2008), 774–795.

22.

Goldstein

, Osher

, The split Bregman method for L1-regularized problems, SIAM Journal on Imaging Sciences 2(2) (2009), 323–343.

23.

Osher

, Mao

, Dong

, et al., Fast linearized Bregman iteration for compressive sensing and sparse denoising, Communications in Mathematical Sciences 8(1) (2010), 93–111.

24.

Xiao

Y.-H.

, Song

H.-N.

, An inexact alternating directions algorithm for constrained total variation regularized compressive sensing problems, Journal of Mathematical Imaging and Vision 44(2) (2012), 114–127.

25.

Yang

, Zhang

, Yin

, A fast alternating direction method for TVL1-L2 signal reconstruction from partial Fourier data, IEEE Journal of Selected Topics in Signal Processing 4(2) (2010), 288–297.

26.

, Yin

, Jiang

, et al., An efficient augmented Lagrangian method with applications to total variation minimization, Computational Optimization and Applications 56(3) (2013), 507–530.

27.

Qiao

, Redler

, Epel

, et al., 3D pulse EPR imaging from sparse-view projections via constrained, total variation minimization, Journal of Magnetic Resonance 258 (2015), 49–57.

28.

Qiao

, Liang

, Tang

, et al., Optimization-based image reconstruction from fast-scanned, noisy projections in EPR imaging, IEEE Access 7 (2019), 19590–19601.

29.

Qiao

, Redler

, Epel

, et al., A balanced total-variation-Chambolle-Pock algorithm for EPR imaging, Journal of Magnetic Resonance 328 (2021), 107009.

30.

Qiao

, Redler

, Qian

, et al., Investigation of the preconditioner-parameter in the preconditioned Chambolle-Pock algorithm applied to optimization-based image reconstruction, Journal of X-ray Science and Technology 26(3) (2018), 435–448.

31.

Chan

R.H.

, Tao

, Yuan

, Linearized alternating direction method of multipliers for constrained linear least-squares problem, East Asian Journal on Applied Mathematics 2(4) (2012), 326–341.

32.

Yang

, Yuan

, Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization, Mathematics of Computation 82(281) (2013), 301–329.

33.

Fang

E.X.

, He

, Liu

, et al., Generalized alternating direction method of multipliers: new theoretical insights and applications, Mathematical Programming Computation 7(2) (2015), 149–187.

34.

Ouyang

, Chen

, Lan

, et al., An accelerated linearized alternating direction method of multipliers, SIAM Journal on Imaging Sciences 8(1) (2015), 644–681.

35.

Nien

, Fessler

J.A.

, Relaxed linearized algorithms for faster X-ray CT image reconstruction, IEEE Transactions on Medical Imaging 35(4) (2015), 1090–1098.

36.

Hager

W.W.

, Zhang

, Inexact alternating direction methods of multipliers for separable convex optimization, Computational Optimization and Applications 73(1) (2019), 201–235.

37.

Liu

, Shen

, Gu

, Linearized admm for nonconvex nonsmooth optimization with convergence analysis, IEEE Access 7 (2019), 76131–76144.

38.

Sidky

E.Y.

, Kraemer

D.N.

, Roth

E.G.

, et al., Analysis of iterative region-of-interest image reconstruction for x-ray computed tomography, Journal of Medical Imaging 1(3) (2014), 031007.

39.

Kaipio

, Somersalo

, Statistical inverse problems: discretization, model reduction and inverse crimes, Journal of Computational and Applied Mathematics 198(2) (2007), 493–504.

40.

Qiao

, Redler

, Gui

, et al., Three novel accurate pixel-driven projection methods for 2D CT and 3D EPR imaging, Journal of X-ray Science and Technology 26(1) (2018), 83–102.

41.

, Noo

, Dennerlein

, et al., Simulation tools for two-dimensional experiments in x-ray computed tomography using the FORBILD head phantom, Physics in Medicine & Biology 57(13) (2012), N237–N252.