Low dose CT reconstruction via L 1 norm dictionary learning using alternating minimization algorithm and balancing principle

Abstract

Excessive radiation exposure in computed tomography (CT) scans increases the chance of developing cancer and has become a major clinical concern. Recently, statistical iterative reconstruction (SIR) with l₀-norm dictionary learning regularization has been developed to reconstruct CT images from the low dose and few-view dataset in order to reduce radiation dose. Nonetheless, the sparse regularization term adopted in this approach is l₀-norm, which cannot guarantee the global convergence of the proposed algorithm. To address this problem, in this study we introduced the l₁-norm dictionary learning penalty into SIR framework for low dose CT image reconstruction, and developed an alternating minimization algorithm to minimize the associated objective function, which transforms CT image reconstruction problem into a sparse coding subproblem and an image updating subproblem. During the image updating process, an efficient model function approach based on balancing principle is applied to choose the regularization parameters. The proposed alternating minimization algorithm was evaluated first using real projection data of a sheep lung CT perfusion and then using numerical simulation based on sheep lung CT image and chest image. Both visual assessment and quantitative comparison using terms of root mean square error (RMSE) and structural similarity (SSIM) index demonstrated that the new image reconstruction algorithm yielded similar performance with l₀-norm dictionary learning penalty and outperformed the conventional filtered backprojection (FBP) and total variation (TV) minimization algorithms.

Keywords

alternating minimization algorithm balancing principle low dose CT reconstruction

1 Introduction

The x-ray computer tomography (CT) scanner has been extensively used in medical diagnosis. However, it has been shown that frequent use of conventional CT could significantly increase the chance of developing cancer and other diseases due to the high radiation exposure [1]. Therefore, how to reduce radiation dose exposure while maintain high image reconstruction quality has become a major concern in the CT reconstruction field [2 –6]. To the best of our knowledge, there are two ways to reduce the radiation dose: one is to lower the mAs/kVp levels in data acquisition protocols, and the other is to reduce the number of projection views. In these cases, conventional reconstruction algorithms, such as filtered backprojection (FBP), will result in severe image quality degradation. Therefore, it is highly desirable to develop a new technique to reconstruct high-quality CT image from noisy and under sampled projections. Recently, the compressed sensing (CS) frame provides a general solution in which a high-quality signal/image can be reconstructed from far fewer measurements than what is usually required by the Nyquist sampling theorem provided that the signal is known to be sparse [7, 8]. For the CT images that reflect the anatomical information in the human body, there are many structures and hence they can be sparsely represented well. In recent years, a finite difference operator, i.e., total variation (TV), has been widely used as such sparsifying transform in CT reconstructions and shown the tremendous power in solving under sampling projection reconstruction [9 –21], limited angle reconstruction [9], interior reconstruction [22, 23] for piecewise constant or homogeneous images. In the same time, other forms of sparsifying transformation, such as the Harr transform [24], tight frame transform [25], S-transform [26] total generalized variation [27], fractional-order TV [28] and Gamma potential function [29] have also been introduced in the context of under-sampled projection reconstruction.

Statistical iterative reconstruction (SIR) based algorithms have been used for a long time in x-ray CT for their ability to reduce statistical noise in reconstructed images. Very recently, a lot of authors [12 , 23] have evaluated the TV regularization based CS algorithm in the SIR framework. However, images reconstructed with TV regularization may lose some detailed features and contain blocky artifacts in the cases of noisy projections. To improve the image quality, a sparse representation constraint in terms of redundant dictionary was introduced to substitute the TV term in the CS based SIR framework [30]. In such an approach, the CT image is reconstructed by minimizing the objective function consisting of a dictionary learning based sparse representation term and a data fidelity term posed by the X-ray projections. Experimental results in this study show that the DL based method outperforms the one with TV regularization in terms of recovering fine structures. Nonetheless, there are still two issues have not been solved. First, the sparse regularization term adopted in this approach is l₀-norm, which gives the sparsest regularization to achieve the best reconstruction quality but cannot guarantee the global convergence of the proposed algorithm. It should be interesting that when substituting the l₀-norm term with a convex one, such as the l₁-norm, how much difference in quality between the reconstructed images? If the quality difference is very limited, it should be suggested to use the l₁-norm regularization to avoid non-convex property in this framework. Secondly, in the implementation of the iterative computation with regularization constraint, the regularization parameter (balancing the data fitting term and the regularization term) is empirically selected because there is less prior information about the image content and the error in the measured projection data in general. In practice, to find the optimal regularization parameter, one often needs to try a large number of regularization parameters. This will require large computation time and cost. Hence, finding a way to fairly select the regularization parameter will be naturally one of the major motivations of this study.

In the framework of the l₁-norm based computation, there exists a significant amount of researches on the development of the strategies for selecting regularization parameters, such as the Morozov’s discrepancy principle, L-curve method [31, 32] and generalized cross validation (GCV) method [33 –35]. However, less of them have been utilized for practical applications due to the fact that theoretically justified methods in those works often require unrealistic assumptions. For example, Morozov’s discrepancy principle [36] is a posteriori strategy to choose the regularization parameter which require accurate estimate of the noise level. Unfortunately, the noise level is usually unavailable in practice. The L-curve method is mainly based on the plot of the norm of the regularized solution versus that of the corresponding data-fitting residual. The characteristic L-shaped “corner” of the plot is chose as the optimal regularization parameter. However, there is currently no rigorous justification of the L-curve method. Moreover, the “corner point’ is not a well-defined notion [37]. The GCV method is based on a priori knowledge of the structure of the input error, which means that the errors in the measurement data can be considered to be white noise. However, it is well known that the GCV method tends to under-smooth the solution. Sometimes, the GCV function has multiple minimum solutions [33].

Recently, the balancing principle derived from model function approach for determining the regularization parameter has been studied. The basic idea of this principle is to approximate the minimum of regularizing functional by an appropriate model function with the regularization parameter as a variable. Then, the determination of the regularizing parameter from the balance principle is approximated from a new equation satisfied by the model function. The advantage of this method is that the approximation of regularizing parameter can be solved from a simplified nonlinear equation and some convergence property for the regularization parameter can be set up. From the numerical point of view, this approach introduces a new way to optimize regularizing parameter with relatively small amount of computation because the approximate nonlinear equation with respect to the regularization parameter has an explicit expression.

The novelties of the proposed algorithm are as follows. Firstly, l₁-norm dictionary learning penalty is introduced into SIR framework for low dose CT reconstruction to obtain superior reconstruction performance than conventional l₁-norm based TV penalty. Secondly, an alternating minimization algorithm is developed to minimize the associated objective function, which transforms CT image reconstruction problem into a sparse coding subproblem and an image updating subproblem. Thirdly, balancing principle derived from model function approach is applied in the image updating subproblem to choose regularization parameter adaptively, which does not require knowledge of the noise level.

The remainder of this paper is organized as follows. Section 2 describe the SIR, sparse coding and dictionary learning, SIR framework with l₁-norm dictionary learning penalty, the alternating minimization algorithm for proposed reconstruction framework. Section 3 illustrates the performance of the proposed reconstruction approach via realistic CT projection data and two numerical simulated experiments. Discussion and conclusions are presented in section 4.

2 Materials and method

2.1 Statistical iterative reconstruction

The transmission x-ray CT can be described in a discrete form using the following statistical model: $y_{i} \sim Poisson (b_{i} 1 pt \exp (- p_{i})), i = 1, \dots I,$ (1) where y = (y₁, …, y_I) is a vector represents the raw detector measurements with the corresponding blank scan b = (b₁, … b_I), $p_{i} = \sum_{j = 1}^{J} a_{ij} μ_{j} = [A μ]_{i}$ is the projection data that measures the integral of the x-ray linear attenuation coefficients, μ = (μ₁, ⋯ , μ_J) ^T is a vector representing the attenuation coefficients of the imaged object, A ={ a_ij } is the I × J system matrix that models the imaging system, the element a_ij of the system matrix represents the length of the intersection between the i^th projection path and the j^th pixel which is determined by the focal spot position and the detector pixel position, and the symbol T denotes the transpose operator.

Based on the previous work [23, 30], the SIR is obtained by minimizing the following cost function: $Φ (μ) = {\sum_{i = 1}^{I} \frac{w_{i}}{2} ([A μ]_{i} - p_{i})^{2} + β U (μ)},$ (2) where w_i respects the maximum likelihood estimated by the inverse of the variance of the projection measurement, U ( μ) is known as the regularization term that presents prior knowledge of the image object and β is the regularization parameter that control the power of the regularization.

2.2 Sparse coding and dictionary learning

Let μ ∈ R^W×H denote the image to be reconstructed. Let $R_{s} = {r_{nj}^{s}} \in R^{N \times J}$ be the sth patch extraction operator. The operator is a matrix of all zeros except for a one in each row that exacts vectorized $\sqrt{N} \times \sqrt{N}$ image patch R_s μ from the image. Let D ∈ R^N×M be a dictionary of M atoms represented as columns d_k ∈ R^N. The sparse representation of R_s μ over D is to find a sparse vector α = [α₁ ; … ; α_M] (i.e. most of the coefficients in α are close to zero) such that R_s μ ≈ Dα. The sparse vector α can be obtained by solving the following minimization problem: $arg min_{α} {{∥ R_{s} μ - D α ∥}_{2}^{2} + λ {∥ α ∥}_{p}}$ (3) where λ is a regularization parameter. Because the l₀-norm (p = 0) minimization is an NP-hard problem, the l₁-norm minimization (p = 1), as the closer convex function to l₀-norm minimization, is widely employed in sparse representation, and it has been shown that l₀ and l₁ minimizations are equivalent if the solution is sufficiently sparse [38]. In this paper, we adopt the l₁-norm minimization.

A critical ingredient of sparse representation modeling is the choice of dictionary. In most cases, to learn a good dictionary, the training patches are usually extracted from the high-quality sample images of the same type to be reconstructed image. Given a set of training patches X = [x₁, … x_S] ∈ R^N×S, the goal of dictionary learning is to seek a dictionary that make each patch in the training set be sparsely represented by the atoms of this dictionary, that is, jointly optimize the dictionary D and the representation coefficient matrix α = [ α₁, … α_S] such that x_i ≈ Dα_i and ∥α_i ∥ ₁ ≤ T. This can be formulated by the following optimization problem: $arg min_{α, D} {{∥ X - D α ∥}_{2}^{2} + λ {∥ α ∥}_{1}}$ (4)

A fast online learning technique has been proposed to solve such a dictionary learning problem [39], which is an iterative method that alternates between sparse coding of the image patches set based on the current dictionary and a process of updating the dictionary D. In addition, the optimization toolbox named Sparse Modeling Software for solving the above dictionary learning problem has been distributed [40]. In this study, we use the toolbox to train a dictionary.

2.3 DL based SIR with l₁-norm minimization

The sparse representation in term of redundant patch-based dictionary as the prior information is introduced into the framework of SIR for image reconstruction. Therefore, the CT reconstruction problem can be stated as the following minimization problem with the l₁-norm prior: $arg min_{α, μ} {\sum_{i = 1}^{I} \frac{w_{i}}{2} ([A μ]_{i} - p_{i})^{2} + β \sum_{s = 1}^{S} ({∥ R_{s} {μ - D α}_{s} ∥}_{2}^{2} + λ {∥ α_{s} ∥}_{1})} .$ (5)

2.4 Algorithm development

Mathematically, equation (5) contains the following two sub-problems, which can be solved in an alternation manner of the two following sub-problems.

Sub-problem A: $arg min_{α} {\sum_{s = 1}^{S} ({∥ R_{s} μ - {D α}_{s} ∥}_{2}^{2} + λ {∥ α_{s} ∥}_{1})} .$ (6)

This sub-problem A is called sparse representation, which is closely related to one of the following convex constrained optimization problems: $min_{α} {∥ α_{s} ∥}_{1} s . t . {∥ R_{s} μ - {D α}_{s} ∥}_{2}^{2} \leq ɛ,$ (7) $min_{α} {∥ R_{s} μ - {D α}_{s} ∥}_{2}^{2} s . t . {∥ α_{s} ∥}_{1} \leq t,$ (8) where ɛ and t are nonnegative real parameters. Using convex analysis, it can be shown that a solution of (7) is α_s = 0, or else is a minimization of (6), for some λ > 0. Similarly, a solution of (8) for any t ≥ 0 is also a minimization of (6) for some λ ≥ 0. These claims have been proven [41] (referred as Theorem 27.4) Either Equation (7) or Equation (8) is an l₁-regularizied linear least-square problem. A number of algorithms have been proposed to solve this type of problem, such as coordinate descent algorithm [42], LARS-Lasso algorithm [39]. In this paper, we adopt the LARS algorithm to solve Equation (7).

Sub-Problem B: $arg min_{μ} {\sum_{i = 1}^{I} \frac{w_{i}}{2} ([A μ]_{i} - p_{i})^{2} + β \sum_{s = 1}^{S} {∥ R_{s} μ - {D α}_{s} ∥}_{2}^{2}} .$ (9)

This sub-problem aims at updating an image with respect to a fixed dictionary and associated sparse representation, which can be solved by the separate parabolic surrogate method [43], Equation (9) can be iteratively solved as $μ_{j}^{k} = μ_{j}^{k - 1} - \frac{\sum_{i = 1}^{I} (a_{ij} w_{i} ({[{A μ}^{k - 1}]}_{i} - p_{i})) + 2 β \sum_{s} \sum_{n = 1}^{N} r_{nj}^{s} ({[R_{s} μ^{k - 1}]}_{n} - {[{D α}_{s}]}_{n})}{\sum_{i = 1}^{I} (a_{ij} w_{i} \sum_{m = 1}^{J} a_{im}) + 2 β \sum_{s} \sum_{n = 1}^{N} r_{nj}^{s} \sum_{m = 1}^{J} r_{nm}^{s}}, j = 1, \dots J,$ (10) where the superscript k = 1, … K represents the iteration number.

2.5 Parameter selection of the proposed reconstruction algorithm

As can be seen from the above iteration process, there are two parameters in the iterative process. One is the error level in sparse representation step, i.e. the error ɛ, and the other is the regularization parameter β which only acts on the image update step as formulated by Equation (10). The sparse coding error ɛ represents the tolerance of the difference between the reconstructed and original image which is determined by the image noise level and the property of the dictionary. Previous research has proven that the sparse coding error ɛ of each image patch is very small [30]. In this study, we choose 1 × 10^-7 as sparse error value for all cases. It can be observed that the proposed algorithm using the chosen sparse coding error work well in the experiments of this study.

During the preliminary experimental studies, we have found out that if the value of β in Equation (10) is fixed during iterative reconstruction, the quality of the reconstructed image becomes sensitive to the value of β. For example, if it is selected as a relatively large value, some fine image details would probably be lost. On the other hand, if it is selected as a relatively small value, the effect of sparse coding is weakened and the proposed method behaves similar to conventional SIR method and the obtained image contains some noise. To get the best reconstructed CT image quality, we manually chose the regularization parameter to adjust the relative weight between the data fidelity term and the regularization term. It is well known that the optimal value of this parameter is case dependent: any change in noise level, the number of projections or the phantom image may lead to different optimal regularization parameter. The selection will be time-consuming for iterative reconstruction algorithm.

The balancing principle derived from the model function approach was first proposed to automatically choose the regularization parameter of l₁-norm data fitting term in the image processing field [44]. Very recently, it has also been applied in the multispectral bioluminescence tomography [45]. In these applications, it has been demonstrated that the balancing principle derived from the model function approach is an effective automatic regularization parameter choosing method. It should be noted that the data fitting term in this paper is in the form of penalized weighted least-squares, which can be rewritten as the l₂-norm data fitting term. To develop a unified method of choosing the adaptive regularization parameter in different situations for low-dose CT reconstruction via dictionary learning, here we apply the balance principle to adaptively choose the regularization parameter into the alternating minimization algorithm. Before going into the details of the balancing principle, we firstly need to rewrite the objection function Equation (9) as follow: $F (β) = \sum_{i = 1}^{I} \frac{w_{i}}{2} ([{A μ}_{β}]_{i} - p_{i})^{2} + β \sum_{s = 1}^{S} {∥ R_{s} μ_{β} - {D α}_{s} ∥}_{2}^{2},$ (11)

where the solution μ_β denotes the optimization value μ for each fixed β. The idea of balance principle for regularization parameter choice method is to balance the data fitting term $φ (β) = \sum_{i = 1}^{I} \frac{w_{i}}{2} ([{A μ}_{β}]_{i} - p_{i})^{2}$ and the penalty term $β F^{'} (β) = β \sum_{s = 1}^{S} {∥ R_{s} μ - {D α}_{s} ∥}_{2}^{2}$ in Equation (11). Choosing the best regularization parameter β^* based on the balance principle is equal to solve the following equation with respect to β [44] $(σ - 1) φ (β^{*}) = β^{*} F^{'} (β^{*}) .$ (12)

where σ > 1 control relative weight of these two terms.

To find a solution of the balancing equation, Equation (12) is rewritten as $F (β^{*}) = σ (F (β^{*}) - β^{*} F^{'} (β^{*})) .$ (13)

And consider a fixed point iteration, where β^k+1 is chosen as the solution of $F (β^{k + 1}) = σ (F (β^{k}) - β^{k} F^{'} (β^{k})) .$ (14)

To compute the solution, model function approach is adopted to determine the regularization parameter, which locally approximates the function F (β) by rational polynomials. We adopt a model function of the form: $m (β) = b + \frac{s}{t + β} .$ (15)

Note that when s → 0 or β→ ∞, according to Equation (15) b is obtained easily. The parameters s and t are determined by the following interpolation conditions: $m (β) = F (β), m^{'} (β) = F^{'} (β),$ (16) which together with the definition of m (β) give $b + \frac{s}{t + β} = F (β), - \frac{s}{(t + β)^{2}} = F^{'} (β) .$ (17)

More precisely, the parameters s and t are computed according to following equation: $s^{k} = - \frac{(b - F (β^{k}))^{2}}{F^{'} (β^{k})}, t^{k} = \frac{b - F (β^{k})}{F^{'} (β^{k})} - β^{k} .$ (18)

Solving for β^k+1 in m^k (β^k+1) = σ (F (β^k) - β^kF′ (β^k)), we have ${\hat{m}}^{k} = F (β^{k}) - β^{k} F^{'} (β^{k}),$ (19) $β^{k + 1} = \frac{s^{k}}{σ {\hat{m}}^{k} - b} - t^{k} .$ (20)

Finally, the flowchart of the proposed algorithm is summarized as in Table 1.

Table 1

The flowchart of the proposed algorithm

Global Dictionary learning

1. Prepare a high-quality image, which is reconstructed from complete and normal dose projections;

2. Extract patches from the high-quality image;

3. Construct a dictionary using the fast online dictionary learning technique.

Image reconstruction algorithm

Given ɛ > 0, σ > 1 and K

1. Initialize μ⁰α⁰β⁰ = 0 and set k = 0;

2. Solve the optimization problem Equation (8) for α^k with a Lasso algorithm;

3. Update the reconstructed image μ^k according to Equation (10);

4. Compute F (β^k) and F′ (β^k) according to Equation (11);

5. Update s^k and t^k according to Equation (18);

6. Construct the model function

{\hat{m}}^{k}

by substituting s^k and t^k according to Equation (19);

7. Update β^k+1 based on Equation (20);

8. STOP if k = K; otherwise set k = k + 1, GoTo 2;

9. When the stop criterion is met, return μ^k, β^k+1 as the regularized solution and regularization parameter, respectively.

3 Results

In this section, a series of experiments were designed to demonstrate the performance of the proposed algorithm. First, low dose and under sampled raw projections of a sheep lung perfusion were used to validate the performance of the proposed algorithm. Second, to validate the robustness of the proposed algorithm against different noise levels, simulated noisy projections were used with different noise levels based on the sheep lung CT perfusion examples. The noise levels are 1×10⁵, 5×10⁴ and 1×10⁴ photons per detector element. Third, chest numerical simulation studies were designed to evaluate the performance of the proposed algorithm. In the following, the proposed DL based SIR method is denoted by DLSIR-BPMF, we also perform the TV based SIR [23] (denoted by TVSIR) and global dictionary learning based SIR [30] (GDSIR) algorithm for comparison.

Balancing principle used in the proposed method can adaptively choose the regularization parameter β with a clear physical meaning that balances the data fitting term and the penalty term. However, another parameter σ is also introduced. There are some literatures on how to select this parameter [46, 47]. Per our experiments, the existing selection methods are not suitable to the minimization problem of this study shown in Equation (5). To investigate possible selection rule for this minimization problem, we took great deal of experiments for each case of reconstruction by varying σ in a range and the one resulting in best image quality is selected as the optimal parameter. By studying the optimal selections of different reconstruction cases with different image contents, noise levels, and views, we would become conscious of a preliminary rule of this selection.

3.1 Realistic sheep lung study

In this sheep lung perfusion study, we scanned an anesthetized sheep on a SIEMENS Somatom Sensation 64-Slice CT scanner in a circular cone-beam scanning mode at normal and low does respectively. A scan protocol was developed for low-dose studies with ECG gating: time point 1 for a normal X-ray dose scan (100 kV/150 mAs) before a contrast agent injection, and time points 2–21 for low-dose scans (80 kV/17 mAs) after the contrast agent injection. We extracted all the sinograms of the central slice which were in fan-beam geometry. The radius of the trajectory was 57 cm. 1160 projections were uniformly collected over a 360° range. For each projection, 672 detector elements were equi-angularly distributed defining a field of view (FOV) of 25.05 cm radius.

3.1.1 Global dictionary learning

For global dictionary learning used in DLSIR-BPMF and GDSIR, a baseline image was first reconstructed from a normal-dose sinogram using the classic FBP algorithm, which is of 768×768 pixels covering a 43.63×43.63 cm² region. The procedure of training the dictionary was as follows: a set of overlapping image patches of 8×8 pixels was extracted from the entire lunge region which is of 495×370 pixels in the baseline image, and the direct current (DC) component was removed from each patch. Then a global dictionary D ∈ R^64×256 with 256 atoms was constructed using the online dictionary learning method with a fixed sparse level [40]. Finally a DC atom was added to the global dictionary. The lung region and the final dictionary are shown in Fig. 1.

Fig.1

Sample image used to train the global dictionary and the learned dictionary. (a) A sheep lung image reconstructed from a normal-dose sinogram by the FBP method, which is used to extract the training patches. (b) The learned dictionary consisting of 256 atoms. The display window is [–700,800] HU.

3.1.2 Low-dose results

Unless otherwise mentioned, the reconstructed images in these experiments were 512×512 pixels covering a 43.63×43.63 cm² region. The precision in this study was set as ɛ = 1 ×10^-7 for all experiments. The FBP was firstly applied to reconstruct image from the aforementioned low-dose sinogram. The reconstruction result is shown in Fig. 2(a), which was also used as initial image of iteration for the proposed DLSIR-BPMF algorithm. Then, the proposed algorithm was applied to reconstruct image from the low-dose sinogram, the sparsity constraint was enforced on the entire image, and the parameters σ in this section was experimentally set as σ = 1 +5 . 5 × 10^-5. An order-subset strategy was used to speed up the convergence [48]. The number of subset was 40. The iterative process was stopped after 50 iterations. The corresponding reconstructed images were shown in Fig. 2(b). To compare the performance among the l₁-norm sparsity constraint, l₀-norm sparsity constraint and TV-based algorithms, GDSIR [30] and TVSIR were applied to the same low-dose sinogram. For a fair comparison, the parameter in GDSIR and TVSIR are tuned to achieve the best reconstruction performance. The corresponding reconstructed images were shown in Fig. 2(c) and Fig. 2(d). It can be observed that there is strong noise in the FBP reconstruction. The proposed algorithm with l₁-norm sparse constraint has almost the same performance visually as the GDSIR algorithm with l₀-norm sparse constraint. Both of them achieve supersite performance in removing the noise and reserving the details. The image reconstructed by the TVSIR was a little blocky and had poor performance compared to the dictionary learning based methods. According to Ref. [40], balancing principle and model function approach could receive the optimal regularization theoretically when the iterative reconstruction algorithm converges. To validate this statement, we reconstructed image using proposed l₁-norm dictionary learning based SIR framework with optimal fixed regularization parameter β⁵⁰ = 0.1718 from 1160 low dose projections, which is shown is Fig. 2(e). It can be seen that the reconstructed image is almost same as the proposed DLSIR-BPMF visually.

Fig.2

Reconstructed images from a representative low dose sinogram collected in the realistic sheep lung study. (a) FBP, (b) DLSIR-BPMF, (c) GDSIR, (d) TVSIR and (e) DLSIR with optimal fixed regularization parameter β⁵⁰ = 0.1718, respectively. The display window is [–700,800] HU.

3.1.3 Convergence analysis

To examine the convergence of the proposed algorithm, Fig. 3 shows the value of cost function, residual which was defined as ∥ μ^k+1squo - μ^k ∥ ₂/ - ∥ μ^k ∥ ₂ with k as iteration number and regularization parameter β in terms of iteration steps of the proposed DLSIR-BPMF algorithm from 1160 low dose views in realistic sheep lung study. The curves indicate that the proposed DLSIR-BPMF algorithm can converge to a steady solution after ∼40 iterations. In this study, we choose the iteration number, which is usually used in low dose CT reconstruction as the stop criterion.

Fig.3

Plots (a), (b) and (c) show the value of cost function, residual and regularization parameter β in terms of iteration steps of the proposed DLSIR-BPMF algorithm from 1160 low dose projections in realistic sheep lung study, respectively.

3.1.4 Few-view low-dose results

Reducing the number of projection views is an effective way to reduce imaging time and radiation dose, resulting the few-view problem. To achieve high image quality from the few-view projections, the parameter σ may also be tuned. To further demonstrate this point, the number of low-dose projections was down-sampled from 1160 to 580, 290 and 145, respectively. The FBP, DLSIR-BPMF, GDSIR and TVSIR were applied to reconstruct images from the aforementioned data sets. For the reconstructions from 580, 290 and 145 views, the numbers of subset were set to 20, 10 and 5, the iteration processes were stopped after 100, 200 and 400 iterations, the parameter σ were set as 1+5.5×10^–5×3, 1+5.5×10^–5×9, 1+5.5×10^–5×27, respectively. The reconstruction results are shown in Fig. 4. It can be observed that the FBP reconstruction results became worse and worse when the number of views was gradually decreased from 1160 to 580. The similar phenomena can also be found from the TVSIR reconstruction results.

Fig.4

Images (from left to right) reconstructed using FBP, DLSIR-BPMF, GDSIR and TVSIR from 580 (the first row), 290 (the second row) and 145 (the third row) projection views, respectively. The display window is [–700,800] HU.

In contrast, the proposed method, as well as GDSIR, can suppress the noise effectively while the edges are well preserved. In the case of 580 views, the reconstruction result was almost as good as that reconstructed from 1160 views in Fig. 2(b). However, in the cases of 290 and 145 views, some fine structures were lost.

3.2 Sheep lung simulation study

To further validate the robustness of our proposed algorithm against different noise levels in low dose scanning situations, Poisson noise was added to the normal dose sinogram collected in the sheep lung CT perfusion study. 1×10⁵, 5×10⁴ and 1×10⁴ photons per detector element were assumed to simulate different low dose scanning situations. The FBP, DLSIR-BPMF, GDSIR and TVSIR were performed on the above three different low-dose datasets. The parameter in DLSIR-BPMF algorithm is tuned for different noise levels. Hence, the parameter σ was set as 1+5.16×10^–5/1.2, 1+5.16×10^–5 and 1+5.16×10^–5×1.3, respectively. The number of subset was 40, and the iterative process was stopped after 50 iterations. The reconstructed results are shown in Fig. 5. It can be seen that the FBP results became worse and worse when the number of photon decreased from 1×10⁵ to 1×10⁴. In the case of 1×10⁵ photons, while the reconstruction results using the proposed method was almost as good as that reconstructed using the GDSIR and TVSIR. In the case of 1×10⁵ photons, the DL based algorithm outperforms the TVSIR in preserving structures and suppressing noise.

Fig.5

Images (from left to right) are reconstructed using the FBP, DLSIR-BPMF, GDSIR and TVSIR from noisy projections with 1×10⁵ (the first row), 5×10⁴ (the second row) and 1×10⁴ (the third row) photons per detector element, respectively. The display window is [–700,800] HU.

To further illustrate the performance of the present DLSIR-BPMF algorithm, the zoomed images of a selected region of interest (ROI) as indicated by the red rectangles in Fig. 5 are shown in Fig. 6 with soft tissue display window. As can be seen in Fig. 6, the conventional SIRTV algorithm are suffering severe blocky artifacts, while the present DLSIR-BPMF algorithm shows superior image quality, suggesting that the proposed algorithm outperforms SIRTV and FBP algorithms.

Fig.6

Zoomed details of the ROI as indicated by the red rectangle in Fig. 5. The display window is [–700, 400] HU.

3.3 Chest simulation study

To evaluate the proposed method for different image contents, a human chest CT image was down loaded from website [49] to simulate low-dose and few-view projections. The size of chest image is 512×512; each pixel covers an area of 0.9766×0.9766 mm². The fan-beam geometry with an equi-angular detector was assumed. The distance between the x-ray source and the system origin is 57 cm. 360 projections were uniformly collected in [0, 2π]. For each projection, 672 detector elements spanned an FOV of 25.05 cm in radius. In this experiment, the sparsity constraint was enforced on the entire image.

3.3.1 Global dictionary learning

For global dictionary, the entire chest region of 220×335 pixels was extracted from the baseline image, which is given in the Fig. 7(a), and then the patches were extracted from the entire chest region to construct a global dictionary as described in subsection 3.1.1. The learned dictionary D ∈ R^64×256 with 256 atoms is shown in Fig. 7(b).

Fig.7

Sample image used for training the global dictionary and learned dictionary. (a) A Chest image. (b) The learned dictionary consisting of 256 atoms. The display window is [–1000, 800] HU.

3.3.2 Simulated data with noise

Poisson noise assuming 1×10⁵, 5×10⁴ and 1×10⁴ photons per detector element was added to the simulated projection data, respectively. The FBP, DLSIR-BPMF, GDSIR and TVSIR were performed on the above three low-dose dataset. The parameter σ in our experiment was set as 1+0.0072/1.2, 1+0.0072 and 1+0.0072×1.3 to respectively adapt to different noise levels. The number of subset was 12, and the iterative process was stopped after 50 iterations. Fig. 8 shows the reconstructed results from noisy projections with 1×10⁵, 5×10⁴ and 1×10⁴ photons per detector element, respectively. It is seen that the FBP results became worse and worse when the number of photon decreased from 1×10⁵ to 1×10⁴, and the reconstructed results using the proposed algorithm are all performed well in preserving structures and suppressing noise, and are almost visually the same as the original image.

Fig.8

Chest images (from left to right) reconstructed using FBP, DLSIR-BPMF, GDSIR and TVSIR from noisy projections with 1×10⁵(the first row), 5×10⁴(the second row) and 1×10⁴(the third row) photons, respectively. The display window is [–1000, 800] HU.

3.3.3 Few-view results

The proposed algorithm with adaptive regularization parameter choice was evaluated in the case of low-dose few-view reconstruction. The number of low-dose projection with 5×10⁴ photons per detector element was downsampled from 360 to 180 and 90, respectively. The FBP, DLSIR-BPMF, GDSIR and TVSIR were performed on the above two few-view dataset. For 180 and 90 views, the parameter σ for the proposed algorithm was set as 1+0.0072×3 and 1+0.0072×9, the number of subset was 6 and 3, and the iterative process was stopped after 100 and 200 iterations, respectively. The results are shown in Fig. 9. It can be seen that the results of our proposed algorithm outperforms the FBP and SIRTV counterparts, which has similar performance to GDSIR.

Fig.9

Chest images (from left to right) reconstructed using FBP, DLSIR-BPMF, GDSIR and TVSIR from 180 (the first row) and 90 (the second row) projection views with 5×10⁴ photons, respectively. The display window is [–1000, 800] HU.

3.3.4 Performance evaluation

To quantify the reconstruction accuracy of the algorithm, we use root mean square error (RMSE) to measure the similarity between the ground truth and the reconstructed image. Besides, the image quality assessment (IQA) index for structural similarity (SSIM) has been proven to be consistent with visual perception [50]. Hence the SSIM is also used as another metric to evaluate the reconstruction accuracy in this study. In this section, we use Fig. 7(a) as the ground truth. The RMSE and SSIM indices of the reconstructed images in chest simulation study are summarized in Table 4. It can be concluded that our proposed DLSIR-BPMF algorithm and GDSIR had basically the same performance. Both the DL based algorithms outperform the TV-base algorithm in preserving the fine structure and removing image noise.

4 Discussion

The l₁-norm sparsity constraint in term of redundant dictionary has been introduced into the SIR framework for low dose and/or sparse-view projection measurements in this work. Specifically the l₁-norm minimization is a proxy for the l₀-norm minimization if the solution is sufficiently sparse. Experimental results in this study have demonstrated that the l₁-norm minimization has the similar performance with the l₀-norm sparsity constraint in removing the noise and preserving the fine structure for low dose CT reconstruction. In practice, the l₁-norm based SIR via DL method will be more applicable for its guarantee of global convergence and little performance discount.

Balancing principle using the model function approach has been applied to determine regularization parameter adaptively for image denoising in the image processing field, in which the quantity σ is always selected in the interval (1.01, 1.05). In this study, we applied balancing principle to choose regularization parameter in the dictionary learning based SIR framework. A quantity σ is also introduced to control the weight of the sparse coding term relative to the data fidelity term. Although balancing principle using the model function approach proposed in this work can not realize the goal of automatic parameter selection, some rules of choosing quantity σ can be observed for the test cases in this study.

We also surveyed that the order of the quantity σ - 1 is proportion to the order of the reciprocal of the initial Log-likelihood term. In particular, we took noisy projection with 50000 photons Poisson noise as example, the quantity σ in sheep study and chest study is selected in the interval (1+5×10^–5, 1+8×10^–5) and (1+6×10^–3, 1+8×10^–3), respectively, which is proportion to the reciprocal of the initial Log-likelihood term. We also observed that if the number of low dose projection views was down-sampled by a factor of two, then the value of quantity σ - 1 grows by a factor of three in few view low-dose experiments. It can be inferred that the selection of the parameter σ would be much easier than the selection of the original hyperparameters.

There are some literatures that have discussed the issue of parameter selection to apply the balancing principle [46, 47]. By our experimental verification, the proposed solution is not general for the DL based SIR so far. Taking the solution proposed in Ref. [47] for example, we represent the determination as following: $\frac{1}{σ - 1} = γ_{0} {\frac{\sum_{i = 1}^{I} \frac{w_{i}}{2} ([A μ *]_{i} - p_{i})^{2}}{0.05 \times \sum_{i = 1}^{I} \frac{w_{i}}{2} (p_{i})^{2}}}^{\frac{1}{4}} .$ (21)

By numerical verification, the rule shown in Tables 2 and 3 accords roughly with Equation (21) respectively for each image content when noise level varying. But for different image contents, the parameter γ₀ in Equation (21) has to be tuned accordingly. In this study, the two images content has a difference of the order of 2 in quantity. However, with the merit of having clear physical meaning, balancing principle is hopeful in resolving the problem of automatic parameter selection.

A general convergence proof of the alternating minimization algorithm with fixed regularization parameter is presented in the Ref. [51]. Due to the cost function in Equation (5) is composite convex, so we can prove the convergence of proposed l₁-norm DL based SIR framework with fixed regularization parameter based on Theorem 2 in Ref. [51]. Although the regularization parameter varied in every iteration in our proposed DLSIR-BPMF algorithm, experimental results indicate that the proposed DLSIR-BPMF algorithm can converge to a steady solution. It is noting that the convergence proof of the balancing principle derived from model function has been stated in Theorem 4.9, Theorem 4.10 and Remark 4.11 in Ref. [44]. Furthermore, Lin et al. [52] have proven that the linearized augmented Lagranigian is convergent when the penalty parameter is non-decreasing and bounded above. Borrowing the ideas of Theorem 4.9, Theorem 4.10, Remark 4.11 in Ref. [44] and Theorem 8 in Ref. [52], we would demonstrate the convergence of the DLSIR-BPMF algorithm in the future.

Table 2

A summary of σ and initial log-likelihood term values in F (β⁰) of low dose sheep lung study and sheep simulation study

	σ	Log-likelihood term
Low dose 1160 projection views	1+5.5×10^–5	1.942×10³
Low dose 580 projection views	1+5.5×10^–5×3	1.041×10³
Low dose 290 projection views	1+5.5×10^–5×9	6.36×10²
Low dose 145 projection views	1+5.5×10^–5×27	5.27×10²
Normal dose 1160 projection views with 1×10⁵ photons	1+5.16×10^–5/1.2	1.934×10³
Normal dose 1160 projection views with 5×10⁴ photons	1+5.16×10^–5	1.947×10³
Normal dose 1160 projection views with 1×10⁴ photons	1+5.16×10^–5×1.3	2.047×10³

Table 3

A summary of σand initial log-likelihood term values in F (β⁰) of chest simulation study

	σ	Log-likelihood term
360 projection views with 1×10⁵ photons	1+0.0072/1.2	5.1963
360 projection views with 5×10⁴ photons	1+0.0072	5.6263
360 projection views with 1×10⁴ photons	1+0.0072×1.3	9.3985
180 projection views with 5×10⁴ photons	1+0.0072×3	41.5654
90 projection views with 5×10⁴ photons	1+0.0072×9	267.3202

Table 4

RMSE and SSIM values (in bracket) of the chest images reconstructed by the FBP, DLSIR-BPMF, GDSIR and TVSIR in the chest simulation study (Unit: HU)

Photons	Views	FBP	TVSIR	GDSIR	DLSIR-BPMF
1×10⁵	360	40.8025 (0.5832)	42.2444 (0.7348)	30.7652 (0.7535)	30.1556 (0.7594)
5×10⁴	360	54.3216 (0.5102)	52.9655 (0.6331)	31.8785 (0.7319)	31.3234 (0.7349)
1×10⁴	360	115.8552 (0.3296)	66.8143 (0.5450)	38.2414 (0.6298)	35.3131 (0.6321)
5×10⁴	180	89.2538 (0.3915)	59.4731 (0.5911)	37.3296 (0.6767)	33.7877 (0.6591)
5×10⁴	90	154.2458 (0.2722)	70.9918 (0.5122)	52.2070 (0.5465)	52.5455 (0.4953)

Our MATLAB codes were developed on a PC with a 3.50 GHz Intel core i7 processor and a 16GB RAM. The proposed DLSIR-BPMF include a sparse coding step, image updating step (including projection/backprojection) and regularization parameter updating step. In our realistic sheep lung experiment with 1160 low dose projections, sparse coding step, image updating step and regularization parameter updating step takes 90.71 seconds (in average), 0.03 seconds (in average) and 3.3 seconds (in average), respectively. Graphics processing unit (GPU) was employed in projection and backprojection operation to speed up the proposed algorithm. It can be observed that the time consumption for the regularization parameter updating is quite short.

Experimental results show the proposed DLSIR-BPMF algorithm can yield superior performance than FBP algorithm. However, it has not be accepted and used in the clinical CT machines. This is because the sparse coding error was chosen manually to get the best reconstructed image quality and the reconstruction speed still cannot compete with the speed of conventional FBP. To make the DLSIR-BPMF practical, on the one hand we would try to develop an automatic way to choose sparse coding error, on the other hand, we would further speed up our iterative reconstruction algorithm using Nesterov’s first-order accelerating strategy [53, 54] and implement the sparse coding step and regularization updating step on GPU.

In conclusion, we proposed and tested a convergent reconstruction algorithm for low-dose CT reconstruction via dictionary learning in this study. In particular, an l₁-norm based sparse constraint in term of redundant dictionary is incorporated into an objective function in the SIR framework, and an alternating minimization algorithm is developed to minimize the objective function. Furthermore, an efficient model function approach based on balancing principle is applied to choose the regularization parameter. Using the balancing principle method, the regularization parameter is updated in alternating minimization steps. Low-dose or/and few-view reconstruction results demonstrated that our proposed algorithm can receive the excellent results for low-dose CT reconstruction. In particular, it has been verified that the proposed approach has similar performance with GDSIR and outperforms the TV minimization based method in this study.

Footnotes

Acknowledgments

We would like to thank anonymous reviewers for their helpful suggestions and also thank Dr. Hengyong Yu, Department of Electrical and Computer Engineering, University of Massachusetts Lowell, for providing the realistic sheep lung projection data. This work is also supported in part by the Natural Science Foundation of China (NSFC) (61571359, 61772416), Natural Science Basic Research Plan in Shaanxi Province of China (2016JQ6065, 2017JM5048), Key Laboratory Project of the Education Department of Shaanxi Province of China (17JS098), Scientific Research Program Funded by Shaanxi Provincial Education Department of China (15JK1535) and by the Key Laboratory of Computer Network and Information Integration, Southeast University and Ministry of Education of China (K93-9-2017-04).

References

de González

A.B.

and Darby

, Risk of cancer from diagnostic X-rays: Estimates for the UK and 14 other countries, The Lancet 363 (2004), 345–351.

Chen

, Ma

, Feng

, et al., Nonlocal prior bayesian tomographic reconstruction, Journal of Mathematical Imaging & Vision 30 (2008), 133–146.

Chen

, Shi

, Feng

, et al., Artifact suppressed dictionary learning for low-dose CT image processing, IEEE TRANSACTIONS ON MEDICAL IMAGING 33 (2014), 2271–2292.

Chen

, Zhang

, Yang

, et al., Structure-adaptive Fuzzy estimation for random-valued impulse noise suppression, IEEE Transactions on Circuits & Systems for Video Technology PP (2016), 1.

Liu

, Hu

, Yang

, et al., 3D feature constrained reconstruction for low dose CT imaging, IEEE Transactions on Circuits & Systems for Video Technology PP (2016), 1.

Jin

, Ma

, Yi

, et al., Discriminative Feature Representation to Improve Projection Data Inconsistency for Low Dose CT Imaging, IEEE TRANSACTIONS ON MEDICAL IMAGING PP (2017), 1.

Candès

E.J.

, Romberg

and Tao

, Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information, Information Theory, IEEE Transactions on 52 (2006), 489–509.

Donoho

D.L.

, Compressed sensing, Information Theory, IEEE Transactions on 52 (2006), 1289–1306.

Sidky

E.Y.

, Kao

C.-M.

and Pan

, Accurate image reconstruction from few-views and limited-angle data in divergent-beam CT, Journal of X-Ray Science and Technology 14 (2006), 119–139.

10.

Sidky

E.Y.

and Pan

, Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization, Physics in Medicine and Biology 53 (2008), 4777–4807.

11.

Chen

G.H.

, Tang

and Leng

, Prior image constrained compressed sensing (PICCS): A method to accurately reconstruct dynamic CT images from highly undersampled projection data sets, Medical Physics 35 (2008), 660.

12.

Tang

, Nett

B.E.

and Chen

G.H.

, Performance comparison between total variation (TV)-based compressed sensing and statistical iterative reconstruction algorithms, Physics in Medicine and Biology 54 (2009), 5781.

13.

Censor

, Davidi

and Herman

, Perturbation resilience and superiorization of iterative algorithms, Inverse Problems 26 (2010), 065008.

14.

and Wang

, A soft-threshold filtering approach for reconstruction from a limited number of projections, Physics in Medicine and Biology 55 (2010), 3905–3916.

15.

Choi

, Wang

, Zhu

, et al., Compressed sensing based cone-beam computed tomography reconstruction with a first-order method, Medical Physics 37 (2010), 5113–5125.

16.

Defrise

, Vanhove

and Liu

, An algorithm for total variation regularization in high-dimensional linear problems, Inverse Problems 27 (2011), 065002.

17.

Tian

, Jia

, Yuan

, et al., Low-dose CT reconstruction via edge-preserving total variation regularization, Physics in Medicine and Biology 56 (2011), 5949.

18.

Sidky

E.Y.

, Duchin

, Pan

and Ullberg

, A constrained, total-variation minimization algorithm for low-intensity X-ray CT, Medical Physics 38 (2011), S117.

19.

Ritschl

, Bergner

, Fleischmann

and Kachelrieß

, Improved total variation-based CT image reconstruction applied to clinical data, Physics in Medicine and Biology 56 (2011), 1545–1561.

20.

Ertas

, Yildirim

, Kamasak

and Akan

, Iterative image reconstruction using non-local means with total variation from insufficient projection data, Journal of X-ray Sciecne and Technolgy 24 (2016), 1–8.

21.

Niu

, Jiang

, Bian

, et al., Iterative reconstruction for spase-view X-ray CT using alpha-divergence constrained total generalized variation minimization, Journal of X-ray Science and Technology 25 (2017), 673–688.

22.

and Wang

, Compressed sensing based interior tomography, Physics in Medicine and Biology 54 (2009), 2791–2805.

23.

, Mou

, Wang

, et al., Statistical interior tomography, Medical Imaging, IEEE Transactions on 30 (2011), 1116–1128.

24.

Garduño

, Herman

and Davidi

, Reconstruction from a few projections by ℓ1-minimization of the Haar transform, Inverse Problems 27 (2011), 055006.

25.

Jia

, Dong

, Lou

and Jiang

S.B.

, GPU-based iterative cone beam CT reconstruction using tight frame regularization, Physics in Medicine and Biology 56 (2011), 3787–3807.

26.

Luo

, Liu

, Li

, et al., Image reconstruction from sparse projections using S-transform, Journal of Mathematical Imaging and Vision 43 (2012), 227–239.

27.

Niu

, Gao

, Bian

, et al., Sparse-view x-ray CT reconstruction via total generalized variation regularization, Physics in Medicine and Biology 59 (2014), 2997.

28.

Zhang

, Zhang

, Lei

and Zhou

, Few-view image reconstruction with fractional-order total variation, JOSA A 31 (2014), 981–995.

29.

Zhang

, Hu

, Yang

, et al., Sparse-view X-ray CT reconstruction with Gamma regularization, Neurocomputing 230 (2017), 251–269.

30.

, Yu

, Mou

, et al., Low-dose X-ray CT reconstruction via dictionary learning, IEEE Transactions on Medical Imaging 31 (2012), 1682–1697.

31.

Hansen

P.C.

, Analysis of discrete ill-posed problems by means of the L-curve, SIAM Review 34 (1992), 561–580.

32.

Yang

, Hofmann

, Dapp

, et al., TV-based conjugate gradient method and discrete L-curve for few-view CT reconstruction of X-ray in vivo data, Optics Express 23 (2015), 5368–5387.

33.

Golub

G.H.

and Von

, Matt, Generalized cross-validation for large-scale problems, Journal of Computational and Graphical Statistics 6 (1997), 1–34.

34.

Ramani

, Liu

, Rosen

, et al., Regularization parameter selection for nonlinear iterative image restoration and MRI reconstruction using GCV and SURE-based methods, IEEE Transactions on Image Processing 21 (2012), 3659–3672.

35.

Zhang

, Zhou

, Niu

, et al., Regularization parameter selection for penalized-likelihood list-mode image reconstruction in PET, Physics in Medicine & Biology (2017).

36.

Engl

, Discrepancy principles for Tikhonov regularization of ill-posed problems leading to optimal convergence rates, Journal of Optimization Theory and Applications 52 (1987), 209–215.

37.

Vogel

C.R.

, Siam, Computational methods for inverse problems 23 (2002).

38.

Donoho

D.L.

, For most large underdetermined systems of linear equations the minimal, Communications on Pure and Applied Mathematics 59 (2006), 797–829.

39.

Mairal

, Bach

, Ponce

and Sapiro

, Online learning for matrix factorization and sparse coding, The Journal of Machine Learning Research 11 (2010), 19–60.

40.

http://spams-devel.gforge.inria.fr/

41.

Rockafellar

R.T.

, Convex analysis. Princeton: Princeton university press; 1970.

42.

T.T.

and Lange

, Coordinate descent algorithms for lasso penalized regression, The Annals of Applied Statistics (2008), 224–244.

43.

Elbakri

I.A.

and Fessler

J.A.

, Statistical image reconstruction for polyenergetic X-ray computed tomography, Medical Imaging, IEEE Transactions on 21 (2002), 89–99.

44.

Clason

, Jin

and Kunisch

, A semismooth Newton method for L1 data fitting with automatic choice of regularization parameters and noise calibration, SIAM Journal on Imaging Sciences 3 (2010), 199–231.

45.

Feng

, Qin

, Jia

, et al., An adaptive regularization parameter choice strategy for multispectral bioluminescence tomography, Medical Physics 38 (2011), 5933–5944.

46.

Abdalah

, Mitra

, Boutchko

and Gullberg

, Optimization of regularization parameter in a reconstruction algorithm, Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC) IEEE (2013), 1–4.

47.

Ito

, Jin

and Takeuchi

, A regularization parameter for nonsmooth Tikhonov regularization, SIAM Journal on Scientific Computing 33 (2011), 1415–1438.

48.

Kamphuis

and Beekman

F.J.

, Accelerated iterative transmission CT reconstruction using an ordered subsets convex algorithm, IEEE Transactions on Medical Imaging 17 (1998), 1101–1105.

49.

http://wiki.kek.jp/pages/viewpage.action?pageId=13667347

50.

Wang

, Bovik

A.C.

, Sheikh

H.R.

and Simoncelli

E.P.

, Image quality assessment: From error visibility to structural similarity, IEEE Transactions on Image Processing 13 (2004), 600–612.

51.

Csiszár

and Tusnády

, Information geometry and alternating minimization procedures, Statistics & Decisions 1 (1984).

52.

Lin

, Liu

and Su

, Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation, Advances in Neural Information Processing Systems (2011), 612–620.

53.

Nesterov

, A method of solving a convex programming problem with convergence rate, in Soviet Mathematics Doklady (1983), 372–376.

54.

Nesterov

, Smooth minimization of non-smooth functions, Mathematical Programming 103 (2005), 127–152.