Bivariate shrinkage using undecimated dual-tree complex wavelet transform for image denoising

Abstract

In this paper, we present a bivariate shrinkage method for denoising of images corrupted with additive white Gaussian noise. The proposed method first employs undecimated dual-tree complex wavelet transform on noisy image, which can keep a direct one-to-one relationship between the co-located complex wavelet coefficients at all scales. After that, it estimates the wavelet coefficients by taking into account the parent-child dependency under the non-Gaussian bivariate model, and completes the denoising procedure. Experimental results on test images show that our method can not only eliminate different levels of noise but also obtain fine structures preservation.

Keywords

Image denoising undecimated dual-tree complex wavelet transform bivariate shrinkage

1 Introduction

Denoising is often used as a crucial pre-processing step in image processing, which aims to preserve important structures of the image while removing noise. As a kind of representative method, transform domain denoising has received a lot of attention, mostly in conjunction with the coefficient-wise wavelet shrinkage proposed by Donoho [1]. However, such method assumes that the wavelet coefficients are independent, which actually is not the case for most natural images. Therefore, interscale and intrascale dependencies between image wavelet coefficients are taken into considering by recently presented denoising methods [2].

Hansen et al. [3] propose a minimum description length criterion based on a Laplacian model considering the intrascale dependency of wavelet coefficients, and utilize it to achieve high performance forsimultaneous denoising and compression. Portilla Hansen et al. [4] present a method based on a statistical model of the intrascale coefficients. In such model, the Bayesian least squares estimate of each coefficient reduces to a weighted average of the local linear estimates over all possible values of the hidden multiplier variable. Pizurica et al. [5] propose a new subband-adaptive shrinkage function by assuming a generalized Laplacian prior for noise-free subband coefficients, which outperforms Bayesian thresholding approaches in terms of MSE. Additionally, Vapnik-Chervonenkis (VC) theory has recently emerged as a general theory for estimation of dependencies from finite samples. Under the framework of VC-theory, Cherkassky et al. [6, 7] propose a tree structure to order the wavelet coefficients based on theirs magnitude, scale and spatial location. Wavelet domain hidden Markov tree (HMT) has also been applied to signal and image denoising, which can effectively capture the probability density of the wavelet coefficients [8, 9]. Sendur et al. [10, 11] propose a new non-Gaussian bivariate distributions to consider the dependencies between thecoefficients and their parents. Under this model, nonlinear shrinkage functions are derived by using Bayesian estimation theory. The results of image denoising are better than those using HMT models. After that, Achim et al. [12, 13] propose a novel bivariate shrinkage technique based on a heavy tailed bivariate distribution to provide a quantitative improvement for image denoising, which no longer uses the real and imaginary components of the complex valued subband coefficients separately. Recently, several adaptive methods capturing the dependency of both interscale and intrascale wavelet coefficients are proposed [14 –20].

Note that the discrete wavelet transform (DWT) has been most used in transform domain denoising methods mentioned above. However, thresholded DWT coefficients do not give optimal results due to its drawback of being neither shift invariant nor phase information. Therefore, researchers try to use undecimated DWT for image denoising [21, 22]. Although the undecimated DWT is shift invariant, it lacks directionality. To overcome the problem, Kingsbury et al. [23, 24] propose a new form of DWT, called dual-tree complex wavelet transform (DT-CWT), which can provide approximate shift invariance and directionally selective filters while preserving the usual properties of perfect reconstruction and computational efficiency. However, the subsampled subbands of the DT-DWT have a restricted number of coefficients directly related to each spatial position in the image, which may lower the denoising performance of bivariate shrinkage based methods [25].

In this paper, we aim to extend Sendur’s non-Gaussian bivariate model and apply undecimated DT-CWT (UDT-CWT) to improve denoising performance. The rest of this paper is organized as follows. Section 2 describes the UDT-CWT implementing. Section 3 presents the bivariate shrinkage method using UDT-CWT for image denoising. Section 4 analyzes the experimental results and Section 5 addresses the conclusions.

2 The undecimated dual-tree complex wavelet transform

As a extension of the traditional dual tree complex wavelet transform (DT-CWT), the undecimated DT-CWT (UDT-CWT) can provide a one-to-one relationship between co-located complex coefficients in all subbands, which can be effectively utilized by bivariate shrinkage method for image denoising [13]. The UDT-CWT analysis stage is shown in Fig. 1, where the crosses represent the subsampling of the traditional DTCWT has been removed. In order to offset this effect, the filters at each stages need to be upsampled [25]. The UDT-CWT filter $h_{0}^{(l + 1)}$ at stage l + 1 is defined recursively as follow:

$h_{0}^{(l + 1)} [k] = h_{0}^{(l)} [k] ↑ 2 = {\begin{matrix} h_{0}^{(l)} [k / 2], if k is even \\ 0, if k is odd \end{matrix}$ (1)

Note that $h_{0}^{(1)}$ at the first stage is a non upsampled filter. The other filters at stage l + 1, i.e., $h_{1}^{(l + 1)}$ , $g_{0}^{(l + 1)}$ and $g_{1}^{(l + 1)}$ are similarly defined. The synthesis stage is the inverse of Fig. 1. In this paper, Q-shift filters are adopted in the UDT-CWT analysis and synthesis stages.

In 2-D UDT-CWT, an image signal f (x, y) is decomposed by using a complex scaling function and six complex wavelet functions as follow

$\begin{matrix} f (x, y) & = & \sum_{l \in Z^{2}} A_{j_{0}, l} φ_{j_{0}, l} (x, y) \\ + \sum_{k \in α} \sum_{j = 1}^{j_{0}} \sum_{l \in Z^{2}} D_{j, l}^{k} φ_{j, l}^{k} (x, y) \end{matrix}$ (2) where A_j₀,l and $D_{j, l}^{k}$ are scaling coefficients and complex wavelet coefficients respectively. φ_j₀,l (x, y) denotes the scaling function and $φ_{j, l}^{k} (x, y)$ denotes six wavelet functions which are oriented at k ∈ α = {±15°, ± 45°, ± 75°}.

Considering the 2-D wavelet φ (x, y) = φ (x) φ (y) associated with the row-column implementation of the wavelet transform, if φ_g (t) is approximately the Hilbert transform of φ_h (t), i.e., φ_g (t) ≈ H {φ_h (t)}, one can obtain the following six wavelets [26]: $φ_{i} (x, y) = \frac{1}{\sqrt{2}} (φ_{1, i} (x, y) - φ_{2, i} (x, y))$ (3) $φ_{i + 3} (x, y) = \frac{1}{\sqrt{2}} (φ_{1, i} (x, y) + φ_{2, i} (x, y))$ (4) for i = 1, 2, 3, where the two separable 2-D wavelet bases are defined as follows: $φ_{1, 1} (x, y) = φ_{h} (x) φ_{h} (y), φ_{2, 1} (x, y) = φ_{g} (x) φ_{g} (y)$ (5) $φ_{1, 2} (x, y) = φ_{h} (x) φ_{h} (y), φ_{2, 2} (x, y) = φ_{g} (x) φ_{g} (y)$ (6) $φ_{1, 3} (x, y) = φ_{h} (x) φ_{h} (y), φ_{2, 3} (x, y) = φ_{g} (x) φ_{g} (y)$ (7)

In fact, the 2-D UDT-CWT is implemented by taking sum/difference of two separable wavelet filter banks. Since each of the above six wavelets are aligned along a specific direction, the UDT-CWT can capture more image information than the conventional wavelet transform.

3 Bivariate shrinkage method using UDT-CWT for image denoising

3.1 Non-Gaussian bivariate model

In this paper, the non-Gaussian bivariate model is adopted for image denoising, which can take into account the statistical dependency between a coefficient and its parent. Let w_1 be a complex wavelet coefficient at some scale by using UDT-CWT to perform decomposition on a noisy image, and w_2 be the parent of w_1 . Considering y₁ = w₁ + n₁ and y₂ = w₂ + n₂, where y₁ and y₂ denote noisy complex wavelet coefficients, w_1 and w_2 denote the true coefficients, and n₁ and n₂ denote the independent white Gaussian noise. One can write in vector form as follow $y = w + n$ (8) where y = (y₁, y₂), w = (w₁, w₂) and n = (n₁, n₂).

We aim to estimate w from the noisy observation y by using the maximum a posteriori (MAP) estimator. According to Bayes rule, one can obtain $\begin{matrix} \hat{w} (y) = arg max_{w} p_{w | y} (w | y) \\ = arg max_{w} p_{y | w} (y | w) p_{w} (w) \\ = arg max_{w} p_{n} (y - w) p_{w} (w) \end{matrix}$ (9)

Assume that the noise is i.i.d. Gaussian, and the noise pdf can be written as follow $\begin{matrix} p_{n} (n) = \frac{1}{2 π σ_{n}^{2}} \cdot exp (- \frac{n_{1}^{2} + n_{2}^{2}}{2 σ_{n}^{2}}) \\ = \frac{1}{2 π σ_{n}^{2}} \cdot exp (- \frac{(y_{1} - w_{1})^{2} + (y_{2} - w_{2})^{2}}{2 σ_{n}^{2}}) \end{matrix}$ (10)

For simplicity, we use the following pdf of complex wavelet coefficients [10] $p_{w} (w) = \frac{3}{2 π σ^{2}} \cdot exp (- \frac{\sqrt{3}}{σ} \sqrt{w_{1}^{2} + w_{2}^{2}})$ (11)

Solving (9) by using (10) and (11), one can obtain the following bivariate shrinkage function ${\hat{w}}_{1} = \frac{{(\sqrt{y_{1}^{2} + y_{2}^{2}} - \frac{\sqrt{3} σ_{n}^{2}}{σ})}_{+}}{\sqrt{y_{1}^{2} + y_{2}^{2}}} \cdot y_{1}$ (12)

Note that such function is actually some kind of soft thresholding. The bivariate shrinkage function is shown in Fig. 2. One can see that the estimated value of complex wavelet coefficient also depends on its parent value.

As is shown in (12), calculating MAP estimator for each complex wavelet coefficient requires the prior knowledge of the noise variance $σ_{n}^{2}$ and the marginal variance σ².

The median absolute deviation (MAD) of coefficients at the first level of UDT-CWT decomposition is adopted to estimate the noise variance ${\hat{σ}}_{n}^{2} = \frac{median (| y_{i} |)}{0.6745}, y_{i} \in Yh$ (13)

where Yh is the subband on the finest scale.

Given the kth complex wavelet coefficient on current scale, the marginal variance of the noisy parent-child coefficients can be estimated as follow [11] ${\hat{σ}}_{y}^{2} = \frac{1}{M} \sum_{y_{i} \in N (k)} y_{i}^{2}$ (14)

where N (k) is the neighborhood of the kth complex wavelet coefficient, and M is the size of N (k). Then σ² can be estimated as follow ${\hat{σ}}^{2} = max ({\hat{σ}}_{y}^{2} - {\hat{σ}}_{n}^{2}, 0)$ (15)

3.2 Our proposed image denoising method

Based on the preliminary study described in the previous sections, our proposed image denoising method can be summarized as follows.

Perform L-level UDT-CWT decomposition on an input noisy image.

For complex wavelet coefficients in the 6 highpass subbands from scale 1 to L-1:

Calculate the noise variance using (13).

Calculate the marginal variance using (15).

Estimate each coefficient y₁ using (12).

Perform inverse UDT-CWT to obtain the denoised image.

Note that at the coarsest scale L, we can not directly exploit (12) for estimating since the current coefficients have not parents. Therefore, we can adopt three solutions after step 2 to obtain three versions of the proposed method.

Version 1 (V1): Leave complex wavelet coefficient y₁ in highpass subbands at scale L being unprocessed.

Version 2 (V2): Take corresponding coefficient in lowpass subband from the final level as the parent, and estimate y₁ in highpass subbands at scale L using (12).

Version 3 (V3): Use the following Laplacian prior based shrinkage function [27] to estimate y₁ in highpass subbands at scale L. ${\hat{w}}_{1} = {(y_{1} - \frac{\sqrt{2} σ_{n}^{2}}{σ})}_{+} \cdot y_{1}$ (16)

4 Experimental results

In this section, we perform extensive experiments on several test images in the presence of additive Gaussian noise to show that the proposed method is good at detail preserving and noise smoothing. Ten test images with different resolutions are employed in the experiments. Eight of them are benchmark images, i.e., 512×512 Barbara, 512×512 Lena, 512×512 Baboon, 512×512 Peppers, 256×256 Boat, 256×256 Hill, 256×256 Cameraman, 256×256 Couple, while 128×128 Composites1 and 128×128 Composites2 are two images (see Fig. 3) which show the structure of carbonized C/Ph composites. The debonding and crack (two kinds of traditional flaws in composites) are included in the two images (see the areas recorded by red). All the experiments are performed on the platform of Lenovo M7100 (Dual core 2.83 GHz CPU and 4 G RAM).

4.1 Comparison of three versions of our proposed denoising method

In a first set of experiments, we implemented three versions of the denoiseing method mentioned in Section 3.2. All versions perform 3-level transform on the test images using the 13/19 tap filters at level 1 and the Q-shift 14/14 tap filters at the rest levels. The neighborhood size N (k) is 5×5. Figure 4 shows the obtained PSNR and SSIM results of different denoising versions performed on image Hill with the noise standard deviation changing from 5 to 50. The similar results can be obtained on other images. It can be seen that V3 always performs better than V2 and V1 for the full range of noise levels. Especially, while noise increasing the advantage of V3 is more obvious. Note that when UDT-CWT decomposition level is not large (3 in our experiments), there still exist much noise which should not be ignored in the subband at the coarsest scale. Since the pdf of Laplace model is similar with that of bivariate model, shrinking the coefficients at the coarsest scale by utilizing threshold based on Laplace model can further improve the performance of denoising. Figures 5 and 6 show the visual results of the three denoising versions performing on noisy Boat and Hill with σ= 20.

4.2 DT-CWT vs. UDT-CWT

As a second set of experiments, we tested our proposed method V3 in comparison with its DT-CWT transform version. Both versions perform 3-level transform on the test images using the 13/19 tap filters at level 1 and the Q-shift 14/14 tap filters at the rest levels. The neighborhood size N (k) is 5×5. The PSNR results of the two denoising versions on ten test images with the noise standard deviation 30 is shown in Fig. 7. Obviously, denoising by implement UDT-CWT can achieve higher PSNR values on all test images with different noise levels. The reason is that UDT-CWT can describe the relationship between co-located complex coefficients more precisely, which helps to utilize bivariate shrinkage scheme for image denoising.

Figures 8 and 9 show the visual results of the two denoising versions on Barbara and Lena. Taking Barbara as an example, it can be seen that the edge of the chair is more localised while using the UDT-CWT version for denoising. Furthermore, there is less ringing caused by edges, and the textured areas, such as scarf, have less artifacts within them after denoising. Since the undecimated transform has no reduction in spatial size of the subbands at all scales, the local estimation based on Non-Gaussian bivariate model can help to achieve higher denoising performance.

4.3 Comparing our method with other state-of-the-art methods

In a third set of experiments, we compared our proposed method (V3) with other state-of-the-art methods published in [10–13 , 25] and [28]. The partition block size of 32×32 was adopted by using method of [28]. All methods were implemented by using 3-level decomposition for fair comparison, and the optimal parameters were set as configured by their authors. The PSNR and SSIM results are listed in Tables 1 and 2 respectively.

On the whole, methods of [12 , 25] and our method always outperform those of [10 , 28], since DT-CWT or UDT-CWT has the properties of directional selectivity and approximately shift invariance. Next, we focus on methods of [12 , 25] and ours. The Cauchy distribution based bivariate shrinkage is adopted in [12 , 25], which can better capture the heavy-tailed nature of the complex coefficients. However, our proposed method further shrinks the complex coefficients at the coarsest scale, which can offset the deficiency of the conventional non-Gaussian bivariate model. Therefore, our method achieves better denoising performance compared to methods of [12] and [25] in terms of PSNR as well as SSIM (in all cases). Moreover, it outperforms method of [13] for noising images with lower resolution (see PSNR results of Composites1 and Composites2 in Table 1, and SSIM results of Cameraman, Couple, Composites1 and Composites2 in Table 2), whereas is slightly inferior than that of [13] for noising images with moderate and high resolution. Additionally, the average execution time of method of [12 , 25] and ours is 7.45 s, 11.10 s, 9.25 s and 5.13 s respectively. It illustrates that our method is more efficient since Cauchy distribution based bivariate shrinkage method needs to evaluate the extra dispersion parameter, which is much time consuming. In a word, the PSNR as well as SSIM results of our method are compared to the state-of-the-art denoising method of [13], while the processing time of our method is halved. Visual results of different denoising methods on Composites1 and Composites2 are shown in Figs. 10 and 11. Obviously, the proposed method can better recover the debonding (see Fig. 10) and crack (see Fig. 11), which will be useful in some flaw identification applications.

5 Conclusions

This paper presents an image denoising method based on UDT-CWT and Non-Gaussian bivariate shrinkage model. While performing UDT-CWT decomposition on an input noisy image, the proposed method removes the traditional subsampling, but instead makes the filters at each stage be upsampled. A direct one-to-one relationship between the co-located complex wavelet coefficients at all scales can be derived by using such scheme. We implement three versions of the proposed method, and illustrate that V3, which applies Laplacian prior based shrinkage function to estimate the coefficients at the final decomposition level, is the best. We also compared V3 of the proposed method with other conventional methods, and the experimental results indicate that it can achieve higher PSNR and SSIM values with fine structures preservation for image denoising.

Footnotes

Acknowledgments

We are grateful to the referees for their valuable comments. This work is financially supported by the National Natural Science Foundation of China (61363050, 61272077, 61563037), and the Natural Science Foundation of JiangXi Province (20142BDH80026).

References

Donoho

, De-noising by soft-thresholding, IEEE Transactions on Information Theory41 (1995), 613–627.

Liu

and Moulin

, Information-theoretic analysis of interscale and intrascale dependencies between image wavelet coefficients, IEEE Trans Image Processing10 (2001), 1647–1658.

Hansen

and Yu

, Wavelet thresholding via MDL for natural images, IEEE Trans Inform Theory46 (2000), 1778–1788.

Portilla

, Strela

and Wainwright

M.J.

, Image denoising using scale mixtures of Gaussians in the wavelet domain, IEEE Trans Image Processing12 (2003), 1338–1351.

Pizurica

and Philips

, Estimating the probability of the presence of a signal of interest in multiresolution single and multi band image denoising, IEEE Trans Image Processing15 (2006), 654–665.

Zhong

and Cherkassky

, Image denoising using wavelet thresholding and model selection, pp, In: Proceedings of the IEEE International Conference on Image Processing (2000), 262–265.

Cherkassky

and Shao

, Signal estimation and denoising using VC-theory, Neural Networks14 (2001), 37–52.

Fan

and Xia

X.G.

, Improved hidden Markov models in the wavelet-domain, IEEE Trans Signal Processing49 (2001), 115–120.

Romberg

J.K.

, Choi

and Baraniuk

R.G.

, Bayesian tree-structured image modeling using wavelet domain hidden Markov models, IEEE Trans Image Processing10 (2001), 1056–1068.

10.

Sendur

and Selesnick

I.W.

, Bivariate shrinkage functions for wavelet-based denoising exploiting interscale dependency, IEEE Transactions on Signal Processing50 (2002), 2744–2756.

11.

Sendur

and Selesnick

I.W.

, Bivariate shrinkage with local variance estimation, Signal Processing Letters9 (2002), 438–441.

12.

Achim

and Kuruoglu

E.E.

, Image denoising using bivariate α-stable distributions in the complex wavelet domain, Signal Processing Letters12 (2005), 17–20.

13.

Hill

P.R.

, Achim

A.M.

, Bull

D.R.

and Al-Mualla

M.E.

, Dual-tree complex wavelet coefficient magnitude modelling using the bivariate Cauchy–Rayleigh distribution for image denoising, Signal Processing105 (2014), 464–472.

14.

Cai

, Cheng

T.H.

and Lu

, Efficient wavelet-based image denoising algorithm, Electronics Letters37 (2001), 683–685.

15.

Pizurica

, Philips

and Lemahieu

, A joint inter- and intra-scale statistical model for bayesian wavelet based image denoising, IEEE Trans Image Processing11 (2002), 545–557.

16.

Rahman

S.M.

and Hasan

M.K.

, Wavelet-domain iterative center weighted median filter for image denoising, Signal Processing83 (2003), 1001–1012.

17.

Eom

I.K.

and Kim

Y.S.

, Spatially adaptive denoising based on mixture modeling and interscale dependence of wavelet coefficients, pp, In: Proceedings of IEEE International Conference on Neural Networks & Signal Processing (2003), 1070–1073.

18.

Atto

, Abdourrahmane, Pastor dominique and mercier grégoire, Smooth Adaptation by Sigmoid Shrinkage (2009), 1–24.

19.

Chesneau

, Fadili

and Starck

J.L.

, Stein block thresholding for image denoising, Applied and Computational Harmonic Analysis28 (2010), 67–88.

20.

Geert

and Paul

, Geodesics on the manifold of multivariate generalized gaussian distributions with an application to multicomponent texture discrimination, International Journal of Computer Vision95 (2011), 265–286.

21.

Zhang

, Nosratinia

and Wells

R.O.

Jr., , Image denoising via wavelet-domain spatially adaptive FIR Wiener filtering, In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (2000), pp. 2179–2182.

22.

Claudio

R.J.

and Jacob

, Adaptive image denoising and edge enhancement in scale-space using the wavelet transform, Pattern Recognition Letters24 (2003), 965–971.

23.

Kingsbury

N.G.

, Complex wavelets for shift invariant analysis and filtering of signals, Applied and Computational Harmonic Analysis10 (2001), 234–253.

24.

Kingsbury

N.G.

, Design of q-shift complex wavelets for image processing using frequency domain energy minimization, In: Proceedings of International Conference on Image Processing (2003), pp. 1013–1016.

25.

Hill

, Achim

and Bull

, The Undecimated Dual Tree Complex Wavelet Transform and its application to bivariate image denoising using a Cauchy model, In: Proceedings of International Conference on Image Processing (2012), pp. 1205–1208.

26.

Selesnick

I.W.

, Baraniuk

R.G.

and Kingsbury

N.C.

, The dual tree complex wavelet transform, IEEE Signal Processing Magazine22 (2005), 123–151.

27.

Pierre

and Juan

, Analysis of multiresolution image denoising schemes using generalized Gaussian and complexity priors, IEEE Transactions on Information Theory45 (1999), 909–919.

28.

Chen

G.Y.

and Kégla

, Image denoising with complex ridgelets, Pattern Recognition40 (2007), 578–585.