Abstract
This paper presents a novel transform based watermarking approach using the ridgelet transform for image decomposition. This method embeds the watermark into the Fourier amplitudes of the ridgelet coefficients of the host image. The embedding weights are calculated such that the perceptual transparency is maintained and the robustness against different attacks is guaranteed. The proposed method forms a trade off function considering the similarity between the coefficients of watermarked image and those of the original image, and robustness against different attacks. The ridgelet coefficients of the watermarked image can be obtained by the minimization of this function. Experimental results demonstrate that the newly proposed method performs well in terms of perceptual transparency, and outperforms several state of the art schemes in terms of robustness against common image processing operations and most of the geometric attacks.
Introduction
With the rapid growth of information technologies, more and more digital media products e.g. audios, images and videos can be accessed, distributed, and copied in many convenient ways. But this has also created a potential demand for the protection of copyright of digital contents against piracy and malicious manipulation. To efficiently answer to these concerns, digital watermarking has been introduced [1, 2]. Digital watermarking is the process of embedding (or adding) the digital watermark into the digital host image without perceptible degradation so that the added watermark can resist any extraneous attacks [3].
A successful image watermarking algorithm should be robust against a wide variety of possible attacks. In the field of watermarking, attacks can be generally classified as common image processing operations (or noise-like signal processing [4]) and geometric distortions. The attacks belonging to the first class i.e., common image processing operations include, low-pass filtering, median filtering, noise contaminating (usually Gaussian and salt and pepper), edge sharpening, histogram equalization and compression, e.g., JPEG (Joint Photographic Experts Group) and JPEG-2000. Furthermore, attacks belonging to the second class i.e., geometric distortions include, rotation, cropping, translation, affine transformations, and scaling. From the image watermarking point of view, geometric distortions cause the synchronization errors between the encoder and decoder. Although the digital watermark still exists, the decoder is no longer able to extract it from the cover image.
In addition to robustness against attacks, imperceptibility and embedding capacity are of primary concerns in the context of image watermarking. Imperceptibility means that the effect of watermark embedding should be unobtrusive for the human visual system. A conflict usually arises among the robustness, imperceptibility and capacity, such that increasing the capacity facilitates the robustness, while decreases the imperceptibility. However, the importance of these requirements is application dependent. For example, having high capacity is the most important requirement of the watermarking systems for covert communication, while robustness against geometric attacks is the main concern for copyright protection.
Until now, several image watermarking schemes have been proposed which can be classified into several categories [5], namely, methods based on histogram [6, 7], moments [8, 9], transforms [3, 10] spread spectrum (SS) [11, 12] and quantization [13, 14].
The methods based on SS, modulate the digital watermark using spreading code patterns and insert the result into the cover image either additively or multiplicatively. The methods belonging to this family have simple embedding and detection procedures but they usually suffer from the problem of HSI (host signal interference) [15]. This family was early initiated based on the idea proposed by Cox et al. [16]. Cannons and Moulin proposed an SS based watermarking algorithm, in which the host data are chosen from a secret subset of the full-frame DST (discrete cosine transform) of an image, and then the information of the watermark is inserted by means of multiplicative embedding [17]. By suitably designing the hash function, this method can reject the HSI. Malvar and Florêncio introduced the improved spread spectrum (ISS) method in which the signal does not act as a noise source [18]. By modulating the watermark energy based on the correlation between the digital watermark and the cover image, this method can reduce the HSI. In contrast to the method proposed in [17], this method is blind. The correlation-aware improved spread spectrum (CAISS) embedding scheme was proposed in [19]. This method not only maintains the simplicity of the decoder, but it also significantly reduces the HSI effect in data hiding by incorporating the side information. To deal with HSI effectively, the improved multiplicative spread spectrum (IMSS) embedding scheme for data hiding was proposed in [20], exploiting both the correlation between the cover signal and the watermark and the decoder structure in embedding the bit information. Liu et al. proposed a new watermarking scheme which numerically encrypts the watermark by a simulation of the optical double random phase encoding (DRPE) process [21]. The random complex image produced by this procedure is then processed to provide a real valued random image having a low number of quantization levels. Adding this image to the cover image gives the watermarked one.
To embed the information of the watermark in quantization based watermarking methods, features are extracted from the cover image and then quantized to the lattice points. In contrast to SS based methods, this family does not suffer from HSI. The main problem in this family is the fragility against the amplitude scaling attack. To tackle this problem, a maximum likelihood technique was used for estimating amplitude scaling factors in [22]. This method firstly derives the probability density function (PDF) of the watermarked and attacked images in the absence of subtractive dither and, subsequently, modifies the models to incorporate the subtractive dither of the encoder. Method proposed in [23] uses Watson’s perceptual model to adaptively determine the step size of quantization based on the estimated perceptual slack. This model is then improved so that the slacks linearly scale with valumetric changes (e.g., changes in the amplitude). In such way, a quantization index modulation (QIM) algorithm is provided which is theoretically invariant to valumetric scaling. A robust quantization-based image watermarking scheme was proposed in [24], which uniformly quantizes the angles of gradient vectors at wavelet planes to embed the watermark bits. So, this method is called the GDWM (gradient direction watermarking). Due to the fact that the watermark is embedded in the angles of the significant gradient vectors, this method is robust to amplitude scaling attacks. Method proposed in [14], divides the cover signal into two parts and calculates the l p -norm of each part. Owing to the use of the division function, this method is robust to the scaling attacks.
Alghoneimy and Tewfik proposed a novel technique for watermarking digital images based on moment invariants [25]. In this method, the watermark is made based on the second and third order moments which are invariant to scaling and orthogonal transformations. In other words, the watermarked image is formed by a linear combination of the host signal and a nonlinear transformation of it. A watermarking scheme was proposed in [26], in which the watermark embedding is carried out in the DCT (discrete cosine transform) domain of the host image. In the extraction phase, the geometric distortion factors are firstly estimated by using moments of the image. By using the estimated factors, one can convert back the transformed image into its original, size, position and shape. Based on a fast and accurate framework for the calculation of Zernike moments (ZMs), a new watermarking scheme was proposed in [27]. By maximizing the data embedding size and modifying the hiding ratio, the high capacity is achieved in this scheme. This method introduced the conditional quantization technique making it possible to reduce the total number of ZMs required to be altered during the embedding, to improve perceptual transparency and to enhance the robustness against various attacks. Liu proposed a new moment based method which can be applied to colour images [28]. The bi-level moment-preserving technique is incorporated in this scheme which is a two-stage dual parity-check method used in the proofing process to verify two set dual authentication data for obtaining a higher performance.
Due to the fact that the histogram shape of a natural image does not change with pixel position and is insensitive to different attacks, it has a good potential to be exploited in the watermarking. Based on the histogram shape, several watermarking methods have been proposed [6, 30]. The main problem related to this family is the fragility against histogram equalization. The method proposed by Wei et al. uses the invariant property of the image statistic characteristic [29]. According to a predefined step, a specific range of pixel values is chosen and divided into some intervals. Subsequently, this method quantizes the pixels of each interval to have the same value according to the bits of the watermark. A geometrically robust image watermarking scheme was proposed in [30], in which the Harris-Laplace detector is used to extract the feature points and to construct local circular regions (LCRs). Then, a set of non-overlapped LCRs is chosen by a clustering-based feature selection method and geometrically invariant regions are formed via orientation normalization. The histogram of the pixel position is calculated over the regions to be used in embedding. In the embedding phase of the method proposed in [6], the cover image is pre-processed by a Gaussian low pass filter. Then, a number of gray levels are selected by a secret key and the histogram of them is constructed. Subsequently, the histogram-shape-related index is utilized to select the groups of pixels having the highest number of pixels. Finally, the watermark is inserted into the selected pixel groups.
Generally, transform domain approaches are more popular due to the fact that more information can be embedded and greater robustness against attacks can be achieved by these methods [3]. The transforms usually used for watermarking schemes are DCT [31], discrete Fourier transform (DFT) [32], discrete wavelet transform (DWT) [3], ridgelet [33], contourlet [11, 34], ripplet [35–37], and shearlet [38]. In most of the methods of this family, a combination of SVD (singular value decomposition) and a multiscale transform such as wavelet is used [3, 40]. Furthermore, some successful watermarking methods have been proposed by modeling wavelet, contourlet and shearlet coefficients (for example, contourlet coefficients are modeled with GGD (general Gaussian distribution) in the method proposed by Akhaee et al. [11]). Here, several state of the art transform based methods are reviewed. Based on fuzzy support vector machine (FSVM) correction, a watermarking method was proposed in [41]. This method extracts a significant bitplane image from the host image and performs the undecimated discrete wavelet transform (UDWT) on it. This method then, divides the lowpass subband into small blocks, modulates selected wavelet coefficients in small blocks and embeds them into the host image. The method proposed in [42], uses eight samples of wavelet low frequency coefficients for each image block to form two line segments in the two-dimensional (2D) space. The angle constructed between these line segments is changed for data embedding. To solve the tradeoff between the robustness and transparency of the watermarked image, geometrical tools have been used in this method.
In this paper, a new technique in the ridgelet domain is proposed, in which the watermark is embedded in the rotation invariant features of the host image. This method uses the Fourier amplitudes of ridgelet coefficients of the image for data embedding. In this model, it is tried to increase the robustness against attacks and to maintain the visual transparency. The weights of embedding are calculated depending on the watermark and the host image. Experimental results confirm the validity of the proposed method and demonstrate the superiority of our method in terms of visual transparency, and robustness against attacks including common image processing operations and most of the geometric distortions.
The remainder of this paper is organized as follows. Section 2, describes the modeling of the problem, solving the model to extract the weights and the watermarked image, and the decoding process. Section 3 is dedicated to experimental results and comparison of the proposed method with some state of the art techniques. Conclusions are given in Section 4.
Proposed methodology
Focusing on nonseparable multiscale transforms making it possible to capture the intrinsic geometrical structures of the image e.g., smooth contours leads to the advent of the ridgelet transform. Appling this transform to a 2D image gives a 2D matrix that each column is a representative for a specific direction. In addition, it has been shown that the amplitude of the Fourier transform over the direction in the ridgelet domain is rotation invariant [43]. Here, we exploit this property to make the algorithm robust against rotation attacks. So, we apply DFT to the rows of the ridgelet matrix to obtain another matrix. This matrix is called the “host matrix”. Subsequently, by calculating the amplitude of each element of the host matrix a new matrix is formed. Then, this new matrix and the watermark are rearranged into row-wise vectors which are called the host vector and the watermark vector, respectively. Let the host vector be
So, minimization of
where (X: Y) = trace(XY). After some manipulations we have:
Setting Equation (6) equal to zero yields:
After some manipulations we have:
Using the steepest descent algorithm, one can obtain
So, the steps of the proposed algorithm are:
Given numbers Nu and Ni which are the number of iterations: Calculate the matrix obtained by the Fourier amplitudes of the ridgelet coefficients of the host image. Then, rearrange it into a row-wise vector. Set Obtain the extraction weights from Equation (7):
Repeat the step 5 for Ni times. Set t = t+1 and calculate the vector of the watermarked image:
Set n = n+1 and if n < Nu go to step 3. Rearrange the vector
The minimization strategy of the proposed method is similar to the block-coordinate minimization algorithm used in [45], that starts with an initialization
The calculated extraction weights i.e.,
The decoding steps are as follows: Calculate the matrix obtained by the Fourier amplitudes of the ridgelet coefficients of the watermarked image. Then, rearrange it into a row-wise vector. Multiply this vector into
In this section, the performance of the proposed image watermarking method is evaluated in comparison with the methods proposed in [39, 40], and [46]. For the experiments, ten well known images of size 512×512 are used: Fruits, Gold hill, Lena, Baboon, Man, Couple, Peppers, Stream and bridge, Barbara and Boat (see Fig. 1). Most of these images are available in USC-SIPI image database. Five pseudorandom binary watermarks and five different logos which are shown in Fig. 2, have been used in the experiments.

The host images used in the experiments: (a) Fruits, (b) Gold hill, (c) Lena, (d) Baboon, (e) Man, (f) Couple, (g) Peppers, (h) Stream and bridge, (i) Barbara, (j) Boat.

The digital watermarks used in the experiments.
In the proposed algorithm, the two trade-off parameters i.e., α and β are very important due to the fact that they control the trade-off between the perceptual transparency and the robustness against different attacks. The time step Δt is set to be 0.1, and iteration parameters Nu and Ni are set to be 10 and 12, respectively. In order to achieve a desirable perceptual transparency, the PSNRs (peak signal-to-noise ratio) should be higher than 40 dB. Figure 3 shows the robustness against three different attacks namely, JPEG (quality parameter = 50), Gaussian noise (variance = 0.003) and cropping (x 25% y 25%) versus various trade off parameters. In order to quantify the robustness, the normalized cross correlation (NC) metric has been used in this figure (see the subsection 3.3). The average NC versus different values of α and β.
Figure 1(f) to (j) are used in this subsection to evaluate the perceptual transparency of the proposed method (the parameters are tuned by using Fig. 1(a) to (e)). Figure 4 shows the watermarked images obtained by the proposed scheme. From this figure, one can see that the watermarked images are perceptibly similar to the original host images (see Fig. 1).
Usually two criteria are used to quantitatively assess the perceptual transparency of the watermarked images. The first criterion is PSNR. The PSNR criterion is defined as the ratio between the maximum possible value (power) of the original signal and the power of distorting noise that can affect the quality of its representation. In other words, we take the square of the maximum value of the original image (for an 8 bit image, the maximum value is 255) and divide it by the MSE (mean square error):

Examples of the watermarked images obtained by our approach (“Black and White” logo has been used in all of the images): (a) Couple, (b) Peppers, (c) Stream and bridge, (d) Barbara, (e) Boat.
The second criterion is the structural similarity (SSIM) index which is a perceptual measure that quantifies image quality distortion caused by image processing e.g., watermarking. This criterion is a full reference metric requiring both the reference image and the distorted image and is calculated on local windows:
where μ
Table 1 shows these criteria for the proposed scheme. For each value, ten experiments (by ten watermarks) are done and the average of them is reported. As can be seen from this table, the PSNRs are higher than 40 dB for all of the images. So, one can make sure that the information of the embedded watermark is not visible to human eyes.
Perceptual transparency for different host images in terms of PSNR and SSIM
Again Fig. 1(f) to (j) are used to evaluate the robustness of the proposed method against common image processing operations and geometric attacks. Each watermarked image is tested under different attacks including, median filtering, Gaussian filtering, salt and pepper noise, Gaussian noise, speckle noise, JPEG compression, rotation, histogram equalization, translation, cropping and edge sharpening, and the results are averaged over the five test images. The NC metric is used to evaluate the robustness of our scheme and compare it with three state of the art methods which are proposed in [39, 46]:
Some of the attacked watermarked images for the Peppers data have been shown in Fig. 5. The results of the robustness against different common image processing operations and geometric attacks are reported in terms of NC (see Table 2). According to the values of this table, the proposed method has excellent robustness against common image processing operations e.g., histogram equalization, Gaussian noise, speckle noise, median filtering, Gaussian filtering and salt and pepper noise. Furthermore, the proposed method has satisfactory robustness against geometric attacks such that it has the best robustness against cropping, rotation and translations less than or equal to 10 pixels.

The attacked watermarked images: (a) Median filtering (3×3), (b) Gaussian filtering (3×3), (c) Salt and pepper noise (0.02), (d) Gaussian noise (0.002), (e) Speckle noise (0.01), (f) JPEG 50, (g) Rotation (50), (h) Histogram equalization, (i) Translation (H 15, V 15), (j) Cropping (x 25%, y 25%), (k) Edge sharpening.
The watermark detection results for common image processing operations and geometric distortions (NC)
This subsection investigates the capacity performance of the proposed watermarking scheme. For this purpose, the robustness and perceptual transparency are assessed versus different watermark lengths. The average NC and average PSNR of five cover images have been shown in Fig. 6. This figure shows that the proposed method can embed long length watermarks as it satisfies perceptual transparency and robustness.

The average NC and average PSNR versus different watermark lengths: (a) Average NC results versus different watermark lengths, (b) Average PSNR results versus different watermark lengths.
In this paper, a novel ridgelet based watermarking method was proposed using the Fourier amplitudes of the ridgelet coefficients for data embedding. To obtain the extraction weights, a trade-off function was formed which takes into account not only the perceptual transparency but also robustness against attacks. In order to maintain the perceptual transparency, the proposed method guarantees the similarity between the Fourier amplitudes of the ridgelet coefficients of the watermarked image and those of the original image. In addition, the embedding weights are determined such that serious attacks to the host image do not considerably change the extracted watermark. Experimental results demonstrated that the proposed method is highly robust against different common image processing operations, in comparison with other state-of-the-art watermarking methods. Meanwhile, the proposed watermarking method provided a satisfactory performance under geometric attacks such that it offered the best performance against cropping, rotation, and translations less than or equal to 10 pixels.
