Abstract
Image steganography provides efficient techniques and methods for embedding secure data into an image. Researchers face many challenges in this field such as: ensuring the quality of the stego image is adequate, ensuring the hidden message is secure, increasing the hiding capacity, recovering the hidden message and the cover image losslessy, and overcoming the effects of lossy image compression on the hidden message. In this paper, we address the above challenges by proposing a new image steganography model to ensure the security of data in cloud storage. The fundamental processes within the base level of the proposed model are to preprocess both the cover image and the secret message. The cover image is transformed to the wavelet domain using integer-to-integer transform while the secret message is compressed using lossless entropy coding and then it is encrypted for additional security before embedding. The control level of the model drives the steganography process as follows. Firstly, it selects significant coefficients from the transformed cover image according to some threshold values. Then, it creates groups of 7-bits and 3-bits from non-lossy bits of selected significant coefficients and the encrypted bit stream of the secret message, respectively. Finally, the non-lossy bits in the selected significant coefficients are updated by injecting the secret bits. Through the process of steganography, the consistency between the payload of the secret message and the number of selected significant coefficients is checked. The model is validated and verified using extensive real experiments. Moreover, the performance of the proposed model is measured by comparison with other recent models.
Keywords
Introduction
Steganography is the science of hiding a message or more inside a cover object so that the sender and the intended receiver detect the existence of such messages to protect important information and the communicating parties. There are two categories of steganography: traditional and modern. There are two types of traditional steganography: technical and linguistic [3, 11]. Technical steganography hides information using scientific methods, devices, and tools. Linguistic steganography manipulates language and text. There are two types of linguistic steganography: open codes and semagrams. Open codes embed messages in other messages in a way that does not arouse the suspicion of the average person who is not looking for a hidden message. Semagrams hide information using special visual signs. These visual signs can be pictures or changes in the visual style of the text. There are three types of steganography: image steganography, where messages are hidden in digital images in a non-intrusive way that is hard to detect visually; audio steganography where audio information is hidden within an audio message; and video steganography where video information is hidden with a video stream.
Images are the most widely used cover objects for steganography. Image steganography techniques can be used in the following domains [1, 20]: Firstly, the spatial domain where steganography algorithms are developed to hideout data instantly into certain intensity values of the image. Some of the spatial domain steganography techniques are: least significant bit (LSB) replacement, matrix embedding, random pixel embedding (RPE), mapping pixel, pixel value differencing (PVD), edge-based data embedding (EBE), labeling and connectivity, pixel intensity, texture-based, histogram shifting (HS), and multi-cover adaptive steganography. Secondly, the transform domain, where various algorithms and transformations are used on images to hide information. Thirdly, the frequency domain, where the embedding process is performed after the transformation of some types of image pixel values. These transformed pixels are classified into significant and non-significant coefficients which control image quality. Hiding data in the frequency domain has several advantages over the spatial domain, one being that hiding data in significant coefficients is less risky than using different image processing operations such as compression?. The techniques applied in the frequency domain are based on one of the discrete (Fourier, cosine and wavelet) transforms. Fourthly, the compression domain, where steganography techniques take into account the effect of compression on the secret hidden data. Compression plays an important role in choosing which steganographic algorithms to use. Lossy compression increases the possibility that the embedded data may be partly lost. Lossless compression keeps both the original cover image as well as the embedded data without change. It compares low compression ratios with lossy compression. The most commonly used compression techniques for digital images in steganography are the joint photographic experts group (JPEG), embedded zerotree wavelet (EZW) and JPEG2000.
In this paper, a new image steganography model is proposed that securely embeds messages within images using the EZW technique which helps to ensure a higher level of data security in cloud environments. This proposed model is composed of base and control levels. The base level has two functions: preprocessing and postprocessing. The base level preprocess stage is responsible for creating the ready I/O image using integer-WT and getting the secret messages ready using text compression and encryption. The base level postprocess starts after the control level finishes its processes. The post-sub processes are performed using the EZW encoder with a predetermined compression ratio to generate the encoded image and the EZW decoder to generate the stego image. The second level is the control level which is responsible for driving the steganography process through the following steps: 1. extracting the significant coefficients from the prepared I/O image, generating the cover-bits so that its output is constructed as groups of 7 bits, 2. generating the secret-bits from the message using the compressed and encrypted message’s bit stream where the output is shaped as groups of 3 bits, 3. injecting the created groups of 3 bits into its corresponding group of 7 bits, updating the significant coefficients, and 4. reflecting the updated and consistent coefficients to its corresponding positions at the ready I/O image.
The remainder of this paper is organized as follows. In Section 2, the related work is reviewed and discussed. The proposed steganography model, its structural components, and the behavior sequence that describes its role are presented in Section 3. In Section 4, the experiment results and performance analysis are provided. Finally, some concluding remarks and future work are discussed in Section 5.
Related work
To fulfill the steganography requirements of images, appropriate tools, methods and techniques are needed. Existing research into steganography field environments can be divided into the following:
In a previous study [16], the authors analyzed both least significant bit (LSB)-based steganography and DCT-based steganography. In LSB-based steganography, the secret text message is embedded into the LSB of the cover image pixels in the spatial domain, whereas in DCT-based steganography, the text message are embedded into the LSB of the DCT coefficients of the cover image. The authors state, “the DCT-based steganography scheme works perfectly with minimal distortion to image quality (in PSNR) as compared to LSB-based steganography scheme for both gray scale and color images” (Walia, 2010, p. 7). In their work, the amount of secret data that can be hidden using the DCT-based scheme is small compared to the LSB-based scheme. The DCT-based scheme is recommended by the authors because of the minimum distortion of the image quality.
In another study [21], the authors employed a technique where secret data are embedded into the compression codes during the image compression process using a prediction-based image hiding scheme. The scheme embeds a secret message into the transformed host image coefficients in the DCT domain. Embedding secret data during the compression process of the host image is based on the compression type which can be lossy or lossless. In the prediction coding stage of image compression, the predictive pixel values are estimated using a predictive coding technique. The main idea of the proposed method is to embed secret data into the prediction error values, which is the difference between the predictive values and the original values. In the proposed scheme, a predictor is employed to estimate the predictive pixel values, and the prediction error values are slightly modified to embed secret data. The modified values are then compressed with entropy coding. The results show that hiding data via the predictive coding approach supports a large hiding capacity and high stego image quality.
In a recent study [10], a reversible data hiding scheme based on the variation of the DCT coefficients of an image was proposed. Cover images are decomposed into several different frequencies and the high-frequency parts are embedded with secret data. In this work, integer-to-integer mapping form is applied to implement the 2D DCT to solve the problem of rounding errors. Thus, the image recovered from the modified coefficients can be transformed back to the correct data-hidden coefficients. Due to the nature of 2D DCT, the coefficients are closely seen as a Gaussian distribution centralized at zero; which means it is a natural candidate for embedding secret data using the histogram shifting method. The main idea of the proposed approach is to shift the positive and negative coefficients around zero to the right and to the left respectively to free space for hiding the secret data. The authors claim that their proposed reversible data hiding scheme improves the quality of stego images and increases hiding capacity. They concluded that “the histogram shifting method based on DCT coefficients has high potential in competition with the method of hiding in quantized DCT coefficients” (Lin, 2012, p. 10).
A lossless data hiding method for digital images using integer wavelet transform and threshold embedding technique was investigated in [19]. In this study, an image is transformed using Cohen-Daubechies-Fauraue (CDF) integer wavelet transform adopted by JPEG2000 for image lossless compression. The data are embedded into the (LSB) bit-plane of high frequency CDF integer wavelet coefficients whose magnitudes are smaller than a certain predefined threshold. Histogram modification is applied to prevent overflow and underflow problems, since the high frequency coefficients in three high frequency subbands represent 75% of all the transformed coefficients, and one bit can be possibly embedded into a coefficient. Therefore, the highest payload is 0.75 bits per pixel (bpp). The larger the threshold, the higher the payload will be. On the other hand, the larger the threshold, the lower the quality of the stego image. To further improve the payload, the multi-time embedding technique can be adopted. Specifically, the stego images can be treated as an original image. By applying the same embedding procedure to the stego images, the payload can be increased. The drawback of this method is the degradation of the stego image quality because of the multi-time embedding process.
The work in [8] proposed a method that depends on the contrastive relation of gray-level values of the selected pixels and its top, down, left and right neighbor pixels. The authors claimed that some of contrastive relations of gray-level values are not destroyed after lossy compression using JPEG. According to the value of each bit in the hidden data, some pixels (exactly 5) in the cover image may be altered. The authors found that it is extremely complex to alter 5 pixels to embed just one bit from the hidden data. Sothe size of the hidden data, as discussed in [8], is only 2056 bytes in a cover image of 512×512 pixels. Moreover, the compression ratios the authors tried by JPEG range between 2.05 : 1 and 2.44 : 1, which are low compression ratios.
Similarly, in [14], the investigators propsed an algorithm based on frequency domain which is compatible with lossy compression in JPEG. The algorithm embeds a message into the DCT coefficients of an image according to the relative size of the selected coefficient value and the average value of all its adjacent coefficients in the block used to embed and extract the hidden message bit. The goal of the algorithm is to maximize the quality of the images after compression and hiding but not the size of hiding message. The results showed that the algorithm satisfied stego images whose quality was greater than 30dB when the quantization factor was bigger than 35%. Even though the quality of the stego images is good, the method restricts the size of the hidden bits to be equal to the number of 8×8 blocks in the cover image.
In a more recent study [15], the authors introduced a high capacity reversible steganography (CRS) method using multilayer embedding. The method is based on image interpolation to enhance the embedding capacity while preserving the quality of the stego image. The main objective of the method is to take greater advantage of the similar properties of the neighboring pixels’ difference value.
In [6], the authors discussed the effects of the JPEG lossy compression algorithm on image pixels and coding techniques to counter the corruption of steganographically hidden data. By showing the results of a byte-by-byte comparison of the original image files and the JPEG processed versions, the least significant bits have a number of errors which differ from image to image, so the data embedded in any or all of the lower 5 bits is corrupted beyond recognition. Attempts to embed data in these bits and recover it after JPEG processing showed that the recovered data was completely garbled by JPEG. The authors observed that, in JPEG processed images, the pixels which were changed from their original appearance were similar in color to the original. The proposed method introduced a new coding scheme based on viewing the pixels as a point in space with three color channel values as the coordinates. The method results in a lower error rate, but requires the addition of repetitive redundancy to even attempt to achieve a reasonable error rate. The method is able to demonstrate a recovery rate of approximately 30% of embedded data.
Recently investigators examined a method to increase the image compression rate of JPEG based on DCT and DWT using steganography [9]. The key idea was to compress a target block of an image using JPEG, and then hide the resulting bits into subsequent blocks of the compressed image. Consequently, data compression is performed twice, once using the JPEG technique and the other using steganography.
The work in [12] introduced a fast algorithm for matrix embedding steganography. Matrix embedding steganography using Hamming and random linear codes suffers from the excessive computational complexity of the embedding process. The method introduced a fast algorithm to reduce computational complexity using the syndrome of the error correction code to search for the coset leader, which was a modification to the cover image. Moreover, it improved the parity check matrix of the Hamming code to make the cyndrom indicate its coset leader by itself. On the other hand, the syndrome was used to indicate its coset leader for most cosets, and it searched for the coset leader by its syndrome for the rest in the matrix embedding using random linear. The proposed scheme has the same embedding efficiency as conventional matrix embedding, but it is has lower computational complexity for steganographic systems with only low and medium embedding rates.
Finally, recent evidence suggests the use of a reversible data hiding method in the JPEG compression domain [17]. In this method, the data are hidden in the space made by lowering certain quantization table entries and lifting corresponding quantized DCT coefficients with an adjustment value added. The embedding strategy and sequence are optimized in order to get a better stego image quality.
Proposed model
In this section, we present a new steganography model for hiding secure messages within images for both the lossy or lossless compression technique to increase data security in the cloud environment as shown in Fig. 1. Lossy coding of the cover image after embedding a secret message is not guaranteed to recover the original message correctly. The main model function is to embed the secret bits in non-lossy bits of significant coefficients during the compression process. This proposed model comprises two levels: the base level and the control level. The base level is the input/output interface of the model which has two types of process flow. The first type concerns the preprocessing of both the cover image and the secret message. The image is initially transformed into the wavelet domain using an integer-to-integer wavelet transform through the Lifting process with a perfect construction filter bank. Two necessary conditions are needed for perfect construction filter bank:

Proposed coding/steganography model.
We can rewrite the z-transform as:
The transformed coefficients are then passed to the control level for sharing in the hiding process as a cover media. On the other hand, a secret message is compressed using arithmetic coding [18] as a lossless compression technique. This compression enhances the hiding capacity by increasing the number of bits that can be injected in the cover media. The message can be encrypted after the compression process using the DES encryption technique [5]. Using this approach, the security of the hidden message is guaranteed twice; once by encryption and the other by a hiding method within the cover media. The second process flow in the base level works after the hiding process has finished. Once the updated coefficients have been received from the control level, the transformed coefficients are fed into the compression process using EZW to compress an image in either a lossless or a lossy format according to a predetermined quality level or a target bit rate [13]. In EZW, zero-trees are used to predict the insignificance of wavelet coefficients at low levels according to the insignificance of the root at the higher level. This is can be clarified by assuming c to be a child of r whose probability density functions are related as:
This means:
Let x = c2and y = r2, and assume that x and y are correlated with correlation coefficient ρ, the we have the following relations:
Then the minimum error variance for the best linear unbiased estimator of x given y can be given by:
The predetermined compression ratio is important for selecting both coefficients and cover bits used in the hiding process. The compressed bit-stream may be stored as an encoded image or transmitted through a network. To explain the effect of the hiding process on the compressed image, the encoded image can be decoded by the EZW decoder to obtain the stego image.
The control level is divided into two views. The first view is the abstract view that is used to identify the components of the hiding process with its relationship dependencies. The second view is the operational view that visualizes the physical functions related to each of the abstract components. The following is a description of the control-level scenario in both views: the first component is the target coefficients extractor which receives the transformed wavelet coefficients from the cover image and the predetermined compression ratio that is defined by the user. The target coefficients extractor selects only the significant coefficients according to selected threshold values. The initial threshold should be set such that it is a half of the maximum coefficient, a multiple of 2, and is computed using the following equation [13]:
The second component is the bit generator/injector which is used to extract the required bits from both the selected significant coefficients and the secret message. The next step is to inject the secret bits into the cover bits extracted from the coefficients. At this point, the generator starts by dividing the stream of bits received from the text compression and encryption algorithms into groups of 3-bits. Also, it selects the least significant bit(s) from the significant coefficients and divides them into groups of 7-bits. In the selection process, the generator has two optional modes: the fixed and variable position mode. In the variable mode, it selects all LSBs from all significant coefficients regardless of its position in its binary representation. The LSB position is changed in coefficients according to the behavior of bit plane coding and variable threshold through different passes. In the fixed position mode, the generator selects a fixed position in binary coefficients representation that is usually some of the LSBs and their neighbors. For example, if it selects the third position from the right hand side, this means that it is the LSB for the coefficients in the last iteration but may not be the least bit in the coefficients in the other higher iterations. After generating and grouping the cover and secret bits, the generator injects the groups of secret bits into the groups of cover bits using similar method as in [4], as follows:
Assume that an m - bit codeword x for embedding a n - bit message M, and a hash function f extracts n bits from a modified codeword x′. The matrix embedding technique finds a suitable modified codeword x′ for every x and M with M = f (x′), such that the Hamming distance
The consistency checker checks if the cover bits are sufficient to accommodate the secret bits or not. Depending on the number of selected significant coefficients (Sc), the number of selected LSBs, and the number of secret bits (Sb), the checker either makes a decision to proceed in the hiding process or to ask the user to change some parameters and restart again. The checker works based on the following equation:
Finally, the refactor coefficient engine reflects the updated coefficients to its corresponding positions in the transformed image to proceed with the coding process using the EZW encoder.
In this section, several experiments are detailed and their results are discussed. Without loss of generality, the proposed steganography method is implemented using the EZW compression technique to illustrate the effect of the compression process on steganographic secret data. The recently proposed method for Secure Bit-plane [4] is used for hiding secret bits into coefficients as explained previously. It is obvious that the concept of our idea can be applied to any compression technique wherever transformed coefficients can be classified into low and high frequencies and a predetermined coding rate is available. At the end of this section, the performance of the proposed method is compared with the most recent method presented in [10].
Experiment setup
The proposed method is implemented based on [4, 18]. The coefficients of high frequency subbands are used through the compression process using EZW to hide the secret data. The secret data is compressed using the arithmetic coding (AC) technique to increase the hiding capacity. Moreover, we used the data encryption standard (DES) to encrypt the secret data before hiding. This increases the security of the hiding process in addition to merging the cryptography with steganography techniques. For simplicity, we proceed in our experiment using the fixed LSB position in the selected coefficients. Between one and three LSBs of each significant coefficient are used to collect the cover bits before embedding. This gives hiding capacities of 84258, 168519, and 252780 bits for the three high frequency subbands of the first decomposition level of a gray scale image of size 512×512 pixels. Zero to 3 LSBs will share in selecting cover bits process that supports the lossless and lossy features of the coding technique. By ignoring 0 bits, either the first LSB of each coefficient, or the first and second LSBs, or the first three LSBs are used to collect the cover bits. On the other hand, ignoring 1 bit from the rightmost side of the binary representation of the coefficient implies using one of
Ignoring bits from the leftmost side of the coefficient value supports the truncation of the embedded bit stream of EZW at any point which guarantees the lossy compression of the cover image. On the other hand, selecting the LSBs after the ignored bits guarantees the reversibility of the secret data. It is a tradeoff between the number of ignored bits and used bits for embedding secret data and the stego image quality. A group of grayscale test images are used in this experiment of size 512×512 pixels as shown in Fig. 2.

The grayscale images used in our experiment.
To measure the effectiveness of our method, experiments were carried out on a large sample of grayscale images that are recognized as well known test images (for example a woman (Lena), an airplane, a baboon and a man). The experiments were conducted using a desktop computer with a i7 processor, 8.00GB of memory, the Windows 10 operating system and MATLAB 2018. Figure 3 and Fig. 4 show the results of our method.

Airplane (cropped) a) original-image, different number of ignored bits are displayed in (b) 0 ignored-bits, (c) 1 ignored-bit, and (d) 2 ignored-bits. A payload of embedding rate 3% is achieved for all cases.

Lena (cropped) (a) original-image, different number of ignored bits are displayed in (b) 0 ignored-bits, (c) 1 ignored-bit, and (d) 2 ignored-bits, and a payload of embedding 2.5% is achieved for all cases.
Also, we analysed our methods using diferent parameters. Tables 1, 2, 3 and 4 show the results of the proposed method with different images at different embedding rates, a different number of bits used in the embedding process and a different number of ignored bits using the peak signal to noise ratio (PSNR) and the structural similarity index metric (SSIM).
PSNR and SSIM of the airplane image at different embedding rates, a different number of bits used in the embedding process and a different number of ignored bits
PSNR and SSIM of the Lena image at different embedding rates, a different number of bits used in the embedding process and a different number of ignored bits
PSNR and SSIM of the baboon image at different embedding rates, a different number of bits used in the embedding process and a different number of ignored bits
PSNR and SSIM of the man image at different embedding rates, a different number of bits used in the embedding process and a different number of ignored bits
In addition, we compared the accuracy of the proposed method with the methods in [22–25] as shown in Fig. 5 and Fig. 6.

Comparison of PSNR for the proposed method with respect to different embedding rates (a) airplane, (b) man,(c) Lena and (d) baboon.

Comparison of SSIM for the proposed method with respect to different embedding rates (a) airplane, (b) man,(c) Lena and (d) Baboon.
In this paper, a new method is proposed to use the matrix embedding technique to hide secret messages into EZW coefficients for cloud storage security enhancement. The secret message can be recovered completely from the compressed bit-stream of EZW. The method concentrates on avoiding the impact of stego image compression on the loss of the secret message. It adopts the invertible integer-to-integer wavelet transform to embed the secret message into the EZW coefficients. The security of the message is guaranteed twice; once by encryption and the other by scattered embedding the encrypted bits in the EZW coefficients. The method can accommodate a large secret message due to the ability of using multiple bits of selected coefficients. The quality of the stego image is good because only one bit of seven is changed to embed three secret bits.
The experiment results of our proposed method show the stego images are of high visual quality in terms of PSNR and SSIM. The results show that it can hide huge payloads while maintaining the quality of the stego image, guaranteeing the security of the secret message and recovering the message losslessly thereby avoiding the effect of the lossy compression of EZW on the stego image. The quality of stego images are more than 61.8 dB on average at an embedding rate of 3%. The method produces stego images of superior quality compared to [22–25] at different embedding rates.
In future studies, it is recommended that further investigation is needed in relation to the matrix embedding technique for data hiding with different image compression methods. It is also important to explore the performance of different image compression techniques, such as JPEG and JPEG2000.
