Abstract
This paper presents a discrete quantum behaved gravitational search algorithm (DQGSA) which is used to provide a novel steganographic algorithm based on the spatial domain: Least Significant Bits (LSB). This approach improves the LSB matching method proposed by Mielikaien. Whereas Mielikaien’s method uses a binary function to reduce the number of changed pixel values, in suggested method a three-score system is proposed to assess the performance of different orders for LSB matching. Afterward, the proposed DQGSA is employed to search for an optimal solution among all the permutation orders. By employing the proposed method, the distortion of the stegano image could be reduced while the probability of detection is decreased. The experimental results show that the proposed method achieves better performance than Mielikainen’s pair-wise LSB matching method in less distortion (means higher visual quality) and more preservation against steganalysis.
Keywords
Introduction
Since the rise of the Internet, the security of information has become one of the most important aspects of communication and information technology [18]. Steganography is the art and science of hiding information by embedding messages within other media, seemingly innocent covers [4]. Steganography can be viewed as similar to cryptography. Both have been used to protect secure information. In comparison with cryptography, it could be said that cryptography is the science of writing in secret codes. On the other hand, cryptography is the study of hiding information, while steganography deals with composing hidden messages so that only the sender and the receiver know that the message even exists [33]. In other words, in steganography, only the sender and the receiver know the existence of the message, whereas in cryptography the existence of the encrypted message is visible to the world. Due to this, steganography removes the unwanted attention coming to the hidden message. The word steganography literally means covered writing as derived from Greek words steganos meaning “covered or protected” and graphei meaning “writing” [26].
The following formulation provides a simple description of the steganographic process:
Simple steganographic techniques have been used for hundreds of years. But with the increasing use of files in an electronic format, new techniques for hiding information have become accessible. Steganography and watermarking are main parts of the fast developing area of information hiding and they are techniques to hide important information undetectably and/or irremovably in image, audio and video data. These two information hiding techniques are explained in next section in detail [31].
Steganography methods depend on in which domain the insertion is performed, could be divided into two main groups, namely transform domain and spatial domain. The transform domain methods consist of Discrete Cosine Transform (DCT), Discrete Fourier Transform (DFT) and Discrete Wavelet Transform (DWT). The transform domain steganography methods hide messages in more significant areas of the cover images which are explained completely in next section. The second group belongs to spatial domain methods. The spatial domain techniques embed messages in the intensity value of the pixels directly. The first attempt in spatial domain was least significant bit (LSB) substitution; LSB substitution steganography is a simple technique that hides message bits in LSB of image pixels. Against simplicity of this method, one significant drawback of this method is that the secret message could be detected very easily [5].
After some years an improved LSB based method was proposed which is called the LSB matching procedure [19]. As the main advantage, this method decreases the probability of detection. In the LSB matching method, introduced by Mielikainen, initially a matching process between secret message and the cover image is performed and then based on the maximum similarity; embedding data takes place [19].
In [15], a LSB matching method is developed that adjusts the matching order between the secret messages and the host image in order to obtain optimal solution. They found that using exhaustive search to find optimal solution is impractical because of time-consuming. Therefore, they used immune programming strategy to get nearly-optimal solution for LSB matching. They demonstrated that their experiments achieved better performance than Mielikainen’s method in terms of visual quality of stego-image and survival rate against detection.
Inspired by their method, the authors in [21] proposed a new method by using quantum gravitational search algorithm (QGSA) to search for a near-optimalsolution.
In this paper, first a new method of evaluating gain matrix is proposed to improve presented results in [21]. Afterward, a discrete version of quantum behaved gravitational search algorithm (DQGSA) is suggested and then employed to search into gain matrix in order to find the best matching order that improves Mielikainen’s method. Based on experimental results, the proposed method increases the speed of algorithm and simultaneously decreases the probability ofdetection.
The rest of the paper is organized as follows. InSection 2, related works are reviewed. The details of our proposed method are described in Section 3. InSection 4, the experimental results are given. Finally, the conclusions are presented in Section 5.
Basic concepts
Steganography is the art and science of hiding information such that no one, apart from the sender and receiver, suspects the existence of the message [6]. Secret information is encoded in such a way that the existence of the information is concealed in a human perception. The main goal of steganography is to communicate securely in a completely undetectable manner [13].
Many developments in steganography occurred during World War II. This included the development of invisible inks, microdots, and encoded messages [3]. It would be very instructive to compare first the basic groups of information hiding techniques in order to introduce the steganography in detail.
Cryptography vs. steganography
Cryptography is the science of encrypting data in such a way that nobody can understand the encrypted message, whereas in steganography nobody should suspect/discover the existence of data. On the other hand, the goal of cryptography is to make data unreadable by a third party, whereas the goal of steganography is to hide the data from a third party [11].
In steganography, the information to be hidden is embedded into the cover object which can be text, image, audio or video so that the appearances of cover object doesn’t vary unusually after information become hidden. In other words, cryptographic methods try to protect the content of a message, while steganography uses methods that would hide both the message as well as the content [35].
Steganography vs. watermarking
Watermarking is mainly used to avoid piracy in digital media. In steganography the data to be hidden is not at all related to the cover object. However, the main intention is secret communication. Whereas in watermarking the data to be hidden is related to the cover object. In other words, the main intention is to restrict piracy of digital data [23].
As another difference in watermarking process, any changes in the watermarked object should have no effect on the watermark. On the other hand, the watermark inside the image should survive the manipulations; otherwise the attackers can very easily remove the watermark [32].
Steganography is a very powerful tool because, as mentioned above, it can be very difficult to detect secret data.
A brief review of steganography’s method
Methods of image steganography can be divided into two main groups: Transform Domain methods (TD) and Spatial Domain methods (SD) [20]. In the transform domain methods, images are first transformed and afterward the message is embedded into the cover image. In other words, in transform domain methods, messages are hidden in more significant areas of the cover image, and therefore this make it more robust [34]. In order to hide information in a more significant area, the cover image is initially separated into high, middle and low frequency components.
One of the most important issues in steganography methods is being undetectable. Therefore, in order to avoid image distortion, the information has been embedded in higher frequencies, which have lessvisibility.
Discrete Cosine Transform (DCT), Discrete Fourier Transform (DFT) and Discrete Wavelet Transform (DWT) are the most common transforms that are used for data hiding [36]. Comparing with other methods in transform domain, the DCT approach is very robust to JPEG compression, since JPEG compression itself uses DCT [24].
The most of transform domain methods are independent of the image format [28]. As another advantage, security and robustness against statistical attacks and image manipulation can be mentioned. However, high computational complexity is the main disadvantage of these methods.
The second group is spatial domain methods. In these methods, messages are directly embedded in the pixels information. The main idea behind this group of methods is to insert the secret message’s bits into the least significant bits of the cover image pixels hence it has less effect on visibility. The first attempt in spatial domain is LSB substitution. LSB substitution steganography is a simple technique that hides message bits in LSB of image pixels [5].
Popular steganographic tools based on LSB embedding vary in the existing approaches for hiding information [9, 10]. Some algorithms change LSB of pixels visited in a random walk [8], others modify pixels in certain areas of images [14], or instead of just changing the last bit they increment or decrement the pixel value [30]. Another approach for embedding secret message in spatial domain was proposed by [17] which is based on statistical information of noise.
The major advantages of the LSB algorithms are their speed and simplicity. Because of computational efficiency, steganography software has been developed which uses LSB manipulation. Against simplicity of this method, one significant drawback of this method is that the secret message could be detected very easily [29].
Therefore, a new method which is called LSB matching method has been proposed and it is more robust against detection [1]. In this method, first a matching process between secret message and the cover image is done. Afterward, message is embedded based on the maximum similarity into cover image. Having a security key in this method decreases the probability of detection in comparison with LSB insertion.
The performance of the LSB matching was improved by Mielikainen [19]. In Mielikainen’s method, initially a matching process between secret message and the cover image is performed (which is explained in section 2.4 in detail) and then based on the maximum similarity; embedding data would take place [19]. In [15], in order to improve the quality of image, a three-score system is proposed to assess the performance of different orders for LSB matching. Then, an immune programming strategy is adopted to search for a near-optimal solution among all the permutation orders.
Mielikainen’s method
For starting the explanation of the proposed method Mielikainen’s method is described in detail. In Mielikainen’s method [19] the LSBs of the cover image are not just simply replaced with the LSB of the secret message. Instead, if the message bit does not match the LSB of the cover image, the value of the cover pixel is randomly either added or subtracted by one.
Assume that S, C and St are the matrixes of secret data, cover image and stegano-image, respectively. In Mielikainen’s method, first, by scanning left-to-right and top-to-bottom of matrixes of S, Cand St, they are converted into streams of C0, S0and St0respectively, as shown in Equation (1).
where r c and c c indicate numbers of row and column in the cover image, r s and c s indicate numbers of row and column in the secret data and r st and c st indicate numbers of row and column in the stegano-image. Then, each two consecutive elements in streams of C0 and S0 are selected. Assume that two cover pixels are c i and ci+1, two secret bits to be hidden are s i and si+1and two stegano pixels are assumed st i and sti+1.
The Mielikainen’s method is performed as follows:
(i) The value of secret bit s i is compared with the LSB of the first cover pixel c i . If they are equal (LSB (c i ) = s i ), then the first cover pixel is kept, st i = c i , otherwise the second cover pixel is kept, sti+1 = ci+1.
(ii) In the case of st
i
= c
i
, si+1 is compared with f (c
i
, ci+1) where it is defined as:
If they are equal, then sti+1 = ci+1, otherwise, the second pixel sti+1 is calculated by randomly incrementing/decrementing the second cover pixel ci+1. If s i is not equal with LSB(c i ), sti+1 = ci+1. Then, if si+1 = f (c i - 1, ci+1) the value of sti+1 = c i - 1, otherwise sti+1 = c i + 1. Mielikainen’s method is illustrated in Fig. 1.
A brief introduction to GSA algorithm
The Gravitational Search Algorithm (GSA) was first introduced in 2009 [7]. Up to now, many works have been done to improve the GSA algorithm [27]. As a population-based evolutionary technique, GSA is a stochastic optimization approach which maintains a collection of objects representing the candidate solutions to the problem at hand. Objects moves through a multidimensional search space to find the optima or sub-optima.
In the standard gravitational search algorithm (SGSA) with Nobjects in a D-dimensional search space, the potential solution could be represented by theobject’s position vector X
i
(t). In GSA, the values of objects’ masses are calculated by computing the current population fitness, as follows:
where M
i
is the mass of i-th object (agent), fit
i
(t) represents the fitness value of the agent i at time t, and worst (t) is defined as follows (for a minimization problem):
To compute the acceleration of an agent, by using the law of motion, is calculated as follows:
where, G (t) is gravitational constant at time t, note that G (t) is a decreasing function of time, which is set to G0 at the beginning and decreased exponentially toward zero by lapse of time. ɛ is a small constant, R
ij
(t) is the Euclidean distance between two agents i and j and rand
j
is a uniformly distributed random number in the interval [0, 1]. In equation (5) Kbest is the set of K agents with the biggest masses,K is a function of time, initialized to K0 at the beginning and decreased linearly with time. Afterward, the next velocity of an agent iscalculated as a fraction of its current velocity added to acceleration.
Eventually, the agent’s position, X
i
, is updated according to Equations (7).
This process is repeated until the stopping criterion is reached, usually a sufficiently good fitness or a maximum number of iterations.
Physics is a foundation of modern science and technology. Recently, novel optimization methods have been motivated from the concepts of quantum mechanics and computation [23]. By quantum theoretical development, claims have been made that polynomial quantum computing algorithms exist for problems which are considered non-polynomial (i.e, NP-hard) on classical computer [25]. It is clear that the algorithm described by Shor represents a significant advance in knowledge of how to program quantum computational networks if they exist.
In [2], inspired by Shor’s quantum algorithm for factoring very large numbers, an extension to Shor’s method is described and this leads to provide a novel discrete algorithm. In other words, in this paper a brief outline of genetic algorithm for solving discrete problems is proposed namely quantum inspired genetic algorithm (QIGA). QIGA identifies and demonstrates the feasibility of a novel computational paradigm which is inspired by the principle of quantum mechanics and quantum computing. In QIGA new kinds of crossover and mutation for discrete computing inspired by Shor’s algorithm are presented.
QGSA is one of the novel optimization methods based on quantum mechanics. In quantum model of GSA, each object has quantum behavior. Thus, the state of the object is characterized by wave-function ω, where |ω|2 is the probability density function of its position.
Employing Monte Carlo method, the objects in QGSA [22] move according to the following iterative equation:
where r1and r2 are uniformly distributed random numbers in the interval [0,1] and Center i is center of potential well. In QGSA to avoid local optima, we assume that each Kbest is center of an attractive potential field. Therefore, in each iteration, the number of attractive potential field is equal to Kbest. As mentioned above, Kbest is a function of time.
In this step, with a probabilistic mechanism, every agent chooses one of the Kbest members and moves towards it. This probabilistic mechanism is based on the masses of Kbest set. The way of selecting one of Kbest is so similar to the roulette wheel selection in genetic algorithm. In this kind of selection, objects are given a probability of selection that is directly proportionate to their masses (quality of the solutions).
It should be noted that QGSA is a real algorithm; in other words, it cannot be used for solving discrete problems in the current form.
As mentioned in the previous section, the QGSA cannot be used to generate discrete values for solving discrete problems since the positions definition (solutions) are real-valued.
For discrete optimization problem, conventional QGSA must address the following two issues: How to change the position of the objects? How to guarantee the position is reasonable?
Inspired by [2], we propose a novel discrete quantum behaved gravitational search algorithm (DQGSA) to tackle the discrete space problems. To do this, we redefine the position updating procedure in QGSA algorithm for the discrete problem.
The travelling salesman problem (TSP) is a well-known NP-hard problem in combinatorial optimization, operations research and theoretical computer science. In TSP, a list of cities and their pairwise distances are given and the task is to find the shortest possible route that visits each city exactly once and returns to the origin city.
For an D-ordered TSP problem, D cities are numbered 1, 2, 3, … and D respectively, denote as the i-th agent’s position in the swarm of DQGSA, which represents the traveling tour of .
To simplify the explanation, consider an example containing 7 cities with the names of A, B, C, E, F, G and H. In this example, a tour can be represented as an agent consisting of 7-dimensions (city) where each dimension is a city. The first dimension is the start of tour. The procedure of DQGSA is described as follows:
Scan the agent from first dimension to the last one and compare it with a basic agent containing all possible cities. As an example consider an agent as follow:
And the basic agent [A B C E F G H] which contains all possible cities respected to feasible space.
In the tour represented by X i in Equation (9), the first, second and third dimensions consist of ‘A’, ‘C’ and ‘H’ respectfully. Therefore, ‘A’, ‘C’ and ‘H’ are omitted from basic agent and the basic agent is changed to [B E F G].
The fourth position is ‘A’ which omitted before. The scanning procedure is continued. After scanning all dimensions the remaining letters in basic agent are ‘B’, ‘F’ and ‘E’ which mean that ‘B’, ‘F’ and ‘E’ are not available in sample agent X i .
Afterward, the scanning procedure is started again in order to make a tour. The first, second and third dimensions are copied without any changes. At this point, the ‘A’ in the fourth dimension cannot be included since ‘A’ already occurs in the first dimension. Since the basic agent already contains ‘B’, ‘F’ and ‘E’, an arbitrary letter is chosen instead of ‘A’ in the fourth dimension.
It is suggested that the letter is chosen randomly in order to have the algorithm more exploration ability in the search space. Assume that ‘F’ is chosen. The chosen letter is omitted from basic agent. So, the basic agent is changed to ‘B’ and ‘E’. It will be continued until all dimensions are checked. Finally the rebuilt agent has the character of a tour.
As shown in Fig. 1, in the Mielikainen’s method at first, the value of secret bit s i is compared with the LSB of the first cover pixel c i . If they are equal (s i = LSB (c i )), then the first cover pixel is kept unchanged, otherwise the second cover pixel would be kept unchanged. In the first branch, when st i = c i , s i is compared with f (c i , ci+1). If they are equal, then sti+1 = ci+1, otherwise, the second pixel sti+1 is calculated by randomly incrementing/decrementing the second cover pixel ci+1; where f (c i , ci+1) is defined as Equation (2).
If s i is not equal to LSB (c i ), sti+1 = ci+1. If sti+1 = f (c i - 1, ci+1), then the value of st i = c i , otherwise st i = c i + 1.
The main drawback of this method is that this method may also distort stegano images. Therefore, in this paper, by finding the best matching order between cover image and secret data, we try to decrease the distortion of stegano images while a heuristic search algorithm is used to find this best matching order.
From the flowchart in Fig. 1, it could be declared that best matching with fewer changes of the cover image is achieved when s i = LSB (c i ) and si+1 = f (c i , ci+1), which means values of stegano pixel s i and si+1 and cover pixel c i and ci+1 are equal respectively, and as a result the difference between stegano image and cover image is minor.
In our proposed method, a four-tier state-scoring is used that operates as follows:
If two cover image pixels are similar to stegano image pixels and there is the first time they are used, score is T1.
If two cover image pixels are similar to stegano image pixels and there is the Q-th time they are used, score is .
If at least one pixel of the stegano image is changed related to cover image, score is T3.
4. If both pixels are changed, score is T4. In order to show the superiority of keeping the cover pixel values unchanged,T1 is considered greater than 3T3. In our proposed method, each pair pixels of cover image pixels which are similar to stegano image pixels can be used more than one. But after each use their score decreases to when Q is the number of time they are used. It should be mentioned that in the first step for evaluating the score matrix, T1, T3 and T4 just are considered and in the next step if two cover image pixels are utilized more than once, their score is decreased to .
To distinguish the performances of these processes, they must be assigned hierarchical scores such as T1 > T3 > T4 and as mentioned before T1 > 3T3 > T2 which indicates that keeping the cover pixel values unchanged is better than changing them. The scores are given as follows: T1 = 40,T3 = 10,T4 = 0. Score matrix
In order to evaluate the performance of different matching order of secret image and cover image, first a matrix which is called score matrix SM is defined.
After converting matrix of C and Sinto streams of C0 and S0, for each two sequential pixels in the stream, the value of T is evaluated and then the score matrix SML
c
×L
S
is calculated as follows:
where sm (i, j) is the i-th row and the j-th column element of this matrix SM, L s is the length of secret stream and L c is the length of cover stream. In order to finding the best matching order, we need to choose L s element in different columns from the matrix SM. It should be noted that only one element from each column should be selected. But each row can be selected more than once. This situation may happen whenever the element of row is T1, then this value is decreased to where Q is the number of time each row are used
In addition to this, an adjustment list is evaluated from these selected L s elements, namely J = [j1, j2, …, j k , …, j L s ] where index k indicates the k-th column of SM and the value of j k refers to j k th element of k-th column.
It should be mentioned that many adjustment lists could be made by taking different elements, but the one with the highest score which means the best matching structure will be found by the DQGSA algorithms.
By having an adjustment list J, the fitness (total score) can be calculated by adding sm (j
k
, k), where j
k
is the kth element of J and k = 1, 2, …, L
s
. For simplicity, the normalized fitness of J is introduced as follows:
As said in [15], after finding the best adjustment list J, the secret data will be extracted as follows:
First, the stegano stream St0 will be arranged by scanning the stegano image St as explained before. Afterward, for each consecutive two elements of the stream St0 the position of J has been checked. At first, the value of second secret bit si+1 is calculated by using Equation (1). If si+1 = LSB (st i ), the first secret bit s i is si+1. If si+1 is not equal to LSB (st i ), then s i is calculated from s i = f (st i , sti+1 ± 1). Eventually, when the entire secret bits are extracted, the secret stream is rearranged into the secret image.
Experimental results
To validate the performance of the proposed method, experimental results are presented and discussed in this section. In our experiment all images are 256 gray levels of size 512×512. In order to increase the speed of algorithm all images are resized into 128×128. We take the images Peppers, Barbara, Baboon and Lena, shown in Fig. 2(a), (b), (c), (d), respectively. The image coin (Fig. 2(e)) and a randomly generated secret binary stream data are taken as secret data.
Setting parameters
The proposed DQGSA has three main parameters, population size, δ which determines the balance between exploration and exploitation and the size of Kbest. Performance of each algorithm depends on suitable parameter values. In search algorithms setting parameters correctly has an effective influence on exploration and exploitation of the algorithm. But the main problem is how to set parameter values properly and how to vary them in order to maximize the searching ability of algorithms.
The proposed algorithm is tested with different population sizes to get the best performance. The results show that DQGSA with swarm size of 450–600 have good performance for this problem (Fig. 3). Thus, the swarm size is set to 500 in our experiment.
The next important parameter is δ. In DQGSA δ is a crucial parameter where can affect the precision of final solutions obtained. On the other hand, δ has a direct effect on the convergence behavior of the algorithm. In this paper because the problem is multimodal and based on our previous experiments, we set the parameter δ to 1.
It should be noted that since local exploitation and global exploration capabilities are always twisted together in the search process, it is difficult to fix the values for parameters. On the other hand, a good balance between exploration and exploitation abilities of a heuristic algorithm makes it more robust to avoid getting trap into local optima and also increases the convergence speed of the algorithm.
In this paper to control the exploration and exploitation of the proposed DQGSA, Kbest is considered as a function of time with the initial value at the beginning and is decreased with time by the linear control. It is set to N and the beginning and is decreased to 1 at the end.
Invisibility and capacity test
Figure 2 shows the cover image and secret data. The secret data is the image of Coin. Before embedding, both the cover image and the secret data are permutated randomly. The stegano images are shown in Fig. 4b after embedding data into the LSB planes by the proposed method. We find that the visual quality of the stegano images is not degraded much. Similar tests are also performed on the Baboon, Peppers and Woman images.
Table 1 shows the PSNR value of the proposed method compared with Mielikainen’s method and 3 other algorithms consist of GA, PSO and GSA. All algorithms were tested using similar parameters such as population size and the maximum number of iterations, the number of fitness evaluations. From the table, we find that with different embedding capacity, the PSNR value of our method is higher than other algorithm which proves the ability of DQGSA and proposed method in spotting acceptable optimum. As Table 1 shows, DQGSA increases the PSNR value of Mielikainen’s by about 9 dB, which is a significant result.
The results in Table 1 show clearly that by using the DQGSA in the proposed method, the number of changes of the LSB positions of the stegano image is considerably decreased. Hence, the visual quality of the stegano image and the PSNR value are enhanced compared with the original pair-wise LSB matching steganography and four other algorithms.
Conclusion
This paper presented a discrete quantum GSA in order to solve LSB matching method. The proposed solution evaluates different LSB matching orders to find the best matching order. It was shown that the proposed method could achieve higher PSNR values of stegano images and reduced number of changes in LSB position. To get good results, the parameters of the proposed DQGSA such as N, Kbest and δare investigated. In conclusion, experimental results show an improved visual quality of stegano images.
