Abstract
This work explores the extent to which LSB embedding can be made secure against structural steganalysis through a modification of cover image statistics prior to message embedding. LSB embedding disturbs the statistics of consecutive k-tuples of pixels, and a kth-order structural attack detects hidden messages with lengths in proportion to the size of the imbalance amongst sets of k-tuples. To protect against kth-order structural attacks, cover modifications involve the redistribution of k-tuples among the different sets so that symmetries of the cover image are broken, then repaired through the act of LSB embedding so that the stego image bears the statistics of the original cover. We find this is only feasible for securing against up to 3rd-order attacks since higher-order protections result in virtually zero embedding capacities. To protect against 3rd-order attacks, we perform a redistribution of triplets that also preserves the statistics of pairs. This is done by embedding into only certain pixels of each sextuplet, constraining the maximum embedding rate to be
Introduction
Hiding secret messages in the least significant bits of pixels in digital images is the oldest steganographic technique. It follows a simple rule: to embed a message bit into a pixel of value x, flip the pixel’s least significant bit (LSB) to match the message bit,

(a) Illustration of how pixel value pairs
This asymmetry has given rise to a barrage of steganalytic attacks over the last two decades, starting in 2000 with the histogram attack [60] which seeks to detect LSB embedding by examining the extent to which neighboring bins in the image’s pixel value histogram tend to equality as a result of this embedding asymmetry. This attack only works well for high embedding rates, and so new techniques looking for changes in higher-order statistics, like correlations among neighboring pixel values, were developed [6–8,19,30,31,33]. These powerful methods are generally referred to as structural steganalysis, because they target the statistical properties of image structures (like pixel pairs, triplets, and so on). Structural steganalysis is based on the idea that the cardinalities of certain sets of consecutive pixel groups should be approximately equal for natural images, but diverge in an idiosyncratic way under LSB embedding. Structural attacks analyze the count statistics of these pixel groups and render an estimate of the hidden message length in proportion to this divergence. The most sensitive attacks in this family are able to detect embedding rates as low as 3% [30], and are more accurate than other prominent attacks against LSB embedding [5,10,18,34,70] for certain image formats, like JPEG-compressed images. We discuss these methods at length in Section 4.
In this work, we seek a means of securing LSB embedding against higher-order structural steganalysis, including both RS/Sample Pairs Analysis (SPA) [7,8,19] and Triples analysis [31], that 1) does not produce obvious statistical artifacts, 2) does not require significant additional secret data for message recovery, and 3) preserves the computational and algorithmic simplicity of LSB embedding. We develop a method of cover modification, in which pixel values of the cover are modified, disturbing the natural count statistics of various pixel groups prior to message embedding, such that the act of embedding the message restores the count statistics to their natural values. With apparently normal set cardinalities, structural analyzers will then be fooled into concluding that no hidden message is present. This kind of cover modification was presented as a defense against SPA in [55], and we extend it here to protect against up to third-order (Triples) attacks. This extension is not straight-forward: to protect against all orders up to some order k, the statistics of n-tuples must be preserved, where n is the least common multiple of all orders up to and including k. We find that cover modifications in terms of sextuplets, required to preserve the statistics of both pairs and triplets, are highly constrained and result in cover images with virtually zero embedding capacity. We demonstrate how to instead perform cover modifications at third-order, which is much less constrained, by redistributing triplets in such a way that also preserves the statistics of pairs. The trade-off is that only certain pixels in the image can be embedded, and so steganographic capacities are reduced.
We test this method against a range of different image types and find that we can achieve maximum undetectable embedding rates of 0.12, 0.17, and 0.21 bits per channel for uncompressed grayscale, uncompressed color, and JPEG-compressed color raster images. We also argue that extending protections to higher-order (quadruples analysis [32]) is only possible at the cost of virtually zero embedding capacity; however, such detectors are difficult to implement in practice.
This paper is organized as follows: Section 2 explores related prior art on the subject of improving the security of LSB embedding against structural steganalysis, and Section 3 provides some rationale for still considering LSB embedding in today’s steganographic landscape. Section 4 reviews the family of structural steganalysis techniques and in Section 5 we introduce the procedure of cover modification as it is used to secure LSB embedding against SPA in [55], with some new elements necessary for extending it to higher-order. Section 6 develops the new cover modification procedure to protect against both second- and third-order structural attacks, and presents results of testing on a data set of grayscale and color images. In Section 7 we discuss the possibility of extending this methodology to higher-order, and in Section 8 we conclude.
Due its simplicity, there has been much work on improving the security of LSB embedding against the ever-escalating wave of steganalytic attacks. These approaches evade statistical attacks focused on image structures by preserving these statistics during the embedding process. There are three broad approaches to this problem: embedding strategies, in which LSB embedding is only performed on subgroups of pixels that preserve certain statistics; statistical restoration, in which portions of the cover image are altered after LSB embedding to recover certain statistics of the cover image; and cover modification, in which the cover image is altered prior to embedding so that the stego image retains certain statistics of the cover image. Some methods incorporate more than one of these aspects. We review several relevant works in this section.
The earliest approaches attempted to circumvent histogram-based attacks. In the work of [13], the histogram is preserved by encoding the message such that the probabilities of 1’s and 0’s in the message are precisely those required to keep the frequencies of adjacent bins unchanged. Some protection against second-order statistics is conferred if the pixel pairs are chosen such that their frequencies in the image co-occurrence matrix are unchanged after the embedding; however, this method is not protective against second-order attacks like RS analysis [2]. This procedure is only applicable for high embedding rates, since otherwise the histogram attack is not particularly effective. In [9], the histogram-preserving data mapping introduced under the assumption that pixels are i.i.d., embeds data with the same distribution as the cover image histogram so as to minimize their relative entropy. The i.i.d. assumption is in general not true for natural images, however, and so this method is susceptible to higher-order structural attacks, e.g. as shown in [59].
The LSB+ method of Wu et al. [63] also seeks to preserve the pixel value histogram by compensating for bits embedded into a given pair of neighboring bins by appropriately changing the values of other pixels from these bins reserved for this purpose. To protect against second-order attacks, like SPA, only restricted groupings of pixels are embedded so that these statistics too can be preserved; the result is that test images had a very low average embedding capacity of around 2.5%. Protection against higher-order statistics would result in even lower embedding capacities. An improvement in capacity is offered by [24] but this method is equally susceptible to higher-order structural steganalysis.
With the increasing use of the powerful SPA technique, steganographers recognized the need to go beyond the preservation of first-order image statistics. In [48], an inverse histogram transformation is applied to the cover image prior to embedding. This operation compresses the range of pixel values in the cover image, essentially coarse-graining the image prior to embedding. Since structural attacks like RS and SPA rely primarily on trace sets with small differences among neighboring pixels, as these are the most common pairs, the method of [48] is able to defeat these attacks for a wide range of embedding rates. To recover the hidden message, the recipient must reverse the compression. The trouble with this method is that the compression transformation eliminates entire pixel values from the stego image, which appear as empty bins in the image histogram. Analysis of the histogram enables one to reverse the transformation and then directly analyze the LSB-embedded image. Lou and Hu [44] correct this problem by performing multiple transformations with different parameters on different pixel groups of the cover, with the result that the combined histogram has no missing levels. Depending on the pixel grouping strategy, the authors of [44] acknowledge that this approach could be susceptible to a brute force attack wherein the steganalyst examines the histograms of many different pixel groups looking for evidence of missing levels. As the number of pixel groups grows, such that the histograms contain ever fewer pixels, missing levels could occur naturally and the authors argue that in this case there are insufficient histogram statistics to support steganalysis. This claim, however, remains to be validated in general.
An interesting example of statistical restoration is provided by the method of dynamic compensation [47]. Here the message is embedded and the values of half of the image pixels are increased by one. This has the effect of essentially “resetting” the statistics of the stego image, and structural steganalysis is unable to detect any hidden messages. The pixels used for this compensation must generally be chosen dynamically such that detection by common structural attacks is minimized. The main drawback of this method is that message retrieval requires a reversal of this compensation procedure, and so the locations of all modified pixels must be communicated to the recipient. This is a sizable amount of data: for a
An approach that combines cover modification with an embedding scheme based on the eight-queens problem is presented in [1]. Here, each LSB is flipped or not according to whether its pixel, when taken as part of an eight-pixel block, is masked by one of the 92 eight-queens solutions. A group of pixels is reserved to restore set cardinalities to approximate those of the cover so that SPA is unable to detect the message. A general upper bound on embedding capacities is not established in [1], but sample images are tested up to relative payloads of 30%. This is lower than the cover modification technique of [55], discussed below, and similarly does not protect against higher-order attacks. It should also be noted that message extraction using eight-queens encoding is greatly more complicated than simple LSB embedding, and does not confer additional security against structural steganalysis.
Most recently, the work of [55] considers cover modification where the cardinalities of sets analyzed by SPA to detect the presence of LSB steganography are adjusted prior to message embedding such that the relevant second-order statistics are preserved in the process. This approach successfully protects against SPA at the cost of lower embedding capacities, upwards to around 50% on average [55]. Though second-order statistics are carefully preserved in this method, higher-order statistics can still be targeted by Triples analysis to uncover the hidden message length.
Why LSB embedding?
The technique of substituting, or embedding, message bits into the least significant bits of cover image pixels is perhaps the oldest and arguably the simplest steganographic technique. Since its inception, LSB embedding has been targeted by a wide range of steganalysis, and is considered today to be effectively broken. A number of techniques were soon developed that incorporate LSB matching [52] into more secure frameworks.1
In LSB matching pixel values are randomly changed by
Possible reasons for the tenacious popularity of LSB embedding include 1) its simplicity (in terms of code, compute, storage, and hardware requirements), 2) its availability, 3) its ease of message extraction (in terms of additional data required, beyond perhaps a once-pre-shared key), and 4) that the expected threat of sophisticated steganalysis is not sufficiently high to warrant more advanced approaches. While it is difficult to know to what extent the use of steganography over lower-risk channels influences the popularity of LSB embedding, pragmatism argues that one “do only what is necessary, and no more.” This maxim shapes standard tradecraft in fields like penetration testing, in which simple, unsophisticated attacks are used whenever and wherever possible. We argue below that state-of-the-art methods suffer from a number of these kinds of impracticality, favoring the use of simpler stegangraphy, like LSB embedding, particularly over lower-risk channels. In this paper, we therefore seek to improve its security against those steganalytic attacks designed to target it in practice.
To exemplify the simplicity of LSB embedding, it can be implemented with an 80-character Perl code at the Linux command-line [30]. Software for both message embedding and extraction are widely available for free on the Internet (see [61] for a list of steganography programs, many of which are free and include LSB embedding). The code used to perform the cover modifications and embedding described in this paper is publicly available:2
In contrast, methods employing machine learning are significantly more complex and pre-trained networks are generally unavailable; for example, several popular implementations [53,57,58,65] have no reported open source code. These techniques require considerable expertise to develop from scratch: as examples typical of this class of methods, the works [57,58,71] make use of sets of deep convolutional neural networks that must be trained via adversarial learning. Generally tens or hundreds of thousands of images [26,29,57,58,71] are required for training, ideally on high-performance hardware like GPUs to speed-up training and hyperparameter optimization. Training times vary, ranging from upwards of 78 hours for ASDL-GAN [58] to 9 hours for UT-GAN [65] on a single GPU. These methods, while state-of-the-art, are very much research-oriented and not suited for wide deployment outside of academia. In short, actors employing steganography to send secret messages are generally not artificial intelligence engineers capable of training deep neural networks.
A further observation is that, while some GAN-based methods have embedding capacities competitive with state-of-the-art adaptive techniques [69], most can accomplish at most 0.4 bits per pixel with detection error rates in the 20%-30% range [26,57,58,65,71], and are limited to working with smaller images (
Traditional steganography embeds encrypted messages in the cover image; this is done both for security and because the resulting pseudo-random bit stream nicely randomizes pixel modifications. Messages are also typically embedded into a pseudo-random pixel sequence. Each of these operations requires that (at least one) secret key be pre-shared between sender and recipient, and anything else required for message extraction is considered additional data. LSB embedding requires nothing beyond the pre-shared secret key, and this is true as well of the modification described in this paper. The need for additional data, particularly data specific to individual stego images, increases the difficulty of practical implementation because it requires the existence of a secure channel that can be accessed on a per-message basis.
The powerful adaptive techniques described above work by selecting pixels for embedding such that some optimization criterion is achieved. In order for the recipient to extract the message, they must know which pixels were embedded. Many of these methods [22,27,28,40,51] use syndrome trellis codes (STC) to reduce the number of pixel modifications, and require the parity-check matrix of the code for message extraction. This matrix encodes the embedding specific to a single stego image, and so must be shared along with the image for extraction. In addition, it must be kept secret since an adversary that intercepts it can use it to extract the message. As an example, for HUGO [50] this matrix has dimensions
Machine learning-based methods are also impractical from this standpoint, since they generally require that the recipient has a specially-trained neural network for message extraction. The extractor must typically be trained on the same data set as the generator, and so in the above scenarios it is developed by the sender and must be sent to the recipient. Short of providing a fully-executable neural network, the sender could opt to send the recipient only the parameters (weights, biases, activations) of the network which they would then use to develop their own network. The extractor networks, however, can be rather large and the data structure representing these parameters can be sizeable, e.g. almost 70 Mb for the model of [26] according to [68]. Some deep learning-based models that implement adaptive strategies [53] or matrix embedding [57] must additionally provide parity-check matrices. And so, like the adaptive methods, machine learning-based models require considerable additional data for message extraction, challenging their practicality.
Prevalence of state-of-the-art steganalysis
Deep learning has also been applied to steganalysis [3,4,41,56,64,66,67], notably as the discriminator networks in GAN-based steganography. The 20-layer convolutional model, called Xu-Net [64], serves as the discriminator for several of the above methods [57,58,65]. Outside of this application, deep learning-based steganalyzers are some of the most powerful general purpose detectors ever developed, with the 11-layer SRNet [3] cited as one of the most powerful at the end of 2018 [4]. While several of these models have publicly-available code [3,64,66,67], as deep neural networks like the above GANs, considerable resources and expertise are required to train and implement these algorithms. As noted for example in [4], SRNet “requires strong know-how for its initialization.” Meanwhile, machine learning-based steganalyzers without deep architectures, like rich models [21], and ensemble and SVM-based classifiers [38], are available [14,16] and easier to train, but still require careful hyperparameter optimization and regularization for good generalizability [37].
Deep learning-based steganalysis is an exciting but immature technology, and its complexity and training requirements prevent its wide-spread adoption in practical detectors. In contrast, the family of structural attacks, like SPA, Triples, and Weighted Stego, are publicly available, require no training, and work “out-of-the-box”. A low-resourced or unsophisticated “warden” might therefore be expected to opt for this brand of steganalysis, and it is this “lower-risk channel” for which a more secure LSB embedding algorithm, like ours and those reviewed in the Related Work section, might find useful application given its relative ease of use (in comparison with machine learning-based models) and ease of message extraction (unlike adaptive and machine learning-based models).
Structural steganalysis
Structural steganalysis refers to a family of techniques that seek to detect hidden messages in spatial domain images by analyzing the statistical properties of contiguous groups of pixels. These methods have had good success detecting randomized LSB embedding at even low embedding rates.
First-order attacks
First-order statistics, like frequency counts of pixel values, were the basis of the early histogram-based attacks. Often referred to in the literature as the histogram attack, the approach of [60] employs a
Also based on the image histogram, the work of [25] modeled LSB steganography as additive noise and observed that the smoothing-out of neighboring histogram bins observed in [60] could be quantified in terms of the center of mass of the histogram characteristic function. In [25], this attack was only tested on a few color images at full embedding capacity, and so its performance against lower rates has not been carefully studied. Absent good models of first-order statistics for natural images, we expect this method to likewise struggle to detect lower embedding rates.
Second-order attacks
Sample pairs analysis (SPA) [7,8,19,30] considers the second-order statistics of natural images, and is based on the premise that natural images of objects with continuous shading should exhibit fairly small differences between neighboring pixels, and, for a given pair of such neighboring pixels,
We define a sample pair as a doublet of neighboring pixels
A multiset is the generalization of a set to include non-unique elements. Hereafter we will simply refer to them as sets.

Transition probabilities of subsets of the trace set
In terms of the above sets, the SPA4
Often in the literature, the term “SPA” refers specifically to the technique used in [8] to compute the change rate, p, from Eq. (9); here, we use it more generally to refer to the 2nd-order structural steganalysis and cover assumptions that yield Eq. (9), irrespective of how it is solved.
To infer the embedding rate,
Sample pairs analysis using least squares optimization has proven quite successful at detecting embedding rates as low as 5% [45], and the additional optimizations of [30] have achieved rates as low as 3% [31]. But, it is possible to do better by considering the higher-order statistics of larger sets of pixels.
Ker [31] has developed a generalized approach for analyzing n-tuples of pixels; specifically, he explored whether the cardinalities of sets of triplets of consecutive pixels,
Finally, an analysis of quadruples was also studied [32]. The cover image symmetries considered were the analog parity symmetry, the inversion symmetry
Cover modifications to defeat SPA
Sample pairs analysis is premised on the key assumption that natural images should satisfy the constraint
Schematically, LSB embedding transforms a cover image, I, into a stego image,

The effect of LSB embedding on trace subsets of an original and modified cover image. On the left are subset cardinalities of the original and modified cover, and on the right is how these cardinalities change after some amount of LSB embedding. The dashed horizontal lines are a guide to assess how well the “stego modified cover” (red) in the right plot resembles the “modified cover” (black) in the left plot.
The more data we wish to embed into the cover image, the more pairs need to be moved out of donor subsets. Since the donor subsets are of finite cardinality, there is a limit to the embedding capacity that depends on the particular cover image. Each trace set,
The number of trace sets to modify is arbitrary, though good results are obtained for
In any case, once the value of α has been obtained, we are ready to perform the cover modification. This is just a redistribution of pairs among the trace subsets according to Eq. (14) with the chosen embedding rate, α. The appropriate number of pairs are moved out of each donor subset into non-donor subsets according to their deficits. For color images, trace sets are adjusted separately for each color channel.
This kind of cover modification has been shown to be quite effective at evading SPA [55], but what about higher-order attacks? How does redistributing pixel pairs in this way affect the distribution of triplets? We perform a test on 1000
Confidence limits were established by running SPA and Triples detections on the raw un-embedded images.
The Triples analysis of [31] is able to detect the presence of a hidden message in almost every image, and estimate its length to within 50% accuracy for most. The noisiness observed in the Triples detections at high embedding rate possibly arises from the same instability observed by Ker in [31]. And so, perhaps unsurprisingly, a second-order cover modification is insufficient for securing LSB embedding against higher-order structural attacks.

The results of SPA and triples detections on 1000 uncompressed grayscale raster images with cover modifications made to defeat SPA. Dashed lines indicate 95% confidence limits for detection by the detector with the corresponding color.
To understand why the second-order cover modification did not also provide third-order protections, consider two consecutive triplets,
where the
In what follows, we refer to an nth-order cover modification as one that adjusts the cardinalities of sets of n-tuples.
In [31], Ker developed an approach to structural steganalysis to arbitrary order, which we apply here. Trace sets carry five indices denoting the differences between consecutive pixels in the sextuplet,
and similarly for the inverses,
With so many more subsets per trace set at sixth-order, there is a real danger that we will encounter trace sets with at least one empty subset, preventing us from embedding into that trace set. To find out, we test this cover modification on 1000 uncompressed grayscale raster images (
The problem with the sixth-order approach is that a single empty subset excludes all the pixels in its trace set. We cannot get rid of empty subsets because they are a property of the cover image which we intend to preserve; but, we can mitigate the collateral damage their exclusion has on other subsets. One idea is to reduce the sizes of the trace sets so that the number of pixels that must be omitted from embedding is smaller in the event that the trace set contains an empty subset. One way to reduce the number of subsets per trace set is to reduce the dimensionality of the transformation: while we are stuck preserving sixth-order statistics, we are not actually stuck with the sixth-order transformation,
Then,
Since only two pixels in each triplet are ever embedded, each triplet undergoes a second-order transformation governed by
We also considered the strategy of omitting the four middle pixels from Eq. (17), which, on its face would seem worse since twice as many pixels are excluded at the outset. But, the trace sets are even smaller in this case and the subsets even more general, and so the maximum α could be large enough to compensate for the loss of pixels. The LSB embedding transformation is composed of two separate families of single-pixel transformations on triplets:

Distribution of maximum embedding rates α for images cover-modified to resist both SPA and triples steganalysis. Results of 1000 (a) uncompressed grayscale, (b) uncompressed color, (c) JPEG-compressed color raster images.
To perform this cover modification, we consider sets with
For JPEG-compressed images, we find that larger embedding rates are possible, with a range of 0.05–0.40 bpc, and an average of 0.21 bpc, Fig. 5(c). This result is especially of interest since Triples analysis has shown to be much more reliable than pairs analysis at both detecting messages and estimating their length in JPEG-compressed covers, making it the last line of defense against these image types. Cover modifications that resist these attacks at moderate embedding capacities might therefore be of considerable value.
Before closing this section, we note that since first-order statistics, namely the quantities characterizing the distribution of single pixel values, are not adjusted during cover modification, the pixel value histogram will reflect LSB embedding. However, the
Once the cover modification is complete, messages can be embedded in the standard way, typically along a pixel path selected pseudo-randomly from the image. This pseudo-random sequence can be generated via a stream cipher with a secret key shared between the sender and receiver. The difficulty here, though, is that some pixels along the pseudo-random path might not be suitable for embedding for one of two reasons: i) the pixel belongs to an omitted trace set, or ii) the pixel belongs to the middle pair of a sextuplet (position
Even for the small
It is also standard to encrypt the message prior to embedding, both for confidentiality and so that, as an effectively pseudo-random bit sequence, the message won’t introduce statistical artifacts into the image. Here, if the message is encrypted and then embedded, the recipient’s decrypted message will contain errors because the extracted sequence will contain additional bits—those corresponding to the LSBs of omitted pixels—that were ignored during encryption. For example, schematically, key bit
Rather then encrypt the message itself (hereafter
Upon receipt, the recipient extracts all LSBs along the pseudo-random path (excluding middle-pair pixels, since the embedding strategy is assumed known to them), and decrypts the resulting bit sequence obtaining the sequence

Message encryption/embedding (left) and extraction/decryption (right) processes. Gray squares indicate omitted pixels and red squares indicate decryption errors. See text for symbol definitions and discussion.
We have demonstrated that LSB embedding can be secured against second- and third-order structural attacks. But this raises the obvious question: is it susceptible to fourth-order attacks? In principle, yes. The quadruples analysis of [32] was shown generally effective but difficult to apply, owing to the uncertainty over which root of the quartic polynomial for q to select as the predicted change rate. Ker suggests selecting the root closest to the estimate from a prior detection using SPA or Triples; however, these methods fail to detect any embedded message for covers modified according to Section 5. It is therefore unclear how quadruples analysis could be applied in practice against these kinds of stego images.
An extension of this methodology to provide fourth-order protections is possible, but we find that embedding capacities are close to zero. This is due to two factors: the loss of available pixels from the embedding strategy, and the limits imposed by donor set cardinality during cover modification. The embedding strategy is necessary since to preserve all statistics up to fourth-order, one needs to work with
This strategy reduces the number of available pixels to
In general, the order of the cover modification is the least common multiple of all relevant orders whose statistics are to be preserved. Let the highest-order preserved statistic be k, and let the least common multiple be n. Then, only
Conclusions
This work has explored the extent to which LSB embedding can be made secure against structural steganalysis by modifying consecutive pixel count statistics of cover images prior to message embedding. It is observed that modifications to protect against structural steganalysis at a particular order do not secure LSB embedding against higher-order attacks. Given the effectiveness of the third-order Triples analysis of [31] at detecting moderate LSB embedding rates, particularly against JPEG-compressed images, we sought in this research to develop a cover modification that would be protective against both Sample Pairs and Triples analyses.
We found that the sixth-order cover modification necessary to preserve both the second- and third-order cover statistics targeted by Sample Pairs and Triples analyses resulted in virtually zero embedding capacity. This is because the large, 64-dimensional trace sets overwhelmingly tend to have at least one empty subset, preventing the redistribution of sextuplets within that trace set. We therefore considered instead reverting to a third-order cover modification, but only embedding into certain pixels so that both second- and third-order statistics would be preserved. Specifically, if all but the middle two pixels in each sextuplet are available for embedding, redistribution of pixel triplets also preserves second-order statistics and moderate embedding rates can be achieved. We find that for uncompressed color and grayscale raster images, undetectable embedding rates range from around 0.05-0.30 bpc, with an average of 0.12 bpc and 0.17 bpc, respectively. For JPEG-compressed color images, we find generally higher undetectable payloads upwards to 0.40 bpc, with an average of 0.21 bpc. Since Triples analysis has shown to be superior to SPA at detecting the presence of messages and estimating their length in JPEG-compressed images [31], cover modifications that can defeat Triples are especially salient for this image type.
We also conclude that cover modifications performed at higher than third order result in virtually zero embedding capacity, and so protections cannot be extended beyond Triples analysis. This finding suggests that quadruples and even higher-order structural steganalysis should continue to be matured and developed in the face of these kinds of cover modifications.
Though accurate and powerful, structural steganalysis is not the only attack against LSB embedding. For example, the weighted stego-image [18,34] and asymptotic uniform most powerful (AUMP) [10] tests are robust detectors of LSB embedding that operate according to different principles, and so are not defeated with these kinds of cover modifications. It is an open question whether the cover statistics targeted by structural steganalysis can be modified while also preserving the cover models exploited by weighted stego-image and AUMP steganalysis. Our approach might also be extended to secure against more general pixel grouping geometries like those explored in the Closure of Sets work of [35,36].
Lastly, a nagging shortcoming of this methodology is the need to omit the pixels of entire trace sets in order to increase the maximum embedding rate. This requires that the recipient perform the additional work of identifying and removing the LSBs of the omitted pixels from the extracted data before a meaningful message can be recovered. Further, of course, having these pixels available for embedding in the first place would considerably increase the embedding capacity in many cases. Future work could explore cover pre-processing (prior to the modifications studied here) that redistributes pixels in trace sets with small (and, hence, limiting) donor subsets; such transfers, however, would not be reversed in the course of LSB embedding and so would stand as permanent modifications to the cover image. Such alterations would need to be performed carefully to avoid the introduction of statistical artifacts, and hence warrant further study.
Footnotes
Acknowledgment
The author thanks colleague Max Kresch for helpful discussions and for providing the NRCS image data set. The author acknowledges use of the software available at
