StegEraser: Defending cybersecurity against malicious covert communications

Abstract

Traditionally, the mission of intercepting malicious traffic between the Internet and the internal network of entities like organizations and corporations, is largely fulfilled by techniques such as deep packet inspection (DPI). However, steganography, the methodology of hiding secret data in seemingly benign public mediums (e.g., images), has been leveraged by advanced persistent threat (APT) groups in recent years, and is almost impossible to be detected and intercepted by traditional techniques, posing a pervasive and realistic threat to cybersecurity. Additionally, internal networks’ vulnerability to steganography is further exacerbated by the connectivity and large attack surface of the Internet of Things (IoT), whose adoption and deployment are quickly expanding. To protect computer systems against malicious communications that apply steganographic methods potentially unknown to cybersecurity stakeholders, we propose StegEraser, an approach to removing the secret information embedded in public mediums by adversaries, that is fundamentally distinct from existing research which is primarily designed for known steganographic methods. Implemented for images, StegEraser injects an excessively huge amount of random binary data with a novel steganographic method into the images, by utilizing the information-merging capabilities of invertible neural networks (INNs), in order to “overload” adversaries’ steganographic hiding capacity of images transmitted through the firewall performing DPI. In the meantime, StegEraser preserves the perceptual quality of the images. In other words, StegEraser “defeats unknown steganography with steganography”. Extensive evaluation verifies that StegEraser significantly outperforms state-of-the-art (SOTA) methods in terms of removing secret information embedded with both traditional and neural network-based steganographic methods, while visually maintaining the image quality.

Keywords

Cyber security advanced persistent threat steganography invertible neural network Internet of Things

1. Introduction

Internet connectivity has now become commonplace for electronic devices that are rapidly growing in numbers, such as smartphones, corporate computers, the Internet of Things (IoT), and the Industrial IoT (IIoT). For instance, with the technological advances in the industrial sector, including the IIoT and Industry 4.0, industrial systems, such as critical infrastructure, cyber-physical systems (CPSs), and factories, are no longer isolated from the Internet as they used to be [40]. Despite the benefits brought forward by such connectivity, including a higher level of automation and lower cost, cyberattacks and major data leaks have become increasingly prevalent in recent years, jeopardizing the security of computer systems. As an example, according to IBM, it is estimated that the average cost of data breach has reached $4.35 million in 2022 [26]. Apart from the economic damages and privacy issues, the consequences of cyber threats also include potential Internet and power outages, or even severe damage to critical infrastructure [47].

Traditionally, the responsibility of protecting the cyber perimeter of entities like corporations, government agencies, organizations, and industrial systems, largely falls on firewalls with deep packet inspection (DPI) capabilities [10,39]. However, as an emerging threat, steganography, the methodology of hiding secret information in seemingly benign public mediums (e.g., images and audio that can be analyzed by DPI), has the capability to enable APT groups and malicious insiders to transmit malware or sensitive information like intellectual properties from and to computers on the internal network. With neural networks, researchers have proposed a variety of methods to hide secret images or even arbitrary binary data in “cover” images or audio [22,48]. For instance, a popular framework based on generative adversarial networks (GANs) can effectively hide more than 4 bits of arbitrary binary information in a single pixel on average [52]. In other words, an RGB image of 1920 × 1080 pixels has a hiding capacity of 1012 KB, which is large considering that many malware payloads are tailored to small sizes. When combined with cryptographic techniques like AES or RSA, binary secret data hidden with steganography pose a greater threat and are even more difficult to defend against [24].

The threat of steganography to cybersecurity is not just a theory, but already a reality. In recent years, the United States Cybersecurity and Infrastructure Security Agency (CISA) has issued several alerts about advanced persistent threat (APT) groups that apply steganography to obscure the Command & Control (C&C) communications and target critical infrastructure, as well as private sector organizations [46]. In addition, the MITRE Corporation, maintainer of the CVE (Common Vulnerabilities and Exposures) program, has identified multiple APT groups that leverage steganography in cyber attacks [44], with examples including hiding malicious Portable Executables (PEs) and shellcode within PNG and JPEG files. As noted by MITRE, except for certain steganographic (or stego for short) methods that leave easily recognizable signatures of artifacts on the images, the detection of most attacks that are based on steganography is extremely difficult [44], and exceeds the capabilities of traditional DPI equipment.

To defend against malicious communications based on steganography, researchers have proposed various methods for steganalysis, which aim to detect the presence of hidden messages [28]. However, steganalysis tools can not sufficiently prevent covert communications, since different stego methods may be applied by adversaries that were not considered at the development stage of the tools [58]. Moreover, apart from the potential false positive and false negative results of steganalysis tools, positive results would also require proper further processing, such as the removal of the embedded secret information. Therefore, instead of detecting the existence of hidden messages, researchers have recognized the necessity to directly modify all transmitted public messages (e.g., images) and remove potential stego information regardless of whether the transmitted messages are benign or malicious [28,58]. This paradigm is sometimes referred to as the “active warden” in the literature [42]. Hence, the covert communication channel provided by steganography is disrupted without the necessity of successfully classifying the messages as benign or malicious.

For the blind removal of potential secret information embedded in public mediums, such as images, video, and audio, the following primary difficulties exist,

A large variety of stego techniques with distinct characteristics need to be defended against, and many techniques might even be unknown to the active warden.

Such removal should be as transparent as possible to ordinary users, i.e., without severely impairing the perceptual quality of the public medium.

Faced with the above practical challenges, most existing approaches of active wardens are only designed specifically for certain known stego methods [4,5]. This implicit assumption may, to a great extent, limit the practical application of these methods, since new stego techniques are constantly being developed. More recent measures of active warden, such as PixelSteganalysis [28], do not require knowledge of the exact stego method applied by adversaries, but nevertheless make some other restrictive assumptions, as discussed in Section 2.

To address these challenges while considering the limitations of existing research, we propose StegEraser, an approach to disrupting malicious covert communications applying steganography, that is fundamentally distinct from existing research, as shown in Fig. 1. In the figure, APT groups can bypass traditional DPI techniques, and persistently transmit large payloads embedded in seemingly benign medium without being noticed by the firewall. The payloads are then extracted by an internal device compromised via previous attacks or a malicious insider, followed by lateral movement (i.e., moving across the network from the compromised devices for greater access to the internal resources). In the reverse direction, insiders or victim devices collect sensitive information, embed the information into public medium with stego methods, and then send it to the Internet (external network).

Fig. 1.

An overview of the threat of steganography to industrial systems, and StegEraser, the proposed scheme for disrupting steganography-based malicious covert communication.

With a realistic assumption that the firewall with DPI capabilities has access to the relevant cryptographic keys and decrypts the network traffic for inspection, StegEraser serves as an enhancement to traditional firewalls. Unauthorized encrypted traffic could be blocked by the firewall by default. In StegEraser, we propose to “overload” the stego hiding capacity of public medium, and “overwrite” potential malicious secret information with random binary data. Specifically, by leveraging the transformation capabilities of invertible neural networks (INNs) [11], along with a metric of perceptual image quality [53], StegEraser injects an excessively huge amount of random binary data into the public medium, and consequently inhibits adversaries’ ability to correctly decode hidden information. Thus, the covert communication channel enabled by steganography is disrupted, such that the cost and effort required by adversaries to transmit data across the firewall is significantly increased.

As a demonstration, we implement StegEraser for images due to the abundance of public datasets and relevant research on steganography, while our idea can also be extended to other forms of data, such as video and audio.

The main contributions of this paper are summarized as follows,

A novel methodology is proposed to disrupt malicious covert communications based on steganography, which is fundamentally distinct from existing approaches.

We propose to adopt the perceptual quality metric instead of the commonly used pixel-level difference to preserve the visual quality of processed images.

As a byproduct, StegEraser can also be repurposed as a stego method for hiding arbitrary binary data in images, with a capacity higher than SOTA models.

Extensive evaluation indicates that StegEraser outperforms existing SOTA methods in terms of removing secret information while preserving the quality of images.

The remainder of this paper is organized as follows. In Section 2, we briefly introduce the related work and research gap. The threat model and details of StegEraser are presented in Section 3. Then, the evaluation of StegEraser is shown in Section 4. Finally, the limitations and potential countermeasures against StegEraser are discussed in Section 5, and Section 6 concludes the paper.

2. Related work

In this section, clarification regarding several relevant concepts is presented, and the related work is discussed, as well as the major differences between them and this paper.

2.1. Steganography and blind watermarking

As the most widely studied form of steganography, the objective of image steganography is to hide secret information (e.g., images, encrypted messages) in “cover” images, such that the generated image (“stego image”) is as indistinguishable as possible from the cover image. Methods of image steganography are generally categorized as either conventional or “deep steganography”, i.e., based on neural networks (NNs). Examples of conventional methods include modifying the least significant bits (LSBs) of pixels, as well as hiding secret information in the wavelet domain [29]. With regard to deep steganography, Zhang et al. develop UDH [51], a general NN-based framework that disentangles the encoding of secret images and cover images, and is robust to pixel intensity shifts. In [7], the author applies several neural networks to hide a full-size color image in a cover image.

Conceptually, image steganography is quite similar to blind watermarking [2], i.e., hiding invisible watermarks (in the forms of bits or images) within an image. While the term steganography focuses on covert communication, blind watermarking is generally oriented towards the identification of ownership, and typically has a smaller hiding capacity.

Apart from images, steganography can also be conducted in many other forms. For instance, secret information can be embedded into text [37], video [34], or even the activity patterns network traffic [18]. As a demonstration, this paper focuses on image steganography for simplicity, but the idea can be extended to other forms of steganography.

2.2. Steganalysis

To detect the presence of secret information embedded in images, researchers have proposed a variety of methods. These methods are generally based on complicated statistics of pixel values, or analysis in the transform domains (e.g., discrete cosine transform and discrete wavelet transform) [30].

For instance, Fillatre et al. [19] propose an approach to detecting secret information embedded in LSBs of images by an adaptive statistical test. In [9], the authors apply deep residual networks to detect the existence of both spatial-domain and JPEG steganography in images.

However, as mentioned in §1, it is unrealistic to solely rely on steganalysis to protect the cyberspace, due to factors including unknown advanced stego methods, and potential false positive as well as false negative classification results.

2.3. Active wardens

The majority of existing steganographic destruction techniques (i.e., active wardens) are based on conventional methods like wavelet transforms, and often target a specific known stego method [4,5]. For instance, the LSBs of pixels can be overwritten with Gaussian noise to prevent LSB-based steganography [20], and image filters can be applied to remove the secret information embedded with certain stego methods [43]. Despite the simplicity of implementation, these conventional methods typically lead to significant degradation in image quality [58].

By contrast, PixelSteganalysis [28] utilizes an architecture based on convolutional neural networks to establish pixel and edge distribution for each image, and then removes the hidden secret information at the suspicious pixels, while maximally preserving the quality of the processed images. It does not require knowledge of the exact stego method used by adversaries. However, PixelSteganalysis makes the assumption that the probability distribution of pixel values for the cover images used by adversaries is similar to the distribution of the dataset available to the active warden. This may limit its real-world application.

In [58], Zhu et al. propose to first use a simple neural network to simulate a Gaussian filter, which is an effective yet non-differentiable measure of removing hidden information, and then apply another neural network to compensate for the loss of image quality caused by the filter. With GANs, Corley et al. [12] propose the Deep Digital Steganography Purifier (DDSP), which removes the stego content from images. In DDSP, the authors also assume that the stego methods are already known, such that an autoencoder (generator) can be trained to purify the stego images, while the discriminator distinguishes between the clean cover image and the purified image.

Without making restrictive assumptions regarding the distribution of cover images, StegEraser differs from the aforementioned approaches in the sense that, instead of attempting to identify suspicious pixels or to simulate a traditional filter, it adopts a fundamentally distinct method of “defeating steganography with steganography”, i.e., overloading the stego hiding capacity of images with noise and invertible neural networks. Using a SOTA metric of perceptual image quality, StegEraser is able to preserve the visual quality of the images being processed.

2.4. Invertible neural networks

With carefully designed network architectures, invertible neural networks (INNs) are mathematically guaranteed to be invertible [27,31], i.e., mapping their input into their output in the forward process, and mapping the output back into the input in the inverse process. Typical invertible operations include coupling and specially designed convolutions [31].

INNs inherently excel at merging information together without loss [11]. In most cases, they are used for normalizing flows [35], which are designed to map a complex distribution (e.g., natural images) into a tractable distribution (e.g., Gaussian), with applications including density estimation and image generation.

However, as it is unnecessary to calculate the exact probability (likelihood) of images in our case, INNs are directly used without involving normalizing flows.

3. Details of StegEraser

3.1. Threat model and assumptions

As depicted in Fig. 1, the threat model and major assumptions made by us are described as follows,

Unknown stego methods. Similar to PixelSteganalysis [28], we do not make restrictive assumptions concerning the exact type of stego method applied by the APT groups. Hence, StegEraser could have a broad impact and be applied in many real-world systems.

Availability of DPI. By definition, steganography utilizes “public medium” to transmit secret information. Therefore, we assume that the firewall has DPI capabilities. Specifically, it owns the relevant cryptographic keys, decrypts network traffic, and analyzes the plaintext content, such that the unencrypted data can be further processed by StegEraser. Practically, the firewall could block unauthorized encrypted traffic by default. If DPI is unavailable, stakeholders’ ability to disrupt stego communications would be hindered, and they could turn to alternative approaches like behavioral pattern analysis [41] or software that monitor employees’ computers.

Adversarial access to springboard devices. We assume that adversaries have already acquired access to at least one springboard device residing in the internal network, either by prior penetration or with a malicious insider. For instance, as a very commonplace example in many corporations, employees might click on a seemingly authentic URL in an email sent by hackers, or unknowingly insert an infected flash drive to their computer. Consequently, the springboard device can send embedded sensitive internal data to the Internet (or external network), or extract malicious payloads hidden with stego methods in public medium/data received from the Internet. Therefore, further persistent malicious communications across the firewall is feasible for the adversaries.

Protection of data in large quantities. From a practical perspective, it is generally almost impossible to completely destruct the information hidden with stego methods. As an extreme example, APT groups could secretly transmit merely tens of bits of information each time by sending an image of a cat among all the predefined choices (e.g., a cat, a dog, etc.), or by choosing the specific timing of transmission. Thus, a more realistic objective is to implement proactive measures to disrupt the covert transmission of large amounts of data, such that the effective capacity of the stego channel of communication is reduced, and the cost and difficulty of malicious transmission is increased. For instance, with a stego capacity heavily reduced by StegEraser, to transmit the same amount of information into or out of the internal network, it would take APT groups much longer time to complete the data transmission, and force them to transmit a smaller amount of data for each opportunity of transmission. Otherwise, they would risk being noticed for having an unusual traffic pattern.

Non-generative steganography. For simplicity, this paper considers the emerging threat of generative steganography [33], which directly generates natural images from secret information. Further explanation and discussion are provided in §5.2.

As discussed in Section 2, these assumptions are less restrictive than those of existing research (e.g., [28,58]).

3.2. An overview of StegEraser

Fig. 2.

The StegEraser model, with modifications to IICNet [11]. During training, the forward direction generates stego images, and the inverse direction recovers the embedded binary data. After training, the model is used for erasing potential hidden information, and only the forward direction is utilized, which produces the purified image.

As a proof of concept, we modify the architecture of the IICNet [11] (a generic framework for the lossy embedding of several color images within a single image using INNs) and develop the proposed StegEraser model shown in Fig. 2.

During training, the overall objective is to acquire a steganographic model which has a substantial hiding capacity of arbitrary binary data, while maintaining low visual degradation to the image quality. Resultantly, after training, the model can be utilized to “overwrite” potential secret information hidden in images that need to be purified.

The design of StegEraser is primarily based on the following observations. First, in most cases, whether in the spatial domain or transform domain, image steganography inevitably result in changes to the pixel values of images, while attempting to preserve the appearance of the original image (with exceptions discussed in Section 5). Second, humans’ perception of images greatly differs from that of machines. One intuitive example is DeepFool [38], in which a small and hardly noticeable perturbation to the image leads to wrong NN classification results. Based on the above observations, StegEraser pre-emptively encodes random binary data in images, with the goal of only preserving the perceptual quality of images instead of pixel-level similarity. In other words, it is conceptually more reasonable to use perceptual metrics of image quality like DISTS [14], instead of pixel-level metrics like mean squared error and PSNR.

Regarding our choice of encoding random binary data using StegEraser, one may naturally ask the question that why not encode random natural images instead, since most stego methods, especially those based on NNs, are designed to embed secret color images within cover images (e.g., [51]). The reasons are twofold. First, from the perspective of implementation, it is easier to design the neural network if the cover image and the secret data to be embedded have the same spatial sizes. Then, as secret images would typically have three color channels, it can be complicated to acquire a stego hiding capacity (measure in bpp, i.e., bits per pixel) other than multiples of three. On the contrary, with binary secret data, it is much straightforward to precisely control the desired bpp, making it more favorable. Second, natural images exhibit strong spatial correlations, which is often taken advantage of by stego neural networks [52]. In other words, natural images have a less amount of entropy than random binary data. Hence, using random binary data can be considered to be less restrictive than using natural images.

Denote by $x$ the input image of size $H \times W$ with three color channels, processed by operations like normalization, where H, W are the height and width. Then, a total of D planes of random binary data $y = {y_{1}, \dots, y_{D}}$ are generated, with each data plane having $H \times W$ elements. Each element of $y$ is independently and identically distributed (i.i.d.) according to the Bernoulli distribution with equal probability.

To avoid confusion with the backward propagation, we describe the two directions of information flows in Fig. 2 as forward and inverse , as opposed to “backward”. In the forward pass, $y$ is embedded in $x$ , and yields the stego image $x_{s}$ . In the inverse direction, the StegEraser model takes $x_{s}$ as input, and recovers the image $\hat{x}$ and binary data $\hat{y}$ .

3.3. Model architecture

Nonlinearity module. Due to the limited nonlinear representation capacity of INNs resulting from their architectural constraints [16], we prepend the INNs with a nonlinearity module (shown in Fig. 2), which serves two purposes. First, it preprocesses the heterogeneous inputs of image and binary data, which are further processed and merged in the INN module. Second, it provides a richer nonlinear representation capability to the StegEraser model. The nonlinearity module is implemented with two separate symmetric Dense Blocks [25], since the Dense Block itself is not invertible. One Dense Block is for the forward direction, and the other is for the inverse direction. Each one consists of four densely connected convolutional layers, along with the LeakyReLU activation function and batch normalization.

INN module. To better merge the information of the input image and the random binary data, the INN module is composed of multiple identical invertible blocks. There exist many methods to implement the invertible operation, and we choose the widely adopted affine coupling [15] for its simplicity. For the l-th invertible block, in the forward direction, the affine coupling process first splits its input $u^{l}$ into two halves $u_{1}^{l}$ , $u_{2}^{l}$ . The splitting can be performed across the channel dimension, or spatial dimensions using a checkerboard-like pattern [17]. Then, an affine transformation is conducted on the two halves according to the following equations, $\begin{array}{c} (1) & u_{1}^{l + 1} = u_{1}^{l} + f (u_{2}^{l}) \\ (2) & u_{2}^{l + 1} = u_{2}^{l} ⊙ exp [2 σ_{s} (g (u_{1}^{l + 1})) - 1] + h (u_{1}^{l + 1}) \end{array}$ where $σ_{s} (z) = 1 / [1 + exp (- z)]$ is the sigmoid function, ⊙ is element-wise multiplication, and the functions f, g, h can be arbitrary neural networks. In our case, f, g, h are each implemented with a Dense Block for simplicity.

It is straightforward to verify that the mapping from $(u_{1}^{l}, u_{2}^{l})$ to $(u_{1}^{l + 1}, u_{2}^{l + 1})$ is invertible, since $\begin{array}{c} (3) & u_{2}^{l} = (u_{2}^{l + 1} - h (u_{1}^{l + 1})) ⊙ exp [- 2 σ_{s} (g (u_{1}^{l + 1})) + 1] \\ (4) & u_{1}^{l} = u_{1}^{l + 1} - f (u_{2}^{l}) \end{array}$

Channel squeeze. To reduce the number of channels for the output of the INN module, the channel squeeze layer calculates the average of its input across channels, and produces an image of exactly three channels (RGB) in the forward direction [11]. For the inverse direction, the input is simply duplicated across channels. As its name suggests, this layer is not invertible, and causes loss of information.

Specifically, given an input with $3 C$ channels, the channel squeeze layer takes the average of channel 1 to channel C as the value for the first channel of the produced image (e.g., red channel). The green and blue channels are obtained in a similar manner.

Quantization. As the outputs of the channel squeeze layer are floating-point values, it is necessary to quantize the values, such that a valid image can be produced. However, the quantization operation is not differentiable since its derivative is zero almost everywhere, effectively prohibiting backward propagation during training. To solve this problem, we apply a commonly used trick [6], in which the quantization of a continuous value z is replaced with the addition of uniformly distributed noise during training. After training, the normal rounding (quantization) operation is performed. This process during training is described as, $\begin{matrix} (5) & Q (z) = z + ϵ \end{matrix}$ where the noise ϵ follows the uniform distribution $U (- 0.5 / 255, 0.5 / 255)$ , since the pixel values of $[0, 255]$ are normalized to $[0.0, 1.0]$ .

3.4. Loss functions

The aim of the StegEraser model during training is to maintain the visual quality of images, while embedding as much binary secret data as possible into the images, such that it has a higher probability of overwriting potential secret information hidden in images by adversaries after training.

Although the INN module itself is (mathematically) invertible, a certain degree of unavoidable information loss breaks the invertibility of the whole model, due to the inherent and inevitable error of computers’ representation of floating-point numbers, the non-invertible nonlinearity module, channel squeeze layer, as well as the quantization operation. As a result, it is impossible to perfectly embed and extract secret binary data, leading to the necessity of introducing a loss function to be minimized.

Perceptual loss. As noted above, since the human perception of images significantly differs from computer algorithms (including neural networks), and image steganography generally involves making subtle changes to the pixel values, we propose to apply the perceptual image quality loss, as opposed to existing research [58] that relies on pixel-level differences (e.g., MSE, mean squared error). Specifically, the SOTA metric of LPIPS (Learned Perceptual Image Patch Similarity) [53] is adopted, which calculates the distance between the input image $x$ and the reference image $x_{0}$ in the deep features space of L layers of a specific neural network (e.g., VGG). The perceptual loss $L_{perc}$ is defined as $\begin{matrix} (6) & L_{perc} (x, x_{0}) = \sum_{l} \frac{1}{H_{l} W_{l}} \sum_{h, w} {‖ w_{l} ⊙ ({\hat{y}}_{h w}^{l} - {\hat{y}}_{0 h w}^{l}) ‖}_{2}^{2} \end{matrix}$ where $H_{l}$ , $W_{l}$ are the feature dimensions at layer l, $w_{l}$ is a weighting vector, and ${\hat{y}}_{h w}^{l}$ , ${\hat{y}}_{0 h w}^{l}$ are the corresponding elements of the features extracted from x, $x_{0}$ .

Data decoding loss. For the extraction of the random secret binary data embedded in the stego image, the data decoding loss $L_{data}$ is defined as $\begin{matrix} (7) & L_{data} (\hat{y}, y) = CE (\hat{y}, y) \end{matrix}$ where CE is the standard cross-entropy.

Therefore, the total loss to be minimized is as follows, $\begin{matrix} (8) & \begin{array}{r} L (θ) = E [L_{perc} (x, x_{s}) + β L_{data} (\hat{y}, y)] \end{array} \end{matrix}$ where β is a weighting factor, $θ$ is the model parameters to be optimized, and the expectation is taken over the entire dataset.

3.5. Complexity of training and inference

The training process of StegEraser involves both the forward and inverse directions. Therefore, it would take longer and more GPU memory as the number of INN blocks increases. The relatively slow speed of training is common for models of normalizing flows [31], which are also based on INNs.

Fortunately, model inference, i.e., execution after deployment of the model, only involves the forward direction, and is fast apart from being resource-efficient.

Details of the training and inference speed in our experiments are given in Section 4.

4. Evaluation

In this section, we evaluate StegEraser to answer the following questions:

How well does StegEraser perform in removing hidden secret images and binary data? (§4.2.1, §4.2.2)

How does StegEraser affect the stego images and benign images? (§4.2.3, §4.2.5)

Does processing images with StegEraser for multiple passes lead to better results? (§4.2.4)

How much overhead does StegEraser introduce (i.e., inference speed)? (§4.2.6)

What are the impacts of the perceptual quality metric and the number of invertible blocks in the INN module (ablation study)? (§4.2.8)

4.1. Evaluation setup

Stego methods. Similar to [58], we include a blind watermarking approach due to the scarcity of usable open-source implementations of conventional stego methods. The list of approaches used for generating stego images includes,

J-UNIWARD (JPEG UNIversal WAvelet Relative Distortion) [23], a conventional JPEG stego method utilizing a carefully designed distortion cost function of embedding secret binary information, such that the embedding effects can be estimated.

DWT+SVD, a blind watermarking scheme based on discrete wavelet transform (DWT) and singular value decomposition (SVD) [8] for hiding binary data, using open-source implementation.1

¹
https://github.com/guofei9987/blind_watermark

It is highly robust against rotation, noise, brightness change, and JPEG compression, but has a low hiding capacity as a result (roughly 0.016 bpp).

Deep Steganography (DS) [7], an NN-based method for hiding a secret color image in a cover image of the same size.

Universal Deep Hiding (UDH) [51], an NN-based general framework for hiding a color image within a cover image.

SteganoGAN (SG) [52], a popular framework2

https://github.com/DAI-Lab/SteganoGAN

based on GANs for hiding binary data in images. A total of three variants are used, namely basic (SG-B), residual (SG-R), and dense (SG-D), which can respectively hide 5, 8, and 8 secret bits per pixel in nominal terms.

Methods for secret info removal. Similar to related literature [28,58], the performance of StegEraser is compared against

Gaussian Filter (GF), an effective approach for removing hidden information in images [58]. The values of the parameter σ are selected from {0.5, 1.0, 1.5, 2.0}. A higher σ corresponds to a higher level of blurring. We use the implementation from scikit-image3

https://scikit-image.org/

with default settings.

Gaussian Noise (GN). For images normalized to [0.0, 1.0], Gaussian noise is added, with a total of 4 levels of the parameter σ, namely {0.02, 0.03, 0.04, 0.05}. A greater σ leads to more noise.

OSN [58], the state-of-the-art method based on NNs. It simulates the secret-removal effect of a traditional filter (specifically, the Gaussian filter) with a neural network, while improving the quality of the processed image with another neural network. The details are shown below.

The final loss function of OSN is defined as, $\begin{matrix} (9) & L_{OSN} = ‖ X_{att} - X_{flt} ‖^{2} + λ {‖ X_{res} - (X_{ori} - X_{att}) ‖}^{2} \end{matrix}$ where $X_{res}$ denotes the residual image, $X_{ori}$ denotes the original clean image, $X_{att}$ is the attacked image, and $X_{flt}$ is the filtered image. The first term on the right hand side of the equation attempts to make the aforementioned first neural network mimic the behavior of the traditional filter, and the second term improves the quality of the processed image by making it as close to the original image as possible.

Noted in the original paper [58], as well as confirmed in our own experiments, different values of λ have a negligible impact on the quality of processed images and the destruction rate of secret information. For a fair comparison, the OSN model is trained with almost identical settings as StegEraser, with λ chosen from {1.0, 5.0, 10.0, 20.0}.

In our experiments, J-UNIWARD has a success rate of almost zero in extracting the secret information from images processed by any of the above methods for secret info destruction, its results are therefore not further shown below.

Datasets. For faster speed and higher resource efficiency, we train the StegEraser and OSN models on 50k images in the validation set of ImageNet [13]. To make a reasonable comparison and avoid potential correlation between the training and test set, evaluation is conducted on datasets other than ImageNet, i.e., 1000 images in the validation set of COCO [32] and 800 images in the training set of DIV2K [1], separately. All images are resized to 256 × 256 and pixel values are normalized to [0.0, 1.0].

Metrics. For the objective evaluation of image quality, we include the following conventional metrics,

PSNR, peak signal-to-noise ratio, measured in dB.

SSIM, structural similarity index measure.

MS-SSIM, multi-scale SSIM.

To measure the perceptual quality of images, we adopt the metric of Deep Image Structure and Texture Similarity index (DISTS) [14] based on neural networks, in addition to the LPIPS [53] metric with two variants, namely LPIPS-Alex (LPIPS-A) and LPIPS-VGG (LPIPS-V). Some more recent metrics of perceptual image quality can be found in [21], but they are not adopted in this paper due to their current insufficient popularity.

Incidentally, the DISTS metric is considered slightly better than LPIPS, but unfortunately cannot be used as a loss function for image reconstruction (e.g., training an autoencoder). Thus, DISTS is not compatible with the training of StegEraser, and is only adopted for the evaluation of perceptual quality.

Note: as lower DISTS and LPIPS scores correspond to better image quality, we redefine the scores as follows for the rest of the paper for consistency with metrics like PSNR, $\begin{array}{c} (10) & DISTS = 1 - {DISTS}_{original} \\ (11) & LPIPS = 1 - {LPIPS}_{original} \end{array}$

To measure the efficacy of removing secret binary information, the Recovery Rate (RR) is defined as the percentage of bits successfully decoded by the stego methods with regard to the originally embedded bits. In other words, it negatively correlates with the BER(bit error rate), i.e., $\begin{matrix} (12) & RR = 1 - BER \end{matrix}$

In cases where one can not recover binary data hidden in the cover image with 100% accuracy, it is necessary to consider forward error correction (FEC) to appropriately measure the effective hiding capacity of stego methods. An adversary could theoretically attempt to embed as much secret information as possible into the image, but without an FEC mechanism, the embedded secret information cannot be recovered with 100% certainty and accuracy. In the extreme case, the BER of the recovered secret is so high that it is equivalent to a random guess (noise). With the Reed Solomon (RS) FEC scheme, the RS bits per pixel (RS-bpp) defined in [52] is as follows, $\begin{matrix} (13) & RS-bpp = D \times (1 - 2 \times BER) \end{matrix}$ where BER is the bit error rate of the decoded binary data without considering any FEC measures. D is the aforementioned number of binary data planes (each data plane has the same spatial dimensions as the image), and can also be interpreted as the nominal hiding capacity measured in bpp. The RS-bpp measures the average maximum number of secret bits that can be reliably embedded in a single pixel.

Hardware. All experiments are conducted on a server with two Intel Xeon E5-2678 v3 @2.50 GHz CPUs, and four Nvidia RTX 3080 graphics cards, each equipped with 10 GB GPU memory.

StegEraser settings. The number of planes for secret binary data D is set to 7, and a total of 8 INN blocks are present in the INN module. Values of β include {0.0, 0.01, 0.05, 0.1, 0.5, 1.0}.

Training. The StegEraser models are trained with the Adam optimizer at an initial learning rate of $2 \times 10^{- 4}$ , which is reduced by 10% for every epoch. The batch size is set to 2 due to the high GPU memory consumption of INNs. Gradients are accumulated for 8 training steps, yielding an effective batch size of 16. Each model is trained for a total of 10 epochs, and one epoch’s training takes approximately 2 hours on a single Nvidia RTX 3080 graphics card.

Except for only a few learning rates experimented with the aim of stabilizing training while achieving a suitable convergence speed, no extensive tuning of the hyperparameters (e.g., grid search) is conducted in our experiments for the following two reasons. First, the training cost is high, especially considering that multiple models need to be trained to investigate the tradeoff curves. Second, primarily as a proof of concept, StegEraser exhibits satisfying performance even without such tuning.

4.2. Evaluation results

4.2.1. RS-bpp and impacts of β

Table 1
The hiding capacity of StegEraser (SE) measured in RS-bpp, when treated as a stego method

Method COCO DIV2K

RS-bpp DISTS PSNR RS-bpp DISTS PSNR

SG-B 3.61 0.92 31.25 3.47 0.92 30.62

SG-R 3.99 0.96 37.72 3.84 0.96 36.97

SG-D 4.74 0.96 37.59 4.57 0.96 36.82

SE, $β = 0.01$ 1.65 0.99 37.05 1.59 0.99 36.40

SE, $β = 0.05$ 3.13 0.99 35.28 3.01 0.99 34.51

SE, $β = 0.1$ 3.74 0.99 33.36 3.66 0.99 32.77

SE, $β = 0.5$ 5.47 0.96 29.43 5.40 0.96 28.92

SE, $β = 1.0$ 5.79 0.93 27.31 5.74 0.92 26.82

Method	COCO	DIV2K
SG-B	3.61	0.92	31.25	3.47	0.92	30.62
SG-R	3.99	0.96	37.72	3.84	0.96	36.97
SG-D	4.74	0.96	37.59	4.57	0.96	36.82
SE, $β = 0.01$	1.65	0.99	37.05	1.59	0.99	36.40
SE, $β = 0.05$	3.13	0.99	35.28	3.01	0.99	34.51
SE, $β = 0.1$	3.74	0.99	33.36	3.66	0.99	32.77
SE, $β = 0.5$	5.47	0.96	29.43	5.40	0.96	28.92
SE, $β = 1.0$	5.79	0.93	27.31	5.74	0.92	26.82

The best results are in bold face, and the second-best results are in italics.

Since StegEraser per se can be repurposed and utilized as a stego method, the RS-bpp of different values of β is listed in Table 1, given the original “clean” images as input. In the table, StegEraser is only compared with SG which embeds secret binary information into images, because stego methods like DS and UDH hide secret images into images and DWT+SVD has a rather low hiding capacity. It is obvious in the table that on the COCO dataset, StegEraser provides an effective hiding capacity (RS-bpp) of 5.47 when $β = 0.5$ , significantly higher than 4.74 of the SOTA model SG-D, at the same level of perceptual image quality (i.e., $DISTS = 0.96$ ). On the DIV2K dataset, StegEraser achieves an RS-bpp of 5.40, 18.2% higher than that of the SG-D model, at the same level of DISTS score. We believe that such a large hiding capacity enables StegEraser to sufficiently overwrite potential secret information embedded by adversaries. By comparing the scores of DISTS and PSNR, it can be seen that at the same level of perceptual quality, DISTS yields slightly lower PSNR scores, suggesting that StegEraser focuses on the perceptual quality of the processed images, instead of minimizing the pixel-level differences. In other words, the better performance of StegEraser can be partly interpreted by the fact that more opportunity to embed secret information is obtained by using the perceptual loss, instead of the more restrictive pixel-level losses.

As shown in Fig. 3 and Table 1, a higher β puts more emphasis on decoding binary data with higher accuracy during training, and leads to better destruction of adversaries’ secret information during inference, at the cost of degraded image quality of the processed images measured against the stego images.

Fig. 3.

Impacts of β on the RR of secret binary information (for DWT+DCT, SG-B, SG-R, SG-D), quality of the processed images against the stego images (for all stego methods), and quality of the secret images revealed from processed images against the original revealed secret images (for UDH and DS). Numbers are averaged across the stego methods.

4.2.2. Tradeoff curves

Fig. 4.

Comparison of methods for removing secret information, measured on the COCO dataset. (a)–(f), the recovery rate ( lower better ) vs the image quality of processed images against the stego images ( higher better ). (g)–(h), the quality of secret images revealed from processed images ( lower better ) against the original revealed images.

The tradeoff curves for the four methods of removing secret information are given in Fig. 4, in which the performance of OSN and StegEraser are better than the conventional approaches of Gaussian noise and Gaussian filtering. It is worth mentioning that, according to the aforementioned explanation of RS-bpp, a recovery rate of 50% is equivalent to a random guess, i.e., unable to be used to hide secret information, since its RS-bpp is zero.

For removing embedded secret binary information of low bpp (i.e., DWT+DCT, 0.001-0.016 bpp), OSN provides slightly better performance than StegEraser, as shown in Fig. 4(a)–(c). However, OSN only has a very narrow range of performance (mentioned in §4.1), leaving users with few options for the tradeoff, which might become a limiting factor in its real-world deployment. For secret binary information removal with high bpp (i.e., SG-b, SG-r, and SG-d), StegEraser consistently outperforms other methods, as shown in Fig. 4(d)–(f).

For removing embedded secret color images (DS and UDH), as illustrated by Fig. 4(g)–(h), StegEraser leads to the lowest quality of secret images revealed from processed images by adversaries (i.e., the highest level of destruction regarding secret images), at the same level of image quality for the processed images.

4.2.3. Image quality

Fig. 5.

Visualization of images processed by StegEraser (SE) and other methods, in which (s) denotes stego, (r) denotes the revealed hidden image before processing (erasing), and (pr) denotes the revealed hidden image after processing.

A qualitative investigation of the images processed by different methods is presented in Fig. 5. The stego method of DWT+DCT is configured with a payload ratio of 0.9, the parameter σ for GN is 0.04, σ for GF is 1.5, λ for OSN is 10.0, and β for StegEraser is 0.5. Note that Fig. 5 only serves the purpose of visualization, and readers are referred to figures like Fig. 4 for a more rigorous comparison between the performance of different methods. These parameters are selected to be relatively high within their respective ranges, such that their effects can be clearly seen.

In Fig. 5, the original “clean” images are not shown, because they are almost indistinguishable from the first five columns. After embedding secret information into the clean images, the stego images corresponding to the five methods are shown in the first five columns. The stop signs in the sixth column are the revealed (extracted) secret images embedded by DS, and the next column DS (pr) shows the secret images revealed from stego images processed by erasing methods like OSN. It should be noted that DS and UDH embed secret images into the cover image, and consequently have the columns labeled with (r) and (pr), while other methods embed binary data and therefore do not have the corresponding revealed images. As expected, by comparing the rows of images, GN adds a noticeable amount of noise to the processed images, and perceptually degrades the image quality. GF, as an effective conventional method of removing secret information, makes the images blurry. The quality difference of images processed by OSN and StegEraser is less obvious.

For the secret images embedded with DS and UDH, by comparing the columns of DS (r), DS (pr), and UDH (pr), it can be seen that GN successfully makes the revealed image of DS unrecognizable, but leaves the revealed image of UDH partially intact (notice the red stop sign). The effectiveness of GF is exemplified in the second row, in which it is almost impossible to extract information from the revealed images. At a lower σ, e.g., 0.5, the recovered image of DS processed by GF is similar to the originally embedded secret image (not shown in Fig. 5), instead of appearing black in column DS (pr). For OSN, a low but non-negligible amount of information is retained in the revealed secret images. StegEraser renders the lowest quality of revealed images, which can be regarded as noise, and therefore StegEraser successfully disrupts covert communications.

4.2.4. Running multiple passes

Fig. 6.

The effect of running StegEraser for multiple passes for removing secret information on the COCO dataset, measured for the stego methods of (a) DWT+SVD, with a payload ratio of 0.9, and (b) SG-D.

To investigate whether injecting random binary data by repeatedly running StegEraser for multiple times leads to better destruction of stego contents, we illustrate in Fig. 6 the effects of running multiple forward passes on images, with different randomly generated binary data encoded in each pass.

For secret information embedded with DWT+SVD, running more forward passes leads to worse image quality, as expected, and the tradeoff between image quality and destruction of secret information is better than a single pass. However, this is not the case for secret information embedded with SG-D, where a single pass consistently outperforms multiple passes. The results for other stego methods are similar, suggesting that running StegEraser for multiple passes does not necessarily lead to better or worse performance.

4.2.5. Impacts on benign images

Table 2
Impacts of StegEraser (SE) and other methods on benign images. Values of A/B are for the COCO dataset and the DIV2K dataset respectively

Param DISTS LPIPS-V LPIPS-A PSNR SSIM MS-SSIM

GN 0.02 0.91/0.92 0.87/0.88 0.95/0.96 34.05/34.03 0.87/0.89 0.98/0.99

0.03 0.86/0.88 0.80/0.82 0.90/0.92 30.60/30.57 0.78/0.81 0.97/0.97

0.04 0.82/0.84 0.75/0.77 0.85/0.87 28.15/28.12 0.69/0.73 0.95/0.96

0.05 0.79/0.81 0.70/0.73 0.79/0.83 26.26/26.23 0.61/0.66 0.93/0.94

GF 0.5 0.95/0.95 0.97/0.98 0.97/0.97 38.45/37.29 0.99/0.98 1.00/1.00

1.0 0.82/0.82 0.82/0.83 0.78/0.77 29.48/28.27 0.89/0.87 0.98/0.98

1.5 0.76/0.76 0.72/0.72 0.66/0.64 26.70/25.50 0.80/0.77 0.96/0.95

2.0 0.72/0.72 0.65/0.64 0.56/0.54 25.08/23.94 0.73/0.68 0.93/0.91

OSN 1.0 0.97/0.97 0.97/0.98 0.99/0.99 33.58/33.25 0.98/0.98 1.00/1.00

5.0 0.97/0.97 0.97/0.98 0.99/0.99 33.49/33.16 0.98/0.98 1.00/1.00

10.0 0.97/0.97 0.97/0.98 0.99/0.99 33.44/33.10 0.98/0.98 1.00/1.00

20.0 0.97/0.97 0.97/0.97 0.99/0.99 33.46/33.12 0.98/0.98 1.00/1.00

SE 0.01 0.99/0.99 0.99/0.99 1.00/1.00 37.05/36.40 0.97/0.97 1.00/1.00

0.05 0.99/0.99 0.99/0.99 1.00/1.00 35.28/34.51 0.93/0.94 0.99/0.99

0.1 0.99/0.99 0.99/0.99 1.00/1.00 33.36/32.77 0.89/0.90 0.99/0.99

0.5 0.96/0.96 0.96/0.96 0.99/0.99 29.43/28.92 0.80/0.82 0.98/0.98

1.0 0.93/0.93 0.92/0.92 0.97/0.97 27.31/26.82 0.76/0.78 0.97/0.97

	Param	DISTS	LPIPS-V	LPIPS-A	PSNR	SSIM	MS-SSIM
GN	0.02	0.91/0.92	0.87/0.88	0.95/0.96	34.05/34.03	0.87/0.89	0.98/0.99
0.03	0.86/0.88	0.80/0.82	0.90/0.92	30.60/30.57	0.78/0.81	0.97/0.97
0.04	0.82/0.84	0.75/0.77	0.85/0.87	28.15/28.12	0.69/0.73	0.95/0.96
0.05	0.79/0.81	0.70/0.73	0.79/0.83	26.26/26.23	0.61/0.66	0.93/0.94
GF	0.5	0.95/0.95	0.97/0.98	0.97/0.97	38.45/37.29	0.99/0.98	1.00/1.00
1.0	0.82/0.82	0.82/0.83	0.78/0.77	29.48/28.27	0.89/0.87	0.98/0.98
1.5	0.76/0.76	0.72/0.72	0.66/0.64	26.70/25.50	0.80/0.77	0.96/0.95
2.0	0.72/0.72	0.65/0.64	0.56/0.54	25.08/23.94	0.73/0.68	0.93/0.91
OSN	1.0	0.97/0.97	0.97/0.98	0.99/0.99	33.58/33.25	0.98/0.98	1.00/1.00
5.0	0.97/0.97	0.97/0.98	0.99/0.99	33.49/33.16	0.98/0.98	1.00/1.00
10.0	0.97/0.97	0.97/0.98	0.99/0.99	33.44/33.10	0.98/0.98	1.00/1.00
20.0	0.97/0.97	0.97/0.97	0.99/0.99	33.46/33.12	0.98/0.98	1.00/1.00
SE	0.01	0.99/0.99	0.99/0.99	1.00/1.00	37.05/36.40	0.97/0.97	1.00/1.00
0.05	0.99/0.99	0.99/0.99	1.00/1.00	35.28/34.51	0.93/0.94	0.99/0.99
0.1	0.99/0.99	0.99/0.99	1.00/1.00	33.36/32.77	0.89/0.90	0.99/0.99
0.5	0.96/0.96	0.96/0.96	0.99/0.99	29.43/28.92	0.80/0.82	0.98/0.98
1.0	0.93/0.93	0.92/0.92	0.97/0.97	27.31/26.82	0.76/0.78	0.97/0.97

The best results are in bold face.

Since not all images would contain secret information in practice, the impacts of all the methods of active warden on benign images are listed in Table 2.

It is evident from the table that the degradation of image quality caused by StegEraser, especially the perceptual quality (DISTS and LPIPS), is low compared with other methods. By comparing the results of GF and StegEraser, it can be verified that although GF could lead to slightly better traditional scores of PSNR and SSIM, it is outperformed by StegEraser on perceptual metrics, which aims to optimize the perceptual image quality instead of pixel-level similarity.

4.2.6. Model complexity and inference speed

Table 3
Comparison of the models’ complexity and processing speed

Method # Params MACs Processing time (ms)

GN / / 6.35 ± 0.21

GF / / 3.99 ± 0.17

OSN 0.67 M 43.95 G 3.07 ± 0.03

StegEraser 1.57 M 100.90 G 24.07 ± 1.31

Method	# Params	MACs	Processing time (ms)
GN	/	/	6.35 ± 0.21
GF	/	/	3.99 ± 0.17
OSN	0.67 M	43.95 G	3.07 ± 0.03
StegEraser	1.57 M	100.90 G	24.07 ± 1.31

To preliminarily investigate the overhead that StegEraser would bring to firewall systems, we measure the inference speed and model complexity and present the results in Table 3. In the table, MAC represents the “Multiply-accumulate operation” [3], and is a common objective metric indicating the computational complexity of neural networks, since processing speeds on different GPUs can vary to a great extent. The decoding and encoding of image files (e.g., PNG), as well as file operations (reading and storing) are excluded from our calculations. GN and GF are executed on the CPU, while OSN and StegEraser models are run on a single GPU.

As evident in the table, the StegEraser model is relatively lightweight, considering the fact that the well-known image classification model VGG-16 has 138.36 million parameters, and requires 15.5 GMACs on an image of the size 224 × 224. The processing time of StegEraser is greater than that of other methods, reaching 24 ms per image, but is still acceptable.

It should be noted that the inference performance of StegEraser in real-world deployment would be much faster than the results reported in the table, because many techniques would increase speed and resource efficiency, such as using specially designed inference engine like TensorRT,4

⁴

https://developer.nvidia.com/tensorrt

model pruning, and weight quantization [45].

4.2.7. StegEraser as a stego method

Fig. 7.

Comparison of methods for removing secret information embedded by StegEraser, i.e., StegEraser itself being used as a stego method. Results are measured on the COCO dataset. The image quality corresponds to the processed images measured against the stego images ( higher better ).

As mentioned above, StegEraser can also be regarded as a stego method. Therefore, different methods can be evaluated to measure their effectiveness in terms of removing the stego information embedded by StegEraser. To avoid potential confusion, the corresponding results are separately shown here, instead of being incorporated in §4.2.2. The StegEraser model is set with $β = 0.1$ , $D = 7$ as a stego method. In Fig. 7, it is evident that when faced with StegEraser as a stego method, StegEraser as a defensive measure consistently outperforms its competitors, when evaluated according to the tradeoff between both the perceptual and objective image quality vs the recovery rate.

4.2.8. Ablation study

Fig. 8.

Ablation study, measured for SG-D on the COCO and DIV2K datasets. (a)–(b) The effect of the LPIPS perceptual quality loss, vs the MSE loss. (c)–(d) The effect of the number of invertible blocks in the INN module.

To investigate the extent to which the LPIPS perceptual quality loss contributes to the overall performance of StegEraser, in Figs 8(a), 8(b), we present the recovery rate and image quality tradeoff for replacing LPIPS with the mean squared error (MSE) in the loss function. The training settings of them are unchanged. As shown in the figures, the perceptual quality loss (orange curves) significantly improves the destruction of the secret information compared with the pixel-level difference loss of MSE, at the same level of perceptual image quality measured by DISTS.

Similarly, the performance of StegEraser benefits from the INN module, as shown in Figs 8(c), 8(d). The zero blocks setting is equivalent to completely removing the INN module in StegEraser. It is evident that without the information processing capabilities of INNs, StegEraser yields an inferior destruction rate of secret information at the same level of perceptual image quality.

Evaluation on other stego methods leads to similar results, and are consequently not plotted.

5. Discussion

In this section, we discuss the limitations of StegEraser and potential countermeasures against it.

5.1. Limitation

One of the potential limitations of StegEraser is its inference speed. Although our model is quite fast as shown in Section 4, such speed might be inadequate in scenarios with stringent latency requirements, such as time-sensitive industrial systems [36]. To alleviate this problem, one could distill, quantize, or prune [45] the model for a smaller size and faster speed.

5.2. Potential countermeasures

As mentioned in Section 1, with the lower effective channel capacity due to StegEraser, the stego malicious transmission by APT groups is disrupted. Therefore, as a countermeasure, they could instead take a longer time, or use a greater number of compromised devices in the internal network to collaboratively send data, such that their ability to transmit large amounts of data is partly restored, while their risk of being exposed is manageable. Nevertheless, the cost and effort required by adversaries is still raised by StegEraser.

As the offensive and defensive measures are constantly evolving in cybersecurity, the adversaries may alternatively devise and apply new stego methods that could circumvent the effects of StegEraser. For instance, in recent years, researchers have proposed methods of generative steganography, i.e., semantically manipulating images, such that realistic images can be generated directly from the secret information and no embedding process is involved. In [33], Liu et al. design a scheme to convert the bits of secret information into a synthetic image using structure and texture characteristics. Zhou et al. show that a capacity of 4 bpp can be achieved by utilizing the Glow model for generative steganography [56]. As a more recent example, considering that diffusion models have now become the de facto SOTA paradigm for generating highly realistic images, Zhao et al. demonstrate that it is possible to embed watermark strings into the images generated by diffusion models [55].

Furthermore, faced with the foreseeable modifications on images conducted by active wardens, adversaries could incorporate into stego models the existing techniques that increase the robustness of their stego models, such that the covert communication channel is resilient against interference. In [49], the authors consider perturbations to images including noise and JPEG compression, and introduce a stego scheme robust against such distortions.

Owing to the fundamental limitations of StegEraser regarding generative steganography, to the best of our knowledge and effort, the authors currently list two categories of potential defense against it.

DeepFake detection. It is possible to detect whether an image contains stego information hidden with generative steganography, by largely utilizing existing research on DeepFake detection [54] (in a general sense), which aims to detect whether the images are naturally captured or generated by neural networks, but it nevertheless also inevitably involves the problem of accuracy and brings the issue of giving false positive and false negative results.

As a more restrictive but more powerful strategy of defense, the malicious programs running algorithms of generative steganography (and other malware) can be mostly prevented by the paradigm of trusted computing [50], in which only the trusted and authorized programs can be executed on computers. For instance, the trusted execution environment (TEE) [57] provides the opportunity to run programs safely by applying security measures implemented with CPU hardware.

6. Conclusion

As an advanced technique extremely difficult to defend against with traditional deep packet inspection, covert communication based on steganography poses a pervasive and realistic threat to the cybersecurity of corporations, the IoT, industrial systems, etc. To address this challenge, we propose StegEraser, an approach that is fundamentally distinct from existing research, and is intuitively described as defeating steganography with steganography. In StegEraser, a large volume of randomly generated binary data are effectively embedded in the images with invertible neural networks, such that potential secret information hidden by adversaries is overwritten. The visual quality of the processed images is maximally preserved by leveraging the image perceptual quality loss. Evaluation on multiple steganographic methods with different characteristics and two datasets indicates that StegEraser outperforms the state-of-the-art method, and has the capability to disrupt covert communications involving both secret images and secret binary data.

Footnotes

Acknowledgments

This work is partially supported by the Natural Science Foundation of Tianjin (20JCZDJC00610), the National Natural Science Foundation of China (No. 62172241), the Technology Research and Development Program of Tianjin (No.18ZXZNGX00200).

References

Agustsson and

Timofte, Ntire 2017 challenge on single image super-resolution: Dataset and study, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017, pp. 126–135.

Ahmaderaghi,

Kurugollu,

J.M.

Del Rincon and

Bouridane, Blind image watermark detection algorithm based on discrete shearlet transform using statistical decision theory, IEEE Transactions on Computational Imaging 4(1) (2018), 46–59. doi:10.1109/TCI.2018.2794065.

A.B.

Amjoud and

Amrouch, Convolutional neural networks backbones for object detection, in: International Conference on Image and Signal Processing, Springer, 2020, pp. 282–289. doi:10.1007/978-3-030-51935-3_30.

Amritha,

Induja and

Rajeev, Active warden attack on steganography using Prewitt filter, in: Proceedings of the International Conference on Soft Computing Systems, Springer, 2016, pp. 591–599. doi:10.1007/978-81-322-2674-1_56.

Amritha,

Sethumadhavan and

Krishnan, On the removal of steganographic content from images, Defence Science Journal 66(6) (2016). doi:10.14429/dsj.66.10797.

Ballé,

Laparra and

E.P.

Simoncelli, End-to-end optimization of nonlinear transform codes for perceptual quality, in: 2016 Picture Coding Symposium (PCS), IEEE, 2016, pp. 1–5.

Baluja, Hiding images in plain sight: Deep steganography, in: Advances in Neural Information Processing Systems (NeurIPS), 2017.

Bhatnagar and

Raman, A new robust reference watermarking scheme based on DWT-SVD, Computer Standards & Interfaces 31(5) (2009), 1002–1013. doi:10.1016/j.csi.2008.09.031.

Boroumand,

Chen and

Fridrich, Deep residual network for steganalysis of digital images, IEEE Transactions on Information Forensics and Security 14(5) (2018), 1181–1193. doi:10.1109/TIFS.2018.2871749.

10.

Cheminod,

Durante,

Seno and

Valenzano, Performance evaluation and modeling of an industrial application-layer firewall, IEEE Transactions on Industrial Informatics 14(5) (2018), 2159–2170. doi:10.1109/TII.2018.2802903.

11.

K.L.

Cheng,

Xie and

Chen, IICNet: A generic framework for reversible image conversion, in: IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 1991–2000.

12.

Corley,

Lwowski and

Hoffman, Destruction of image steganography using generative adversarial networks, 2019, arXiv preprint arXiv:1912.10070.

13.

Deng,

Dong,

Socher,

L.-J.

Li,

Li and

Fei-Fei, ImageNet: A large-scale hierarchical image database, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009, pp. 248–255. doi:10.1109/CVPR.2009.5206848.

14.

Ding,

Ma,

Wang and

E.P.

Simoncelli, Image quality assessment: Unifying structure and texture similarity, IEEE Transactions on Pattern Analysis and Machine Intelligence 44(5) (2022), 2567–2581. doi:10.1109/TPAMI.2020.3045810.

15.

Dinh,

Krueger and

Bengio, Nice: Non-linear independent components estimation, 2014, arXiv preprint arXiv:1410.8516.

16.

Dinh,

Krueger and

Bengio, NICE: Non-linear independent components estimation, in: Workshop of the International Conference on Learning Representations ICLR, 2015.

17.

Dinh,

Sohl-Dickstein and

Bengio, Density estimation using real NVP, 2016, arXiv preprint arXiv:1605.08803.

18.

Fathi-Kazerooni and

Rojas-Cessa, GAN tunnel: Network traffic steganography by using GANs to counter Internet traffic classifiers, IEEE Access 8 (2020), 125345–125359. doi:10.1109/ACCESS.2020.3007577.

19.

Fillatre, Adaptive steganalysis of least significant bit replacement in grayscale natural images, IEEE Transactions on Signal Processing 60(2) (2011), 556–569. doi:10.1109/TSP.2011.2174231.

20.

Fridrich,

Goljan and

Hogea, Steganalysis of JPEG images: Breaking the F5 algorithm, in: International Workshop on Information Hiding, Springer, 2002, pp. 310–323.

21.

Gu,

Cai,

Dong,

J.S.

Ren,

Timofte,

Gong,

Lao,

Shi,

Wang,

Yang et al., NTIRE 2022 challenge on perceptual image quality assessment, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 951–967.

22.

Hassaballah,

M.A.

Hameed,

A.I.

Awad and

Muhammad, A novel image steganography method for industrial Internet of Things security, IEEE Transactions on Industrial Informatics 17(11) (2021), 7743–7751. doi:10.1109/TII.2021.3053595.

23.

Holub,

Fridrich and

Denemark, Universal distortion function for steganography in an arbitrary domain, EURASIP Journal on Information Security 2014(1) (2014), 1–13. doi:10.1186/1687-417X-2014-1.

24.

Huang,

Huang and

Y.-Q.

Shi, New framework for reversible data hiding in encrypted domain, IEEE Transactions on Information Forensics and Security 11(12) (2016), 2777–2789. doi:10.1109/TIFS.2016.2598528.

25.

Huang,

Liu,

Van Der Maaten and

K.Q.

Weinberger, Densely connected convolutional networks, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 4700–4708.

26.

IBM, Cost of a data breach report, 2022. https://www.ibm.com/security/data-breach.

27.

J.-H.

Jacobsen,

Smeulders and

Oyallon, i-RevNet: Deep invertible networks, in: International Conference on Learning Representations (ICLR), Vancouver, Canada, 2018, https://hal.archives-ouvertes.fr/hal-01712808 .

28.

Jung,

Bae,

H.-S.

Choi and

Yoon, PixelSteganalysis: Pixel-wise hidden information removal with low visual degradation, IEEE Transactions on Dependable and Secure Computing (2021).

29.

I.J.

Kadhim,

Premaratne,

P.J.

Vial and

Halloran, Comprehensive survey of image steganography: Techniques, evaluations, and trends in future research, Neurocomputing 335 (2019), 299–326. doi:10.1016/j.neucom.2018.06.075.

30.

Karampidis,

Kavallieratou and

Papadourakis, A review of image steganalysis techniques for digital forensics, Journal of information security and applications 40 (2018), 217–235. doi:10.1016/j.jisa.2018.04.005.

31.

Kobyzev,

S.J.

Prince and

M.A.

Brubaker, Normalizing flows: An introduction and review of current methods, IEEE transactions on pattern analysis and machine intelligence 43(11) (2020), 3964–3979. doi:10.1109/TPAMI.2020.2992934.

32.

T.-Y.

Lin,

Maire,

Belongie,

Hays,

Perona,

Ramanan,

Dollár and

C.L.

Zitnick, Microsoft COCO: Common objects in context, in: European Conference on Computer Vision (ECCV), Springer, 2014, pp. 740–755.

33.

Liu,

Ma,

Zhang,

Schaefer and

Fang, Image disentanglement autoencoder for steganography without embedding, in: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 2303–2312.

34.

Liu,

Wang,

Zhao and

Liu, Video steganography: A review, Neurocomputing 335 (2019), 238–250. doi:10.1016/j.neucom.2018.09.091.

35.

Lugmayr,

Danelljan,

L.V.

Gool and

Timofte, Srflow: Learning the super-resolution space with normalizing flow, in: European Conference on Computer Vision, Springer, 2020, pp. 715–732.

36.

Ma,

Shang,

Song,

Huang and

Fan, Reliability versus latency in IIoT visual applications: A scalable task offloading framework, IEEE Internet of Things Journal (2022).

37.

M.A.

Majeed,

Sulaiman,

Shukur and

M.K.

Hasan, A review on text steganography techniques, Mathematics 9(21) (2021), 2829. doi:10.3390/math9212829.

38.

S.-M.

Moosavi-Dezfooli,

Fawzi and

Frossard, DeepFool: A simple and accurate method to fool deep neural networks, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2574–2582.

39.

G.D.L.T.

Parra,

Rad and

K.-K.R.

Choo, Implementation of deep packet inspection in smart grids and industrial Internet of Things: Challenges and opportunities, Journal of Network and Computer Applications 135 (2019), 32–46. doi:10.1016/j.jnca.2019.02.022.

40.

J.E.

Rubio,

Roman and

Lopez, Integration of a threat traceability solution in the industrial Internet of Things, IEEE Transactions on Industrial Informatics 16(10) (2020), 6575–6583. doi:10.1109/TII.2020.2976747.

41.

I.H.

Sarker,

M.H.

Furhad and

Nowrozy, AI-driven cybersecurity: An overview, security intelligence modeling and research directions, SN Computer Science 2 (2021), 1–18. doi:10.1007/s42979-020-00382-x.

42.

Sharifzadeh,

Agarwal,

Aloraini and

Schonfeld, Convolutional neural network steganalysis’s application to steganography, in: IEEE Visual Communications and Image Processing (VCIP), IEEE, 2017, pp. 1–4.

43.

P.L.

Shrestha,

Hempel,

Ma,

Peng and

Sharif, A general attack method for steganography removal using pseudo-CFA re-interpolation, in: 2011 International Conference for Internet Technology and Secured Transactions, IEEE, 2011, pp. 454–459.

44.

The MITRE Corporation, Obfuscated files or information: Steganography, 2022. https://attack.mitre.org/techniques/T1027/003/.

45.

Tung and

Mori, Deep neural network compression by in-parallel pruning-quantization, IEEE transactions on pattern analysis and machine intelligence 42(3) (2018), 568–579. doi:10.1109/TPAMI.2018.2886192.

46.

US Cybersecurity and Infrastructure Security Agency, Advanced persistent threat compromise of government agencies, critical infrastructure, and private sector organizations, 2020. https://www.cisa.gov/uscert/ncas/alerts/aa20-352a.

47.

US Cybersecurity and Infrastructure Security Agency, APT cyber tools targeting ICS/SCADA devices, 2022. https://www.cisa.gov/uscert/ncas/alerts/aa22-103a.

48.

Wu,

Chen,

Luo and

Fang, Audio steganography based on iterative adversarial attacks against convolutional neural networks, IEEE Transactions on Information Forensics and Security 15 (2020), 2282–2294. doi:10.1109/TIFS.2019.2963764.

49.

Xu,

Mou,

Hu,

Xie and

Zhang, Robust invertible image steganography, in: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 7875–7884.

50.

Xu,

Mauldin,

Yao,

Pei,

Wei and

Yang, A bus authentication and anti-probing architecture extending hardware trusted computing base off CPU chips and beyond, in: ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), IEEE, 2020, pp. 749–761.

51.

Zhang,

Benz,

Karjauv,

Sun and

I.-S.

Kweon, UDH: Universal deep hiding for steganography, watermarking, and light field messaging, in: Advances in Neural Information Processing Systems (NeurIPS), 2020.

52.

K.A.

Zhang,

Cuesta-Infante,

Xu and

Veeramachaneni, SteganoGAN: High capacity image steganography with GANs, 2019, arXiv preprint arXiv:1901.03892.

53.

Zhang,

Isola,

A.A.

Efros,

Shechtman and

Wang, The unreasonable effectiveness of deep features as a perceptual metric, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 586–595.

54.

Zhao,

Zhou,

Chen,

Wei,

Zhang and

Yu, Multi-attentional deepfake detection, in: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 2185–2194.

55.

Zhao,

Pang,

Du,

Yang,

N.-M.

Cheung and

Lin, A recipe for watermarking diffusion models, 2023, arXiv preprint arXiv:2303.10137.

56.

Zhou,

Su,

Li,

Yu,

Q.J.

Wu,

Fu and

Shi, Secret-to-image reversible transformation for generative steganography, IEEE Transactions on Dependable and Secure Computing (2022).

57.

Zhu,

Hou,

Wang,

Cao,

Zhao,

Wang,

Zhang,

Ying,

Zhang et al., Enabling rack-scale confidential computing using heterogeneous trusted execution environment, in: IEEE Symposium on Security and Privacy (SP), IEEE, 2020, pp. 1450–1465.

58.

Zhu,

Li,

Qian and

Zhang, Destroying robust steganography in online social networks, Information Sciences 581 (2021), 605–619. doi:10.1016/j.ins.2021.10.023.

StegEraser: Defending cybersecurity against malicious covert communications

Abstract

Keywords

1. Introduction

2.1. Steganography and blind watermarking

2.2. Steganalysis

2.3. Active wardens

2.4. Invertible neural networks

3. Details of StegEraser

3.1. Threat model and assumptions

3.2. An overview of StegEraser

3.4. Loss functions

3.5. Complexity of training and inference

4. Evaluation

4.1. Evaluation setup

1 https://github.com/guofei9987/blind_watermark

4.2.1. RS-bpp and impacts of β

Table 3 Comparison of the models’ complexity and processing speed Method # Params MACs Processing time (ms) GN / / 6.35 ± 0.21 GF / / 3.99 ± 0.17 OSN 0.67 M 43.95 G 3.07 ± 0.03 StegEraser 1.57 M 100.90 G 24.07 ± 1.31

5.1. Limitation

5.2. Potential countermeasures

6. Conclusion

Footnotes

Acknowledgments

References

¹
https://github.com/guofei9987/blind_watermark

Table 3
Comparison of the models’ complexity and processing speed

Method # Params MACs Processing time (ms)

GN / / 6.35 ± 0.21

GF / / 3.99 ± 0.17

OSN 0.67 M 43.95 G 3.07 ± 0.03

StegEraser 1.57 M 100.90 G 24.07 ± 1.31