Adaptive multi-cascaded ResNet-based efficient multimedia steganography framework using hybrid mouth brooding fish-emperor penguin optimization mechanism

Abstract

A massive amount of data is transmitted in the Internet of Things (IoT). Nowadays, the concerning of security issues are the major factor while transferring data through wireless networks. Since, data privacy becomes complicated. In this research work, a newly proposed model for multimedia steganography is developed. Initially, the required video is obtained from the publically available datasets, and then the acquired input is subjected to the Adaptive Discrete Cosine Transformation (DCT) based block process. The optimal blocks are chosen by the Adaptive Multi-cascaded ResNet (AMC-ResNet) model for applying stego data. Here, the parameter optimization takes place in the DCT and ResNet model to enhance the steganography performance via the Mouth Brooding Fish Emperor Penguin Optimization (MBFEPO) derived from the Mouth Brooding Fish Algorithm (MBFA) and Emperor Penguin Optimization Algorithm (EPOA). Finally, the inverse DCT is employed at the blocks to get the final stego video. In the audio steganography phase, the wanted audio is gathered from external websites. The collected data are given to the Short-time Fourier Transform (STFT) to convert into the spectrogram image, and then the spectrogram image is given to the Adaptive DCT block, selecting the block to apply stego data. Thus, the blocks are selected with the utilization of the Adaptive Multi-cascaded ResNet (AMC-ResNet), where the parameters within the DCT and the ResNet are optimized via the same MBFEPO to improve the performance. After, the Inverse ADCT is applied to reconstruct the spectrogram image. Then, the resultant stego audio is obtained by using the Inverse STFT. Finally, several experiments are conducted to estimate the working ability of the proposed steganography model. The outcome of the recommended model shows 12.3%, 52.6%, 12.3%, and 84.3% better performance SFO, HBA, MBFA, and EPOA in terms of median. The recommended model performs superior performance rather than the existing approaches.

Keywords

Multimedia steganography Adaptive Discrete Cosine Transformation Adaptive Multi-cascaded ResNet Mouth Brooding Fish Emperor Penguin Optimization Short-Time Fourier transform

1. Introduction

With the frequent growth multimedia era, digital multimedia data has been converted into public channels [12]. Consequently, more focus has also been given to the security of public channel transmission [35]. In the initial phase, the cryptography process was utilized in order to protect the multimedia data. But, the protection has been disabled after the decryption [1]. In this case, Steganography has provided effective secrecy of images or text to protect them from attackers. Steganography is embedding the message at the time of cover images and then changed their properties. It has the ability to provide secret communication to hide the presence of a message from the attacker or hacker [33]. Securing the secret messages from detection is a crucial role in Steganography [26]. Hence, it involves various array of secret communication phases, which has concealed the message terribly existences. This scheme involves spread spectrum, covert channels, digital signatures, character arrangement, microdots, and invisible inks [6]. It has also worked on concealing video, audio, and images to integrating audio steganography and cryptography over personal computers for improving security [32]. Various multimedia steganography techniques are implemented. Most of them are based on the spatial domain that has a simple and popular scheme in extraction and embedding. Some others are based on the transform domain which is highly robust [17].

Information security is generally secured by the users in personal computers for transit across the network, and then it is restored in the cloud space [16]. It has been categorized into two types that are information hiding and encryption [2]. The process of encryption has verified the confidentiality property of the data. By utilizing the encryption, the attained ciphertext is considered an unintelligent or meaningless form that invites suspicion [25]. Hence information hiding has been generally preferred for encryption. The information hiding process has embedded the data into digital media for converting the communication into a secured form [19]. The cover of information hiding is regarded as multimedia data, in which the images are the most crucial medium in the application [22].

In recent times, deep learning depended techniques have attained better outcomes in several research fields [13]. Classical image steganography has been categorized into two classifications such as spatial domain-based and transform domain-based methods. In the spatial domain depending on techniques, the secret information has been embedded into image pixel values [30]. Typical techniques involve histogram-based approaches, prediction error, Least Significant Bit (LSB), and other techniques. The spatial domain-based algorithm has small effects on the quality of the cover images as well as embedding capacity is huge [5]. Moreover, it generally has decreased robustness. For enhancing the robustness, researchers have suggested integrating secret information in the transform domain [29]. A typical technique involves steganography depending on the “Discrete Fourier Transform (DFT), Discrete Wavelet transform (DWT), and DCT”. Classical image steganography has guaranteed information security to a certain extent. Therefore, most of the classical steganography algorithm has been identified through the classical steganalysis algorithm in terms of certain payload. Along with the enhancement of the steganalysis algorithms, classical steganography algorithms is also constantly enhance the distortion function to assist detection. The most commonly used deep learning model is ResNet. The residual functions of the weight layers are learned based on the reference to the input layers. However, the ResNet can be effectively performed to extract the features in order to enhance the performance of the model. In steganography, the ResNet has the ability to extract the steganalysis features to provide exact and accurate outcomes. Additionally, the ResNet solves the gradient vanishing and also the over fitting problems it suggested the Residual Blocks. Moreover, it helps to maintain a low error rate in the network. Here, the newly proposed multimedia steganography has been implemented.

The major attributions in the newly proposed model are given as follows.

To design a new multimedia steganography images by considering the deep learning models. Here, the multimedia steganography is quite applicable in the mobile environments.

To implement the Adaptive DCT techniques for dividing the images from both the video and audio into 8 × 8 image blocks, and the type of DCT is optimized using the MBFEPO algorithm for enhancing the performance.

To develop the AMC-ResNet for selecting the block images and to apply the stego data, where the parameters like epochs and hidden neuron count in the AMC-ResNet have been optimized by using the MBFEPO algorithm.

To recommend the new MBFEPO algorithm for optimizing the parameter in both the DCT and AMC-ResNet for increasing the performance in terms of maximized Peak Signal Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) rate.

The upcoming section in the given steganography model is given below. Phase II describes the literature survey related to traditional steganography images. The dataset description and proposed architecture for multimedia steganography are given in Phase III. In Phase IV, image or frame decomposition by adaptive DCT is detailed. Deep learning-based block selection for multimedia steganography is explained in Phase V. The result and discussion for the multimedia steganography images are given in phase VI, and the conclusion is in phase VII.

2. Literature survey

2.1. Related works

In 2018, Zhang et al. [35] have recommended the coverless image steganography algorithm depending on the Latent Dirichlet Allocation (LDA) and DCT categorization. In the initial phase, LDA has been used to categorize the image database. In the second stage, the images have come under one of the topics that were chosen as 8 ∗ 8 block DCT that was performed on these images. Further, the sequence of the robust features has been produced by the relation among direct current co-efficient in the adjacent blocks. In the end, an inverted index that includes the image path, location coordinates, and feature sequence has been designed. In the whole process, there was no modification has been carried out to the original images. It has shown a better ability to resist steganalysis assimilated with the classical algorithms.

In 2019, Lu et al. [18] have explored the steganalysis techniques depending on the pre-classification as well as feature selection. In the initial phase, it utilized the features retrieved images depending on image adjacent data; the images, along with various texture and content complexity were pre-classified as multiple clusters by using the standard algorithms. Further, the performance of the various traditional steganalysis features was validated for different clusters of images and the optimal features for every cluster for final classification. The investigational outcomes have shown that the identification accuracy has enhanced the suggested model. In addition to that, availability and rationality have also been verified.

In 2018, Qian et al. [24] have implemented a novel methodology for providing multimedia security, such as multimedia tampering detection, privacy preservation on the cloud, and novel steganography. The main intention of the implemented model was to assure the security of multimedia data and for further processing of such data has provides given constant advances in big data and computational capability analytics. Thus, the implemented techniques have shown better performance over other models.

In 2021, Hao et al. [9] have recommended the semi-construction coverless steganography algorithm. In the initial phase, web crawler techniques have been employed to crawla wide range of small icons and images from the Internet. It has been utilized as the subset, and then images have been implemented in terms of construction rules. In the second phase, the Alex-Net network has been designed for training the algorithm as well as adverbial samples were added to the training set. In the third phase, images were split into a privacy carrier image in terms of the construction principle. The investigational outcomes and the evaluation have shown that the recommended algorithm has to resist steganalysis tools in an efficient manner and also has good robustness against several image attacks. These promising outcomes have proved that the recommended algorithm has been employed to build covert communications.

In 2020, Gutub and Ghamdi [8] have enhanced the counting depending on secret sharing methodology for simple and fast computation as well as higher share security. The major aim of this method was to resolve the defects in the reconstruction phase through implementing new distribution techniques. Then, this technique was optimized in an effective manner. In addition, the shares reconstruction techniques have reflected the enhancement of the security in the system via steganography. Further, the multi-media images depended on steganography techniques for restoring the optimized shares. The outcomes have shown an optimized counting dependent on a secret sharing scheme regarding a promising solution.

In 2020, Sukumar et al. [31] have recommended the classier for transforming the multimedia content and then embedded into a selected cover image. Here, the produced stego images were restored in the cloud. If the multimedia content is acquired, stego images were downloaded from the cloud as well as given to the inverse form of Integer Wavelet Transform (IWT). An Investigational value has illustrated better values for the available scheme, and the robustness and security evaluation of the recommended model was also better.

In 2020, Wu et al. [33] have developed a new steganography technique for digital audio in the time domain. Unlike related techniques for image steganography that were highly based on various traditional embedding costs, the recommended techniques began from the even embedding cost and then updated the initial cost until the promising security performance has been attained. The extensive experimental outcomes have shown, in which the developed model has crucially outperformed adaptive and non-adaptive steganography techniques and then attained state-of-art outcomes. In addition to that, the investigational outcomes have investigated the utilization of embedded modification over steganography techniques.

In 2020, Mstafa et al. [19] have designed novel techniques for video steganography depending upon the LSB algorithm as well as the corner point principle. The designed model initially utilized the standard algorithm for detecting the region within the cover video frames. Further, it has utilized another heuristic algorithm for hiding confidential data inside it. Investigational outcomes have revealed that the designed model has maximized invisible and secure values more than other models.

In 2023, Sonali et al. [27] have developed the secured stego key-based video steganographic method which has been embedded with the Framelet Transform. Here, it has been performed to minimize the computational cost by utilizing the stego key. Further, the encryption was considered using the ECC. Also, this scheme was enhanced to provide better robustness of the model. The diverse performance measures have been validated with the state-of-the-art methods in terms of Bit Error Rate (BER), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity Index (SSIM).

In 2023, Massoud et al. [3] have developed a novel steganography approach which was termed as One secret Picture is sent using only Two cover Pictures (OPTP). For each secret image, the stego images were generated to provide better security and quality aspects. In order to provide better capacity, a lossy compression algorithm was determined, and also the verification takes place with the help of the bloom filter. Consequently, the developed model has achieved better performance while the validation takes place among the existing approaches.

In 2023, Aimin et al. [34] have implemented a new steganography algorithm based on the machine learning models. Initially, the images have been splitted based on the regions of foreground and background using the adaptive threshold segmentation. Next, for each pixel, the high-capacity steganography was utilized for enhancing the embedding capacity. Also, hybrid machine learning has been utilized to encrypt the data to improve security. The empirical analysis has been performed to validate the performance of the developed model along the baseline approaches.

2.2. Problem specifications

The performance of the multimedia steganography process with its advantages and disadvantages are tabulated in Table 1. LDA [35] technique has effectively detected the classical steganalysis algorithms. It has the ability to improve the ability to resist identification which has been useful for enhancing the robustness of the model. The capacity with less side information is lacking in this model. K-means algorithms [18] method has the ability to resolve the issues in the feature selection process and enhanced the steganalysis process. It has various complex texture issues while validations. Ensemble [24] technique has been used to conceal the messages within the model for security measures. This model has faced computational complexity issues. The transfer learning [9] method has the potential to enhance the effectiveness of the training model. It also has the ability to speed up the entire process. But, this model faced the issues of negative transfer. Counting-based secret sharing technique [8] computational cost of this scheme is low when assimilated over other models. It has shown its effectiveness in terms of capacity, robustness, and security. Counting-based secret sharing for enhancing the integrity and availability of the data is limited. DRT [31] technique has been developed for increasing the strength of the proposed scheme. It has also provided promising outcomes while validation and simulation analysis. The Colour image steganographic technique for cloud storage is restricted in this model. CNN [33] model has the ability to attain a very effective embedding cost. It has also resolved the issues in the gradient issues and provides better outcomes over other models. The gradient amplitudes and the adaptive parameters are limited in this work. LSBs algorithm [19] technique has been used for secretly sharing the data bits of the hidden messages. It is regarded as the simple and easiest model for implementation. But, it has faced capacity and storage issues.

Motivation. A few challenges of the existing model are depicted here. In existing approaches, handling of large amount of data becomes the complicated issue which does not provide the effective outcomes. Moreover, it consumes more computational time. Also, the accurate and reliable performance is not sufficient in the designed model. In this research work, a new model is developed for the multimedia steganography. Here, the designed MBFEPO algorithm is utilized for tuning the parameters to avoid complex optimization issues. Moreover, the MBFEPO algorithm provides balancing among the exploitation and exploration phases. However, the recommended AMC-ResNet model is suggested to provides to speed up the training process and also it avoids the over fitting issues. Various metrics were validated to show the effectiveness of the developed model. Throughout the entire simulation process, the developed model provides better performance than the existing approaches.

Table 1
Advantages and disadvantages of the traditional multimedia steganography process

Author [citation] Methodology Features Challenges

Zhang et al. [35] LDA ∙ This technique has effectively detected the classical steganalysis algorithms.
∙ It has the ability to improve the ability to resist identification which has been useful for enhancing the robustness of the model. ∙ The capacity with less side information is lacking in this model.

Lu et al. [18] K-means algorithms ∙ This method has the ability to resolve the issues in the feature selection process and enhanced the steganalysis process. ∙ It has various complex texture issues while validations.

Qian et al. [24] Ensemble ∙ This technique has been used to conceal the messages within the model for security measures. ∙ This model has faced computational complexity issues.

Hao et al. [9] Transfer learning ∙ This method has the potential to enhance the effectiveness of the training model.
∙ It also has the ability to speed up the entire process. ∙ But, this model faced the issues of negative transfer.

Gutub and Ghamdi [8] Counting-based secret-sharing technique ∙ The computational cost of this scheme is low when assimilated over other models.
∙ It has shown its effectiveness in terms of capacity, robustness, and security. ∙ Counting-based secret sharing for enhancing the integrity and availability of the data is limited.

Sukumar et al. [31] DRT ∙ This technique has been developed for increasing the strength of the proposed scheme.
∙ It has also provided promising outcomes while validation and simulation analysis. ∙ The Colour image steganographic technique for cloud storage is restricted in this model.

Wu et al. [33] CNN ∙ This model has the ability to attain very effective embedding cost.
∙ It has also resolved the issues in the gradient issues and provides better outcomes over other models. ∙ The gradient amplitudes and the adaptive parameter are limited in this work.

Mstafa et al. [19] LSBs algorithm ∙ This technique has been used for secretly sharing the data bits of hidden messages.
∙ It is regarded as the simple and easiest model for implementation. ∙ But, it has faced capacity and storage issues.

Author [citation]	Methodology	Features	Challenges
Zhang et al. [35]	LDA	∙ This technique has effectively detected the classical steganalysis algorithms. ∙ It has the ability to improve the ability to resist identification which has been useful for enhancing the robustness of the model.	∙ The capacity with less side information is lacking in this model.
Lu et al. [18]	K-means algorithms	∙ This method has the ability to resolve the issues in the feature selection process and enhanced the steganalysis process.	∙ It has various complex texture issues while validations.
Qian et al. [24]	Ensemble	∙ This technique has been used to conceal the messages within the model for security measures.	∙ This model has faced computational complexity issues.
Hao et al. [9]	Transfer learning	∙ This method has the potential to enhance the effectiveness of the training model. ∙ It also has the ability to speed up the entire process.	∙ But, this model faced the issues of negative transfer.
Gutub and Ghamdi [8]	Counting-based secret-sharing technique	∙ The computational cost of this scheme is low when assimilated over other models. ∙ It has shown its effectiveness in terms of capacity, robustness, and security.	∙ Counting-based secret sharing for enhancing the integrity and availability of the data is limited.
Sukumar et al. [31]	DRT	∙ This technique has been developed for increasing the strength of the proposed scheme. ∙ It has also provided promising outcomes while validation and simulation analysis.	∙ The Colour image steganographic technique for cloud storage is restricted in this model.
Wu et al. [33]	CNN	∙ This model has the ability to attain very effective embedding cost. ∙ It has also resolved the issues in the gradient issues and provides better outcomes over other models.	∙ The gradient amplitudes and the adaptive parameter are limited in this work.
Mstafa et al. [19]	LSBs algorithm	∙ This technique has been used for secretly sharing the data bits of hidden messages. ∙ It is regarded as the simple and easiest model for implementation.	∙ But, it has faced capacity and storage issues.

3. Dataset description and proposed architecture for multimedia steganography

3.1. Dataset details

In this process, the data required to perform the multimedia steganography has been gathered from two datasets that are below.

Dataset 1: This dataset is named as Audio Steganalysis with Deep Learning. It is defined as the standard dataset, which is utilized for audio steganalysis. It includes audio files with a bit rate of 128 and 320 kbps. All the details have been aggregated from “https://github.com/Charleswyt/tf_audio_steganalysis”: “Access Date: 2022-12-19”.

Dataset 2: It is known as deep video steganography: hiding videos in plain sight. It includes various video frames that are used for steganography. This information is gathered from “https://github.com/anilsathyan7/Deep-Video”: “Access Date: 2022-12-19”.

Hence, the data from both video and audio has been attained, and it is termed as ${VD}_{ax}$ , $ax = 1, 2, \dots, AX$ where the total count of collected videos is termed as $AX$ and ${AD}_{bx}$ , $bx = 1, 2, \dots, BX$ where, the total count of collected audio is given as $BX$ accordingly.

3.2. Architecture of multimedia steganography

In this modern technological era, everyone wants safety and secrecy in communicating data. There were two mechanisms for securing the data that are steganography and cryptography. Considering the cryptography text or data has been transferred into cipher text, and so no one can identify the original data without a cipher key. But, the cipher or encryption key is easily accessed by non-users and has low security. Here, steganography is developed. Steganography is defined as the procedure of switching the information inside a data source. The most widely recognized usage of this steganography has been defined as concealing a record inside another document. It is usually based on hiding the uncovered messages over the unsuspected mixed media information as well as used as a part of secret correspondence among recognized gatherings. It is regarded as the system for encryption, which has the ability to hide the information among the bits of cover objects. It has four kinds of steganography that are: video, audio, image, and text for securing the information in the required format. Image steganography has been utilized to secure data transfer over the Internet by using images. Statistical techniques have been used to modify the statistical property of an image in accordance with preserving them in the embedding process. To encode message bits in the transform domain coefficient of the data is used. The process of data embedding over the transform domain has been utilized for robust weakness. Even though traditional steganography has various advantages, there are also some limitations over the model on securing the data in the steganography. Thus, the new concept of multimedia steganography has been implemented, and its architectural depiction is given in Fig. 1.

Fig. 1.

Architectural model for the newly designed multimedia steganography.

Thus, the recommended multimedia steganography has been performed in two phases (a) Image or Video steganography phase and (b) Audio steganography phase. Initially, the wanted image and video have been attained from the publically available datasets, and then the acquired input has been fed into the Adaptive (DCT) based block process. After, the optimal blocks are selected using the AMC-ResNet model and the stego data is applied. Here, the parameter optimization takes place in the DCT and also the ResNet model to enhance the steganography performance via the MBFEPO. Finally, the inverse DCT is applied at the blocks to get the final stego image and video. In the audio steganography phase, the wanted audio has been gathered from external websites. The collected data are given to the STFT to convert into the spectrogram image, and then the spectrogram images are given to the Adaptive DCT block for selecting the number of blocks to apply stego content. Thus, the blocks are selected with the utilization of the AMC-ResNet, where the parameter within the DCT and the ResNet are optimized via the same MBFEPO to improve the performance. After, the Inverse DCT is applied for reconstructing the spectrogram image. Then the resultant stego audio is obtained by using the Inverse STFT. In the end, several investigations are done to shows the performance of the recommended steganography model.

4. Image of frame decomposition by adaptive discrete transformation

4.1. Image formation of raw video and audio

In this phase, the data is aggregated from both the audio and video ${AD}_{bx}$ ,and ${VD}_{ax}$ that has been fed as the input for forming the images from both data.

STFT [ 11 ]: The STFT approach is effectively utilized to split the longer time segments into shorter segments. Moreover, the STFT is more efficient when dealing with noisy signals. Here, the STFT techniques are accurate to detect the periodic patterns, which show the reliable performance. Also, it captures the time and frequency information in the signals in the simultaneous manner. Owing to this effectiveness, we have taken the STFT in this research work. In this phase, the STFT model has been used to attain spectrogram images from the audio data. It has used only a small amount of details that are lost from the signal at the time of transformation. It also has the ability of better compression, restoration, and improvement. Generally, the major limitation in the case of transient signals has been defined as that the Fourier transformation fails to depict the time-variant behaviors of the signal. Here, the STFT has been based on the DFT, which has depicted the phase and frequency of a section of a time-dependent signal. On splits the given signals into equal parts by utilizing the windowing method along with the performing and overlapping DFT, the pretended discrete STFT is attained, and it is equated in Eq. (1). $\begin{matrix} (1) & stft {t [w]} (u, ξ) = \sum_{w = - \infty}^{\infty} t [w] V [w - u] v^{- s ξ w} \end{matrix}$

Here, the complex unit is indicated as s, the window function which has a smaller duration related to the input signal, is given as $V [w]$ , and the time signal to be transferred is depicted as $t [w]$ . In addition to that the size of the window is 512. $\begin{matrix} (2) & V [w] = \{\begin{array}{ll} 0.5 [1 - cos (\frac{2 π w}{G - 1})] & 0 ⩽ w ⩽ G - 1 \\ 0 & otherwise \end{array} \end{matrix}$

Hence, the time domain signals are transformed into the spectrogram images through plotting each of the data recordings as individual 256 × 256 pixel images.

In the end, by using the STFT techniques, the spectrogram image is attained and termed as ${SI}_{cx}^{stft}$ , $cx = 1, 2, \dots, CX$ where the number of attained images is depicted as $CX$ .

Consequently, in terms of video data, it usually contains various frames in it, and then the frames are usually considered as images. It is used to enhance the quality of the given images. Thus, the images attained from the video data have been termed as the $I_{vx}$ , $vx = 1, 2, \dots VX$ where, the number of attained images from the video frame is depicted as $VX$ .

4.2. DCT-based decomposition

In this research work, the DCT based decomposition is performed which shows the effective performance rather than the existing techniques. Here, the decomposition takes place in the image into the spatial frequency spectrum. However, the DCT is widely applicable in videos, images, media, audio, etc. Thus, the DCT has the ability to produce the real transform coefficients. The images attained from both the audio and video are depicted as ${SI}_{cx}^{stft}$ and $I_{vx}$ are given as the input to the DCT techniques for getting the decomposed images. It has been employed to different regions like high-dynamic-range compression, image abstraction, and image smoothing.

DCT [ 20 ]: The DCT is generally considered as the most crucial transform in image processing. The large DCT coefficients have been gathered in the low-frequency region.

The 1D-DWT for the sequence $z (y)$ of length C has been indicated as $E (x)$ and it is equated in Eq. (3). $\begin{array}{c} (3) & \begin{array}{c} E (x) = ω (x) \sum_{y = 0}^{C - 1} z (y) cos (\frac{π (2 y + 1) x}{2 C}), \\ 0 ⩽ x ⩽ C - 1 \end{array} \\ (4) & Where ω (x) = \{\begin{array}{ll} \sqrt{\frac{1}{C}} & x = 0 \\ \sqrt{\frac{2}{C}} & x \neq 0 \end{array} \end{array}$

When the value of $x = 0$ , then Eq. (3) has become $E (0) \sqrt{\frac{1}{C}} \sum_{y = 0}^{C - 1} z (y)$ .

The first transform coefficient is regarded as the average of all the samples in the sequence that are called as DC coefficients as well as the other transform coefficients are called as AC coefficients.

By, using the DCT techniques, the decomposed images from video and audio have been attained and it is termed as $Dc - v_{gx}^{dct}$ and $Dc - a_{hx}^{dct}$ .

4.3. Adaptive DCT-based decomposition

DCT has been used to separate the images into various parts. It also has high computational power. But, it does not reveal any information about the spatial domain. To tackle the disadvantages in the DCT model, the newly designed adaptive DCT is used. In this phase adaptive DCT model has been used for splitting up the decomposed images into 8 × 8 blocks for all channels (RGB). Commonly, RGB images have three channels red, green, and blue. The maximized dimensional-rich features through the residuals of channel differences over the images have been extracted for effective steganography. Here, the recommended MBFEPO algorithm has been used for optimizing the DCT model for enhancing the performance of the steganography process, in which the type of DCT from [1,12,33,35] is optimized in this phase. Then, the process of adaptive DCT-based image decomposition and block split is given in Fig. 2.

Fig. 2.

Adaptive DCT-based image decomposition and block split process.

4.4. Proposed MBFEPO

The related studies for multimedia steganography are investigated and also the numerous technologies are developed to achieve the significant performance. In existing works, we have analyzed the diverse algorithms in this research area but, the techniques fail to provide effective performance. Moreover, the existing techniques falls into the local minima problems. Also, the training process requires more time to validate the model. Thus, the convergence speed gets affected. In order to solve the issues, we have considered the MBFA and EPOA algorithms due to its effective performance. The binary version of the EPOA for resolving the multi-objective issues is restricted in this model. Consequently, the MBFA has yielded powerful as well as efficient results for determining the optimum solution for structural optimization issues. It has also been used to minimize the construction cost of the model. But, it has faced storage issues that degrade the performance of the model. In this research work, we have combined the MBFA and EPOA algorithms to develop a new model called MBFEPO algorithm. Here, the new MBFEPO algorithm is performed to effectively solve the issues in the existing algorithms. The newly explored MBFEPO algorithm has been developed via MBFA and EPOA algorithms for performing the given model. It provides better solutions and effective outcomes in terms of both space and time complexity. It also gives high local optima avoidance and is used in real-time optimization issues for proving its efficiency.

The newly designed MBFEPO algorithm is implemented by using Eq. (5). $\begin{matrix} (5) & {tm}^{'} = {tm}_{0} (\frac{{Mx}_{it}}{b - {Mx}_{it}}) \end{matrix}$

Here, the term $B > 1$ denoted the temperature profile around the huddle, ${Mx}_{it}$ indicated the maximum iteration, and b terms the current iteration. When the value of ${tm}^{'} < (\frac{{Mx}_{it}}{4})$ is verified, then position updating takes place by using the MBFA algorithm, else the position updating is made through the EPOA algorithm.

EPOA [ 4 ]: EPOA has been defined as the new optimization algorithm developed by observing the huddling mannerism of the emperor penguins. It is the heaviest and tallest in all of the penguin species, and it is scientifically named as Aptenodytes forsteri. The major intention of this model is to detect an effective manner. Hence, the steps involved in the process are given below.

a) Huddle boundary determination: at the time of huddling, the location of emperor penguins locates them into the polygon shape grid boundary. The wind flow around the huddle has been demonstrated to detect the huddle boundary around a polygon. Moreover, the wind flow is heavier when compared to the propagation of emperor penguins. When the wind velocity is termed as α and the gradient of α is given as χ as well as it is equated in Eq. (6). $\begin{matrix} (6) & χ = \nabla α \end{matrix}$

The vector ϕ is then combined with the gradient α for producing the complex potential expressed in Eq. (7). $\begin{matrix} (7) & A = α + a ϕ \end{matrix}$

Here, the location of the emperor penguins has been updated randomly towards the location of the emperor penguin. The analytical function on the polygon plane is depicted as A, and the imaginary constant is given as a.

b) Temperature profile: In this phase, the temperature has been regarded for carrying out the process. The emperor penguins have produced huddles in order to increase the ambient temperature in the huddle and to conserve energy. If the radius of the polygon is $B > 1$ , then the temperature becomes $tm = 0$ , and if the radius becomes $B < 1$ , then the temperature becomes $tm = 1$ . Then, the temperature profile around the huddle is indicated as $B > 1$ , it is equated in Eq. (8). $\begin{array}{c} (8) & {tm}^{'} = (tm - \frac{{Mx}_{it}}{b - {Mx}_{it}}) \\ (9) & tm = \{\begin{array}{ll} 0 & if B > 1 \\ 1 & if B < 1 \end{array} \end{array}$

Here, time for detecting the best optimal solution is represented as $tm$ , the radius is given as B, the current iteration is denoted as b, and the maximum number of iterations is termed as ${Mx}_{it}$ .

c) Distance: Other search agents updated their corresponding location to that of the current best optimal solution and it is equated in Eq. (10). $\begin{matrix} (10) & {\vec{O}}_{emp} = abs (C (\vec{Z}) . D (\vec{b}) - \vec{E} . D_{emp} (b)) \end{matrix}$

Here, the current iteration is given as b, The position vector of the emperor penguin is indicated as $D_{emp}$ , the best optimal solution and is denoted as $\vec{E}$ , the distance between the emperor penguin and the best fittest search agent is termed as ${\vec{O}}_{emp}$ , the social forces of emperor penguins is depicted as $F ()$ , the collision among neighbors are neglected by utilizing $\vec{Z}$ and $\vec{E}$ , it is expressed in Eq. (11) and (13). $\begin{array}{c} (11) & \vec{Z} = (G \times {tm}^{'} + E_{grd} (acr) \times rd ()) - {tm}^{'} \\ (12) & E_{grd} (acr) = abs (\vec{E} - {\vec{E}}_{emp}) \\ (13) & \vec{E} = rd () \end{array}$

Here, the random function over $[0, 1]$ is termed as $rd ()$ , the polygon grid accuracy is given as $E_{grd} (acr)$ , the movement parameter, which manage a gap between search agents for collision avoidance is denoted as G. Then, the function $F ()$ is expressed in Eq. (14). $\begin{matrix} (14) & F (\vec{Z}) = {(\sqrt{d . f^{- \frac{b}{e}} - f^{b}})}^{2} \end{matrix}$

Here, exploration and exploitation parameters are indicated as d and e, and the expression function is given as f.

d) Relocate the mover: The next location of the emperor penguin has been updated by using Eq. (15). $\begin{matrix} (15) & {\vec{E}}_{emp} (b + 1) = \vec{E} (b) - \vec{Z} . {\vec{O}}_{emp} \end{matrix}$

Here, the next updated position of the emperor penguin is denoted as ${\vec{E}}_{emp} (b + 1)$ .

MBFA [ 28 ]: The symbiotic interaction strategy, which is adopted by the organisms to survive as well as propagate in the ecosystem, has been simulated by utilizing the MBF algorithm. It is defined as the recently designed meta-heuristic algorithm. Here, mouth brooders are well-known for their potential to protect and take care of their offspring, largely due to their very unusual methods. It has been carried out by using the following phases.

a) Main movements: The major movements for every cichlid have been calculated by using Eq. (16). $\begin{matrix} (16) & Y_{ps} = ps \times CM \end{matrix}$

Here, the mother’s source point for the next iteration is indicated as $ps$ . This propagation is regarded as the last propagation of cichlids is given as $CM$ and it is expressed as in Eq. (17). $\begin{matrix} (17) & ps = ps \times {ps}_{dmp} \end{matrix}$

Here, the mother’s source point damp is denoted as ${ps}_{dmp}$ . $\begin{matrix} (18) & Y_{bl} = disp \times (Bst . cic - Pos . cic) \end{matrix}$

Here, the amount of dispersion that control parameters and could increase or decrease the effect of this movement is denoted as $disp$ , the best position is indicated as $Bst . cic$ , and the current position of the same cichlid is given as $Pos . cic$ . $\begin{matrix} (19) & Y_{bg} = disp \times (Bst . bg - Pos . cic) \end{matrix}$

Here, the current position for each cichlid is depicted as $Pos . cic$ , the best position found of all cichlids is given as $Bst . bg$ . $\begin{matrix} (20) & NEW . p = 10 \times ps \times NF . pst (sc) \end{matrix}$

Here, the best position of the last and current generation is termed as $NF . pst (sc)$ . $\begin{matrix} (21) & Y_{fn} = disp \times (NEW . p - NF . pst) \end{matrix}$

Here, the best position of cichlids at last iteration is represented as $NF . pst$ .

In terms of main movements, every child has the ability to propagate no more than the “Additional Surrounding Dispersion Negative or Additional Surroundings Dispersion Positive (ASDN or ASDP)”. Then, the two parameters mentioned have been expressed as in Eq. (22). $\begin{matrix} (22) & asdp = 0.1 \times (max . va - mi . va) . asdn = - asdp \end{matrix}$

Here, the maximum and minimum limits of the problem variation are depicted as $max . va$ and $mi . va$ accordingly.

When the current location is out of the search space area, then the new movement has been added by utilizing the mirror effects, and it is equated as in Eq. (23). $\begin{matrix} (23) & CM = - CM \end{matrix}$

Here, the movements of cichlids before and after of mirror effects are given as $CM$ .

b) The additional movements: The mother has kept as many cichlids as per the capacity of the mouth permits and remaining are denoted as left out cichlids. The amount of the left-out cichlids has been equated as in Eq. (24). $\begin{matrix} (24) & mn = 0.04 \times popFish \times {ps}^{- 0.431} \end{matrix}$

Here, the mother’s source point is indicated as $ps$ , the population size of cichlids is given as $popFish$ , and left-out cichlids are denoted as $mn$ . The amount of cells for selecting left-out cichlids has been equated as in Eq. (25). $\begin{matrix} (25) & nucech = [var m \times sdisp] \end{matrix}$

Here, the number of the cells that are to be changed is represented as $nucech$ . Thus, the left-out cichlids have the second part of propagation, hence, the restriction of propagation has been multiplied through 4 is given in Eq. (26). $\begin{matrix} (26) & uadispp = 4 \times asdp . uadisn = - uadispp \end{matrix}$

Here, the dispersion positive and negative limits for the left-out cichlid’s propagation are represented as $uadispp$ and $uadispn$ . Then, the second part of the propagation has been calculated as in Eq. (27). $\begin{matrix} (27) & LC . pst = uadispp \pm c . s (sc) \end{matrix}$

Here, the new position of left-out cichlids after the second part of the movements is $LC . pst$ , the randomly selected cells of cichlids are termed as $c . s (sc)$ .

c) Crossover: In this phase, the mouth brooding fish has permitted its best cichlids to marry. Therefore, the MBF algorithms by utilizing the Roulette Wheel selection or a probability distribution thus chose one pair of parents from each cichlid.

d) Shark attack: In this phase, the amount of cichlids for shark attack propagation has been expressed in Eq. (28). $\begin{matrix} (28) & shrkm = 0.04 \times popFish \end{matrix}$

Here, the number of cichlids for the shark attack effect is given as $shrkm$ . The shark attack and location is equated in Eq. (29). $\begin{matrix} (29) & ch . NP = shrkatt \times c . p \end{matrix}$

Here, the randomly selected cichlids are depicted as $c . p$ , the matrix that holds the number of cells and Cichlids is given as $shrkatt$ . The pseudo-code for the newly designed MBFEPO is given in Algorithm 1.

Algorithm 1

Developed MBFEPO

Then, the overall process included in the MBFEPO algorithm is given in flowchart form, and it is given in Fig. 3.

Fig. 3.

The flowchart for the given MBFEPO algorithm.

5. Deep learning-based block selection for multimedia steganography

5.1. Adaptive multi-cascaded ResNet-based block selection

In this phase, the 8 × 8 image blocks obtained by using the adaptive DCT model have been fed to the AMC-ResNet model for selecting the blocks from the images for applying stego data into it.

ResNet [ 15 ]: In general, the DCNN has acquired different breakthroughs in the image classification process. The convergence in the deeper network for the enhancement of the deep learning model, but there is a restriction problem in the network. Sometimes, it may lead to an increase in accuracy and thus obtain the saturation level. In this case, if the depth is higher, then the accuracy is lowered. Hence, the ResNet model has enhanced the retrieving ability over the network by using the cross-layer feature fusion, and the network performance on the other hand has also improved along with the network model. The ResNet model has shown better classification performance when assimilated with other models as well as shown improved accuracy rate by maximizing the depth of the network. To tackle the issues like error rate and degradation, the function included in the ResNet network every layer in the network. The ResNet is then demonstrated as the difference among the actual and estimated values, in terms of mathematical statistics. The identity mapping process and convolutional layers are regarded as the two major components in the residual layer. It usually involves 34 layers with the convolutional filter that has used the fully-connected layers, the same padding, and the max-pooling layers. In addition to that, it has ended up with the Softmax function for attaining the classified images. Further, the kernel size in this convolutional layer has been considered as 3 × 3. Therefore, the output and input dimensions over these residual components are the same and get added up to each other directly. If 1 is regarded as the step size, after the convolution, bath regularization as well as Rectified Linear Unit (ReLU) activation of the input in the ResNet network model, but the padding phase is considered as the original. If 2 is regarded as the step size of the network, then the input of the ResNet model follows the same operation as 1. Moreover, the average pooling process has been made to obtain the filling layer. Finally, the input of the output layer is determined as the sum of the output for the filling layer as well as the output of the residual components.

In this phase, two ResNet models are parallel cascaded to form the AMC-ResNet model. Here, two separate ResNet models are considered for the process and attain two different scores from both ResNet. Further, it is taken averaging and concatenated to obtain the final single outcome.

Here, it has included training and testing phases in the AMC-ResNet model. In both the training and testing phases, the 8 × 8 image blocks are given as the input for attaining the target as selected blocks. The targets are considered as 0 and 1. The smooth region is considered as 1, and the sharp region is considered as 0 in the image. It is calculated by using the standard deviation of the image pixel. In the case of the smooth region, the intensity of the images is the same, and therefore there is no loss of information. On the other hand, the sharp region has different intensities, and there is a loss of information. In regard to this, the testing and training phase in the AMC-ResNet model has been carried out to effectively select the block for embedding.

The ResNet model has been used to tackle vanishing issues by utilizing identity mapping. The network model with the maximized amount of layers is trained easily without increasing the percentage of training error. But, it is infeasible in terms of real-time application. On the other hand, the DCT has more efficient for validating the illumination variation as well as it has been used for the contrast enhancement process. But, it has failed to localize the frequency component over the space. In order to tackle the difficulties in both the ResNet and DCT model, the newly designed MBFEPO algorithm has been used for optimization, and the process is detailed in the objective function, and it is equated in Eq. (30). $\begin{matrix} (30) & obje = \underset{{{ty}_{dct}, {EP}_{resnet}, {HN}_{resnet}}}{arg max} (PSNR + SSIM) \end{matrix}$

Here, the objective function is represented as $obje$ , the type of DCT from [1,12,33,35] is depicted as ${ty}_{dct}$ , the epoch in the ResNet among the range [50 to 100] is indicated as ${EP}_{resnet}$ , and the hidden neuron count in the ResNet model among the range [5-255] is denoted as ${HN}_{resnet}$ has been optimized with the aid of MBFEPO algorithm for enhancing the performance of the given multimedia steganography images. The peak error has been given as $PSNR$ , and it is expressed in Eq. (31). $\begin{matrix} (31) & PSNR = 10 log (\frac{{df}_{max}^{2}}{MSE}) \end{matrix}$

Here, the term ${df}_{max}^{2}$ has been defined as the maximum value in the original data or cover data including audio and video, and the MSE is termed as $MSE$ and expressed in Eq. (32). $\begin{matrix} (32) & MSE = \frac{1}{AD \cdot BV} \sum_{cd = 0}^{BV - 1} \sum_{ed = 0}^{AD - 1} {[I_{ste} (cd, ed) - I_{or} (cd, ed)]}^{2} \end{matrix}$

Here, the term $AD$ and $BV$ is defined as the number of columns and rows in multi-media accordingly, and the term $I_{ste} (cd, ed)$ and $I_{or} (cd, ed)$ is defined as the stego added and original data accordingly.

SSIM: The SSIM is measured based on the various windows of the image. $\begin{matrix} (33) & SSIM = \frac{2 (μ_{a} μ_{b} + x_{1}) (2 σ_{a b} + x_{2})}{(μ_{a}^{2} + μ_{b}^{2} + x_{1}) (σ_{a}^{2} + σ_{b}^{2} + x_{2})} \end{matrix}$

Here, the variable $μ_{a} μ_{b}$ represents the pixel sample mean of a and b. Moreover, the term $σ_{a b}$ is denoted as the covariance of a and b. The constant variable is defined as $x_{1}$ and $x_{2}$ . Then, the overall process included in the AMC-ResNet model is given in Fig. 4.

Fig. 4.

Diagrammatic depiction of AMC-ResNetmodel for block selection.

5.2. Process of image or video steganography

The overall process included in the image steganography is given below.

Step 1: The image gathered from the video $I_{vx}$ has been given as the input to ADCT for attaining the decomposed images as $Dc - v_{gx}^{dct}$ .

Step 2: Then the decomposed images are then split into 8 × 8 image blocks for all the channels in RGB.

Step 3: Then, the 8 × 8 image blocks are given as the input to the AMC-ResNet, which has undergone the testing and training phase for attained the target as selecting the block for embedding. The targets are usually considered as sharp images as 0 and smooth images as 1. To further enhance the performance of the recommended model, the parameters in DCT and ResNet are optimized with the aid of the MBFEPO algorithm.

Step 4: The stego data or content are then applied to the selected blocks from the ResNet. Further, the blocks are recombined.

Step 5: The recombined images are given as the input to the inverse ADCT images for attaining the reconstructed stego images.

The inverse form of the ADCT techniques is equated in Eq. (34). $\begin{matrix} (34) & \begin{array}{c} z (y) = \sum_{y = 0}^{C - 1} ω (x) E (x) cos (\frac{π (2 y + 1) x}{2 C}), \\ 0 ⩽ y ⩽ C - 1 \end{array} \end{matrix}$

This equation is generally defined as the inverse transform or synthetic formula. The basic sequence is given as $cos (\frac{π (2 y + 1) x}{2 C})$ , which is real and discrete time sinusoids.

Finally, the multimedia steganography images are obtained with the maximization of PSNR and SSIM values.

5.3. Process of audio steganography

The overall process included in the audio steganography is given as follows.

Step 1: The spectrogram image collected from audio by using the STFT technique ${SI}_{cx}^{stft}$ has been fed as the input to DCT for attaining the decomposed images as $Dc - a_{hx}^{dct}$ .

Step 2: Then the decomposed images are then split into 8 × 8 image blocks for all the channels in RGB.

Step 3: Further, the 8 × 8 image blocks are given as the input to the AMC-ResNet, where it has undergone testing and training phase for attained the target as selecting the block for embedding. The targets are usually considered as sharp images as 0 and smooth images as 1. To further enhance the performance of the recommended model, the parameters like type of DCT in ADCT and epoch and hidden neuron count in ResNet are optimized with the aid of the MBFEPO algorithm.

Step 4: The stego data are then employed to the selected blocks from the ResNet. Further, the blocks are recombined.

Step 5: The recombined images are given as the input to the inverse ADCT images for attaining the reconstructed stego images.

Step 6: At the end, the reconstructed stego images are given to the inverse STFT techniques for attaining the reconstructed stego audio.

The inverse of STFT is given in Eq. (35). $\begin{matrix} (35) & istft {t [w]} (u, ξ) = \frac{1}{2 π} \int_{- π}^{π} \times t [w] V [w - u] v^{s ξ w} d ξ \end{matrix}$

Finally, the multimedia steganography images are obtained with the maximization of PSNR and SSIM values.

6. Results and discussion

6.1. Experimental setup

The performance estimation over the recommended multi-media steganography model was validated in Python. Here, the performance validation was carried out by utilizing various measures such as “NCC, Mean Squared Error (MSE), Bit Rate (BR), and SSIM, PSNR, and Embedding Capacity (EC)”. The “Sunflower Optimization (SFO) [7], Honey Badger Algorithm (HBA) [10], EPOA [28], MBFA [4] Lifting Wavelet Transform (LWT) [21], Dual Tree Complex Wavelet Transform (DTCWT [23]” were some of the algorithm and classifiers used for validation. The number of population was 1, the Chromosome length was 3 and the maximum iteration was 25 was utilized.

6.2. Performance metrics

The various performance metrics included in the process are represented below.

(a) BR: “It is used for measuring the change of the bit rate, in which the bit rate increase rate” and it is given in Eq. (36). $\begin{matrix} (36) & br = \frac{{rv}_{EMB} - {rv}_{OR}}{{rv}_{OR}} \times 100 % \end{matrix}$

Here, the term ${rv}_{EMB}$ gives the size of the data after embedding the information process and ${rv}_{OR}$ depicts the size of the original multimedia data.

(b) EC: “The maximum quantity of secret data that can be embedded in cover data. It depends on the properties of cover data and the embedding function” and it is given in Eq. (37). $\begin{matrix} (37) & ec = \frac{{rv}_{EMB}^{mx}}{{rv}_{OR}} \times 100 \end{matrix}$

Here, the maximum size of embedded data is represented as ${rv}_{EMB}^{mx}$ .

(c) NCC: “To evaluate the likeness between the original secret data and the extracted secret data”, and it is given in Eq. (38). $\begin{matrix} (38) & \begin{array}{c} ncc = \sum_{cd = 0}^{BV - 1} \sum_{ed = 0}^{AD} [I_{ste} (cd, ed) - I_{or} (cd, ed)] \\ \frac{1}{\sum_{cd = 0}^{BV - 1} \sum_{ed = 0}^{AD} {[I_{or} (cd, ed)]}^{2}} \end{array} \end{matrix}$

(d) SSIM: It is “a perceptual metric that quantifies image quality degradation caused by processing” and it is given in Eq. (39). $\begin{matrix} (39) & SSIM = mean (\frac{(2 ψ_{{NI}^{ref}} μ_{I_{ste}} + {vc}_{1}) (2 ζ_{{NI}^{ref} I_{ste}} + {vc}_{2})}{(ψ_{{NI}^{ref}}^{2} + ψ_{I_{ste}}^{2} + {vc}_{1}) (ζ_{{NI}^{ref}}^{2} + ζ_{I_{ste}}^{2} + {cv}_{2})}) \end{matrix}$

Here, the term ${vc}_{1}$ and ${vc}_{2}$ denoted the constant values, the variance of $I_{or}$ and $I_{ste}$ is denoted as $ζ_{I_{ste}}$ and $ζ_{I_{or}}$ , the covariance of $I_{or}$ and $I_{ste}$ is given as $ζ_{I_{or} I_{ste}}$ , and the mean of $I_{or}$ and $I_{ste}$ is given as $ψ_{I_{or}}$ and $ψ_{I_{ste}}$ accordingly.

6.3. Proposed steganography results

The experimental results for the proposed multimedia steganography are given in Fig. 5.

Fig. 5.

Experimental results for the proposed multimedia steganography.

Fig. 5.

(Continued.)

6.4. Performance evaluation over algorithms for dataset 1 and 2

Figures 6 and 7 represent the performance analysis for the given multimedia steganography by varying the classical algorithms for datasets 1 and 2. The value of MSE for the newly designed MBFEPO-AMC-ResNet has 47%, 19%, 15%, and 26% smaller values when assimilated over SFO-AMC-ResNet, HBA-AMC-ResNet, MBFA-AMC-ResNet, and EPOA-AMC-ResNet for dataset 1. It is similar for dataset 2 also. Therefore, the statistical analysis of the given multimedia steganography shows better outcomes.

Fig. 6.

Validation over various algorithms for proposed multimedia steganography images for dataset 1 regarding “(a) BR, (b) EC, (c) MSE, (d) NCC, (e) PSNR, and (f) SSIM”.

Fig. 7.

Validation over various algorithms for proposed multimedia steganography images for dataset 1 regarding “(a) BR, (b) EC, (c) MSE, (d) NCC, (e) PSNR, and (f) SSIM”.

6.5. Performance evaluation over classifiers for dataset 1and 2

The performance analysis for the given multimedia steganography by varying the classical classifiers for datasets 1 and 2 is given in Fig. 8 and 9. On considering dataset 2, the recommended MBFEPO-AMC-ResNet model has 14%, 11%, 30% and 10% lesser values than DWT, LWT, DTCWT, and ResNet for the value of BR in STD. Hence, the performance validation for the recommended MBFEPO-AMC-ResNet model provides promising outcomes, thus enhancing the performance of the model.

Fig. 8.

Validation over various classifiers for proposed multimedia steganography images for dataset 1 regarding “(a) BR, (b) EC, (c) MSE, (d) NCC, (e) PSNR, and (f) SSIM”.

Fig. 9.

Validation over various classifiers for proposed multimedia steganography images for dataset 2 regarding “(a) BR, (b) EC, (c) MSE, (d) NCC, (e) PSNR, and (f) SSIM”.

6.6. Overall performance evaluation over algorithms and classifiers for datasets 1 and 2

Tables 2 and 3 have represented the overall performance analysis for the given video and audio steganography for both the algorithms and classifiers. For dataset 1, the value of BEST for the MBFEPO-AMC-ResNet model has 39%, 53%, 54% and 54 % higher values over SFO-AMC-ResNet, HBA-AMC-ResNet, MBFA-AMC-ResNet, and EPOA-AMC-ResNet. On the other hand, the recommended MBFEPO-AMC-ResNet model has 23%, 40%, 27% and 22% higher values than DWT, LWT, DTCWT, and ResNet for the value of BEST for dataset 1. Hence, the given recommended MBFEPO-AMC-ResNet model in the multimedia steganography images gives maximized PSNR values and thus increased the performance of the model.

Table 2
Overall performance validation on dataset 1 over various algorithms and classifiers

Terms (%) Algorithm analysis

SFO-AMC-ResNet [7] HBA-AMC-ResNet [10] MBFA-AMC-ResNet [28] EPOA-AMC-ResNet [4] MBFEPO-AMC-ResNet

Best 7.140924 5.419136 5.361269 5.361269 11.61156

Worst 21.22239 29.21434 33.26359 28.15814 27.3468

Mean 15.03354 14.54232 15.1579 13.3776 18.13218

Median 15.88542 11.76791 11.00337 9.995503 16.78519

Standard deviation 5.071806 9.801633 10.92531 8.752113 5.734445

Terms (%) Classifier analysis

DWT [20] LWT [21] DTCWT [23] MC-ResNet [14] MBFEPO-AMC-ResNet

Best 4.541831 7.195952 6.969547 6.969547 11.61156

Worst 11.78377 26.92543 31.78377 28.26195 27.3468

Mean 8.431692 15.09951 15.82154 15.54252 18.13218

Median 8.700582 13.13833 12.26641 13.46929 16.78519

Standard deviation 2.574751 7.618243 9.471655 8.489869 5.734445

Terms (%)	Algorithm analysis
Best	7.140924	5.419136	5.361269	5.361269	11.61156
Worst	21.22239	29.21434	33.26359	28.15814	27.3468
Mean	15.03354	14.54232	15.1579	13.3776	18.13218
Median	15.88542	11.76791	11.00337	9.995503	16.78519
Standard deviation	5.071806	9.801633	10.92531	8.752113	5.734445
Terms (%)	Classifier analysis

	DWT [20]	LWT [21]	DTCWT [23]	MC-ResNet [14]	MBFEPO-AMC-ResNet
Best	4.541831	7.195952	6.969547	6.969547	11.61156
Worst	11.78377	26.92543	31.78377	28.26195	27.3468
Mean	8.431692	15.09951	15.82154	15.54252	18.13218
Median	8.700582	13.13833	12.26641	13.46929	16.78519
Standard deviation	2.574751	7.618243	9.471655	8.489869	5.734445

Table 3

Overall performance validation on dataset 2 over various algorithms and classifiers

Terms (%)	Algorithm analysis

	SFO-AMC-ResNet [7]	HBA-AMC-ResNet [10]	MBFA-AMC-ResNet [28]	EPOA-AMC-ResNet [4]	MBFEPO-AMC-ResNet
Best	7.17599	5.66248	6.817913	7.297889	9.368566
Worst	23.03205	100	27.71372	19.04061	27.41863
Mean	15.06438	31.94501	12.41298	11.16058	17.63476
Median	15.02473	11.05879	7.560138	9.151913	16.87592
Standard deviation	5.799098	39.45552	8.846889	4.63852	6.43651
Terms (%)	Classifier analysis

	DWT [20]	LWT [21]	DTCWT [23]	MC-ResNet [14]	MBFEPO-AMC-ResNet
Best	3.588405	5.893239	5.700354	5.700354	9.368566
Worst	14.46228	29.24053	34.46228	28.44168	27.41863
Mean	9.125958	17.37307	15.85577	14.4242	17.63476
Median	9.226575	17.17927	11.63023	11.77739	16.87592
Standard deviation	4.171199	10.56622	11.05329	8.466827	6.43651

6.7. Time and space analysis of the developed model

The time and space analysis of the designed method is provided in Tables 4 and 5. The analysis shows that the designed method attains better performance rather than the existing methods.

Table 4
Time analysis of the developed model for multimedia steganography

Algorithm analysis

Methods SFO [7] HBA [10] MBFA [28] EPOA [4] MBFEPO-AMC-ResNet

Time (Mins) 8.90436 7.45675 6.2329 6.8023 6.0043

Classifier analysis

Methods DWT [20] LWT [21] DTCWT [23] MC ResNet [14] MBFEPO-AMC-ResNet

Time (Mins) 12.9604 14.3638 10.9034 6.80376 6.0043

Algorithm analysis
Methods	SFO [7]	HBA [10]	MBFA [28]	EPOA [4]	MBFEPO-AMC-ResNet
Time (Mins)	8.90436	7.45675	6.2329	6.8023	6.0043

Classifier analysis
Methods	DWT [20]	LWT [21]	DTCWT [23]	MC ResNet [14]	MBFEPO-AMC-ResNet
Time (Mins)	12.9604	14.3638	10.9034	6.80376	6.0043

Table 5

Space analysis of the offered model for multimedia steganography

Algorithm analysis
Methods	SFO [7]	HBA [10]	MBFA [28]	EPOA [4]	MBFEPO-AMC-ResNet
Space (Kb)	57	50	43	43	40

Classifier analysis
Methods	DWT [3]	LWT [15]	DTCWT [7]	MC ResNet [10]	MBFEPO-AMC-ResNet
Space (Kb)	130	184	102	44	40

6.8. Analysis based on convergence using the designed model

The convergence analysis of the recommended MBFEPO-AMC-ResNet model is evaluated using the existing approaches and it is shown in Fig. 10. If the iteration increases then, the cost function of the recommended model gets decreased. Thus, the convergence analysis provides the effective outcomes while the validation takes place among the existing approaches.

Fig. 10.

Convergence analysis of the developed model in terms of (a) dataset 1 and (b) dataset 2.

Fig. 11.

Screenshot for the Python codes.

6.9. Samples for the Python codes for steganography

The implementation screenshot for the designed approach using the Python platform is shown in Fig. 11

7. Conclusion

Steganography is performed to hide a message signal in the host signal without providing perceptual distortion in the host signal. The steganography was performed to provide better security for the data. Thus, this technique helps to secure files, audio, video, and messages on other media covers. However, the existing approaches are not sufficient to provide the effective performance. Here, the inserting of many characters causes vulnerable, and also security issues are emerged. Owing to these issues, this paper has implemented the multimedia steganography model with the aid of AMT-ResNet and the MBFEPO algorithm. Initially, the required video was obtained from the datasets, and then the acquired input was subjected to the Adaptive DCT-based block process. After, the optimal blocks were selected by utilizing the AMC-ResNet model. Here, the parameter optimization in the DCT and ResNet model was carried out to enhance the steganography performance using the MBFEPO algorithm. Finally, the inverse DCT was applied at the blocks to get the final stego image and video. In the audio steganography phase, the required audio was gathered from external websites. The collected data were given to the STFT to convert into the spectrogram image, and then the spectrogram image was given to the Adaptive DCT block for processing to get a number of blocks. Thus, the blocks were selected with the utilization of the AMC-ResNet, where the parameter within the DCT and the ResNet was optimized using the same MBFEPO to improve the performance. After, the Inverse DCT was applied for reconstructing the spectrogram image. Then the resultant stego audio was obtained by using the Inverse STFT. The value of the recommended MBFEPO-AMC-ResNet model has 48%, 1%, 10%, and 18% higher values than DWT, LWT, DTCWT, and ResNet for dataset 2 in MEAN. In the end, various investigations were conducted for evaluating the performance of the proposed steganography model, and it has shown promising outcomes along with the maximized PSNR and SSIM values. The limitations of the developed model are depicted below. The standard performance metrics like accuracy, FNR, and precision cannot be validated to show the accurate performance of the developed model. However, the accuracy of the model could not be predicted. Moreover, the real time data cannot be utilized in the designed model. In the future we will try to implement and evaluate the real time data using the developed model. Also, the combined evaluation of the steganography and cryptography will be investigated in the upcoming works.

Practical applications. In various fields, the IoT is widely utilized and also applicable in diverse applications like smart home, smart city, health care, mobility etc. Due to the emerging of technology, a huge amount of data is transmitted through wireless networks. Thus, the steganography are utilized to secure the data over the Internet. In a practical scenario, information-hiding techniques like steganography are widely applicable in smart homes to protect communications in critical IoT environment. Thus, it provides significant economic growth and safety prospects.

References

Bieniasz,

Bąk and

Szczypiorski, StegFog: Distributed steganography applied to cyber resiliency in multi node environments, IEEE Access 10 (2022), 88354–88370. doi:10.1109/ACCESS.2022.3199749.

Chen,

Wang,

Li and

Luo, Cost reassignment for improving security of adaptive steganography using an artificial immune system, IEEE Signal Processing Letters 29 (2022), 1564–1568. doi:10.1109/LSP.2022.3188174.

M.H.

Dehkordi and

Mashhadi,

S.T.

Farahi,

M.H.

Noorallahzadeh,

Vahedi,

Gholami and

Alimoradi, OPTP: A new steganography scheme with high capacity and security, Multimedia Tools and Applications (2023).

Dhiman and

Kumara, Emperor penguin optimizer: A bio-inspired algorithm for engineering problems, Knowledge-Based Systems 159 (2018), 20–50. doi:10.1016/j.knosys.2017.11.029.

S.E.

El-Khamy,

N.O.

Korany and

A.G.

Mohamed, A new fuzzy-DNA image encryption and steganography technique, IEEE Access 8 (2020), 148935–148951. doi:10.1109/ACCESS.2020.3015687.

Evsutin,

Melman and

Meshcheryakov, Digital steganography and watermarking for digital images: A review of current research directions, IEEE Access 8 (2020), 166589–166611. doi:10.1109/ACCESS.2020.3022779.

G.F.

Gomes,

S.S.

da Cunha and

A.C.

Ancelotti, A sunflower optimization (SFO) algorithm applied to damage identification on laminated composite plates, Engineering with Computers (2020).

Gutub and

Al-Ghamdi, Hiding shares by multimedia image steganography for optimized counting-based secret sharing, Multimedia Tools and Applications 79(11–12) (2020), 7951–7985. doi:10.1007/s11042-019-08427-x.

Hao,

Yan,

Wu,

Wang and

Yuan, Multimedia communication security in 5G/6G coverless steganography based on image text semantic association, Security and Communication Networks 2021 (2021), 6628034.

10.

F.A.

Hashim,

E.H.

Houssein,

Hussain,

M.S.

Mabrouk and

Al-Atabany, Honey badger algorithm: New metaheuristic algorithm for solving optimization problems, Mathematics and Computers in Simulation 192 (2020), 84–110. doi:10.1016/j.matcom.2021.08.013.

11.

Huang,

Chen,

Yao and

He, ECG arrhythmia classification using STFT-based spectrogram and convolutional neural networkshi, IEEE Access 7 (2019), 92871–92880. doi:10.1109/ACCESS.2019.2928017.

12.

N.-C.

Huang,

M.-T.

Li and

C.-M.

Wang, Toward optimal embedding capacity for permutation steganography, IEEE Signal Processing Letters 16(9) (2009), 802–805. doi:10.1109/LSP.2009.2024794.

13.

Jia,

Luo,

Liu,

Ren and

Wang, Multiperspective progressive structure adaptation for JPEG steganography detection across domains, IEEE Transactions on Neural Networks and Learning Systems 33(8) (2022), 3660–3674. doi:10.1109/TNNLS.2021.3054045.

14.

Li,

Xu and

Du, CascadeNet: Modified ResNet with cascade blocks, in: 2018 24th International Conference on Pattern Recognition (ICPR), 2018, pp. 483–488. doi:10.1109/ICPR.2018.8545289.

15.

Liang, Image classification based on RESNET, Journal of Physics: Conference Series (1634), 012110.

16.

G.-S.

Lin,

Y.-T.

Chang and

W.-N.

Lie, A framework of enhancing image steganography with picture quality optimization and anti-steganalysis based on simulated annealing algorithm, IEEE Transactions on Multimedia 12(5) (2010), 345–357. doi:10.1109/TMM.2010.2051243.

17.

Liu,

Li,

Jiang and

Zhang, A high-performance CNN-applied HEVC steganography based on diamond-coded PU partition modes, IEEE Transactions on Multimedia 24 (2022), 2084–2097. doi:10.1109/TMM.2021.3075858.

18.

Lu,

Zhou,

Yang,

Li and

Lan, Steganalysis of content-adaptive steganography based on massive datasets pre-classification and feature selection, IEEE Access 7 (2019), 21702–21711. doi:10.1109/ACCESS.2019.2896781.

19.

R.J.

Mstafa,

Y.M.

Younis,

H.I.

Hussein and

Atto, A new video steganography scheme based on Shi-Tomasi corner detector, IEEE Access 8 (2020), 161825–161837. doi:10.1109/ACCESS.2020.3021356.

20.

V.P.S.

Naidu, Discrete cosine transform-based image fusion, Defence Science Journal 60 (2010), 48–54. doi:10.14429/dsj.60.105.

21.

Narasimhulu,

D.V.A.

Kumar and

M.V.

Kumar, LWT based ANN with ant lion optimizer for detection and classification of high impedance faults in distribution system, Journal of Electrical Engineering & Technology 15(4) (2020), 1631–1650. doi:10.1007/s42835-020-00456-z.

22.

Natarajan,

Sheen and

Anitha, Multilevel analysis to detect covert social botnet in multimedia social networks, The Computer Journal 58(4) (2015), 679–687. doi:10.1093/comjnl/bxu063.

23.

Patil,

S.S.

Tomar and

S.K.

Chaturvedi, Dual tree complex wavelet transform (DTCWT) based adaptive interpolation technique for enhancement of image resolution, International Journal of Computer Applications 80 (2013), 37–42. doi:10.5120/13933-1957.

24.

Qian,

K.-K.R.

Choo,

Cogranne and

Zhang, Multimedia security: Novel steganography and privacy preserving, Security and Communication Networks 2018 (2018), 6390945.

25.

Qin,

Luo,

Xiang,

Tan and

Huang, Coverless image steganography: A survey, IEEE Access 7 (2019), 171372–171394. doi:10.1109/ACCESS.2019.2955452.

26.

Ramkumar and

A.N.

Akansu, Signaling methods for multimedia steganography, IEEE Transactions on Signal Processing 52(4) (2004), 1100–1111. doi:10.1109/TSP.2004.823468.

27.

Rout and

R.K.

Mohapatra, Secure video steganographic model using framelet transform and elliptic curve cryptography, Multimedia Tools and Applications (2023).

28.

Sedaghat Shayegan,

Lork and

S.A.H.

Hashemi, Mouth brooding fish algorithm for cost optimization of reinforced concrete one-way ribbed slabs, International Journal Of Optimization In Civil Engineering 9(3) (2019), 411–422.

29.

Shamieh and

Wang, Dynamic cross-layer signaling exchange for real-time and on-demand multimedia streams, IEEE Transactions on Multimedia 21(8) (2019), 1893–1904. doi:10.1109/TMM.2019.2892007.

30.

Shamieh and

Wang, Steganographic-based header size reduction technique for multimedia streams, IEEE/ACM Transactions on Networking 28(1) (2020), 399–412. doi:10.1109/TNET.2019.2963792.

31.

Sukumar,

Subramaniyaswamy,

Vijayakumar and

Ravi, A secure multimedia steganography scheme using hybrid transform and support vector machine for cloud-based storage, Multimedia Tools and Applications 79(15–16) (2020), 10825–10849. doi:10.1007/s11042-019-08476-2.

32.

M.D.

Swanson,

Kobayashi and

A.H.

Tewfik, Multimedia data-embedding and watermarking technologies, Proceedings of the IEEE 86(6) (1998), 1064–1087. doi:10.1109/5.687830.

33.

Wu,

Chen,

Luo and

Fang, Audio steganography based on iterative adversarial attacks against convolutional neural networks, IEEE Transactions on Information Forensics and Security 15 (2020), 2282–2294. doi:10.1109/TIFS.2019.2963764.

34.

Yang,

Bai,

Xue,

Li and

Li, A novel image steganography algorithm based on hybrid machine learning and its application in cyberspace security, Future Generation Computer Systems 145 (2023), 293–302. doi:10.1016/j.future.2023.03.035.

35.

Zhang,

Peng and

Long, Robust coverless image steganography based on DCT and LDA topic classification, IEEE Transactions on Multimedia 20(12) (2018), 3223–3238. doi:10.1109/TMM.2018.2838334.