DICO: Dingo coot optimization-based ZF net for pansharpening

Abstract

With the recent advancements in technology, there has been a tremendous growth in the usage of images captured using satellites in various applications, like defense, academics, resource exploration, land-use mapping, and so on. Certain mission-critical applications need images of higher visual quality, but the images captured by the sensors normally suffer from a tradeoff between high spectral and spatial resolutions. Hence, for obtaining images with high visual quality, it is necessary to combine the low resolution multispectral (MS) image with the high resolution panchromatic (PAN) image, and this is accomplished by means of pansharpening. In this paper, an efficient pansharpening technique is devised by using a hybrid optimized deep learning network. Zeiler and Fergus network (ZF Net) is utilized for performing the fusion of the sharpened and upsampled MS image with the PAN image. A novel Dingo coot (DICO) optimization is created for updating the learning parameters and weights of the ZF Net. Moreover, the devised DICO_ZF Net for pansharpening is examined for its effectiveness by considering measures, like Peak Signal To Noise Ratio (PSNR) and Degree of Distortion (DD) and is found to have attained values at 50.177 dB and 0.063 dB.

Keywords

Pansharpening ZF-Net contrast-limited adaptive histogram equalization dingo optimizer coot algorithm

1. Introduction

Recent development in the field of remote sensing has made it an effective tool in many applications, such as defense, security, precision agriculture, environmental management, and so on. In such applications, there exists a need to accurately discriminate land covers as well as the definition of shapes and textures. For this purpose, it is highly essential to capture images with high spectral and spatial resolutions [1]. The deployment of satellites, like WorldView-2, Pleiades, IKONOS, Kompsat, QuickBird, and WorldView-3 has two space-borne sensors, namely the MS and PAN sensors. MS sensors acquire images with high spectral information but lack of spatial information. PAN sensors have the ability to capture PAN images that have clear texture and edges. However, these images fall short of spectral information. The sensors do not have the ability to provide images with high spectral as well as spatial resolution instantaneously, owing to technical limitations [2, 3]. Thus, to acquire an image of rich spectral as well as spatial resolution, pansharpening is employed. Pansharpening has been a hot research topic in the field of remote sensing and targets at attaining an MS image with a rich spatial resolution by performing the fusion of MS image with PAN image. Basically, it can be defined as the enhancement of spatial resolution in an image by injecting it with spatial information extracted from another image [4]. This step is highly required in several remote sensing image analysis processes, like image correction [5], registration [6], terrain classification [7], image segmentation [8], object detection [9], and so on.

Pansharpening aims at developing an MS image that contains both the spectral and spatial information of the MS and PAN image of the scenario acquired by a single satellite through performing fusion. Image fusion is the process of combining two or more images to generate a high-quality image [10, 11]. Image fusion performs the incorporation of several images acquired from multiple sensors to improve the quality of image, thereby enabling efficient decision-making. Moreover, it makes the image appropriate for easy perception in humans as well as machines. Image fusion helps to preserve the significant details and eliminate any discrepancies in the image by extracting all vital details contained in the image [12]. Furthermore, it can be effectively employed to solve the issue concerning missing data, which may occur due to the contamination of the time-series images captured by satellite because of shadow or cloud. The techniques used in pansharpening can be broadly classified as decision, feature, and pixel levels [13]. The decision-level approaches involve separate processing of input images followed by decision extraction, and then the extracted decisions are then fused. The feature-level techniques consider the fusion of the extracted spectral, structural, and geometrical features, whereas the pixel-level fusion technique performs the image fusion pixel-by-pixel [10].

Over the last few years, there has been an increased focus on developing various methodologies of pansharpening. Generally, the prevailing methods of pansharpening can be classified as, traditional methods as well as deep-learning techniques [14]. The traditional techniques of pansharpening can be further classified as Variational Optimization (VO)-based techniques, Multiresolution analysis (MRA)-based techniques, as well as Component Substitution (CS)-based techniques. The VO-based approach comprises two processes namely devising energy function as well as optimizing solutions. MRA-based techniques decompose the image into a series of bandpass channels, and the high-frequency channel of PAN is inserted into the MS channel. The CS-based techniques project the MS band into a new space, and the component representing spatial information is injected into the PAN image. Finally, a fused image is obtained by performing inverse projection [15]. The recent progression of deep learning approaches has led to remarkable accomplishments in various computerized visual processes and has inspired the use of Deep Neural Networks (DNN) in the field of pansharpening [16]. The DNN-based approaches have the capability to achieve an enhanced fusion effect by learning complex features from a number of samples [17]. With the advent of Convolutional Neural Network (CNN), utilization of deep learning techniques in multiple fields has increased tremendously [34, 35]. CNN has gathered huge interest and has been successively used in the pansharpening. A number of CNN-based techniques have been devised for enhancing the spatial resolution of the image acquired from sensors and also for effectively performing fusion and effectively monitoring the health also [18, 32, 33]. Several deep learning techniques, such as the CNN-based pansharpening method (PNN) [19], a Generative Adversarial Network (GAN) for pansharpening (PSGAN) [20], PanNet [21], etc., have been developed for performing pansharpening [14].

In this paper, a hybrid optimized deep learning method named DICO_ZF Net is devised for fusing the MS and PAN images. The MS images are initially subjected to the sharpening process, where they are sharpened using Contrast-limited adaptive histogram equalization (CLAHE), and later, they are upsampled. These images are then fused with the PAN images by employing ZF Net for obtaining the pansharpened image. Here, the fusion process is carried out by ZF-Net, and it is optimized by means of the DICO algorithm to enhance the effectiveness of pansharpening process.

The key contributions of this research are highlighted as follows;

Devised DICO_ZF Net for pansharpening: A innovative network named DICO_ZF Net is developed for performing the process of pansharpening. ZF Net is employed to fuse the MS and PAN images, wherein the developed DICO algorithm is employed for modifying the learning parameters and weights of network. Here, the DICO algorithm is developed by modifying the Dingo Optimization Algorithm (DOX) as per the COOT optimizer for achieving enhanced performance.

The rest of the article has the following organizational structure: Section 2 details the related works in pansharpening, Section 3 elaborates on the introduced pansharpening technique in detail, and in Section 4, experimental results and the assessment of the method are detailed. The conclusion of the research article is provided in Section 5 with the further scope.

2. Motivation

Pansharpening is a highly essential process in remote sensing and it is also a crucial pre-processing phase in many image processing applications, such as classification, segmentation as well as feature extraction. The information content of the MS as well as PAN images are complementary to each other and comprise redundant data, which makes it difficult to build an effective technique of pansharpening. This section briefs the existing pansharpening methodologies and the issues encountered that motivated the development of the introduced technique.

2.1 Literature review

A large number of research have been carried out for developing effective pansharpening techniques. Here, a few of the devised methodologies are considered, and they are shortly explained with their advantages and demerits. Luo et al. [18] developed an unsupervised CNN-based technique for performing a fusion of the MS and PAN images. Here, features were extracted and fused from the input images by utilizing an iterative network based on the similarity between the MS and PAN images. The technique was extremely effective in generating full-resolution images, but it failed to reduce spatial and spectral distortions. The drawback in [18] was overcome in [22], where Wang et al. developed a Dual-Path Fusion Network (DPFN) for performing pansharpening. The DPFN was implemented using two components, namely the Local Sub-Network (LSN) as well as the Global Sub-Network (GSN). GSN was employed for capturing the global spatial, and textural features, and the global contour was obtained by performing the fusion of PAN and MS images. Further, the spatial information in MS images was enhanced by the use of LSN. This technique successively restored the edges and fine detail for improving the visual effects. However, it was unsuccessful in employing an optimization method for accelerating the process. The drawback listed in [22] was overcome in [17], where Yang et al. devised a low-rank fuzzy fusion model for carrying out pansharpening. The pansharpening method was implemented in two phases, wherein the low-rank fuzzy fusion technique was utilized for fusing high-frequency information of PAN and MS images, and later, an adaptive detail supplement was employed for obtaining the final injection information, which was injected to bands of MS images. This method was highly effective in extracting detailed information without tampering with spectral resolution. However, the computational complexity of this technique was high. An improved tradeoff between computational complexity and performance was achieved by the context-based generalized Laplacian pyramid (GLP), developed by Vivone, G in [23] for pansharpening. Here, the injection coefficients were estimated by using regression-based techniques, which combine the GLP technique and the M-estimators. Moreover, homogenous clusters were created by the application of K-means for segmenting the MS images. The technique can be effectively employed in cases where the images contain vegetated areas, but it was unsuccessful in improving the performance of system.

An enhanced performance approach was introduced in [24], wherein Lai et al. presented a Multi-Scale Fusion Network (MSFN) for fusing the MS as well as PAN images. Extraction of the multi-scale features was effectively carried out by using an encoder-decoder structure, and the information was preserved in an information pool, and later, both were subjected to the multi-scale feature fusion module for reconstructing the MS image. The SFN technique was extremely efficient for real-time implementation, but this method failed to consider the effectual spatial enhancement process. The spatial detail was enhanced by considering an adaptive details injection method in [25], where Wu et al. created a multi-objective decision model for performing pansharpening. Here, the pixels in MS images were intrinsically adapted for eliminating the effect of injected details by using the multi-objective optimization model. Later, the spectral modulation coefficient was computed by a multi-objective decision algorithm, and later the spatial information was enhanced by the adaptive detail injection technique. This technique was highly effective in enhancing the texture and edge of the image, although it suffered from low performance owing to various filter requirements corresponding to various fusion issues. The problem of dedicated filters was overcome by the usage of dynamic filters in Multiscale Dynamic Convolutional Neural Network (MDCNN) [16], developed by Hu et al. for pan-sharpening. Here, the network adaptivity was strengthened by the utilization of filters that were dynamically generated using convolutions. The PAN image features were extracted by dynamic convolutions at various scales and the relationship between the features was adjusted using the weights acquired with the help of the weight generation network. MDCNN was successful in overcoming the gradient vanishing and degradation issues, although it endured high computational time. Reduced time consumption was attained in [14], where Zhou et al. presented an unsupervised generative adversarial network (UCGAN). Fusion was carried out by utilizing a two-stream generator for extracting the features related to modalities of the MS as well as PAN images. Moreover, the output image quality was boosted by the hybrid loss function. The UCGAN was capable of reducing the memory cost as well as time, but it failed to employ an optimization technique for enhancing the performance of the network.

2.2 Challenges

The prevailing methodologies of pansharpening encountered the following challenges:

•
An unsupervised CNN was developed in [18] for pansharpening, which offered pansharpened images with high spectral and spatial quality. Although, this method did not consider the utilization of single image spatial resolution techniques or deep image priors for minimizing the spatial as well as spectral distortions in images ahead of fusion.
•
The spatial, as well as spectral distortions, were minimized by the DPFN developed for pansharpening in [22]. This approach was highly effective for enhancing the high-frequency information of MS bands, although it failed to minimize the computational complexity, and it remained the main challenge.
•
The drawback in [22] was overcome by the MSFN [24] proposed for pansharpening, which can be successfully employed in real-time applications. However, this method was unsuccessful in considering the gradient characteristics of the image in order to enhance the spatial resolution of the image.
•
The spatial resolution was improved by the multi-objective decision model in [25], which was developed for attaining a high-quality fused image. The main challenge encountered by this method was that it did not consider addressing the high dimension image fusion issue by investigating image reduction.
•
The prevailing pansharpening methodologies encounter huge challenges as they need ideal high-resolution MS images for making them applicable in real-time scenarios. Moreover, these techniques suffer from degraded performance when they are applied to full-scale images rather than down-scaled images.

3. Introduced DICO_ZF Net for pansharpening

This section describes the process of pansharpening by using devised DICO_ZF Net. Pansharpening is highly critical and is considered a vital pre-processing step in many applications, which require images with high spectral as well as spatial resolution. The spectral details in the low-resolution MS image are integrated with the spatial information of PAN images for obtaining a high-resolution MS image. The devised technique of pansharpening is implemented using the following steps, such as data acquisition, sharpening, upsampling, and fusion. The primary process is the acquisition of MS and PAN images from datasets. The MS image is sharpened using CLAHE [26], and then the sharpened images are upsampled using the upsampling process in Red Green Blue (RGB)-Guided Hyperspectral Image Upsampling algorithm [27]. Later, the PAN image and the upsampled MS image are fused by using the ZF Net, wherein the learning parameters and the weights of ZF Net [28] are modified according to devised DICO algorithm. The devised DICO algorithm is developed by adapting the encircling behavior of Dingoes in DOX [29] algorithm with respect to Coot [30] algorithm. Figure 1 displays the schematic representation of the introduced DICO_ZF Net for pansharpening. The entire process is elaborated in the ensuing subsections.

Figure 1.

Schematic representation of the introduced DICO_ZF Net for pan sharpening.

3.1 Data acquisition

The initial phase in the process of pansharpening is the acquisition of images. Here, pansharpening is carried out using the PAN and MS images, and these images are acquired from datasets.

3.1.1 MS image acquisition

Consider a database $A$ comprising of a total $a$ MS images and is represented by,

$\displaystyle A=\left\{{A_{1},A_{2},\ldots,A_{k},\ldots,A_{a}}\right\}$ (1)

Here, $A_{k}$ signifies the $k^{\text{th}}$ MS image in the database, which is subjected to the sharpening phase.

3.1.2 PAN image acquisition

Consider a database $B$ comprising of a total of $b$ PAN images and is expressed as,

$\displaystyle B=\left\{{B_{1},B_{2},\ldots,B_{l},\ldots,B_{b}}\right\}$ (2)

Here, $B_{l}$ represents the $l^{\text{th}}$ PAN image in the database.

3.2 Sharpening process of the MS image

The MS image acquired $A_{k}$ is subjected to the sharpening phase, wherein the detailed information in the image is highlighted using the CLAHE [26] process. The CLAHE is used to effectively enhance the contrast of the image and is implemented in five phases. Initially, the acquired image is divided into rectangular blocks of the same size and on each block, histogram adjustment is carried out. Various operations, such as creating histogram, clipping and redistributing are employed for performing the histogram adjustment. From the clipped histogram, the Cumulative Distribution Function (CDF)is computed to obtain the mapping function. The artifacts in the blocks are removed by performing the bilinear interpolation among the blocks. Here, the peak value of the histogram in every block is cut off by considering the clip points for restricting the contrast, wherein the clip point is computed by using the following equation.

$\displaystyle\alpha=\frac{C}{D}\left({1+\frac{\beta}{100}\textit{Slope}_{\text% {max}}}\right)$ (3)

Here, $\beta$ represents the clip factor, $\textit{Slope}_{\text{max}}$ is the maximum slope, $C$ denotes the pixel count in a block, and $D$ signifies the dynamic range in the block. High contrast is attained, when the clipping point is higher. Redistribution of clipped pixels is carried out at each gray level, and the gray levels of blocks are remapped using the mapped function based on CDF. The remapping function is given by the following equation.

$\displaystyle E(f)=\textit{CDF}(f)\times f_{\text{max}}$ (4)

Where, $f$ and $f_{\text{max}}$ signifies the pixel grey level and the maximal value of pixel in the block and CDF is given by,

$\displaystyle\textit{CDF}(f)=\sum_{i=0}^{f}{\textit{pdf}\ (i)}$ (5)

Here, pdf represents the probability density function, Interpolation of pixels is carried out from the mapping functions for preventing block artifacts. Bilinear interpolation is performed for finding the remapped pixel and is represented as,

$\displaystyle E(c(j))=N\cdot\bigl{(}M\cdot E_{p}\cdot c(j)+(1-M)\cdot E_{q}% \cdot c(j)\bigr{)}+\,({1-N})\cdot\bigl{(}M\cdot E_{r}\cdot c(j)+\,(1-M)\cdot E% _{s}\cdot c(j)\bigr{)}$ (6)

Here, $c$ represents the remapped pixel in a block, $p, q, r$ and $s$ denote the center pixels of the surrounding blocks, $c(j)$ is the value of the $j^{\text{th}}$ pixel at the $(x,y)$ coordinate and $M$ and $N$ is given by,

$\displaystyle M=\left({x_{q}-x_{c}}\right)\left({x_{q}-x_{p}}\right)$ (7) $\displaystyle N=\left({y_{r}-y_{c}}\right)\left({y_{r}-y_{p}}\right)$ (8)

Thus, the block artifacts are eliminated by interpolation. Moreover, as the CLAHE performs independent processing of blocks, the computational complexity is minimized. The sharpened image is represented by $I_{sh}$ and is subjected to the upsampling phase.

3.3 Upsampling of the MS image

The captured MS images have very low spatial information owing to the limitation of sensors and so, for improving the spatial resolution of MS image, sharpened image $I_{sh}$ is forwarded to upsampling phase. Here, upsampling is carried out using the upsampling method employed in RGB-Guided Hyperspectral Image Upsampling algorithm [27]. This technique is extremely efficient for providing high resolution images from the low-resolution input; further, this method is not affected by noise degradations and has very low run time. The sharpened image $I_{sh}$ has a dimension of $u\times v\times w$ , wherein $u, v, w$ denote the width, height and sampled wavelength count of the image. The image $I_{sh}$ has to be transformed into a high-resolution image $I_{up}$ with dimension $U\times V\times w$ , where $u\ll U$ and $v\ll V$ .

3.3.1 Learning the high resolution-low resolution (HR-LR) exemplar

The image $I_{sh}$ is upsampled using a fast learning-based single image super-resolution algorithm which is modified with respect to the spectrum correlation over various wavelength channels in $I_{sh}$ and the high resolution structure in $F$ . Here, $F$ is high resolution image. A synthetic low-resolution image is created by using the bicubic interpolation for downsampling the high-resolution image in order to prepare the training examples for learning of High Resolution-Low Resolution(HR-LR) exemplars. Patches of size 5 $\times$ 5 are sampled from the training examples and the primitive structures corresponding to $I_{up}$ , $I_{sh}$ and $F$ are denoted as $\rho_{I_{sh}}$ , $\rho_{I_{up}}$ and $\rho_{F}$ . Training examples in $I_{up}$ and $I_{sh}$ have a resolution ratio of 2. If the resolution ratio (upsampling factor) among $I_{sh}$ and $F$ is kept more than 2, then the sampling of the image $I_{sh}$ is carried out several times. Spectrum substitution is applied once the image is upsampled, and then the procedure is reiterated till the required resolution is met. The luminance $Y$ is computed for every patch $G$ by taking into account the RGB value and then the feature vector is created by applying mean subtraction through piling the pixels of the $Y$ image. Then, clustering of sample patches is then performed to form $Z$ groups. The exemplars are formed by considering the clustered patches in each group for obtaining super-resolution.

3.3.2 HR hyperspectral image reconstruction

Assume $P\!=\!\{{P_{I_{up}}(d,e),P_{I_{sh}}(d,e),P_{F}(d,e)}\}$ represents the trained exemplars, such that $1\leqslant d\leqslant H$ and $1\leqslant e\leqslant J(d)$ , wherein $J(d)$ represents the exemplar count of each group and $H$ signifies the clustered group count. The exemplar in every group has the same primitive structure, and hence they can be expressed by linearly combining the rest of the exemplars of the corresponding group as, $P(d,e)=\sum\nolimits_{i=1,i\neq e}^{J(d)}{\phi_{i}P(d,i)}$ , in which the linear coefficient is signified by $\phi_{i}$ . The linear coefficient can be estimated by considering the value of $\rho_{I_{sh}}$ and $\rho_{F}$ , and it is represented as,

$\displaystyle\phi^{*}=\arg\mathop{{\min\limits_{\phi}\left\|\left[\!{\begin{% array}[]{l}\rho_{I_{sh}}\\ \rho_{F}\\ \end{array}}\!\right]\!-\!\sum_{i=1}^{J(d)}{\phi_{i}\left[\!{\begin{array}[]{l% }P_{{}_{I_{sh}}}(d,i)\\ P_{F}(d,i)\\ \end{array}}\!\right]}\right\|}_{2}^{2}}\limits_{\phi}$ (9)

Here, $\varphi^{*}$ signifies the optimum solution and is computed by using simple linear regression. The value of $\rho_{I_{up}}$ can be estimated by the following equation,

$\displaystyle\rho_{I_{up}}=\sum\limits_{i=1}^{J(d)}{\phi_{i}^{*}P_{I_{up}}(d,i)}$ (10)

Once, $\rho_{I_{up}}$ is estimated, the upsampled $I_{up}$ is obtained by adding the mean patch of $I_{sh}$ .

After estimating $\rho_{S}$ , the patch means of $K$ is evaluated for acquiring precise upsampled. As the input image $I_{sh}$ can contain noise, a structure-guided total variation regularization is incorporated and is represented by,

$\displaystyle I_{up}^{o}=\arg\min\limits_{I_{up}}\left\|{I_{up}-\hat{I}_{up}}% \right\|_{2}^{2}+\,\chi\left({1-\left|{\nabla_{\text{max}}F}\right|}\right)% \left|{\nabla_{I_{up}}}\right|_{1}$ (11)

Here, $\chi$ represents the regularization weight having a value of 0.01, $\left|{\nabla_{I_{up}}}\right|_{1}$ signifies the overall variation regularization, and the maximum absolute gradient of $F$ over the RGB channels is given by $\nabla_{\text{max}}F=\max({\left|{\nabla_{R}F}\right|,\left|{\nabla_{G}F}% \right|,\left|{\nabla_{B}F}\right|})$ and $\hat{I}_{up}$ denotes the image obtained by performing exemplar super-resolution. The upsampled image obtained $I_{up}^{o}$ is then forwarded to the ZF-Net for performing fusion.

3.4 Fusion of images using ZF-Net

The process of fusion of the MS images with the PAN image using the ZF-Net is elaborated here. The upsampled MS image $I_{up}^{o}$ is integrated with the PAN image $B_{l}$ acquired from the database to obtain a high-resolution image using ZF-Net. Further, the devised DICO algorithm for updating the learning parameters along with the weights of the ZF Net is detailed in this section

3.4.1 ZF-Net for image fusion

Here, fusion of the MS image with the PAN image is carried out by means of the ZF Net, wherein the upsampled MS image $I_{up}^{o}$ is fused with the PAN image $B_{l}$ , which is acquired from the database. ZF Net [28] is extremely robust and has the ability to avoid the subjectivity of traditional image processing techniques. The ZF Net was developed with the aim of understanding the functioning of CNN and it depicted methods for improving the efficiency of the CNN. ZF Net is a type of AlexNet and comprises of five shareable convolutional (conv) layers three Fully Connected (FC) layers, max-pooling layers and dropout layers. The ZF Net has a reduced stride value of 2 in the first layer, and the size of the filter employed is 7 $\times$ 7. The network has the capability to store the majority of the information in the first two layers. The conv layer performs the process of reducing the dimensions of the image by employing small matrices, which are known as kernels or filters. The pooling layers minimize the spatial dimension of the image for avoiding overfitting by reducing the computations as well as the parameters. FC layer enables connection of neuron in one layer to all the neurons in the next layer, thereby, the outcome is mixed signals. Dropout layers are utilized to avoid overfitting, by arbitrarily nullifying the outputs of the hidden units. Figure 2 displays the architecture of the ZF Net.

Figure 2.

Architecture of ZF Net.

The output obtained from the ZF Net is the pansharpened image $K$ .

3.4.2 DICO for training ZF-Net

An innovative DICO algorithm is proposed in this paper for optimizing the fusion process, by adjusting hyperparameters of ZF Net, which minimizes the loss and achieves desired outcomes. Here, the DICO is developed by altering the encircling behavior of dingoes, while hunting in DOX [30] with respect to COOT algorithm [29]. DOX is a metaheuristic optimizer, which is developed from the inspiration of the hierarchical conduct of dingoes in pack. The optimizer is created by considering the way in which dingoes target prey. The dingoes normally perform an investigation of area for prey, followed by encircling and then attack, based on these processes, DOX is formulated. The optimizer is extremely effective for finding solutions with less effort in real-time problems. Although, this optimizer did not take into account solving multi-objective matters. On the other hand, COOT is a swarm-based algorithm devised by considering the conduct of coot birds as a swarm on the surface of water. The birds undertake two different kinds of motion on the face of water, such as irregular and regular. In the initial phase, birds fly towards leaders for reaching food supply in an irregular manner and later, every bird fly behind following bird in front thereby generates the coot chain. The coot algorithm is realized by considering different movements, like random motion, chain motion, adjust the position with respect to the group leader, and chaperoning the group to the optimal region. The COOT algorithm is highly efficient for solving engineering optimization issues and it is effective for determining the solutions to unknown search spaces. However, this algorithm suffers from a low convergence rate. Hence, by amalgamating two optimization algorithms, the devised DICO optimizer effectively combines the merits of both algorithms and improves the optimization process of ZF Net. The devised DICO is implemented by means of the following steps.

Step 1: Initialization

The population of dingoes is initialized primarily in the search space, and it can be expressed by using the following equation.

$\displaystyle S=\{S_{1},S_{2},\ldots,S_{z},\ldots,S_{m}\}$ (12)

where, $S_{o}$ designates the $z^{\text{th}}$ dingo and $1\leqslant z\leqslant m$ , $m$ denotes the overall count of dingoes.

Step 2: Estimation of Fitness

The optimal solution is computed with some flaws, and hence the optimization is regarded to be a minimization problem and so, the objective is to find a solution with minimal error. Thus, the fitness function chosen here is mean square error (MSE), and it is represented by,

$\displaystyle\textit{Err}=\frac{1}{\lambda}\sum\limits_{g=1}^{\lambda}\left({K% _{g}^{*}-K_{g}}\right)^{2}$ (13)

where, the obtained output of ZF Net is designated as $K_{g}$ and $K_{g}^{*}$ denotes the output of ZF-Net and the target output, $\lambda$ represents the overall sample numbers.

Step 3: Encircling

Dingoes are proficient in determining the position of prey. As the position of prey is traced, the entire pack is led by alpha to the position, and pack begins to encircle the prey. The hierarchical conduct of dingoes is modeled based on the assumption that the prey is located by considering the best search method as quest space is not known ahead. Concurrently, other search agents keep refreshing searching methods in subsequent possible strategies. The conduct of dingoes can be formulated using the equations given below.

$\displaystyle\vec{T}_{di}=\left|{\vec{W}\cdot\vec{O}_{pr}(j)-\vec{O}(j)}\right|$ (14) $\displaystyle\vec{O}\left({j+1}\right)=\vec{O}_{pr}\left(j\right)-\vec{X}\cdot% \vec{T}\left({di}\right)$ (15) $\displaystyle\vec{W}=2\cdot\vec{g}_{1}$ (16) $\displaystyle\vec{X}=2\vec{h}\cdot\vec{g}_{2}-\vec{h}$ (17) $\displaystyle\vec{h}=3-\left({L*\left({\frac{3}{L_{\max}}}\right)}\right)$ (18)

Here, $\vec{T}_{di}$ designates the distance in between prey and dingo, $\vec{O}$ as well as $\vec{O}_{pr}$ represent the location of dingo and prey, $\vec{W}$ and $\vec{X}$ signify the coefficient vectors, $\vec{g}_{1}$ and $\vec{g}_{2}$ denote arbitrary vectors, the value of $\vec{h}$ is a linearly decreases from 3 to 0 with every iteration, $L$ and $L_{\text{max}}$ represent iteration count as well as the maximal iterations.

Substituting Eq. (14) in (15),

$\displaystyle\vec{O}\left({j+1}\right)=\vec{O}_{pr}\left(j\right)-\vec{X}\cdot% \left|{\vec{W}\cdot\vec{O}_{pr}(j)-\vec{O}(j)}\right|$ (19)

Now assuming $\vec{O}_{pr}(j)>\vec{O}(j)$ , we get,

$\displaystyle\vec{O}\left({j+1}\right)=\vec{O}_{pr}\left(j\right)-\vec{X}\cdot% \left({\vec{W}\cdot\vec{O}_{pr}(j)-\vec{O}(j)}\right)$ (20) $\displaystyle\vec{O}\left({j+1}\right)=\vec{O}_{pr}\left(j\right)-\vec{X}\cdot% \vec{W}\cdot\vec{O}_{pr}(j)+\,\vec{X}\cdot\vec{O}(j)$ (21) $\displaystyle\vec{O}\left({j+1}\right)=\vec{O}_{pr}\left(j\right)(1-\vec{X}% \cdot\vec{W})+\,\vec{X}\cdot\vec{O}(j)$ (22)

Based on the arbitrary motion of Coot birds, position of bird can be determined from the following equation,

$\displaystyle\textit{Coot}(j+1)=\textit{Coot}(j)+\kappa\times r_{d}\times(\nu-% \textit{Coot}(i))$ (23)

where, $r_{d}$ represents an arbitrary number having value between 0 and 1, $\nu$ indicates the random location to which coot moves and $\kappa$ is expressed as,

$\displaystyle k=1-L\times\left({\frac{1}{L_{\max}}}\right)$ (24)

The arbitrary location $\nu$ is given by,

$\displaystyle\nu=\textit{rnd}(1,n)\,*\,\left({\textit{up}-\textit{low}}\right)% +\textit{low}$ (25)

Here, $n$ denotes dimension or the problem variable count, up and low indicates the lower and upper bound of search space.

Rewriting Eq. (23), we get,

$\displaystyle\textit{Coot}(j+1)=\textit{Coot}(j)+\kappa\times r_{d}\times\nu-% \,\kappa\times r_{d}\times\textit{Coot}(i)$ (26) $\displaystyle\textit{Coot}(j+1)=\textit{Coot}(j)\left({1-\kappa\times r_{d}}% \right)+\,\kappa\times r_{d}\times\nu$ (27)

Now assume, $\textit{Coot}(j+1)=\vec{O}\left({j+1}\right)$ , $\textit{Coot}(j)=\vec{O}\left(j\right)$ , $W=\vec{W}$ , $r_{d}=\vec{r}_{d}$ and $\nu=\vec{\nu}$ , Eq. (27) can be rewritten as,

$\displaystyle\vec{O}(j+1)=\vec{O}(j)(1-\vec{W}\times\vec{r}_{d})+\,\vec{W}% \times\vec{r}_{d}\times\vec{\nu}$ (28) $\displaystyle\vec{O}(j)=\frac{\vec{O}(j+1)-\vec{W}\times\vec{r}_{d}\times\vec{% \nu}}{(1-\vec{W}\times\vec{r}_{d})}$ (29)

Substituting Eq. (29) in (22), we get,

$\displaystyle\vec{O}(j+1)=\vec{O}_{pr}(j)(1-\vec{X}\cdot\vec{W})$ (30) $\displaystyle\quad+\,\vec{X}\cdot\left({\frac{\vec{O}(j+1)-\vec{W}\times\vec{r% }_{d}\times\vec{\nu}}{(1-\vec{W}\times\vec{r}_{d})}}\right)$ $\displaystyle\vec{O}(j+1)-\frac{\vec{X}\cdot\vec{O}(j+1)}{(1-\vec{W}\times\vec% {r}_{d})}$ $\displaystyle\quad=\vec{O}_{pr}\left(j\right)(1-\vec{X}\cdot\vec{W})$ (31) $\displaystyle\quad-\,\frac{\vec{X}\times\vec{W}\times\vec{r}_{d}\times\vec{\nu% }}{(1-\vec{W}\times\vec{r}_{d})}$ $\displaystyle\frac{\vec{O}(j+1)(1-\vec{W}\times\vec{r}_{d}-\vec{X})}{(1-\vec{W% }\times\vec{r}_{d})}$ $\displaystyle\quad=\frac{\vec{O}_{pr}(j)(1-\vec{X}\cdot\vec{W})(1-\vec{W}% \times\vec{r}_{d})}{(1-\vec{W}\times\vec{r}_{d})}$ (32) $\displaystyle\quad-\,\frac{\vec{X}\times\vec{W}\times\vec{r}_{d}\times\vec{\nu% }}{(1-\vec{W}\times\vec{r}_{d})}$ $\displaystyle\vec{O}(j+1)=\frac{\vec{O}_{pr}(j)(1-\vec{X}\cdot\vec{W})(1-\vec{% W}\times\vec{r}_{d})}{(1-\vec{W}\times\vec{r}_{d}-\vec{X})}$ (33) $\displaystyle\qquad\qquad\quad-\,\frac{\vec{X}\times\vec{W}\times\vec{r}_{d}% \times\vec{\nu}}{(1-\vec{W}\times\vec{r}_{d}-\vec{X})}$

Step 4: Hunting

The dingoes do not generally have an estimation about the optimal location of prey position, but the hunting plan is modeled by considering that all dingoes in the pack have data regarding prey position. Normally, alpha directs hunting, but in some scenarios, other dingoes might be participating in the hunting process, and hence, two of the best values are considered. The rest of the pack also has to keep updating their position as per the best search agents, and this can be modeled as,

$\displaystyle\vec{T}_{\textit{alp}}=\left|{\vec{W}_{1}\cdot\vec{O}_{\textit{% alp}}-\vec{O}}\right|$ (34) $\displaystyle\vec{T}_{\textit{bet}}=\left|{\vec{W}_{2}\cdot\vec{O}_{\textit{% bet}}-\vec{O}}\right|$ (35) $\displaystyle\vec{T}_{\textit{oth}}=\left|{\vec{W}_{3}\cdot\vec{O}_{\textit{% oth}}-\vec{O}}\right|$ (36) $\displaystyle\vec{O}_{1}=\left|{\vec{O}_{\textit{alp}}-\vec{X}\cdot\vec{T}_{% \textit{alp}}}\right|$ (37) $\displaystyle\vec{O}_{2}=\left|{\vec{O}_{\textit{bet}}-\vec{X}\cdot\vec{T}_{% \textit{bet}}}\right|$ (38) $\displaystyle\vec{O}_{3}=\left|{\vec{O}_{\textit{oth}}-\vec{X}\cdot\vec{T}_{% \textit{oth}}}\right|$ (39)

Moreover, for every dingo, the intensity has to be computed and is expressed as,

$\displaystyle\vec{\xi}_{\textit{alp}}=\log\left({\frac{1}{\textit{Fit}_{% \textit{alp}}-\left({1\text{E}-100}\right)}+1}\right)$ (40) $\displaystyle\vec{\xi}_{\textit{bet}}=\log\left({\frac{1}{\textit{Fit}_{% \textit{bet}}-\left({1\text{E}-100}\right)}+1}\right)$ (41) $\displaystyle\vec{\xi}_{\textit{oth}}=\log\left({\frac{1}{\textit{Fit}_{% \textit{oth}}-\left({1\text{E}-100}\right)}+1}\right)$ (42)

Here, $\textit{Fit}_{\textit{alp}}$ , $\textit{Fit}_{\textit{bet}}$ and $\textit{Fit}_{\textit{oth}}$ designate the fitness of the alpha (alp), beta (bet) and other (oth) dingoes.

Step 5: Attacking prey

When no position is available for updating, it indicates that the hunting process is completed by dingoes by carrying out an attack on prey. The attacking prey method can be formulated by decreasing the value of $\vec{h}$ linearly from 3 to 0. With the decrease in $\vec{h}$ , $\vec{T}_{\textit{alp}}$ will have a value in the range [ $-$ 3 $g$ , 3 $g$ ]. The location of the search agent will be between the prey location and its current location, when value of $\vec{T}_{\textit{alp}}$ is [ $-$ 1, 1].

Step 6: Feasibility re-computation

The optimal solution is estimated by taking into account the fitness of the solution using Eq. (13), when the exiting solution has higher fitness than the current solution, then current one replaces the prevailing one.

Step 7: Termination

The entire operation is reiterated for determining the optimal solution till the stopping criterion is attained. Table 1 displays the pseudo-code of devised DICO algorithm.

Table 1

Pseudo-code of devised DICO algorithm

Sl. No.	Pseudocode of devised DICO algorithm
1	Input: Dingo Population $S$
2	Output: Best dingo $\vec{O}\left({j+1}\right)$
3	Begin
4	Produce the initial search agent $T_{\textit{in}}$
5	Set the value of $\vec{h}$ , $\vec{W}$ and $\vec{X}$
6	While terminating criterion not achieved
7	Determine the fitness as well as intensity cost of every dingo
8	$T_{\textit{alp}}$ is dingo with the best search
9	$T_{\textit{bet}}$ is dingo with the second best search
10	$T_{\textit{oth}}$ is dingo search results later
11	Iteration 1
12	Repeat
13	for $j=$ 1: $T_{\textit{in}}$ do
14	Renew the status of the newest search agent
15	end for
16	Recompute the value of fitness as well as the intensity cost of dingoes
17	Document $\textit{Fit}_{\textit{alp}}$ , $\textit{Fit}_{\textit{bet}}$ and $\textit{Fit}_{\textit{oth}}$
18	Document the values of $\vec{h}$ , $\vec{W}$ and $\vec{X}$
19	Iteration $=$ iteration $+$ 1
20	If iteration $\geqslant$ terminating criteria
21	Output
22	end while

The integration of DOX and COOT algorithm has resulted in a highly efficient DICO algorithm, which improved the performance of ZF Net for obtaining fusion of two images to attain enhanced Pansharpening.

4. Results and discussion

The section elaborates on the analytical evaluation of devised DICO_ZF Net for pansharpening. The developed DICO_ZF Net for pansharpening is evaluated in comparison to the state-of-art methodologies, and the same is detailed here.

Figure 3.

Experimental results of the devised DICO_ZF Net for pansharpening. (a) Input images 1, 2, 3 and 4; (b) sharpened images 1, 2, 3, and 4; (c) upsampled images 1, 2, 3, and 4; (d) fused images 1, 2, 3, and 4.

4.1 Experimental setup

The implementation of the devised DICO_ZF Net for pansharpening is carried out using PYTHON tool on a system with the following specifications: Windows 10 OS, Intel i3 processor, and 8 GB RAM.

4.2 Dataset description

For demonstrating the performance of the devised DICO_ZF Net for pansharpening, the devised technique implemented four datasets, namely Indian Pines, Salinas, Pavia Centre, and University [31].

i)
Indian Pines: The dataset contains scenes, which are collected from North-western Indiana comprising the Indian Pines test location. The data is collected by the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) sensor and comprises of 16 classes and 224 spectral reflectance bands with a wavelength ranging from (0.4–2.5) $\times$ 10 ${}^{-6}$ m. Moreover, the images are captured with a resolution of 145 $\times$ 145 pixels. One-third of the scene consists of perennial vegetation, such as forest and the remaining part comprises of agricultural land.
ii)
Salinas: This dataset contains images captured from the Salinas Valley, California using 224-band AVIRIS sensor. The dataset contains 16 classes comprising areas, such as vineyard fields, vegetables, as well as, bare soils and the region encompasses 512 lines by 217 samples. The images have a spatial resolution of 3.7-meter pixels.
iii)
Paviacentre: This dataset comprises images captured over Pavia, Northern Italy using ROSIS sensors. It comprises of 9 classes, with 102 spectral bands, and the images captured have a dimension of 1096 $\times$ 1096 pixels with 1.3 m resolution.

Figure 4.
Comparative assessment of devised DICO_ZF Net using Indian pines by varying number of bands (a) PSNR and (b) DD.

iv)
Pavia University: Similar to the Pavia centre, this dataset also contains images captured during the flight campaign over Pavia. The dataset comprises of 9 classes with 103 spectral bands with images having 1.3 m resolution and 610 $\times$ 610 pixels.

4.3 Experimental outcomes

Here, the experimental outcomes of the proposed method are portrayed using Fig. 3. In Fig. 3a, the input images are portrayed. Figure 3b displays the sharpened image, Fig. 3c and d corresponds to the upsampled and fused images, respectively.

4.4 Evaluation metrics

Two evaluation measures are utilized for illustrating the effectiveness of the developed DICO_ZF Net for pansharpening, such as PSNR and DD. These measures are detailed in the following subsections.

(i)
PSNR: PSNR is a metric used to compare the pansharpened image with the original image and it is computed by finding the average of all bands and is given by,

$\displaystyle\textit{PSNR}=\log_{10}\left({\frac{R_{\text{max}}^{2}}{% \varepsilon}}\right)$ (43)
(ii)
DD: DD can be defined as the relative degree by which the object can be misrepresented or distorted and is expressed as,

$\displaystyle\textit{DD}({\hat{Q},Q})\!=\!\frac{1}{\text{SN}}\left\|{\textit{% vec}(\hat{Q})-\textit{vec}(Q)}\right\|_{1}$ (44)

where, $\textit{vec}(Q)$ and $\textit{vec}(\hat{Q})$ represent the vectorization of matrixes $Q$ as well as $\hat{Q}$ . The less value of DD corresponds to minimal distortion and indicates high spectral quality.

4.5 Comparative techniques

The devised DICO_ZF Net is tested for its effectiveness by comparing it with traditional techniques, like CNN [18], low-rank fuzzy fusion model [17], Context-based GLP [23], Competitive Multi-Verse Feedback Artificial Tree (CMVFTA)-based Deep Maxout network (DMN), Fractional CMVFTA (FrCMVFTA)-based DMN.

Figure 5.

Comparative assessment of the devised DICO_ZF Net using Indian pines by varying neta value (a) PSNR and (b) DD.

Figure 6.

Comparative assessment of the devised DICO_ZF Net using Salinas by varying number of bands (a) PSNR and (b) DD.

4.6 Comparative analysis

The introduced DICO_ZF Net is examined for its efficacy using the comparison with prevailing approaches considering the four datasets by varying the number of bands and neta value.

4.6.1 Assessment using Indian pines

The proposed DICO_ZF Net is evaluated using the Indian pines dataset by varying neta value number of bands, which is illustrated in this section.

i) Assessment by varying number of bands

Figure 4 portrays the assessment of the introduced DICO_ZF Net using Indian Pines by varying number of bands. In Fig. 4a, the comparative analysis is portrayed using PSNR. For 20 bands, the PSNR computed by FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, Context-based GLP, and devised DICO_ZF Net is 47.917 db, 43.829 db, 19.278 db, 10.666 db, 27.211 dB, and 51.655 db. The evaluation with respect to DD is displayed using Fig. 4b. The value of DD computed by the prevailing methodologies, such as the FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLP is 0.067 db, 0.071 dB, 0.419 db, 0.193 db, and 0.099 db, whereas devised DICO_ZF Net attained a low value of DD is 0.064 db.

ii) Assessment by varying neta

Figure 7.

Comparative assessment of the devised DICO_ZF Net using Salinas by varying neta value (a) PSNR and (b) DD.

The assessment of devised DICO_ZF Net is elaborated in Fig. 5 by varying neta value. In Fig. 5a, comparison of various techniques of pansharpening is portrayed based on the PSNR value. When value of neta is 30, PSNR value computed by prevailing techniques, like FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLP is 43.093 dB, 41.944 db, 22.106 db, 11.718 db, and 28.256 dB, respectively, but the devised DICO_ZF Net attained a higher value of PSNR at 45.679 dB. The analysis with respect to the DD is shown in Fig. 5b. The value of DD attained by the methods, like Context-based GLP is 0.106 db, CNN is 0.172 dB, low-rank fuzzy fusion model is 0.417 dB, CMVFTA-based DMN is 0.081 db, and FrCMVFTA-based DMN is 0.077 db, while the neta value is kept at 40. However, a less value of 0.074 db DD is computed by devised DICO_ZF Net.

4.6.2 Assessment using Salinas

The proposed DICO_ZF Net is evaluated using Salina dataset by varying the neta value and number of bands and the same is illustrated in this section.

Figure 8.

Comparative assessment of the devised DICO_ZF Net using Pavia centre by varying number of bands (a) PSNR and (b) DD.

i) Assessment by varying number of bands

The analysis of devised DICO_ZF Net is displayed using Fig. 6. In Fig. 6a, the effectiveness of the developed approach with respect to PSNR is shown. When 30 rounds were considered, value of PSNR computed is 42.774 db by the introduced technique, which is better than the values of PSNR at 41.177 dB, 39.877 dB, 17.902 db, 20.054 db, and 37.283 db computed by other methods, such as FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLP. The DD-based assessment is demonstrated in Fig. 6b. The developed DICO_ZF Net attained a low value of DD at 0.063 db, whereas the state-of-art methodologies, like FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLP calculated higher values of DD at 0.066 db, 0.071 db, 0.402 dB, 0.183 db, and 0.097 db for 10 rounds.

ii) Assessment by varying neta

The assessment of the proposed DICO_ZF Net by varying the value of neta is displayed using Fig. 7, wherein Fig. 7a shows the evaluation with respect to PSNR and Fig. 7b portrays the assessment considering DD. When the neta value is fixed at 10, the PSNR values attained by the existing methodologies, like FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, Context-based GLP, and the developed DICO_ZF Net is 47.432 db, 44.511 db, 25.863 db, 10.402 db, 26.870 db, and 49.273 db respectively for neta value is 10, which shows improved performance. Similarly, when the value of neta is 40, proposed DICO_ZF Net attained a low value of DD at 0.074 db. Although, value of DD computed by the techniques, like FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLP is 0.077 db, 0.081 db, 0.395 db, 0.152 db, and 0.102 db.

Figure 9.

Comparative assessment of the devised DICO_ZF Net using Pavia centre by varying neta value (a) PSNR and (b) DD.

Figure 10.

Comparative assessment of the devised DICO_ZF Net using Pavia university by varying number of bands (a) PSNR and (b) DD.

Table 2

Comparative discussion

Dataset	Variation	Metrics (dB)	Context based GLP	CNN	Low rank fuzzy fusion	CMVFTA- based DMN	FrCMVFTA- based DMN	Devised DICO_ZF Net
Indian pines	Number of bands	PSNR	27.211	10.666	19.278	43.829	46.546	50.177
		DD	0.099	0.193	0.419	0.071	0.069	0.066
	Neta	PSNR	27.608	11.032	21.075	41.750	44.093	46.739
		DD	0.107	0.179	0.415	0.081	0.077	0.074
Salinas	Number of bands	PSNR	37.283	20.054	17.902	39.877	42.094	43.728
		DD	0.096	0.176	0.401	0.071	0.067	0.064
	Neta	PSNR	29.103	12.641	23.604	41.972	44.038	45.746
		DD	0.102	0.149	0.396	0.081	0.078	0.075
Pavia centre	Number of bands	PSNR	37.283	20.054	17.902	39.877	42.094	43.728
		DD	0.079	0.094	0.470	0.072	0.066	0.063
	Neta	PSNR	37.433	20.137	17.837	38.815	41.562	43.174
		DD	0.089	0.116	0.459	0.082	0.077	0.074
Pavia university	Number of bands	PSNR	35.048	18.046	19.464	40.200	42.202	43.839
		DD	0.081	0.110	0.449	0.072	0.067	0.064
	Neta	PSNR	35.060	18.038	19.629	39.108	43.891	45.594
		DD	0.091	0.132	0.439	0.082	0.077	0.074

Figure 11.

Comparative assessment of the devised DICO_ZF Net using Pavia university by varying neta value (a) PSNR and (b) DD.

4.6.3 Assessment using Pavia centre

The analysis of the devised DICO_ZF Net is carried out to reveal its effectiveness using the Pavia centre dataset by considering various values of rounds and neta.

i) Assessment by varying number of bands

This section elaborates the assessment of the introduced DICO_ZF Net by considering different rounds and the same is demonstrated in Fig. 8. In Fig. 8a, the PSNR based evaluation is displayed in Fig. 8a. For 40 rounds, the value of PSNR computed by the prevailing methodologies, namely FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLP is 41.009 db, 39.877 db, 17.902 db, 20.054 db, 37.283 db, for 40 rounds. But, the devised DICO_ZF Net computed a higher value of PSNR at 42.601 db. The assessment using DD is portrayed in Figure 8b. The existing methodologies, such as FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, Context-based GLP, and the devised DICO_ZF Net computed values of 0.066 db, 0.072 dB, 0.473 dB, 0.091 db, 0.078 db, and 0.063 db for 10 rounds, which shows improved performance.

ii) Assessment by varying neta

The evaluation of devised DICO_ZF Net is carried out by varying the neta values and this is illustrated using Fig. 9. Figure 9a and b displays the assessment with respect to PSNR and DD respectively. When neta value is kept at 30, value of PSNR achieved by proposed DICO_ZF Net is 42.938 db, which is higher than the values of PSNR at 41.335 db, 38.775 db, 18.564 db, 20.568 db, and 37.859 db calculated by prevailing FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLP. While the neta value is 20, DD computed by the introduced DICO_ZF Net is 0.074 db, however, DD calculated by FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLP is, 0.077 db, 0.082 db, 0.459 db, 0.112 db and 0.088 db.

4.6.4 Assessment using Pavia university

The assessment of the devised DICO_ZF Net using Pavia university dataset is evaluated in this section by considering various neta values and rounds.

i) Assessment by varying number of bands

Figure 10 portrays the assessment of the introduced technique by varying the number of rounds. In Fig. 10a, assessment of the devised DICO_ZF Net with PSNR is displayed. The prevailing approaches, like FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLP computed a value of PSNR at 42.800 db, 40.146 db, 19.759 db, 18.404 db, and 35.437 db. However, the devised DICO_ZF Net attained a higher value of PSNR at 44.461 db for 10 rounds. Likewise, the evaluation of the devised approach using DD is depicted in Fig. 10b. For 30 rounds, the devised DICO_ZF Net computed a low value of DD at 0.064 dB, when compared to the values of DD at 0.067 db, 0.072 db, 0.449 db, 0.110 db, and 0.081 db, achieved by the existing approaches, such as FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLP.

ii) Assessment by varying neta

The proposed DICO_ZF Net is evaluated using Pavia university dataset by varying the neta values and the same is illustrated in Fig. 11. In Fig. 11a, the value of PSNR achieved by the approaches, like FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLPforneta value 20 is 43.562 db, 39.072 db, 20.456 db, 18.654 db, and 35.725 db. But the devised DICO_ZF Net attained a higher value of PSNR at 45.253 db, which revealed improved performance. The DD-based assessment of the developed DICO_ZF Net is represented in Fig. 11b. When the neta is set at 40, the DD value attained by the state-of-art approaches, such as FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLP is 0.078 db, 0.082 db, 0.438 db, 0.130 db, and 0.091 db. However, the developed DICO_ZF Net calculated a lesser value of DD at 0.075 db, revealing improved performance.

4.7 Comparative discussion

This section deals with a comparative discussion of the devised DICO_ZF Net for pansharpening. The developed DICO_ZF Net is compared to the techniques, such as FrCMVFTA-based DMN, CMVFTA-based DMN, low-rank fuzzy fusion model, CNN, and Context-based GLP by considering parameters, such as PSNR and DD, by varying the number of rounds and neta. The Table 2 displays the values of PSNR and DD computed by the various approaches using Indian Pines, Salina, Pavia centre, and Pavia University datasets corresponding to the value of neta at 50 and 50 rounds. From the table, it can be revealed that the developed DICO_ZF Net computed a value of PSNR at 50.177 dB and DD at 0.063 dB.The high value of PSNR is accounted by the usage of the hybrid optimization algorithm during the training of ZF Net. Further, the utilization of CLAHE for sharpening the MS image has led to the low noise in the image, thereby decreasing the DD.

5. Conclusion

In this article, a novel framework for pansharpening is introduced with the help of devised DICO_ZF Net. Owing to technical limitations, the sensors mounted in the satellite do not have the ability to capture images of high spectral as well as spatial resolutions. Thus, to improve the quality of images captured, pansharpening is needed. Here, pansharpening is performed by using three processes, namely sharpening, upsampling, and fusion. The MS images are sharpened by using CLAHE and then they are upsampled using the upsampling process utilized in the RGB-Guided Hyperspectral Image Upsampling algorithm. ZF Net is employed for fusing the upsampled MS image and the PAN image, wherein an innovative DICO algorithm is utilized for optimizing the fusion process. Further, the devised DICO_ZF Net is examined for its effectiveness by considering, parameters, such as PSNR and DD. The devised DICO_ZF Net attained a higher value of PSNR at 50.177 dB and a lower value of DD at 0.063 dB, thereby revealing enhanced performance. The proposed technique is used in various applications, like maps updating, change detection, and accurate land use classifications. Also, the comprehensive chromatic and spatial information offered by the pan-sharpened image helps territorial transformation identification, extraction of features, and photo interpretation. The CLAHE process of upsampling is restricted by the appearance of casting shadows, which is a major limitation, and to overcome this, future work focuses on using other effective upsampling techniques.

References

Ghassemian

. A review of remote sensing image fusion methods. Information Fusion. 2016; 32: 75–89.

Javan

Samadzadegan

Mehravar

Toosi

. A review on spatial quality assessment methods for evaluation of pan-sharpened satellite imagery. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences. 2019; 42: 255–261.

Jegatheeswari

Angelin Deepa

. Fuzzy Weighted Least Square Filter for Pansharpening in Satellite Images. Multimedia Research. 2019; 2(1): 17–22.

Kahraman

Ertürk

. Review and performance comparison of pansharpening algorithms for RASAT images. Electrica. 2018; 18(1): 109–120.

Aiazzi

Alparone

Garzelli

Santurri

. Blind correction of local misalignments between multispectral and panchromatic images. IEEE Geoscience and Remote Sensing Letters. 2018; 15(10): 1625–1629.

Zambanini

. Feature-based groupwise registration of historical aerial images to present-day ortho-photo maps. Pattern Recognition. 2019; 90: 66–77.

Zhao

Jia

. Hyperspectral remote sensing image classification based on tighter random projection with minimal intra-class variance algorithm. Pattern Recognition. 2021; 111: 107635.

Zhao

Wang

Zhao

. Remote sensing image segmentation using geodesic-kernel functions and multi-feature spaces. Pattern Recognition. 2020; 104: 107333.

Sun

Lei

Sun

Kuang

. Nonlocal patch similarity based heterogeneous remote sensing change detection. Pattern Recognition. 2021; 109: 107598.

10.

Javan

Samadzadegan

Mehravar

Toosi

Khatami

Stein

. A review of image fusion techniques for pan-sharpening of high-resolution satellite imagery. ISPRS Journal of Photogrammetry and Remote Sensing. 2021; 171: 101–117.

11.

Schmitt

Zhu

. Data fusion and remote sensing: An ever-growing relationship. IEEE Geoscience and Remote Sensing Magazine. 2016; 4(4): 6–23.

12.

Kaur

Koundal

Kadyan

. Image fusion techniques: a survey. Archives of Computational Methods in Engineering. 2021; 28(7): 4425–4447.

13.

Belgiu

Stein

. Spatiotemporal image fusion in remote sensing. Remote Sensing. 2019; 11(7): 818.

14.

Zhou

Liu

Weng

Wang

. Unsupervised Cycle-Consistent Generative Adversarial Networks for Pan Sharpening. IEEE Transactions on Geoscience and Remote Sensing. 2022; 60: 1–14.

15.

Meng

Shen

Zhang

. Review of the pansharpening methods for remote sensing images based on the idea of meta-analysis: Practical discussion and challenges. Information Fusion. 2019; 46: 102–113.

16.

Kang

Zhang

Fan

. Pan-sharpening via multiscale dynamic convolutional neural network. IEEE Transactions on Geoscience and Remote Sensing. 2020; 59(3): 2231–2244.

17.

Yang

Wan

Huang

Wan

. Pansharpening Based on Low-Rank Fuzzy Fusion and Detail Supplement. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. September 2020; 13: 5466–5479.

18.

Luo

Zhou

Feng

Xie

. Pansharpening via unsupervised convolutional neural networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. July 2020; 13: 4295–4310.

19.

Masi

Cozzolino

Verdoliva

Scarpa

. Pansharpening by convolutional neural networks. Remote Sensing. 2016; 8(7): 594.

20.

Liu

Zhou

Liu

Wang

. PSGAN: A generative adversarial network for remote sensing image pan-sharpening. IEEE Transactions on Geoscience and Remote Sensing. 2020; 59(12): 10227–10242.

21.

Yang

Huang

Ding

Paisley

. PanNet: A deep network architecture for pan-sharpening. In: Proceedings of the IEEE international conference on computer vision, 2017, pp. 5449–5457.

22.

Wang

Shao

Huang

Zhang

. A dual-path fusion network for pan-sharpening. IEEE Transactions on Geoscience and Remote Sensing. June 2021.

23.

Vivone

Marano

Chanussot

. Pansharpening: context-based generalized Laplacian pyramids by robust regression. IEEE Transactions on Geoscience and Remote Sensing. March 2020; 58(9): 6152–6167.

24.

Lai

Chen

Jeon

Liu

Zhong

Yang

. Real-time and effective pan-sharpening for remote sensing using multi-scale fusion network. Journal of Real-Time Image Processing. 2021; 18(5): 1635–1651.

25.

Yin

Jiang

Cheng

TCE

. Pan-sharpening based on multi-objective decision for multi-band remote sensing images. Pattern Recognition. 2021; 118: 108022.

26.

Chang

Jung

Song

Hwang

. Automatic contrast-limited adaptive histogram equalization with dual gamma correction. IEEE Access. 2018; 6: 11782–11792.

27.

Kwon

Tai

. RGB-guided hyperspectral image upsampling. In: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 307–315.

28.

Makde

Bhavsar

Jain

Sharma

. Deep neural network-based classification of tumourous and non-tumorous medical images. In: The Proceedings of International Conference on Information and Communication Technology for Intelligent Systems, Springer, Cham, 2017, pp. 199–206,

29.

Naruei

Keynia

. A new optimization method based on COOT bird natural life model. Expert Systems with Applications. 2021; 183: 115352.

30.

Bairwa

Joshi

Singh

. Dingo Optimizer: A Nature-Inspired Metaheuristic Approach for Engineering Problems. Mathematical Problems in Engineering. 2021.

31.

Indian Pines, Salinas, Pavia Centre, University datasets. Available at http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes.

32.

Sookhoo

Kavathekar

Bonacaro

. Effectiveness and Experiences of Team-based Learning in Undergraduate Nursing Education Programs: Some Findings from a Mixed Methods Systematic Review. In: The Proceeding of 6th World Congress on Advanced Nursing and Healthcare, Brussels, Belgium, 2019.

33.

Fusini

Zanchini

. Mini-open surgical treatment of an ex professional volleyball player with unresponsive Hoffa’s disease. Minerva Ortopedica e Traumatologica. 2016; 67(4): 192–194.

34.

Gangappa

Kiran Mai

Sammulal

. Enhanced Crow Search Optimization Algorithm and Hybrid NN-CNN Classifiers for Classification of Land Cover Image. Multimedia Research. 2019; 2(3): 12–22.

35.

Santosh Kumar

Venkata Ramanaiah

. An Efficient Hybrid Optimization Algorithm for Image Compression. Multimedia Research. 2019; 2(4): 1–11.