An image fusion algorithm of infrared and visible imaging sensors for cyber-physical systems

Abstract

The rapid developments of computation, communication and control contribute to the generation of cyber physical systems (CPS). For full-time urban surveillance or military reconnaissance in complex environments, infrared and visible imaging sensors typically need to be integrated into the CPS. Furthermore, an effective and stable image fusion algorithm is important for CPS to provide images with rich information. Therefore, an image fusion algorithm for CPS is introduced in this paper. Compared with traditional multi-scale and multi-direction decomposition based algorithms, a more efficient MSMD based algorithm is proposed. Firstly, base layers reserved edges and detailed layers are obtained by multi-scale decomposition. Secondly, multi-direction decomposition is employed to base layers rather than detailed layers in traditional method. Then, serials of detailed layers and multi-directional base layers are obtained by choosing the max value based on patch. After the inverse transformation of multi-direction decomposition is conducted for multi-directional fused base layers, the reconstruction result is obtained via superposition of fused base and detail layers. Experiments prove that our algorithm outperforms the art-of-state.

Keywords

Cyber-physical systems image fusion rolling guided filter non-subsample directional filter bank

1 Introduction

In full-time surveillance or reconnaissance systems, it becomes an important issue that how to capture more information at night or in a complicated environment [1]. Generally, visible images cannot display the object in a complicated environment such as bushes, dense fog, and dark light, but can show clear background and details. In contrast, the object in infrared images could be presented via thermal radiation, but infrared images unable to exhibit a clear background and details. Consequently, an image fusion algorithm always utilized to merge information from infrared and visible imaging sensors in CPS [2]. As the Fig. 1 shows, image fusion algorithms blend the complementary information from the sensors [3], which makes the fused image more suitable for perception. In this paper, a stable and efficient MSMD based image fusion algorithm is proposed for surveillance or reconnaissance systems.

Fig.1

Image fusion of infrared imaging sensor and visible imaging sensor for CPS.

Currently, there are three kinds of image fusion algorithms: pixel-level, decision-level, and feature-level based algorithms. The pixel-level based algorithms are favored for their efficiency and simplicity. In pixel-level based algorithms, spatial [4, 5] and transform [6, 7] domain-based image fusion algorithms are the most popular. In addition, there are also many optimization-based algorithms.

As for information fusion of infrared and visible imaging sensors, spatial domain based algorithms will make the fused images lose details. The key issue in these algorithms is to obtain appropriate weighted maps of source images. To address this problem, many algorithms such as weighted average [4] and principal component analysis (PCA) [8] have appeared. However, it should be noted that the spectrum information is a big difference between infrared and visible images [9], which make the weights obtained by these algorithms is inappropriate. Therefore, some details and texture will lose in the fused image due to the inappropriate weights. Consequently, the spatial domain based algorithm unable to acquire satisfying fusion results from infrared and visible images.

Although transform domain based algorithms acquire more details than the spatial domain based algorithms, the artifacts will be produced in the fused images. The most popular type of transform domain based algorithm is the multi-scale decomposition (MSD) based algorithms [10 –12]. Utilizing the multi-scale decomposition, these algorithms can fuse source images in different scale to obtain a better fused image. In the past few years, many MSD based algorithms have been presented, such as multi-scale contrast-based model [13], Gradient pyramid (GP) [14] and multi-scale morphological focus measure [15]. However, inappropriate scale decomposition will result in artifacts in fused images. Moreover, some details and texture still are lost in the fused image obtained by these algorithms. To obtain more details, a MSMD based algorithms, curvelet transform based algorithm (CVT) [16], is proposed. Components in different scales and directions can be extracted by CVT, which facilitates the fusion of detail and edge information. However, the CVT has no property of shift-invariance because of the down-sampling and up-sampling, which will result in spectral aliasing and distortion in the fused image. Therefore, a non-subsampled curvelet transform based algorithm (NSCT)is introduced to overcome the drawbacks of CVT by removing the down-sampling [10]. However, the multi-direction decomposition of detailed layers becomes time-consuming, since the down-sampling is removed.

Recently, an optimization-based algorithm fuses the infrared and visible images by gradient transformation and total variation (GTF) [17]. This optimization model merges the details in accordance with a regularization term. However, compared with the MSD based algorithms, only gradient information is utilized in the regularization term so that the detail in fused images is much less than that of the MSD based algorithms. Accordingly, the GTF has unstable performance on source images with rich details. In addition, the GTF is time-consuming due to the processing of the optimization problem.

Overall, some defects still need to be addressed in the pixel-based algorithms to blend information from infrared and visible imaging sensors:

Fused images acquired by spatial-based algorithms lose lots of details considering that the spectrum is different between infrared and visible images.

Artifacts always be introduced in fused images due to inappropriate scale decomposition of MSD based algorithms.

Fused images obtained by MSMD based algorithms are either time-consuming or spectral aliasing.

The optimization-based algorithms are time-consuming, and the GTF has unstable performance on different types of images.

Considering the problems mentioned above, an effective and stable algorithm based on MSMD is designed for CPS to fuse infrared and visible images. Since the processing of MSMD, details and texture are rich in fused images of our algorithm. Furthermore, the multi-direction decomposition is adopted to the base layer instead of detailed layers, which makes the computational time of our algorithm much lower than traditional MSMD based algorithms. The main contribution included in this paper is as follows:

To integrate the equipment of urban surveillance and military reconnaissance into CPS, an effective and stable image fusion is proposed for infrared and visible imaging sensors.

To better separate details, edges and low frequency information, a MSMD based algorithm is conducted in source images, which the MSMD is composed of rolling guided filter (RGF) and non-subsampled directional filter bank (NSDFB).

To obtain a better effect of MSD, RGF is applied to decompose original images. RGF is a scale-aware and edge-preserving filter [18], which can preserve edges but remove small structures.

To extract the edges in the base layer, the NSDFB is adopted in base layer. NSDFB could decompose the base layer into multi-directional component with property of shift invariance [19].

To save computation time, a fast MSMD based algorithm is introduced. Compared with traditional MSMD based algorithms like CVT and NSCT, the proposed algorithm is time-saving by only adopting the multi-directional decomposition on the base layer instead of serials of detailed layers.

The structure of our paper is organized as follows. The introduction is exhibited in Section 1. A briefly description of RGF and NSDFB is given in the Section 2. In Section 3, we explain the specific steps of the proposed algorithm. Then, experimental results and discussion will be represented in Section 4. In Section 5, we conclude the paper.

2 Related work

2.1 Rolling guided filter

The rolling guided filter proposed by Zhang et al. [18] can remove the small details but preserve the large-scale edge. With respect to other edge-preserved filters like weighted least square filter, the rolling guided filter with a fast convergence property only need few times iteration to obtain the filtered image. This algorithm is easy to implement and understand, and the specific realization of the process shown in Fig. 2. Step 1 is Gaussian filtering (GF), and Step 2 is joint bilateral filtering (JBF) with the number of iterations T. I is the input image of GF and JBF. G_i represents i-times iterative result of JBF, where the G₀ is the filtering result of Gaussian filter in Step 1. As shown in the schematic diagram, the entire filtering process can be divided into two stages.

Fig.2

The illustration of RGF.

First, small structure is removed by GF described as Equation (1). Given an input image I, G represents the output image, and the t and s represent the center pixel and neighborhood pixel in the Gaussian kernel, respectively. $G (t) = \frac{1}{w_{s}} \sum_{s \in N (t)} \exp (\frac{- {∥ s - t ∥}^{2}}{2 σ_{s}^{2}}) I_{in} (t)$ (1)

The $w_{s} = \sum_{s} \exp (- {∥ s - t ∥}^{2} / 2 σ_{s}^{2})$ is for normalization, and the N (t) denotes the neighborhood pixel in the Gaussian kernel centered at t. The σ_s denotes the standard deviation of Gaussian filter.

Second, the edge is recovered by the joint bilateral filter. As the output of RGF, G_T (t) is obtained by T times of jointly bilateral filtering (JBF). The definition of JBF can be described as follows: $\begin{matrix} G_{i} (t) = \frac{1}{M_{s}} \sum_{s \in N (t)} \exp (- \frac{{∥ G_{i - 1} (s) - G_{i - 1} (t) ∥}^{2}}{2 σ_{r}^{2}} \\ - \frac{{∥ s - t ∥}^{2}}{2 σ_{s}^{2}}) I_{in} (t), i = 1, 2, \dots, T \end{matrix}$ (2)

where

$M_{s} = \sum_{s} \exp (\frac{- {∥ s - t ∥}^{2}}{2 σ_{s}^{2}} - \frac{{∥ G_{i - 1} (s) - G_{i - 1} (t) ∥}^{2}}{2 σ_{r}^{2}})$ (3)

is for normalization, and I_in (t) indicates the input image. G_i (t) is the output of i-th JBF and the guided image of (i + 1)-th JBF. The standard deviation of domain and range Gaussian kernel are presented by σ_s and σ_r, which is utilized to manipulate the spatial and range weight, respectively.

2.2 Non-subsampled directional filter bank

Inspired by the NSCT, multi-direction decomposition is utilized is our algorithm. However, there is a difference between the proposed algorithm and NSCT. The multi-directional decomposition is adopted to the base layer instead of the de-tailed layer. In NSCT, the non-subsampled Laplacian pyramid (NSLP) is used to decompose images into different scales. However, the NSLP has no property of edge-preserved, which result in that both details and edges are decomposed into detailed layers. Consequently, the edges and details are unable to well separated. In contrast, the large-scale edges of source images are retained in the base layer by filtering of RGF in our algorithm, while the NSDFB is applied to decomposing the base layer to extract large-scale edges.

NSDFB is a modified version of directional filter banks (DFB) by quincunx up-sampling instead of down-sampling and up-sampling [19], so the NSDFB possesses the property of shift invariance. NSDFB can decompose the 2-D frequency plane of image into multi-directional bandpass sub-bands, as shown in Fig. 3. If the index of directional decomposition is k, the image will be decomposed into 2^k directional sub-bands. The decomposition can be described as follows:

Fig.3

Non-subsampled directional filter bank with directional index k = 2.

${D^{s, d} | d = 1, 2, 3, \dots, 2^{k}} = DF (D_{s})$ (4)

where the DF (•) represents the filtering process of NSDFB, and the D^s,d means the component of d-th direction of the s-th layer.

3 Proposed algorithm

Figure 4 illustrates the process of the proposed algorithm. Four steps are included in this algorithm. The specifics process of this algorithm is explained in Section 3.1-3.2, and a brief introduction is presented.

Fig.4

The framework of the proposed algorithm.

The MSD of source images: a base layer and serials of detailed layer are separated from source images by RGF.

The multi-direction decomposition of base layers: the base layer is decomposed into multi-direction base layers by the NSDFB, which each direction base layer is a different component in base layer.

Fusion of detailed layers and multi-direction base layers: the max-choosing strategy based on patch is adopted to the multi-directional base images and serials of detailed images to get multi-directional fused base images and fused detailed images.

Obtaining the fused image: after the fused base layer is acquired by applying the inverse transformation of the NSDFB to multidirectional fused base layer, the detailed layers are added on the fused base layer to acquire the final fused image.

3.1. Step 1: Multi-scale decomposition by RGF

In first step, the RGF is applied to decompose original images into different scales. Suppose the number of layers of MSD is L, and R, V represent the infrared and visible source images. In order to obtain well-defined multi-scale layers, two stages need to be done when decompose the images. Firstly, RGF is utilized to acquire base layers and images with different degrees of blur: $R_{i} = RGF (R_{i - 1}, σ_{s}, σ_{r}, T), i = 1, 2, \dots, L$ (5) $V_{i} = RGF (V_{i - 1}, σ_{s}, σ_{r}, T), i = 1, 2, \dots, L$ (6)

where the RGF (•) represents the filtering of RGF. The R_i represents the i-th filtering result of infrared source image, R₀ = I, and R_L is the base layer.Similarly, the V_i represents the i-th filtering result of visible source image, V₀ = V, and V_L is the base layer. The σ_s and σ_r denote the standard deviation of domain and range Gaussian kernel, which is same with the σ_s and σ_r in Equation (2). T is the number of iterations of JBF operating represented by Equation (2).

Secondly, detailed layers of different scales are obtained by difference between adjacent blurred image: $D_{i}^{R} = R_{i} - R_{i - 1}$ (7) $D_{i}^{V} = V_{i} - V_{i - 1}$ (8)

where the $D_{i}^{R}$ and $D_{i}^{V}$ respectively represent the i-th detailed layer of the original image.

3.2 Step 2: Multi-directional decomposition by NSDFB

In this step, NSDFB is adopted to the base layer to extract the edges and details in different direction, as the base layer retains lots of edges and details due to the characteristics of scale-aware and edge-preserving of RGF. For the base layers R_L and V_L, the multi-directional decomposition of base layers can be described as follows: ${I_{d}^{B} | d = 1, 2, \dots, 2^{k}} = DF (R_{L}, k)$ (9) ${V_{d}^{B} | d = 1, 2, \dots, 2^{k}} = DF (I_{L}, k)$ (10)

where the DF (•) represents the processing of NSDFB. The $I_{d}^{B}$ and $V_{d}^{B}$ denote the d-th direction of the infrared base layer and visible base layer, respectively. k is the index of multi-direction decomposition.

3.3 Step 3: Fusion of multi-directional base layers and detailed layers

The fused detailed layers and multi-directional based layer are obtained by a max-choose rule based on patch in this step. To merge more details into the fused image, we apply a max-choose rule to fuse the detailed layers and multi-directional base layer. However, infrared images always contain some noise and irrelevant details, which is undesirable in the final fused image. Therefore, a max-choose rule based on patch replaces the naive max-choose rule to remove noise but keep details. The special fusion rule is composed by following three stages:

The initial decision map is obtained via max-choose based on patch as follows $M (i, j) = \underset{m, n \in Ω}{MAX} (A (m, n), B (m, n))$ (11) where the M (i, j) represents the value of the initial decision map at location (i, j), and A, B denote the source images. The Ω is a local window centered at location (i, j):

The initial weight map is acquired by: $S (i, j) = {\begin{matrix} 1 & \sum_{w} H (i, j) > \frac{m \times n}{2} \\ 0 & else \end{matrix}$ (12) and $H (i, j) = {\begin{matrix} 1 & IR (i, j) > Vis (i, j) \\ 0 & else \end{matrix}$ (13) where the IR (i, j) and Vis (i, j) denote the initial decision map of infrared and visible base layer obtained by Equation (12). The ω represents the local window centered at location (i, j).

The multi-directional fused base layer and detailed layer can be acquired by following equation: $F_{d}^{B} = S_{d} * I_{d}^{B} + (1 - S_{d}) * V_{d}^{B}$ (14) $F_{l}^{D} = S_{l} * R_{i} + (1 - S_{l}) * V_{i}$ (15) where the S indicates the weights of infrared layer. Subscript l and d denote the l-th detailed layer and the d-th directional component of base layer, correspondingly. $F_{l}^{D}$ and $F_{d}^{B}$ mean the fused result of the l-th detailed layer and the d-th directional component of base layer, respectively.

3.4 Step 4: Obtaining the final fused image

To acquire the final fused image, the fused base layer has to be obtained, firstly, by inverse transform of NSDFB as the Equation (16) shows. Then, the reconstruction result is obtained via superposition of fused base and detail layers, which can be described as Equation (17). $B_{f} = IDF ({F_{d}^{B} | d = 1, 2, \dots, 2^{k}})$ (16) $F = B_{f} + \sum_{l = 1}^{L} F_{l}^{D}$ (17)

where IDF (•) denotes the inverse transform of NSDFB. B_f and k indicate the fused base layer and index of direction decomposition, respectively. The F represents the final fused image.

4 Experimental results and comparisons

4.1 Experiment setting

There are some major variables in the proposed algorithm, such as the level L of decomposition, the directional index k of NSDFB, the σ_sσ_r of RGF, and the iteration number t of joint bilateral in RGF. In this paper, we set L = 3, σ_s = 3 σ_s = 0.24, T = 4, and k = 2. Besides, nine other algorithms are compared with the proposed algorithm. These algorithms include five MSD based algorithms DTCWT [7], RP [20], LP [21], DWT [11], MSVD [22], two MSMD based algorithms CVT [16] and NSCT [10], and two recent algorithms GFF [5] and GTF [17]. The settings of these nine algorithms are consistent with corresponding papers. In this paper, some urban and military surveillance images are adopted in experiments. All these source images are from [23, 24].

In addition, some metrics about image quality assessment are selected to compare different algorithms objectively, including EN, MI, Q_ab/f and SD [25]. EN and MI are information theory based metrics. EN represents the information amount of an image [26], and MI indicates the information amount preserved in fused images from source images [27]. Q_ab/f means the amount of edge transmitted from source to fused images [28]. SD measures the amount of details and texture [29], which is a statistics based metrics. Larger value of these four metrics means better effects of algorithms.

4.2 Comparative analysis

In this subsection, our algorithm is compared the nine other algorithms mentioned in Section 4.1. The performance of these ten algorithms will be analyzed on various images.

4.2.1 Comparison with other algorithms

Five images, Dataset-1, in the Fig. 5 will be analyzed in detail, including three urban surveillance images shown in Fig. 5(a-c) and two military reconnaissance images shown in Fig. 5(d-e). For each image, the subject visual effect is analyzed at first. Then, the object assessment is evaluated after corresponding subject analysis.

Fig.5

Dataset-1 utilized in experiments.

The first set of fused images in Dataset-1 are arranged in Fig. 6, and a detail in each sub-figure is magnified at the left-bottom corner. Obviously, fused images of CVT, LP, DTCWT, and NSCT have some artifacts around the traffic light. Also, some texture of the road is lost in these fused images. The second light of the traffic light is disappeared in the fused image obtained GTF, which can be found in the magnified area. In the fuse image acquired by RP, distortion is appeared around the traffic light. Compared with GFF and our algorithm, the fused image of WT is low in brightness and contrast. As for MSVD, aliasing is produced at the edge of traffic light. However, the fused image obtained by the propose algorithm has a clear and nature edge, and the brightness is also higher than all other fused images. To compare different algorithms objectively, the quantitative assessment is presented in Table 1. Obviously, the best results are gained by our algorithm. The fused image of our algorithm obtains the highest scores in EN and MI, which means that it retains more information than other algorithms from source images. In addition, the rich details and texture can be indicated by high values of Q_ab/f and SD, which the highest scores are gained by our algorithm.

Fig.6

Experimental results of different algorithm for the first image in Dataset-1.

Table 1

Quantitative analysis of Fig. 6 under different algorithms

	CVT	DTCWT	GFF	GTF	LP	RP	MSVD	WT	NSCT	PRPOSED
EN	6.5329	6.4843	6.4260	6.6594	6.6739	6.5416	6.2112	6.2167	6.5542	6.8156
MI	1.4709	1.5454	1.6370	2.0059	1.6070	1.4714	1.5942	1.656	1.5494	2.7470
Q_ab/f	0.3673	0.4174	0.4600	0.3879	0.4790	0.4014	0.2103	0.2838	0.4592	0.4593
SD	26.8678	26.2614	25.8712	26.5794	29.2224	27.4105	22.357	22.3798	27.0885	30.6502

Figure 7(c-l) shows fused images obtained by different algorithms of another pair of source image. It is same with the fused image in Fig. 6. The edge of the people in fused images obtained by CVT, DTCWT and NSCT has obvious artifacts. The contrast is low in fused images obtained by GTF and WT, and some details are lost in these fused images. As for the fused image acquired by RP, the edge of the people is blurred, and some noise appears around the person. In the fused image of MSVD, it can be seen that the pixel blocks are appeared around the edge of the person. The fused images of GFF and LP are similar to the proposed algorithm, but the contrast and brightness are lower than the proposed algorithm. As shown in the Table 2, we can find the EN, MI and SD of the proposed algorithm are highest in all algorithms, and the Q_ab/f is only lower than LP and GFF. However, the MI and SD of GFF and LP are much lower than the proposed algorithm. Therefore, our algorithm has the best effect in this pair of source images.

Fig.7

Experimental results of different algorithm for the second image in Dataset-1

Table 2

Quantitative analysis of Fig. 7 under different algorithms

	CVT	DTCWT	GFF	GTF	LP	RP	MSVD	WT	NSCT	PRPOSED
EN	6.5329	6.4843	6.4260	6.6594	6.6739	6.5416	6.2112	6.2167	6.5542	6.8156
MI	1.4709	1.5454	1.6370	2.0059	1.6070	1.4714	1.5942	1.656	1.5494	2.7470
Q_ab/f	0.3673	0.4174	0.4600	0.3879	0.4790	0.4014	0.2103	0.2838	0.4592	0.4593
SD	26.8678	26.2614	25.8712	26.5794	29.2224	27.4105	22.357	22.3798	27.0885	30.6502

In Fig. 8, some information is lost in the fused images of CVT and GTF, and the edge of the object is unnatural. In the magnified area of fused images obtained by DTCWT and RP, it can be found that artifacts appear around the object. It is like the result in Figs. 6(i) and 7(i), there are many pixel blocks around the edge of the object in the fused image acquired by MSVD. As for fused images obtained GFF, LP, WT, NSCT and the proposed algorithm, it is difficult to tell the difference of these images in visual effect, but the brightness of WT and NSCT is lower than GFF, WT and the proposed algorithm. Table 3 illustrates the quantitative result. Though Q_ab/f of our algorithm is lower than the GFF, it is close to the GFF. In contrast, the MI and SD of GFF are much lower than the proposed algorithm. It shows that more information is preserved in fused images by our algorithm.

Fig.8

Experimental results of different algorithm for the third image in Dataset-1.

Table 3

Quantitative analysis of Fig. 8 under different algorithms

	CVT	DTCWT	GFF	GTF	LP	RP	MSVD	WT	NSCT	PRPOSED
EN	6.5636	6.5646	6.7396	6.6214	6.5536	6.5935	6.3866	6.3835	6.5568	6.8456
MI	3.3300	3.4673	4.4702	3.2861	3.7448	3.7652	3.4955	4.322	3.5965	5.8691
Q_ab/f	0.6473	0.6747	0.7204	0.5455	0.7050	0.5949	0.4131	0.6411	0.7044	0.7097
SD	36.279	36.3592	38.8512	31.605	39.6529	36.568	30.5228	31.1703	37.2982	43.4704

Moreover, two pairs of images about military reconnaissance are adopted to verify the effect of the proposed algorithm in military reconnaissance. The first pair image is shown in Fig. 9(a, b), and Fig. 9(c-l) are fusion results of different algorithms. For the visible image, we can just find an object is held in the middle person, but a concealed weapon is appeared inside the shirt of the right person in the infrared image. Therefore, we can get more information by fusing original images. However, the fused images of CVT, DTCWT, LP, RP, MSVD, WT and NSCT have low brightness of the concealed weapon. Moreover, a lot of noise appears in the fused image acquired by RP so that the concealed weapon is unable to recognize. The brightness of the concealed weapon is high in the fused image of GTF, but lots of details are lost. From the magnified area, it also can prove that some information in Fig. 9(b) disappears in the fused image of GTF. In contrast, rich information is preserved in the fused image of our algorithm, and the brightness is high. Compared with GFF, the fused image has clearer edge in the neckband of the left person. Table 4 lists the quantitative assessments. Obviously, the best result of each metrics is achieved by the proposed algorithm, which agrees with the result of visualeffect.

Fig.9

Experimental result of different algorithm for the fourth image in Dataset-1.

Table 4

Quantitative analysis of Fig. 9 under different algorithms

	CVT	DTCWT	GFF	GTF	LP	RP	MSVD	WT	NSCT	PRPOSED
EN	6.1028	6.0750	6.5718	6.1207	6.0606	6.1732	6.1254	6.1187	6.0524	6.8445
MI	1.5808	1.5985	2.6929	1.5014	1.7631	1.237	2.3069	3.2137	1.6963	4.8732
Q_ab/f	0.6090	0.6500	0.7368	0.3419	0.7135	0.3299	0.4154	0.4372	0.6970	0.7674
SD	29.1998	30.2162	33.5853	28.2741	34.1214	35.4836	25.7806	25.6843	30.8415	40.084

To verify the fusion effect of details, a pair image with rich details and texture is adopted in this experiment. Figure 10(a) and (b) are original images, and Fig. 10(c-l) are fused images of different algorithms. Noise still appears in the fused image obtained by RP. There are some pixel blocks around the edge of the tank in the magnified area of MSVD. Compared with the fused image of the proposed algorithm, fused images obtained by CVT, DTCWT, GFF, GTF, LP, WT, GTF, and NSCT lose lots of details, especially by the GTF. In addition, we can find the edge information in the magnified area of the proposed algorithm is richer than others. The quantitative assessment is presented in the Table 5. All the metrics of proposed are highest, which agree with the visual effects.

Fig.10

Experimental results of different algorithm for the fifth image in Dataset-1.

Table 5

Quantitative analysis of Fig. 10 under different algorithms

	CVT	DTCWT	GFF	GTF	LP	RP	MSVD	WT	NSCT	PRPOSED
EN	7.4069	7.404	7.7941	6.384	7.4477	7.5668	7.2445	7.1855	7.4148	7.9424
MI	1.7997	1.8795	2.3709	1.1579	1.9869	1.2004	2.3367	3.5294	1.9546	5.5113
Q_ab/f	0.4028	0.449	0.5464	0.3791	0.511	0.193	0.1679	0.1774	0.4962	0.5484
SD	51.3241	50.9015	61.0499	52.8302	54.5032	61.1764	39.3623	38.5437	52.0198	72.3966

4.2.2 The performance in different images

To analyze the performance in various images of different algorithms, another five pairs of source images shown in Fig. 11 are used in this experiment. The fused images of these source images are presented in Fig. 12. Fused images of Fig. 11(a) are listed at the first column in Fig. 12. Obviously, the contrast of the fused image of our algorithm is higher than other algorithms, and the edge between trees and sky is clearer. In second column, the brightness of the proposed algorithm is the highest. In the third column, obvious artifacts can be found in the fused images obtained by RP, and the brightness of the object is low in all the fused images except the GTF and the proposed algorithm. However, the fused image obtained by GTF has few details from the visible image. From the fourth column in Fig. 12, it can be found the details of trees in the fused image obtained by GTF, LP, RP, MSVD, and WT are less than CVT, DWCWT, GFF, NSCT, and the proposed algorithm. Compared with GFF and the proposed algorithm, the fused images obtained by CVT, DTCWT and NSCT is darker. The fused images of the last image in Fig. 11 are shown in the last column of Fig. 12. It can be found the object in fused images acquired by RP, MSVD and WT are almost disappeared. The island is lost in the fused image obtained by GTF. However, the island and object are clear in the fused image of our algorithm.

Fig.11

Dataset-2 utilized in experiments.

Fig.12

Experimental results of different algorithm for all images in Dataset-2.

Also, the quantitative assessments are conducted in this experiment. Figure 13 indicates the means of EN, MI, Q_ab/f and SD of the ten fused images obtained from Dataset-1 and Dataset-2. Although the Q_ab/f is lower than GFF, the proposed algorithm achieves the best result in EN, MI and SD. It means the proposed algorithm has the best performance for different images between these algorithms. To compare the stability of each algorithm, the ratio of standard deviation to mean is listed in the Table 6. The EN and MI of our algorithm are the minimal among all algorithms, which means our algorithm achieving the most stability performance in EN and MI. Although the Q_ab/f and SD of the proposed algorithm are not the best, it ranks among the top. Therefore, the proposed algorithm is better in terms of stability compared with other algorithms.

Fig.13

The means of each metrics acquired by different algorithms.

Table 6

Ratio of variance to mean

	CVT	DTCWT	GFF	GTF	LP	RP	MSVD	WT	NSCT	PRPOSED
EN	0.1216	0.1222	0.1156	0.1625	0.1207	0.1232	0.1306	0.1384	0.1215	0.1091
MI	0.4333	0.4465	0.5534	0.4784	0.4166	0.5704	0.4220	0.3842	0.4348	0.3756
Q_ab/f	0.2211	0.1890	0.1465	0.2286	0.1543	0.3468	0.3284	0.3874	0.1680	0.1780
SD	0.3868	0.3859	0.3952	0.5001	0.3605	0.4078	0.4211	0.4270	0.3807	0.3728

In addition, a comparison of computational time of all these images is presented in Table 7. The experiments in this paper are implemented on a computer with 16G RAM and Intel i7-4790@3.69GHz. The computational time is measured by average time spent on 20 replicate experiments. Although, the computation time of the proposed algorithm is more than LP, RP, GFF, DTCWT, MSVD and WT, it is much lower than the GTF and traditional MSMD based algorithms like CVT and NSCT.

Table 7

Average computational time of different algorithms

	LP	RP	GFF	CVT	DTCWT	MSVD	WT	NSCT	GTF	PROPOSED
Time	0.0357	0.2276	0.9804	4.9179	0.9724	1.0931	1.2158	12.8436	13.5915	4.3067

5 Conclusion

An image fusion algorithm of infrared and visible imaging sensors is proposed for CPS in this paper. First, the details and large-scale edge of source image are extracted by RGF and NSDFB. Then, a max-choosing rule based on patch is utilized to fuse the detailed layers and multi-directional base layers. Comparative experiments show that our algorithm has the best effects. In addition, our algorithm has better stability than others, which is more suitable for CPS.

However, there are a lot of works to further research about this algorithm. Firstly, further research is need in fusion rule, which is designed to merge information from infrared and visible imaging sensors. Besides, more research of the proposed algorithm is to fuse the remote sensing.

Footnotes

Acknowledgments

This research is funded by National Nature Science Foundation of China (NO. 61771378) and Science Foundation of Sichuan Science and Technology Department (NO. 2018GZ0718).

References

Jeon ,

Anisetti ,

Lee ,

Bellandi ,

Damiani and

Jeong , Concept of linguistic variable-based fuzzy ensemble approach: Application to interlaced hdtv sequences, IEEE Trans Fuzzy Systems17(6) (2009), 1245–1258.

Soumya and

S.M.

Thampi , A fuzzy fusion approach to enlighten the illuminated regions of night surveillance videos, Journal of Intelligent & Fuzzy Systems32(4) (2017), 3143–3149.

Zhen and

H.K.T.

Muzaffar , Intelligent fusion algorithm for multi-sensor information in integrated power grid operation system, Journal of Intelligent & Fuzzy Systems, (Preprint), 1–11.

Malviya and

S.G.

Bhirud , Image fusion of digital images, International Journal of Recent Trends in Engineering2(3) (2009), 146.

Li ,

Kang and

Hu , Image fusion with guided filtering, IEEE Transactions on Image Processing22(7) (2013), 2864–2875.

Zhang ,

Shao ,

Yao ,

Li and

Wang , Underwater multi-focus image fusion based on sparse matrix, Journal of Intelligent & Fuzzy Systems (Preprint), 1–9.

J.J.

Lewis ,

R.J.

O'Callaghan ,

S.G.

Nikolov ,

D.R.

Bull and

Canagarajah , Pixel-and regionbased image fusion with complex wavelets, Information Fusion8(2) (2007), 119–130.

Wan ,

Zhu and

Qin , Multifocus image fusion based on robust principal component analysis, Pattern Recognition Letters34(9) (2013), 1001–1008.

Jeon ,

Anisetti ,

Wang and

Damiani , Locally estimated heterogeneity property and its fuzzy filter application for deinterlacing, Information Sciences354 (2016), 112–130.

10.

Adu ,

Gan ,

Wang and

Huang , Image fusion based on nonsubsampled contourlet transform for infrared and visible light image, Infrared Physics & Technology61 (2013), 94–100.

11.

Li ,

B.S.

Manjunath and

S.K.

Mitra , Multisensor image fusion using the wavelet transform, Graphical Models and Image Processing57(3) (1995), 235–245.

12.

Jian ,

Yang ,

Zhou ,

Zhou and

Liu , Multi-scale image fusion through rolling guidance filter, Future Generation Computer Systems83 (2018), 310–325.

13.

Xing ,

Cai ,

Zeng ,

Chen ,

Zhu and

Hou , A multi-scale contrast-based image quality assessment model for multi-exposure image fusion, Signal Processing145 (2018), 233–240.

14.

V.S.

Petrovic and

C.S.

Xydeas , Gradient-based multiresolution image fusion, IEEE Transactions on Image Processing13(2) (2004), 228–237.

15.

Zhang ,

Bai and

Wang , Boundary finding based multi-focus image fusion through multi-scale morphological focus-measure, Information Fusion35 (2017), 81–101.

16.

Nencini ,

Garzelli ,

Baronti and

Alparone , Remote sensing image fusion using the curvelet transform, Information Fusion8(2) (2007), 143–156.

17.

Ma ,

Chen ,

Li and

Huang , Infrared and visible image fusion via gradient transfer and total variation minimization, Information Fusion31 (2016), 100–109.

18.

Zhang ,

Shen ,

Xu and

Jia , Rolling guidance filter, In European Conference on Computer Vision, Springer, 2014, pp. 815–830.

19.

R.H.

Bamberger and

M.J.T.

Smith , A filter bank for the directional decomposition of images: Theory and design, IEEE Transactions on Signal Processing40(4) (1992), 882–893.

20.

Toet , Image fusion by a ratio of low-pass pyramid, Pattern Recognition Letters9(4) (1989), 245–253.

21.

Wang and

Chang , A multi-focus image fusion method based on laplacian pyramid, JCP6(12) (2011), 2559–2566.

22.

V.P.S.

Naidu , Image fusion technique using multi-resolution singular value decomposition, Defence Science Journal61(5) (2011), 479–484.

23.

https://figshare.com/articles/TNO\_Image\_Fusion\_Dataset/1008029.

24.

http://www.imagefusion.org.

25.

Wang ,

Wu ,

Anisetti and

Jeon , Bayesian method application for color demosaicking, Optical Engineering57(5) (2018), 053102.

26.

Liu ,

Blasch ,

Xue ,

Zhao ,

Laganiere and

Wu , Objective assessment of multiresolution image fusion algorithms for context enhancement in night vision: A comparative study, IEEE Transactions on Pattern Analysis and Machine Intelligence34(1) (2012), 94–109.

27.

Hossny ,

Nahavandi and

Creighton , Comments on “Information measure for performance of image fusion”, Electronics Letters44(18) (2008), 1066–1067.

28.

C.S.

Xydeas and

Petrovic , Objective image fusion performance measure, Electronics Letters36(4) (2000), 308–309.

29.

Zhu ,

Huang and

Lei , Fusion of infrared and visible images based on bemd and nsdfb, Infrared Physics & Technology77 (2016), 82–93.