Fusion of 3-D medical image gradient domain based on detail-driven and directional structure tensor

Abstract

BACKGROUND:

Multi-modal medical image fusion plays a crucial role in many areas of modern medicine like diagnosis and therapy planning.

OBJECTIVE:

Due to the factor that the structure tensor has the property of preserving the image geometry, we utilized it to construct the directional structure tensor and further proposed an improved 3-D medical image fusion method.

METHOD:

The local entropy metrics were used to construct the gradient weights of different source images, and the eigenvectors of traditional structure tensor were combined with the second-order derivatives of image to construct the directional structure tensor. In addition, the guided filtering was employed to obtain detail components of the source images and construct a fused gradient field with the enhanced detail. Finally, the fusion image was generated by solving the functional minimization problem.

RESULTS AND CONCLUSION:

Experimental results demonstrated that this new method is superior to the traditional structure tensor and multi-scale analysis in both visual effect and quantitative assessment.

Keywords

3-D medical image multi-modal image fusion directional structure tensor local entropy detail component

1 Introduction

Based on different imaging modalities such as computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET), and single-photon emission computed tomography (SPECT), medical images can reflect human body structure information or molecular metabolic information from different perspectives. Although PET and SPECT possess an essential advantage in the early diagnosis of lesions because of conveying the information of blood perfusion and metabolism of tissues and organs, the images can’t show the anatomical structure clearly enough due to the limitation of resolution and consequently lead to the difficulty in the localization of lesions. For the sake of providing the physicians with much more comprehensive clinical information, the strategy of fusion of various imaging modalities is proposed and carried out. For example, PET-CT image fusion, which combines anatomical features with functional metabolic information, can not only improve the diagnostic accuracy of head and neck cancer [1], but also facilitate the detection of lung cancer in organs with high motor changes [2]; MRI-PET image fusion can be used to detect brain tumors [3] and metastasis of liver cancer [4]; SPECT-CT image fusion can be employed to study coronary artery disease [5], and to identify bone cancer metastasis or benign and malignant lesions [6]. Moreover, medical image fusion has important application value in surgical planning, setting radiotherapy plan and checking post-treatment effects as well [7, 8].

As an effective information synthesis technology, many medical image fusion methods have been proposed in decades [9 –14]. Multi-scale analysis (MTA) as the most popular method in transform domain has developed various transformations and improvements including discrete wavelet transform (DWT) [15], discrete shearlet transform (DST) [16], contourlet transform [17], non-subsampled contourlet transform (NSCT) [18], and non-subsampled shearlet transform (NSST) [19]. However, the fusion rules used in different sub-bands in MTA may not always effectively fuse the image features, resulting in the loss of detail information and degradation of image quality. Another method of sparse representation has also emerged in the research of transform domain fusion [9 , 20]. The algorithm based on sparse representation (SR) first obtains the sparse coefficient representation of the source image through dictionary learning, then fuses the sparse coefficients and constructs the fusion image by combining the overcomplete dictionary. Although the SR-based method has been successfully applied to fusion with medical images, only one dictionary is usually used to represent the different morphological structures of the source images. The common features contained in the dictionary are not enough to effectively express the content of images, especially for the irregular details, and result in over smooth of the fusion result. Moreover, deep learning technology has also been successfully applied to image fusion [21, 22]. However, due to the lack of sufficient training data and an authority to provide the ground truth for supervision, it is difficult to realize medical image fusion based on deep learning.

Recently, researchers have proposed image fusion methods in the gradient domain [23 –28], in which the important features of the image including edge information can be well preserved onto the fused image. Piella [12] utilized structure tensor as a gradient information to fuse contrast information for multi-spectral images, thus effectively preserving the edge details of the input images. For noise images, Zhao et al. [28] obtained the ideal effect in noise-containing image fusion experiments by defining gradient entropy as the gradient domain weight and introducing P-Laplace diffusion constraint. However, these methods are mainly applied to the fusion of two-dimensional (2-D) multi-spectral images. In general, a medical image is constructed to be a three-dimensional (3-D) volume, which is made up of slices collected along a specific direction. Adjacent information between slices is essential for experts to evaluate the integrality and activity of organs and tissues. Previous studies mostly focused on 2-D image fusion issue in which 3-D image fusion can be achieved by fusing slice by slice. This rough method of generating 3-D fusion from 2-D fusion can easily ignore the hidden volume context information in the 3-D structure, resulting in the loss of information and incomplete fusion [27]. Therefore, it is of great clinical significance to propose an effective 3-D image fusion method. For instance, some researchers combined gradient-based image fusion method with MTA to realize the fusion of 3-D images [23, 25]. Wang and Liu [26] used this gradient-based fusion method to successfully realize the fusion of 3-D non-small cell lung cancer PET/CT images.

Although the fusion method based on 3-D structure tensor can fuse PET-CT images well, it still has some drawbacks. On the one hand, this method cannot capture 3-D continuous detail structure well and lack spatial continuity; on the other hand, the fusion image obtained by this method has small but consistent detail loss due to the incomplete fusion gradient domain. To address these problems, we propose an improved method to improve the fusion performance. For purpose of capturing structural continuity in 3-D medical images, directional structure tensor (DST) is proposed in our previous work [29], which has been proven to be able to describe the local structural information precisely in denoising. The DST contains not only the direction of the local change in the image but also the second-order derivatives. The second-order derivatives are used to build a more complete gradient field that is very instrumental in fusing images based on gradient field. Inspired by this work, we propose a 3-D medical image fusion method based on DST herein. Furthermore, it is necessary to maintain the continuity of information when fusing images, thus we use local entropy to construct gradient weights of different images to construct local entropy-weighted directional structure tensor. Local entropy can express continuous geometric structure and texture information of the image, which is helpful to keep the continuity of information. Additionally, for highlighting the detailed information in the fused image and remedying for the loss of small details caused by reconstruction, we use guided filtering to obtain the base components of the image, then subtract the base components from the source images to obtain the detail components. Once the detail components are obtained, they are converted into gradient domain and added to the fusion gradient field to construct the final objective function.

The remainder of this paper is organized as follows. In section 2, the basic principle of DST is briefly introduced. Section 3 describes the proposed image fusion method. Experimental results are performed in section 4. Discussion and Conclusions are given in section 5 and section 6.

2 Directional structure tensor

In this section, we briefly describe how to build DST. Firstly, the traditional structure tensor (ST) is constructed according to the image local neighborhood gradient information, and then the eigenvectors of the traditional ST are interpolated into the original image to reconstruct a DST with the direction of neighborhood change. Finally, the contribution of the second-order derivatives is also added to construct a more complete gradient field, which is very beneficial for reconstructing the fused image from the gradient field.

Supposing f(x) is a 3-D image, the traditional ST of each point is defined as: $T = g_{σ} * \nabla f \nabla f^{T} = 〈 g g^{T} 〉 = [\begin{matrix} 〈 g_{x} g_{x} 〉 & 〈 g_{x} g_{y} 〉 & 〈 g_{x} g_{z} 〉 \\ 〈 g_{y} g_{x} 〉 & 〈 g_{y} g_{y} 〉 & 〈 g_{y} g_{z} 〉 \\ 〈 g_{z} g_{x} 〉 & 〈 g_{z} g_{y} 〉 & 〈 g_{z} g_{z} 〉 \end{matrix}]$ (1)

The eigen-decomposition of the 3-D ST is: $T = λ_{u} u u^{T} + λ_{v} v v^{T} + λ_{w} w w^{T}$ (2) where u, v, w are the normalized eigenvectors and corresponding to the eigenvalues λ_u, λ_v, λ_w, and λ_u ≥ λ_v ≥ λ_w ≥ 0.

For purpose of describing the local structure information more accurately, we construct DST based on the eigenvectors of traditional ST and the second-order derivatives. DST is defined as follows [29, 30]:

$\begin{matrix} DST = (\nabla I \nabla I^{T}) + g_{σ} * μ [{(\nabla f)}_{xx} {(\nabla f)}_{xx}^{T} + {(\nabla f)}_{yy} {(\nabla f)}_{yy}^{T} + {(\nabla f)}_{zz} {(\nabla f)}_{zz}^{T}] \\ \nabla I \nabla I^{T} = [\begin{matrix} {g_{u}}^{2} & g_{u} g_{v} & g_{u} g_{w} \\ g_{v} g_{u} & {g_{v}}^{2} & g_{v} g_{w} \\ g_{w} g_{u} & g_{w} g_{v} & {g_{w}}^{2} \end{matrix}] \end{matrix}$ (3) where μ is the gain factor and is set as 1.25 in the experiment, g_u, g_v, g_w are the redefined directional derivatives, which are calculated from interpolating the eigenvectors u (x), v (x), w (x) of the traditional ST into the original image f(x). $\nabla I = {[\begin{matrix} g_{u} & g_{v} & g_{w} \end{matrix}]}^{T}$ , g_u, g_v, g_w are defined as: ${\begin{matrix} g_{u} (x) = \frac{1}{2} [f (x + u (x)) - f (x - u (x))] \\ g_{v} (x) = \frac{1}{2} [f (x + v (x)) - f (x - v (x))] \\ g_{w} (x) = \frac{1}{2} [f (x + w (x)) - f (x - w (x))] \end{matrix}$ (4)

The superiority of DST is that the horizontal structure of image or the discontinuity between slices can be easily captured by the reconstructed directional derivative. In order to more intuitively explain the advantages of DST in describing local texture information, one of the CT images of lung is selected for display. Figure 1 shows the maximum eigenvalue images of the traditional ST and DST, (a) is the maximum eigenvalue image of the traditional ST, and (b) is the maximum eigenvalue image of DST. From the Fig. 1 (b), we can see the local structure information of the image can be described more completely based on the DST method. Specifically, as shown in the region of interest marked in red, the continuity of the blood vessel is better, and the linear texture representation of the image is also more clear and complete. Thus, the DST can capture the volume context information in adjacent slices better.

Fig.1

Maximum eigenvalue images of tensor matrixes.

3 The fusion method

In this section, we will introduce the local entropy weighting firstly to reconstruct a new DST. Then we use the guided filtering to obtain the detail components to increasing the contribution of the details in the fusion image. Afterwards, we extract the image features according to the established DST, fuse these features and the extracted details in the gradient domain, and finally reconstruct the fused image from the gradient domain. In the process of fusion, the texture features of the image are fully considered, which are conductive to obtain more rich information and clear image in the fusion results. Figure 2 illustrates the flow chart of the proposed fusion algorithm.

3.1 Local entropy weighting

Fig.2

Algorithm flow chart.

As an index to measure the local information of an image, the local entropy reflects the discreteness of the voxel distribution of each gray-level of the image, and explains the local texture information of the image. The local entropy LE is defined as follows: $p_{i} = \frac{f_{i}}{\sum_{j} f_{j}}$ (5) $LE = - \sum_{i = 0}^{L - 1} p_{i} log p_{i}$ (6) where j is the gray level of the voxel in the local neighborhood, and the size of the 3-D local neighborhood is generally taken 9 × 9 ×9, p_i is the probability that the voxel gray value is i in the local neighborhood, L is the gray level of the image. For the regions with rich information, the value of local entropy is low because of its discreteness, whereas for the regions with uniform gray distribution, the value is high [31]. For making full use of the effective information in the source images, the local entropy metrics are used to construct the gradient weights of different source images in which the regions with larger discreteness have larger weights to enhance the edge details. The local entropy weighting is defined as: $\begin{matrix} w_{k} (x) = 1 - exp (| L E_{k} (x) |) & (k = 1, 2) \end{matrix}$ (7)

Therefore, combining the source image gradient field with local entropy weighting, the final DST is defined as: $\begin{matrix} DST = {(w_{k})}^{2} \cdot DST & (k = 1, 2) \end{matrix}$ (8)

3.2 Image detail extraction

In the purpose of enhancing the detail part of the fused image, we extract the detail components of the source images. The first is to carry out the guided filtering on the source image to get the base component: $B = f * G$ (9) where f is the source image, and G is the guided filtering. After we obtain the base component, then subtract the base component from the source image to obtain the detail component. $D = f - B$ (10)

Since the guided filtering can preserve the edges of the image well during filtering that causing the blurred details of the base component, thus it is only the detail part of the image when subtracting the base component from the source image. As shown in Fig. 3, Fig. 3 (a) shows the details extracted from one slice of the image in the CT sequence, and Fig. 3 (b) shows the details extracted from the image in the MR-T1ce sequence.

Fig.3

Detail map. ((a) shows the details extracted from the CT image, (b) shows the details extracted from the MR-T1ce image).

3.3 Image feature extraction

According to the analysis of the ST in [30], the eigenvalues λ₁, λ₂, λ₃ can provide a measurement of the structural information anisotropy, such as the linearity and planarity of the local structural features of the image. Medical images contain a large number of complex structures such as blood vessels and organs, while image fusion requires the display of these important structures. Therefore, based on the relationship among the eigenvalues, the coherence measurement among the gray values of local pixels is further defined to describe the local information of the image, which aims to highlighting the texture details and edge regions of the image, so that the important organizational features are better preserved in the fusion weighting process. The coherence measurement is designed as: ${\begin{matrix} c_{1} = {(λ_{1} + λ_{2} + λ_{3})}^{2} \\ c_{2} = {[3 \times max (λ_{1}, λ_{2}, λ_{3}) - (λ_{1} + λ_{2} + λ_{3})]}^{2} \\ c = c_{1} + 3 c_{2} \end{matrix}$ (11) where λ₁, λ₂, λ₃ are the eigenvalues of DST.

After constructing the feature template of each source image, the corresponding weight is assigned to the corresponding gradient field of each source image according to the structural features: $W_{k} = \frac{c_{k}^{2}}{\sum_{k = 1}^{n} c_{k}^{2}}$ (12)

Therefore, the final fused gradient field is defined as: $WT = \sum_{k = 1}^{n} W_{k} \cdot DST$ (13)

3.4 Fusion gradient domain

The tensor matrix of the fusion image that is closely related to the fusion gradient field needs to be as close to WT as possible, which is effect to extracting the important structural features of the source image. As previously analyzed, WT is also can be eigen-decomposed as $WT = λ_{1} e_{1} {e_{1}}^{T} + λ_{2} e_{2} {e_{2}}^{T} + λ_{3} e_{3} {e_{3}}^{T}$ (14) where λ₁ is also supposed to be the biggest eigenvalue, and it represents the maximum contrast. The eigenvector e₁ is corresponding to the eigenvalue λ₁. Only when the WT^′ matrix of the fusion image is as approximate to WT as possible, the fusion image can retain more structural features and geometric shape of the source image, that is: WT^′ = λ₁e₁e₁^T. Then the target gradient at pixel x is constructed as $V (x) = \sqrt{λ_{1} (x)} \cdot e_{1} (x)$ (15)

However, since the eigenvector e₁ has two directions, the direction of the target gradient V (x) cannot be completely determined. To avoid this situation, we use the method in [28] to specify the gradient average direction of each source image as the direction of the fusion gradient. Therefore, the fusion gradient can be updated to: $V = \sqrt{λ_{1}} \cdot e_{1} \cdot sign 〈 e_{1}, \frac{1}{2} \sum_{i = 1}^{2} \nabla f_{i} 〉$ (16)

3.5 Reconstruction from gradient domain

Now we need to reconstruct the fusion image F according to the fused gradient domain V. The gradient field of the reconstructed image F should be the closest to the fused gradient domain V, which can be expressed mathematically by the following minimization function: $F = \arg min_{F} {{∥ \nabla F - V ∥}_{F}^{2}}$ (17)

Further, we transform the previously extracted detail information into the gradient domain to fuse more effective details, which the details can preserve and enhance effectively by solving the optimization model: $F = \arg \min_{F} {{∥ \nabla F - (V + α \nabla D_{1} + β \nabla D_{2}) ∥}_{F}^{2}}$ (18) where the coefficients α, β control the relative weighting between these terms. If one of the source images is a morphological image and the other is a molecular image (i.e., α = 1, β = 0 or α = 0, β = 1), then D stands for the detail component of the morphological image. The details can be reflected on the fused image more clearly in this situation. If the two source images both are morphological images, then the coefficients are set as α = 0.5, β = 0.5.

On the basis of the principle of variation method, formula (18) can be derived and reduced to the following: $Δ F - div V^{'} = 0$ (19) where Δ is the Laplacian operator, div is the divergence operator, V′ = V + α ∇ D₁ + β ∇ D₂. The approximation of ∇F and divV′ can be found in [12]. According to the finite difference method, the equation (19) is solved by an iterative method, t is a small positive value. $f^{t + 1} (i, j, k) = f^{t} (i, j, k) + t \cdot (Δ f^{t} (i, j, k) - div (V^{' t} (i, j, k)))$ (20)

For the initialization image f⁰, we use the weighted combination $\sum_{k} \frac{w_{k} (x)}{\sum w (x)} f_{k} (x)$ with local entropy given by (7), and the number of iteration is 150, which value is set by referring to [26] and combining with pre-experiments. We set the number of iteration to 50, 100, 150, 200, 250 in the pre-experiment, and find that as the number of iteration is larger (taking 200 and 250, respectfully), the quality of the fusion result is not significantly improved but the calculation cost is increased.

4 Experimental results and analysis

In this section, several experiments on multi-modal medical images are performed and analyzed to verify the effectiveness of the proposed method. The experimental data are downloaded from the websites of TCIA [32] and BraTS 2018 Data [33, 34]. In these data, CT/PET data are downloaded from the TCIA website. Among these data, the lung data of each CT image is 512×512×257, and PET image is 128×128×257 with a same slice spacing 3.2 mm. The brain data of each CT image is 512×512×134, and PET image is 256×256×134 with a same slice spacing 3.0 mm. The MR data are downloaded from the BraTS 2018 Data website and the size all are 240×240×155 with a same slice spacing 1.0 mm. The proposed method is conducted on two groups of MR images (T1ce/T2). Before the fusion process, a preprocess procedure is required to complete the feature and size adaption.

In addition, some classical fusion methods such as 3D-ST [26], ST-NSST [35], GTF [36], GFF [11], and NSCT-LLE [37] are compared with the proposed solution in comparative experiments. 3D-ST is a 3-D structure tensor model for non-small cell lung cancer PET/CT image fusion. ST-NSST is a multi-modal image fusion method based on structure tensor and non-subsampled shearlet transform (NSST). GTF is a method for infrared/visible fusion based on gradient transfer and total variation minimization. GFF is proposed by using guided filtering. NSCT-LLE is a medical image fusion algorithm that combines phase congruency and local Laplacian energy with NSCT. Considering that GTF method is for infrared/visible fusion which are similar to the PET/CT, GTF method is only applicable to PET/CT fusion experiments. The GFF method is for multi-exposure images which are similar to the T1ce/T2 images, so the GFF comparison method is only applicable to the T1ce/T2 fusion experiments. Moreover, 3D-ST and ST-NSST are carried out in 3-D method. GTF, GFF and NSCT-LLE are designed for 2-D image, thus the individual 2-D slice fusion is implemented for fusing 3-D volumes.

4.1 Qualitative comparisons

First, we conduct experiments to fuse PET and CT images. Two groups of CT images and PET images are shown in Figs. 4, 5 show the fusion results and the rectangles marked in red are regions of interest in the fusion results which are enlarged to facilitate comparison of the visual effects. We can see that every method can fuse the important features of the source image together, but there are some differences among them. From (a) and (b) of Fig. 5, it can easily figure out that the fused results of GTF and NSCT-LLE can contain the lesion clearly but loss some vessel details in volume context information, and the fusion results obtained by the GTF method have the dark brightness owing to retaining the brightness of the PET images only. 3D-ST, ST-NSST and the proposed method all employ the structure tensor to extract features so that the information of the blood vessels in the fusion results are more abundant than others, and the proposed method can maintain the brightness and contrast better of the source images while having high definition. Although the clarity of the contour for the lesion in Fig. 5 (e) is not as good as Fig. 5 (a) enough, still can be seen clearly and is better than 3D-ST and ST-NSST. In conclusion, especially from the analysis of the connectivity of blood vessels and the sharpness of the contour of the lesion, the proposed method is better in reflecting the advantages of DST with extracting the deep structure information of 3-D images and connecting the vessel breaks.

Fig.4

Two pairs of CT/PET source images, (a) (b) are Lung1 image, (c) (d) are L ung2 image. The rectangles marked in red are the regions of interest.

Fig.5

Fusion results of the Lung1 and Lung2 images. From the left to the right are fusion results of GTF method, NSCT-LLE method, 3D-ST method, ST-NSST method, and the proposed method, respectively.

Figure 6 shows the source images for the other three lung image pairs and two head image pairs, and Fig. 7 shows the fusion results with different methods. Since the structural details are mainly contained in CT, almost each method performs well in the detailed information while the main differences are the connection of the vessels and the distortion of the PET images. By contrast, we can clearly observe that the structures of PET are over-smoothed in fused images of Fig. 7 (a) and (b), specifically in Head1 and Head2, and the vessels in Lung3– Lung5 are unclear compared to Fig. 7 (e). 3D-ST can connect the vessels well, but has comparatively low visual definition. Though the ST-NSST method can retain the details in the image, the definition of the fused results are not as good as the proposed method. Generally, the proposed method is superior to other methods in displaying of the details and preserving the brightness and contrast of the source images into the fusion image without introducing distortion.

Fig.6

Source images of Lung3– Lung5 and Head 1– Head 2.

Fig.7

Fusion results of the Lung3– Lung5 images and Head1– Head2 images. From the left to the right are the results of GTF method, NSCT-LLE method, 3D-ST method, ST-NSST method, and the proposed method, respectively.

To further verify the applicability and performance of our method, we next conduct experiments on two groups of T1ce/T2 images. It can be seen that in Fig. 8, due to various imaging settings, the T1ce images and T2 images contain complementary information. Figure 9 (a)– (e) illustrates the fusion results obtained by the comparison methods and the proposed method. As seen in Fig. 9 (e), the contour of ventricle or edema are shown with better sharpness in our fused results. Meanwhile, compared with other methods, our method transmits more interesting features and preserves the brightness and contrast of the source images better, which all indicate that the proposed method performs well on preserving both the structural information and detailed information in fusing T1ce/T2 images.

Fig.8

Source images of T1ce and T2 images.

Fig.9

Fusion results of the Head3– Head4 images. From the left to the right are the results of GFF method, NSCT-LLE method, 3D-ST method, ST-NSST method, and the proposed method, respectively.

Table 1

Quantitative assessment of different fusion methods of CT/PET images

CT/PET Images	Methods	Q ^E	Q ^JE	Q ^MG
Lung1	GTF	3.5553	3.7992	2.6037
	NSCT-LLE	3.5839	3.6199	2.8699
	3D-ST	3.9963	4.3610	2.6452
	ST-NSST	4.3303	4.7149	2.7116
	Proposed	4.5012	4.9671	2.7048
Lung2	GTF	2.4925	3.4092	2.6583
	NSCT-LLE	2.5905	2.2373	2.7032
	3D-ST	2.9819	3.7841	2.2248
	ST-NSST	3.1296	3.9529	2.3594
	Proposed	3.3506	5.1098	2.7887
Lung3	GTF	3.0925	3.4339	2.4603
	NSCT-LLE	3.0826	2.9296	2.2587
	3D-ST	3.2936	3.6368	2.3973
	ST-NSST	3.3717	3.8383	2.5029
	Proposed	3.5800	4.0257	2.5134
Lung4	GTF	3.6166	4.4635	2.9180
	NSCT-LLE	3.3286	3.9954	2.2437
	3D-ST	3.8830	4.8232	2.6776
	ST-NSST	4.0064	4.9983	2.7424
	Proposed	4.1707	5.1098	3.1200
Lung5	GTF	3.2292	3.6885	2.4743
	NSCT-LLE	2.8150	3.1732	2.6380
	3D-ST	3.2909	3.7275	2.4881
	ST-NSST	3.4388	3.9048	2.6183
	Proposed	3.6332	4.1206	2.6248
Head1	GTF	2.7131	5.2465	1.6070
	NSCT-LLE	2.6859	5.1293	1.4810
	3D-ST	2.9542	5.0368	1.5846
	ST-NSST	3.0039	5.2969	1.5971
	Proposed	3.1490	5.4757	1.6094
Head2	GTF	3.4013	4.1602	2.0176
	NSCT-LLE	2.9972	3.9615	1.7153
	3D-ST	3.3878	4.7195	1.7557
	ST-NSST	3.4298	5.1722	1.8993
	Proposed	3.5692	5.0940	2.0273

4.2 Quantitative comparisons

To further illustrate the effectiveness of the proposed method, information entropy (Q^E), joint entropy (Q^JE), mean cross entropy (Q^MCE) and mean gradient (Q^MG) are used as quantitative evaluation metrics for image fusion results [38]. Q^E evaluates the richness of information for the fused image. Q^JE reflects the joint information between the source image and the fused image. Q^MCE is used to measure the difference between the corresponding pixels of the source images and the fused image. Q^MG is a gradient-based quality metric to indicate the level of detail contrast expression.

The experimental results are compared quantitatively according to the objective evaluation metrics, and the results are shown in Tables 1– 2. In the fusion of CT/PET images, the DST can connect the discontinuities of tubular structures such as blood vessels in the CT images during the fusion processing, which will cause the difference among the fused image and the source images to be large. Therefore, the Q^MCE metric is meaningless to assess the fusion result of CT/PET, and we choose Q^E, Q^JE and Q^MG as the objective evaluation metrics in CT-PET fusion. For the fusion of T1ce/T2 images, what we need more is to evaluate the difference among the fusion image and the source images, and to evaluate whether the fusion image effectively integrate the structure information of lesion in the two source images. Then we choose Q^E, Q^MCE and Q^MG as the objective evaluation metrics in T1ce/T2 fusion. The boldface in the tables indicates the best results of the objective evaluation metrics. Tables 1 and 2 illustrate the proposed method outperforms other methods on almost the metrics. For details, the Q^E and Q^JE of the proposed framework in Table1 are obviously higher than other fusion methods, which demonstrates that our fusion method has more advantages in obtaining information from the source images. Besides, the Q^E and Q^MCE of the proposed framework in Table 2 point out that the proposed framework can preserve structure information well of the source images. In spite of the Q^MG of the NSCT-LLE is a little bit higher than the proposed method in some cases, most of the fusion results obtained by the proposed method still have higher resolution. Through the objective comparison of the four metrics, the superiority of the proposed method in preserving image structure information and reducing distortion is satisfactory.

Table 2
Quantitative assessment of different fusion methods of T1ce/T2 lung images

MR Images Methods _Q^E _Q^MCE _Q^MG

Head3 GFF 1.9007 0.1122 1.1395

NSCT-LLE 1.9843 0.0938 1.1723

3D-ST 1.9353 0.1390 1.0128

ST-NSST 1.8483 0.1204 1.0595

Proposed 2.3463 0.1258 1.2918

Head4 GFF 1.9542 0.0782 1.3665

NSCT-LLE 1.8817 0.0967 1.3291

3D-ST 1.9693 0.0740 1.1968

ST-NSST 2.0055 0.0914 1.2691

Proposed 2.3238 0.0726 1.5324

MR Images	Methods	_Q^E	_Q^MCE	_Q^MG
Head3	GFF	1.9007	0.1122	1.1395
	NSCT-LLE	1.9843	0.0938	1.1723
	3D-ST	1.9353	0.1390	1.0128
	ST-NSST	1.8483	0.1204	1.0595
	Proposed	2.3463	0.1258	1.2918
Head4	GFF	1.9542	0.0782	1.3665
	NSCT-LLE	1.8817	0.0967	1.3291
	3D-ST	1.9693	0.0740	1.1968
	ST-NSST	2.0055	0.0914	1.2691
	Proposed	2.3238	0.0726	1.5324

5 Discussion

In this section, we discuss the applicability and limitation of the proposed fusion method. The purpose of the fusion method proposed in this paper is to establish a fusion method that is beneficial to processing of 3-D medical images, which can effectively capture the structural continuity among slices and the hidden structural information. Therefore, in the gradient-based fusion method, it is necessary to establish a more ideal and accurate fusion gradient domain to make the gradient of fusion image as close as possible to this fusion gradient domain.

For constructing this ideal fusion gradient domain, we use the directional structure tensor weighted by local entropy to extract the local structure features of image, and fuse the gradient features of the source image effectively via the coherence measurement function. The directional structure tensor is proved to be capable of connecting the discontinuous structures and capturing the information hidden in the voxels in Fig. 1, while the local entropy weighted directional structure tensor highlights the important structural features of the details in source image, and is conductive to the subsequent feature fusion. In addition, since the approximation of the fused gradient domain is composed of the maximum eigenvalue of the weighted directional structure tensor and the corresponding eigenvectors, which may result in a small loss of continuous details. To address this problem, the details of source images are extracted and added to the final fused gradient domain, and the weight of detail components is increased to enhance the details in the fused image. Through taking full advantage of DST and detail components, the proposed fusion method gives a satisfactory result in preserving structure information, capturing the structural continuity and enhancing the vessel structures. In conclusion, the fusion method proposed in this paper can be applied to most 3-D medical images, and can even be applied to other fusion cases that need to obtain 3-D inter-component information. However, there is a limitation of this method, that is, the 3-D weighted directional structure tensor has slight diffusion effect, which is not suitable for color images or causing color distortion.

6 Conclusion

In this paper, a 3-D fusion method based on directional structural tensor is proposed for the fusion of 3-D medical images. First of all, the eigenvectors of the traditional ST and the second-order derivatives of the source image are combined to construct the DST. Then the gradient weights of different source images are constructed by using the local entropy metrics, which is to construct the weighted directional structure tensor to extracting the important and continuous structural features. For purpose of highlighting the detailed information in the fused image, we add the detail components to the gradient domain. Experiments are carried out on several 3-D CT/PET images, 3-D MR-T1ce/ MR-T2 images to verify the effectiveness of the fusion method. The experimental results demonstrate that the fusion results obtained by this method are more complete and continuous in displaying the pulmonary vascular structure, and preserve the brightness and contrast of the source images into the fusion image without introducing distortion. Future research is expected to focus on the fusion of noise medical images.

Footnotes

Acknowledgments

This work is sponsored by Natural Science Foundation of Shanghai (18ZR1426900), National Natural Science Foundation of China (61201067).

References

Schoder

, Yeung

H.W

, Gonen

, et al., Head and neck cancer: clinical usefulness and accuracy of PET/CT image fusion, Radiology 231 (2004), 65–72.

Cai

, Chu

J.C

, Recine

, et al., CT and PET lung image registration and fusion in radiotherapy treatment planning using the chamfer-matching method, International Journal of Radiation Oncology* Biology* Physics 43 (1999), 883–891.

Shahdoosti

H.R

, Mehrabi

, MRI and PET image fusion using structure tensor and dual ripplet-II transform, Multimed Tools Appl 77 (2018), 22649–22670.

Donati

O.F

, Hany

T.F

, Reiner

C.S

, et al., Value of retrospective fusion of PET and MR images in detection of hepatic metastases: comparison with 18F-FDG PET/CT and Gd-EOB-DTPA– enhanced MRI, J Nucl Med 51 (2010), 692–699.

Gaemperli

, Schepis

, Valenta

, et al., Cardiac image fusion from stand-alone SPECT and CT: clinical experience, J Nucl Med 48 (2007), 696–703.

Ogata

, Nakahara

, Ode

, et al., 3D SPECT/CT fusion using image data projection of bone SPECT onto 3D volume-rendered CT images: feasibility and clinical impact in the diagnosis of bone metastasis, Ann Nucl Med 31 (2017), 304–314.

Grosu

A.L

, Weber

W.A

, Franz

, et al., Reirradiation of recurrent high-grade gliomas using amino acid PET (SPECT)/CT/MRI image fusion to determine gross tumor volume for stereotactic fractionated radiotherapy, Int J Radiat Oncol 63 (2005), 511–519.

Nemec

S.F

, Donat

M.A

, Mehrain

, et al., CT– MR image data fusion for computer assisted navigated neurosurgery of temporal bone tumors, Eur J Radiol 62 (2007), 192–198.

, Hu

, Liu

, Multi-focus image fusion based on joint sparse representation and optimum theory, Signal Process-Image 78 (2019), 125–134.

10.

, He

, Tao

, et al., Joint medical image fusion, denoising and enhancement via discriminative low-rank sparse dictionaries learning, Pattern Recogn 79 (2018), 130–146.

11.

, Kang

, Hu

, Image fusion with guided filtering, IEEE T Image Pr 22 (2013), 2864–2875.

12.

Piella

, Image fusion for enhanced visualization: A variational approach, Int J Comput Vi 83 (2009), 1–11.

13.

Singh

, Khare

, Fusion of multimodal medical images using Daubechies complex wavelet transform– A multiresolution approach, Inform Fusion 19 (2014), 49–60.

14.

Zhu

Z.Q

, Chai

, Yin

H.P

, Li

A.X

, A novel dictionary learning approach for multi-modality medical image fusion, Neurocomputing 214 (2016), 471–482.

15.

Bhavana

, Krishnappa

H.K

, Multi-modality medical image fusion using discrete wavelet transform, Procedia Comput Sci 70 (2015), 625–631.

16.

Lim

W.Q

, The discrete shearlet transform: a new directional transform and compactly supported shearlet frames, IEEE T Image Pr 19 (2010), 1166–1180.

17.

M.N

, Vetterli

, The contourlet transform: an efficient directional mul-tiresolution image representation, IEEE T Image Pr 14 (2005), 2091–2106.

18.

Da Cunha

A.L

, Zhou

, Do

M.N

, The non-subsampled contourlet transform: theory, design and applications, IEEE T Image Pr 15 (2006), 3089–3101.

19.

Liu

, Wang

, Lu

, et al., Multi-focus Image Fusion Based on Adaptive Dual-channel Spiking Cortical Model in Non-subsampled Shearlet Domain, IEEE Access 7 (2019), 56367–56388.

20.

, Hu

, Liu

, et al., Noisy remote sensing image fusion based on JSR, IEEE Access 8 (2020), 31069–31082.

21.

Liu

, Wang

, Lu

, et al., Multi-Focus Image Fusion Based on Residual Network in Non-Subsampled Shearlet Domain, IEEE Access 7 (2019), 152043–152063.

22.

Liu

, Chen

, Cheng

, Peng

, A medical image fusion method based on convolutional neural networks, IEEE 20th International Conference on Information Fusion (2017), 1–7.

23.

Huang

P.W

, Chen

C.I

, Chen

, et al., PET and MRI brain image fusion using wavelet transform with structural information adjustment and spectral information patching, IEEE International Symposium on Bioelectronics and Bioinformatics (IEEE ISBB 2014) IEEE, 2014, 1–4.

24.

Petrovic

V.S

, Xydeas

C.S

, Gradient-based multiresolution image fusion, IEEE T Image Pr 13 (2004), 228–237.

25.

Wang

, Li

, Tian

, Multimodal Medical Volumetric Data Fusion Using 3-D Discrete Shearlet Transform and Global-to-Local Rule, IEEE T Bio-Med Eng 61 (2013), 197–206.

26.

Wang

Y.J

, Liu

, Three-dimensional structure tensor based PET/CT fusion in gradient domain, J X-ray Sci Tec 27 (2019), 307–319.

27.

Yin

, Tensor sparse representation for 3-D medical image fusion using weighted average rule, IEEE T Bio-Med Eng 65 (2018), 2622–2633.

28.

Zhao

, Xu

, Zhao

, Gradient entropy metric and p-Laplace diffusion constraint-based algorithm for noisy multispectral image fusion, Inform Fusion 27 (2016), 138–149.

29.

Wang

, Wang

Y.J

, Anisotropic diffusion filtering method with weighted directional structure tensor, Biomed Signal P 53 (2019), 101590.

30.

, Janson

, Directional structure tensors in estimating seismic structural and stratigraphic orientations, Geophys J Int 210 (2017), 534–548.

31.

, Wang

, Chen

, Active contours driven by weighted region-scalable fitting energy based on local entropy, Signal Process 92 (2012), 587–600.

32.

Clark

, Vendt

, Smith

, et al., The cancer imaging archive (TCIA): maintaining and operating a public information repository, J Digit Imag 26 (2013), 1045–1057.

33.

Menze

B.H

, Jakab

, Bauer

, et al., The multimodal brain tumor image segmentation Benchmark (BRATS), IEEE T Med Imag 34 (2015), 1993–2024.

34.

Bakas

, Akbari

, Sotiras

, et al., Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features, Nature Scientific Data 4 (2017), 170117.

35.

Liu

, Mei

, Du

, Structure tensor and nonsubsampled shearlet transform based algorithm for CT and MRI image fusion, Neurocomputing 235 (2017), 131–139.

36.

, Chen

, Li

, Huang

, Infrared and visible image fusion via gradient transfer and total variation minimization, Inform. Fusion 31 (2016), 100–109.

37.

Zhu

, Zheng

, Qi

, et al., A Phase Congruency and Local Laplacian Energy Based Multi-Modality Medical Image Fusion Method in NSCT Domain, IEEE Access 7 (2019), 20811–20824.

38.

Wang

, Li

, Cao

, et al., Image fusion incorporating parameter estimation optimized Gaussian mixture model and fuzzy weighted evaluation system: A case study in time-series plantar pressure data set, IEEE Sens J 17 (2016), 1407–1420.