Abstract
In this paper, we formulate the joint/simultaneous X-ray CT and MRI image reconstruction. In particular, a novel algorithm is proposed for MRI image reconstruction from highly under-sampled MRI data and CT images. It consists of two steps. First, a training dataset is generated from a series of well-registered MRI and CT images on the same patients. Then, an initial MRI image of a patient can be reconstructed via edge-oriented dual-dictionary guided enrichment (EDGE) based on the training dataset and a CT image of the patient. Second, an MRI image is reconstructed using the dictionary learning (DL) algorithm from highly under-sampled k-space data and the initial MRI image. Our algorithm can establish a one-to-one correspondence between the two imaging modalities, and obtain a good initial MRI estimation. Both noise-free and noisy simulation studies were performed to evaluate and validate the proposed algorithm. The results with different under-sampling factors show that the proposed algorithm performed significantly better than those reconstructed using the DL algorithm from MRI data alone.
Keywords
Introduction
For diagnosis and therapy, it is highly desirable to integrate different modalities for comprehensive analysis, which is called multimodality imaging [1]. As the first generation of medical multimodality technology, PET-CT has had a major impact especially on cancer treatment [2]. With an integrated gantry, CT and PET images are sequentially acquired within a relatively short time period. These two types of images are thus in a good registration. The PET-CT system effectively solves the attenuation correction and image registration problems, being widely used in oncological settings [3]. After PET-CT, other multimodality imaging instruments such as SPECT-CT and PET-MRI were developed and accepted in various healthcare applications. For example, PET-MRI, a latest multimodality system, performs well in neurosurgery planning and cardiac examination, such as localization of epileptic foci and stroke [4 –7].
Nowadays, MRI has become an indispensable medical modality. However, a typical MRI procedure usually lasts ten minutes or even longer. It is uncomfortable for the patient to keep motionless surrounded by the huge and noisy MRI gantry in such a long time. Moreover, some organs movements such as heartbeat and spasm would cause motion artifacts. It has been demonstrated that the average displacement is over 0.35mm within 100 seconds when a healthy young adult is lying on the table, and this displacement would increase to 2.5mm for a patient [8, 9]. Therefore, it is significant to shorten MRI scanning time for better image quality. On the other hand, it is well known that the CT scanner only takes a few seconds to scan most parts of the human body, which is much faster than MRI. In addition, the spatial resolution of CT image is better than MRI as well. Therefore, although both CT and MRI are good at structural imaging, simultaneous MRI and CT imaging are an under-investigated yet interesting multimodality mode which can provide complementary information for healthcare benefits. Usually, MRI offers better soft tissue information while CT depicts sharper interfaces between air and tissue or tissue and bone. Hence, MRI-CT hybrid imaging in one machine was expected to provide superior image quality in both soft and hard tissues [10, 11]. Moreover, it is possible to greatly shorten the scanning time of MRI by reconstructing an MRI image from highly under-sampled data and CT data from the same subject [12, 13]. Actually, before the conception of the MRI-CT scanner Fahrig et al. worked to put the X-ray radiography devices into an MRI scanner [14, 15]. They found that the magnetic field influenced on the electron beam in the X-ray tube and proposed a method for handling the deflection of the electron beam. It was suggested that paramagnetic and ferromagnetism materials must be replaced by other non-magnetic material in the integrated device.
This paper focuses on the joint CT-MRI image reconstruction. A new algorithm is proposed to reconstruct an MRI image from highly under-sampled data with a good initial MRI image generated using a dual-dictionary learning (DDL) method from a well-registered MRI and CT image training set. In the next section, the dictionary learning (DL) method is introduced for medical image reconstruction. Then, an edge-oriented dual-dictionary guided enrichment (EDGE) scheme is proposed to reconstruct an MRI image from highly under-sampled MRI k-space data. The numerical results are presented in the third section. In the last section, the discussions are presented to inspire further efforts.
Method
Background
CT and MRI have different imaging principles. The CT image reconstruction is to solve the inverse problem of the integrals of the X-ray attenuation coefficient distribution in the human body. X-ray projections or integrals of linear attenuation coefficients is usually described as
Different from X-ray CT, each pixel value of an MRI image represents a property relevant to the resonance nuclear density (hydrogen protons) and two relaxation times, with the details depending on a specific pulse sequence. The scan signals, known as k-space data, are recorded by phase and frequency encoding. An image can be reconstructed by the inverse Fourier transform,
Although Equations (2-3) represent different models, the two reconstruction problems can be both transfered into similar mixed-norm regularization problems according to compressed sensing (CS) theory [16]. Moreover, though the physical principles and quantities are different for these imaging modalities, the human structural information in these images are correlated [17, 18]. Inspired by the correlative nature of CT and MRI datasets, we are motivated to study whether an MRI image can be reconstructed from less k-space data when the prior CT information of the same patient is available.
Recently, CS theory has served as a powerful tool in producing high quality image restoration and reconstruction from under-sampled data which may be much less than the requirements of the traditional Shannon/Nyquist sampling theory [19
–24]. According to the CS theory, a highly under-sampled image reconstruction problem is to solve an underdetermined system of linear equations F
u
x = y by minimizing the l
0 norm of a sparsified transformΨx. The corresponding optimization problem may be described as
This problem is also known as a sparse coding problem which is an NP-hard problem. Usually, the l 0 norm can be replaced by the l 1 norm because the l 1 norm minimization is a proxy for the l 0 sparsity constraint when the restricted isometry property (RIP) holds [19]. Then, the problem can be efficiently solved by linear programming in the real domain or second order cone programming in the complex domain.
DL is a promising implementation of CS theory by training a dictionary for efficient sparse representation [25, 26]. It aims to find a proper representation of data in adaptive low-dimensional subspaces. DL related methods are widely used, already giving significant improvements in the field of medical image reconstruction [27 –29].
Given an image of size , it can be decomposed into small patches of size , b << N. Each patch can be expressed as a vector . All the patches are extracted from the image according to a designed partition. A dictionary is a matrix consisting of k atoms (columns). An initial dictionary generated from sample images is usually redundant or over-complete. With this dictionary D, each vector x
p
of an image can be sparsely represented as
If an image contains S patches, DL is to find a dictionary that all the patches can be sparsely represented by
Many algorithms were proposed to solve the dictionary learning problem Equation (7) [25 , 30]. As one of the popular algorithms, the k-means singular value decomposition (K-SVD) algorithm iteratively updates the dictionary atoms to better fit data. It does SVD on the errors, and refines the current dictionary atoms and coefficients simultaneously. As a widely-used method, K-SVD enjoys an excellent performance in convergence and sparsity [25]. The DL method was reported to perform well for MRI reconstruction from under-sampled k-space data.
However, image features may be in large disparities across multimodality images or even within one modality, for example, dual-energy CT. These different images cannot be directly integrated into one dictionary. Thus, two or more dictionaries are required to describe image features in different domains. A number of dual-dictionary learning (DDL) methods were proposed to solve this problem, such as MRI-CT reconstruction, multi-energy CT reconstruction, image reconstruction from sparse data [12 , 31–33]. DDL enables us to establish a connection between different types of images using the prior information of two modalities.
In multimodality image reconstruction, the existing DDL algorithms usually use the Euclidean distances among different patches to choose the dictionary atoms, which work well for the multimodality images in consistent contrasts, or from a high contrast image to a low contrast destination; for example, estimating or reconstructing a CT image of the human brain from an MRI image using DDL [12]. However, these algorithms would perform poorly in reconstructing a high contrast image from a low contrast counterpart; for example, reconstructing an MRI image of the human brain from a CT image. Therefore, we are motivated to investigate an edge-oriented dual-dictionary guided enrichment (EDGE) alternative to solve the MRI reconstruction problem from the training dataset consisting of well-registered MRI and CT images subject to highly under-sampled k-space data and a complete CT dataset. The key is to use the relative positions guided by the edges and contours in the images to establish the connections among the patches instead of the Euclidean distance that can be misled by inconsistent contrast and grey-scalebias.
The EDGE algorithm has two steps. First, an initial MRI image is generated by an edge-oriented DDL algorithm from a training dataset consisting of well-registered MRI and CT images of selected previous patients, and the CT image of a current patient. The EDGE mechanism enables the two dictionaries well mapped and self-adaptive. Then, the estimated MRI image will be used as an initial value to facilitate the final MRI image reconstruction from highly under-sampled k-space data.
Step 1. Initial MRI image estimation using the EDGE strategy.
First, to establish two dictionaries D
MR and D
CT, we need to collect a series of MRI and CT images of patients. As a training dataset, these images may be gathered from different CT and MRI scanners. The dual-modality images of each patient require to be well-registered by structure similarity. For ahigh-resolution CT image , a patch vector (b is the patch size) can be represented by a specific dictionary according to DL theory as [27]
Using the orthogonal matching pursuit (OMP) algorithm, the patches in Equation (8) can be represented as [34],
When MRI and CT data of an object are obtained simultaneously on a hybrid CT-MRI scanner or sequentially in a consistent status, u
MR has a corresponding CT image u
CT. We assume that the patches and of these two modal images in registration satisfy
Equation (11) may be simplified to
Therefore, through sparse representation and dictionary learning, a high-quality patch can be sparsely coded by the same vector in the relationship of D
MR = Q · D
CT. It other words, the patch of MRI image can be approximately recovered by multiplying D
MR by the sparse representation obtained from the corresponding CT image and the dictionary D
CT if the atom mapping from the dictionary D
CT to D
MR is provided. That is,
Next, the key is to establish an atom-to-atom mapping between these two dictionaries of different imaging modalities. Here we propose an edged-orientated method to establish such a mapping. It was proved that the public edge information between the well-registered MRI and CT images was useful to improve the MRI image quality in MRI-CT hybrid imaging [35]. Different from that method, here the edge information in the low contrast CT images is used to choose the atoms of the CT dictionary to sparsely code a given CT image, instead of the traditional Euclidean distance measure.
For a given CT image u CT, some sample CT images are first chosen from the initial dataset whose structure information is close to the target image. Then, a specific dictionary D CT may be generated from the patches of these sample images. To represent each patch of the given CT image, we perform an edge-based searching to select the closest atoms in D CT.
As shown in Fig. 1, (a) is the MRI image of a patient’s brain to be reconstructed, (b) is the corresponding CT image at the same slice, (d) and (e) are the MRI and CT sample images selected from the training dataset. Without loss of generality, let us consider a patch of the CT image (b) whose center locates at N
0 (x
0, y
0). To find the patches in Fig. 1(e) similar to , we use their location from the boundaries instead of the Euclidean distance of the grayscale values because the contrast of this CT image of the human brain is rather low. Thus, the high contrast skull edges in Fig. 1 (b) and (e) are first segmented and extracted. Two contours can be respectively outlined by single-pixel-wide edges as shown in Fig. 1(c) and (f). The four purple points in Fig. 1(c) are the intersections of the contour and the horizontal and vertical lines through the point N
0 (x
0, y
0). The distances between N
0 and these four intersections can be calculated by
We can compute two distance ratios of the above distances along the x- and y-axes respectively,
Assume that in the sample image Fig. 1(e) N
1 (x
1, y
1) is the closest patch or atom to N
0 (x
0, y
0). As shown in Fig. 1(f), which is the contour image of (e), the distances between N
1 and the four intersections on the edge can be calculated using the equation similar to Equation (14). Then, two distance ratios of these distances with respect to N
1 may be given by
Let k
x_N
1
= k
x_N
0
and k
y_N
1
= k
y_N
0
, we have
For any given patch of the CT image Fig. 1(b), a corresponding patch can be determined in the sample image Fig. 1(e) using Equation (17). That is, Equation (17) establishes an edge-oriented atom mapping from the patient’s CT image to the sample CT image, which selects the atoms having the same distance ratios defined by Equation (15).
However, because the values of related to the location of N 1, (x 1, y 1) cannot be directly determined by one forward calculation of Equation (17), generally we need to use an iterative procedure in Fig. 2 to solve this problem, which is a crucial DDL step in our EDGE algorithm.
In this paper, the central pixel of this atom was determined by being rounded to the nearest integer from the output (x 1, y 1). Furthermore, as shown in Fig. 1, once the patch N 1 (x 1, y 1) is selected from the sample CT image Fig. 1(e), its corresponding patch M 1 (x 1, y 1) is also determined at the same position in the sample MRI image Fig. 1(d) because these two sample images are presumably well-registered.
To each patch of the known corresponding CT image Fig. 1(b), , its dictionary consists of the atoms selected by the procedure in Fig. 2 from the sample CT images. To improve the accuracy, the atom in each dictionary triples its length by adding the first-order gradient vector of each patch along x- and y- directions respectively [32]. Then, the sparse representation of each with respect to can be calculated using the OMP algorithm. Given on the coherence between the CT and MRI images, a correlative dictionary can be generated from the corresponding well-registered MRI image. Thus, each patch of the reconstructed MRI image can be approximately recovered by Equation (13) from the corresponding spare representation . Once all the patches are obtained, the final MRI image can be reconstructed by combining these patches. Figure 3 is a general workflow of the EDGE algorithm to reconstruct an initial MRI image u MR.
Note that the above EDGE algorithm requires that the single-pixel contour should be a convex and closed curve as shown in Fig. 1. Otherwise, there may have more than four intersections of the contour and those horizontal and vertical lines through the point N 0 (x 0, y 0). Figure 4 shows the general process to get this convex contour. First, the high contrast bone edges, Fig. 4(b), can be extracted using the common threshold method from the original CT image. Then, the largest connected domain is selected with a certain area threshold as shown in Fig. 4(c). Figure 4(d) is the minimum convex set containing the connected domain of Fig. 4(c). Finally, we can extract the convex set and its single-pixel contour shown in Fig. 4(d) and (e) respectively.
Step 2. High-resolution MRI image reconstruction using DL from highly under-sampled k-space data and an initial image.
Once an MRI image is obtained by Step 1 from its counterpart CT image and two dictionaries D
CTand D
MR, it will be used as an initial input in the following high-resolution MRI image reconstruction from highly under-sampled k-space data. Based on the CSMRI model described in [27], the image reconstruction problem can be approximately described as
Equation (18) is a common l 2 norm minimization problem which can be solved directly using a classical iterative optimization technique, such as Gauss-Seidel and conjugate gradient (CG). The detailed solution procedure is summarized in Fig. 5.
In Equation (19), y (k x , k y ) denotes the updated value at the location (k x , k y ) in the k-space, S 0 represents the original k-space data which are highly under-sampled, Ω is a subset of k-space that has been sampled, and S = FFT (x (n)) represents the k-space data generated from the current iterative image x (n).
In this section, the EDGE algorithm was evaluated by using the well-registered clinical MRI and CT images on human brain provided by the Visible Human Project of the National Library of Medicine (http://www.nlm.nih.gov). The MRI images have 256×256 pixels with the size of 1.01562×1.01562 mm2, while the CT images have 512×512 pixels with the size of 0.527344×0.527344 mm2. The highly under-sampled k-space data was simulated by subsampling the Fourier transform of original (test) images using a corresponding under-sampling mask.
In the simulation, 5 pairs of CT and MRI images were selected to construct two dictionaries. First, CT images were down-sampled into 256×256 pixels and re-registered with the same pixel size as MRI images. Thus, for each image patch to be represented, its corresponding dictionaries D CT and D MR have 135 atoms by selecting the nearest 9 patches to the each patch center. The patch size was 9×9 with the sliding distance of 2 pixels. After the initial MRI image was reconstructed by the Algorithm 1 described as Fig. 3 (Step 1 of EDGE), it is used as an initial value of the Algorithm 2 described as Fig. 5 (Step 2 of EDGE). Moreover, in the implementation of Fig. 5, the iteration number of K-SVD in step 1) was set 15. The block size is 8, and the sparseness level is 9. In step 2), all the atoms were used in OMP iteration. And, its convergence condition ε OMP = 0.023.
The results of the EDGE algorithm were compared with the reconstruction of the traditional DL-MRI method [27]. In addition, we used the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) for quantitative image quality assessment [16]. The PSNR is computed as the ratio between the peak intensity value of the reference image and the root-mean-square of the reconstruction error. The SSIM between 0 and 1 compares local patterns of pixel intensities that have been normalized for luminance and contrast. All implementations were coded in Matlab 7.11.0 (R2010b).
Noiseless simulation
The sample CT and MRI images in Fig. 6 were selected to generate the dictionaries of these two modalities. The images in each column have the same size and are well-registered. The images in the 3rd row are the closed single-pixel edges of the skulls in the CT images of the 1st row. Figure 7 shows the test MRI and CT images of the same patient. (a) is the test CT image. (b) is the closed single-pixel edge of the skull extracted from (a). (c) is the target MRI image to be reconstructed. (d) is the initial MRI image reconstructed by the step 1 of EDGE (Algorithm 1) from the training data set consisting of the five pairs of well-registered MRI and CT images from other patients shown in Fig. 6, and the CT image of this same imaged object shown in Fig. 7(a). (e) is the reconstructed MRI image by the traditional DDL algorithm without edge-oriented mapping [12]. According to the true value of the target MRI image, calculate the MSE (mean squared error) and SSIM of these two reconstructed results. The MSE and SSIM of Fig. 6(d) are 8720.3 and 0.8003, respectively. However, the MSE and SSIM of Fig. 6(e) are 9909.5 and 0.7043, respectively. Enhanced by the adaptive edge information, EDGE algorithm can provide much better image quality than the traditional DDL algorithm.
Figure 8 shows the final MRI reconstruction results from the 20 folds under-sampling k-space measurement data. (a) is the test MRI image. (b) is the 20 folds under-sampling matrix in k-space. (c) is the reconstruction by the Fourier transform algorithm with zero-filling of measured data. (d) is the initial MRI image reconstructed by the Step 1 of EDGE. (e) is the reconstruction by the traditional DL-MRI algorithm with the initial image (c) [27]. (f) is the reconstruction by the EDGE algorithm with the initial image (d). The MSE of the reconstructed image (f) is 0.0065 which is much lower than the MSE of the reconstructed image (e), 0.0564. It can also be seen that the EDGE reconstruction has much cleaner background and less errors than the results of DL-MRI.
Figure 9 represents the PSNR as a function of the number of iterations for the EDGE and traditional DL-MRI algorithms respectively. After 15 iterations, for the DL-MRI reconstruction the PSNR and SSIM are 12.5128 dB and 0.6292, respectively. In contrast, the PSNR and SSIM of EDGE reconstruction are 27.7277 and 0.9322. Moreover, according to the same iterative convergence condition, our algorithm can stop when the iteration number equaled 6, while DL-MRI method stopped more than 15 iterations. Figure 10 shows the PSNR curves for different under-sampling folds in k-space measurements. We can find that the result of EDGE has a slow decline. However, the result of DL-MRI has a rapid fall after 14 folds.
Benefiting from the edge-oriented patch mapping between D CT and D MR, similar structural information are preserved within these two dictionaries. The initial MRI image reconstructed by EDGE can be regard as a highly under-sampling MRI image whose under-sampling factor is about 14. Therefore, as shown in Fig. 10, when the under-sampling factor is higher than 13, EDGE keep a much better PSNR than DL-MRI. The mean PSNR of the EDGE is about 18 dB higher than the DL-MRI result when the k-space is 25 under-sampled.
Noisy data simulation
In the noisy data simulation, the k-space data were added by the Gaussian noise with zero mean and standard deviation σ. And, the k-space data was 20 folds under-sampled. Figure 11 shows the results with the noise level σ = 10 reconstructed by DL-MRI and EDGE respectively. It may be found that the reconstruction of EDGE has a higher PSNR and cleaner background than that of DL-MRI. Table 1 shows the mean PSNR values with different noise levels for EDGE and DL-MRI reconstructions. It can be seen that the PSNR values for EDGE are much better than the results of DL-MRI from either noise-free or noisy data. In the case of high level noise, increasing the number of dictionary atoms, such as 10 or 15 (each image generates two or more atoms), may provide a better reconstruction result.
Discussions and conclusion
In conclusion, we proposed a new edge-oriented DDL MRI image reconstruction algorithm from highly under-sampled k-space data in order to decrease the MRI data acquisition time in the MRI-CT scanner. The key step is to establish a one-to-one mapping relationship between the training CT images and MRI images by using the edge-oriented mapping. Without any measured MRI data an initial MRI image can be generated within this new algorithm from the CT image of the target patient and the training CT and MRI data sets, which proved to be useful to improve the MRI image reconstruction from high under-sampled k-space data in this paper. This method is expected to be applied in fast MRI imaging where requires high temporal resolution such as dynamic cardiac imaging. Though the algorithm in this paper was derived and simulated in MRI-CT simultaneous imaging, we expect it can be used in other multi-modality imaging, e.g., the Omni-tomography including CT, MRI, PET, SPECT and optical image [10].
As mentioned above, the dictionary mapping in the EDGE algorithm decides the quality of the initial MRI image. Thus, the performance of EDGE is highly related to the mapping accuracy between the training CT and MRI images. The mismatch of the pairs of atoms in the two dictionaries D CT and D MR may damage their mapping relationship. Therefore, the MRI and CT images in the training data set should have the same pixel size and be well registered in data pre-processing stage. Moreover, because the edge information is extracted and matched from the training CT images and the patient’s CT image, we need to ensure that these two kinds of CT images have the similar and high image qualities, which means that the MRI-CT scanner should provide the similar CT image quality to the current clinical X-ray CT.
Footnotes
Acknowledgments
This work was partly supported by the grants from NSFC 61571256 and 81427803, Beijing Excellent Talents Training Foundation (2013D009004000004), and Beijing Municipal Science & Technology Commission (Z151100003915079).
