Abstract
BACKGROUND:
Recent studies have explored layered correction strategies, employing a slice-by-slice approach to mitigate the prominent limited-view artifacts present in reconstructed images from high-pitch helical CT scans. However, challenges persist in determining the angles, quantity, and sequencing of slices.
OBJECTIVE:
This study aims to explore the optimal slicing method for high pitch helical scanning 3D reconstruction. We investigate the impact of slicing angle, quantity, order, and model on correction effectiveness, aiming to offer valuable insights for the clinical application of deep learning methods.
METHODS:
In this study, we constructed and developed a series of data-driven slice correction strategies for 3D high pitch helical CT images using slice theory, and conducted extensive experiments by adjusting the order, increasing the number, and replacing the model.
RESULTS:
The experimental results indicate that indiscriminately augmenting the number of correction directions does not significantly enhance the quality of 3D reconstruction. Instead, optimal reconstruction outcomes are attained by aligning the final corrected slice direction with the observation direction.
CONCLUSIONS:
The data-driven slicing correction strategy can effectively solve the problem of artifacts in high pitch helical scanning. Increasing the number of slices does not significantly improve the quality of the reconstruction results, ensuring that the final correction angle is consistent with the observation angle to achieve the best reconstruction quality.
Keywords
Introduction
Data-driven 3D medical image restoration presents a complex and formidable challenge, requiring the recovery of pristine images from damaged or corrupted 3D images riddled with artifacts [1–3]. In recent years, end-to-end deep learning methods have become increasingly prevalent in tasks such as segmentation, generation, denoising, and artifact removal. Nevertheless, the intricate nature and computational demands involved in building end-to-end 3D networks for 3D image restoration, coupled with hardware limitations and computational constraints, severely restrict the size of images that can be effectively processed. Therefore, the majority of deep learning models are designed for 2Ddomains.
However, the 3D features of medical images are of great significance for clinical diagnosis. Many diseases cannot be diagnosed accurately by relying solely on 2D slices [6]. For example, it may not be possible to fully assess the shape, size, and location of lung nodules using only 2D slices, which are crucial for determining whether they are benign or malignant [7, 8]. For cardiovascular diseases like aneurysms or heart valve abnormalities, 3D images are essential for evaluating the shape, diameter, and blood flow within the vessels [9]. In addition, joint diseases such as arthritis or cartilage injury also require 3D images to assess the structure and extent of damage to the joint [10]. Therefore, numerous previous studies have endeavored to transform 3D objects into 2D planes through slicing techniques and utilize 3D end-to-end networks to restore each slice individually. Finally, they were superimposed to recover the true 3D image. However, these studies varied in their methods of slicing and rectifying 3D images. Some studies sliced the 3D volume from a single direction and recovered it layer by layer, while others sliced it from different directions separately and recovered it multiple times. Finally, the rectified 2D images were superimposed.
For the high pitch helical CT scanning method that can effectively accelerate scanning speed and reduce radiation dose in clinical diagnosis, this article aims to verify three questions through theory and experiment. Firstly, which direction of slicing yields the best results when employing single-directional slicing correction? Secondly, can multiple-directional corrections significantly enhance recovery accuracy compared to single-directional corrections? Lastly, how can 3D volume slicing and correction be optimized for maximum benefit with limited computing resources? Unlike the method of estimating the possible composition of 3D anisotropic media from multiple perspectives using statistical methods in materials science [11], our research focuses on constructing realistic and accurate 3D images through slicing methods, rather than estimating them. In this article’s experiment, to include as many damaged image scenarios as possible in existing medical images and fit real clinical application scenarios, we chose the most used helical computed tomography structure in clinical diagnosis. We introduced sparse damage between layers and limited-angle damage within layers by adjusting the pitch, ensuring that the experimental images fully sustain damage from all directions possible [12]. Building upon this foundation, we apply image post-processing techniques for image restoration [13], further validating our findings. The challenge of reconstructing 3D images affected by such scenarios has long been a formidable obstacle in clinical medical diagnosis. Restoring such 3D images through data-driven methods stands as one of today’s most prominent research directions [14], which is crucial for improving the quality of medical imaging and accurate diagnosis and treatment. Through the experimental results of this study, we hope to draw conclusions and confirm whether a single directional correction is sufficient and whether multiple directional slices and one-by-one correction can significantly improve recovery accuracy. This will help to improve the methods and technologies of 3D image recovery and provide more reliable and high-quality 3D image recovery solutions for practical applications.
This article represents the inaugural exploration into slicing methods pertaining to the medical 3D image restoration process. Our principal contributions are outlined as follows: We have established a general theoretical model for 3D volume correction using slice methods, applicable to data reconstruction and correction in 3D helical scanning scenarios. Through extensive experimentation, we have validated the effects of correction direction, sequence, and their combinations on the improvement of correction results, providing quantitative analysis. We have assessed the performance of CNN, GAN, and Transformer architecture networks in 3D volume correction under different pitch conditions, providing references for selecting the optimal network architecture. For clinical practices employing high pitch and accelerated helical scanning methods, we have proposed strategies and recommendations for data reconstruction and correction to optimize image quality and reduce artifacts.
The rest of this paper is organized as follows. We reviewed the relevant research on using 2D slices for 3D volume correction in Section 2. In Section 3, we establish a 2D slice correction model for a 3D image and detail the possible sources of error in the correction process, as well as the experimental scenario and setup. Section 4 presents and analyzes the experimental results. In Section 5, we conducted an in-depth discussion of the experimental results. Finally, this work is summarized, and we give our suggestions in Section 6.
Background
3D damaged images are closer to the real clinical situation, and more than half of clinical diagnoses require the use of 3D data. The restoration of 3D-damaged medical images is important for the accuracy of clinical diagnosis [15]. In previous studies, data-driven 3D feature recovery has been widely applied to recover 3D image damage caused by various scenarios such as low dose, metal artifacts, and motion artifacts. However, the computational complexity of 3D end-to-end deep learning networks is often too large, and they are limited by 3Dthe memory size of graphics processors, which can only handle small-sized 3D volumes [16]. Therefore, most research focuses on 2D images or slicing 3D objects and then using end-to-end networks for learning. Initially, to reduce computational complexity, slices along one direction are usually used for correction. For example, a classic study used a Wasserstein Generative Adversarial Network (WGAN) to reconstruct 2D computed tomography (CT) slice images from a limited number of projection images and then performed longitudinal superposition to complete 3D image reconstruction [17]. This is a general method that can be used by almost all deep learning networks with end-to-end nonlinear fitting capabilities. In addition, the Department of Mechanical Engineering at Tsinghua University constructed a 3D slicing reconstruction model (3DSR) to super-resolve and stitch 3D volumes layer by layer, improving accuracy while reducing data acquisition resources and storage space during the 3D modeling process [18].
However, research has found that there are always problems with the accuracy of using 2D slices to characterize 3D features [19], especially in scenarios where there is a significant pixel shift along the direction perpendicular to the slice. In complex textures or porous microstructures [20], single-direction correction cannot meet the accuracy requirements of 3D reconstruction [21]. Therefore, many studies have attempted to improve the inter-layer consistency during single-direction correction. For example, Patrik Kamencay et al. used SURF (Speeded-Up Robust Features) descriptor and SSD (Sum of Squared Differences) matching algorithm to improve the quality of 3D reconstruction under slice methods using the idea of inter-layer feature point matching [22], but the drawback is longer computation time. L Salvolini et al. proposed a fast multi-GPU acceleration framework for slice-to-volume reconstruction to address the problems of motion artifacts and slow algorithms in MRI imaging. The proposed method ensures high reconstruction accuracy by accurately calculating the point spread function (PSF) for each input data point [23]. K Choi et al. first used the correlation between slices for model training in the context of low-dose CT (LDCT). They constructed a self-supervision framework (Corr2Self) to train a CNN denoiser. The proposed method incorporated the inter- and intra-slice correlation of LDCT images into the self-supervision learning procedure by using adjacent slices and thicker slices, without unnecessary radiation dose transfer [24]. In addition, some simple techniques can also increase the 3D consistency of 2D slices in the perpendicular direction. For example, K Choi et al. used multiple consecutive slices as inputs in low-dose CT (LDCT) reconstruction tasks, which is essentially also a 3D reconstruction network, but significantly reduces the computational load to a certain extent [25].
Another method is to slice and correct the 3D volume from different directions in multiple stages. This method often selects orthogonal slicing angles to minimize the possible shift of features in 3D space. Although this method can further improve the accuracy of 3D feature recovery, it is not widely used due to cumbersome steps and exponentially increased recovery time. For example, JW Hayes et al. used a U-Net network to correct the 3D volume from two directions, the coronal and sagittal planes, in high-pitch clinical CT, which can simultaneously eliminate finite-angle artifacts in sections and sparse artifacts between layers [26]. In addition, in the field of fast multi-layer magnetic resonance imaging (MRI), K Kim et al. adopted a method called slice intersection motion correction (SIMC) to directly align multiple slice stacks by obtaining the matching structure of orthogonal slices and all intersecting slices, to provide a single high isotropic resolution 3D image. The registration effect of this method has significantly improved compared to previous methods [27].
However, the above studies all assume the validity of slicing and have not explored the impact of slicing methods or angle selection on 3D volume recovery from a theoretical or experimental perspective. To our knowledge, our study is the first to attempt to investigate the effectiveness of single-direction and multi-direction slicing and their effects on 3D volume accuracy recovery.
Methods
2D slicing methods for high pitch helical CT volume correction
Most image translation work directly targeted at 3D images uses 3D convolution operations in the network encoding stage, which requires a large amount of memory, so it can only handle very small-sized 3D volumes. Therefore, the current approach is to correct damaged 3D images by slicing them layer by layer. Taking the 3D body structure of the chest and abdomen of the human body as an example, the slicing method along the sagittal plane (x-axis) is shown in Fig. 1(b). In addition, there are also common slicing methods along two mutually perpendicular coronal planes as shown in Fig. 1(c) and (d). The above process can be generalized to the general form of slicing 3D images at any angle. Given the angle θ, a unit vector

Three reconstruction slicing methods for 3D volumes.
To describe the slicing process more generally, let us assume that we have a plane
The deep learning method that emerged in recent years abandoned the idea of accurately solving the analytical methods from a mathematical perspective but instead constructed a nonlinear space to realize image-to-image transformation. In theory, with sufficient training parameters, the deep learning model can be trained to approximate any complex space, thus achieving damaged image restoration. Especially in the field of repairing damaged medical images, this process can also be vividly referred to as “image translation”.
Assuming we have a 3D volume
The entire 3D volume
If we assume that a deep learning network model is denoted as
The loss of applying deep learning networks in each layer slice in each direction can be expressed as:
When considering multiple directions, we can add up the error functions in each direction and then minimize the overall error function. Assuming there are D directions {x1, x2, …, xD}, where each xd is the correction parameter vector for the corresponding direction d. The overall error function can be expressed as:
Therefore, our goal is to find the optimal set of correction parameter vectors
This problem can be formalized as a multivariate function optimization problem:
This single network objective function can be described as follows:
To verify the influence of correction angle on the recovery effect of 3D volume, we used a variety of classic deep learning models in the field of computer vision, a brief introduction to the model and experimental setup are as follows: U-Net is a renowned neural network architecture featuring a codec structure extensively employed in tasks such as image segmentation and restoration [28]. The U-net used in the experiment is a classic five-layer structure with a maximum down sampling rate of 16. The GAN method is widely used in restoring damaged medical images due to its ability to adaptively generate complex images by refining the loss function during training. Among these methods, pix2pixGAN [29] is notable. It learns the mapping between input and target images by training both the generator and discriminator simultaneously. ResNet-9 was used as its generator in the experiment, while PatchGAN was used as the discriminator. CycleGAN [30] is an image conversion model that does not require paired training data and can convert between different fields, such as converting horse images to zebra images. Unlike pix2pixGAN, CycleGAN does not require paired data as input. Swin-Unet [31] is an important work in the field of image segmentation, which combines the characteristics and advantages of Swin Transformer and U-Net. Compared to traditional convolutional neural networks, it has excellent modeling capabilities and effective processing of long-distance dependencies. The code used in the experiment can be found in reference [31].
In the experiment, we wrote the program in Python, implemented all deep learning methods using PyTorch, and ran them on a computer equipped with an Intel(R) Xeon(R) Silver 4210 R CPU@2.40 GHz and an eight-card NVIDIA RTX3090 GPU. The experimental procedure is outlined in Fig. 2, illustrating the overall flow of the experiment.

Experimental steps and end-to-end network architecture diagram for 3D volume slice validation.
In theory, achieving precise 3D helical FDK algorithms requires satisfying two data completeness conditions: (1) the scanning angle interval must be sufficiently small, and the rays must be dense enough to prevent sparse angle artifacts; (2) the scanning trajectory needs to meet the Tuy condition to prevent helical artifacts between layers and limited-angle artifacts within layers. Only by meeting these conditions can it be ensured that the rays adequately irradiate all voxels and achieve accurate reconstruction. Literature [26] has already demonstrated that in helical scanning trajectories, as shown in Fig. 3, ensuring a standard pitch P < 1.375 satisfies the data completeness condition. In practice, some CT vendors have also introduced data extrapolation schemes to extend the pitch to P = 1.5. To fully demonstrate the correction effect of end-to-end deep learning algorithms, we selected scanning conditions with pitches of 2, 3, and 4, as shown in Fig. 4.

Scanning schematic diagram of different pitches. When the standard pitch is less than 1, the scanning trajectory rotates one revolution and is still within the detector plane range. Otherwise, it will cause the projection data to be missing along the detector direction.

Reconstruction results of FDK algorithm for standard helical trajectories under different pitches: (a-4), (b-4), (c-4), and their cross-sections (a-2), (b-2), and (c-2); Schematic diagram of coronal plane (a-3) (b-3) (b-4). As the pitch increases, the severity of damage to the 3D reconstruction results and slices in all directions gradually increases, and the training difficulty faced by the end-to-end model also gradually increases. The display windows are [–1000 1000] HU.
To adapt to the 3D characteristics in this study, we used the AbdomenCT-1K dataset, which is a large-scale abdominal CT dataset, including 1112 CT scans, for the segmentation of four abdominal organs, including the liver, kidney, spleen, and pancreas. To facilitate the training of end-to-end deep learning models, we first proportionally scale all 2D slice data along the x-axis, from the original size of 512 * 512 to 288 * 288. Then, we symmetrically fill the edges of the slices with less than 512 images along the y-axis and perform center symmetric cropping for those exceeding 512 pixels. Finally, we make all 3D volumes of size 288 * 288 * 512, and obtain the results along x-axis, The slice sizes for the y-axis and z-axis are 288 * 288, 512 * 512 and 512 * 512, respectively.
We randomly selected 1062 cases and obtained 174,540, 305,856, and 305,856 2D slices along the x, y, and z axes using the slicing method, respectively. The training set, test set, and validation set were divided according to the size of 8 : 1:1. Finally, we implemented the U-net, pix2pixGAN, and Swin-Unet methods on the Pytorch framework. The training, validation, and testing of the models were performed on an Intel Core i77700 (3.60 GHz) central processing unit (CPU) and eight RTX3090 graphics processing units (GPU) with 24GB of video memory. To fully verify the effectiveness of multi-angle 2D slices for 3D feature recovery, we designed five sets of experiments to conduct controlled experiments on different directional correction, correction order, increasing correction dimensions, different deep learning methods, and different difficulty levels of correction. The data involved in the training reached 9.8T, and all model training lasted for 10 weeks and 3 days.
Selection of Evaluation Indicators
When evaluating the results of 3D large pitch CT reconstruction, we used a series of objective evaluation indicators, including SSIM (structural similarity index), PSNR (peak signal-to-noise ratio), as well as 3D SSIM and 3D PSNR. These indicators can provide a comprehensive evaluation, particularly suitable for our scenario of using end-to-end deep learning data-driven methods for reconstruction, while comparing the effects of X, Y, Z, and slices in different directions.
Firstly, we use the Structural Similarity Index (SSIM) to measure the structural similarity between the reconstructed image and the original image. The calculation formula for SSIM is as follows:
Secondly, we use Peak Signal to Noise Ratio (PSNR) to quantify the signal-to-noise ratio between the reconstructed image and the original image. The calculation formula for PSNR is as follows:
Results on single directional correction
Table 3 and Fig. 1 depict the results of slice images corrected from cross-sectional, coronal, and sagittal directions utilizing the pix2pixGAN model, with a pitch of 2. The results conspicuously illustrate the ability of end-to-end data-driven models to effectively restore original images from damaged 3D high-pitch scenarios, with outcomes closely approximating the Ground truth slice images.
Traditional analytical 3D FDK algorithms exhibit significant limited-angle artifacts within cross-sectional planes, manifested as extensive data voids and arc-like artifacts, while coronal and sagittal planes present severe helical artifacts, hindering discernment of internal structural textures. In contrast, all slices corrected using the Pix2pixGAN model show no apparent artifacts or errors. Examination of Error maps in Fig. 5 reveals the error distribution across slice directions. As per the objective evaluation metrics in Table 1, results corrected along directions perpendicular to the final observed slice direction demonstrate optimal performance, while slices in other directions exhibit some small, dense, vertical, or horizontal stripe artifacts, as indicated by the red arrows in Fig. 5. Moreover, Table 1 indicates minimal discrepancies in the 3D reconstruction results along various correction directions.

Slices along different axes, corrected using the FDK (II) and pix2pixGAN algorithm with a pitch of (p = 2), are presented. Slice (a) along the x-axis of the cross-section and its corresponding error map (a1), slice (b) along the y-axis of the coronal plane and its error map (b1), and slice (c) along the z-axis of the sagittal plane and its error map (c1) are depicted. The window size is set to [–1000, 1000].
The quantitative evaluation index for image slices corrected in different directions and orders using the pix2pixGAN model with a pitch of 2. Bold font represents the maximum value in a column
Figure 6 and Table 2 elucidate the reconstruction outcomes of the pix2pixGAN model under a pitch of 2, following the augmentation of correction angles and adjustment of correction sequences. Relative to the 3D FDK model, any sequence or quantity of corrections significantly enhances the quality of resulting images, effectively mitigating circular artifacts in cross-sectional planes and helical artifacts in coronal and sagittal planes. Objective evaluation metrics in Table 2 indicate a minor improvement in 3D image quality with increased correction angles, albeit not substantial. However, noteworthy is the observation, as indicated by the red arrows in the Error Map of Fig. 6, that augmenting correction directions not only fails to noticeably reduce image errors but may introduce artifacts along other directions, leading to a further increase in overall error, a phenomenon corroborated by the objective evaluation metrics in Table 2.

Results obtained using the FDK (II) and pix2pixGAN algorithm with a pitch of (p = 2) have been presented with corrections along the x-axis as follows: (III) corrected along the x-axis, (IV) x-axis before y-axis, (V) x-axis before z-axis, and (VI) in the order of x, y, and z-axis. The results are depicted along the x-axis of the cross-section (a) and its corresponding error map (a1), along the y-axis of the coronal plane (b) and its error map (b1), and along the z-axis of the sagittal plane (c) and its error map (c1). The window size is set to [–1000, 1000].
The quantitative evaluation index for image slices corrected in different directions and orders using the pix2pixGAN model with a pitch of 2, (PSNR/SSIM). The column directions represent slicing the 3D volume along different directions and then correcting it. The row direction represents the objective evaluation indicators of slicing the corrected results along the cross-sectional (x-slice), coronal (y-slices), and sagittal (z-slices) planes. 3D represents the evaluation results using 3D objective evaluation indicators. Bold font represents the maximum value in a column
Furthermore, the 3D reconstruction outcomes following correction sequence permutations, as depicted in Table 2, exhibit negligible discrepancies. However, upon closer examination from a columnar perspective, it becomes apparent that optimal results are generally achieved when the final corrected slice aligns with the observed direction.
Table 3 illustrates the results of correction along the cross-sectional slice (x-axis) by different models at various pitches. With increasing pitch, all models exhibit a certain degree of quality deterioration, especially evident in the traditional CNN-based Unet model, which shows the lowest reconstruction quality at a pitch of 4. In contrast, the pix2pixGAN model demonstrates consistently superior reconstruction quality across pitches of 2 and 3, surpassing the unsupervised CycleGAN model. Additionally, although the Transformer-based Swin-Unet model does not consistently achieve the best reconstruction quality across all pitch scenarios, its reconstruction results exhibit considerable stability across varying pitches, without significant degradation as pitch increases (implying increased network training difficulty). Furthermore, as evident from Table 3, the reconstruction results along the cross-sectional slice (x-axis) direction remain optimal across all directional slices.
Quantitative evaluation indicators for image slices corrected from the cross-sectional (x-axis) direction using different models with pitches of 2, 3, and 4, respectively. Bold font represents the maximum value in a row
Quantitative evaluation indicators for image slices corrected from the cross-sectional (x-axis) direction using different models with pitches of 2, 3, and 4, respectively. Bold font represents the maximum value in a row
Figure 7 presents the objective evaluation metrics results of different models after increasing correction directions at a pitch of 4. It is notable that increasing correction directions does not universally result in an improvement in reconstruction quality, particularly for traditional CNN and GAN architectures. The results indicate that the Swin-Unet network model can achieve consistently superior reconstruction performance on average, particularly in challenging scenarios (pitch of 4), with the most stable outcomes.

Quantitative evaluation index curves of image slices corrected along the X, X⇒Y and X⇒Y⇒Z directions using different models in the pitch 4 scenario.
Recovering data from high-pitch helical CT scans, where data loss is inherent in 3D space, presents considerable challenges in clinical applications. Our research focuses on investigating the predominant method for restoring compromised 3D reconstruction outcomes: slice correction. Our objective is to identify the correction approach that can offer the most advantageous outcomes.
The widely recognized 3D FDK analytical algorithm, due to its utilization of global filtering operators and back-projection methods, amplifies errors during back-projection with even minor data loss, resulting in prominent artifacts such as inter-slice helical artifacts and intra-slice limited-angle artifacts. Especially when the pitch is further increased, it can be found that the results are even less ideal. This is because the FDK algorithm’s filter is a high pass filter, and the missing data with obvious sharp protrusions or depressions will be enhanced in this process, thereby further increasing the impact of artifacts on the reconstruction results. Nonetheless, its concise and efficient nature positions it as the initial step in end-to-end data-driven approaches. In the subsequent stage, we deliberately selected three slices in different directions and explored various permutations of correction sequences, encompassing both single-directional and multi-directional corrections. Our objective was to investigate whether selecting correction angles or increasing correction angles could effectively enhance correction accuracy. Thus, we conducted experiments covering a wide array of mainstream 2D deep learning network architectures. Both objective and subjective evaluation metrics demonstrate significant improvements in reconstruction quality when employing the slice correction method compared to 3D FDK.
Indeed, if a model possesses sufficiently strong nonlinear fitting capabilities, and each slice corrected using the slice method closely resembles the corresponding label slice, correction from a second direction may be unnecessary regardless of the direction of damage to artifacts in the 3D volume. However, the nonlinear fitting capabilities of any end-to-end deep learning model are inherently limited, especially in challenging scenarios such as 3D high-pitch reconstruction. Even after model training, artifacts may persist along certain directions. Hence, research into correction directions, sequences, and model performance becomes crucial. Through experimentation, we observed that the correction sequence does not significantly influence the results for different models. Furthermore, increasing the number of corrections does not notably enhance 3D reconstruction quality, even in severely compromised scenarios, as illustrated in Fig. 5. One potential explanation could be that under different correction directions, models prioritize different aspects of recovery. For instance, during recovery along the cross-sectional plane, models may mitigate errors within the plane but overlook the coherence between adjacent cross-sectional planes. This oversight leads to severe artifacts along the longitudinal axis of the coronal plane, as indicated by the red arrows in Fig. 5 and Fig. 6, posing challenges for subsequent corrections in other directions. Similarly, during subsequent corrections in other directions, even if a network model can reduce image errors within a specific plane to a sufficiently low level, it cannot guarantee coherence across other directions. It may even reintroduce artifacts from previous corrections, thereby explaining why the objective evaluation metrics of 3D reconstruction, as shown in Table 2, do not significantly improve with an increase in correction directions.
On the other hand, we have observed the advantages of transformer-based models in reconstructing images from ultra-high pitch helical scans. It is imperative to acknowledge that different models exhibit varying learning capacities, and we conducted extensive experiments on prevalent methods, as evidenced in Table 3 and Fig. 5. Notably, although our data-driven approaches employ similar types and sizes of datasets, their nonlinear fitting capabilities differ. Transformer architectures like Swin-Unet have demonstrated good stability across training scenarios of varying difficulty. In contrast, CNN and GAN architectures exhibit more variability. For scenarios involving 3D helical reconstruction, large-scale, global limited-angle artifacts are prevalent within the planes. Traditional CNN architectures struggle to recognize such extensive global features due to constraints imposed by convolutional kernel sizes and pooling layers. This limitation hinders the effective utilization of long-range dependencies for image recovery, particularly in scenarios with increased helical pitches leading to significant degradation in reconstruction quality. Moreover, GAN-based methods excel in image restoration accuracy compared to CNN architectures but lag in training time and inference speed. This delineates the trade-offs between these methodologies in the context of 3D helical reconstruction, with Transformer models like Swin-Unet demonstrating promising stability and efficacy across various trainingdifficulties.
Conclusion
Through theoretical analysis and extensive experiments, our study has provided further clarity on the impact of correction directions on the quality of 3D and 2D slice reconstruction. We found that increasing the number of correction directions does not significantly enhance reconstruction quality once the model is fully trained. Similarly, selecting correction sequences in multiple directions does not notably affect the accuracy of 3D reconstruction.
Moreover, we observed that during the multi-directional correction process, artifacts in the correction results tend to align with the direction perpendicular to the previous correction. While models with the ability to capture long-range dependencies may face challenges in training and inference speed, they demonstrate excellent performance in scenarios with high pitch.
Given these insights, we recommend cautious clinical use of 2D slicing methods for correcting high pitch spiral reconstructions. Considering limited computing resources and time constraints, it is advisable to correct from a single direction. Optimal results are achieved when the correction direction is perpendicular to the final clinical observation direction.
For scenarios with abnormally high pitch (p > = 4), models with stronger capabilities in modeling long-range dependencies can be appropriately utilized for correction. If significant errors persist after one-dimensional correction, multi-directional correction can be employed to achieve the highest 3D reconstruction accuracy, ensuring that the final correction direction aligns as closely as possible with the observation direction.
Footnotes
Acknowledgments
This research is supported by National Natural Science Foundation of China (Grant No.: 52075133), CGN-HIT Advanced Nuclear and New Energy Research Institute (Grant No.: CGN-HIT202215).
