Abstract
BACKGROUND:
The corpus callosum in the midsagittal plane plays a crucial role in the early diagnosis of diseases. When the anisotropy of the diffusion tensor in the midsagittal plane is calculated, the anisotropy of corpus callosum is close to that of the fornix, which leads to blurred boundary of the segmentation region.
OBJECTIVE:
To apply a fuzzy clustering algorithm combined with new spatial information to achieve accurate segmentation of the corpus callosum in the midsagittal plane in diffusion tensor images.
METHODS:
In this algorithm, a fixed region of interest is selected from the midsagittal plane, and the anisotropic filtering algorithm based on tensor is implemented by replacing the gradient direction of the structural tensor with an eigenvector, thus filtering the diffusion tensor of region of interest. Then, the iterative clustering center based on K-means clustering is used as the initial clustering center of tensor fuzzy clustering algorithm. Taking filtered diffusion tensor as input data and different metrics as similarity measures, the neighborhood diffusion tensor voxel calculation method of Log Euclidean framework is introduced in the membership function calculation, and tensor fuzzy clustering algorithm is proposed. In this study, MGH35 data from the Human Connectome Project (HCP) are tested and the variance, accuracy and specificity of the experimental results are discussed.
RESULTS:
Segmentation results of three groups of subjects in MGH35 data are reported. The average segmentation accuracy is 97.34%, and the average specificity is 98.43%.
CONCLUSIONS:
When segmenting the corpus callosum of diffusion tensor imaging, our method cannot only effective denoise images, but also achieve high accuracy and specificity.
Introduction
Corpus callosum (CC), as the largest white matter fiber connecting the left and right cerebral hemispheres, presents a semilunar shape in the midsagittal plane [1]. According to studies, the shape and size of the corpus callosum are related to age, sex, and handedness. Some diseases, such as Alzheimer’s disease, multiple sclerosis, schizophrenia, epilepsy and depression, can also cause changes in the corpus callosum [2–4]. The accurate segmentation of corpus callosum and observation of its pathological information can be used to realize the early diagnosis of the disease, which provides a strong basis for improving the clinical diagnosis rate and controlling the disease development as soon as possible.
Diffusion tensor imaging (DTI) is an imaging method to obtain the information of human tissue structure through the diffusion information of water molecules in the body [5]. The traditional method of corpus callosum segmentation based on DTI is to convert the tensor information into scalar information, like: Fractional Anisotropy (FA), and to segment the diffusion scalar map. In the FA map of the midsagittal plane, the position and FA of the fornix and the corpus callosum are similar, thus, they cannot be easily separated. However, in the direction information of DTI, the fornix is in the anterior and posterior direction, while the corpus callosum is in the left and right direction. Niogi et al. [6] weighted the FA and introduced the direction of the main eigenvector to construct the weighted color-coded anisotropy map. Then, the corpus callosum was segmented by a binary mask. The algorithm is simple and accurate, but it takes time to select seed pixels manually. On this basis, Freitas et al. [7] proposed a watershed algorithm based on weighted FA, which realized automation and effectively distinguished corpus callosum from other structures such as fornix and cingulate gyrus. Rittner et al. [8] improved the watershed algorithm and changed the original scalar mapping method to the tensorial morphometric gradient mapping method, but there were some difficulties in segmenting the posterior midbody. Wang et al. [9] took the target shape as a priori information and proposed a multi-atlas active shape model algorithm. In this method, when calibrating feature pixels, manual calibration method was cumbersome and time-consuming, and only similar shapes can be segmented.
In order to realize the automation and use the shape and direction information, this study proposes and uses the clustering algorithm to segment the corpus callosum. K-means clustering algorithm has been widely used because of its simple idea and fast convergence speed. Fuzzy C-means (FCM), as a promote of K-means clustering algorithm, has poor anti-noise ability and cannot smooth the image. In the process of image segmentation, the accurate target boundary of the image cannot be obtained. In order to solve the boundary problem, Pham [10] proposed a Robust Fuzzy c-means algorithm (RFCM). In the case of citing empirical parameters, a membership function containing neighborhood pixels was constructed and neighborhood information was introduced into the objective function. In order to avoid using empirical parameters in the objective function, Krinidis et al. [11] proposed Fuzzy Local Information C-means algorithm (FLICM) to realize the automation of the algorithm. However, in order to reduce the objective function value when the clustering center was noise point, neighborhood pixels were assigned to this class. Ji et al. [12] used image patches to replace pixels in the fuzzy clustering and constructed a weighting scheme to able the pixels in each image patch to have anisotropic weights. Thus proposed a weighted image patch-based FCM algorithm, which could effectively deal with noise interference, but the amount of calculation increased and the time consumption was large. In order to overcome the noise problem, He et al. [13] reduced the weight of noise points by introducing adjacent pixels to realize neighborhood constraints and improve the noise immunity. Liu [14] introduced normal distribution spatial information into fuzzy membership function, and believed that in the spatial neighborhood, the closer to the pixel, the greater the correlation would be, so as to reduce the influence of image noise and human factors on segmentation results.
Up to now, the fuzzy clustering algorithms based on DTI all take pixels as input data and introduce pixel weights, distribution functions as neighborhood information. Therefore, the tensor fuzzy clustering algorithm (TFCA) is proposed, this paper improves the following two points: In the first point, for the edge problem caused by partial volume effect, this paper proposes the anisotropic filtering algorithm based on tensor, and the direction of gradient operator of structural tensor is replaced by eigenvector to realize filtering; In the second point, the TFCA algorithm is proposed. In the calculation process, the diffusion tensor data is first calculated as the input data. In order to calculate the distance of symmetric positive definite matrices, different metrics are introduced as similarity measure. and the corresponding Fréchet mean calculation method is used to calculate the cluster center. In addition, when selecting spatial adjacent information, the diffusion tensor pixel calculation method of logarithmic Euclidean frame (LE) is used to calculate adjacent points, which can solve the problem of edge ambiguity and different value points. In this paper, MGH35 data from the HCP are experimented and the experimental results are evaluated in detail.
Method
Diffusion tensor
The diffusion tensor is a symmetric positive definite matrix, which can be defined as:

Diffusion tensor graph, where from left to right are three equal eigenvalues to two larger eigenvalues to only one larger eigenvalue.
In order to describe the degree of diffusion of water molecules, researchers have proposed several measurement methods, among which the most common measure methods are mean diffusivity (MD) and FA. MD measures the overall diffusivity, with no information about the direction of diffusion. FA is the ratio of the anisotropy of water molecules to the overall diffusivity [5], and the formula is as follows:
The smaller FA is, the more unrestricted the dispersion is. The larger the FA is, the more regular and directional the tissue is, and the nerve conduction function is also enhanced.
As can be seen from Fig. 2(a), there is a boundary between high and low FA. In the midsagittal plane, the rectangular region containing the corpus callosum serves as the region of interest. In the region of interest in Fig. 2(b), the fornix is connected to and under the corpus callosum in a small semilunar shape.

FA image of brain. (a) FA image after skull removal, in which the box represents the region of interest; (b) Region of interest.
The traditional FCM algorithm uses Euclidean metric as the similarity measure to calculate the membership function, and then achieves the clustering effect by minimizing the objective function. But FCM does not incorporate information about spatial context, resulting in noise sensitivity, therefore, Pham [10] introduced a penalty function into the objective function and proposed a robust fuzzy clustering algorithm, which is the earliest fuzzy clustering algorithm based on neighborhood information. The objective function is as follows:
In order to avoid the use of empirical parameters, Krinidis [11] proposed a robust fuzzy local information C-means clustering algorithm (FLICM) to achieve automate, and realize the balance between image details and noise, its objective function is:
The FLICM algorithm adopts the transplantation method. According to the iteration expression of membership function and clustering center of FCM algorithm, the expression of membership function and clustering center of FLICM algorithm are given. The formula is as follows:
Later, scholars improved the calculation method, including objective function, membership function and clustering center [15, 16]. In addition, the fuzzy clustering algorithms based on DTI were always improved by adding weight to pixels or defining new spatial distribution function [12, 14].
In diffusion tensor imaging, the edge of image is swelled by partial volume effect, so the image smoothing technique is proposed [17, 18]. Weickert et al. [19] introduced nonlinear structure tensor that was based on anisotropic diffusion filter, this method can effectively distinguish the boundary region of image and enhance corner detection.
The common gradient tensor G is given by ∇I
p
convolved with K
p
. The use of the common gradient G in each weighted direction ensures that the missing boundary information in the other weighted directions can be obtained in one weighted direction. The calculation formula is as follows:
Considering the premise that the diffusion tensor and structure tensor are both symmetric positive definite matrices. Then, the anisotropic filtering algorithm based on structure tensor is extended to the anisotropic filtering algorithm based on tensor for the first time. In the structure tensor, the direction and geometry of the gradient operator represent the direction and speed of changes in the gray value, which are replaced by the eigenvectors and eigenvalues representing different diffusion directions ρ and diffusion rates v
ρ
in the diffusion tensor D. The common gradient tensor G is constructed by summing up v
β ⊗ v
β
T
in different directions of diffusion vectors and convolving with Gaussian kernel function K
ρ
.
Finally, v ti = v gi and λ ti = 1/λ gi , i = 1, 2, 3 is used to construct a new mathematical diffusion tensor T, in which vg1, vg2, vg3 are the eigenvectors of G and λg1, λg2, λg3 are the eigenvalues of G. Then, the mathematical diffusion tensor T is obtained through T = v ti λ ti v ti T .
When the DTI image was segmented, Lenglet [5] proposed the segmentation result was influenced by the different metrics, and found the splenium was not visible in the Euclidean metric, but the splenium was visible when using the Riemannian metric, proved that the different metrics can influence the result of the segmentation. In this paper, common metrics in the space of covariance matrix are introduced to compare the segmentation results. Common metrics are: Euclidean metric, Riemannian metric, Log Euclidean metric, Cholesky metric, and Root Euclidean metric. Except for Euclidean metric, the other metrics are non-Euclidean metrics, and the formula can be seen in Table 1.
Different distances and mean values between D1 and D2
Different distances and mean values between D1 and D2
where L1 is the Cholesky factorization of the symmetric positive definite matrix D, i.e. D1 = L1L1 T .
In addition to the similarity measure, the clustering effect is also affected by the initial selection of the center point, so that the mean value of the data points in each cluster obtained by K-means clustering algorithm based on FA is taken as the initial clustering center points. This method has low time complexity and simple operation. When calculating the mean value, a method is introduced by Jayasumana [20], which generalized the mean, expectation and the variance measures on a vector space onto a Riemannian manifold. Arsigny et al. [21] explained why using the Euclidean mean was not suitable for a positive definite space (especially for DTI). Additionally, Dryden [22] proposed to use the Fréchet and Karcher mean when calculating the mean of covariance matrices. The Fréchet mean is based on minimizing the variance globally, and the Karcher mean is based on minimizing the variance locally. Considering the globally, this paper selects Fréchet mean as the calculation method of clustering center after iteration, and its formula is as follows:
Table 1 shows that, except Euclidean metric, the other metrics are substituted into Fréchet mean for simplification to obtain the corresponding mean formula. It can be seen from Table 1 that,
When calculating new spatial neighborhood information, this paper introduces the calculation method of neighborhood points in LE framework [23], and the formula is as follows:
The specific steps of DTI image segmentation method based on tensor fuzzy clustering algorithm are as follows:
Step 1. Reading the DTI data, calculating the FA image of the DTI image according to Equation (2), and selecting the ROI.
Step 2. Performing anisotropic filtering algorithm based on tensor for the tensor data in the ROI to obtain the filtered tensor data.
Step 3. Recalculating the FA of the ROI and making the number of cluster K = 5 to achieve the K-means algorithm to get the initial cluster centers.
Step 4. Calculating the neighborhood points of tensor through LE framework, and constructing a new spatial penalty function by calculating the distance between tensors and neighborhood points under different metrics. By minimizing the objective function and iterating on the membership function and the center points, the iteration of the center points are calculated by Fréchet mean. The flow is shown in Fig. 3.

Flow based on tensor fuzzy clustering algorithm.
Experimental environment and data
Data from the HCP Program of the National Institutes of Health were collected from 35 young adults using a customized MGH Siemens 3T connector scanner with a maximum gradient of 300 mT/m for diffusion imaging. B value is selected as the five serial values of 0, 1000, 3000, 5000 and 10000.
According to the age and gender analysis of the experimental data in Table 2, the segmentation experiment is carried out with the data of female aged 20–29, male aged 30–39 and male aged 40–49 respectively. The sagittal plane data in the 70th layer of DTI which the corpus callosum is the most obvious in the brain tissue region and is used to analyze the experimental results.
Experimental data analysis of 35 cases
Experimental data analysis of 35 cases
In order to evaluate the clustering effect of the algorithm, the result of manual segmentation by experienced doctors is used as the Ground Truth, and the evaluation factors are given through variance(var), accuracy and specificity:
According to Equation (14), when the clustering is better, the closer each point in the cluster is to the center of clustering, the lower the variance value is. The higher the accuracy, the higher the accuracy of assigning points to the cluster and the better the segmentation effect. The higher the specificity, the fewer points containing the corpus callosum in the cluster except the corpus callosum.
Experiment 1 shows the result of the anisotropic filter algorithm based on tensor. Here, the number of iterations is set as 10, the iteration time is 0.001, and the Gaussian kernel function β is 0.5. The results are as follows.
It can be seen from Fig. 4 that compared with Fig. 4(a)-(c), the regions with high FA values and low FA values in Fig. 4(d)-(e) have more obvious boundaries. According to Equation (10), the diffusion rate is lower in the region with a low FA, and the region will be smoothed by filtering the convolution of Gaussian kernel function, as shown in the upper left corner of Fig. 4(d)-(f) and the region below the fornix. Compared with the unfiltered image, the FA of the image decreases significantly.

The filtering result: (a)-(c) the original FA image; (d)-(f) the FA image after filtering.
Experiment 2 shows the variance of corpus callosum that segmented by the tensor fuzzy clustering algorithm with different metrics and mean values. The results are shown in Table 3.
Variances of corpus callosum based on different measures and corresponding mean values
When the same metric method is used to calculate the distance and the mean value, var (CC) is the smallest and the clustering effect is better. However, the units of variance calculated by different methods are different, which cannot be used to evaluate the effectiveness of different measures and mean values. Therefore, Experiment 3 is proposed.
Due to the shape and size of the corpus callosum in the clinical trial has important implications for disease research, this experiment will covariance matrix A determinant | A | as evaluation standard. The results are shown below.
Figure 5 shows that the filtered data can get better clustering effect compared with the data before filtering and also reflects the accuracy of clustering effect when Log Euclidean metric and Riemannian metric are used as similarity measure. However, in the actual experiment, since the mean value of Riemannian metric require numerical method, and the time complexity is high, and considering that the results of Log Euclidean metric and Riemannian metric are similar. In the following experiment, Log Euclidean metric is used as the similarity measure and mean calculation method.

Determinant of covariance matrix by different method.
Experiment 4 shows the result the clustering algorithm based on FA. The segmentation experiments are carried out using the K-means algorithm, the traditional FCM algorithm, RFCM algorithm, FLICM algorithm and the improved spatial fuzzy clustering algorithm.
Taking FA as the input data is the most common DTI segmentation algorithm, in this case, Fig. 6 proves that K-means algorithm and various fuzzy clustering algorithms are unable to achieve segmentation of corpus callosum and fornix regardless of whether the fornix and corpus callosum are connected or at a certain distance. And whatever the algorithm, there are a lot of noises both inside and outside the corpus callosum. For these reasons, we propose a fuzzy clustering algorithm based on tensor information.

Segmentation results of different segmentation methods, (a)(g)(m) FA map; (b)(h)(n) K-means algorithm; (c)(i)(o) Traditional FCM algorithm; (d)(j)(p) RFCM algorithm; (e)(k)(q) FLICM algorithm; (f)(l)(r) improved spatial fuzzy clustering.
Experiment 5 is to verify the feasibility of the fuzzy clustering algorithm based on tensor information and the segmentation effect of TFCA algorithm. The traditional FCM algorithm, RFCM, FLICM algorithm and the improved spatial fuzzy clustering algorithm are rewritten into the corresponding fuzzy clustering algorithm based on tensor information, and the segmentation results are compared with the corpus callosum contour experiment drawn by professional doctors who have worked for many years.
Figure 7 proves the feasibility of fuzzy clustering algorithm based on tensor information and solves the difficulty in the segmentation of corpus callosum and fornix in the midsagittal plane. Moreover, compared with large number of noise points in Fig. 6, Fig. 7 also deals with noise points to some extent.

Segmentation results of different segmentation methods, (a)-(c) FA map; (d)-(f) Traditional FCM algorithm; (g)-(i) RFCM algorithm; (j)-(l) FLICM algorithm; (m)-(o) improved spatial fuzzy clustering; (p)-(r) TFCA algorithm; (s)-(u) the contour of the corpus callosum, drawn by professional doctors who have worked for many years.
In Fig. 7(f), it is intuitively shown that fuzzy clustering algorithm based on tensor information cannot guarantee the complete segmentation of the splenium when Euclidean metric is used as the similarity measure. Although the RFCM algorithm solves the problem of FCM that lacks spatial information, it also causes edge swelling, leading to unclear structural edge of corpus callosum. FLICM algorithm improves the edge problem of RFCM algorithm, but some internal outliers are not clustering successfully. Improved spatial fuzzy clustering algorithm can eliminate the internal point, but by observing the morphology of the corpus callosum, especially in Fig. 7(m) shows this method for segmenting the splenium result is poor, besides compared to professional doctors draw the outline of the corpus callosum, fuzzy clustering algorithm based on improved spatial segmentation of the corpus callosum thickness thicker. The points inside the corpus callosum indicate that the normal distribution of DTI pixel points can be applied to the Improved spatial fuzzy clustering algorithm. But The TFCA algorithm can solve the above problems better.
The sixth experiment evaluates the segmentation effect of the algorithm with accuracy and specificity. Since the algorithm based on FA is unable to separate the corpus callosum from the fornix, this paper only evaluates the fuzzy clustering algorithm based on tensor information, with the emphasis on evaluating the superiority of TFCA algorithm compared with other fuzzy clustering algorithms based on tensor information. The accuracy results are shown in Table 4, and the specificity results are shown in Table 5.
Accuracy of different methods
Specificity of different methods
It can be seen from Table 4 5 that the filtered segmentation results significantly improve the accuracy and specificity of segmentation compared with the unfiltered segmentation results. It is found that the spatial distribution function defined based on the improved spatial fuzzy clustering algorithm has better accuracy and specificity, but the filtering technology proposed in this paper has improvement.
In DTI, the common segmentation of corpus callosum is mostly performed in the coronal plane, but the pathological changes of corpus callosum need to be observed in the midsagittal plane. Therefore, this paper proposes a tensor fuzzy clustering algorithm to segment the corpus callosum in the midsagittal plane.
Clustering method is different from other algorithms that need to select seed points or build a database to segment the corpus callosum. This method is used to process the tensor data in order to realize automatic segmentation to reduce the burden on medical staff and to improve the accuracy and specificity of segmentation. In the past, scholars had calculated and segmented the scalar data of the diffusion tensor. The scalar data does not contain direction information, thus, previous improvement methods added directional information to scalar data. In this paper, tensor data is chosen as the input data. And considering that symmetric positive definite matrix belongs to Riemannian manifolds, and Log Euclidean frame are similar to Riemannian frame, the proposed neighborhood point calculation method can better reflect the directivity of diffusion tensor than other algorithms using weighted FA or Gaussian normal distribution. And the distance calculation method under LE framework is used.
Compared with the other fuzzy clustering algorithm in recent years [24, 25], the TFCA algorithm proposed in this paper does not require manual interference, has a better processing effect on direction information, and has a good adaptability to each individual. The third experimental result shows that the variance of the image and the change of the size and shape of corpus callosum obtained is the smallest when the metric and mean method are calculated by the Log Euclidean frame, and the higher accuracy of the clustering. Compared with the other three clustering algorithms, TFCA algorithm proves that this method has the best effect on splenium segmentation. At the same time, in the processing of edge information, TFCA algorithm introduces new spatial information, which also makes edge more accurate and smooth. In the fifth experimental result shows TFCA has lower time complexity and better edge processing effect than spatial fuzzy clustering algorithm. However, since the clustering center is calculated by the transplantation method and does not contain neighborhood information, the following improvements will be made in this direction to further improve the accuracy of segmentation.
Conclusion
In this study, a new tensor fuzzy clustering algorithm is proposed and tested by filtering, selecting similarity measure, calculating mean value and solving neighborhood points. Compared with the segmentation algorithm based on scalar information, this study demonstrates feasibility of applying the clustering algorithm based on tensor information, and by calculating the accuracy and specificity of different fuzzy clustering algorithms based on tensor information, the study results indicate that TFCA algorithm has higher accuracy and specificity, and can completely present the contour structure of corpus callosum. Therefore, this is a feasible and more accurate DTI segmentation method.
Footnotes
Acknowledgments
This work is sponsored by Natural Science Foundation of Shanghai (18ZR1426900), National Natural Science Foundation of China (61201067).
