Abstract
Meningioma is among the most common primary tumors of the brain. The firmness of Meningioma is a critical factor that influences operative strategy and patient counseling. Conventional methods to predict the tumor firmness rely on the correlation between the consistency of Meningioma and their preoperative MRI findings such as the signal intensity ratio between the tumor and the normal grey matter of the brain. Machine learning techniques have not been investigated yet to address the Meningioma firmness detection problem. The main purpose of this research is to couple supervised learning algorithms with typical descriptors for developing a computer-aided detection (CAD) of the Meningioma tumor firmness in MRI images. Specifically, Local Binary Patterns (LBP), Gray Level Co-occurrence Matrix (GLCM) and Discrete Wavelet Transform (DWT) are extracted from real labeled MRI-T2 weighted images and fed into classifiers, namely support vector machine (SVM) and k-nearest neighbor (KNN) algorithm to learn association between the visual properties of the region of interest and the pre-defined firm and soft classes. The learned model is then used to classify unlabeled MRI-T2 weighted images. This paper represents a baseline comparison of different features used in CAD system that intends to accurately recognize the Meningioma tumor firmness. The proposed system was implemented and assessed using a clinical dataset. Using LBP feature yielded the best performance with 95% of F-score, 87% of balanced accuracy and 0.87 of the area under ROC curve (AUC) when coupled with KNN classifier, respectively.
Introduction
There are many types of brain tumors categorized as primary and secondary tumors. Primary tumors originate in the brain itself such as Astrocytoma, Glioblastoma Multiforme, Meningioma and Medulloblastoma, while secondary tumors are the cancer cells resulted from another part of the body that spread to the brain [1, 2]. Meningiomas are the most common type [3, 4] that originate from the meninges that covers of the brain and spinal cord, hence the naming. Around 90% of meningiomas are diagnosed as benign tumors, while the remaining 10% are atypical or malignant. However, when benign tumors grow, they affect the brain and cause disability and even be life-threatening [5, 6]. Another important fact about Meningioma is the firmness or the consistency of this tumor, which ranges from soft to firm [7]. It plays a critical factor that influences the operative strategy and patient treatment planning [8, 9]. For soft Meningiomas, tumors are removed by tissue suction, hence it takes less time and has lower morbidity and rate of recurrence [10, 11], while for the firm tumors usually craniotomy is performed, where the skull is opened to be able to remove the tumor [12, 13].
In general, physicists’ ways of diagnosing a brain tumor usually begin with a Magnetic Resonance Imaging (MRI) scan, which is a widely used medical imaging technique that utilizes magnetic field and radio waves to provide detailed images of the internal tissues and organs of the body [14, 15]. To reach concrete results, it requires expert radiologists to visually analyze the MRIs to avoid ambiguity and subjective variability in decisions. Consequently, radiologists sometime rely on Computer-Aided Diagnosis (CAD) systems as a second examiner [16]. CAD systems have been developed during the last two decades to aid radiologists in interpreting medical images to increase the accuracy of brain tumors detection and recognition. In fact, typical CAD systems are fundamentally based on image processing techniques to extract distinctive features from the medical images and rely on machine learning techniques to map the extracted features into the predefined tumor classes [17].
Despite the efforts made to develop CAD systems able to predict the malignancy of the brain tumor [18, 19], to classify different type of brain tumors including Meningioma [21, 22] or grading [27, 28], machine learning techniques have not been investigated yet to address the Meningioma firmness detection problem. Although many medical studies have shown the importance of tumors consistency for neurosurgery, particularly Meningioma, most of the existing traditional approaches focus on the correlation between the consistency of Meningioma and their preoperative MRI findings, such as the signal intensity ratio between the tumor and the normal grey matter of the brain [30, 31].
An interesting alternative consists in associating supervised learning techniques with typical visual features extraction to form a CAD system able to recognize how soft or firm a given Meningioma tumor is. The outcome of such a system would yield better management of the consequent surgical procedure. This research aims to design and implement a supervised machine learning based system that is intended to automatically map a given Meningioma tumor in an MRI image into the soft or firm class. This learning task relies on a collection of labeled MRI-T2 weighted images with a pre-specified region of interest (ROI) that has been annotated by a radiologist. Specifically, we intend to investigate several typical classifiers such as the Support Vector Machine (SVM), and k-Nearest Neighbors (KNN) [34] along with state-of-the-art texture features; Namely, the Local Binary Patterns (LBP) [22], the Gray level co-occurrence matrix (GLCM) [24, 25] and the Discrete Wavelet Transform [27, 28] to enhance the overall Meningioma firmness detection performance.
Related works
Up to now, to the best of our knowledge, no research has dealt with the automatic recognition of Meningioma firmness. Thus, this chapter is covering brain tumors classification techniques for tumors’ malignancy or grading, which describes and depends on the appearance of tumor cells. The classification of brain tumors is a critical and important step that affects the treatment, therapy and surgical planning. However, manual classification is a tedious and time-consuming task. Therefore, the need for CAD systems that predict malignancy of brain tumor, classify different type of brain tumors or grading has drastically increased. Thus, many studies have been conducted and reported in the literature.
In [27], the diagnostic value of MRI texture and shape analysis is investigated for grading and classifying Meningiomas into high grade and low grade. 279-dimensionl textural features were extracted from the ROI images. The intensity features of the ROIs were normalized to reduce brightness and contrast. For each tumor, 20 dimensions correspond to Grey level run length (four directions), 220 dimensions represent the co-occurrence matrix (four directions) and five between-pixel distances. Similarly, 20 dimensions correspond to the wavelet parameters at five scales within four frequency bands, 14 dimensions represent the histogram and gradient-based parameter and 20 dimensions correspond to the autoregressive model parameters. In addition, 73 dimensions were obtained for shape features, diameter, perimeter divided by convex perimeter and skeleton length divided by area. The three top-ranked texture features and shape features were selected using Correlation-based Feature Subset Selection. Three classifiers Logistic Regression (LR) [36], Naive Bayes [37] and Support Vector Machine (SVM) [38] were trained with 10-cross validation on 131 MRI T1 weighted images from a university affiliated hospital. SVM yielded better results compared to other classifiers.
Brain tumor grade identification was also addressed in [28]. The authors introduced a non-invasive technique for grading Astrocytoma tumor based on the Gray-level co-occurrence matrix (GLCM) [33] as textual features, along with a shape and intensity features. Pulse coupled neural network [39] and median filter were performed for noise removal. Images’ ROI were segmented using several methods, Fuzzy c-means clustering [40] for finding interesting patterns, Edge-based segmentation and Water shed segmentation algorithm. Textural, shape and intensity-based features were extracted. Seven features from GLCM were manually chosen. In addition to area features and Principle Component Analysis (PCA) features. These features were reduced using Shuffling frog leaping algorithm (SFLA) [41]. SVM [35], Learning Vector Quantization (LVQ) [42] and Naive Bayes [37] classifiers were deployed on the features extracted from 200 T2-weighted images. The authors in [29] built CAD system able to distinguish Glioblastomas GBM from lower-grade Gliomas LGG, utilizing local binary patterns. Quantitative moment features were obtained from local binary patterns (LBP) transformation [32] of the ROI, namely mean, variance, skewness, and kurtosis along with textural features based on GLCM [43] with four directions. Features were selected by backward elimination, then were individually tested using Kolmogorov-Smirnov test and evaluated by Student’s t-test and Mann-Whitney U-test [42]. The relevant features were combined in binary logistic regression [36] to classify GBM from LGG with leave one out for validation, they compared them to pure GLCM feature, which showed that LBP transform is beneficial combined with texture calculation.
In [18], a fully automated method was developed for segmentation and classification at two levels. Specifically, the authors investigated the image level for tumor malignancy and tumor level for tumor grades. In the preprocessing step, artifacts in MR images were eliminated, then converted into gray scale. Besides, a 5×5 Gaussian filter was used for smoothing. Several techniques were used for the segmentation of tumor including unsupervised learning algorithm as K-means and morphological operations such as erosion and dilation. For feature extraction, intensity, shape (area, circularity and perimeter) and texture features (GLCM) at four angles [43]. Then, the SVM [38] with various kernels, was applied to map the extracted features to the two predefined classes and grading. Similarly, the authors in [19] proposed a method to detect tumors on T2-weighted MRI images. MR images preprocessing used dynamic stochastic resonance (DSR) combined with anisotropic diffusion (AD) for contrast, edge sharping and smoothing. The segmentation was done using multilevel customization of Otsu’s thresholding technique. Namely, the textural features were LBP [32], Tamura [44], Gabor [45], GLCM [43] and Zernike Moments [46]. In addition, the shape features were extracted from the segmented image. Then, the two prominent features; LBP (means of LBP histogram) and Tamura (coarseness, contrast, and directionality) were selected using the entropy measure as the most relevant features. Finally, the extracted features were fed into an SVM [35] classifier to categorize a dataset of 1100 images and another collection of 600 images. In [20], a hybrid technique was proposed to detect and classify tumor malignancy. Median and high pass filters were applied on MR images to remove the high frequency.
For feature extraction, the Discrete Wavelet Transforms (DWT) [47] with two and three level decomposition via Haar wavelet was applied to obtain the coefficient from the segmented images. The dimensionality coefficients were reduced by applying Principle Component Analysis (PCA). The system in [21] focused on three types of tumors: Meningioma, Glioma, and pituitary tumor. The method was deployed on 3064 slices, containing 708 Meningiomas, 1426 Gliomas and 930 pituitary tumors. The tumors were segmented and partitioned using region augmentation and ring-form partition which took into consideration the tumor-surrounding tissues. The ROI is then divided into useful ring-form subregions. Three different methods of feature extraction were implemented on the divided subregions: intensity histogram, (GLCM), and bag-of-words (BoW) [49] model. For the intensity histogram, the values are scaled to (0, 1) through the min-max normalization. For the second method, GLCM elements were directly used rather than second-order statistics. Specifically, the average directions of GLCM were used. On the other hand, the BoW-based tissues which use, local features from segmented ROIs, were extracted using K-means for dictionary learning and feature coding and pooling. The obtained features were reduced using the Latent Discriminant Analysis (LDA). Finally, the SVM with histogram intersection kernel was used for classification. Six brain tumors including Meningioma were categorized using a combination of textural and statistical features in [22]. A set of 72 intensity and texture features were extracted using several techniques. Namely, 16 GLCM features [43] at four different offsets, 16 Laplacian of Gaussian (LoG) [50] features at four Gaussian widths, 25 Directional Gabor [45] features at 5 directions and 5 wavelengths, 10 Rotation Invariant Circular Gabor Features (RICGF) [50] at two phase offsets and 5 wavelengths and 9 Rotation Invariant Local Binary Patterns (RILBP) [32] features at 3 scales. Furthermore, 6 intensity-based features, in addition to the shape-based features were also extracted. On the other hand, a Genetic Algorithm (GA) was used to select the 28 most relevant features. For classification, SVM [35] with a Gaussian kernel as well as a standard Multi-layer perceptron (MLP) were individually applied to detect the brain tumors.
In [23], Meningioma and Glioma tumors were detected after classifying tumors into primary and metastatic. The system relies on textural features and a modified probabilistic neural network (PNN) classifier [48] was deployed on 67 T1-weighted post-contrast MR images with pre-specified ROIs. The texture features extracted from the ROIs resulted in 36-dimensional vectors. Specifically, 4 dimensions represent the histogram feature, 22 dimensions enclose the GLCM [43] and the remaining 10 dimensions encode the run-length matrices. Typically, a two-level hierarchical decision tree using a modified PNN classifier with non-linear least square features transformation was applied to distinguish between the primary and the metastatic categories. Then, the discrimination between Meningiomas and Gliomas tumors was performed within the primary group.
The authors in [24] claimed that GLCM is the most discriminative feature for their application, which is intended to recognize Meningioma as well as three other types of tumors. Typically, a Gaussian filter was first applied on MR images for noise removal. Then, four features from each of intensity histogram intensity-based feature and GLCM feature were extracted from the images. Each set of features was used for classification. However, GLCM features yielded the best results. It yielded promising accuracy when coupled with the C4.5 decision tree algorithm and a collection of 250 MRI images. Another research [25] utilized the texture features specifically the (GLCM) [43] to encode four different tumor types including Meningioma. A Gaussian filter was applied to improve MR images quality. From GLCM, 16 dimensions were chosen. A two-layered feed forward neural network (FNN) was trained using the extracted GLCM feature. The method was deployed on a total of 80 MRI samples. Whereas in [26], PCA was used for feature selection and a probabilistic neural network (PNN) [48] was trained to classify tumors into three types normal, benign and malignant. Malignant tumors were then categorized into Meningioma and Glioma. MR images were converted to gray level and resized into 256×256 images. The features extracted from the 105 MRI images were mapped to the benign, malignant and normal categories. In addition, this method was applied on another dataset containing 44 MRI images of malignant tumors labeled as Meningioma or Glioma.
As it can be noticed, most of the reviewed works focus on the classification of brain tumors using supervised learning methods. Based on this literature review, one can claim that the commonly used features to encode the ROIs (tumor and/or density) visual properties are the GLCM, LBP, Gabor filters and the Discrete wavelet transform. Table 1 summarized the related works outlined above.
Summary of the related brain tumor classification methods
Summary of the related brain tumor classification methods
This objective of this study is to investigate the use of pattern recognition methods on specifically texture analysis for classifying Meningioma firmness on T2-weighted MRI images. This chapter outlines the overall architecture of the proposed system, presents its components and describes the functionality of each part. The main components of the proposed CAD systems are shown in Fig. 1. The data will be divided into training and testing sets. During the training phase, the system learns to differentiate between firm and soft Meningiomas based on the manually labeled MRI images. In other word, it builds a classification model that maps the visual properties of the MRI images to one of the pre-defined classes (soft or firm). This model will then be used to predict the class value (soft or firm) for the unlabeled images during the testing phase.

Block diagram for the proposed CAD system.
Typically, after acquiring the data images that are manually cropped. The proposed system will start with feature extraction which plays a major role in capturing the visual properties of the ROIs. In our system, we intend to investigate various texture features. Namely, we plan to extract the Local Binary Pattern as well as the GLCM and discrete wavelet features. The rationale behind the choice of these features is their wide use in cancer detection systems. Lastly, the extracted features are used along with the corresponding labels to build the classification models.
The feature extraction can be defined as the process of encoding the image content to a set of numerical vectors that represent the relevant information of the given image. In this research, we extracted the Local Binary Patterns, the GLCM and the Discrete Wavelet Transform features. In fact, these features, as depicted in table 1, were successfully used for cancer detection in the related works.
Local binary patterns
MRI images usually have different intensity distribution because of the device settings or illumination. Therefore, the reliability of texture analysis is affected by the brightness variance [29]. The LBP operator proved to be robust and efficient in this case due to its resistance to illumination [32]. In our system, the standard and simple LBP will be computed using a 3×3 window. The LBP code of a center pixel is calculated from its 8 surrounding neighbors resulting in a pattern of 8 binary bits. Then, a 256 bins histogram is built form calculating the LBP code for each pixel, and it is used as the feature vector of the image. The value of LBP code of a pixel is given (1), where P is the number of neighbors and R is the radius of the neighborhood, gc is the intensity of the center pixel and gp is of its neighbors [29].
where
As illustrated in Fig. 2, to extract the patterns, each pixel in the image is compared with its eight neighbors in a 3×3 window with that pixel as the center. Note that the value of the central pixel is considered as a threshold to label the neighbors. If the neighbor values are higher than the central pixel value than 1 is assigned, otherwise 0 is assigned. The resulting 8-bit binary code is then converted into a decimal number which is the LBP code for the central pixel. Thus, the distribution of the LBP values represented using LBP histogram encode the texture property of the image.

Illustrative example of LBP code extraction.
In this work, the LBP feature is extracted from the ROI images using a 3×3 window with a radius equal to one. The LBP code is calculated in clockwise direction starting from the upper left corner neighbor, using the center pixel as threshold. A resulting image after calculating all pixels is shown in Fig. 3 (b). Then, the 256 bins histogram is computed to reflect the occurrences of different LBP codes.

Images converted to LBP. (a) Original Grayscale image, (b) Local Binary Pattern.
One should note that these histograms obtained using (1) are rough estimates locally the probability density function and enclose information on micro-patterns therein. Besides, the resulting feature vector is highly dimensional and may yield a curse of dimensionality problem and affect the learning performance. Therefore, we grouped the domain values to reduce variance of the distribution estimate, and thus reduced the extracted vector length. In fact, no texture information was contained in bit patterns that had more than two transitions [54]. For instance, the 8-bit binary strings 00000110 and 11100011 contain two transitions each. On the other hand, the string 100000000 contains one transition only. Thus, as suggested in [54], we lumped all the bit patterns with more than two transitions into one bin. These accumulation of the binary patterns with more than two transitions into a single bin reduced the LBP vectors/codes from 256 to 59. Besides, all extracted features are “Min-Max” normalized as follows:
The GLCM features have been used in many CAD systems intended to detect tumors [29]. They describe image pattern using a matrix resulted from co-occurrence frequencies of two adjacent pixels (i and j) to represent the relationship between them in four direction θ: 0°, 45°, 90° as shown in Fig. 4.

GLCM directions.
For the proposed system, the GLCM is calculated based on the ROI, then the derived second-order statistics such as contrast, correlation, energy and homogeneity features are derived as follows:
where σx and σy are the mean and standard deviation (SD) of the marginal distributions p(i, j|d, θ).
Thus, a 16-dimensional vector (4 statistics per direction) is extracted to encode GLCM feature.
Wavelet is used to analyze and represent images in multi-level resolution. It has been frequently used to extract wavelet coefficient in MRI images, it is also useful for classification because it localizes frequency information of the image [20]. The discrete wavelet transform DWT are discretely sampled wavelets represented in (8), whereas x(n) represent the signal (the image), while dj,k and aj,k are the detail and approximation sub-bands respectively. The parameters j and k are the wavelet scale and translation factors and the coefficients of low and high pass filters are represented by h(n) and g(n). One of the famous types of DWT is the HAAR wavelet denoted in (9) where L represent disintegration level [20]. To compute the wavelet features, DWT will be applied to the images using HAAR wavelet function, which will be decomposed up to level 3 based on [20]. The features derived from these approximation and detail sub-band images uniquely characterize a texture of the image.
Specifically, to compute the wavelet features, DWT is deployed on the ROI images using HAAR wavelet function which is decomposed to four sub-band images (approximation sub-band LL, and three detail sub-bands LH, HL and HH). Then, the mean and standard deviation of all four sub-bands were calculated. This was done for all three levels decomposed images (for level = 1, 2, 3) as illustrated in Figure 5. The mean and standard deviation of all sub-bands are calculated which yields a 24-dimensional vector to encode the Discrete Wavelet Transform feature.

DWT decomposition (a) Original image, (b) Single-level discrete 2-D wavelet transform ‘Haar’.
To validate the effectiveness of the feature extraction techniques for the discrimination of soft and firm ROIs, two classifiers are chosen to learn the model and their performance will be compared which are the support victor machines SVM and k-Nearest neighbors KNN [34].
Support Vector Machines yields accurate detection and recognition of brain tumors [18, 19]. Hence, SVM classifier is chosen for this study as it performs better with binary classification problems. SVM discriminates between the data instances {x
i
, y
i
} where x
i
is the feature vector, and y
i
∈ {firm, soft} is the class value, by finding the maximum margin of the hyperplane that splits the data. The equation of the hyperplane is defined as:
where w is the normal vector to the hyperplane.
In this research, different types of hyperplane/margins are used. In particular, the hard margin which is usually applied for linearly separable data is used as follows:
In addition, we used the soft margin defined as:
where ξi = max (0, 1 - yi (w T · x i + b)).
Note that ξi is used to relax the stiff condition of the linear separability while C is the cost parameter that controls the trade-off between misclassification and hard margins. The higher C is, the harder the margin will be.
Last is the non-linear (kernel) that maps the data inputs into a higher dimensional feature spaces is formulated using:
where x i and x j are two data points. And γ is ‘gamma’, the RBF hyperparameters. The selected kernel, the Radial Basis Function (RBF), can handle the situation when the relationship between attributes and class labels is nonlinear. Moreover, it includes less hyperparameters and numerical complexity compared to the polynomial kernel. Namely, it relies on the parameter γ ‘gamma’ that spread the kernel to change the decision region. Selecting values of these parameters will affect the classifier performance and efficiency in predicting unknown data. In fact, there are several alternative methods such as Grid search, Random search and Bayesian model-based optimization that can be used to optimize these hyperparameters. In our research, we adopted the Bayesian optimization method [51] which is a sequential model-based approach associated with cross validation for automatic hyperparameters tuning. This Bayesian optimization is time efficient and yields better generalization performance on the test set.
KNN algorithm is basically assigning input samples (ROI images) to a class (firm or soft) by defining the k closest samples in the training set using a distance metric. Then, the test sample is assigned to the class to which majority of those k closest samples belong. The most common metric used to calculate distance is the Euclidean distance, calculated using:
This section reports and discusses the results obtained through experimenting six different settings combinations including three visual descriptors (DWT [47], GLCM [33] and LBP [22]) as well as two typical classifiers (SVM [35], KNN [34]). The attained performances are analyzed and compared in order to determine the optimal setting for this problem. The experiment steps can be summarized as shown in Figure 6. In following, we first describe the dataset and the extraction of the ROI. Next, we present the considered feature extraction approaches. Then, we outline the performance measures used for experiment assessment. Last, we present the results of each experiment, discuss and analyses them.

Sample diagram of the conducted experiments.
The dataset is collected from King Khalid University Hospital, that contains 31 cases of male and female having Meningioma brain tumor, 19 cases are labeled firm and 12 cases are labeled soft. The data labeling was collected and labeled by a surgeon who has performed craniotomy to most of the cases, and based on his findings, tumors were assigned to the soft or firm category. Each case has sequences of T2-weighted MRI images and most of them are in axial cross view. Raw images are 16-bit in DICOM format (Digital Imaging and Communications in Medicine) a standard format international standard to transmit, store, retrieve, print, process, and display medical imaging information. The data is converted to TIF for easier computation.
The image size ranges from 512×512 to 488×640. More detailed demographic information on the raw dataset is provided in Table 2. Sample images from the dataset are shown in Fig. 7 where the first row shows firm samples while the second row depicts soft samples.
Demographic features of Meningioma cases
Demographic features of Meningioma cases

Samples from the raw dataset. (a) firm cases, (b) soft cases.
The image cropping was done manually by a radiologist. The process starts by choosing all slices of the brain tumor from the MRI sequence images, which varies from case to case. Then, the region of interests (ROI) are selected by drawing a rectangle on the largest area of the tumor. In addition, sub-areas are selected as well, as depicted in Fig. 8 (a). Also, the areas that are near the edges of the tumor are captured, as shown in Fig. 8 (b). This results in a different number of images, and variable sizes depending on the tumor size for each case. This cropping process was done by a radiologist for all cases in order to mimic all possible copping scenarios that can be done by a physician. This Data Augmentation [53] task intends to address the limitations of the actual small data set. In fact, small datasets do not capture the variations of all cases. Thus, Data Augmentation tackles the overfitting problem from the root (the training set) and yields better generalization of the machine learning models. It relies on the assumption that further information can be extracted from the original data collection through various operations.

Manual segmentation (a) global and local sub-areas, (b) Near-Boundary areas.
As a result, 921 samples were obtained, 698 are firm samples, and 223 are soft samples. Samples of the regions/frames cropped by the radiologist are shown in Fig. 9. As one can see in Figure 9 (a), some of these samples contain ‘vessels’. According to the radiologist, this is a main challenge for this kind of data set because it may lead to a confusion between ‘soft’ and ‘firm’ samples.

Samples from the cropped datasets: first row are firm cases, second row is the soft cases, (a) global areas, (b), (c), (d), (e) and (f) sub-areas.
To evaluate the proposed systems, we used standard classification measures. Namely, we computed the accuracy, sensitivity, specificity, area under ROC (Receiver operating characteristic) curve and F-score measures. However, since our data is un-balanced, the balanced accuracy was also considered instead of the standard accuracy. These measures were calculated as follows:
The equations above are derived from the confusion matrix shown in Table 3, where the positive class refers to the firm category while the negative class refers to the soft class.
Confusion Matrix
The accuracy can be defined as the number of correctly classified positive and negative cases. The sensitivity on the other hand can be defined as the amount of positive ‘firm’ cases correctly classified as positive among total positive cases, while specificity can be calculated as the amount of negative ‘soft’ cases correctly classified as negative out of total negative cases. The ROC curve, which can be obtained by plotting the sensitivity against specificity-1 or the false positive rate (FPR) at various thresholds to assess the obtained results. AUC is the area under this curve and is between 0 and 1 where 1 corresponds to the optimal classifier with 100% accuracy. The balanced accuracy is considering both specificity and sensitivity. While the F-score is the harmonic mean of precision and sensitivity which is more meaningful than the accuracy in the case of uneven class distribution.
In addition, a Student’s t-test [52] with a confidence level of 95% was computed in order to prove the statistical significance of the obtained results. In fact, t-test is a statistical hypothesis test that checks if two means are reliably different from each other. So, when the difference of the mean of the performance measure between the two models is statistically significant, we reject the null hypothesis that assumes that the two samples have the same distribution.
As mentioned above, in this research we investigated DWT [47], GLCM [33] and LBP [32] features along with SVM [35] and KNN [34] classifiers. For SVM, we considered a hard margin and a soft-margin SVMs in addition to a kernel based SVM. All classification models were trained using a 10-folds cross-validation strategy. Similarly, we tuned the number of neighbors for KNN using 10-folds cross-validation.
The ROC curves obtained using the hard margin SVM are shown in Figure 10. As it can be seen, LBP outperforms both DWT and GLCM. In fact, DWT and GLCM have nearly diagonal ROC curve which indicates a random result. Thus, DWT and GLCM are not appropriate to segregate between firm and soft classes using hard margin SVM.

ROC curves obtained using DWT, GLCM and LBP with a hard margin SVM.
The corresponding performance measures are reported in Fig. 11. As one can note, LBP yielded an F-score of 82.7% and a balanced accuracy of 69.8%. Since hard margin SVM is prone to overfitting, we use linear soft margin SVM in the next experiment in order to enhance further the results.

Performance measures obtained using DWT, GLCM and LBP with a hard margin SVM.
For soft margin SVM experiments, the ‘c’ hyperparameter was optimized using the Bayesian optimization method [51]. The obtained ROC curves are displayed in Fig. 12. As it can be seen, LBP yields better results when using soft-margin SVM compared to hard-margin SVM.

ROC curves obtained using DWT, GLCM and LBP with a soft margin SVM.
The corresponding performance metrics are shown in Fig. 13. In fact, the sensitivity increased by almost 20%. However, the specificity did not increase. This means that the class ‘soft’ instances are still not well classified. The results obtained using DWT and GLCM attained 100% for sensitivity and 0% for specificity. This means that they are classifying every sample as firm i.e. all firm cases are correctly classified, but none of the soft cases is identified.

Performance metrics obtained using DWT, GLCM and LBP with a soft margin SVM.
Figure 14 depicts the performance measures attained using RBF SVM. As it can be seen, when using kernel SVM [35], the performance of the proposed system increased significantly with respect to all considered features. For hard-margin SVM and soft-margin SVM, LBP is outperforming the other two features. We should mention that while the sensitivity slightly decreased, the specificity increased by 25%, resulting in an F-score of 93.3% and a balanced accuracy of 81.7%. DWT yielded much better results when coupled with a kernel SVM.

Performance measure attainment using DWT, GLCM and LBP with RBF SVM.
Specifically, it reached an F-score and a balanced accuracy of 91.5% and 77.3% respectively. Similarly, the performance of the proposed system when conveying GLCM feature to kernel SVM increased. The obtained F-score is 90.5% and the balanced accuracy is 72.7%.
In summary, one can claim that LBP [32] overtakes DWT [47] and GLCM [33]. Thus, it is more appropriate to discriminate between the soft and firm instances. Moreover, the kernel-SVM improved significantly the performance of the proposed system compared to hard margin SVM and soft margin SVM. This shows that the soft and firm classes are not linearly separable and that a mapping of the features to a new feature space is necessary in order to separate the two classes. Table 4 summarizes the obtained results.
Performance measures obtained using DWT, GLCM and LBP along with SVM
For KNN based experiments, we built four models using different number of neighbors K. Specifically, we varied K from 1 to 7. This was done for each set of features (DWT, GLCM and LBP) to investigate the effect of the number of neighbors on the overall performance. The Euclidean distance was used to measure the distance between the data instance and its neighbors. In addition, the inverse weight distance was used to determine the nearest neighbors. The results of these experiments are reported in Table 5 and illustrated in Fig. 15. As it can be seen for LBP, the best results were attained using K = 1. As a result, 95% and 87% were reached as F-score and balanced accuracy respectively. This was confirmed by the other performance measures too. However, as K increases, the LBP based results decreased. On the contrary, the results obtained using DWT and GLCM are higher for large K. One should note that DWT yielded relatively similar results. Its three top F-score results attained the 90±0.6 (90%, 91% and 91.2%) with K = 3, 5 and 7 respectively. Accordingly, the top accuracies were 75.8%, 76% and 74.9%. Similarly, the top two F-scores obtained using GLCM and K = 5 and 7 were 86.6% and 86.5% respectively. On the other hand, the best balanced accuracy values were 62% and 61.3% respectively. For the most discriminative features, similarly to SVM, LBP still overtakes the other features as illustrated by the ROC Curves Fig. 15, except for K = 7, where DWT is slightly better than LBP in terms of F-score.
Performance measures achieved using DWT, GLCM and LBP along with KNN (K = 1, 3, 5, 7)

ROC curves obtained using different features (DWT, GLCM and LBP) along with KNN (K = 1, 3, 5, 7).
The best performances for each feature with both classifiers (SVM and KNN) are reported in Table 6. Obviously, RBF based SVM outperforms KNN when used with DWT and GLCM. On the other hand, for LBP KNN with K = 1 overtakes RBF SVM.
Comparison between best SVM modes and KNN model for each feature type
To statistically validate the conclusion above, we conducted two tailed paired Student t-test with a confidence level of 95% using the Balanced Accuracy and F-score results of each the 10-folds of the cross validation. The obtained results are intended to determine whether the potentially best results are statistically significant. In fact, if the p-value is below 0.05, the results are statistically significant and the t-test rejects the null hypotheses. On the other hand, when the p-value is equal to 0.05 or above, then there is no statistical difference between the obtained results and the t-test doesn’t reject the null hypotheses.
First, a t-test was conducted for KNN folds to determine the best K value for each feature set. The attained results are reported in Table 7. Then, to determine which classifier performs better with each feature, another t-test was used to compare SVM and KNN results as shown in Table 8. Lastly, as it can be seen in as in Table 9,
T-test results on KNN results obtained using different features
T-test of classifiers comparison (SVM vs KNN) for each feature
T-test results to statistically validate DWT and GLCM performance
Table 10, and 11, a t-test was also conducted to validate the results obtained using the different features. As one can see, most of the KNN based results are significantly different from each other. However, based on t-test results using the balanced accuracy, KNN classification using DWT with K = 3 and K = 5 yielded almost the same results. Therefore, KNN classification using DWT, with both K = 3 and K = 5 perform the best compared to the other KNN settings. On the other hand, the GLCM coupled with K = 3, 5 or 7 yield statistically different results. Specifically, GLCM performs the best with K = 5. Lastly, with LBP, all models performed differently which indicate the obvious result that LBP perform the best when K = 1.
T-test results to statistically validate LBP and DWT performance
T-test results to statistically validate GLCM and LBP performance
Based on the t-test results obtained using the F-score performance, KNN classification results based on DWT feature are all statistically different. Thus, one can claim that DWT with K = 5 yields the best performance. While for GLCM, there is no significant difference between K = 5 and K = 7 results. Both overtake the other setting results. Finally, the results obtained by the different settings with LBP are also statistically different. This makes K = 1 the best setting for LBP feature.
Table 8 proves that that all SVM vs KNN results are significantly different from each other. According to this and to the previous results in Table 7, one can conclude that the RBF SVM overtakes all KNN configurations, except for LBP, with K = 1. However, K = 1 setting may result in overfitting. In other words, RBF SVM with LBP yields better model generalization.
To validate the feature comparison results (DWT vs GLCM vs LBP), we conducted a t-test based on both classifiers (SVM and KNN) performance. For KNN, we considered the best setting scenario and feature according to the results above. Specifically, we used DWT along with K = 5, GLCM with K = 5, 7 and LBP with K = 1.
According to the T-tests results in Tables 9, 10, and 1, one can conclude that the best setting combination consists in LBP along with KNN (K = 1). However, if we take into consideration the overfitting issue, LBP coupled with RBF SVM is the best combination followed by LBP with KNN (K = 3). For DWT, the best performance was attained using RBF SVM along with GLCM feature. In this research, LBP features proved to be the most discriminative feature compared to DWT and GLCM for the automatic detection of the Meningioma tumor firmness in MRI images. This can be attributed to the high distinctive power of LBP and its invariance to grayscale changes.
Detecting Meningioma Tumor Firmness in -MRI Images is crucial for operative strategy and patient counseling. However, up to now, no research has tackled the Meningioma firmness detection using machine learning techniques. Our experiments using state-of-the-art machine learning and feature extraction techniques yielded promising classification performance and proved that Automatic Meningioma firmness detection can be exploited to support physician decision regarding the operative strategy to be adopted for his patient. As a start, in this research we proposed to couple supervised machine learning techniques with typical texture in order to automatically detect the Meningioma tumor firmness. Specifically, we DWT, GLCM and LBP descriptors along with different classification algorithms such as the SVM and KNN were used to predict the Meningioma firmness. In addition, a real dataset was collected and annotated with accurate ROIs by a radiologist in order to conduct the planned experiments scenarios. Moreover, the collected dataset was augmented to alleviate the concern related to the relatively small data size.
For SVM based classification, LBP [32] overtook DWT [47] and GLCM [33]. Precisely, the kernel-SVM outperformed the hard margin SVM and soft margin SVM. This shows that the soft and firm classes are not linearly separable. Thus, the mapping of the original features into a new feature space using the RBF kernel yielded an easier discrimination between the two considered classes. Typically, DWT, when coupled with the kernel SVM, yielded an increase in the specificity by an average of 25%, resulting in an F-score of 93.3% and a balanced accuracy of 81.7%. Similarly, for KNN based classification, LBP beat the other features as illustrated through the ROC Curves depicted in Fig. 15.
In summary, the obtained results proved that LBP yields the best classification performance using both KNN and RBF SVM. Specifically, it attained balanced accuracies of 87% and 81.76% and F-score of 95% and 93% respectively. This performance can be mainly attributed to the statistical robustness of LBP. In fact, using uniform patterns (59 dimensions) instead of all the possible patterns (256 dimensions) yielded better classification results. Specifically, such uniform patterns are more stable and exhibit less sensitivity to noise. Moreover, the significantly lower number of possible LBP labels (59 Vs 256) requires less data instances and yields accurate estimation of their distribution. One should note that KNN manifests considerable drawbacks. Particularly, it is a lazy learner, and does not generalize well and hence can prone to overfitting.
As future works, the proposed system could be extended through the inclusion of more complex visual descriptors along with fusion techniques to aggregate the extracted features. On the other hand, transfer learning can be adopted in order to deploy deep learning paradigms to overcome this classification problem. Besides, the collection of a dataset with better representation, such as the MRI sequence FLAIR, can be considered as interesting alternative for the future works.
Footnotes
Acknowledgments
This work was supported by the Research Center of the College of Computer and Information Sciences at King Saud University. The authors are grateful for this support.
