Ensembled EfficientNetB3 architecture for multi-class classification of tumours in MRI images

Abstract

Healthcare informatics is one of the major concern domains in the processing of medical imaging for the diagnosis and treatment of brain tumours all over the world. Timely diagnosis of abnormal structures in brain tumours helps the clinical applications, medicines, doctors etc. in processing and analysing the medical imaging. The multi-class image classification of brain tumours faces challenges such as the scaling of large dataset, training of image datasets, efficiency, accuracy etc. EfficientNetB3 neural network scales the images in three dimensions resulting in improved accuracy. The novel neural network framework utilizes the optimization of an ensembled architecture of EfficientNetB3 with U-Net for MRI images which applies a semantic segmentation model for pre-trained backbone networks. The proposed neural model operates on a substantial network which will adapt the robustness by capturing the extraction of features in the U-Net encoder. The decoder will be enabling pixel-level localization at the definite precision level by an average ensemble of segmentation models. The ensembled pre-trained models will provide better training and prediction of abnormal structures in MRI images and thresholds for multi-classification of medical image visualization. The proposed model results in mean accuracy of 99.24 on the Kaggle dataset with 3064 images with a mean Dice score coefficient (DSC) of 0.9124 which is being compared with two state-of-art neural models.

Keywords

Ensemble EfficientNet B3 U-Net architecture medical images multi-class image classification segmentation

1. Introduction

The initiation of deep learning models in the ground medical domain and clinical diagnosis aids in the enhanced medical procedures and improved decision-making with image visualization of medical image modalities. The illustrations of abnormal structures such as brain tumour cases all over the world are rising at a rapid rate for certain reasons such as age factors, chemical exposure, radiation etc. [1]. Nowadays, healthcare needs well-organized and reliable practices to analyse diseases such as cancer which reason of mortality worldwide for humans. The best practice for investigating and identifying tumours is MRI (Magnetic Resonance Imaging). Advanced level grades evaluate that in most cases third and fourth-stage tumours are harmful as these are developed in the brain at a rapid rate [2, 3]. In the underlying stage, the tumour does not spread much however in the later stage, there are chances of spreading to parts of the brain. Early diagnosis of a tumour is hence recommended as in the affected part of the brain, cells are destructive subsequently these cells might pass into the circulation systems of the brain and spread into the encircling cells. Consequently, it is imperative to identify the early phases of brain tumours with the most effective accuracy.

Brain MRI Images avail a technique for classification of tumours in multiple classes by means of data segmentation and image detection using models of deep neural networks. Multiclass image classification techniques applied on medical imaging modalities provide the diversity in the available dataset by the inclusion of image cropping, Flips, Gaussian filters, Principal Component Analysis (PCA), colour jittering etc. [4, 5]. and through these techniques capturing significant characteristics of original images can be done. The possibility of improvement in the data visualization entirely counts on the learning strategies and deep learning strategies including data augmentation diminishes the overfitting of the models by training the dataset This technique comprises one of the image pre-processing methods which are considered to be effective in training the dataset with extreme discriminative nature. The abnormality in the brain structure is diagnosed by experts by means of manual methodologies available i.e., visualizations like MRI scan. But the main problem arises when diagnosis by experts is quite time consuming, and more prone to errors. With rise in cases the exhaustiveness for diagnosis also increases and may lead to erroneous decision making, as this is not repeatable procedure so there may be variation in inter and intra reader in MRI scans.

Even though many distinguished approaches have been utilized in earlier works for analysing the abnormalities in MRI scans with fine results but still there is sufficient scope for multiple classes in improvised detection and higher accuracy for better diagnosis by experts. The proposed work provides the non-invasive method using a neural model which will segment and detect the type of tumour from multi classes with higher efficiency and accuracy. Although various existing convolutional neural models have been developed for T1-weighted CT-MRI images with average accuracy, it is still major apprehension leading to misdiagnosis [6, 7, 8, 9]. The proposed neural model will make medical image datasets efficient and affordable in a low-budget economy. The paper delivers following contributions to the existing literature:

(i)
It deals with the pre-processing of brain MRI images and further neural image augmentation is performed where the image dataset can be enhanced for training and testing.
(ii)
The proposed methodology encompasses the state-of-art approaches for ensembled EfficientNetB3 architecture with U-net to fine-tune the model for identification of multiple classes of tumours in MRI dataset.
(iii)
The next stage incorporates segmentation and detection of tumours which are further classified into three labelled classes where first-class meningioma tumors, second-class glioma and third-class pituitary tumor with optimization of parameters.
(iv)
The comparative analysis of the proposed model with competitive models of CNN models with the inclusion of fewer parameters.

The paper is classified into five sections, wherein Section 2 explains the literature review. Section 3 describes the material and method description which is utilized in the proposed work and Section 4 explains the proposed method and its results are discussed in Section 5. Section 6 concludes the proposed work with its future scope.
2. Literature review

Several neural models have been utilized to segment tumours CT tumours in the brain using MRI scans. Image classification have been done on the multiple classes for identification of abnormalities in the structure which are discussed below in two categories.

2.1 Medical image augmentation strategies

Venu (2021) addresses the problem of misbalancing of datasets because of annotations and cost. For that reason, the data augmentation methodology generates the artificial medical images through the generative Adversarial approach called DCGAN (Deep Convolutional Generative adversarial network) with an FID score of 1.28. The experiment performed in the paper demonstrates that the neural network classifier has superior performance and the trained data is done on an augmented dataset [10]. Barrowclough (2021) projected the image segmentation for spline representations implicitly with a deep neural network. The experiment is based on a prediction of control points where the spline function is set on the segmentation boundary and quantitatively compute the standard metrics for loss function on different networks. The dice score was achieved for the average volumetric test with 92% on the medical dataset [11]. Chen (2019) in the paper, uses the co-teaching architecture in the network in command to train the networks based on mutually exclusive distributions of the training of the medical dataset. It further practices cross-validation to choose between the inputs which are selective. The disadvantage of this methodology is the only utilization of the subset for training purposes. The defined strategy improvises the parameters of generalization of neural networks with reference to real and synthetic world noises of training [12]. Yun (2019) introduces the CutMix strategy which arbitrarily cut the rectangular sections from an image and this will be based on masking where the two random images were mixed depending upon a binary mask with almost similar parameters [13].

2.2 Brain tumour segmentation and detection

Jia (2020) proposed the procedure for identification of tumours in brain MRIs using FAHS-SVM (Fully Automatic Heterogeneous Segmentation using Support Vector Machine) [14]. The regions for tumours can be recognized by utilizing classification data in a statistical manner and grouping of inherent medical image structure order. The regions for the tumour are defined spatially small beside the constant concerning medical image contented provided that suitable and robust information needs to be available for consequential segmentation. FAHS-SVM technique can accomplish promising segmentation of tumours in unification with a semi-supervised method underneath global and local accuracy structure. The performed experiments emphasise on multiparametric brain MRI indicating the identification of the precise location of the tumour correctly and rapidly. The proposed strategy detects the abnormality tissues in brain MRI and it is adequate for the enclosure of prime diagnostic and clinical specialists in the establishment of clinical decision systems. Rai (2020) experimented with using LU-Net convolution neural network in segmentation of brain MRIs in classifying it as without tumour and tumour. The implemented LU-Net model comprises of not so complex and low layered model from U-Net which specialises in medical image datasets. According to performed experiment, the segmentation and detection of brain MRI deliver the finest accuracy and outperformed the deep neural network models. The image dataset was pre-processed and then augmented before training validation and then the generated augmentations were compared with VGG-16 and LeNet neural models [15]. The level of performance of described neural models is being assessed with performance metrics like precision, recall, F-score, and specificity. The outcome displays that the proposed model surpasses other neural models with an accuracy of 98%. Krishnammal (2019) classified the tumour and localize it accurately through curvelet transform to show feature extraction smoothly and that too through high resolution laterally with the good direction. In the performed experiment using a k-means algorithm, it achieves 100% accuracy from the validation and training phase [16]. In CNN, the training phase requires more time as a large labelled image dataset is the primary condition for model training. The proposed model assesses that the GPU interface handles datasets with higher accuracies.

Bahrami (2020) proposed an efficient convolutional neural network model which comprises of U-Net model, Seg-net model with activation layer SeLU in order to achieve an efficient generation of MRI images. The quantitative approach i.e., MAE (mean absolute error), ME (Mean Error) determines the performance which is more promising as related to atlas-based methods [17]. Although, the efficient Convolution Neural Network deep learning model was assessed on the foundation of ground-truth CT (computed tomography) images along with those of generated synthetic Computed tomography (sCT) medical images using mean absolute error and mean error metrics where $N$ denotes the number of voxels ranging from $i=$ 1 to $N$ [18, 19, 20]. The soft tissue, segmented bone and air cavities using reference and synthetic computed tomography medical images, DICE coefficient was measured to assess TIA (i.e., tissue identification accuracy) with Atlas method in the U-Net model. The proposed method reveals an efficient deep learning ability for small training datasets as compared to U-net and atlas methods

2.3 Multiclass image classification of MRI images

Siva Raja (2020) experimented with hybrid autoencoders for classification for MRIs in order to eliminate the noise from MRI image. A non-local filter for mean is utilized after which Bayesian fuzzy clustered method for segmentation is applied to extract features using wavelets packets. It yields good accuracy but it took more time for computation which is inefficient procedure for larger datasets [21]. Kermi (2018) proposed U-Net architecture which is more accurate and automatic technique for purpose of segmentation of MRI medical imaging system. The proposed deep neural model was being trained to slice both Low Grade Glioma and High-grade Glioma volumes [22]. The main objective is to evaluate the parameters of network in order to minimize the loss function. The loss function uses Generalized dice loss and Weight Cross Entropy parameters states the issue of class imbalance in dataset. The duration of segmentation of brain MRI along with the associated components presented that the evaluation performance procedures approve those results obtained are alike to those which are obtained manually by experts even though proposed technique can be enhanced.

Yaniv (2020) proposed V-net architecture (three-dimensional) for Prostate segmentation in CT-MRI imaging which is mostly operative in memory management tasks and or parameters like Accuracy Dice. This prototype uses a macro-structure 3D network [23, 24, 25] constructed on a network over two paths, one is encoder and second being decoder. The architecture is trained by means of distinct loss function called as DSC (i.e., Dice Similarity Coefficient). In prostrate segmentation, the outcome will have 90% less learning parameters along with 90% fewer storage, showing that V-net Light network abridge complex structure three-dimensional segmentation tasks with limited resources available. Togacar (2020) introduces a BrainMRNET model which is a novel CNN architecture for image classification with accuracy of 96.05%. But the major issue with the given experiment is that they are showing good accuracy only for binary classification and for multi-class classification, model is not efficient [26]. Romera (2018) anticipated ERFNet methodology (Efficient Residual factorized Network) which operates factor convolutions over residual connections. They surveyed the method to route around 83 FPS in unique iteration of Titan X then 7 FPS (TX1 Jetson) embedded device in optimization [27]. Raghavendra (2022) proposed the framework using multi-layered stack probability belief classified network for malignant or benign tumour on BraTs image dataset yielding higher accuracy. The medical images are pre-processed using Anisotropic filters and blowing up contrast via histogram equalization. This method outperforms in identification of shape, location and size of multiple class of tumours [28]. Jemimma (2022) projected a novel approach called CWSCO that enabled CNN for the pre-processing of brain tumours by utilizing the noise removal and artifacts in MRI image. The approach applies fuzzy clustering for segmentation of tumuor using the wavelet and scattering transform for feature extraction. Afterwards, Deep CNN model is determined for tumuor classification which is being trained by CWSCO resulting in achieving specificity of 98.59% and accuracy of 95.52% [29].

3. Materials and methods

This section covers methodologies and the dataset description for the segmentation and detection of brain tumours in MRI images which will be further utilized in proposed methodology.

3.1 Dataset description

The Kaggle image dataset consists of CE-MRI brain images which are T1-weighted with approximate 3064 slices for 233 patients among which 1426 are gliomas, 930 pituitary and 708 are meningiomas tumours. The images in dataset have resolution 512 $\times$ 512 with the defined pixel size as 0.49 $\times$ 0.49 mm ${}^{2}$ . The thickness of 2D slices is 6 mm and whereas each slice gap stands 1 mm [30]. Similarly, according to clinical settings, the CT-MRI images have firm number of 2D slices with large gaps that are obtainable as shown in Fig. 1.

Figure 1.

T1 weighted CT-MRI image.

Figure 2.

Flow diagram for EfficientNetB3 neural network.

Figure 3.

Compound scaling with three dimensions for EfficientNet B3.

3.2 EfficientNetB3 deep learning

EfficientNet can be described as family of convolutional neural models which are pre-trained on ImageNet Database. The concept was proposed by Mingxing Tan of google research team in year 2019 in the conference paper titled “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks” [31]. EfficientNet neural model is an entire baseline neural which will implemented in directive to attain maximum accuracy from image datasets. These are defined as backbone network which truly emphasize on image classification tasks. This architecture utilizes a novel semantic segmentation which is an end-to-end model. This model is basically lightweight model which uses AutoML framework in order to evolute the baseline B0 network and therefore scaling the dimensions for depth, resolutions and width using compound coefficient. These coefficients were then improvise ranging their models from B1 to B7 to attain the dominance over the existing neural models [32, 33, 34].

The EfficientNetB3 architecture is one of the recent scaling methods which works quite well on Mobilenets and Resnet neural models because of the effectivity in this baseline network which therefore leads to developing newer baseline network and scaling the family of the model of EfficientNetB3. As compared to previous neural models, EfficientNetB3 determines the GPipe accuracy via using less parameters as compared to others and run faster on inference. It has been determined that with similar number of FLOPS, the accuracy gets improved in this model. Also, the main reason behind this neural model is to develop CNN model to maintain fix cost of resources for GPU. EfficientNet are scaled to gain the maximum accuracy with same available resources on benchmarking datasets. Many neural models are scaled dimensionally on basis of depth or width, similarly (as shown in Fig. 2), this model will scale randomly which fine-tune parameters manually with altogether three dimensions including depth, width and resolution [35].

3.2.1 Compound scaling

This neural network also called as Efficient Convnet architecture utilizes a technique in its model which is called as compound coefficient which is valuable in uniformly scaling the model on each dimension for width ( $W$ ), depth ( $D$ ) and resolution ( $R$ ) through static set of scaling coefficients. The width ( $W$ ) denotes the number of channels which does exist in layers for model, depth ( $D$ ) denotes number of layers in the given model whereas resolution ( $R$ ) is size of input image chosen for model. With variation in these scaling dimensions, the seven models for EfficientNet were been introduced from EfficientNet B0 to EfficientNet B7 which exceeds the state-of-art accuracy for almost neural models with better efficiency. The baseline network of EfficientNet architecture consists of approximate 5.3M parameters and 224 $\times$ 224 input image as shown in Fig. 3.

Colaco et al. [36, 37] proposes compound scaling process which computes $\theta$ (compound coefficient) in order to scale up the network’s dimensions. The phenomenon of scaling up the dimensions accord to neural model is to retrieve the complex feature extraction though the training of network become more crucial because of gradient descent problem. Compound scaling allows to capture the fine grain feature extraction which is easy to train for achieving efficiency. Also, the medical images of higher resolution allow the projected neural model to extract the fine grain features from the image dataset. This methodology is based upon the balancing the network’s dimensions through maintaining constant ration for all parameters as in Eq. (3.2.1).

$\displaystyle\text{Depth (D)}=\alpha^{\phi},\quad\text{Width (W)}=\beta^{\phi}% ,\quad\text{Resolution (R)}=y^{\phi}$ $\displaystyle\qquad\qquad\text{where,}\ \alpha.\beta^{2}.y^{2}\approx 2\ \text% {and}\ \alpha\geqslant 1,\beta\geqslant 1,y\geqslant 1$ (1)

where $\alpha$ , $\beta$ , $y$ are constants determined through grid search and $\phi$ is user defined coefficient to control available resources for model scaling. The parameter $\theta$ , i.e., compound coefficient is user defined as to determine the increment of computational resources required in the network whereas values of $\alpha$ , $\beta$ , $y$ are observed using grid search procedure and similarly these values are scaled up with different $\theta$ observations in order to arise the model ranging from B0 to B7. The FLOPS of systematic convolution is directly proportional to $d$ , $w^{2}$ , $r^{2}$ where the on doubling the depth of the network will somehow double the FLOPS although on doubling the width or network resolution increases the number of FLOPS by four times. In order to diminish the design space in the model, restrictions are made on all the implemented layers which need to scale in uniform manner maintaining the constant ratio [38, 39]. The aim is to gain maximum accuracy for this practice which can further be optimized to achieve parameters. Attaining improvised accuracy in network is done via ConvNet scaling the three dimensions ( $D$ , $W$ , $R$ ) and it is crucial to balance all these dimensions. Through indicating maximum of targeted flops for all dimensions to compute average accuracy as shown in Eq. (2).

$\displaystyle\text{Accuracy}(\mathcal{N}(\text{Max}(D,W,R)))\mathcal{N}=\Theta% \mathcal{F}_{i}^{L(i)}(X_{(H,W,C)})$ (2) $\displaystyle\text{where,}\mathcal{N}(D,W,R)=\Theta\mathcal{F}^{D.L(i)}(X_{R.H% (i),R.W(i),W.C(i)})\text{Memory}(\mathcal{N})\leqslant\text{target-memory}% \text{Flops}(\mathcal{N})\leqslant\text{target-flops}$

where $\mathcal{N}$ represents the network and $i$ in equation represents the stage number. The $\mathcal{F}$ shows convolution operation, $L(i)$ defines number of times $\mathcal{F}$ repeats for $i$ stages, $H$ , $W$ , $C$ are parameters which are prior defined input tensor in the baseline network. In above defined scaling procedure, $L(i)$ manages depth while $H$ and $W$ are resolutions of inputs in network and $C$ manages width of neural network. The rules are restricted for choosing compound coefficients ( $H$ , $W$ , $C$ ) where first rule states that every layer will be uniformly scaled via constant ratio and second rule , every layer utilizes similar convolution operations. Equation (3) illustrate candidate pool set defined as CP where elements signify convolution network model in pool determined through $D_{i}$ , $D_{j}$ and $R_{k}$ combined scaling coefficients ( $i$ , $j$ and $k$ are indexed magnitude of scaling coefficients) and $N$ denotes magnitude scale.

$\displaystyle\text{CP}=\{\text{CNN Model}\ (D_{i},W_{j},R_{j})\mid\text{for}\ % i,j,k\in N\ \text{and}\ 0<W_{j},D_{i},R_{k}<1\}$ (3)

Table 1

Depth coefficients for scaling down the EfficientNet

D ${}_{i}$	D1	D2	D3	D4	D ${}_{i}$
Coefficient	1.0	0.7	0.6	0.5	…
Total operators	T1	T2	T3	T4	Tu
	18	17	15	12	…

To calculate the depth coefficient from Eq. (4), for scaling down predefined coefficients are defined as shown in Table 1 from the EfficientNet family with repetitive ceiling function by applying the coefficients 0.9 and 0.8 resulting in similar depth $D_{1}=$ 1.0.

$\displaystyle\qquad\qquad D_{1}=1.0,\quad\text{if}\ i=1$ $\displaystyle\qquad\qquad D_{i}=D_{i-1}-0.1x,\quad\text{where}\ x\in N$ $\displaystyle\text{such that}\ \sum_{s=1}^{9}\text{Ceil}(R_{k}.D_{i-1})>\sum_{% s=1}^{9}\text{Ceil}(R_{k}.D_{i}),\text{if}\ i>1$ (4)

To calculate the width and resolution coefficients from Eq. (5), constant ratios for all three dimensions need to be set with defined values where $W=$ 1.1, $D=$ 1.2 $j$ and $R=$ 1.15 obtained through predefined grid search. The predefined coefficients will scale down the $W_{j}$ , $R_{k}$ with that of depth using grouping method defined:

$\displaystyle\Delta D=\text{Tu}+1/\text{Tu},\quad\Delta W=W_{j}+1/W_{j},\quad% \Delta R=R_{k}+1/R_{k}$ (5) $\displaystyle\text{where}\ \Delta W:\Delta D:\Delta R=1.1:1.2:1.15$

Table 2

Hyperparameters of the EfficientNetB3

Model	EfficientNetB3
Size of image (max resolution)	300 $\times$ 300
Batch size	32
Optimizer	Adam
Learning rate	0.001
Learning rate decay	SGD decay

The hyperparameters are set for the EfficientNet B3 architecture which are defined in Table 2 are as follows.

The intricacy of the neural model which rise the overfitting issues the image dataset in the proposed methodology can be bound by not fine tuning all the layers of neural network and for that SGD will be taken into consideration as an optimizer as it will utilize a single sample for probably one iteration which is though randomly selected for stability. The SGD decay is learning rate scheduler which is cast-off for stability convergence [40, 41]. All the layers of this architecture obtain gradient apprises during training i.e., weights of all layers are pre-trained which are formerly classified into diverse classes. So, the fine-tuning of the neural model is completely based on the differences in the distributed datasets, and for that SGD optimizer with SGD learning rate decay being settled as default.

3.3 U-Net

U-Net is fully connected deep network which is essential in compliant in respect to input image with varied size and also in reducing the number of parameters for the neural model. With context to EfficientNet B3, the U-net will be trained from end-to-end method through which the output size of image will be quite similar to input image size. The encoder will be extracting features and capturing the context called as contraction path and decoder enables localization at precise level. The concatenation of encoder and decoder will use convolution of 3 $\times$ 3 followed by activation layer. At last, the final layer will have convolution of 1 $\times$ 1 in order to plot the size of features from desired classes [42, 43, 44]. This methodology will skip the connection to reimburse the information which is lost through passing that from encoder to decoder at similar levels. This architecture gets trained from scratch by initializing the weights randomly leading to requirement of huge data with a high cost.

3.4 Evaluation metrics

The network model for brain MRI segmentation and detection is entirely based on classification of pixels for meningioma, glioma and pituitary tumours. The EfficientNet B3 in U-Net architecture produces the hyper parameters for network output as binary image (for predicted mask) The network output would be one of the three classes each resulting in “1” for tumour and “0” for non tumour. To evaluate the network performance metrics for this model, the coefficients considered for images in dataset includes Mean loss, learning rate, Dice Score, accuracy (AUC), precision (Pr), recall (Re) [45, 46, 47]. The end-to-end semantic segmentation model will be evaluated on the mentioned metrics. The semantic segmentation will be average score of all mentioned parameters as shown in equation 6 and 7. The calculations for coefficients are done using formulas given in equation below:

$\displaystyle\text{Precision},\ \textit{PR}=\frac{\textit{TP}}{\textit{TP}+% \textit{FP}}$ (6) $\displaystyle\text{Recall},\ \textit{RE}=\frac{\textit{TP}}{\textit{TP}+% \textit{FN}}$ (7)

TP stands for true positive, FP false positive, FN false negative. Based on these two equations, $F_{\beta}$ can be evaluated for balancing precision and recall:

$\displaystyle F_{\beta}=\frac{(1+\beta^{2})*\textit{Pr}*\textit{Re}}{(\beta^{2% }*\textit{Pr})+\textit{Re}}$ (8) $\displaystyle\textit{DSC}(M,G)=1-\frac{2|M\cap G|}{|M|+|G|}$ (9)

where DSC is Dice Score Coefficient, $M$ is predicted pixel set and $G$ is utilized for ground truth set in Eq. (9). The Learning rate will be evaluated by applying cosine annealing scheduler in which initializes to 1e ${}^{-4}$ and steadily declines to 1e ${}^{-6}$ . This learning rate scheduler starts with higher value and then dropped quite swiftly to value nearer to zero then rises again to maximum.

$\displaystyle y(t)=\frac{y_{0}}{2}\left(\cos\left(\pi\frac{\textit{mod}(t-1),[% F/C]}{[F/C]}\right)+1\right)$ (10)

Equation (10) specifies the cosine annealing scheduler where $y(t)$ represent learning rate for $t$ epoch, $y_{0}$ is max learning rate, $F$ is total number of epochs; and $C$ is total number of cycles.

4. Proposed methology

EfficientNet family classifies the medical image with enhanced accuracy and in the proposed methodology the neural model will be optimized with U-Net network for improved results for larger dataset.

4.1 Hybrid ensembled EfficientNetB3 with U-Net architecture

The EfficientNetB3 architecture illustrates layered description of the model in which first column indicated by ‘i’ shows the number of blocks in network and in column 2 function $F_{i}$ mentions name of convolutional layer each block and its respective kernel size. Next Column characterizes the image resolution variation for different convolutional layers. The last column defines whole layers in every block.

Figure 4.

Flow for hybrid ensembled EfficientNet B3 with U-Net neural model.

EfficientNetB3 ensembles from U-Net architecture pre-train the network robust model and powerful convolutional neural models. This will skip connections which will permit the localization for pixel level indeed as shown in Fig. 4. For purpose of segmentation in image dataset, test time augmentation is being applied in the test set which is then augmented with resizing and normalization. Average prediction will be computed based on the augmented images and also the output for segmented network will be considered as an image set with similar size for pixel input image linked to probability of fitting into defined class. In order to transform discrete masking of all classes of tumour for 0’s and 1’s, binary task classification with enumerator binary threshold will be applied. As discussed, in Eq. (11), pixel’s value will be customed based on threshold value and initially it will be set to zero.

$\displaystyle\textit{Bin}-\textit{TH}(x,y)=\begin{cases}\text{max}&\text{if},% \textit{source}(x,y)>\textit{Threshold}(x,y)\\ 0&\text{otherwise}\end{cases}$ (11)

Here, assuming Bin-TH ( $x$ , $y$ ) is the anticipated output MRI image and its max value is the non-zero value which are allotted to pixel for which the specified condition is fulfilled. The source in equation denotes the array of input image in grayscale for 8-bit channel and the threshold is intended individually for each pixel. As shown in Fig. 5, the input image with 256 $\times$ 256 followed the convolution neural network with stride (1, 1) and padding (1, 1, 1, 1) and after encoder of EfficientNetB3 and decoder of U-net, these model will classify the image into multiple classes labelled with class 0, class 1 and class 2.

Figure 5.

Ensembled EfficientNet B3 with U-Net architecture for identification of Brain MRI images for 233 patients.

Figure 6.

Architecture for EfficientNetB3 neural model with mobile inverted convolution blocks with Relu activation function and predefined filter size.

The investigation for metrics calculation of different parameters involved in segmentation and detection of Brain MRI images involves accuracy, DICE score, learning rate, precision and recall of the image dataset. The experimentations were attained through Python environment with OpenCV 2.0 on Windows 10. In the first process of image classification, image augmentation is to performed in order to increase the size of medical dataset. Various operations are performed on dataset which includes vertical flip, horizontal flip, grid distortion etc.

Training of deep neural model involves enormous supervised data for image dataset and in order to achieve this image set, specific training of network need to be done. The image dataset comprises of T1 weighted brain CE-MRI images. The image dataset has been labelled into three sections which comprises of glioma, pituitary and meningiomas tumour. Further these images will be partitioned into training, testing and validation set. The batch size sets to 10 and its pact trade-off between GPU memory and speed of training. Out of 3064 images, training set utilizes 2479 images and test set with 339 images with 246 images for validation set. The learning scheme of the proposed model is Adam optimizer (i.e., Adaptive moment estimation). Initially, learning rate of the implemented model is 0.0001 and it has reduction with defined factors to half in each iteration i.e., 0.5, 0.25 and then 0.125. The number of epochs conducted in training set is 29 with mean loss on validation and accuracy. The proposed model for convolutional neural network has been applied in Phyton through Tensorflow support.

In Fig. 6 , various blocks are defined with MBConvblocks predefined before applying model with 3 $\times$ 3 in block 1, block 2, block 4, and block 7 and MBConv block 5 $\times$ 5 in block 3, block 5 and block 6. MBConv Block is a bottleneck linear layer which is defined in invert way using separable convolution (depth dimension) which bound 3 $\times$ 3 convolutions in order to condense layers. In first phase 3 $\times$ 3 filters via each channel of each input is applied following other filter applies 1 $\times$ 1 to every channel and this is repeated for all seven blocks in model.

4.2 Flow architecture for hybrid ensembled EfficientNetB3 with U-Net architecture

EfficientNet family has diverse variations for MBConv building blocks. While moving from upper family of this network , all the three parameters including depth , height and width increases along with its accuracy . EffcicinetNetB3 model outperforms accuracy from the given dataset and it is faster than other neural models. The architecture is divided into 7 blocks with varied MBCov Blocks including size of filter, strides and lastly number of channels as shown in Fig. 6.

Figure 7.

Flow architecture for ensembled EfficientNetB3 with U-Net framework for segmentation and detection. The concatenation layer and Upconvolution2D layer includes under the decoder.

As U-Net architecture is proportional to the contracting path and when hyperparameters are fine-tuned with the EfficientNetB3 neural model so, it act as encoder in same scenario for contracting path as shown in Fig. 7. The role of decoder is to maintain a sequence of Upconv2D (UpConvolution) along with concatenation layer to get segmentation map. It will also describe the resolution of image, number of levels it will pass and channels it includes in forward process for each feature map. It will bilinearly unsampled (MBConv) feature mapping of former logit in encoder by factor of 2 and concatenate with encoder’s feature map with spatial resolution at later stage. In the further repeated process i.e., 3 $\times$ 3 MBConv layer followed by up-sampling by factor of 2. It will continue to repeat till segmentation map’s size becomes equal UNet’s decoder utilizes to EfficientNet encoder’s network backbones targeting varied spatial size. At every block, UNet’s decoder is additionally associated over Skip connections of UNet along with improvement in spatial size and the adjacent neighbor up-sampling. EfficicentNet’s encoder delivers input to corresponding decoder classification which in turn springs the up-sampled output after every single operation of upsampling. Then the Up-sampled feature mapping is performed to concatenate with the encoder’s output at each defined level for propagated decoder block.to input image, till then reconstruction process will be repeated. In summarization the observations includes for hybrid model, that it is asymmetric and also contracting path is much more deeper than expansion path. But overall performance of algorithm give improved results which are discussed in next section.

5. Results and discussions

5.1 Results

In this classification process second phase consist of the parameters for mean values of accuracy, Dice score and mean loss in the image dataset. Table 3 shows the parameters achieved from dataset after pre-processing and segregating it on training and validation dataset.

Table 3
Evaluation metrics for training and validation set for EfficientNetB3

Parmaters	On training	On validation
Mean Loss	0.3292	0.2859
Mean DICE score	0.7509	0.8019
Mean accuracy	0.9924	0.9935

Table 4

Dice score correlation with threshold for the multi-class classification

Class	Threshold size	Dice
1	0.8	1200
2	0.8	100
3	0.95	100

Figure 8.

Original and transformed image on masking for glioma tumour.

Experiments in its initial phase investigated the graph for loss with respect to number of epochs it went through on both training and validation procedures. The test set for training and validation is divided into 80% and 20% respectively. The graph in Fig. 8a is prepared between loss i.e., through the loss function and epoch to find the loss between train and test data. While in Fig. 8b justifies the Dice score for each epoch for the SGD (stochastic gradient descent) optimizer where the value of momentum set as 0.9. The network is being trained using SGD decay for learning rate with batch size of 16for EfficientNet B3-UNet model. The learning rate initially set as 0.001 where the values are updated with each iteration process in epoch with the cosine anneal scheduler as mentioned above initially set as 1e ${}^{-4}$ . The image dataset was initially divided into three phases including training, testing and validation sets. The predefined metric evaluation and loss function for proposed model are utilized to plot the models in figures for analysis as shown in Fig. 9.

Figure 9.

(a) The first graph in the figure depicts the loss in the image dataset computed between epoch on x-axis and loss on y-axis. (b) The second graph depicts the dice score of entire image dataset. (c) The third graph shows the accuracy of the dataset.

After training of MRI image dataset, dice score of each class will be evaluated based on the size of threshold in order to acquire the class inter-relationships for three classes where class 1 in given Table 3 represents meningioma tumour, class 2 represents glioma tumour and class 3 indicating pituitary tumour as shown in Table 4. This score will provide metrics for region which is being overlapped for the prediction of tumour for the class which it belongs based on size of threshold estimated. Hence, the score is considered as prospect for attribute of function of cost by replacing the class weights and though pixel maps are created to fine tune the hyperparameters of cross entropy.

In the below graphs, image classification models for each class of tumuor for proposed methodology were plotted for continuous output from each class to predict the membership of each one utilizing the threshold values. Each value of threshold points on the curve ranging from ( $-\infty$ , $+\infty$ ). The main aim is to select the threshold value by keeping the optimal balance among true positive and negative rates for computation of segmentation on test data and hence plotting the dice score keeping minimum threshold value for each class as shown in Fig. 9 where the optimal threshold value ranges from 0.0 to 0.8.

Table 5

Dice coefficient score (DSC) for the multi-class classification for proposed model

Best DICE score	In percentage
Meningioma	95.345
Glioma	83.178
Pituitary tumor	95.199

Figure 10.

(a) Threshold and minimum size on x-axis with dice on y-axis for class 1: meningioma tumuor; (b) class 2: glioma tumuor; (c) class 3: pituitary tumuor.

As shown in Fig. 10, the CE-MRI T1-wieghted brain images were evaluated through Dice coefficient based on each class in which class 1 meningioma tumour scores 95.345%, class 2: glioma tumour scores 83.178% and class 3: pituitary tumour 95.199 % and these scores were figured for the prediction with the fitted proposed model with 91.24% mean dice score for all three classes in the dataset after training as shown in Table 5.

The precision and recall curve generated through the proposed methodology by initiating class labels for predicting the probabilities over different thresholds. The measurement of accuracy for purpose of classification in the image dataset are done via precision-recall curve in the increasing order with precision defined on x-axis and recall defined on y-axis for entire range of value lie between 0 and 1. As shown in Fig. 11a, for the class meningioma precision calculated as 0.85 for selected recall threshold value. For glioma tumour and pituitary tumour in Fig. 9b and 9c, precision is 1.0 and recall as 0.99 for selected thresholds.

The proposed model was utilized in segmentation of the brain tumour from classified image for labelled data which has been estimated through the model and for 233 patients with 3064 MRI images U-net model ensembled with EfficientNet B3 model generates the corresponding mask of the tumour with respect to type of tumour and hence performing segmentation of tumour from MRI image. In Table 6, the tumours were mentioned with the corresponding their mask of MRI image defined class-wise. The section shows the results for class wise segmentation of tumours carried out in python on Windows 10 operating system and Intel Core i3 processor with 8 GB RAM. The images depicted the type of tumour through dice overlap and extracted area has been marked separately along with the segmented image.

Table 6

Segmentation of the three classes of tumours shown in the columns

Type of tumour	Segmentation of tumour
Meningioma tumour
Glioma tumour
Pituitary tumour

Table 7

Metrics for fine-tuning the hyperparameters of neural models in the experiment

Models/hyperparameters	Optimizer	LR schedule	Batch size	Epochs	Reference
EfficientNetB3 ensembled U-Net	Adam	1e ${}^{-3}$ to 1e ${}^{-5}$	16	30	–
ResNextU-Net	Adam	1e ${}^{-3}$	16	30	[46]
U-Net	Adam	1e ${}^{-3}$	32	50	[47]

Table 8

Comparative analysis of evaluation Parameters for neural models

Model/evaluation parameters	Mean loss on training	Mean loss on validation	Mean dice score	Reference
EfficientNetB3 ensembled U-Net	0.3292	0.2859	0.8019	–
ResNextU-Net	0.270	0.219	0.790	[46]
Multi Class U-Net	0.212	0.198	0.789	[47]

Table 9

Comparative analysis of DSC coefficient for neural models

Classes of tumour	EfficientNet ensembled U-Net	ResNextU-Net	Multi class U-Net
Meningioma tumour	95.345%	87.018%	85.157%
Glioma tumour	83.178%	68.102%	60.520%
Pituitary tumour	95.199%	82.065%	78.536%

5.2 Comparison with other methods

Various experiments have been conducted with dissimilar configurations as shown in the table. The table showed numerous experiments with different parameters for training and validation. The first model utilizes the proposed methodology EfficientNet as the segmentation architecture for the backbone network. In the second model, ResNextU-Net neural model has been adopted cardinality to adjust the neural model via hyperparameters. It also utilizes the inception model to split and transform the medical image dataset. In the last model, U-Net architecture will allow the pre-trained neural networks which act as a backbone through encoders and up-sampling via decoders. The hyperparameters are fine-tuned in the table in all the neural networks taken into consideration for the experiments to obtain better accuracy. The parameters considered for experiments are the model’s optimisers, which will probably use samples for one iteration through random selection.

Figure 11.

(a) Precision-Recall curve for class 1 meningioma tumour; (b) Precision-Recall curve for class 2 glioma tumuor; (c) Precision-Recall curve for class 3 pituitary tumour.

The LR scheduler, i.e., learning rate scheduler via training, reduces the learning rate in the pre-defined schedule and this hyperparameter is critical in terms of stability in a neural network. This decision parameter analyses the loss gradient which needs to be applied to the parameters which are currently in implementation to reduce the loss in the network. The learning rate for the neural model is increased linearly and each value of the LR scheduler is trained for given epochs for a range of values as shown in Table 7. The effective neural network training lies within the range of 0.0001–0.01 and afterwards loss for validation will be quite high for the network.

The Dice coefficient evaluated in this proposed architecture i.e., Ensembled EfficientNet with U-net architecture with other competitive neural models which signifies the positive correlation for segmentation of images in dataset. It will impart the maximum similarity between groundtruth image and predicted image ranging from 0 to 1 as shown in Table 9.

6. Conclusion and future work

This paper proposed the EfficientNetB3 architecture ensembled with U-Net network which outperform the neural model on the medical image dataset. This methodology proposes a novel method for segmentation of tumour along with its classification in three classes including meningioma tumour, glioma tumour and pituitary tumour. This has been done using and encoder and decoder design which will be entirely rely on EfficientNet neural network which then will be added to feature encoder for proposed backbone structure along with U-Net (i.e., U-shape) network. This methodology utilizes the compound scaling to efficiently scale the dimensions of the MRI images in the open-source dataset. The model implements the utilization of augmentation and then accomplishing the metrics calculation using various training turns. The results of the model show the mean accuracy of 99.24% in the training set and 99.35% in validation set. The mean dice score of all three classes is 91.24% which can further be utilized for evaluation of the tumour segmentation in medical images. The EfficientNetB3 model scale through state-of-art accuracy with fine tuning of hyperparameters in the proposed model. This baseline network ensembled with one of the CNN architectures achieves maximum accuracy and reduced mean loss but it gets penalized with inference time which is quite slow in the computation because of which prediction times increases slightly. When in utilizing the capacity of memory in GPUs get increased, class imbalance problem may also rise due to multi modal dimensions as inputs in the proposed system. For this, in future work the proposed system can be extended with neural architecture like Resnet and Googlenet by fine tuning the parameters and allowing the spatial information to get embed in system itself. In next phase, experiments on improving positive predicted value in matrix during training on image dataset can be done in real scenarios in emerging technologies. As EfficientNet scales down the image via three dimensions so the dataset utilized in the proposed framework contains standard images of 3000 slices and in present scenario no dataset larger than this is available for implementation. In future, if larger dataset will be available then it can be trained for proposed framework to achieve better accuracy.

References

Gaikwad

. Study on artificial intelligence in healthcare. In: 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS). IEEE; 2021 Mar 19, vol. 1. pp. 1165-1169.

Beam

Kohane

. Artificial intelligence in healthcare. Nature Biomedical Engineering. 2018 Oct; 2(10): 719-31.

Khan

Yairi

. A review on the application of deep learning in system health management. Mechanical Systems and Signal Processing. 2018 Jul 1; 107: 241-65.

Singh

Chetty

Sharma

. A novel machine learning approach for detecting the brain abnormalities from mri structural images. In: IAPR International Conference on Pattern Recognition in Bioinformatics. Berlin, Heidelberg: Springer; 2012 Nov 8. pp. 94-105.

Krishnammal

Raja

. Convolutional neural network based image classification and detection of abnormalities in mri brain images. In: 2019 International Conference on Communication and Signal Processing (ICCSP). IEEE; 2019 Apr 4. pp. 548-553.

Ker

Wang

Rao

Lim

. Deep learning applications in medical image analysis. IEEE Access. 6: 9375-9389.

Khan

Jue

Mushtaq

. Brain tumor classification in MRI image using convolutional neural network. Math Biosci Eng. 2020 Sep 1; 17(5): 6203-16.

Esteva

Chou

Yeung

Naik

Madani

Mottaghi

Liu

Topol

Dean

Socher

. Deep learning-enabled medical computer vision. NPJ digital medicine. 2021 Jan 8; 4(1): 1-9.

Krishna

Bartake

Niu

Wang

Lai

Jia

Mueller

. Image Synthesis for Data Augmentation in Medical CT using Deep Reinforcement Learning. arXiv preprint arXiv:2103.10493. 2021 Mar 18.

10.

Kora Venu

Ravula

. Evaluation of deep convolutional generative adversarial networks for data augmentation of chest x-ray images. Future Internet. 2020 Dec 31; 13(1): 8.

11.

Barrowclough

Muntingh

Nainamalai

Stangeby

. Binary segmentation of medical images using implicit spline representations and deep learning. Computer Aided Geometric Design. 2021 Feb 1; 85: 101972.

12.

Zhang

Goodfellow

Metaxas

Odena

. Self-attention generative adversarial networks. In: International Conference on Machine Learning. PMLR; 2019 May 24. pp. 7354-7363.

13.

Yun

Han

Chun

Choe

Yoo

. Cutmix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019. pp. 6023-6032.

14.

Jia

Zuo

Shen

. Deep neural network ensemble for the intelligent fault diagnosis of machines under imbalanced data. IEEE Access. 2020 Jul 3; 8: 120974-82.

15.

Rai

Chatterjee

. Detection of brain abnormality by a novel Lu-Net deep neural CNN model from MR images. Machine Learning with Applications. 2020 Dec 15; 2: 100004.

16.

Krishnammal

Raja

17.

Bahrami

Karimian

Fatemizadeh

Arabi

Zaidi

. A new deep convolutional neural network design with efficient learning capability: Application to CT image synthesis from MRI. Medical physics. 2020 Oct; 47(10): 5158-71.

18.

Sharif

Khan

Saleem

. Active deep neural network features selection for segmentation and recognition of brain tumors using MRI images. Pattern Recognition Letters. 2020 Jan 1; 129: 181-9.

19.

Nayak

Dash

Majhi

. Automated diagnosis of multi-class brain abnormalities using MRI images: a deep convolutional neural network based method. Pattern Recognition Letters. 2020 Oct 1; 138: 385-91.

20.

Dubey

Bhatt

. Implementation of Autoencoder for Super Resolution of 3D MRI Imaging using Convolution Neural Network. In: 2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence). IEEE; 2022 Jan 27. pp. 398-404.

21.

Siva Raja

. Brain tumor classification using a hybrid deep autoencoder with Bayesian fuzzy clustering-based segmentation approach. Biocybernetics and Biomedical Engineering. 2020 Jan 1; 40(1): 440-53.

22.

Kermi

Mahmoudi

Khadir

. Deep convolutional neural networks using U-Net for automatic brain tumor segmentation in multimodal MRI volumes. In: International MICCAI Brainlesion Workshop. Springer, Cham; 2018 Sep 16. pp. 37-48.

23.

Yaniv

Portnoy

Talmon

Kiryati

Konen

Mayer

. V-net light-parameter-efficient 3-d convolutional neural network for prostate mri segmentation. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI). IEEE; 2020 Apr 3. pp. 442-445.

24.

Ronneberger

Fischer

Brox

. U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention Springer, Cham; 2015 Oct 5. pp. 234-241.

25.

Milletari

Navab

Ahmadi

. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV). IEEE; 2016 Oct 25. pp. 565-571.

26.

Toğaçar

Ergen

Cömert

. BrainMRNet: Brain tumor detection using magnetic resonance images with a novel convolutional neural network model. Medical hypotheses. 2020 Jan 1; 134: 109531.

27.

Romera

Alvarez

Bergasa

Arroyo

. Erfnet: Efficient residual factorized convnet for real-time semantic segmentation. IEEE Transactions on Intelligent Transportation Systems. 2017 Oct 9; 19(1): 263-72.

28.

Raghavendra

Harshavardhan

Neelakandan

Partheepan

Walia

Rao

. Multilayer stacked probabilistic belief network-based brain tumor segmentation and classification. International Journal of Foundations of Computer Science. 2022 Sep 31; 33(06n07): 559-82.

29.

Masoumi

Keshavarz

Fotohi

. File fragment recognition based on content and statistical features. Multimedia Tools and Applications. 2021 May; 80(12): 18859-74.

30.

www.kaggle.com. Avalable from: https://www.kaggle.com/datasets/awsaf49/brain-tumor.

31.

Tan

. Efficientnet: Rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning. PMLR; 2019 May 24. pp. 6105-6114.

32.

Marques

Agarwal

de la Torre Díez

. Automated medical diagnosis of COVID-19 through EfficientNet convolutional neural network. Applied soft computing. 2020 Nov 1; 96: 106691.

33.

Koonce

. EfficientNet. In: Convolutional Neural Networks with Swift for Tensorflow. Berkeley, CA: Apress; 2021. pp. 109-123.

34.

Chaudhary

Mehta

Sharma

Gupta

Khanna

Rodrigues

. Efficient-CovidNet: deep learning based COVID-19 detection from chest x-ray images. In: 2020 IEEE International Conference on E-health Networking, Application & Services (HEALTHCOM). IEEE; 2021 Mar 1. pp. 1-6.

35.

Nayak

Padhy

Mallick

Zymbler

Kumar

. Brain Tumor Classification Using Dense Efficient-Net. Axioms. 2022 Jan 17; 11(1): 34.

36.

Colaco

Han

. Deep Learning-Based Facial Landmarks Localization Using Compound Scaling. IEEE Access. 2022 Jan 11; 10: 7653-63.

37.

Guang

Liang

. Cmsea: Compound model scaling with efficient attention for fine-grained image classification. IEEE Access. 2022 Feb 9; 10: 18222-32.

38.

Shah

Saeed

Yun

Park

Paul

Kang

. A Robust Approach for Brain Tumor Detection in Magnetic Resonance Images Using Finetuned EfficientNet. IEEE Access. 2022 Jun 17; 10: 65426-38.

39.

Marques

Ferreras

de la Torre-Diez

. An ensemble-based approach for automated medical diagnosis of malaria using EfficientNet. Multimedia Tools and Applications. 2022 Mar 29; 1-8.

40.

Monis

Sarkar

Nagavarun

Bhadra

. Efficient Net: Identification of Crop Insects Using Convolutional Neural Networks. In: 2022 International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI). IEEE; 2022 Jan 28. pp. 1-7.

41.

Coccomini

Messina

Gennaro

Falchi

. Combining efficientnet and vision transformers for video deepfake detection. In: International Conference on Image Analysis and Processing. Springer, Cham; 2022. pp. 219-229.

42.

Siddique

Paheding

Elkin

Devabhaktuni

. U-net and its variants for medical image segmentation: A review of theory and applications. IEEE Access. 2021 Jun 3; 9: 82031-57.

43.

Cui

Chang

Jiang

Xia

Zhang

. Multiscale attention guided U-Net architecture for cardiac segmentation in short-axis MRI images. Computer Methods and Programs in Biomedicine. 2021 Jul 1; 206: 106142.

44.

Dong

Yang

Liu

Guo

. Automatic brain tumor detection and segmentation using U-Net based fully convolutional networks. In: Annual Conference on Medical Image Understanding and Analysis. Springer, Cham; 2017 Jul 11. pp. 506-517.

45.

Nicholson

Wooster

Sigurslid

Jiang

Tian

Cardenas

Malhotra

. Estimating risk of mechanical ventilation and in-hospital mortality among adult COVID-19 patients admitted to Mass General Brigham: The VICE and DICE scores. EClinicalMedicine. 2021 Mar 1; 33: 100765.

46.

Alrashedy

Almansour

Ibrahim

Hammoudeh

. BrainGAN: Brain MRI Image Generation and Classification Framework Using GAN Architectures and CNN Models. Sensors. 2022 Jun 6; 22(11): 4297.

47.

Wang

Chen

Han

. Copy-move image forgery detection based on evolving circular domains coverage. Multimedia Tools and Applications. 2022 Apr 22:1-26.

48.

Available from: https://www.kaggle.com/code/bonhart/brain-tumor-multi-class-segmentation-baseline.

49.

Available from: https://www.kaggle.com/code/banggiangle/multi-class-unet.

Ensembled EfficientNetB3 architecture for multi-class classification of tumours in MRI images

Abstract

Keywords

1. Introduction

2.1 Medical image augmentation strategies

2.2 Brain tumour segmentation and detection

2.3 Multiclass image classification of MRI images

3. Materials and methods

3.1 Dataset description

3.2.1 Compound scaling

3.4 Evaluation metrics

4.1 Hybrid ensembled EfficientNetB3 with U-Net architecture

5.1 Results

Table 3 Evaluation metrics for training and validation set for EfficientNetB3

References

Table 3
Evaluation metrics for training and validation set for EfficientNetB3