Breast cancer classification application based on QGA-SVM

Abstract

Early diagnosis of breast cancer plays an important role in improving survival rate. Physiological changes of breast tissue can be observed and measured through medical electrical impedance, and the results can be used as a preliminary diagnosis by doctors before treatment. In this paper, quantum genetic algorithm (QGA) and support vector machine (SVM) were combined to classify breast tissues to help clinicians in diagnosis. The algorithm uses QGA to optimize the parameters of SVM and improve the classification performance of SVM. In this experiment, the electrical impedance data measured from breast tissue provided by UCI [58] was used as the data set. Objectively speaking, the data volume of the data set is small and the representativeness is not strong enough. However, the experimental results show that QGA-SVM shows better classification performance, and it is better than SVM.

Keywords

Quantum genetic algorithm Support Vector Machines Breast cancer Medical electrical impedance

1 Introduction

According to the statistics of the past decade, the incidence of breast cancer (BC) ranks first in the world. For breast cancer patients, early detection, early diagnosis and treatment, and comprehensive adjuvant therapy after surgery are the keys to prolonging survival [1, 2]. Mammography is the most common way to screen and detect breast cancer. But its sensitivity to detecting the risk of cancer development decreases with increasing breast density, mainly because increased density tends to mask lesions. In addition to mammography, magnetic resonance imaging (MRI) can also be performed. Fine needle aspiration biopsy (FNAB) is a common technique, which is also used as a method for the investigation and diagnosis of breast cancer. Although these methods have made progress in screening for lesions, they have also been painful and uncomfortable for patients, and mammography can even damage breast tissue [3 –5]. In contrast, interdisciplinary computer-aided diagnosis makes diagnosis more efficient and more acceptable. In 2000, Silva et al. proposed the use of electrical impedance spectroscopy (EIS) to classify breast tissue using linear discriminant analysis, and designed a three-stage layered approach. The maximum overall classification accuracy was 92%, and the cancer tissue identification rate was greater than 86%. Narumol et al. [6]. proposed a breast tissue classification algorithm based on Bootstrap Aggregation. Bootstrap Aggregation is to sample the data, and then use the random replacement method to create a classifier for data randomization. This means that existing data remain the same after randomization, rather than being reduced. They conducted multiple cross validation on the samples, and finally obtained an accuracy rate of 74.47%. In 2019, Yoke et al. [7] proposed a new perspective, using EIS to classify wounds in breast tissue. Breast wounds can be observed and distinguished by EIS. Under different conditions, wounds have different patterns and levels. The experiment used learning vector quantization (LVQ) to classify breast tissue wounds. In order to obtain better results, a genetic algorithm (GA) is used to optimize the LVQ weight value. The experiment compared the classification of breast tissue wounds by LVQ and GA-LVQ. The final maximum classification accuracy of GA-LVQ is 73%, which is better than LVQ. Toukir et al. [8] used five ensemble-based machine learning (ML) algorithms, namely Random Forest (RF), Extreme Random Tree (ERT), Decision Tree (DT), Gradient Boosting Tree (GBT), and Adaptive Boosting (ADB) algorithms, to classify breast tissue. The results show that the three bagging integrated ML algorithms, namely RF ERT and DT, have better classification accuracy than the two boosting algorithms GBT and ADB, and the maximum classification accuracy reaches 86%. Pranav et al. [9] analyzed the EIS dataset using four different ML algorithms, SVM, DT, RF, and modified random forests (MRF). MRF has an accurate mean of 99%. A test size of 15% works best. Most computer-aided diagnosis systems traditionally use manual feature extraction methods. In view of the inefficiency and time-consuming of this method, DM Vo et al. [10] proposed a method to extract the most useful visual features from multi-scale training images for breast cancer classification with an ensemble of trained DCNNs. To maximize the classification performance, they combined DCNN and boosting tree classifier to improve the classification performance of DCNN classifier. The challenging database and breakhis dataset of bioimaging 2015 breast histology classification challenge were used to test the effectiveness of the method. The results show that these deep learning models can extract better features compared to handcrafted feature extraction methods, with classification accuracy up to 96%. Deniz et al.[11] used transfer learning and deep feature extraction methods to enable pre-trained CNN models to help classify breast cancer. They perform feature extraction on the BreakHis dataset followed by transfer learning. Feature extraction using pretrained neural network structures for classification. The network structure they used for pre-training was a modified AlexNet, which removed the last three layers of the network and added new layers. Finally, the classification is done using SVM. The results show that transfer learning produces better results compared to deep feature extraction and SVM, with a classification accuracy of 93.57%.

Most of the above experiments used the Breast Tissue sample set of UCI database in the United States, including 106 breast tissue samples. Because the sample size of the data set is not rich enough and does not meet the requirements of some algorithms for the data set[12 –14], some algorithms are not highly sensitive to the characteristics of breast tissue, resulting in low classification accuracy [15 –18]. The data set used in this experiment is small, and the simple linear model can be better than the depth network model. In order to better, faster and more accurately analyze the pathological state from the detected data [19 –22], this paper proposes QGA-SVM. QGA combines the advantages of quantum with GA, introduces quantum coding and quantum revolving gate, increases the possibility of genome change, and makes QGA have more diversified offspring than traditional GA. Moreover, QGA has higher convergence speed and stronger search ability, and the optimization effect has been realized in some fields. The radial basis function of SVM is used as kernel function, which reduces computation and saves storage space. In this paper, the algorithm is used in the classification of breast tissue, and the convergence and accuracy of the algorithm are tested from EIS.

2 Related work

When breast tissue forms a tumor or develops cancer, it releases an vascular factor that stimulates the tumor to produce numerous nutrient blood vessels, often spreading at the tumor margins or inserting into surrounding tumors [23 –25]. As a result, the tumor and its surrounding tissue are rich in new blood vessels, speeding up blood flow and increasing blood supply. Due to the specificity of the blood impedance of the human body, the impedance of the tumor and surrounding tissues will change significantly [26 –28].

Medical electrical impedance technology is a non-injury detection technology which can extract biomedical information related to human pathological condition by using the electrical characteristics and change rules of biological tissues and organs. It usually detects objects by means of an electrode system placed on the body surface to apply a small AC measurement signal, and calculates the electrical impedance and its corresponding changes through the obtained test signal to obtain relevant physiological and pathological information [29 –31].

The basic structural unit of the human body is the cell. A cell is surrounded by a membrane, a semi-permeable membrane with a special structure and function, called a cell membrane [32 –34]. It allows the selective passage of certain substances, but strictly maintains the stability of the cellular material composition. It separates the cell contents from the surrounding environment of the cell, and enables the cell to selectively exchange substances with the surrounding environment through the cell membrane to maintain life activities [35 –38]. The cell membrane is not only a barrier between the cell and its environment, but also a gateway for the cell to receive influence from the outside world and other cells. The cell membrane is also closely related to the physiological and pathological processes of immune function and cell division, differentiation and canceration [39 –41].

Due to the presence of the cell membrane, the internal and external liquid of the cell can be regarded as an electrolyte, and the liquid between the membrane and the membrane can be regarded as a capacitance. A single cell can be equivalent to the circuit model shown in Fig. 1, where R_e is the resistance of the extracellular fluid, C_e is the parallel capacitance of the extracellular fluid; R_m is the resistance of the cell membrane, C_m is the parallel capacitance of the cell membrane; R_i is the resistance of the intracellular fluid, C_i is the parallel capacitance of the intracellular fluid. In the range below 1 MHz (low frequency), the cell membrane resistance R_m is very large, which can be regarded as an open circuit, while the parallel capacitances C_i and C_e of the inner and outer liquids are small and can also be regarded as an open circuit. The simplified equivalent circuit model shown in Fig. 2, this simplified model is also called the parallel equivalent circuit model. For the biological tissue as a whole, we can assume that it is composed of many cells. Therefore, the circuit model of the biological tissue can be equivalent to the circuit shown in Fig. 2. At this time, R_i, R_e, and C_m represent the components of the entire biological tissue. This is the so-called three-element bio-impedance model with internal and external liquid resistance and membrane capacitance[42, 43].

Fig. 1

Circuit model.

Fig. 2

Simplified model.

Medical electrical impedance technology can extract the electrical impedance characteristic information related to the functional changes of tissues and organs at the cell level, so as to identify the physical and pathological events at the cell level, and provide early disease reports or prediction reports before the structural changes of biological tissues [44 –46].

QGA is a new evolutionary algorithm using quantum logic gates for chromosome evolution. It aims to overcome the limitations of slow convergence and easy to fall into local extremum caused by improper selection, crossover and mutation in classical genetic algorithm. Based on the quantum state vector representation, the algorithm applies the probability amplitude of quantum bits to the encoding of chromosomes, allowing a single chromosome to represent the overlap of multiple states, in order to achieve the goal of optimizing the solution [47, 48]. QGA is a new optimization algorithm which combines the operation mode of classical quantum computing and traditional genetic algorithm. Compared with traditional genetic algorithm, QGA has the characteristics of good population diversity, strong global search ability and fast convergence speed [49 –51].

Quantum coding and quantum revolving gate are the most important ones in QGA [52, 53]. Quantum coding is the representation of chromosomes as quantum state vectors. It allows one chromosome to be expressed in a cluster of multiple states, which increases the diversity and richness of the population and enables the algorithm to find the best in a smaller population. The introduction of quantum gates ensures the updating of the population and enables fast convergence of the algorithm.

QGA is encoded by quantum coding, in which two concepts of qubit and quantum superposition state are given. In quantum computing, qubits are the smallest unit of information storage [54, 55]. The superposition state of the single qubit of |0〉 and |1〉 is used to represent genetic information. A qubit can be represented by the following three states, namely the |0〉 state that represents the spin up, and the spin down represents the spin downward |1〉 state and any superposition state between |0〉 and |1〉. The state of a qubit can be described as follows: $| ρ 〉 = λ | 0 〉 + μ | 1 〉$ (1) where λ and μ represent the probability amplitudes of the quantum state, which are a pair of complex numbers whose sum of squares is 1. When we perform an optimization operation on the population, |ρ〉 may collapse to the |0〉 state or |1〉 state with the probability of | λ|² or | μ|². The quantum state collapse is to combine fitness value to optimize the provenance. Therefore, if a system has m qubits, the system can describe 2^m states simultaneously. However, for every observation of the quantum state, the qubit simply collapses into a definite state. For m qubits, it can be expressed as:

$Γ_{m} = [\begin{matrix} τ_{1} & τ_{2} & \dots & τ_{i} \dots & τ_{m} \end{matrix}] = [\begin{matrix} λ_{1} & λ_{2} & \dots & λ_{i} & \dots & λ_{m} \\ μ_{1} & μ_{2} & \dots & μ_{i} & \dots & μ_{m} \end{matrix}]$ (2)

A population of size n is described by m qubits, which can be expressed as: $Ω_{n} = [\begin{matrix} Γ_{m}^{1} \\ Γ_{m}^{2} \\ ⋮ \\ Γ_{m}^{i} \\ ⋮ \\ Γ_{m}^{n} \end{matrix}] = [\begin{matrix} τ_{1}^{1} & τ_{2}^{1} & \dots & τ_{m}^{1} \\ τ_{1}^{2} & τ_{2}^{2} & \dots & τ_{m}^{2} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ ⋮ & τ_{j} & τ_{j + 1} & ⋮ \\ ⋮ & ⋮ & \dots & ⋮ \\ τ_{1} & τ_{2} & \dots & τ_{m}^{n} \end{matrix}]$ (3)

In the quantum genetic algorithm, qubits can be changed through quantum gates. The introduction of quantum gates not only provides the possibility for the development and exploration of the algorithm, but also makes the algorithm converge. Quantum gate is an operation mechanism for completing evolution. Commonly used quantum gates include NOT gate, controlled NOT gate, revolving gate and Hadamard gate [56, 57]. Different quantum gates can be selected according to different specific problems.

The quantum gate is a 2 × 2 order invertible matrix $U (θ_{i}) = [\begin{matrix} cos (θ_{i}) & - sin (θ_{i}) \\ sin (θ_{i}) & cos (θ_{i}) \end{matrix}]$ . Quantum gates generally update qubits through the following matrix transformations: $| \begin{matrix} α_{i}^{'} \\ β_{i}^{'} \end{matrix} | = U (θ_{i}) | \begin{matrix} α_{i} \\ β_{i} \end{matrix} | = [\begin{matrix} cos (θ_{i}) & - sin (θ_{i}) \\ sin (θ_{i}) & cos (θ_{i}) \end{matrix}] [\begin{matrix} α_{i} \\ β_{i} \end{matrix}]$ (4)

Among them: (α_i, β_i) and (α′_i, β′_i) are the probability amplitudes before and after the update of the i-th qubit revolving gate of the chromosome, θ_i is the size of the rotation angle, which is used to determine the convergence speed.

3 Method

Support vector machine is a method of identifying binary classification of data based on supervised learning in thefield of artificial intelligence machine learning [59, 60]. Its core is to transform the linear indivisibility problem into a linear indivisible problem, and divide the linearity through nonlinear mapping. When it is applied to practical problems, its recognition accuracy depends to a large extent on the choice of parameters. The parameters (C, σ) determine the performance of the SVM function. Among them, the parameter C is the tolerance to errors, and the nuclear parameter σ affects the complexity of the subspace distribution of the sample data. If σ is selected incorrectly, over-fitting or under-fitting will occur [61]. The phenomenon offitting, these two parameters can affect the speed of prediction and training and then affect the classification effect of the classifier. Therefore, this algorithm uses quantum genetic algorithm for parameter optimization, which greatly improves the classification effect of the algorithm. The algorithm uses the kernel function to carry out the linear mapping process from low dimension to high dimension, and does not need to know the explicit expression of nonlinear mapping in the operation process. Therefore, this classification method will not increase the amount of calculation, but also avoid the complex calculation with the increase of dimension.

The multi-parameter machine learning model selected in this paper is nonlinear SVM, and the optimization problem of the model is: $\begin{matrix} min_{W, e} \frac{1}{2} ∥ W ∥^{2} + \frac{C}{2} \sum_{i = 1}^{m} e_{i}^{2} \\ s . t . y_{i} (W \cdot φ (x_{i}) + b) \geq 1 - e_{i}, i = 1, \dots, m \\ e \geq 0, i = 1, \dots, m \end{matrix}$ (5)

Through the Lagrange multiplier method and transformed into a dual problem, the optimization problem is transformed into: $\begin{matrix} min_{α} \frac{1}{2} \sum_{i}^{m} \sum_{j}^{m} α_{i} α_{j} y^{i} y^{j} K (x_{i}, x_{j}) - \sum_{i = 1}^{m} α_{i} \\ s . t . \sum_{i = 1}^{m} α_{i} y^{i} = 0 \\ 0 \leq α_{i} \leq C, i = 1, \dots, m \end{matrix}$ (6) Where: $K (x_{i}, x_{j}) = exp (- \frac{{∥ x_{i} - x_{j} ∥}^{2}}{2 σ^{2}})$

QGA-SVM starts with optimizing parameters to improve the classification performance of SVM, selects QGA to find the optimal parameters, and then uses SVM to classify the data. This combination can effectively prevent falling into local optimum and improve the classification accuracy.

The specific process of the algorithm is as follows:

Input: electrical impedance spectral characteristic data.

Fig. 3

Algorithm flow chart of QGA-SVM.

Set the basic parameters of the algorithm, including the number of groups N, the maximum number of iterations T, the parameters C and σ to be optimized, as well as their value intervals [C_min, C_max] and [σ_min, σ_max], and the quantum rotation angle.

Initialize the quantum form of the population, and set the quantum population as $Q (t) = {q_{1}^{t}, q_{2}^{t}, \dots, q_{N}^{t}}$ . Among them: t is the current genetic algebra, $q_{i}^{t}$ is a chromosome of the population, the definition $q_{i}^{t} = [\begin{matrix} α_{Ci}^{t} & α_{σ i}^{t} \\ β_{Ci}^{t} & β_{σ i}^{t} \end{matrix}]$ . $(α_{Ci}^{t}, β_{ci}^{t})$ and $(α_{σ i}^{t}, β_{σ i}^{t})$ are the qubits of the parameters C and σ to be optimized, respectively. Initialize and convert the quantum angle sequence into a list of quantum coefficients of the population.

Each individual in the initial population is measured. The binary representation of chromosomes was converted to decimal and set between [C_min, C_max] and [σ_min, σ_max]. The fitness function is $Fitness = \frac{S_{succ}}{S} \times 100 %$ , where S_succ is the number of samples correctly classified and S is the total sample. The average value is cross-verified to find the optimal fitness function value and the corresponding binary representation of parameters.

Determine whether the termination conditions are met. If so, the operation is terminated. Otherwise, proceed to the next step.

The individuals in population Q (t) were measured to generate binary solution set, and the fitness of each determined solution was evaluated.

Variation adjustment for individuals using total interference crossover and quantum turnstiles. Firstly, the fitness function value list after crossing is obtained, then the rotation angle of each qubit is initialized, and then the rotation angle of each qubit is calculated. According to the rotation angle of each qubit, a new quantum angle list of the population is generated to obtain a new population Q (t + 1). The corresponding binary solution set is obtained by measuring each individual in Q (t + 1), Then, based on the fitness evaluation of each determined solution, the optimal individual and the corresponding fitness value are recorded.

Increase the number of iterations by 1 and return to step 5.

The algorithm flow chart of QGA-SVM is shown in Fig. 3.

4 Experiments and Results

Regardless of the number of samples in modeling and prediction, in this experiment, the test samples are selected manually and randomly in equal proportion in six categories, so the experimental results have strong universality. Run the program, the maximum number of iterations is set to 50, and finally the average of the 10 optimal accuracy rates in each case is used as the final classification result.

4.1 Dataset

The experiment used a collection of Breast Tissue samples from the UCI database[58] in the United States. The data set recorded 120 spectra from breast tissue samples of 64 patients undergoing breast surgery, and each spectrum included 12 impedance measurements at different frequencies from 488 Hz to 1 MHz. 14 spectra were discarded due to abnormalities. Impedance measurements were made at the frequencies: 15.625, 31.25, 62.5, 125, 250, 500, 1000 KHz.Impedance measurements of freshly excised breast tissue were made at the follwoing frequencies: 15.625, 31.25, 62.5, 125, 250, 500, 1000 KHz. These measurements plotted in the plane constitute the impedance spectrum from where the breast tissue features are computed. Among the remaining 106 cases, the normal tissue category included 14 connective tissue, 22 adipose tissue, 16 glandular tissue, and the pathological tissue category included 21 carcinoma tissue, 15 fibroadenoma, and 18 breast diseases.The dataset has a total of 9 attribute features, as shown in Table 1.

Table 1
Attribute features

Feature Description

I0 Impedivity (ohm) at zero frequency

PA500 phase angle at 500 KHz

HFS high-frequency slope of phase angle

DA impedance distance between spectral ends

AREA area under spectrum

A/DA area normalized by DA

MAXIP maximum of the spectrum

DR distance between I0 and real part of the

maximum frequency point

P length of the spectral curve

Feature	Description
I0	Impedivity (ohm) at zero frequency
PA500	phase angle at 500 KHz
HFS	high-frequency slope of phase angle
DA	impedance distance between spectral ends
AREA	area under spectrum
A/DA	area normalized by DA
MAXIP	maximum of the spectrum
DR	distance between I0 and real part of the
maximum frequency point
P	length of the spectral curve

4.2 The influence of the number of samples on the classification accuracy of the algorithm

In the experiment, the number of training samples was set to 20, 40, 60, 80 and 100, and the remaining samples were taken as the test sample set. The three algorithms were respectively run to compare the effects.

QGA-SVM, PCA-SVM and SVM were used to classify breast cancer tissues, breast diseases and fibroadenomas. The accuracy results obtained from the experiment are shown in Fig. 4, Fig. 5 and Fig. 6 respectively. The performance comparison of the three algorithms in terms of running time, time complexity, bias trade-off, parameterization, etc. is shown in Table 2.

Fig. 4

Comparison of breast cancer classification results.

Fig. 5

Comparison of classification results of breast diseases.

Fig. 6

Comparison of classification results of fibroadenoma.

Table 2

Performance comparison table of three classification algorithms

Algorithm	Running time	Time complexity	Bias tradeoff	Parametric model
SVM	Nonparametric model
PCA-SVM	Nonparametric model
QGA-SVM	Nonparametric model

The line graphs in Fig. 4, Fig. 5 and Fig. 6 show the trend of the accuracy of QGA-SVM, PCA-SVM and SVM for the classification of sample data with the change of the number of samples. From the figures the following conclusions can be drawn:

The classification accuracy of the three algorithms increases significantly with the increase of the number of training samples, which indicates that the more training samples we have, the higher the accuracy of the learning model. If the number of samples is large enough, the classification algorithm can predict the class with very high accuracy.

Overall, QGA-SVM has faster convergence speed and higher classification accuracy than the other two algorithms even when the number of training samples is small.

Although it can be seen from the above figure that QGA-SVM improves the accuracy of data classification, it increases the operation time while improving the accuracy. When the number of samples is constant, there is little influence. However, when the number of samples is large enough, the time consumed by QGA-SVM will have a certain gap compared with the algorithm with lower classification accuracy. Therefore, when selecting an algorithm, select an appropriate algorithm according to the requirements.

Table 2 compares the performance of three classification algorithms in four aspects:

Although QGA-SVM has excellent performance in classification accuracy, its running time and time complexity are not superior among the three algorithms. SVM is outstanding in these two aspects.

The bias error comes from the fact that the model is biased towards a specific solution or hypothesis, and QGA-SVM does not show significant bias in the classification process.

The three algorithms are nonparametric models, that is, the parametric model means that the number of parameters of the model is fixed, while the number of parameters of the nonparametric model increases with the increase of data.

4.3 Comparing the performance of the three algorithms with a fixed training sample of 80

Eighty samples from the dataset were selected as the training set, and the remaining 26 samples were selected as the test set, including 5 breast cancer tissues, 5 adipose tissues, 5 breast disease tissues, 4 fibroma tissues, 4 glandular tissues, and 3 connective tissues. The classification effects of the three algorithms on breast tissues are shown in Table 3.

Table 3
Comparison of classification accuracy of LDA, QGA-SVM and SVM for breast tissue

LDA	QGA-SVM	SVM
Breast cancer	81.82%	1	96.86%
Fibroadenoma	66.67%	79.82%	71.55%
Breast disease	16.67%	67.34%	59.88%
Glandular tissue	54.54%	74.42%	73.71%
Connective tissue	85.71%	76.98%	78.99%
Adipose tissue	90.91%	99.21%	95.62%

From table 3, it can be concluded that the accuracy of QGA-SVM for breast cancer classification can reach 100%, and the accuracy of adipose tissue classification can also reach more than 99%. Meanwhile, it can be seen from Table 3 that:

Compared with LDA and SVM algorithms, QGA-SVM improved the overall classification accuracy, but the classification accuracy of other tissues except breast cancer and adipose tissue was still less than 90%.

When the number of training samples was 80, QGA-SVM was used to classify connective tissue, and the recognition accuracy was lower than LDA and SVM algorithms. Because connective tissue is a normal tissue without any disease, although the classification accuracy is lower than LDA and SVM, it does not affect the overall accuracy of QGA-SVM in classifying diseased breast tissue.

The experimental results show that when the training samples are fixed, although LDA and SVM show good accuracy, the accuracy of QGA-SVM is always the best.

4.4 Different training sets and test sets are set, and different classification methods are used to test QGA-SVM

4.4.1 A total of 106 groups of breast tissues were used, all of which were used as training sets and test sets.

Separate the three pathological tissues. In the process of differentiation, for example, when breast cancer is distinguished, breast cancer is regarded as a class, and fibroadenoma and breast disease are regarded as a class. The same procedure is used to distinguish fibroadenoma from breast disease. The classification effect is shown in Table 4.

Table 4
The effect of distinguishing three kinds of pathological tissues separately

Classification algorithm	Classification accuracy
Breast cancer	100%
Fibroadenoma	91.53%
Breast disease	94.25%

Separately distinguish three normal tissues. In the process of differentiation, for example, when distinguishing connective tissue, connective tissue is regarded as a class, and adipose tissue and gland tissue are regarded as a class. The same division is performed when distinguishing between adipose tissue and glandular tissue. The classification effect is shown in Table 5.

Table 5

The effect of distinguishing three kinds of normal tissues separately

Classification algorithm	Classification accuracy
Connective tissue	93.95%
Adipose tissue	100%
Glandular tissue	94.32%

As can be seen from the tables, since the training set and test set are the same, the classification accuracy of both pathological tissues and normal tissues has been improved to a certain extent and is higher than 90%. Essentially, this tissue identification only allows some test checks of the discriminative method, and what really proves the classification effect is that the training and test sets are different. But to a certain extent, it also reflects the strong classification performance of the algorithm.

4.4.2 53 sets of samples are selected as the training set, and the remaining samples are used as the test set.

53 groups of samples were selected as the training set, including 10 cancer tissue groups, 8 fibroadenoma groups, 9 breast disease groups, 7 connective tissue groups, 11 adipose tissue groups, and 8 gland tissue groups, and the remaining 53 groups of samples were used as the test set. When the selected training set and test set are different, the organization can still be classified, and some observable rules can still appear.

The three pathological tissues were separately distinguished, and the classification effects were shown in Table 6.

Table 6
The effect of distinguishing three kinds of pathological tissues separately

Classification algorithm Classification accuracy

Breast cancer 98.84%

Fibroadenoma 70.99%

Breast disease 71.65%

Classification algorithm	Classification accuracy
Breast cancer	98.84%
Fibroadenoma	70.99%
Breast disease	71.65%

Three normal tissues were separately distinguished, and the classification effects were shown in Table 7.

Table 7

The effect of distinguishing three kinds of normal tissues separately

Classification algorithm	Classification accuracy
Connective tissue	69.55%
Adipose tissue	95.34%
Glandular tissue	75.96%

4.4.3 Taking 106 groups of data samples and 53 groups of data samples as training sets, respectively, the algorithm is used to classify cancer tissue, glandular tissue and adipose tissue, that is, to classify pathological tissue from normal tissue.

In the process of differentiation, if the differentiation of cancer tissue, cancer tissue as a class, glandular tissue and adipose tissue as a class; The same is done to distinguish glandular tissue from adipose tissue.

The classification of 106 data sets as training sets is shown in Table 8.

Table 8
Classification effects of 106 data sets as training sets

Classification organization Classification accuracy

Breast cancer 100%

Adipose tissue 100%

Glandular tissue 95.17%

Classification organization	Classification accuracy
Breast cancer	100%
Adipose tissue	100%
Glandular tissue	95.17%

The classification of 53 data sets as training sets is shown in Table 9.

Table 9

Classification effects of 53 data sets as training sets

Classification organization	Classification accuracy
Breast cancer	99.27%
Adipose tissue	97.18%
Glandular tissue	78.37%

According to the above classification methods based on different training samples and classification methods,whether from the perspective of separate classification of pathological tissues and normal tissues or from the perspective of identifying pathological tissues from normal tissues, the classification accuracy of taking part of samples as training sets has been reduced to varying degrees. From the general situation of classification, we can find that the classification function used at present has higher classification performance. It can not only accurately identify cancer tissue from normal tissue, but also accurately identify cancer tissue from pathological tissue, and the accuracy rate is good. And it is very useful to use them to analyze data. In the future work, we will focus on expanding the data set and better training the algorithm to make the algorithm more robust.

4.5 Confusion matrix

In multiple experiments, one of the classification cases with normal performance is selected, and its confusion matrix is displayed as shown in Fig. 7. Although this is a relatively modest classification, it is also excellent at classifying cancerous tissue compared to other algorithms.

Fig. 7

Confusion matrix.

4.6 Comparison with existing work

We compared recent studies on the use of EIS in the diagnosis of breast cancer, and compare the advantages and disadvantages as shown in Table 10. These studies are still devoted to the classification of breast tissue. With the progress of scientific research, the performance of related studies is also continuously improved, and the EIS has been described and sorted out more comprehensively.

Table 10
Comparison of related literatures

Year of publication Method used Problems solved Dataset used Results Advantages and Disadvantages

2016 [51] Extraction of mean, standard deviation, entropy, kurtosis and skewness from X-ray images using SVM and PNN. Interpret texture changes on fat and dense mammograms to aid the radiologist in diagnosis. MIAS dataset (including 106 fat images and 216 dense images) SVM achieves the highest classification accuracy of 94.4%. Advantages: 1. Eliminates the preprocessing step; 2. Enhanced texture properties of underlying organization. Disadvantage: Some of the acquired Laws texture features may be redundant.

2019 [2] Boost aggregation samples the data, uses the random replacement method to create a classifier, and uses 10 fold cross validation. Sorting breast tissue. Breast Tissue dataset provided by UCI 74.47% Advantages: Data remains the same after randomization, rather than decreasing. Disadvantage: The result has a 24.5% error.

2019[3] GA was combined with LVQ to classify breast wounds. Help identify pathological tissue. Breast Tissue dataset provided by UCI 73% Advantages: Has the ability to find the closest distance during the learning process. Disadvantage: The weight value is not optimal.

2020[4] Multiple machine learning-based breast cancer risk stratification classifiers were compared with EIS. Sorting breast tissue. Breast Tissue dataset provided by UCI >90% Advantages: Effective feature extraction on datasets. Disadvantage: Results are not significantly improved.

2020[52] RF, ER, DT, GBT and ADB were used to classify breast tissues. Comparing the classification performance of five machine learning algorithms. Breast Tissue dataset provided by UCI RF achieves the highest classification accuracy of 86% Advantages: 1. Has high precision and stability; 2. Allows mapping to nonlinear relationships. Disadvantage: Requires manual tuning of important parameters.

2020[53] Integrated classification mechanism based on majority voting mechanism. Evaluate classifier performance to help diagnose breast cancer. Wisconsin Breast Cancer Dataset (WBCD) 99.42% Advantages: Outperforms state-of-the-art voting mechanisms. Disadvantage: Only binary classification is done.

2021[5] Breast tissues were classified using SVM, DT, RF and MRF. Help diagnose breast cancer. Breast Tissue dataset provided by UCI MRF achieves 99% accuracy. Advantages: The impact of multiple machine learning algorithms on breast cancer classification was studied. Disadvantage:: Only binary classification is done.

Ours QGA was used for parameter optimization, and SVM was used for breast tissue classification. Classifying breast tissue helps doctors make a diagnosis. Breast Tissue dataset provided by UCI >99% Advantages: 1. Multivariate classification; 2. The results are better than many algorithms. Disadvantage: The dataset is single.

Year of publication	Method used	Problems solved	Dataset used	Results	Advantages and Disadvantages
2016 [51]	Extraction of mean, standard deviation, entropy, kurtosis and skewness from X-ray images using SVM and PNN.	Interpret texture changes on fat and dense mammograms to aid the radiologist in diagnosis.	MIAS dataset (including 106 fat images and 216 dense images)	SVM achieves the highest classification accuracy of 94.4%.	Advantages: 1. Eliminates the preprocessing step; 2. Enhanced texture properties of underlying organization. Disadvantage: Some of the acquired Laws texture features may be redundant.
2019 [2]	Boost aggregation samples the data, uses the random replacement method to create a classifier, and uses 10 fold cross validation.	Sorting breast tissue.	Breast Tissue dataset provided by UCI	74.47%	Advantages: Data remains the same after randomization, rather than decreasing. Disadvantage: The result has a 24.5% error.
2019[3]	GA was combined with LVQ to classify breast wounds.	Help identify pathological tissue.	Breast Tissue dataset provided by UCI	73%	Advantages: Has the ability to find the closest distance during the learning process. Disadvantage: The weight value is not optimal.
2020[4]	Multiple machine learning-based breast cancer risk stratification classifiers were compared with EIS.	Sorting breast tissue.	Breast Tissue dataset provided by UCI	>90%	Advantages: Effective feature extraction on datasets. Disadvantage: Results are not significantly improved.
2020[52]	RF, ER, DT, GBT and ADB were used to classify breast tissues.	Comparing the classification performance of five machine learning algorithms.	Breast Tissue dataset provided by UCI	RF achieves the highest classification accuracy of 86%	Advantages: 1. Has high precision and stability; 2. Allows mapping to nonlinear relationships. Disadvantage: Requires manual tuning of important parameters.
2020[53]	Integrated classification mechanism based on majority voting mechanism.	Evaluate classifier performance to help diagnose breast cancer.	Wisconsin Breast Cancer Dataset (WBCD)	99.42%	Advantages: Outperforms state-of-the-art voting mechanisms. Disadvantage: Only binary classification is done.
2021[5]	Breast tissues were classified using SVM, DT, RF and MRF.	Help diagnose breast cancer.	Breast Tissue dataset provided by UCI	MRF achieves 99% accuracy.	Advantages: The impact of multiple machine learning algorithms on breast cancer classification was studied. Disadvantage:: Only binary classification is done.
Ours	QGA was used for parameter optimization, and SVM was used for breast tissue classification.	Classifying breast tissue helps doctors make a diagnosis.	Breast Tissue dataset provided by UCI	>99%	Advantages: 1. Multivariate classification; 2. The results are better than many algorithms. Disadvantage: The dataset is single.

5 Conclusion

Combining the advantages of efficient and fast global optimization algorithm QGA and SVM, QGA-SVM is obtained. The algorithm was applied to the characteristic data of breast, connective tissue, adipose tissue, breast disease, fibroadenoma and carcinoma obtained by electrical impedance measurement. The experimental results show that the improved classification algorithm can classify breast tissue with high accuracy, which can help us effectively identify normal breast tissue and malignant tumor tissue. Compared with the traditional classification algorithm, qga-svm greatly improves the classification efficiency, especially in the classification of breast cancer and adipose tissue. The accuracy is more than 99%, and satisfactory optimization results are obtained. At the same time, the increased calculation time is within an acceptable range. In the case of different training sets, the algorithm performance is best when the training sets and verification sets are the same, but the case of different training sets and verification sets is more representative and factual. The algorithm still shows excellent classification accuracy in different training sets and verification sets.

References

Shawarib

M.Z.A.

, Latif

A.E.A.

, Al-Zatmah

B.E.E.D.

, Abu-Naser

S.S.

Breast Cancer Diagnosis and Survival Prediction UsingJNN, International Journal of Engineering and InformationSystems (IJEAIS) 4(10) (2020).

Tsochatzidis

, Costaridou

, Pratikakis

Deep learning forbreast cancer diagnosis from mammograms— a comparative study, Journal of Imaging 5(3) (2019)37.

Zou

, Yu

, Meng

, Zhang

, Liang

, Xie

A technical review of convolutional neural network-based mammographic breast cancer diagnosis, Computational and Mathematical Methods in Medicine 2019.

Wang

, Zheng

, Yoon

S.W.

, Ko

H.S.

A support vectormachine-based ensemble algorithm for breast cancer diagnosis, European Journal of Operational Research 267(2) (2018)687–699.

Dhahri

, Al

, Maghayreh

, Mahmood

, FaisalNagi

Automated breast cancer diagnosis based on machine learning algorithms, Journal of Healthcare Engineering 2019.

Chumuang

, Pramkeaw

, Farooq

Electrical impedance of breast tissue classification by using boot strap aggregating. In 2019 15th International Conference on Signal-Image Technology&Internet-Based Systems (SITIS) (2019, November), (pp. 551-556). IEEE.

Arbawa

Y.K.

, Pisefty

R.A.D.

, Bachtiar

F.A.

Wound 621 Classifications Of Breast Tissues with Electrical Impedance Spectroscopy (EIS): Comparison of LVQ and GALVQ. In 2019 5th International Conference on Science in Information Technology (ICSITech) (2019, October), (pp. 229-234). IEEE.

Ahmed

M.T.

, Masud

M.R.

, Al Mamun

Comparisons Among MultipleMachine Learning Based Classifiers for Breast Cancer Risk Stratification Using Electrical Impedance Spectroscopy, European Journal of Electrical Engineering and Computer Science 4(4) (2020).

Verma

, Ramasamy

, Selvam

D.D.D.P.

Classification of breastcancer from electrical impedance measurements dataset in samples offreshly excised breast tissues, Platform: A Journal of Scienceand Technology 4(1) (2021)107–116.

10.

D.M.

, Nguyen

N.Q.

, Lee

S.W.

Classification of breast cancerhistology images using incremental boosting convolution networks, Information Sciences 482 (2019)123–138.

11.

Deniz

, Sengur

, Kadirogclu

, Guo

, Bajaj

, Budak,

Transfer learning based histopathologic image classificationfor breast cancer detection, Health Information Science andSystems 6(1) (2018)1–7.

12.

Chakradar

, Aggarwal

, Cheng

, Rani

, Kumar

, Shankar

A non-invasive approach to identify insulin resistance withtriglycerides and HDL-c ratio using machine learning, NeuralProcessing Letters (2021)1–21.

13.

Shrestha

, Dhasarathan

, Kumar

, Nidhya

, Shankar

, Kumar

, Deep

Learning Based Convolution Neural Network-DCNN Approach to Detect Brain Tumor. In Proceedings of Academia-Industry Consortium for Data Science (2022), (pp. 115-127). Springer, Singapore.

14.

Chakradar

, Aggarwal

, Cheng

, Rani

, Kumar

, Shankar

A non-invasive approach to identify insulin resistance withtriglycerides and HDL-c ratio using machine learning, NeuralProcessing Letters (2021)1–21.

15.

Dutt

, Ahuja

N.J.

, Kumar

An intelligent tutoring systemarchitecture based on fuzzy neural network (FNN) for specialeducation of learning disabled learners, Education andInformation Technologies 27(2) (2022)2613–2633.

16.

Raheja

, Kasturia

, Cheng

, Kumar

Machinelearning-based diffusion model for prediction of coronavirus-19outbreak, Neural Computing and Applications (2021)1–20.

17.

Kumar

, Alshehri

, AlGhamdi

, Sharma

, Deep

A de-anninspired skin cancer detection approach using fuzzy c-meansclustering, Mobile Networks and Applications 25(4) (2020)1319–1329.

18.

Rani

, Jain

, Kumar

Identification of copymove andsplicing based forgeries using advanced SURF and revised templatematching, Multimedia Tools and Applications 80(16) (2021)23877–23898.

19.

Dhasarathan

, Kumar

, Srivastava

A.K.

, Al-Turjman

, Shankarand

, Kumar

A bio-inspired privacypreserving framework forhealthcare systems, The Journal of Supercomputing 77(10) (2021)11099–11134.

20.

Yuvalı

, Kavalcıoğlu

, Kaba

ŞS.

, Işın

Fuzzy Ordination of Breast Tissue with Electrical Impedance Spectroscopy Measurements. In International Conference on Theory and Application of Soft Computing, Computing with Words and Perceptions (2019, August), (pp. 151-157). Springer, Cham.

21.

Calvo

P.C.

, Campo

, Guerra

, Castano

, Fonthal

Designof using chamber system based on electrical impedance spectroscopy(EIS) to measure epithelial tissue, Sensing and Bio-SensingResearch 29 (2020)100357.

22.

Singh

, Singh

A.K.

Role of image thermography in early breastcancer detection-Past, present and future, Computer Methods andPrograms in Biomedicine 183 (2020)105074.

23.

Zandi

, Gilani

, Abbasvandi

, Katebi

, Tafti

S.R.

, Assadi

, Abdolahad

Carbon nanotube based dielectricspectroscopy of tumor secretion; electrochemical lipidomics forcancer diagnosis, Biosensors and Bioelectronics 142(2019), 111566

24.

Rubfiaro

A.S.

, Tsegay

P.S.

, Lai

, Cabello

, Shaver

, Hutcheson

, He

Scanning ion conductance microscopy studyreveals the disruption of the integrity of the human cell membranestructure by oxidative DNA damage, ACS applied bio materials 4(2) (2021)1632–1639.

25.

Xiao

, Song

, Bu

, Pang

, Zhou

, Zhang

, Xie

Theinvestigation of detection and sensing mechanism of spicy substancebased on human TRPV1 channel proteincell membrane biosensor, Biosensors and Bioelectronics 172 (2021)112779.

26.

Canella

, Martini

, Cavicchio

, Cervellati

, Benedusi

, Valacchi

Involvement of the TREK-1 channel in human alveolarcell membrane potential and its regulation by inhibitors of thechloride current, Journal of Cellular Physiology 234(10) (2019)17704–17713.

27.

Mohsen

, Said

L.A.

, Elwakil

A.S.

, Madian

A.H.

, RadwanExtracting

A.G.

optimized bio-impedance model parameters using differenttopologies of oscillators, IEEE Sensors Journal 20(17) (2020)9947–9954.

28.

Tang

, Lu

, Xie

, Yin

A Novel Efficient FEM Thin shellmodel for bio-impedance analysis, Biosensors 10(6) (2020)69.

29.

Freeborn

T.J.

, Fu

Fatigue-induced cole electrical impedancemodel changes of biceps tissue bioimpedance, Fractal andFractional 2(4) (2018)27.

30.

Yousri

, AbdelAty

A.M.

, Said

L.A.

, Elwakil

A.S.

, Maundy

, Radwan

A.G.

Chaotic flower pollination and grey wolf algorithms forparameter extraction of bioimpedance models, Applied SoftComputing 75 (2019)750–774.

31.

, Freeborn

T.J.

Residual impedance effect on emulatedbioimpedance measurements using Keysight EA precision impedanceanalyzer, Measurement 134 (2019)468–479.

32.

Esco

M.R.

, Nickerson

B.S.

, Fedewa

M.V.

, Moon

J.R.

, Snarr

R.L.

Anovel method of utilizing skinfolds and bioimpedance for determiningbody fat percentage via a fieldbased three-compartment model, European Journal of Clinical Nutrition 72(10) (2018)1431–1438.

33.

Al-Ali

A.A.

, Elwakil

A.S.

, Maundy

B.J.

, Freeborn

T.J.

Extractionof phase information from magnitude-only bio-impedance measurementsusing a modified Kramers–Kronig transform, Circuits,Systems, and Signal Processing 37(8) (2018)3635–3650.

34.

Ghita

, Neckebroek

, Juchem

, Copot

, Muresan

C.I.

, Muresan

C.M.

Bioimpedance sensor and methodology for acute painmonitoring, Sensors 20(23) (2020)6765.

35.

Lyu

, Hu

, Zhou

, Wang

Application of improved MCKDmethod based on QGA in planetary gear compound fault diagnosis, Measurement 139 (2019)236–248.

36.

Zhou

, Zhou

, Xia

, Hong

W.C.

Construction of EMD-SVR-QGAModel for Electricity Consumption: Case of University Dormitory, Mathematics 7(12) (2019)1188.

37.

Liangshan

S.H.A.O.

, Yu

Z.H.O.U.

Application of QGA-RFR model inprediction of height of water flowing fractured zone, ChinaSafety Science Journal 28(3) (2018)19.

38.

Zhang

, Wu

T.Y.

, Wang

, Xiong

, Ding

, Mei

, Liu

Application of quantum genetic optimization of LVQ neural network insmart city traffic network prediction, IEEE Access 8(2020), 104555–104564.

39.

Man

, Li

, Di

, Mu

Application of quantum geneticalgorithm in high noise laser image security, OptoelectronicsLetters 18(1) (2022)59–64.

40.

Hua

, Chen

, Pei

, Zhang

, Zhou

Quantum imageencryption algorithm based on image correlation decomposition, International Journal of Theoretical Physics 54(2) (2015)526–537.

41.

Wang

, Su

, Luo

, Nian

, Teng

Color image encryptionalgorithm based on hyperchaotic system and improved quantumrevolving gate, Multimedia Tools and Applications (2022)1–21.

42.

Wang

, Hu

, Zhang

, Wu

A prediction method of soybean moisture content in the process of soy sauce brewing production using quantum revolving gate of quantum evolution algorithm back propagation. In IOP Conference Series: Materials Science and Engineering (2018, July). (Vol. 382, No. 3, p. 032001). IOP Publishing.

43.

Dong

, Zhang

An improved hybrid quantum optimizationalgorithm for solving nonlinear equations, Quantum InformationProcessing 20(4) (2021)1–22.

44.

Guo

, Wei

, Xu

A sonar image segmentation algorithm basedon quantum-inspired particle swarm optimization and fuzzyclustering, Neural Computing and Applications 32(22) (2020)16775–16782.

45.

Chauhan

V.K.

, Dahiya

, Sharma

Problem formulations andsolvers in linear SVM: a review, Artif Intell Rev 52(2019), 803–855.

46.

, Zhu

, Gan

Robust SVM with adaptive graphlearning, World Wide Web 23(3) (2020)1945–1968.

47.

Vijayarajeswari

, Parthasarathy

, Vivekanandan

, Basha

A.A.

Classification of mammogram for early detection of breastcancer using SVM classifier and Hough transform, Measurement 146 (2019)800–805.

48.

Shi

, Zhang

Fault diagnosis of an autonomous vehicle withan improved SVM algorithm subject to unbalanced datasets, IEEETransactions on Industrial Electronics 68(7) (2020)6248–6256.

49.

Singh

, Parmar

K.S.

, Makkhan

S.J.S.

, Kaur

, Peshoria

, Kumar

Study of ARIMA and least square support vector machine(LS-SVM) models for the prediction of SARS-CoV-2 confirmed cases inthe most affected countries, Chaos, Solitons & Fractals 139 (2020)110086.

50.

Huang

, Zheng

, Ma

, Wang

, Huang

, Leng.

. Guo.

Quantitative contribution of climate change and human activities to vegetation cover variations based on GA-SVM model, Journal of Hydrology 584 (2020)124687.

51.

Virmani

, Dey

, Kumar

, PCA-PNN and PCASVM based CAD systems for breast density classification. In Applications of intelligent optimization in biology and medicine (2016), (pp. 159-180). Springer, Cham.

52.

Rahman

S.M.

, Ali

M.A.

, Altwijri

, Alqahtani

, Ahmed

, Ahamed

N.U..

Ensemble-Based Machine Learning Algorithms for Classifying BreastTissue Based on Electrical Impedance Spectroscopy. In International Conference on Applied Human Factors and Ergonomics (2019, July), (pp. 260-266). Springer, Cham.

53.

Assiri

A.S.

, Nazir

, Velastin

S.A.

Breast tumor classificationusing an ensemble machine learning method, Journal of Imaging 6(6) (2020)39.

54.

Kumar

, Aggarwal

, Rani

, Stephan

, Shankar

, Mirjalili

Secure video communication using firefly optimization andvisual cryptography, Artificial Intelligence Review 55(4) (2022)2997–3017.

55.

Shrestha

, Dhasarathan

, Kumar

, Nidhya

, Shankar

, Kumar

A Deep Learning Based Convolution Neural Network-DCNN Approach to Detect Brain Tumor. In Proceedings of Academia-Industry Consortium for Data Science (2022), (pp. 115–127). Springer, Singapore.

56.

Madhu

, Govardhan

, Ravi

, Kautish

, Srinivas

B.S.

, Chaudhary

, Kumar

DSCN-net: a deep Siamese capsule neuralnetwork model for automatic diagnosis of malaria parasitesdetection, Multimedia Tools and Applications (2022)1–23.

57.

Cinarer

, Kilic

Diabetic Retinopathy Detection with Deep Transfer Learning Methods. In International Conference on Intelligent and Fuzzy Systems ((2021), August), (pp. 147–154). Springer, Cham.

58.

Jock

Blackard. UCI Repository of machine learning databases.http://archive.ics.uci.edu/ml/datasets/ILPD,Accessed October 2015.

59.

Bhushan

, Alshehri

, Agarwal

, Keshta

, Rajpurohit

, Abugabah

A novel approach to face pattern analysis, Electronics 11(3) 10.3390/electronics11030444. https://www.mdpi.com/2079-9292/11/3/444

60.

Singh

A.K.

, Kumar

, Bhushan

, Kumar

, Vashishtha

A proportional sentiment analysis of MOOCs course reviews using supervised learning algorithms, Ingenierie des Systemes d’Information 26(5) (2021), pp.501–506 https://doi.org/10.18280/isi.260510

61.

Bhushan

, Alshehri

, Keshta

, Chakraverti

A.K.

, Rajpurohitand

, Abugabah.

An experimental analysis of various machinelearning algorithms for hand gesture recognition, Electronics 11(6) 10.3390/electronics11060968. https://www.mdpi.com/2079-9292/11/6/968

Breast cancer classification application based on QGA-SVM

Abstract

Keywords

1 Introduction

2 Related work

4.1 Dataset

Table 3 Comparison of classification accuracy of LDA, QGA-SVM and SVM for breast tissue

4.4.1 A total of 106 groups of breast tissues were used, all of which were used as training sets and test sets.

Table 4 The effect of distinguishing three kinds of pathological tissues separately

Table 6 The effect of distinguishing three kinds of pathological tissues separately Classification algorithm Classification accuracy Breast cancer 98.84% Fibroadenoma 70.99% Breast disease 71.65%

Table 8 Classification effects of 106 data sets as training sets Classification organization Classification accuracy Breast cancer 100% Adipose tissue 100% Glandular tissue 95.17%

References

Table 3
Comparison of classification accuracy of LDA, QGA-SVM and SVM for breast tissue

Table 4
The effect of distinguishing three kinds of pathological tissues separately

Table 6
The effect of distinguishing three kinds of pathological tissues separately

Classification algorithm Classification accuracy

Breast cancer 98.84%

Fibroadenoma 70.99%

Breast disease 71.65%

Table 8
Classification effects of 106 data sets as training sets

Classification organization Classification accuracy

Breast cancer 100%

Adipose tissue 100%

Glandular tissue 95.17%