Abstract
Background:
Early detection of cancerous tumors is a critical factor in improving treatment outcomes. To address this need, this study explores a simple, effective, and cost-efficient method for early cancer detection by measuring the bioimpedance of living tissues. Bioimpedance-based methods hold significant promise for the early detection of cancerous tumors.
Materials and Methods:
The study begins by simulating the impedance behavior of the human breast under two conditions: healthy and containing cancerous tumors. The Cole–Cole model is used to simulate the dielectric properties of both breast and tumor tissues using finite element modeling. In the measurement phase, eight electrodes are evenly distributed around the breast model to ensure comprehensive data collection. Subsequently, a dataset is prepared encompassing three breast sizes (60, 70, and 80 mm) in both the healthy and tumor-afflicted states, with tumor sizes of 5, 8, and 10 mm radius. This dataset is utilized to develop machine learning models, including support vector machines (SVM), convolutional neural networks (CNN), and random forest (RF), for breast cancer detection.
Results:
The results of this study demonstrate the practicality of integrating machine learning techniques with multielectrode bioimpedance measurements to achieve precise and automated breast cancer detection. Notably, the RF model outperformed both SVM and CNN in terms of cancer detection accuracy.
Conclusions:
This study underscores the potential of bioimpedance-based methods, coupled with machine learning algorithms, for early cancer detection. The findings suggest that RF models hold promise for accurate and automated breast cancer detection, offering a valuable tool for improving patient outcomes.
Introduction
Cancer is one of the leading causes of death globally. Worldwide, an estimated 19.3 million new cancer cases (18.1 million excluding nonmelanoma skin cancer) and almost 10.0 million cancer deaths (9.9 million excluding nonmelanoma skin cancer) occurred in 2020. Female breast cancer has surpassed lung cancer as the most commonly diagnosed cancer, with an estimated 2.3 million new cases (11.7%), followed by lung (11.4%), colorectal (10.0%), prostate (7.3%), and stomach (5.6%) cancers. 1
Early detection plays a pivotal role in managing breast cancer. Early detection involves identifying breast cancer before symptoms appear, and it encompasses methods such as breast self-exams, clinical breast exams, and mammograms. Studies, including one by Allemani et al., 2 have highlighted a strong correlation between early detection and higher survival rates. For instance, localized breast cancer cases exhibit a five-year relative survival rate of 92%, underscoring the importance of detecting the disease at an early stage.
Detecting breast cancer early empowers healthcare providers to consider less aggressive treatment options. Patients may benefit from treatments such as lumpectomy (partial removal of breast tissue) instead of mastectomy (complete breast removal) or hormone therapy instead of chemotherapy. Cold et al.’s 3 research discusses the potential for less extensive surgeries and milder treatment regimens when breast cancer is diagnosed early. Early detection allows for the preservation of breast tissue, which can lead to improved cosmetic outcomes and a better quality of life for survivors. Research by Waljee et al. 4 emphasizes the significance of considering patients’ quality of life in breast cancer diagnosis and treatment decisions.
Late-stage breast cancer carries a substantial economic burden because of higher treatment costs and reduced productivity. Early detection can significantly reduce this economic impact. In Yabroff et al., 5 a study provides valuable insights into the economic consequences of late-stage breast cancer diagnosis, highlighting the potential for significant economic benefits associated with early detection.
Enhancing the early detection of breast cancer is pivotal in elevating survival rates, traditionally relying on mammography as the primary screening method. Mammography is a specialized radiographical method used to examine breast tissue, utilizing a dedicated X-ray apparatus tailored exclusively for breast imaging purposes. Its primary application lies in the screening, detection, and diagnosis of breast cancer. In the course of a mammogram, the breast is gently pressed between two plates to achieve uniform tissue thickness, facilitating the acquisition of precise X-ray images capable of identifying anomalies like tumors. 6
Nevertheless, mammography grapples with substantial drawbacks, encompassing reduced sensitivity in dense breast tissue, elevated rates of false positives, and radiation exposure concerns. In recent times, significant strides have been made in the realm of breast cancer detection methods with the specific goal of mitigating these limitations. 7
Breast magnetic resonance imaging (MRI) uses radiofrequency pulses to stimulate hydrogen nuclei contained within water molecules in the body. These stimulated nuclei emit signals, which are captured by a receiver coil positioned around the breast. Subsequently, a computer processes these signals to produce cross-sectional images of the breast. Typically, an intravenous injection of a contrast agent, such as gadolinium, is administered to heighten the signal distinctions between normal and abnormal tissues. 8 Breast MRI is more sensitive than mammography, especially in dense breast tissue, but lower specificity leads to more false positives and higher costs. 9
During ultrasound screening, a portable transducer is applied to the breast, systematically moved to capture images from various angles. The transducer emits sound waves that interact with breast tissue and are then captured by the same instrument. These reflected waves are transformed into electrical signals, which a computer interprets to produce detailed breast tissue images. 10 Ultrasound screening markedly augmented the detection of cancers among high-risk women; however, it also yielded a substantial number of false-positive results. 11
Molecular breast imaging (MBI) capitalizes on the higher metabolic activity of breast cancer cells compared with normal tissue. It does so by using 99m Tc-sestamibi, a radiotracer that is absorbed by the mitochondria inside cells. Subsequently, gamma rays emitted by the radiotracer are captured by a gamma camera system, generating visual representations of the breast tissue. 12 MBI shows promise as a supplementary screening technology for women with dense breasts, yet it is worth noting that MBI involves a higher radiation dose compared with mammography and exhibits lower specificity. 13
Bioimpedance for cancer detection
Bioimpedance, also known as bioelectrical impedance, characterizes the electrical properties of biological tissues, specifically their ability to impede the flow of electric current. It is a measure of how biological tissues resist the passage of electrical signals and can be quantified by analyzing the response to an electrical excitation, typically by calculating the ratio of voltage to current. 14
Biological tissues consist of living cells organized in a three-dimensional structure. The cellular membrane takes on a bilayer lipid structure, rendering it highly capacitive. These membranes also contain channels that selectively permit the passage of molecules. Simultaneously, the intracellular and extracellular media in living cells contain numerous ions, which confer conductive properties to these tissues. Consequently, the impedance of biological tissues encompasses both capacitance and resistance. Variations in bioimpedance depend on several factors, including tissue composition, anatomical characteristics, and the frequency of the applied electrical signal (excitation).15,16
In the context of measuring the impedance of living tissues, two prominent methods are used: the two-electrode and four-electrode methods. The former utilizes two electrodes for both current injection and voltage measurement, whereas the latter involves two pairs of electrodes, with one pair for current injection and the other for voltage measurement. Impedance is calculated by dividing the voltage by the current.
Studies have shown that alterations in the electrical properties of biological tissues, such as changes in conductivity and permittivity, lead to shifts in tissue impedance. These changes can result from variations in tissue composition, health status, or the frequency of electrical excitation. Researchers have harnessed these properties in various applications. 17
In the case of tongue cancer, studies conducted at discrete frequencies, such as 20 and 50 kHz, have uncovered discernible disparities in impedance between patients afflicted by tongue cancer and their healthy counterparts, shedding light on the diagnostic potential of bioimpedance in this context. 18
Similarly, when addressing liver cancer detection, impedance measurements within the frequency range of 10–200 kHz have emerged as a powerful diagnostic tool. These measurements have revealed notable differences in phase between healthy liver tissues and cancerous tumors, offering promising prospects for early diagnosis and intervention. 19
Turning our attention to skin cancer, bioimpedance spectroscopy has emerged as a valuable asset in the clinical setting. This technology aids healthcare practitioners in conducting examinations and identifying potential cases of non-melanoma skin cancer, contributing to early intervention and improved patient outcomes. 20 Moreover, several research endeavors have explored the synergy between bioimpedance spectroscopy and advanced classification algorithms. This collaboration has demonstrated a remarkable ability to discriminate between malignant and benign skin lesions, further enhancing the diagnostic potential of bioimpedance in skin cancer assessment.21,22
One pivotal criterion for distinguishing skin cancer from benign growths involves the relationship between lesion impedance and reference skin impedance. Multifrequency bioimpedance measurements have played a pivotal role in this regard, offering promising results in the detection of skin cancer and its differentiation from other benign conditions. 23
The advantages of using bioimpedance spectroscopy in skin cancer detection are multifaceted. It exhibits high sensitivity, aids in informed decisions regarding biopsy excisions, possesses the capability to detect both melanoma and non-melanoma skin cancer, and facilitates the longitudinal monitoring of melanocytic lesions. This ongoing monitoring, made possible through bioimpedance, has significant implications for the early detection of melanoma, potentially leading to more successful treatment outcomes. 24
In the context of breast cancer detection, bioimpedance has demonstrated its diagnostic potential across a spectrum of frequencies, providing insights into impedance differences between benign and cancerous breast tissues. Notably, within the frequency range of 100 Hz–2 MHz, distinct impedance behaviors have been observed, with a notable differentiation occurring at 200 kHz. 25 In addition, a novel approach involves comparing impedance values between the right and left breasts at a low frequency of 1 kHz, enabling the identification of malignant tumors. Significant impedance disparities in this comparison signal the necessity for further medical evaluation, highlighting the sensitivity of bioimpedance in breast cancer diagnosis. 26 Moreover, bioimpedance measurements offer the ability to differentiate between breast cancer and benign tumors through the computation of indicators related to the Cole–Cole parameters, offering a comprehensive assessment tool for clinicians in distinguishing between these conditions. 27 A significant stride in the application of bioimpedance in breast cancer care lies in its role in identifying lymphedema, a condition often associated with breast cancer patients. In a comprehensive analysis encompassing 14 distinct research investigations, bioimpedance spectroscopy was harnessed to detect lymphedema in its early stages, well before clinical symptoms became apparent. The results were compelling, demonstrating exceptional sensitivity and specificity in identifying lymphedema. 28 The potential of bioimpedance spectroscopy extends beyond diagnosis. It offers the early identification of lymphedema, enabling prompt intervention to halt its progression. Moreover, it proves beneficial in monitoring the responsiveness to lymphedema treatments and evaluating the effectiveness of diverse therapeutic approaches. However, it is essential to acknowledge the challenges associated with bioimpedance spectroscopy, including the need for standardized protocols, calibration, and stringent quality control measures. Variables such as temperature, hydration levels, and electrode placement can impact measurement accuracy and should be carefully considered in its application. 29
Materials and Methods
In this section, we will explore an elucidation of the dielectric properties of biological tissues. Furthermore, a detailed exploration of mathematical models, including the Cole−Cole and Debye models, will be presented. In addition, the section will showcase the benefits of various machine learning algorithms by highlighting promising results achieved in this domain.
Dielectric properties of biological tissues
The dielectric properties of living tissues are affected by the composition of these tissues which is affected by the health of the tissues. On the other hand, these properties are also affected by the frequency of electrical excitation applied to the living tissues. 30 Owing to the important applications, especially in the field of medicine, many studies have focused on measuring and modeling the behavior of electrical properties of different types of biological tissues in different frequency ranges.
Determination of the electrical properties of different breast tissue samples, which includes three classes: healthy, malignant, and benign tissues in the range from 1 to 20 GHz, where it was clear that there is a difference between the values of electrical properties of healthy tissues and cancerous tissues. 31
The estimation of dielectric properties of healthy and malignant human skin was the purpose of many studies, in the frequency range from 20 to 100 GHz, 32 in the frequency range from 1 to 100 MHz 33 where it was found that different layers of human skin showed different values of dielectric properties.
The characterization of human and porcine liver tissue in vivo and ex vivo in three states: healthy, malignant, and cirrhotic over a wide frequency range of 0.5–20 GHz, it was found a significant difference between the ex vivo dielectric properties of healthy and malignant tissues, as well as a difference in the electrical properties of healthy tissues between ex vivo and in vivo at certain frequencies. 34
Through a study that included 20 types of tissues, the values of dielectric properties of these tissues were determined in a wide frequency range extending from 10 Hz to 20 GHz, where the effect of frequency on these dielectric properties is cleared. 35
Dielectric properties of biological tissues are shown to be affected by frequency through three levels of dispersion: α, β, and γ as shown in Figure 1. The α−dispersion occurs at low frequencies and is mainly affected by the ionic atmosphere surrounding the cells (low frequencies, i.e., 10 Hz–10 kHz), the β−dispersion describes structure relaxation (10 kHz–10 MHz), and the γ−dispersion is known as the relaxation of water molecules (at high frequencies [0.1100 GHz]).36–38

Dispersion regions, idealized. 38
Mathematical models of dielectric properties of biological tissues
The possibility of simulating the interaction between biological tissues and electromagnetic fields represents a starting point for many applications, especially in the medical field. Hence, the presence of models that characterize the dielectric properties of biological tissues is essential. The Debye model is giving complex relative permittivity in term of frequency and a set of parameters:39–42
For second-order Debye model:
A single Debye dispersion model was used with magnetic resonance images (MRIs) to develop a realistic numerical model of the breast,45–47 the same technique was used to estimate the spatially averaged dielectric properties of breast tissue. 48 Parameters for one pole and two poles Debye model for healthy and malignant breast tissues was presented in the range from 0.5 to 20 GHz. 40 The frequency dependence of the breast tissue relative permittivity and the conductivity were modeled using a single pole Debye model for microwave imaging via space-time early breast cancer detection technique. 49
By deriving fourth-order Debye model for each tissue of the head in the range from 0.1 to 3 GHz, a three-dimensional FDTD model of the head was formed that can be used to study the effect of the electromagnetic field on the head. 50
Using the same approach as Debye model, the Cole–Cole model (first presented by Cole and Cole
51
) describes the dispersion and absorption of a large number of liquids and dielectrics in a wide frequency range The Cole–Cole expression. Each dispersion region may become broader by various contributions because of the complexity of biological material’s structure and composition; this is why it was added a distribution parameter α with a value between 0 and 1 that describes the dispersion’s broadening. The general equation of Cole–Cole model is given in (8).
The Cole–Cole model’s order used to predict a material’s dielectric properties is defined by the considered dispersion regions. 52
For frequencies above a few hundred megahertz, the complex permittivity can be represented using one single pole,
53
as dipolar relaxation of water is the main relaxation mechanism. Then, the Cole–Cole equation can be written as:
The real and imaginary parts of a single dispersion Cole–Cole model are given by the equations:
54
The work made by Gabriel et al. 52 for establishing a four-terms Cole–Cole model for 17 different tissue types in the frequency range from 10 Hz to 100 GHz is one of the most well-known works in this area. Another work presented by Sasaki et al. 55 has determined, through an identification algorithm, the parameters for a double Cole–Cole model of 43 types of tissues and organs in the frequency range from 1 MHz to 20 GHz.
As the Cole–Cole model allows us to overcome the complexities of experimental measurements, it was used in many studies to model the propagation of electromagnetic waves in biological tissues.53,56 For medical applications, this model was derived for healthy and malignant breast tissue within a frequency range from 0.5 to 50 GHz, where there was a difference between the model parameters for healthy and malignant tissues. 57 In the same way, a one-pole model of normal, malignant, and benign breast tissue was derived within the frequency range from 0.5 to 20 GHz. 31 In addition, the Cole–Cole model was used to find a relationship between changes in dielectric properties and glucose concentration in blood plasma, through which blood glucose concentration can be monitored. 58 In many other studies, this model was used to simulate near-realistic models of some human organs.59–61
Advantages of machine learning in detecting breast cancer
Breast cancer detection has witnessed significant advancements through machine learning techniques. In a recent study, a hybrid convolutional neural network (CNN)–support vector machine (SVM) model was introduced for the automated detection of breast tumors in ultrasound images. The model exhibited impressive performance metrics, including an accuracy of 97.5%, a sensitivity of 96.0%, and a specificity of 98.3%, as demonstrated on a dataset comprising 200 ultrasound images of breast tumors. Notably, this hybrid model surpassed the individual CNN and SVM models, showcasing superior accuracy, sensitivity, and specificity in the context of breast tumor detection. 62
A review encompassing diverse deep learning techniques for breast cancer detection across multiple modalities, encompassing mammography, ultrasound, thermography, and MRI, was conducted. It was concluded that the CNN classifier is deemed suitable for early breast cancer detection, attaining a notable 99% accuracy when compared with alternative classifiers. Furthermore, the complexities and forthcoming directions within the field of deep learning for breast cancer detection were discussed. 63
An optimized SVM-based model was introduced for breast cancer prediction using the Bayesian search method to identify the optimal hyperparameters of the SVM classifier. Evaluation was conducted on two datasets, namely the Wisconsin Breast Cancer Dataset (WBCD) and Breast Cancer Coimbra Dataset (BCCD). The results indicated that an accuracy of 97.14% was attained on WBCD, while 83.33% accuracy was achieved on BCCD. These accuracies surpassed those of the SVM classifier with default hyperparameters. 64
A breast cancer contour detection model was introduced, designed for application in conventional ultrasound, elasticity, and Doppler images. The segmentation of contours was carried out using region-based level sets, followed by classification using SVM to discern between breast cancer contours and false contours. Features were extracted from level sets and through the FM method, which combines information from ultrasound, elasticity, and Doppler images. The model achieved an impressive accuracy rate of 98.5% when tested on a dataset containing 40 breast tumor images. 65
CNN method is proposed to improve breast cancer identification in whole-slide images. Using various CNN architectures and a large dataset of 275,000 50 × 50-pixel RGB image patches, the system achieves an 87% accuracy rate. This approach has the potential to reduce diagnostic errors. 66
The CNN algorithm, utilizing the Wisconsin breast cancer dataset, demonstrates impressive performance in binary classification (benign vs. malignant). The novel CNN model achieves a remarkable validation accuracy of 97.85%, outperforming both the standard CNN and LSTM models, which achieve accuracies of 94.12% and 93.50%, respectively, with different optimizers. Meanwhile, the multilayer perceptron attains an accuracy of 92.44% using the Adam optimizer. 67
An automated model for cancer detection and classification from microscopic biopsy images utilizing biologically interpretable features was proposed. The approach used the Socially Spider Optimization encryption method in conjunction with a neural network for categorizing cancerous biopsy images. The framework’s performance was assessed using accuracy, sensitivity, specificity, and Matthews Correlation Coefficient evaluations, resulting in respective values of 95.91%, 94.25%, 97.12%, and 97.68%. 68
A cost-sensitive learning approach for breast cancer classification using the random forest (RF) algorithm was proposed. The model was evaluated on the Wisconsin breast cancer dataset and achieved an accuracy of 97.51%. 69
A machine learning approach for breast cancer prediction utilized the WBCD and two classifiers, RF and Gaussian Naïve Bayes. To address the dataset imbalance, the Borderline Synthetic Minority Oversampling Technique (BSM) was applied. Results show that the combination of BSM with the RF algorithm achieved the highest recall score, ∼99.80%, whereas BSM with the Gaussian Naïve Bayes classifier resulted in the lowest recall score, at 78.20%. 70
Simulations
The impedance values for tissues with tumors exhibit a consistent pattern of being lower than those for healthy tissue, indicating that the presence of a tumor significantly influences the electrical properties of the tissue. Furthermore, as the tumor size increases, the impedance values decrease more substantially, underscoring the greater impact of larger tumors on tissue impedance.
In the case of human breast tissue, as depicted in Figure 2, a similar trend is observed. With increasing frequency, the difference in impedance between healthy breast tissue and breast tissue with tumors diminishes. For instance, at 20 kHz, the impedance difference between healthy breast tissue and breast tissue with a 20 mm tumor is 294.6 Ω, whereas at 100 MHz, the difference reduces to 82.15 Ω. Similarly, at 20 kHz, the impedance difference between healthy breast tissue and breast tissue with a 10 mm tumor is 41.5 Ω, whereas for breast tissue with a 20 mm tumor, the difference significantly increases to 294.6 Ω. This observation highlights that larger tumors result in more pronounced differences in impedance compared with smaller tumors.

Difference of impedance between normal human breast and human breast containing a tumor in the frequency range from 20 kHz to 100 MHz, with tumor radius: T1 = 10 mm, T2 = 12 mm, T3 = 15 mm, and T4 = 20 mm.
In our study, we leverage this characteristic to effectively distinguish between healthy breast tissue and breast tissue containing tumors. Based on these findings, we have chosen a minimum frequency of 20 kHz for bioimpedance measurements.
In the initial step of our simulation, we developed a breast model representing it as a half-sphere as shown in Figure 3. To accurately capture the dielectric properties of breast tissue, we incorporated the Cole–Cole model. Similarly, within this breast model, we represented the tumor as a sphere, and its dielectric properties were also characterized using the Cole–Cole model. The dielectric properties of the tissues are modeled using Cole–Cole parameters as shown in Table 1 for human breast. For the purpose of electrical impedance measurements, we positioned a set of eight negative electrodes evenly distributed around the base of the breast model. In addition, a positive electrode was placed on the top of the breast model.

3D breast model containing a tumor and surrounded by 8 negative electrodes, with the positive electrode positioned on top.
Cole–Cole Parameters for Normal Breast Tissue and Tumor in the Frequency Range from 20 kHz to 100 MHz 71
In this simulation, we incorporated three distinct breast sizes, each characterized by a radius of 60, 70, and 80 mm, respectively. Within each of these breast sizes, tumors of varying sizes, with radius of 5, 8, and 10 mm, were introduced.
To comprehensively assess the impact of these parameters, we conducted 200 bioimpedance measurements for each combination of breast size and tumor size. During each measurement, we used the Latin hypercube sampling (LHS) algorithm to calculate the coordinates for the tumor’s position within the breast model. This allowed us to capture the dynamic impedance behavior associated with different tumor positions.
In addition to the tumor-focused measurements, we performed 100 bioimpedance measurements for healthy breasts without tumors in each breast size category. For each measurement, we utilized the LHS algorithm to determine the coordinates for slight perturbations in the position of the positive electrode. In total, for each breast size, we conducted 4,800 measurements with tumors, considering 200 tumor positions, 3 tumor sizes, and 8 electrodes. In addition, we performed 800 measurements without tumors, accounting for 100 different positions of the positive electrode and 8 electrodes in total. By adopting this comprehensive approach, we were able to thoroughly explore the bioimpedance characteristics of breasts containing tumors of varying sizes and positions, while also establishing a baseline for healthy breast tissue impedance.
Following the preparation of the dataset, we proceeded with the classification process, using three distinct machine learning methods: SVM, RF, and CNN. The classification process is divided into two distinct parts. In the first part, individual classifications are performed for each breast size, considering all tumor sizes within each category. Subsequently, in the second part, a global classification is conducted, encompassing all breast sizes and tumor sizes, as illustrated in the following algorithm steps:
Initialize arrays, variables, and data structures. Create arrays for storing evaluation metrics.
Define breast size categories.
For each breast size category:
Augment the healthy data to match the tumor data. Combine the data and create corresponding labels. Normalize the impedance values using z-score normalization.
For each breast size category:
Perform 5-fold cross-validation. Train a machine learning model. Evaluate the model’s performance. Save the trained model for the current breast size.
Initialize arrays for storing metrics of the global model. Perform 5-fold cross-validation for the combined dataset. Train a global machine learning model. Evaluate the global model’s performance.
Next, the results of our classification experiments using various machine learning techniques will be discussed.
Results
In this section, we will present the outcomes obtained from using various machine learning algorithms for breast cancer detection.
SVM model performance
The outcomes obtained from using SVM algorithm for breast cancer detection include the following:
Performance Metrics for Breast Size SVM Models and Global SVM Model
SVM, support vector machine.
CNN model performance
The outcomes obtained from using CNN algorithm for breast cancer detection:
Performance Metrics for Breast Size CNN Models and Global CNN Model
CNN, convolutional neural network.
RF model performance
For the RF technique, in contrast to the other two models used in this study, we adopted a unique approach that involved varying the number of electrodes from 2 to 8. This approach allowed us to assess the influence of the number of electrodes on the performance of the RF model. The findings are presented in Table 4.
Performance Metrics for Different Measurements and Breast Sizes for Random Forest Model
RF, random forest.
In summary, increasing the number of measurements (electrodes) generally leads to improved performance for both specific breast sizes and the global RF model. This improvement is observed in terms of accuracy, sensitivity, specificity, and the F1 score, indicating that more measurements provide better information for classifying breast tissue samples and detecting tumors.
Interestingly, our findings suggest the existence of a saturation point where increasing the number of measurements leads to only marginal improvements in performance. Specifically, the shift from 6 to 8 measurements for each tumor position, especially in the case of larger breast sizes, resulted in relatively minor performance gains.
This observation implies the presence of an optimal balance between the quantity of measurements and computational resources. Beyond a certain threshold, accumulating more measurements may not necessarily lead to significant enhancements in model performance. This underscores the importance of exercising prudence in weighing the efforts of data acquisition against the associated costs and computational complexities.
In the comparative analysis of the global classification models, the RF model emerges as the top performer. RF achieved an exceptional accuracy of 99.67%, the highest sensitivity of 99.78%, and specificity of 99.33%, resulting in a perfect F1 score of 1.00. In contrast, the SVM model exhibited lower accuracy (90.50%) and sensitivity (93.72%) with moderate specificity (80.83%) and an F1 score of 0.87. The CNN model also showed competitive performance with an accuracy of 89.53%, sensitivity of 91.50%, specificity of 87.56%, and an F1 score of 0.89. Thus, the RF model demonstrates superior classification capabilities, making it the preferred choice for this task.
Discussion
In this study, we investigated the performance of various machine learning models for breast cancer detection, including SVM, CNN, and RF models. The results obtained from these models demonstrate promising accuracy and sensitivity in classifying breast tissue samples of different sizes, highlighting their effectiveness in tumor detection.
The SVM model exhibited reasonable performance for breast tissue samples of 60 mm, achieving an average accuracy of 84.00% and high sensitivity (95.17%). However, specificity was relatively low at 50.50%, indicating challenges in correctly classifying healthy tissue samples. As breast tissue size increased to 70 and 80 mm, the SVM model’s accuracy improved, reaching 97.50% and 99.25%, respectively. Sensitivity increased, while specificity remained relatively stable.
The CNN model demonstrated competitive performance, with an accuracy of 89.53% and an F1 score of 0.89 for the global dataset. High sensitivity and strong specificity indicated effective tumor detection and good performance in identifying healthy tissue samples. As the breast tissue size increased, the CNN model’s accuracy improved, reaching 99.75% for both 70 mm and 80 mm sizes, with sustained high sensitivity and improved specificity.
The RF model exhibited excellent performance across different breast sizes, achieving high accuracy, sensitivity, and specificity. With an increase in the number of measurements, the RF model’s performance consistently improved, reaching perfect accuracy, sensitivity, and specificity for breast tissue samples of 60, 70, and 80 mm. This suggests that the RF model becomes more accurate and effective as more measurements are considered.
In comparing the global classification models, the RF model emerged as the top performer, achieving an exceptional accuracy of 99.67%, the highest sensitivity of 99.78%, and the highest specificity of 99.33%, resulting in a perfect F1 score of 1.00. In contrast, the SVM model exhibited lower accuracy (90.50%) and sensitivity (93.72%) with moderate specificity (80.83%) and an F1 score of 0.87. The Convolutional Neural Network model also showed competitive performance with an accuracy of 89.53%, sensitivity of 91.50%, specificity of 87.56%, and an F1 score of 0.89.
These results underscore the potential of machine learning models in breast cancer detection and emphasize the importance of selecting an appropriate model based on specific task requirements. The RF model demonstrated superior classification capabilities, making it the preferred choice for this task. However, further research is needed to investigate the impact of different model parameters, feature selection techniques, and data preprocessing methods on model performance.
Conclusion
This study has showcased the utility of bioimpedance as a powerful tool for the early detection of breast cancer. It underscores the importance of using multielectrode configurations for bioimpedance measurements, which significantly contribute to the accuracy and reliability of breast cancer detection. Moreover, this research highlights the pivotal role of machine learning techniques in enhancing the detection capabilities of breast cancer, particularly in automated and precise classification tasks.
Through a comprehensive comparison of results, it is evident that the RF machine learning model outperforms SVM and CNN in the context of breast cancer detection. RF exhibited exceptional accuracy, sensitivity, specificity, and achieved a perfect F1 score, making it the preferred choice for this vital medical application.
As a future direction, this study could be further improved by extending the analysis to include smaller tumor sizes and a more extensive dataset obtained from experimental measurements. This expansion could provide even greater insights into the effectiveness of bioimpedance-based breast cancer detection and contribute to the ongoing efforts to improve early diagnosis and treatment outcomes in breast cancer patients.
Footnotes
Acknowledgment
The authors acknowledge Ecole Militaire Polytechnique for providing the necessary resources and facilities for this research.
Authors’ Contributions
O.B.: Conceptualization, investigation, visualization, writing—original draft, and writing—reviewing and editing. Y.A.: Investigation, writing—original draft, and writing—reviewing and editing. A.Z.: Supervision, investigation, and writing—reviewing and editing.
Author Disclosure Statement
No competing financial interests exist.
Funding Information
No funding was received for this article.
