Choledochal cancer region detection in hyperspectral images using U-Net based models

Abstract

Cholangiocarcinoma (CCA) is a type of cancer that forms in the bile duct that carry digestive fluid from the liver. CCA is the primary form of liver cancer that affects population ranging from age 60 to 69 years. CCA is difficult to diagnose at an early stage. Hyperspectral (HS) imaging is an advanced imaging technique that combines spectroscopy with conventional imaging. HS imaging is an emerging field of study which can be used for early CCA detection. HS imaging involves capturing images across various spectral bands, which forms a three-dimensional data cube often called as hyperspectral data cube. In this study, we have utilized U-Net based models, namely U-Net and DenseUNet were used to perform semantic segmentation on the HS images of CCA tissues. A band selective approach was employed to derive a subset of meaningful bands based on the spectrum plot from the HS image. The HS images are further preprocessed with Principal Component Analysis (PCA). The models were further evaluated by computing the accuracy, AUC (Area under the ROC curve), sensitivity and specificity metrics. The proposed models, namely, U-Net and DenseUNet reported an overall accuracy of 73.47% and 77.09% respectively. The DenseUNet models outperforms the U-Net model on every evaluation metric. The proposed models were also compared with other state-of-the-art (SOTA) models trained on various HS dataset. This study explores the application of HS imaging in carcinoma detection. The findings of this study could be used for further enhancement of the approach.

Keywords

Hyperspectral imaging U-Net DenseUNet

1. Introduction

Cholangiocarcinoma (CCA), also known as Choledochal cancer, is a type of cancer that forms in the slender tubes (bile ducts) that carry the digestive fluid bile. Bile ducts connects the liver to the gallbladder and to the small intestine. CCA is the primary cancer of the bile ducts. CCA originates from the malignant transformation of cholangiocytes, which are the epithelial cells lining the biliary system [1]. CCA is generally categorized into two types based on its location within the biliary tree: intrahepatic and extrahepatic. Intrahepatic CCA develops within the liver parenchyma, forming distinct mass lesions and often displaying advanced clinical symptoms. Extrahepatic cholangiocarcinoma (CCA) originates in the larger bile ducts, including the left and right hepatic ducts, common hepatic duct, and common bile duct. CCA is a significant contributor to primary liver cancers, accounting for roughly 10% to 25% of these malignancies globally [2]. CCA rarely occurs before the age of 40; the typical age at presentation is between ages 60 to 69 years [3]. Due to the lack of symptoms in the early stages and their location, CCA is diagnosed at an advanced stage. Advanced diagnostic techniques like fluorescence in situ hybridization (FISH) and mutational analysis have become crucial for accurate diagnosis [4]. Early diagnosis is necessary to prevent the fatal situation of the patient. The primary method for diagnosing CCA is through histopathological analysis of choledochal tissues stained with Hematoxylin and eosin (HE), sample micrographs of Choledochal tissues stained with HE is depicted in Fig. 1.

Figure 1.

Micrographs of distinct categories of Cholangiocarcinoma obtained from [5] (a) Tissue without cancer regions (b) Tissue with partial cancer regions (c) Tissue entirely affected by cancer.

Hyperspectral imaging (HSI) is a non-ionizing sensing technique, with the ability to capture the diffuse reflectance spectra across the visible (VIS) and near-infrared (NIR) wavelength range [6]. Hyperspectral imaging involves capturing multiple images across a range of adjacent spectra, allowing the reconstruction of the spectrum reflection for every pixel. This process results in the acquisition of three-dimensional hypercube information. The spatially resolved spectra collected provide valuable tissue diagnostic data, facilitating non-invasive monitoring of biopsies, histological and fluorometric analysis, and enhancing the understanding of diseases. The data acquired generates a hyperspectral cube, where two dimensions represent the spatial extent of the location, and the third dimension signifies the spectral content [7]. HSI has found its applications in numerous fields, such as archaeology [8, 9, 10], vegetation and water resource control [11, 12, 13, 14, 15, 16], food quality control [17, 18, 19, 20], and many more. HSI is an advance imaging technique that combines spectroscopy and imaging. HSI collects spectral information at each pixel of a 2D detector array, resulting in a 3D dataset containing spectral and spatial information, this dataset is often referred to as a hypercube [21]. HS imaging can be utilized to provide intra-operative feedback to the surgeon for objective assessment of cancer [22]. Studies show the effectiveness of HS imaging in noninvasive tissue analyses and in the detection of cancer in tissue samples collected from human body, ex vivo human tissue, in breast [23, 24, 25], skin [26, 27, 28, 29, 30, 31], colon [6, 32], brain [33], etc.

Conventional algorithms like Support Vector Machines [34], K Nearest Neighbors [35], Convolutional Neural Networks [36], and other image processing algorithm have been used on RGB (Red Green Blue) color space images for carcinoma detection . HS images requires a substantial amount of memory, this implies that applying image processing techniques on HS images is resource-intensive. For example, the dataset used in this study, the average size of a HS image is about 150 Megabytes (MB). These images contain multiple bands representing spectral information, which increases dataset dimensionality. Therefore, preprocessing is necessary to reduce dimensionality and make HS images compatible with conventional algorithms and models.

In this study, we explore the capabilities of U-Net based models on HS images of Choledochal tissues. This study proposes two semantic segmentation models, namely U-Net and DenseUNet, which were trained on HS images preprocessed with Principal Component Analysis (PCA), to perform semantic segmentation tasks. The proposed models are also compared with other state of the art (SOTA) models trained on various HS dataset.

2. Literature review

The literature review conducted for this study indicates that hyperspectral (HS) imaging is an emerging field of study that demonstrates enhanced performance compared to RGB color space images. However, the storage requirements of HS images are enormous, which makes them difficult to work with, therefore there is a need to use various image processing techniques, in order to train a conventional model on HS images data. In [37] the authors investigated the segmentation of rat bile duct carcinoma from hyperspectral images (HSI) using the Otsu algorithm (OTSU) and support vector machine (SVM). Their study demonstrated the potential of HSI in detecting liver tumors, contributing to automated tumor detection techniques and paving the way for improved surgical outcomes and future applications in image-guided interventions for liver cancer. Utilizing a combination of spectral and spatial data analysis, a study [23] employed broadband hyperspectral imaging technique based on the U-Net [38] architecture that effectively identifies breast cancers with high efficiency.

In [24] Aboughaleb et al., investigated the efficacy of Hyperspectral Imaging (HSI) combined with advanced processing techniques for diagnosing ex-vivo breast cancer. Their study revealed distinct optical responses in breast tissue properties, utilizing K-mean clustering to differentiate between malignant and normal tissue. Results demonstrate high sensitivity (95%) and specificity (96%) in discriminating tumor regions from normal tissue, suggesting the potential of HSI for improving surgical outcomes compared to conventional methods. In the study [26] the authors presented a hyperspectral imaging (HSI) technique that allow for a non-invasive tool for skin cancer diagnosis. Their research demonstrates the effectiveness of HS Imaging in detecting and classifying pigmented skin lesions (PSLs), achieving high sensitivity (87.5%) and specificity (100%) in discriminating between benign and malignant PSLs. Their study underscores the potential of HSI to assist dermatologists in real-time diagnosis during clinical practice, offering a significant advancement in skin cancer detection. Manni et al. in [6] assessed the potentials of Hyperspectral Imaging (HSI) for automating colon cancer detection during surgery. Their study, employed a spectral-spatial patch-based classification approach on six ex-vivo specimens, which demonstrated promising results with a sensitivity of 0.88 and specificity of 0.78. Comparison with deep learning approaches highlights the superiority of their hybrid CNN method, paving the way for improved surgical outcomes with HSI guidance.

The study [32] proposed a machine learning and hyperspectral imaging technique for automatic colon and esophagogastric cancer recognition. Their study highlights the effectiveness of 3D Convolutional Neural Networks (3DCNN) with a high ROC-AUC of 0.93, suggesting the potential of this approach in clinical practice. Jansen-Winkeln et al. in [39] explored the potential of combining Hyperspectral Imaging (HSI) with artificial intelligence algorithms for automatic colorectal cancer (CRC) detection. Using a four-layer perceptron neural network, they achieved a sensitivity of 86% and specificity of 95% in distinguishing cancerous or adenomatous tissue from healthy mucosa. Additionally, HSI revealed significant perfusion parameter differences related to tumor staging and neoadjuvant therapy, suggesting its ability to detect chemotherapy-induced biological changes. Khan et al. in [40] provided a review on the surge of deep learning applications in medical hyperspectral image analysis. Addressing a gap in the literature, the paper explores how deep learning methods are utilized for classification, segmentation, and detection in this domain. By synthesizing current research, the authors identify challenges and propose strategies for future advancements, offering valuable insights for researchers in the field. The study done by Tsai et al. in [41] proposed a method combining Hyperspectral Imaging (HSI) with deep learning for early esophageal cancer detection. Using a single-shot multibox detector (SSD)-based system, they achieve 88% accuracy with white-light endoscopic images (WLI) and 91% with narrow-band endoscopic images (NBI). Compared to RGB images, this approach shows a 5% increase in accuracy for both WLI and NBI, suggesting substantial improvement in cancer detection precision. Urbanos et al. in [33] explore the synergy between supervised machine learning (ML) methods and hyperspectral imaging (HSI) techniques for brain cancer classification. Their study utilizes HSI and ML algorithms like SVM, Random Forest (RF), and CNN to differentiate healthy and tumor tissues during brain tumor surgery. Results showed the overall accuracy ranging from 60% to 95%, indicating promising potential for ML-assisted diagnosis and surgical guidance in brain cancer. Wang et al. (2021) [25] proposed a PCA-U-Net method for segmenting breast cancer nests from hyperspectral images. By combining unsupervised principal component analysis with the U-Net neural network, the approach achieves an 87.14% segmentation accuracy, offering potential for aiding pathologists in diagnosing breast cancer lesions and advancing tumor diagnosis.

A method for staging skin cancer, focusing on squamous cell carcinoma (SCC), using Hyperspectral Microscopic Imaging (HMI) and machine learning was developed in [27]. The study highlights the importance of early detection due to increasing global incidence. The authors optimized their approach, achieving a staging accuracy of 0.952 $\pm$ 0.014 and a kappa value of 0.928 $\pm$ 0.022, with spectral data from nuclear compartments proving most influential in accurate SCC staging. In study [42], the authors assessed the utility of hyperspectral imaging (HSI) combined with deep learning for detecting thyroid carcinoma on whole histologic slides. They compare the performance of classifiers trained on different imaging modalities using a dataset of 33 tissues from patients with follicular thyroid carcinoma. The study showcased that the deep learning classifier trained on HSI data achieves the highest AUC-ROC of 0.966, outperforming classifiers trained on RGB and HSI-synthesized RGB data. This highlights the potential of HSI to improve cancer classification accuracy on entire histologic slides, offering an automated approach for thyroid cancer detection. The study conducted in [43] explored the efficacy of hyperspectral imaging (HSI) with band selection and color reproduction for early esophageal cancer detection. By enhancing the visibility of blood vessel features in simulated narrow-band endoscopic images (NBIs) from white-light images (WLIs), they improved prediction performance. With a dataset of 1780 esophageal cancer images, they achieved mean average precision (mAP) of 80% in WLIs, 85% in NBIs, and 84% in HSI images. This study highlights HSI’s potential to boost accuracy by around 5% compared to white-light imagery, in line with previous narrow-band imaging (NBI) findings.

Agrawal et al. [7] proposed a lossy compression method for hyperspectral images, employing modified convolutional autoencoders with attention layers. The encoder-decoder architecture, tested on images taken from Airborne Visible / Infrared Imaging Spectrometer (AVIRIS), Reflective Optics System Imaging Spectrometer (ROSIS), and NASA EO1, achieves up to a 5% increase in Peak Signal to Noise Ratio (PSNR) and up to 200 times higher compression ratio compared to existing methods, addressing the challenge of processing large hyperspectral datasets efficiently. La Salvia et al. conducted a study [44], which utilized Hyperspectral Imaging (HSI) to automate glioblastoma segmentation during surgery. Their AI-based approach, employing deep learning techniques, improves processing times for real-time segmentation. Evaluated against ground truths, their method enhances the gold-standard machine learning pipeline for intraoperative glioblastoma delineation. Mohamed et al. [45] introduced the Automated Laryngeal Cancer Detection and Classification using a Dwarf Mongoose Optimization Algorithm with Deep Learning (ALCAD-DMODL) technique for automating the detection and classification of laryngeal cancer (LCA). This method combined deep learning with the Dwarf Mongoose Optimization Algorithm to enhance accuracy in identifying LCA from throat region images, surpassing existing approaches in performance metrics. In the study [28], the authors investigated the use of hyperspectral imaging (HSI) combined with deep learning to classify skin cancer lesions. Utilizing the ISIC dataset, they trained models using YOLOv5 on both HSI and RGB images. Results showed that the HSI model outperformed the RGB model in identifying squamous cell carcinoma (SCC) features, with a recall rate of 0.794. This study highlights the potential of HSI technology for improving skin cancer classification accuracy. In [46] the authors addressed the challenge of early melanoma diagnosis using hyperspectral imaging and deep learning. Their study, encompassing samples from 50 melanoma and nevus patients, achieved promising classification accuracies of 89% and 98% for one-dimensional and two-dimensional data, respectively. This approach shows potential for improving diagnostic precision in distinguishing between melanoma and nevus, offering a non-invasive alternative to traditional histological methods. The study done in [29, 30], focuses on parallelizing HS processing methods using CUDA to expedite classification, emphasizing the need for efficient disease detection. Results showed significant improvements in classification times with parallel SVM and XGBoost algorithms, affirming GPUs’ suitability for hyperspectral image analysis.

The study done by Huang et al. [31] pioneered AI and hyperspectral imaging for identifying skin lesions, notably Mycosis fungoides (MF) from psoriasis (PsO) and atopic dermatitis (AD). The authors used a dataset of 1659 skin images, they developed a multi-frame AI algorithm, achieving high accuracy in lesion segmentation and classification. Their study highlights the potential of AI and HSI in dermatological diagnostics, offering a noninvasive and efficient approach for early detection of skin conditions. In summary, the literature reviewed provides valuable insights into HS imaging. However, it is important to acknowledge a limitation observed in the existing literature. The enormous storage requirements of HS images, makes it difficult to work with.

3. Dataset description

In this study we have utilized a secondary dataset, which is the Multidimensional Choledoch Database [5] to train the proposed models for semantic segmentation tasks. The Multidimensional Choledoch Database contains both the microscopy hyperspectral images and RGB color space images of cholangiocarcinoma tissues stained with HE (hematoxylin and eosin). All the images in the dataset are meticulously labelled by experienced pathologist to generate annotations files, these annotation files are further be processed to form the ground truth maps for further processing and training.

Figure 2.

HS images in the datasets with respective ground truth mask: (a) Sample with full cancer regions (N). (b) Sample with no cancer regions (P). (c) Sample with partial cancerous regions [47].

The images in the dataset can be categorized into three types: L (samples with partial cancer regions with annotation files), N (samples with full cancer regions), and P (samples without cancer regions). The dataset contains 880 samples of multidimensional images captured from choledoch tissues of 174 patients. Among these multidimensional images, there are 689 scenes that contain partial cancer areas, 49 scenes that depict complete cancer areas, and 142 scenes that do not feature any cancer areas. The annotations are stored under “.xml” files, which contains the coordinates of the polygons that represents cancerous regions. These “.xml” files are required to be converted into binary masks that can be further used for training and testing purposes. These binary masks serve as ground truth maps, this can be seen in Fig. 2, the white regions in the mask represents the cancerous regions, while the black regions indicate the non-cancerous regions of the tissue. In the following sections we discuss the image acquisition system, and image formats.

3.1 Image acquisition system

The imaging system comprises a microscope (Nikon 80i, Nikon Corp.) and an acousto-optic tunable filter (AOTF) adapter(VA310-.37.80-L, Brimrose Corp.), an SPF Model AOTF controller (VFI130-140SPFB2C2exSTS, Brimrose Corp.), a gray scientific complementary metal oxide semiconductor (sCMOS, Dhyana 400D, Tucsen Corp.), a color charge coupled device detector (color CCD, DigiRetina 16, Tucsen Corp.), and a personal computer [5]. The hyperspectral images in the dataset are captured using the system depicted in Fig. 3.

Figure 3.

Schematic of the image acquisition system for capturing HS images extracted from [5].

Single-band images are acquired by sCMOS with wavelength ranging from 550 nm to 1000 nm, utilizing narrow bandwidth via the AOTF [5]. These images contain two-dimensional spatial data and one-dimensional spectral data. They can be visualized as a three-dimensional cube.

3.2 Image formats

The dataset contains the microscopy HS images, which are stored in two formats, namely “.hdr” and “.raw” files. The “.hdr” files contains important description about the “.raw” files. Some of the important parameters stored in “.hdr” file are the “band $=$ 60”, this parameter indicates that the hyperspectral data cube includes 60 bands, and “interleave $=$ bsq” represents how the images are stored; which in this case is the BSQ (band sequential) format. The “.raw” files contains the actual hyperspectral information filled with spectral information. These files occupy most of the memory in the dataset, this can be seen in Table 1. The dataset was acquired through the “Kaggle” platform; however due the platform’s storage restrictions we were unable to access the entire dataset, this is evident in Table 1. The typical size of a single hyperspectral image is approximately 150 MB, and the total storage required for the entire dataset is approximately 91.06 GB, Table 1.

Table 1
Storage details of ‘.raw’ files in the database [47]

Category	Number of samples	Average size	Total size
L	483	150 MB	75.81 GB
N	44	150 MB	6.6 GB
P	55	150 MB	8.65 GB
Total	582		91.06 GB

4. Research methodology

In this section we discuss methods and models used in this study. We begin by describing the Principal Component Analysis (PCA) algorithm, U-Net model and DenseUNet model. We also delve into the details related to the various methods employed in this study and also discuss the rationale behind selecting those methods.

4.1 Dimensionality reduction using principal component analysis (PCA)

As evident from Table 1, HS images have large storage requirements, this makes processing HS images challenging. Due to larger number of bands present, HS images often suffer for the “curse of dimensionality” phenomenon [48]. Numerous bands within hyperspectral images frequently exhibit strong correlation. PCA is one of the oldest and simplest technique which is used to reduce dimensionality of the dataset, while preserving as much “variability” as possible. Principal Component Analysis (PCA) serves as a descriptive tool that does not rely on distributional assumptions. It is an adaptive exploratory method suitable for analyzing numerical data of various types. PCA transformation denotes a linear conversion of the original image bands into a collection of new, uncorrelated features.

For an $n\times p$ data matrix, $X$ , where each of the $n$ rows represent a different repetition of the experiment, and each of the $p$ columns represent a specific type of feature. The transformation is defined by a set of size $l$ of $p$ -dimensional vectors of weights or coefficients $w_{\left(k\right)}=\left({w_{1},\ldots,w_{p}}\right)_{\left(k\right)}$ . These coefficients map each row vector $x_{\left(i\right)}$ of $X$ to a new vector of principal component scores $t_{k\left(i\right)}=\left({t_{1},\ldots,t_{l}}\right)_{\left(i\right)}$ , given by

\displaystyle t_{k\left(i\right)}=x_{\left(i\right)}\cdot w_{\left(k\right)}% \text{~{}for~{}}i=1,\ldots n\quad k=1,\ldots,l

(1)

In this manner, the individual variables $t_{1},\ldots,t_{l}$ of $t$ observed across the dataset, progressively acquire the highest achievable variance from $X$ , ensuring that each coefficient vector $w$ remains to be a unit vector [49].

In hyperspectral images, many bands frequently display strong correlations. PCA involves a linear transformation of the original bands in the image to produce a new collection of independent features. These new features are determined by the image covariance matrix’s eigenvectors, where each eigenvalue denotes the variance along the direction of its corresponding eigenvector. A very small number of primary components can be used to capture a significant portion of the variation in the image.

Figure 4.

Contribution rate of each of principal component after applying PCA.

In [47] we have applied PCA to the HS image and retained the first component, the first component retains around 80.95% of the original variance, this can be clearly seen in Fig. 4. This study further explores the approach by retaining the top three components with retains around 96.96% of the original variance.

4.2 U-Net

The U-Net [38] model is a popular model, which was originally introduced for biomedical image segmentation. U-Net falls under the category of Fully Convolutional Networks (FCNs), which are neural network that only contains convolutional layer in the network. The architecture of U-Net makes it capable for performing semantic segmentation with very few training images and yields more precise segmentation.

Figure 5.

Architecture of a classical U-Net model [38].

The U-Net architecture comprises a contracting pathway and an expanding pathway, resulting in a symmetrical model structure. The contracting path comprises a sequence of convolutions, each succeeded by a rectified linear unit (ReLU) and a max-pooling operation with a stride of 2 for downsampling. At every downsampling step the number of feature channels is doubled. In the expansion phase of the model, the feature map is upsampled, then subjected to a 2 $\times$ 2 convolution that reduces the number of feature channels by half. This result is concatenated with the corresponding cropped feature map from the contracting path. Subsequently, two 3 $\times$ 3 convolutions are applied, each followed by a ReLU. In the final layer, a convolutional operation is employed to transform each 64-component feature vector into the required number of classes, the architecture of the U-Net model is schematically depicted in Fig. 5. The contracting path is responsible for feature extraction, and it allows the network to capture contextual information about the image. The spatial dimensions of the image, decreases while preserving the important features, as it proceeds through the contracting phase of the model. The architecture of U-Net makes it suitable for performing effective semantic segmentation on various types of images.

4.3 DenseUNet

FCNs based on the encoder (contracting) and decoder (expanding) architectures, usually have millions of parameters and they suffer with the issue of vanishing gradient, this due to large depth of these networks, the signal needs to backpropagate across many layers.

Figure 6.

Network architecture of DenseUNet [50].

DenseUNet [50] is a modified version of the classical U-Net architecture, it uses Dense Blocks (DB) to create a densely connected U-Net architecture. Similar to U-Net, the architecture of DenseUNet constitutes of contracting and expanding paths. DenseUNet consists of four major core blocks: Down Transition Block (DTB), Up Transition Block (UTB), bottleneck and Dense Block (DB). The DB are added to the network for solving the vanishing gradient problem. The DTB consists of two layers: 2 $\times$ 2 max pooling layers with stride 2 and a dropout layer with a dropout rate of 0.2. Hence, the output feature maps from the DTB model are halved in size compared to the input feature maps, while retaining an equal number of channels as the input feature maps. UTB comprises of two layers: a 2 $\times$ 2 upsampling layer and a dropout layer with a dropout rate of 0.2. When upsampling the feature maps, UTB doubles the size of the feature maps, with two times as many as number of channels of the input feature maps. The DTB and UTB forms the contracting and expanding path of the DenseUNet model. DenseUNet uses dense connectivity, where each layer is connected to all preceding layers. This allows for better feature propagation and reuse, which means that DenseUNet can learn more complex and discriminative features with fewer parameters. In addition, DenseUNet also uses dropout layers within the architecture to prevent overfitting. Dropout layers randomly drop out some of the activations in a layer during training, which helps to regularize the model and improve generalization performance.

5. Proposed methods

In this section we describe the various methods and approaches employed in the proposed method. Preprocessing of HS images is an important step for training the Fully Convolutional Networks (FCNs) models. This section describes the various methods employed for preprocessing the HS images. Further, we also discuss the configuration details of the FCNs in great detail.

5.1 Data preprocessing

Figure 7.

(a) Spectrum plot for cancerous and non-cancerous regions in a sample HS image. (b) Contribution rate for the top 3 components after apply PCA.

Figure 8.

Hyperspectral Image preprocessing pipeline.

The proposed method constitutes of various stages that takes place sequentially, the first stage in the proposed method is to preprocess the HS images, this stage is mutual in both the models. A band-selective approach is implemented, wherein specific bands are chosen based on the spectrum curve of the hyperspectral (HS) image. As depicted in Fig. 7(a), bands ranging from 10 to 50 were selected from the original HS image due to the highest intensity observed in this range of the spectral plot. The preprocessing steps is schematically depicted in Fig. 8. The preprocessing start with the original hyperspectral data cubes; these cubes are firstly normalized by removing the mean and scaling to unit variance. The normalization helps in standardizing the dataset, which makes the further processing much more consistent. After normalization is applied, we then proceed to the next stage, which is the applying Principal Component Analysis (PCA). PCA is an effective algorithm that is used for dimensionality reduction. The storage requirements for hyperspectral images are enormous, this makes it difficult to preprocess HS images. Consequently, training models on hyperspectral images becomes resource-intensive due to the large storage requirements. Initially, the hyperspectral (HS) images have dimensions of $1024\times 1280\times 60$ . After applying Principal Component Analysis (PCA) to the HS images, their dimensions are reduced to $1024\times 1280\times 3$ . This dimensionality reduction enhances the suitability of the images for training purposes. The top three components of PCA retains around 96.96% of the original variance of the HS image, this can be clearly seen in Fig. 7(b). In [47], we have used a similar approach, in which the HS images were preprocessed with PCA and only the first component was retained and after which the normalization was applied to the images. However, the approach proposed in this study, the first three principal component were retained and the normalization is applied before PCA. The next stage in the preprocessing phase is to extract patches from the preprocessed HS images, which implies splitting the images in multiple small tiles, which make them easier to work with.

We have used a tile size of 256 pixels, with a stride size of 256 pixels, this ensures that there is no overlapping of the tiled images, each preprocessed HS image yields 20 tiles. These tiles have a dimension of $256\times 256\times 3$ . Concurrently, the ground truth masks corresponding to the HS images are similarly segmented into multiple tiles. These tiles are stored as array which forms a dataset, which is further utilized for training and evaluation of the Fully Convolutional Networks (FCNs). The preprocessing phases forms a Hyperspectral Image Preprocessing Pipeline, through which each image is passed through, the output of this pipeline is a series of smaller HS image patches which are further used for training and testing purposes.

5.2 Image segmentation using U-Net

In this study, we have proposed a U-Net based FCN, that is used for performing semantic segmentation on the preprocessed HS image patches. In [47], a classical U-Net model was utilized to execute semantic segmentation on hyperspectral (HS) images depicting Choledochal Cancer tissues. The model exhibited effectiveness in segmenting the images and demonstrated commendable performance across diverse evaluation metrics. This study proposes a similar U-Net model; however, the proposed model was trained on image patches, which made the model less computational resource intensive. The model was trained with the hyperparameters shown in Table 2. The “n_filters” parameter refers to the number of filters, also known as kernels or channels, is applied to the input data. Each filter is responsible for detecting specific patterns or features in the input data. Increasing the number of filters allows the CNN to learn more diverse and complex patterns from the input data, potentially improving its ability to extract meaningful features and make accurate predictions [36]. The “Dropout Rate” parameter is used for regularizing the FCN. The Dropout Rate of 0.5 was selected to avoid overfitting of the model. By training the model on various dropout rate values and evaluating their performance, we identified 0.5 as the optimal choice for avoiding overfitting of the model while maximizing its effectiveness.

Table 2
Hyperparameters for U-Net model

Hyperparameter	Value
Epochs	25
Batch size	32
Optimizer	Adam
Loss	Binary cross entropy
n_filters	20
Learning rate	$1\times 10^{-9}$
Dropout rate	0.5

Table 3

Parameter information for the proposed U-Net model

	Layer		Kernel function (size/channels)	Activation function	Output size
Convolution part	Down 1	conv 1.1	$3\times 3/1$	ReLU	$256\times 256$
		conv 1.2	$3\times 3/1$
		conv 1.3	$3\times 3/1$
	Down 2	max pool 2.1 conv 2.1 conv 2.2	$2\times 2/-$ $3\times 3/128$ $3\times 3/128$	ReLU	$128\times 128$
	Down 3	max pool 3.1 conv 3.2 conv 3.3	$2\times 2/-$ $3\times 3/256$ $3\times 3/256$	ReLU	$64\times 64$
	Down 4	max pool 4.1 conv 4.2 conv 4.3	$2\times 2/-$ $3\times 3/512$ $3\times 3/512$	ReLU	$32\times 32$
	Down 5	max pool 5.1 conv 5.2 conv 5.3	$2\times 2/-$ $3\times 3/1024$ $3\times 3/1024$	ReLU	$16\times 16$
Deconvolution part	Up 1	up-conv 6.1	$2\times 2/512$	ReLU	$32\times 32$
		conv 6.2	$3\times 3/512$
		conv 6.3	$3\times 3/512$
	Up 2	up-conv 7.2 conv 7.2 conv 7.3	$2\times 2/256$ $3\times 3/256$ $3\times 3/256$	ReLU	$64\times 64$
	Up 3	up-conv 8.1 conv 8.1 conv 8.3	$2\times 2/128$ $3\times 3/128$ $3\times 3/128$	ReLU	$128\times 128$
	Up 4	up-conv 9.1 conv 9.1 conv 9.3	$3\times 3/64$ $3\times 3/64$ $3\times 3/64$	ReLU	$256\times 256$
Output		conv 10	$1\times 1/2$	Sigmoid	$256\times 256$

A convolution layer requires two important parameters, the kernel size and number of channels. The kernel size is responsible for setting the dimensions of the filter matrix. The channels define the depth or number of channels on which the convolutional layer should operate. The parameters information about the convolutional layers used in the proposed classical U-Net model is depicted in Table 3. The U-Net primarily conducts feature extraction via convolutional and pooling layers, whereas the upsampling process is predominantly achieved through inversion techniques. The model takes an input image of size $256\times 256\times 3$ , which at the end of the contracting path is transformed to the dimensions of $16\times 16$ . The subsequent step involves the expanding path, which yields an image portraying the segmentations predicted by the model. This segmentation image has dimensions of $250\times 250$ . The output layer uses a sigmoid activation function to produce the binary segmentation results.

5.3 Image segmentation using DenseUNet

U-Net is regarded as the state-of-the-art (SOTA) method for biomedical segmentation [38]. The problem of vanishing gradients restricts the U-Net’s training capability, moreover U-Net often includes millions of learnable parameters which requires enormous number of computational resources. Similar to U-Net, the architecture of U-Net DenseUNet also feature a contracting and expanding paths. DenseUNet, a state-of-the-art method, integrates advancements from both U-Net and DenseUNet. DenseUNet requires relatively fewer parameter that makes training the model a less resource-intensive process, this is further discussed in the consequent section.

Table 4
Parameter information about the proposed DenseUNet model

Block	Block details	Output size	Output channels
Input		256 $\times$ 256	3
Conv	3 $\times$ 3	256 $\times$ 256	32
DB	L $=$ 4, g $=$ 8	256 $\times$ 256	64
DTB		128 $\times$ 128	64
DB	L $=$ 4, g $=$ 16	128 $\times$ 128	128
DTB		64 $\times$ 64	128
DB	L $=$ 4, g $=$ 32	64 $\times$ 64	256
DTB		32 $\times$ 32	256
DB	L $=$ 4, g $=$ 64	32 $\times$ 32	512
UTB		64 $\times$ 64	256
Concatenation		64 $\times$ 64	512
Bottleneck		64 $\times$ 64	256
DB	L $=$ 4, g $=$ 64	64 $\times$ 64	512
Bottleneck		64 $\times$ 64	256
UTB		128 $\times$ 128	126
Concatenation		128 $\times$ 128	256
Bottleneck		128 $\times$ 128	128
DB	L $=$ 4, g $=$ 32	128 $\times$ 128	256
bottleneck		128 $\times$ 128	128
UTB		256 $\times$ 256	64
Concatenation		256 $\times$ 256	128
Bottleneck		256 $\times$ 256	64
DB	L $=$ 4, g $=$ 16	256 $\times$ 256	128
Bottleneck		256 $\times$ 256	64
Conv	3 $\times$ 3	256 $\times$ 256	2
Conv	3 $\times$ 3	256 $\times$ 256	1

The exact configuration of the DenseUNet model used in this study is shown in Table 4. The input layer takes a image of dimensions $256\times 256$ with 3 channels. As the image proceeds through the model, the number of channels increases, this helps the model learn the important features of the image. At the end of the contracting path the image attains a dimension of $32\times 32$ with 16 channels. The image then goes through the expanding path, which upsamples the image to its original dimensions to produce the segmentation result.

The Dense Block (DB) contains “L” number of layers, the value of “L” determines the depth or complexity of the Dense Block, which in turn affects the expressive power and representational capacity of the neural network architecture. The growth rate parameter “g” is a key hyperparameter in architectures like DenseUNet. It determines the number of additional feature maps produced by each layer within a block when transitioning from one layer to the next.

Table 5

Hyperparameters used for the proposed DenseUNet model

Hyperparameter	Value
Epochs	25
Batch size	32
Learning rate	$1\times 10^{-9}$
Loss	Binary cross entropy
Optimizer	Adam
n_filters	16

Furthermore, DenseUNet yields comparable results to conventional methods with reduced pre- and post-processing requirements [50]. The proposed DenseUNet model was trained on the same dataset of HS image patches. The DenseUNet model incorporates Dropout layers within DTB, UTB and Bottleneck blocks, this eliminates the need for explicitly defining the Dropout Rate hyperparameter for the model. Similar to U-Net, the “n_filter” parameters are the number of filters applied to the input data. In this study we have used a classical DenseUNet model, which was trained using the hyperparameters shown in Table 5.

6. Results

The dataset was partitioned such that 70% was allocated for training the models, while the remaining 30% was reserved for testing. An additional 10% percent of the training dataset was used for validating the model at the end of each epoch, this ensures that evaluation metrics are consistent throughout the training phase, this validation strategy is called “epoch-wise validation”. Data augmentation was applied to the patches dataset by randomly flipping the images horizontally and vertically this helps the models generalize well to unseen data. Both the models were trained using “P100” GPU accelerator on Kaggle Platform. The models were optimized using the Adam, which is an popular algorithm used for gradient-based optimization of machine learning models [51]. In order to evaluate the performance of the segmentation models, we compute the accuracy, area under the curve (AUC), sensitivity and specificity of both the models. The models were trained on the spectral patches created from the Hyperspectral Dataset Pipeline.

Table 6
Performance of the proposed models on the discussed evaluation metrics

Evaluation metric	U-Net			DenseUNet
	Training	Validation	Testing	Training	Validation	Testing
Accuracy	76.19%	74.70%	73.47%	80.79%	78.07%	77.09%
AUC	78.47%	74.99%	72.27%	85.62%	80.81%	80.10%
Sensitivity	87.06%	82.63%	78.97%	93.32%	87.55%	87.55%
Specificity	86.34%	82.76%	78.16%	92.85%	88.43%	88.43

Figure 9.

Learning curve for U-Net: (a) Loss curve (b) Accuracy curve.

Figure 10.

Learning curve for DenseUNet: (a) Loss curve (b) Accuracy curve.

The performance metrics of both the models, in different phases are shown in Table 6. Both the U-Net and DenseUNet model achieved an accuracy of 73.47% and 77.09% respectively, on the testing split. The learning curve for U-Net and DenseUNet models, which constitutes of the training loss and accuracy curve, can be seen in Figs 9 and 10. respectively.

The validation loss steadily decreases as the training proceeds, similarly the accuracy curve steadily increases throughout the training phase. This trend is evident in both the models. Early stopping based on the training loss, was employed while training both the models, this ensures that the models stop training when the training loss does not improve. The performance of the models on different dataset splits are consistent, this indicates that the models does not overfit.

7. Result discussion

In this section we discuss the results generated by the proposed models. The DenseUNet model exhibits slightly higher on the discussed evaluation metrics when compared to the U-Net model across all the datasets splits, indicating its effectiveness in performing accurate semantic segmentation tasks. The U-Net achieves an overall accuracy of 73.47% on the testing dataset, while DenseUNet achieves an overall accuracy of 77.08%. The Receiver Operating Characteristic - Area Under the Curve (ROC-AUC) is a performance metric used to evaluate the performance of binary classification models. It quantifies the overall performance of the model across all possible threshold values. The AUC value ranges between 0 and 1, where a higher value indicates better performance. The ROC-AUC for U-Net is 72.27%, while DenseUNet achieves an AUC value of 80.10%. The AUC value indicates that both the models can accurately segment the cancerous and non-cancerous regions of a tissue.

Table 7
Comparison of proposed models with SOTA models

Reference	Proposed model	Dataset	Results
[5]	Support vector machine (SVM)	Multidimensional choledoch dataset	Accuracy: 0.9375 Sensitivity: 0.9173 Specificity: 0.8234
[5]	Neural Net (NN)	Multidimensional Choledoch Dataset	Accuracy: 0.9427 Sensitivity: 0.8834 Specificity: 0.9598
[28]	YOLOv5	Skin Cancer Dataset	Accuracy: 0.7870 Sensitivity: 0.7260 Specificity: 0.7860
[47]	PCA based U-Net	Multidimensional Choledoch Dataset	Accuracy: 0.6195 Sensitivity: 0.5472 Specificity: 0.7528
Proposed models	U-Net	Multidimensional Choledoch Dataset	Accuracy: 0.7347 Sensitivity: 0.7897 Specificity: 0.7816
	DenseUNet	Multidimensional Choledoch Dataset	Accuracy: 0.7708 Sensitivity: 0.8755 Specificity: 0.8843

Figure 11.

Comparisons of memory size (in MB) of the proposed models.

As seen in Fig. 11, the training weights of DenseUNet takes around 2.34 MB whereas U-Net takes around 7.02 MB of memory. DenseUNet has 609,377 trainable parameters, while on the other hand U-Net has 1,839,621 trainable parameters, this indicate that DenseUNet is a robust model, that has relatively less trainable parameters, but it outperforms U-Net on the discussed evaluation metrics. This implies that DenseUNet has relatively less trainable parameters which makes it much more memory-efficient when compared to U-Net.

8. Conclusion

This section, further discusses and interprets the performance of the models and compare with other SOTA models. In [5, 47], results of three of the conventional algorithms, namely, Support Vector Machine (SVM), Neural Net (NN) and PCA based U-Net trained on HS spectral images of Choledochal tissues was reported, the models achieved an overall accuracies of 93.75%, 94.27% and 61.95% respectively. Additionally, a YOLOv5 model trained on HS image skin cancer achieved and accuracy of 78.70%. Although accuracy is an important evaluation metric that indicates the correctness of a model, however accuracy alone could not be used to assess the overall capabilities for a given model. For image segmentation tasks some of the other important evaluation metrics are specificity, sensitivity and AUC (Area under curve).

A comparison of the proposed models with other SOTA models trained on various datasets is shown in Table 7. The comparison of the proposed models with other SOTA models reveals that the segmentation capabilities of the proposed models are comparable with others models. The results of the proposed models also indicates that patch-based image segmentation approach is more superior when compared to the model trained with entire images [47]. The proposed model performs relatively high on the discussed evaluation metrics, however, when comparing U-Net and DenseUNet models, the DenseUNet models outperforms the U-Net model.

In conclusion, our research highlights the potential of HS imaging and advanced segmentation models for early detection of CCA. While the dataset limitations may have impacted performance to some extent, access to a more extensive dataset could further improve model accuracy. Refining these techniques and expanding dataset access could significantly enhance diagnostic capabilities and patient outcomes in CCA detection.

9. Future discussion

In this section, we explore the potential future applications and challenges that may emerge from the technologies employed in this study. Hyperspectral (HS) images provide enhanced accuracy compared to conventional RGB color space images. However, capturing HS images necessitates a significant investment in proprietary hardware setups, which are costly and not readily accessible. Hence, there is a necessity to develop cost-effective and easily accessible hardware devices to facilitate the advancement of this technology. Hyperspectral (HS) images demand extensive storage, this results in time-consuming processing of HS images. Therefore, there is a need for software algorithms capable of efficiently processing HS images, thereby simplifying their usage. In summary, the findings of this study lay the groundwork for ongoing advancements and refinements in the field, paving the way for a more effective and accessible implementation of hyperspectral imaging technology in various applications.

References

Lazaridis

K.N.

Gores

G.J.

, Cholangiocarcinoma, Gastroenterology128 (2005), 1655–1667. doi: 101053/j.gastro.2005.03.040.

Sarcognato

Sacchi

Fassan

Fabris

Cadamuro

Zanus

Cataldo

Capelli

Baciorri

Cacciatore

Guido

, Cholangiocarcinoma, Pathologica113 (2021), 158–169. doi: 1032074/1591951X252.

Tyson

G.L.

El-Serag

H.B.

, Risk factors for cholangiocarcinoma, Hepatology54 (2011), 173–184. doi: 10.1002/hep.24351.

Gonda

T.A.

Viterbo

Gausman

Kipp

Sethi

Poneros

J.M.

Gress

Park

Khan

Jackson

S.A.

Blauvelt

Toney

Finkelstein

S.D.

, Mutation profile and fluorescence in situ hybridization analyses increase detection of malignancies in biliary strictures, Clinical Gastroenterology and Hepatology15 (2017), 913–919.e1. doi: 10.1016/j.cgh..

Zhang

Sun

Zhou

Chu

, A multidimensional choledoch database and benchmarks for cholangiocarcinoma diagnosis, IEEE Access7 (2019), 149414–149421. doi: 10.1109/ACCESS.2019.2947470.

Manni

Fonolla

Der Sommen

F.V.

Zinger

Shan

Kho

De Koning

S.B.

Ruers

De With

P.H.N.

, Hyperspectral imaging for colon cancer classification in surgical specimens: towards optical biopsy during image-guided surgery, in: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, Montreal, QC, Canada, (2020), 1169–1173. doi: 10.1109/EMBC44109.2020.9176543.

Agrawal

Debnath

Sagnika

Bilgaiyan

Gupta

, Hyperspectral image compression using modified convolutional autoencoder, International Journal of Computer Information Systems and Industrial Management Applications15 (2023), 396–407.

Liang

, Advances in multispectral and hyperspectral imaging for archaeology and art conservation, Applied Physics A106 (2012), 309–323. doi: 10.1007/s00339-011-6689-1.

Doneus

Verhoeven

Atzberger

Wess

Ruš

, New ways to extract archaeological information from hyperspectral pixels, Journal of Archaeological Science52 (2014), 84–96. doi: 10.1016/j.jas.2014.08.023.

10.

Ødegård

Ø.

Mogstad

A.A.

Johnsen

Sørensen

A.J.

Ludvigsen

, Underwater hyperspectral imaging: a new tool for marine archaeology, Applied Optics57 (2018), 3214. doi: 10.1364/AO.57.003214.

11.

Benelli

Cevoli

Fabbri

, In-field hyperspectral imaging: An overview on the ground-based applications in agriculture, Journal of Agricultural Engineering51 (2020), 129–139. doi: 10.4081/jae.2020.1030.

12.

Kim

Baek

Stocker

M.D.

Smith

J.E.

Van Tassell

A.L.

Qin

Chan

D.E.

Pachepsky

Kim

M.S.

, Hyperspectral Imaging from a Multipurpose Floating Platform to Estimate Chlorophyll-a Concentrations in Irrigation Pond Water, Remote Sensing12 (2020), 2070. doi: 10.3390/rs12132070.

13.

Dao

Liu

Shang

, Recent Advances of Hyperspectral Imaging Technology and Applications in Agriculture, Remote Sensing12 (2020), 2659. doi: 10.3390/rs12162659.

14.

Liu

Hou

Zhang

Liu

Wang

Zhong

Tan

Xia

Qian

, UAV-Borne Hyperspectral Imaging Remote Sensing System Based on Acousto-Optic Tunable Filter for Water Quality Monitoring, Remote Sensing13 (2021), 4069. doi: 10.3390/rs13204069.

15.

Wang

Liu

Zhu

Hou

Liu

, A review of deep learning used in the hyperspectral image analysis for agriculture, Artificial Intelligence Review54 (2021), 5205–5253. doi: 10.1007/s10462-021-10018-y.

16.

Khan

Vibhute

A.D.

Mali

Patil

C.H.

, A systematic review on hyperspectral imaging technology with a machine and deep learning methodology for agricultural applications, Ecological Informatics69 (2022), 101678. doi: 10.1016/j.ecoinf.2022.101678.

17.

Khan

Munir

M.T.

Young

B.R.

, A Review Towards Hyperspectral Imaging for Real-Time Quality Control of Food Products with an Illustrative Case Study of Milk Powder Production, Food and Bioprocess Technology13 (2020), 739–752. doi: 10.1007/s11947-020-02433-w.

18.

Caporaso

ElMasry

Gou

, Hyperspectral imaging techniques for noncontact sensing of food quality, in: Innovative Food Analysis, Elsevier, (2021), 345–379. doi: 10.1016/B978-0-12-819493-5.00013-3.

19.

Özdoğan

Lin

Sun

D.-W.

, Rapid and noninvasive sensory analyses of food products by hyperspectral imaging: Recent application developments, Trends in Food Science and Technology111 (2021), 151–165. doi: 10.1016/j.tifs.2021.02.044.

20.

Saha

Manickavasagan

, Machine learning techniques for analysis of hyperspectral images to determine quality of food products: A review, Current Research in Food Science4 (2021), 28–44. doi: 10.1016/j.crfs.2021.01.002.

21.

Fei

, Medical hyperspectral imaging: a review, Journal of Biomedical Optics19 (2014), 010901. doi: 10.1117/1.JBO.19.1.010901.

22.

Fei

Wang

Zhang

Little

J.V.

Patel

M.R.

Griffith

C.C.

El-Diery

M.W.

Chen

A.Y.

, Label-free reflectance hyperspectral imaging for tumor margin assessment: a pilot study on surgical specimens of cancer patients, Journal of Biomedical Optics22 (2017), 1. doi: 10.1117/1.JBO.22.8.086009.

23.

Kho

Dashtbozorg

De Boer

L.L.

Van De Vijver

K.K.

Sterenborg

H.J.C.M.

Ruers

T.J.M.

, Broadband hyperspectral imaging for breast tumor detection using spectral and spatial information, Biomed Opt Express10 (2019), 4496. doi: 10.1364/BOE.10.004496.

24.

Aboughaleb

I.H.

Aref

M.H.

El-Sharkawy

Y.H.

, Hyperspectral imaging for diagnosis and detection of ex-vivo breast cancer, Photodiagnosis and Photodynamic Therapy31 (2020), 101922. doi: 10.1016/j.pdpdt.2020.101922.

25.

Wang

Tao

Sun

Chen

Zhou

, PCA-U-Net based breast cancer nest segmentation from microarray hyperspectral images, Fundamental Research1 (2021), 631–640. doi: 10.1016/j.fmre.2021.06.013.

26.

Leon

Martinez-Vega

Fabelo

Ortega

Melian

Castaño

Carretero

Almeida

Garcia

Quevedo

Hernandez

J.A.

Clavo

Callico

G.M.

, Non-Invasive Skin Cancer Diagnosis Using Hyperspectral Imaging for In-Situ Clinical Support, Journal of Clinical Medicine9 (2020), 1662. doi: 10.3390/jcm9061662.

27.

Liu

Zhang

, Staging of Skin Cancer Based on Hyperspectral Microscopic Imaging and Machine Learning, Biosensors12 (2022), 790. doi: 10.3390/bios12100790.

28.

Huang

H.-Y.

Hsiao

Y.-P.

Mukundan

Tsao

Y.-M.

Chang

W.-Y.

Wang

H.-C.

, Classification of Skin Cancer Using Novel Hyperspectral Imaging Engineering via YOLOv5, Journal of Clinical Medicine12 (2023), 1134. doi: 10.3390/jcm12031134.

29.

Petracchi

Gazzoni

Torti

Marenzi

Leporati

, Machine Learning-Based Classification of Skin Cancer Hyperspectral Images, Procedia Computer Science225 (2023), 2856–2865. doi: 10.1016/j.procs.2023.10.278.

30.

Petracchi

Torti

Marenzi

Leporati

, Acceleration of Hyperspectral Skin Cancer Image Classification through Parallel Machine-Learning Methods, Sensors24 (2024), 1399. doi: 10.3390/s24051399.

31.

Huang

H.-Y.

Nguyen

H.-T.

Lin

T.-L.

Saenprasarn

Liu

P.-H.

Wang

H.-C.

, Identification of Skin Lesions by Snapshot Hyperspectral Imaging, Cancers16 (2024), 217. doi: 10.3390/cancers16010217.

32.

Collins

Maktabi

Barberio

Bencteux

Jansen-Winkeln

Chalopin

Marescaux

Hostettler

Diana

Gockel

, Automatic Recognition of Colon and Esophagogastric Cancer with Machine Learning and Hyperspectral Imaging, Diagnostics11 (2021), 1810. doi: 10.3390/diagnostics11101810.

33.

Urbanos

Martín

Vázquez

Villanueva

Villa

Jimenez-Roldan

Chavarrías

Lagares

Juárez

Sanz

, Supervised Machine Learning Methods and Hyperspectral Imaging Techniques Jointly Applied for Brain Cancer Classification, Sensors21 (2021), 3827. doi: 10.3390/s21113827.

34.

Wang

, ed., Support Vector Machines: Theory and Applications, Springer Berlin Heidelberg, Berlin, Heidelberg, (2005). doi: 10.1007/b95439.

35.

Kramer

, Dimensionality Reduction with Unsupervised Nearest Neighbors, Springer Berlin Heidelberg, Berlin, Heidelberg, (2013). doi: 10.1007/978-3-642-38652-7.

36.

Yamashita

Nishio

R.K.G.

Togashi

, Convolutional neural networks: an overview and application in radiology, Insights into Imaging9 (2018), 611–629. doi: 10.1007/s13244-018-0639-9.

37.

Wang

Zhou

Sun

, Segmentation of Pathological Features of Rat Bile Duct Carcinoma from Hyperspectral Images, in: 2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), IEEE, Beijing, China, (2018), 1–5. doi: 10.1109/CISP-BMEI.2018.8633189.

38.

Ronneberger

Fischer

Brox

, U-Net: Convolutional Networks for Biomedical Image Segmentation, (2015). doi: 10.48550/ARXIV.1505.04597.

39.

Jansen-Winkeln

Barberio

Chalopin

Schierle

Diana

Köhler

Gockel

Maktabi

, Feedforward Artificial Neural Network-Based Colorectal Cancer Detection Using Hyperspectral Imaging: A Step towards Automatic Optical Biopsy, Cancers13 (2021), 967. doi: 10.3390/cancers13050967.

40.

Khan

Paheding

Elkin

C.P.

Devabhaktuni

V.K.

, Trends in Deep Learning for Medical Hyperspectral Image Analysis, IEEE Access9 (2021), 79534–79548. doi: 10.1109/ACCESS.2021.3068392.

41.

Tsai

C.-L.

Mukundan

Chung

C.-S.

Chen

Y.-H.

Wang

Y.-K.

Chen

T.-H.

Tseng

Y.-S.

Huang

C.-W.

I.-C.

Wang

H.-C.

, Hyperspectral Imaging Combined with Artificial Intelligence in the Early Detection of Esophageal Cancer, Cancers13 (2021), 4593. doi: 10.3390/cancers13184593.

42.

Tran

Litter

J.V.

Chen

A.Y.

Fei

, Thyroid carcinoma detection on whole histologic slides using hyperspectral imaging and deep learning, in: Levenson

R.M.

Tomaszewski

J.E.

Ward

A.D.

, (Eds.), Medical Imaging 2022: Digital and Computational Pathology, SPIE, San Diego, United States, (2022), 19. doi: 10.1117/12.2612963.

43.

Tsai

T.-J.

Mukundan

Chi

Y.-S.

Tsao

Y.-M.

Wang

Y.-K.

Chen

T.-H.

I.-C.

Huang

C.-W.

Wang

H.-C.

, Intelligent Identification of Early Esophageal Cancer by Band-Selective Hyperspectral Imaging, Cancers14 (2022), 4292. doi: 10.3390/cancers14174292.

44.

La Salvia

Torti

Gazzoni

Marenzi

Leon

Ortega

Fabelo

Callicó

Leporati

, AI-based segmentation of intraoperative glioblastoma hyperspectral images, in: Barnett

N.J.

Gowen

A.A.

Liang

(Eds.), Hyperspectral Imaging and Applications II, SPIE, Birmingham, United Kingdom, (2023), 12. doi: 10.1117/12.2646782.

45.

Mohamed

Almutairi

R.L.

Abdelrahim

Alharbi

Alhomayani

F.M.

Elamin Elnaim

B.M.

Elhag

A.A.

Dhakal

, Automated Laryngeal Cancer Detection and Classification Using Dwarf Mongoose Optimization Algorithm with Deep Learning, Cancers16 (2023), 181. doi: 10.3390/cancers16010181.

46.

Tian

Zhang

Liu

Chen

Zhao

Zhang

Zhao

, Combining hyperspectral imaging techniques with deep learning to aid in early pathological diagnosis of melanoma, Photodiagnosis and Photodynamic Therapy43 (2023), 103708. doi: 10.1016/j.pdpdt.2023.103708.

47.

Nabajja

Kanojia

Yadav

, Choledochal cancer region detection in hyperspectral tissue images using U-Net, in: Intelligent Systems Design and Applications (2023).

48.

Altman

Krzywinski

, The curse(s), of dimensionality, Nature Methods15 (2018), 399–400. doi: 10.1038/s41592-018-0019-x.

49.

Jolliffe

I.T.

Cadima

, Principal component analysis: a review and recent developments, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences374 (2016), 20150202. doi: 10.1098/rsta.2015.0202.

50.

Cao

Liu

Peng

, DenseUNet: densely connected UNet for electron microscopy image segmentation, IET Image Processing14 (2020), 2682–2689. doi: 10.1049/iet-ipr.2019.1527.

51.

Kingma

D.P.

, Adam: A Method for Stochastic Optimization (2014). doi: 10.48550/ARXIV.1412.6980.

Choledochal cancer region detection in hyperspectral images using U-Net based models

Abstract

Keywords

1. Introduction

3. Dataset description

Table 1 Storage details of ‘.raw’ files in the database [47]

4.1 Dimensionality reduction using principal component analysis (PCA)

5.1 Data preprocessing

Table 2 Hyperparameters for U-Net model

Table 4 Parameter information about the proposed DenseUNet model

Table 6 Performance of the proposed models on the discussed evaluation metrics

Table 7 Comparison of proposed models with SOTA models

9. Future discussion

References

Table 1
Storage details of ‘.raw’ files in the database [47]

Table 2
Hyperparameters for U-Net model

Table 4
Parameter information about the proposed DenseUNet model

Table 6
Performance of the proposed models on the discussed evaluation metrics

Table 7
Comparison of proposed models with SOTA models