Abstract
Underwater images are often degraded by wavelength-dependent absorption, scattering, and turbidity, resulting in color distortion, low contrast, and noise. To address these challenges, we propose a multi-stream preprocessing and multi-scale fusion framework guided by perceptual weight maps for underwater image enhancement. The framework generates three complementary representations of the input: a white-balanced stream for global color correction, a CLAHE-enhanced stream for local contrast improvement, and a Gaussian-filtered CLAHE stream for noise reduction. Each stream is decomposed using Laplacian pyramids, and four weight maps-chromatic, local contrast, saturation, and exposure are adaptively estimated to guide the fusion process. This approach ensures consistent color correction, enhanced textures and structural details, and effective noise suppression in the reconstructed output. The method was evaluated on the UIEB and EUVP datasets using using both reference-based (PSNR, SSIM, MSE) and no-reference (NIQE, AG, UIQM and entropy) metrics. Comparative experiments with UDCP, CLAHE, Water-Net, Retinex, GDCP, FUnIE-GAN, and UGAN demonstrate consistent improvements in color restoration, visibility, and perceptual quality. Our framework achieved 25.44 dB PSNR, 0.895 SSIM, and 7.68 entropy, outperforming both conventional and learning-based enhancement methods. These results, further supported by histogram analysis and ablation studies, confirm the reliability and effectiveness of the approach.
Keywords
Introduction
Underwater image processing is a critical field in computer vision, with applications in marine biology, robotics, archaeology, and environmental surveillance. However, various factors such as light absorption, scattering, and color distortion severely degrade image quality, making it challenging to obtain clear underwater visuals (Zhu et al., 2021). These distortions occur because light wavelengths are absorbed at different rates in water, causing deep-sea images to appear dominantly blue or green (Metzner & Salzmann, 2023).
There have been many techniques developed to mitigate these problems, from classical image processing to deep learning-based techniques, that have been developed to address these issues (Shi et al., 2022). While traditional techniques like white balance correction and histogram equalization can enhance color balance and contrast, they tend to struggle when it comes to complex underwater lighting environments (B & Maheswari, 2020). Meanwhile, data-driven models (e.g., learned-based approaches) leverage large datasets to enhance image quality (Chu, 2022). However, especially when dealing with a wide range of underwater settings, these approaches may have difficulty generalizing and scaling with computation time (Cacciapuoti & D'Amore, 2024).
In this research, we introduce a unique multi-scale fusion-based preprocessing method, featuring a new framework for generating weight maps to restore underwater images (Yang et al., 2024). Our approach utilizes Laplacian pyramid decomposition to process images at multiple scales while integrating three fundamental enhancement techniques: white balance for color correction, CLAHE (Contrast-Limited Adaptive Histogram Equalization) for contrast improvement, and Gaussian filtering for noise reduction (Moradi et al., 2024). Beyond this, the enhancement process was refined with four components of weighted balance: WCH (Weighted Color Balance) for natural color reproduction, WCL (Weighted Local Contrast) for edge and texture enhancement, WSAT (Weighted Saturation Adjustment) for color vividness, and WEXP (Weighted Exposure Correction) for optimal brightness. By computing multi-scale weight maps, our method adaptively enhances image structures across different sizes. Within this multi-stage framework, we perform an optimal fusion of the weight maps to preserve global image characteristics, such as overall brightness and color fidelity, while simultaneously enhancing local details, including textures and edges. This comprehensive approach effectively mitigates noise artifacts and significantly improves image clarity. Extensive experimental evaluations on the UIEB dataset (https://www.kaggle.com/datasets/larjeck/uieb-dataset-raw) confirm that our method outperforms existing baseline techniques across multiple quantitative metrics, including Entropy (measuring information content), Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Mean Squared Error (MSE), demonstrating its superior performance in underwater image enhancement.
The current underwater image enhancement approaches, including UDCP and Water-Net, fail to generalize to various underwater conditions, as they rely on predefined priors or generate data-driven models designed for the training domain, which often do not adapt across diverse water types and lighting conditions. To overcome these limitations, an innovative multi-scale fusion-based method and a new weight map generation framework, incorporating Laplacian pyramids for decomposition followed by an optimal fusion of the enhancement components: white balance for color consistency, CLAHE for contrast enhancement, and Gaussian filtering for noise reduction. The proposed method includes four weighted terms used to fine-tune enhancement over the WCH, WCL, WSAT, and WEXP, with multi-scale processing to ensure global image fidelity and fine detail preservation.
The main contribution of this paper is as follows: Proposed a novel underwater image enhancement method that utilizes Laplacian pyramid decomposition combined with adaptive multi-scale processing for superior restoration. Introduced a dynamic weighting framework comprising four optimized components: Chromatic Weight (WCH) for color balance, Local Contrast Weight (WCL) for contrast refinement, Saturation Weight (WSAT) for color vividness, and Exposure Weight (WEXP) for brightness control. These weights are computed adaptively across multiple scales to ensure balanced enhancement throughout the image. We developed a multi-scale fusion strategy that effectively integrates White Balance for color correction, Contrast-Limited Adaptive Histogram Equalization (CLAHE) for contrast enhancement, and Gaussian filtering for noise reduction within a unified framework. This approach simultaneously preserves global image characteristics and enhances fine local details. Implemented a normalization process for the computed weights, ensuring that each enhancement component contributes comparably to the final fusion, thereby optimizing the overall enhancement effect.
Related Works
The study by Choubey and Choubey (2024) focuses on the role of preprocessing algorithms in underwater image analysis. Indeed, they underscore the potential of various noise reduction and distortion removal methodologies, from adaptive filtering to wavelet denoising, to mitigate these kinds of problems, including the turbidity, scattering, and absorption of light by water. Vijayalakshmi and Sasithradevi (2024) provide a comprehensive review of deep learning architectures for preprocessing underwater images. They present the pipeline of underwater image processing, which includes image collection, preprocessing, feature extraction, and classification.
The research (Markkandan, 2024) conducts a survey on underwater image processing using artificial intelligence technologies. The study overviews classical techniques such as histogram equalization and white balancing and more recent AI approaches such as convolutional neural networks (CNNs) and generative adversarial networks (GANs). The model (Jiang et al., 2023) proposed ECO-GAN, an underwater image enhancement method based on a generative adversarial network. The architecture employs an encoder to learn features, with separate decoders for denoising motion blur, increasing brightness, and color correction. Cross-stage fusion modules are used to enhance output quality.
The study by Prasenan and Suriyakala (2022) analyzes preprocessing techniques for underwater images, focusing on challenges such as light absorption, scattering, and noise. The work examines the physics of light transmission in water and surveys algorithms for various image enhancement and feature extraction methods. Shuang et al. (2024) conduct a comprehensive review of algorithms for improving underwater optical image quality. The authors identify unique challenges in underwater imaging and propose a new taxonomy of underwater imaging methods based on algorithmic approaches. Alsakar et al. (2024) offers a detailed review of underwater image restoration and enhancement techniques. They also classify approaches as either enhancement or restoration and review degradation factors such as light absorption and scattering.
Huang et al. (2024) propose an underwater image quality evaluation system based on the Multi-Exposure Fusion-based (MEFB) method. They evaluate the performance of YOLOv8 using augmented image datasets, highlighting discrepancies between quantitative and qualitative metrics. Umamageswari et al. (2024) present a strategy for preprocessing, augmentation, and noise reduction in underwater images. Their approach integrates DnCNN for noise removal, SURF for feature extraction, and CLAHE for image enhancement. This comprehensive method improves edge detection, color correction, and brightness adjustment, resulting in better detection accuracy compared to existing methods.
The study by Singh and Bhat (2023) reviews key technologies for preprocessing underwater optical images, categorizing preprocessing methods into three types: (1) image acquisition methods, (2) sharpening methods (including both traditional and deep learning-based approaches), and (3) segmentation methods. Rajinikanth and Rama (2023) systematically review methodologies for underwater image processing and enhancement, covering restoration techniques, enhancement methods, deep learning-based approaches, datasets, and evaluation metrics. Muniraj and Dhandapani (2023) propose an algorithm for assessing Regions of Interest (ROIs) in underwater images using genetic algorithms combined with firefly and particle swarm optimization. Their process involves image collection, thresholding, ROI identification, and performance evaluation using the UFO-120 dataset, though they note this represents only the initial stage of their research.
Wang et al. (2024) introduces an underwater image enhancement approach that combines modified color correction and adaptive Look-Up-Table (LUT) as well as edge-preserving filters. The method first transforms the image into LAB color space, followed by contrast enhancement based on pass probability thresholds, and lastly edge-preserving with a fast local Laplacian filter. Wu et al. (2024) analyze underwater image enhancement and restoration techniques, categorizing them based on their dependence on physical imaging models. The study includes comprehensive experimental comparisons using public datasets and quality assessment methods. Pang et al. (2023) develop a variable contrast and saturation enhancement model for underwater images, while Liang et al. (2024) propose NPT-UL, an unsupervised learning framework based on non-physical transformation for underwater image enhancement. Peng et al. (2025) introduce a U-Net++ neural network for preprocessing LOFARgrams in underwater target detection. The network utilizes nested skip connections to improve time-frequency feature fusion, achieving superior noise suppression while preserving target signatures through training on both synthetic and real LOFARgram data. Pratama et al. (2025) compare system identification methods for AUV yaw dynamics prediction, evaluating N4SID and ARX models using preprocessed data.
Karthikeyan et al. (2025) introduced a lightweight deep hybrid convolutional neural network (CNN) integrated with attention mechanisms for underwater image restoration. Their model combines depthwiseseparable convolutions with channel attention modules to reduce computational costs while addressing color distortion and haze through multi-scale feature fusion. The approach demonstrated competitive performance with fewer parameters. Shao et al. (2024) introduced UIEAnything, a zero-shot framework combining depth estimation and White Balance (WB) models. Depth maps guided adaptive scene recovery; improved Sea-Thru handled backscatter. Required no training data, generalizing across environments. Showed superior adaptability to unseen scenes. Zhang et al. (2025a) developed a precise target localization system using a small-scale vertical hydrophone array. Applied beamforming and time-delay estimation to process acoustic signals. Exploited vector sensor advantages for 3D localization. Achieved high precision in shallow water tests. Wang et al. (2025) introduced an adaptive acoustic target recognizer with multi-scale residual and attention modules. Learned scale-invariant features via hierarchical convolutions. Attention mechanisms highlighted discriminative frequency bands. Outperformed traditional methods in noise.
Overview of Fusion-Based Enhancement and CLAHE
Fusion-based techniques have emerged as a prominent strategy for underwater image enhancement, aiming to combine the strengths of multiple algorithms or image representations to overcome the limitations of any single approach. A summary of representative fusion-based methods is provided in Table 1, highlighting the diversity of preprocessing components and fusion strategies.
Comparison of Fusion-Based Underwater Image Enhancement Methods.
Comparison of Fusion-Based Underwater Image Enhancement Methods.
Contrast Limited Adaptive Histogram Equalization (CLAHE) is among the most widely adopted standalone techniques for contrast enhancement in underwater imaging, as evidenced by its extensive use in recent literature (Li et al., 2022; Naik et al., 2021). Its popularity stems from its ability to improve local contrast without the excessive noise amplification often observed with global histogram equalization. However, as reported in (Li et al., 2022), a primary limitation of CLAHE is its tendency to over-enhance noise in relatively homogeneous regions, such as open water or sandy seabeds. Similarly, (Naik et al., 2021) emphasizes that while CLAHE effectively improves contrast, it does not correct color distortions on its own and is therefore often coupled with additional preprocessing steps.
To address these limitations, the proposed framework advances this field by moving beyond sequential preprocessing. CLAHE is not applied in isolation but instead serves as a dedicated branch within a parallel multi-stream fusion architecture. Its output is dynamically balanced with color-corrected and noise-suppressed representations using adaptive perceptual weight maps, ensuring that local contrast is preserved without introducing noise or color bias.
As summarized in Table 1, existing fusion-based enhancement methods typically integrate preprocessing techniques such as white balance, CLAHE, filtering, or Retinex variants through multi-scale decomposition and weighted fusion. While these approaches have demonstrated improvements in contrast, color correction, and structural preservation, most rely on either fixed or handcrafted weight maps (Qu et al., 2024; Zhang et al., 2025b; Zhang et al., 2025c) or adopt black-box deep learning fusion strategies, where interpretability and adaptability are limited (Kahveci & Ayaroglu, 2020; Verma et al., 2024).
Moreover, CLAHE, though widely employed as a preprocessing step (Tian et al., 2025, March; Zhao et al., 2023), often amplifies noise in homogeneous regions, while Retinex-based pipelines are prone to over-enhancement artifacts (Kahveci & Ayaroglu, 2020). As a result, these methods face persistent trade-offs such as color over-correction, noise amplification, and texture loss, which limit their robustness across diverse underwater conditions. These challenges underscore the necessity of a more adaptive, perceptually guided fusion framework, an objective directly addressed in the proposed methodology.
Underwater image enhancement is a particularly challenging task due to the combined effects of wavelength-dependent light absorption, scattering, turbidity, and noise from suspended particles. These degradations lead to severe color distortions, reduced contrast, and loss of fine structural details, making standard enhancement techniques insufficient. To address these challenges, we propose a multi-stream preprocessing and adaptive fusion framework that strategically integrates complementary enhancement operations into a unified pipeline.
The overall workflow of the proposed method is illustrated in Figure 1. The framework begins with white balance correction to restore global color fidelity by compensating for the loss of red wavelengths and neutralizing the dominant blue-green tint. Next, Laplacian pyramid decomposition provides a multi-scale representation of the image, separating low-frequency structures from high-frequency textures. To address the poor local visibility often observed in underwater scenes, Contrast-Limited Adaptive Histogram Equalization (CLAHE) is applied, which enhances local contrast while controlling over-amplification of noise. Since CLAHE may still introduce high-frequency distortions, we incorporate a Gaussian filtering stage to suppress noise in homogeneous regions while retaining essential structures.
Finally, the three complementary versions of the image white-balanced, CLAHE-enhanced, and Gaussian-filtered CLAHE are adaptively fused using Laplacian pyramids guided by perceptual weight maps (chromaticity, saturation, local contrast, and exposure). This multi-stream fusion design ensures that the strengths of each preprocessing stage are preserved while their weaknesses are minimized, producing outputs that are both visually natural and quantitatively superior.
Input Raw Image Formation
Underwater images captured by imaging devices are often severely degraded due to the optical properties of water, including wavelength-dependent absorption, scattering by suspended particles, and reduced illumination with increasing depth. These effects lead to significant challenges such as the loss of red wavelengths (resulting in dominant blue–green hues), low contrast, and noise from turbidity. Such degradations reduce visibility and obscure fine details, necessitating preprocessing before enhancement.
Schematic Representation of the Proposed Preprocessing and Adaptive Multi-Scale Fusion Framework for Underwater Image Enhancement.
Mathematically, a raw underwater image can be represented as a two-dimensional matrix of pixel intensities:
Underwater images are strongly affected by wavelength-dependent light absorption, with red and yellow attenuated rapidly while blue–green components dominate. This results in severe color distortions and an unnatural visual appearance. White balance correction is therefore essential to restore perceptual naturalness and prepare the image for subsequent enhancement steps.
The correction is modeled as a linear transformation applied to the raw input image:
where
The pixel intensity at spatial location (x, y) can be expressed as:
The scene's illuminant Gray World Assumption:
White Patch Assumption:
Shades of Gray (Minkowski Norm
Channel-wise gain factors are computed as:
The corrected pixel intensities are then:
This scales the color channels to neutralize the illuminant.
A chromatic adaptation transform (CAT) is applied:
where, M is diagonal Adaptation Matrix, I is original image, I′ is color corrected image.
After applying white balance in linear space, the image is typically converted back to RGB by applying a gamma correction:
This ensures that the image appears correct on standard displays. Final Formula for White Balance Correction
Unlike conventional white balance corrections applied directly in RGB space, our method integrates chromatic adaptation with gamma correction in a structured sequence (Eqns. 3–14). This ensures that the illuminant estimation is explicitly modeled and corrected before further enhancement. Such a staged formulation provides a more accurate restoration of true colors in underwater scenes compared to existing MSF approaches.
The Laplacian pyramid provides a multi-scale image representation by decomposing an image into a sequence of detail layers. It is constructed by first generating a Gaussian pyramid, consisting of progressively blurred and downsampled versions of the original image. The Laplacian pyramid is then obtained by computing the difference between each Gaussian level and the upsampled version of its next coarser level. Each Laplacian layer captures high-frequency information, such as edges and textures, that are lost during downsampling. This multi-scale separation allows simultaneous analysis of both low-frequency structures and fine details, making the Laplacian pyramid a widely used representation in image enhancement tasks
Laplacian Pyramid Construction
The Laplacian pyramid is built from a sequence of Gaussian pyramid levels. At each level k, a Gaussian-blurred image
The original image can be reconstructed by summing the Laplacian levels with the upsampled coarser levels:
When applied to the white-balanced image Iwb, the Laplacian pyramid decomposes the scene into multiple spatial frequency bands, allowing both global low-frequency structures and fine high-frequency details to be retained. This property is particularly crucial for underwater image enhancement, where global color distortions must be corrected while preserving subtle textures and edges.
Multi-scale fusion enhances underwater images by combining weighted features to preserve color fidelity, contrast, and details while reducing noise. Four weight maps are derived to emphasize visually significant regions:
where
The overall perceptual weight is obtained as a weighted sum:
Unlike prior multi-scale fusion approaches that typically rely on two simple weights (e.g., contrast and saturation), our design incorporates four perceptual weights with adaptive scaling, ensuring a balanced contribution of color fidelity, sharpness, brightness, and vividness. The normalization in Eqn. (23) prevents any single weight from dominating, leading to stable and visually consistent results across diverse underwater scenes.
While white balance correction restores global color fidelity, underwater images often remain affected by severely degraded local contrast due to scattering and wavelength-dependent absorption. To overcome this limitation, we employ Contrast-Limited Adaptive Histogram Equalization (CLAHE), which adaptively redistributes pixel intensities within local regions while limiting excessive noise amplification. Unlike global histogram equalization, CLAHE is particularly effective for underwater imaging, where illumination varies substantially across spatial regions.
The CLAHE operation on a white-balanced image
The image I(x,y) of size M × N is divided in to non-overlapping tiles
where i, j are the indices of the tile in the image grid.
For each tile
This function counts the number of occurrences of each intensity level k within the tile.
The normalized CDF is given by:
To prevent noise over-amplification, a clip limit
The excess pixels are redistributed uniformly across all histogram bins.
To avoid block artifacts at tile boundaries, bilinear interpolation is applied:
For each pixel at (x, y) the new intensity is computed using the four nearest tiles:
The enhanced image is obtained by replacing each original intensity I(x, y) with the mapped intensity I′(x, y) from the contrast-limited histogram.
If applied in LAB space, enhancement is confined to the luminance channel:
Then, the final color-enhanced image is reconstructed by combining
As with the white-balanced image, perceptual weight maps (saturation, chromaticity, local contrast, exposure) are computed from the Laplacian pyramid of
The combined weight is then normalized to ensure it is comparable across all pixels:
Although CLAHE has been widely applied in image enhancement, most existing approaches use it as an independent preprocessing step, which often amplifies noise or produces uneven enhancement. In contrast, our framework integrates CLAHE into a multi-scale Laplacian fusion pipeline, where its contribution is regulated through four adaptive perceptual weight maps. This design ensures that contrast improvements are effectively preserved while suppressing noise and artifacts, resulting in a more consistent and visually natural reconstruction. Importantly, the weight maps are derived directly from the CLAHE-enhanced image, allowing the fusion process to be guided by its unique contrast characteristics.
Although CLAHE effectively enhances local contrast, it can also amplify high-frequency noise, especially in homogeneous regions of underwater images. To address this issue, a Gaussian filtering stage is applied to the CLAHE-enhanced output. The Gaussian filter functions as a low-pass operator, reducing fluctuations caused by scattering and sensor noise, thereby improving perceptual smoothness. The resulting noise-suppressed image serves as the third input to the multi-scale fusion pipeline, alongside the white-balanced and CLAHE-enhanced images. A corresponding set of weight maps is derived from the Gaussian-filtered image, allowing the fusion process to utilize its smooth regions while depending on the CLAHE output for fine textures and the white-balanced result for accurate color representation. This complementary combination effectively preserves structural details and suppresses noise, producing a cleaner and more visually consistent enhanced image.
Gaussian Filtering Formulation
The Gaussian kernel is defined as:
Kernel size K is chosen according to:
Ensuring adequate coverage of the Gaussian distribution. The filtering process is then performed as a convolution:
As in the earlier stages, the Gaussian-filtered image is decomposed into multiple scales using a Laplacian pyramid. Four perceptual weights are then computed saturation (Ws), chromaticity (Wc), local contrast (Wl), and exposure (We) and combined as:
While Gaussian filtering has been widely used for general image denoising, its integration within the proposed multi-scale fusion framework introduces a distinctive enhancement strategy. Unlike conventional approaches that employ CLAHE for contrast improvement or apply Gaussian smoothing as an isolated preprocessing step, the proposed method utilizes the Gaussian-filtered CLAHE output as a dedicated third input to the fusion process. This configuration allows the noise-suppressed homogeneous regions from the Gaussian output to complement the high-contrast details of the CLAHE image and the color-corrected features of the white-balanced image. The adaptive weighting mechanism of the Gaussian-filtered channel prevents excessive smoothing by emphasizing areas where noise is most significant. This integration of Gaussian filtering within a weighted multi-scale fusion framework marks a notable advancement over existing fusion-based enhancement techniques, producing reconstructions that are sharp, well-balanced, and perceptually consistent under complex underwater conditions.
The final stage of the proposed framework integrates the outputs of the three complementary preprocessing streams white balance, CLAHE, and Gaussian-filtered CLAHE through a multi-scale fusion strategy. Each stream addresses a distinct limitation of underwater imaging: white balance restores global color fidelity, CLAHE enhances local contrast and texture visibility, and Gaussian filtering suppresses noise in homogeneous regions. Rather than relying on a single enhanced version, the framework adaptively fuses all three, ensuring their respective strengths are preserved while mitigating weaknesses.
As illustrated in Figure 1, the workflow begins with the generation of the three preprocessed images, which are subsequently decomposed into Laplacian pyramids to capture both low-frequency structures (color gradients, smooth regions) and high-frequency details (edges, textures). For each decomposition level, four perceptual weight maps chromatic, saturation, local contrast, and exposure are computed and normalized to adaptively guide the contribution of each stream. Formally, the fused Laplacian coefficients at scale k are computed as:
The novelty of this framework lies not in the individual preprocessing operations, but in the strategic manner of their adaptive integration. Unlike prior multi-scale fusion approaches that depend on a single enhanced image or employ fixed weighting functions, the proposed method treats preprocessing as a multi-stream complementary fusion problem. By leveraging perceptual weight maps that dynamically regulate the contribution of each stream, the framework resolves common trade-offs such as over-enhancement, color bias, and noise amplification.
This content-aware integration allows the system to produce outputs that are both visually natural and quantitatively superior, distinguishing it from conventional underwater image enhancement pipelines. Thus, the proposed design advances multi-scale fusion by rethinking preprocessing as an adaptive, perceptually guided integration problem, ensuring balanced enhancement that is robust across diverse underwater conditions
Input: Output: White Balance Algorithm: Decomposition: Laplacian pyramids (First Step): Weight Map Calculation (First Step):
Contrast Enhancement (CLAHE): Decomposition: Laplacian pyramids (Second Step): Weight Map Calculation (Second Step)
Gaussian Filter for Noise Reduction: Decomposition (Laplacian pyramids—Third Step): Weight Map Calculation (Third Step)
Weight Map Normalization: Multi-Scale Fusion: Reconstruction: Output Enhanced Image: End Algorithm
To enhance the reproducibility of our proposed framework, the critical parameters used in the preprocessing stages were carefully selected based on empirical evaluations on a subset of the UIEB and EUVP dataset (100 images). The full processing pipeline is already detailed in Algorithm 1, which outlines each enhancement stage and corresponding weight map generation.
Weight Scaling Factors (α, β, γ, δ)
The four weights Chromatic Weight (WCH), Local Contrast Weight (WCL), Saturation Weight (WSAT) and Exposure Weight (WEXP) were combined using a weighted summation. We empirically determined the optimal coefficients based on PSNR and SSIM scores as:
This configuration prioritizes local contrast enhancement while maintaining balanced emphasis on color, saturation, and brightness.
Gaussian Filter Standard Deviation (σ)
We evaluated σ values in the range [0.5, 2.5] and found that σ = 1.5 (with kernel size 9, based on 16σ+1) offered the best compromise between noise suppression and structural detail preservation.
CLAHE Clip Limit
The CLAHE clip limit was tuned through grid search (values: 2, 4, 6, 8). A value of clip limit = 4.0 consistently produced the best visual and statistical enhancement results without over-amplifying noise, especially in homogeneous regions.
Results and Discussions
To comprehensively assess the effectiveness of the proposed multi-stream fusion framework, experiments were conducted on two widely used underwater image enhancement benchmarks: UIEB and EUVP. The evaluation employed both objective quantitative metrics including Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), Mean Squared Error (MSE), Entropy and qualitative assessments based on visual inspection of enhanced outputs. In addition, ablation studies, histogram-based analyses, and cross-dataset validations were performed to systematically examine the contribution of individual components, the progressive improvements achieved across enhancement stages, and the generalizability of the method under diverse underwater conditions.
Overall, the results demonstrate that the fused outputs achieve substantial improvements in color balance, contrast, and visibility while preserving structural details. The objective evaluations confirm significant gains in clarity, noise suppression, and perceptual quality compared to baseline methods, establishing the proposed approach as an effective and robust solution for underwater image enhancement.
The detailed results are presented in the following subsections, beginning with a quantitative evaluation against representative baseline methods.
Dataset Description
To comprehensively evaluate the proposed method and to minimize the risk of dataset-specific bias, experiments were conducted on two widely used underwater image enhancement benchmarks: UIEB and EUVP.
The UIEB (Underwater Image Enhancement Benchmark) dataset consists of 950 real-world underwater images, including 890 raw samples and 60 high-quality reference images. In this work, we use 100 images from the test split for quantitative evaluation. UIEB serves as the primary dataset for both training and testing, as it is one of the most widely adopted benchmarks in underwater image enhancement research. The dataset is publicly available at: https://www.kaggle.com/datasets/larjeck/uieb-dataset-raw
To demonstrate the generalizability of our framework and to evaluate robustness under diverse underwater conditions, additional experiments were conducted on the EUVP (Enhancing Underwater Visual Perception) dataset. EUVP contains both paired and unpaired underwater images captured across different visibility and illumination levels. For consistency, we report results on 100 test images, covering a variety of turbidity, color distortion, and lighting conditions. No retraining or fine-tuning is performed on EUVP; instead, the model trained on UIEB is directly applied, thereby validating cross-dataset performance. The dataset is publicly available at: https://www.kaggle.com/datasets/pamuduranasinghe/euvp-dataset.
Performance Evaluation
The effectiveness of the proposed method is compared with state-of-the-art underwater image enhancement techniques, including UDCP (Zhou et al., 2023), CLAHE (Alhajlah, 2023), Water-Net (Liu et al., 2023), RETINEX (Zhou et al., 2023), GDCP (Islam et al., 2020), FUnIE-GAN (Chen et al., 2024), and UGAN (Cong et al., 2023). The comparison focuses on essential evaluation metrics.
Table 2 shows the performance of the UDCP model for underwater image enhancement. The results indicate moderate noise reduction (PSNR: 27.93–28.05 dB), limited detail retention (Entropy: 4.29–5.32), and low to moderate structural similarity (SSIM: 0.14–0.52).
Table 3 presents the performance results of the CLAHE model for underwater image enhancement, evaluated using PSNR, Entropy, MSE, and SSIM metrics. The PSNR values range from 28.04 to 29.31 dB, indicating effective noise reduction, while the entropy values (6.59 to 7.71) demonstrate significant contrast enhancement. The SSIM values (0.79 to 0.86) indicate improved structural similarity, making CLAHE a strong method for enhancing underwater images.
Performance Evaluation of the UDCP Model Based on PSNR, Entropy, MSE, and SSIM for Underwater Image Enhancement.
Performance Evaluation of the UDCP Model Based on PSNR, Entropy, MSE, and SSIM for Underwater Image Enhancement.
Quantitative Analysis of the CLAHE Model Using PSNR, Entropy, MSE, and SSIM for Underwater Image Enhancement.
Performance Evaluation of the WATERNET Model Based on PSNR, Entropy, MSE, and SSIM for Underwater Image Enhancement.
Quantitative Analysis of the GDCP Model Using PSNR, Entropy, MSE, and SSIM for Underwater Image Enhancement.
Table 4 shows the performance effects of WATERNET for underwater image enhancement. Its outstanding performance with high PSNR (28.3–29.9 dB) and nearly perfect SSIM (0.82) shows its superior structural preservation. The implicit MSE and entropy values are still zero, indicating a possible error in contrast enhancement evaluation or metric computation.
Table 5 presents the performance metrics of the Guided Dehazing Color Prior (GDCP) model on enhanced underwater images. The PSNR values range from 27.33 to 29.31 dB, indicating moderate improvement in signal quality across different images. The entropy values, varying between 5.72 and 7.17, suggest effective enhancement in information richness and visual detail. The MSE values fall between 76.20 and 119.97, showing a relatively acceptable error margin in image reconstruction.
The Table 6 presents the results of the RETINEX model applied to both original and enhanced images, showcasing four evaluation metrics: PSNR, Entropy, MSE, and SSIM. The PSNR values for the enhanced images range from 26.9 to 28.9, with a slight degradation in quality compared to the original image's PSNR of 27.8. The RETINEX model's enhancement process leads to a slight decrease in image quality (lower PSNR), increased distortion (higher MSE), reduced complexity (lower entropy), and a loss of structural similarity (lower SSIM) compared to the original image.
Evaluation of the RETINEX Model Using PSNR, Entropy, MSE, and SSIM Metrics Across Various Underwater Images.
Table 7 presents the performance of the UGAN model on underwater images. The enhanced images show a moderate improvement in quality with PSNR values ranging from 27.4 to 28.2, compared to the original 27.4. Entropy values remain consistently high (7.05–7.31), reflecting good detail preservation, while the slightly higher MSE values (99.7–118.8) indicate increased pixel-level differences.
Performance Evaluation of the UGAN Model Using PSNR, Entropy, MSE, and SSIM Metrics for Underwater Image Enhancement.
Table 8 highlights the results of the FUnIE-GAN model. The PSNR values (26.5–28.1) reveal only modest improvements in signal fidelity, with some cases showing no gain over the original images. Nevertheless, the entropy range (6.91–7.15) suggests that the model effectively retains image details, while the MSE values (103.6–125.2) indicate moderate reconstruction errors, comparable to UGAN.
Evaluation Results of the FUnIE-GAN Model Showing PSNR, Entropy, MSE, and SSIM Values for Enhanced Underwater Images Compared to the Original Images.
The Proposed Model have higher entropy (up to 8.91) that generate better contrast and retain more information, and PSNR (29.4–31.7 dB) guarantees the quality of images as shown in Table 9. SSIM (0.970–0.975) demonstrated better accuracy in structural preservation compared to UDCP and CLAHE. The model significantly improves underwater images, with balances of MSE and noise reduction.
Quantitative Results of the Proposed Multi-Scale Fusion Model Across Various Underwater Images, Including PSNR, Entropy, MSE, and SSIM.
Figure 2 presents a visual comparison of different underwater image enhancement methods. The raw images suffer from strong color casts, low contrast, and poor visibility. Traditional methods like UDCP and CLAHE partially restore visibility but often introduce over-enhancement or unnatural tones. Deep learning–based models such as WATER-Net, GDCP, and RETINEX reduce haze but still fail to achieve natural color balance, while GAN-based approaches (FUnIE-GAN and UGAN) enhance perceptual quality but sometimes produce artifacts, over-saturation, or blurred textures. In contrast, the proposed method restores natural color fidelity, improves edge sharpness, and enhances contrast while preserving fine details, producing results that are visually closest to the ground truth. This demonstrates the robustness and superiority of the proposed framework across diverse underwater conditions.

Visual Comparison of Enhancement Results Using Different Methods, Including the Proposed Multi-Scale Fusion Approach, Against Ground Truth Images.
The quantitative evaluation results on the UIEB and EUVP datasets are summarized in Tables 10 and 11. Traditional model-free approaches such as RETINEX, CLAHE, GDCP, and UDCP achieve only moderate improvements in PSNR and SSIM, with relatively higher MSE values, indicating a limited ability to correct severe color casts and contrast degradation. Learning-based methods, including Water-Net, FUnIE-GAN, and UGAN, show noticeable gains over conventional techniques, particularly in SSIM and entropy, reflecting their improved capacity for structural recovery and visual enhancement.
Quantitative Comparison of Enhancement Methods on the UIEB and EUVP Datasets Using Reference Metrics.
Quantitative Comparison Using No-Reference Image Quality Metrics on the UIEB and EUVP Datasets.
The proposed Multi-scale Fusion framework demonstrates superior performance across both reference-based and no-reference quality metrics. In the reference-based evaluation, it achieves a PSNR of 25.44 dB, an SSIM of 0.895, and an MSE of 185.2 on UIEB, and a PSNR of 25.11 dB, an SSIM of 0.852, and an MSE of 190.6 on EUVP. This represents a significant 2.68 dB PSNR improvement over the strongest baseline (Water-Net). In the no-reference evaluation, the framework attains optimal scores for NIQE (5.1 on UIEB, 5.0 on EUVP), AG (4.8 on UIEB, 4.9 on EUVP), Entropy (7.68 on UIEB, 7.61 on EUVP), and UIQM (3.55 on UIEB, 3.58 on EUVP). The low NIQE values confirm superior perceptual naturalness, while the high AG values validate effective edge preservation. The elevated entropy and UIQM scores further highlight enhanced information content and superior underwater-specific visual quality. These consistent improvements across diverse datasets and evaluation metrics confirm the robustness of the proposed framework in achieving a balanced enhancement of color fidelity, contrast, and structural detail without introducing artifacts.
The experimental results across all metrics confirm the superior performance of the proposed Multi-scale Fusion method compared to existing approaches. In Figure 3 (PSNR), our method consistently achieves higher values, surpassing traditional techniques (UDCP, GDCP) and learning-based methods (Water-Net, FUnIE-GAN). Similarly, Figure 4 (SSIM) illustrates better structural integrity preservation, while Figure 5 (Entropy) demonstrates improved information retention with the highest entropy values. Notably, Figure 6 (MSE) highlights the method's accuracy, displaying the lowest error rates consistently. These comprehensive results across PSNR, SSIM, Entropy, and MSE validate the robustness and effectiveness of the proposed Multi-scale Fusion approach, providing reliable and enhanced underwater image quality over conventional and advanced methods, maintaining stability across all iterations.

PSNR Comparison Across Different Enhancement Methods.

SSIM Comparison Across Different Enhancement Methods.

Entropy Comparison Across Different Enhancement Methods.

MSE Comparison Across Underwater Image Enhancement Methods.
This subsection presents an ablation study conducted on 100 test images each from the UIEB and EUVP datasets to evaluate the contribution of individual components in the proposed multi-scale fusion framework. Individual modules, including White Balance, CLAHE, Gaussian filtering, and Laplacian pyramid fusion, were selectively removed to assess their impact on enhancement performance. Similarly, each perceptual weight map (WCH, WCL, WSAT, WEXP) was independently excluded, and the adaptive weight computation was compared against a baseline configuration with uniform weights (α=β=γ=δ=0.25).
The results, summarized in Table 12, show that removing any component leads to a noticeable degradation across all quantitative metrics. The exclusion of Laplacian pyramid fusion causes the largest drop (PSNR: 22.37 dB on UIEB), confirming its essential role in preserving structural details. The absence of white balance correction significantly affects color restoration, while removing CLAHE or Gaussian filtering reduces contrast and noise suppression. Among the perceptual weights, the chromatic (WCH) and local contrast (WCL) components are most influential for maintaining color fidelity and fine detail. Although the uniform weighting configuration performs moderately well, it is consistently outperformed by the adaptive weighting strategy, underscoring the effectiveness of content-aware fusion. Statistical analysis using paired t-tests confirms that the improvements achieved by the proposed configuration are statistically significant (p < 0.05) across both datasets. These findings collectively demonstrate that each module and adaptive weighting contributes meaningfully to the overall enhancement quality, with their integration being essential for high-fidelity and visually balanced underwater image restoration.
Ablation Study on the UIEB and EUVP Dataset Evaluating the Contribution of Individual Components to the Underwater Image Enhancement Model.
Ablation Study on the UIEB and EUVP Dataset Evaluating the Contribution of Individual Components to the Underwater Image Enhancement Model.
Table 13 presents the non-reference evaluation, confirming the significance of each component in the framework. The absence of Laplacian fusion results in the highest NIQE and lowest AG, indicating reduced naturalness and edge sharpness, while the removal of the chromatic weight (WCH) lowers color quality as reflected by decreased UIQM values. The steady improvement from ablated variants to the complete model, with optimal NIQE, AG, and UIQM scores, demonstrates that every module contributes effectively to the overall enhancement. The complete configuration with adaptive perceptual weighting ensures balanced and visually consistent underwater image quality.
Ablation Study on the EUVP Dataset Evaluating the Contribution of Individual Components to the Underwater Image Enhancement Model.
To further illustrate the progressive improvements of the proposed framework, histogram distributions of pixel intensities were analyzed across different stages of enhancement. Histograms provide a statistical view of color distribution and contrast, highlighting the recovery of suppressed channels, redistribution of intensities, and balancing of overall color composition.
As observed in Figure 7, the histogram of the raw underwater image reveals a clear dominance of the blue channel, suppression of the red channel, and only moderate intensity in the green channel. The grayscale distribution is narrow and clustered around mid-intensity values, reflecting typical underwater degradations such as severe color cast, limited dynamic range, and poor contrast.
After white balance preprocessing, as shown in Figure 8, the red channel begins to recover, reducing the dominance of blue and shifting the histograms toward a more balanced distribution. However, contrast remains limited and structural details are still under represented.
With the application of CLAHE, the grayscale histogram is broadened as illustrated in Figure 9, redistributing pixel intensities across a wider range and enhancing local contrast. This step accentuates edges and textures but may also introduce slight over-enhancement in homogeneous regions.
The Gaussian filtering stage suppresses high-frequency fluctuations, as observed in Figure 10, which is indicated by the reduced spikiness in the histogram, effectively attenuating noise while retaining key structural features.
Finally, in the multi-scale fusion output as seen in Figure 11, the histograms demonstrate a well-balanced and uniformly distributed spread across all three RGB channels. The red, green, and blue intensities are more evenly aligned, eliminating the bluish-green dominance typical of underwater imagery. The grayscale histogram spans the full 0–255 range, indicating improved brightness, enhanced global contrast, and preservation of both shadows and highlights. The smooth yet broad distribution confirms balanced integration across channels, resulting in natural color reproduction and visually pleasing enhancement.

Histogram Analysis of the Original Image. (a) Distribution of Intensities Across the Three RGB Color Channels. (b) Intensity Distribution of the Corresponding Grayscale Image.

Histogram Analysis After White Balance Processing. (a) Distribution of Intensities Across the Corrected RGB Color Channels. (b) Intensity Distribution of the Corresponding Grayscale Image.

Histogram Analysis After CLAHE Contrast Enhancement. (a) Distribution of Intensities Across the RGB Color Channels. (b) Intensity Distribution of the Corresponding Grayscale Image.

Gaussian Analysis After CLAHE Contrast Enhancement. (a) Distribution of Intensities Across the RGB Color Channels. (b) Intensity Distribution of the Corresponding Grayscale Image.

Histograms of the Fused Output: (a) RGB Histogram Illustrating Balanced Distribution Across red, Green, and Blue Channels with Improved Color Consistency (b) Grayscale Histogram Showing an Expanded Tonal Range and Enhanced Contrast, Highlighting the Effectiveness of the Fusion Process in Restoring Image Details.
Overall, the histogram evaluation validates that each stage contributes uniquely to the final reconstruction white balance restores chromatic fidelity, CLAHE enhances local contrast, Gaussian filtering reduces noise, and multi-scale fusion ensures adaptive integration. Collectively, these steps achieve superior visual quality and structural fidelity in underwater images.
This subsection evaluates the computational performance of the proposed framework in terms of runtime, memory consumption, and algorithmic complexity. All experiments were conducted in a Google Colab environment with an NVIDIA T4 GPU (16 GB VRAM), 2 vCPUs, and 12 GB of system RAM. At an input resolution of 256 × 256, the average processing time was 0.12 s per image (8.3 FPS). At 512 × 512, the runtime increased to 0.35 s per image (2.9 FPS). Memory usage remained modest, with ∼1.2 GB at 256 × 256 and ∼1.8 GB at 512 × 512.
From a computational standpoint, the complexity of each module was assessed. The Contrast-Limited Adaptive Histogram Equalization (CLAHE) step is the most demanding, operating in O(N log N) due to local histogram processing, while white balance correction, Gaussian filtering, Laplacian pyramid decomposition/fusion, weight map calculations, and reconstruction scale linearly (O(N)). Therefore, the overall asymptotic complexity of the framework is O(N log N), dominated by CLAHE.
Although validation was performed only on benchmark datasets (UIEB and EUVP) and not deployed on embedded platforms or Autonomous Underwater Vehicles (AUVs), the results show that the method is computationally lightweight compared to deep CNN-based approaches and achieves near real-time performance at moderate resolutions. The combination of low runtime, modest memory requirements, and efficient complexity suggests strong potential for future adaptation in resource-constrained or real-time underwater imaging systems, pending platform-specific optimization.
Conclusion
This work presented a novel multi-stream preprocessing and multi-scale fusion framework for underwater image enhancement, integrating white balance correction, CLAHE-based contrast adjustment, and Gaussian-filtered noise suppression within a unified architecture. Each preprocessed stream was decomposed into Laplacian pyramids, and their contributions were adaptively balanced using four perceptual weight maps chromatic, saturation, local contrast, and exposure. Unlike conventional approaches that rely on single-stream enhancement or handcrafted weighting, the proposed method treats preprocessing as a complementary fusion problem and employs adaptive weight normalization to preserve global color fidelity, enhance local contrast, and suppress high-frequency noise simultaneously.
Extensive experiments on the UIEB and EUVP datasets demonstrate that the proposed framework consistently outperforms traditional enhancement techniques and recent fusion-based methods, both in terms of objective metrics (PSNR, SSIM, MSE, UIQM) and qualitative assessments. The ablation study confirms that each module and weight map contribute meaningfully to the final enhancement quality, while statistical significance testing validates the robustness of the observed improvements. Furthermore, histogram analyses illustrate the progressive restoration of balanced color distributions and improved contrast across processing stages. The method also achieves near real-time performance with modest computational and memory requirements, highlighting its practical feasibility.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
