End-to-end dehazing of traffic sign images using reformulated atmospheric scattering model

Abstract

As an advanced machine vision task, traffic sign recognition is of great significance to the safe driving of autonomous vehicles. Haze has seriously affected the performance of traffic sign recognition. This paper proposes a dehazing network, including multi-scale residual blocks, which significantly affects the recognition of traffic signs in hazy weather. First, we introduce the idea of residual learning, design the end-to-end multi-scale feature information fusion method. Secondly, the study used subjective visual effects and objective evaluation metrics such as Visibility Index (VI) and Realness Index (RI) based on the characteristics of the real-world environment to compare various traditional dehazing and deep learning dehazing method with good performance. Finally, this paper combines image dehazing and traffic sign recognition, using the algorithm of this paper to dehaze the traffic sign images under real-world hazy weather. The experiments show that the algorithm in this paper can improve the performance of traffic sign recognition in hazy weather and fulfil the requirements of real-time image processing. It also proves the effectiveness of the reformulated atmospheric scattering model for the dehazing of traffic sign images.

Keywords

Deep learning image processing dehazing of real-world traffic sign recognition reformulated atmospheric scattering model

1 Introduction

The rapid development of the world economy has brought environmental pollution and other problems. Mainly the hazy weather has caused a considerable impact on people’s lives. Many suspended particles in the hazy absorb and scatter light, resulting in reduced contrast, color distortion, and low recognizability of the images obtained by the camera, which brings significant challenges to traffic sign recognition. Autonomous vehicles receive road condition information through the camera. On the one hand, due to the characteristics of traffic signs, the traffic signs are often different from the surrounding environment in color; on the other hand, the outdoor image scene obtained by the camera has a wide range. There are too many sky areas, which are very different from indoor images. However, although there is much research on image dehazing, they synthesize hazy images indoors for testing. There are few unique methods for subsequent advanced vision tasks (such as target detection). Therefore, it is of great significance to design a dehazing method suitable for traffic sign recognition in hazy weather.

Traditional image dehazing methods are based on image enhancement and image restoration. The image enhancement method enhances image contrast, visibility, grey distribution, and other methods to improve the image’s visual effect. Among them, the dehazing algorithm based on the Retinex theory has been used in recent years. There are sound effects in image detail enhancement, color preservation, and recognition enhancement [1 –4]. Jobson et al. [5, 6]improved the window scale and successively proposed image enhancement algorithms based on single-scale Retinex (SSR) and Multi-Scale Retinex (MSR) to help improve the dehazing effect. Yu et al. [7] studied an image enhancement algorithm based on a physical lighting model. The experimental results show that the algorithm increases the richness of the visible edges of the image and maintains the consistency and naturalness of the colors. The method of image restoration is to establish an image degradation model by considering interference factors such as hazy, noise, and air medium, and inversely deduce a clear image based on the model [8]. For example, Jason et al. [9] used the polarization information of light scattering and combined image processing technology to obtain clear images after dehazing. Liang et al. [10]considered that infrared radiation has a better ability to penetrate the hazy and proposed a polarization dehazing method based on visible light and infrared image fusion to improve the visibility of hazy images. He et al. [11] proposed the image dehazing method based on the Dark Channel Prior (DCP) theory. This method needs to process the transmittance through soft matting, but the dehazing efficiency is not high due to excessive calculation. However, due to the simplicity of the DCP principle and algorithm, many scholars improved the image dehazing algorithm based on the DCP. For example, Jackson et al. [12] combined DCP and Rayleigh scattering theory to improve the transmittance and atmospheric light estimation. Experiments have proved that the method has an excellent dehazing effect and has a certain degree of optimization in calculation speed. Sarkar et al. [13] applied DCP and guided filters to the empirical wavelet domain of hazy images for dehazing, the output of the algorithm. The overall visual quality of the image is high. Lu et al. [14], based on the principle of DCP, combined with the CPU architecture features and instruction types of the embedded processor DSP, to ensure the hazy penetration effect of thick hazy images, and increase the processing speed by 50 times; Yang et al. [15] also improved computational efficiency through the DCP. Zhu et al. [16] proposed a Color Attenuation Prior (CAP), which creates a linear model to model the depth of the hazy image scene, uses a supervised learning method to train parameters and an atmospheric model to dehaze. Berman et al. [17] proposed a new Non-Local Dehazing (NLD), which is based on the color of the hazy-free image to form tight clusters in the RGB space, and each color cluster in the clear image becomes the RGB space. The distance map and clear image can be restored at the same time through the algorithm.

With the development of artificial intelligence, deep learning has gradually become a key technology in image dehazing. Many scholars have improved the effect of image dehazing based on deep learning methods [18 –22]. There is still room for improvement in many classic methods, such as Li et al. [23] proposed an AOD-Net dehazing model based on Convolutional Neural Network (CNN). The model does not need to estimate atmospheric light and transmittance separately but is implemented by the light-weight and high-efficiency CNN to realize end-to-end dehazing, and experiments have proved that the AOD-Net dehazing model has superior performance. Qian et al. [24] designed the FAOD-Net model based on AOD-Net, using separable depth convolution instead of standard convolution. This model can avoid the error estimation caused by the hazy image transmittance and atmospheric light value. Reduce network model training parameters and running time. Cai et al. [25] also designed an end-to-end training model DahazeNet based on the CNN architecture to estimate the transmittance of hazy images. They proposed a new type of non-linear activation called Bilateral Linear Unit (BReLU) Function to replace the commonly used ReLU or Sigmoid. Experiments show that DehazeNet dehazing is efficient and easy to use, and the dehazing effect is better than existing methods. Rick and Tim et al. [26, 27] can solve the linear inverse problem in image dehazing by learning the nearest neighbour operator in the iterative optimization algorithm. Yang et al. [28] designed an improved near-end DehazeNet based on this, combining the traditional dehazing method based on a priori theory and the advantages of deep learning and incorporating the prior knowledge related to hazy into the deep network. Experiments show that the method has reached the most advanced performance in dehazing a single image. In terms of deep learning, Ren et al. [29] proposed a Multi-Scale Convolutional Neural Network (MSCNN) for single image dehazing through training to learn the mapping between hazy images and their transmittance, and passed estimating the atmospheric transmittance to restore clear images, and experiments show that the method is effective in restoring synthetic and real hazy images.

Image dehazing belongs to low-level vision tasks and is usually considered a preprocessing step required to complete high-level vision tasks. At present, a small number of researchers combined with follow-up high-level vision tasks for research. For example, Wiesemann [30] studied the influence of hazy weather on traffic sign detection algorithms under different road conditions, using Koschmieder’s hazy model to semi-automatically generate depth maps for Simulate and analyze the hazy weather, thereby establishing a real traffic sign recognition environment. Yan et al. [31] proposed an effective method for the recognition of traffic speed limit signs in the hazy

This method uses HOG and SVM classifiers to extract features, detects marked areas, and has both processing time and robustness. The improvement is more in line with practical applications. It was inspired by the residual network, Cao et al. [32] designed a dehazing algorithm suitable for railway monitoring images and proved through experiments that the detection accuracy of the image after dehazing had been improved. Because of the current research status, few studies on the connection of images with high-level vision tasks after dehazing, and it is impossible to truly and effectively test the follow-up performance [23].

Therefore, this paper proposes an end-to-end dehazing network using reformulated atmospheric scattering model, including multi-scale residual blocks, which specifically considers the weight of atmospheric light value and transmittance, establishes an adaptive depth model whose parameters change with the input hazy image, and reconstructs the atmosphere. Thereby, the reconstruction error between the output dehazing image and the real clear image can be minimized, the contrast of the dehazing image can be improved, and the edge characteristics of traffic signs can be maintained, and the recognition accuracy of traffic signs under hazy weather can be enhanced.

2 Related work

This paper uses deep learning methods to recognize traffic signs in hazy weather, aiming to improve the accuracy of traffic sign recognition, thereby improving the safety of autonomous driving. The specific steps can be divided into two stages. Firstly, the traffic sign images in hazy weather are dehazed, and secondly, we identify the traffic sign on images after dehazing. The overall scheme studied in this paper is shown in Fig. 1.

Fig. 1

The overall scheme of dehazing of traffic sign images.

In terms of image dehazing, this paper is based on the more end-to-end physical model proposed by Li et al. [23], combined with the high-level visual task of subsequent traffic sign recognition, designed a light-weight neural network by introducing the idea of the residual network that can effectively improve the contrast of the dehazing image and maintain the edge characteristics of the image. We are aiming to obtain a dehazing image that is more conducive to traffic sign recognition.

2.1 Reformulated atmospheric scattering model

Atmospheric scattering model theory is a classic description of the generation of hazy images [33, 34], and the specific expression is shown in Eq. (1) $I (x) = J (x) t (x) + A (1 - t (x))$ (1)

Where I (x) is the input hazy image, J (x) is the clear image, A is the global atmospheric light value, t (x) is the transmittance, and the value range is 0 1. $J (x) = \frac{1}{t (x)} I (x) - A \frac{1}{t (x)} + A$ (2)

Due to the use of prior knowledge to estimate the global atmospheric light value A and transmittance t (x), and then through the process of Eq. (2) to restore a clear image, direct end-to-end dehazing cannot be achieved, which will cause errors to accumulate or even increase. To merge the atmospheric light values A and transmittance t (x) into K (x), we established the following Eqs. (3) and (4): $J (x) = K (x) I (x) - K (x) + b$ (3) $K (x) = \frac{\frac{1}{t (x)} (I (x) - A) + (A - b)}{I (x) - 1}$ (4) where, b is the constant deviation, and the default value is 1. It can be seen from Eq. (4) that the new variable K (x) not only integrates the atmospheric light value A and transmittance t (x) but also changes with the input hazy image I (x). Therefore, we train the adaptive model by reducing the loss between the clear image J (x) after dehazing and the original image, thus directly completing end-to-end dehazing.

2.2 Multi-scale convolution fusion

The size of the convolution kernel will affect the model calculation ability and the feature extraction. Multi-scale convolution can improve the robustness of capturing image depth information. Ren et al. [29] connected the coarse-scale network with the intermediate layer of the fine-scale network, and merged filters of different sizes to extract multi-scale features. The DenseNet proposed by Huang et al. [35] realized the fusion of low-level features and high-level features. It also significantly reduces the number of parameters. The most basic structure in DenseNet is Dense Block. Its function is to merge the channels of the feature map through the connection layer so that the feature maps of different levels are merged. The connection method is shown in Fig. 2. Each x represents a hierarchical feature map.

Fig. 2

Schematic diagram of a Dense Block connection.

This paper uses multiple convolution kernels with sizes of 1×1, 3×3, 5×5, and 7×7 to directly extract features from traffic sign hazy images. The calculation Eq. (5) is as follows:

$G_{i} (x) = W_{i} * G_{i - 1} (x)$ (5) where G_i (x) is the output feature map of the i layer, W_i is the convolution kernel of the i layer, * represents the convolution operation, and G_i-1 (x) is the feature map of the i-1 layer.

2.3 The residual block

The residual in mathematical statistics refers to the difference between the actual observed value and the fitted value. In a correct regression model, the residual is the predicted value of the error. For a neural network, as the number of network layers increases, the training difficulty increases, and the effect on the training set will become worse, this is called a degradation problem, and the residual network can be used to optimize this problem. He et al. [36] proposed a deep residual network to solve the problem of performance degradation as the network depth increases. Since then, the residual network has been applied to various fields of image processing and has achieved remarkable results [37 –39]. Veit et al. [40] regarded the residual network as an integrated model assembled by a series of path sets. They proved that the path after the residual network was expanded has a certain degree of independence and redundancy. Based on this point of view, this paper conducts residual learning in the convolutional layer, retains the feature information of the previous layer, and uses jump layer connection to merge different levels of features. This can increase the diversity of features, optimize gradient descent, and speed up model training.

This paper introduces the idea of residual learning in the convolutional layer to form residual modules. The structure of each residual module is shown in Fig. 3, x represents the input, and F(x) represents that the residual module passes through the first layer linearly. The output after the change and activation is called the residual term. F(x)+x represents the output before the activation function of the second convolutional layer.

Fig. 3

The residual module structure of this paper.

The BN layer in the residual module is used to normalize the input data so that the processed data can be more fully learned by the next layer of the neural network, which solves the gradient dispersion problem to a certain extent.

3 The proposed method

3.1 Network design

The AOD-Net proposed by Li et al. [23] consists of two modules, one is the K (x) estimation module, and the other is the clear image generation module. The input hazy image I (x) is estimated K (x) and then used K (x) as the input adaptive parameter to restore the clear image J (x). In this section, by introducing the idea of residual learning, a new K (x) estimation module is designed for traffic sign recognition after dehazing, called Res-K(x). In the Res-K(x) estimation module, 7 convolutional layers are used, conv1 conv7 respectively, and convolution kernels of different sizes are used to extract image features; 3 connection layers, concat1 concat3, use to concatenate the output features of the middle convolutional layer; the total includes three residual modules, res-block1 res-block3, and the specific structure is shown in Fig. 4.

Fig. 4

Res-K(x) estimation module structure.

In the Res-K(x) estimation module, the “concat1” layer connects the output of the “conv1” layer, and the “res-block1” in series and the “concat2” layer connects the output of the “conv2” layer, and the “res-block2” layer, “Concat3” layer connects the output features from “conv1”, “conv2”, “conv3”, “conv4”, “conv5” and “res-block3”. In summary, the Res-K(x) estimation module uses multi-scale convolution to extract features, uses the concat layer to realize the smooth conversion of low-level features to high-level features, and compensates for the information loss in the convolution process through the residual module. Besides, each convolutional layer in the “res-block1” and “conv7” layer uses only three filters, each convolutional layer in “res-block2” uses six filters, and each convolutional layer in “res-block3” uses nine filters. Our proposed network still has a light-weight neural network structure.

3.2 training data and experimental settings

This paper needs to collect large amounts of data to train the designed network. In the real environment, it is affected by many factors. It is difficult for us to obtain many different levels of hazy images and their corresponding clear images, and the hazy images containing traffic signs are more challenging to obtain. First of all, the background environment of the training set should be positioned outdoors, and then, more specifically, it should be positioned on the outdoor highway. Therefore, we choose the outdoor trainset OTS of RESIDE-beta [41]. So far, this dataset contains 4477 real outdoor images of real-time weather. To adapt to the haze concentration under different weather, the corresponding scattering rate of each image β= (0.04, 0.06, 0.08, 0.1, 0.12, 0.14, 0.16, 0.2) total 7 categories, atmospheric light value A= (0.8, 0.85, 0.9, 0.95, 1) total 5 categories, a total of 156695 synthetic hazy images, and then from among them, 2377 clear images with the background environment as traffic road place were selected, and a total of 83195 synthetic hazy images were selected. The number of samples used for training was 79035, and the number of samples used for verification was 4160. According to the method proposed in [26], we use pre-trained projection operators to preprocess the dataset to solve linear inverse problems such as noise and improve image quality.

This paper uses the model designed by Pytorch 1.5.1 + cu101 training. We initialize the weights with Gaussian random variables with a mean value of 0 and a variance of 2/n, select ReLU neurons as the activation function, set the momentum parameter to 0.9, and set the weight attenuation parameter to 0.0001, the learning rate is set to 0.0001, and the specific parameters are shown in Table 1.

Table 1
Training parameters

Parameters Value

Image size 512×512

Learn rate 0.0001

Momentum 0.9

Weight decay 0.0001

Train batch size 16

Val batch size 8

Epochs 51

Parameters	Value
Image size	512×512
Learn rate	0.0001
Momentum	0.9
Weight decay	0.0001
Train batch size	16
Val batch size	8
Epochs	51

Figure 5 compares the change of the loss function of this model and AOD-Net with the number of iterations. To facilitate comparison and observation, we fit the data and use dark colors to represent the fitted curve. It can be seen from Fig. 5 that our model converges faster. This is because we are based on the Adam optimization algorithm, inspired by the ReZero proposed by Thomas [42], we initialize any layer in the network as an identity map, and add an additional learning parameter to each layer so that the network has faster convergence speed.

Fig. 5

Our method is compared with the loss function curve of AOD-Net.

4 Dehazing experiment and results

This section proves that our method has advantages in terms of visual effects and evaluation metrics. The images of the real-world dataset were dehazed, and the evaluation metrics (MSE, SSIM, PSNR, VI and RI) are analyzed and compared with the 8 kinds of existing methods.

4.1 Subjective visual effect

The high-level visual task studied in this paper is traffic sign recognition. Whether the traffic signs are clear or not is a subjective judgment. Therefore, we can first compare the visual effects of traffic sign images before and after dehazing to make a preliminary evaluation of the proposed method. In the real-world hazy image, the effect of using our method to dehaze is shown in Fig. 6. The yellow box marks the location of the traffic signs.

Fig. 6

Subjective visual effects of traffic signs after dehazing using our method.

It can be seen from the figure that after dehazing by the algorithm in this paper, the traffic signs in the image are clearer than before dehazing, the edge and contour information of the traffic signs are highlighted. We can easily capture the traffic signs through the human eye. However, the reliability of this subjective judgment is low, and it needs to be further detected by the YOLOv5 method.

4.2 Evaluation metrics in image quality assessment

Because it is difficult to evaluate the effect of image dehazing by personal subjective judgment, researchers mostly choose to use Structural Similarity (SSIM) and Peak Signal-to-Noise Ratio (PSNR) [42] to analyze the experimental results quantitatively. However, these evaluation metrics are usually used for indoor images with depth of field, there is a certain gap between real-world hazy images and indoor images, and a dehazing method that fits indoor images well might not necessarily fit real-world images well. Because it is difficult to use a standard to measure the dehazing effect, Zhao et al. [43] proposed to evaluate the dehazing method from two aspects of visibility and authenticity, and established two FR-IQA standards VI and RI, and a large number of experiments have proved the superiority of VI and RI over other evaluation metrics, especially suitable for real-world images. Therefore, this section selects MSE, PSNR, SSIM, VI and RI as objective evaluation metrics to evaluate the dehazing method we proposed.

We compared the datasets provided by the literature [43 –46]. Most of the datasets are artificially synthesized. The synthetic hazy images are different from the real-world hazy images. Therefore, we choose the BeDDE provided in the literature [43] as the dataset to calculate the dehazing effect evaluation metrics MSE, PSNR, SSIM, VI and RI. The BeDDE dataset collects all real-hazy images that fulfil the experimental requirements of this paper. Among them, three levels of hazy images of light hazy, medium hazy, and thick hazy are selected to dehaze, respectively, as shown in Fig. 7. From Fig. 7, it can be seen that the distance between the buildings and other scenes in the medium hazy figure (b) is close, will lead to a subjective judgment that the hazy level is lower than the picture (a), which more fully illustrates the importance of objective metrics to evaluate the dehazing effect. We choose 8 dehazing methods with superior performance to compare with our method, namely CAP [16], DCP [13], NLD [17], Dehaze-Net [25], MSCNN [29], FFA-Net [47], GCA-Net [48] and AOD-Net [23]. The first three are traditional dehazing methods, and the last five are deep learning dehazing methods. The dehazing results are shown in Figs. 8 , and 10 respectively, and the specific parameter values of MSE, PSNR, SSIM, VI and RI are shown in Tables 2 , 4. The red in the table indicates the three with the best evaluation metrics, and the blue indicates the three with the worst evaluation metrics.

Fig. 7

Light hazy, medium hazy and thick hazy effect display in the BeDDE dataset.

Fig. 8

Results of 8 different dehazing methods in light hazy scenes.

Fig. 9

Results of 8 different dehazing methods in medium hazy scenes.

Fig. 10

Results of 8 different dehazing methods in thick hazy scenes.

Table 2

MSE, PSNR, SSIM, VI and RI values of dehazing images in light hazy scenes

Light	MSE	SSIM	PSNR	VI	RI
CAP	0.0308	0.8236	15.1125	0.8768	0.9636
DCP	0.0730	0.7221	11.3659	0.8941	0.9715
NLD	0.0640	0.7433	11.9365	0.8447	0.9605
Dehaze-Net	0.0389	0.7432	14.0982	0.8515	0.9760
MSCNN	0.0382	0.7914	14.1741	0.8606	0.9743
FFA-Net	0.0148	0.8755	18.3124	0.8451	0.9742
GCA-Net	0.1083	0.6355	9.6549	0.8974	0.9655
AOD-Net	0.0353	0.7608	20.5404	0.8622	0.9669
Ours	0.0156	0.8763	18.0489	0.8714	0.9717

Table 3

MSE, PSNR, SSIM, VI and RI values of dehazing images in medium hazy scenes

Medium	MSE	SSIM	PSNR	VI	RI
CAP	0.0154	0.8572	18.1298	0.8054	0.9264
DCP	0.0488	0.7851	13.1127	0.8639	0.9461
NLD	0.0592	0.6916	12.2801	0.8455	0.9496
Dehaze-Net	0.0186	0.8258	17.2998	0.8291	0.9496
MSCNN	0.0211	0.8593	16.7472	0.8147	0.9419
FFA-Net	0.0213	0.8941	16.7257	0.8127	0.9509
GCA-Net	0.0139	0.8819	18.5582	0.8416	0.9460
AOD-Net	0.0211	0.8268	22.7760	0.8380	0.9474
Ours	0.0129	0.8849	18.9224	0.8456	0.9498

Table 4

MSE, PSNR, SSIM, VI and RI values of dehazing images in thick hazy scenes

Thick	MSE	SSIM	PSNR	VI	RI
CAP	0.0162	0.8471	17.9167	0.8013	0.9118
DCP	0.0494	0.7755	13.0641	0.8536	0.9431
NLD	0.0655	0.6976	11.8670	0.8627	0.9440
Dehaze-Net	0.0170	0.8420	17.6742	0.7769	0.9504
MSCNN	0.0152	0.8818	18.1923	0.7738	0.9498
FFA-Net	0.0143	0.9016	19.1839	0.7389	0.9514
GCA-Net	0.0148	0.8834	18.2897	0.8194	0.9338
AOD-Net	0.0170	0.8528	17.7018	0.7929	0.9489
Ours	0.0142	0.8788	18.4596	0.8012	0.9503

4.3 Running time comparison

The above experimental results verify the effectiveness and superiority of our method in terms of subjective visual effects and objective evaluation metrics of MSE, PSNR, SSIM, VI and RI. Traffic sign recognition in hazy weather is an essential part of advanced driver assistance systems and autonomous driving systems, especially in autonomous driving scenarios requiring verification of operational efficiency. We test the various dehazing algorithms mentioned above on a real-world hazy dataset with an image size of 512×512, ignoring the time of model loading, and the average running time of each algorithm is shown in Table 5. As can be seen from the table, the overall running time of the dehazing algorithm based on deep learning is faster than that of the traditional dehazing algorithm. In contrast, the running time of the dehazing algorithm on the Matlab platform is slower than that on the Pytorch platform. The algorithm in this paper has an extremely fast running speed. The operating efficiency is second only to AOD-Net, which can fulfil the needs of real-time detection.

Table 5
The average running time of each dehazing algorithm in the real-world hazy dataset

Methods Platform Time (s)

CAP Matlab 0.790

DCP Matlab 4.780

NLD Matlab 6.120

Dehaze-Net Matlab 2.140

MSCNN Matlab 1.750

FFA-Net Pytorch 0.265

CGA-Net Pytorch 0.030

AOD-Net Pytorch 0.038

Ours Pytorch 0.054

Methods	Platform	Time (s)
CAP	Matlab	0.790
DCP	Matlab	4.780
NLD	Matlab	6.120
Dehaze-Net	Matlab	2.140
MSCNN	Matlab	1.750
FFA-Net	Pytorch	0.265
CGA-Net	Pytorch	0.030
AOD-Net	Pytorch	0.038
Ours	Pytorch	0.054

5 Traffic sign recognition after dehazing

At present, there are few types of research on linking image dehazing with high-level vision tasks. However, in the hazy weather, the contrast of the images obtained by the camera is reduced, the color is distorted, and the recognizability is low, which seriously affects the recognition of traffic signs. The dehazing model proposed in this paper can effectively remove the haze in traffic sign images, improve traffic sign recognition accuracy, and verify the superiority of this model.

We use YOLOv5 as the traffic sign recognition algorithm in this paper, because this algorithm is a classic target detection algorithm with fast detection speed and high accuracy. Researchers well know it, so it is suitable for detecting this effect algorithm. In this paper, the CCTSDB [49] was used to train the model and choose 600 pieces based on Koschmieder to add haze as the synthetic hazy testing dataset for this experiment. However, the traffic sign recognition was oriented to the application scenarios of automatic driving, and there was a difference between the synthetic haze and the real world. But there is no public dataset of hazy traffic signs at present, so we collected 100 images of hazy traffic signs in the real world as a testing dataset. Compared with 8 kinds of state-of-the-art dehazing methods, the traffic sign recognition effect after dehazing on synthetic testing datasets and real-world testing datasets, using Recall, Precision and mAP as evaluation indicators.

Figure 11 shows the synthetic hazy images and real-world hazy images used in the test. (a) is the synthetic image of light hazy, medium hazy and thick hazy in the same scene; (b) is the three hazy images of different scenes in the real world, and three images with significant differences are selected as far as possible. Figure 12 shows the recognition results of the synthetic hazy testing dataset, and the specific evaluation metrics values are shown in Table 6. Figure 13(A), (B), (C) respectively show the recognition results of the three graphs in Fig. 11(b), and the specific evaluation metrics values are shown in Table 7.

Fig. 11

The synthetic hazy images and real-world hazy images used in the test.

Fig. 12

The recognition results of the synthetic hazy testing dataset.

Table 6

Evaluation metrics of traffic sign recognition results of synthetic hazy testing dataset

Methods	Precision	Recall	mAP@.5	mAP@.5:.95
Synthetic Hazy	92.8%	72.1%	75.5%	68.6%
CAP	97.8%	95.2%	97.7%	92.9%
DCP	98.0%	99.0%	98.8%	96.7%
NLD	99.2%	98.0%	98.9%	97.0%
Dehaze-Net	97.2%	90.6%	95.0%	89.8%
MSCNN	96.8%	89.8%	94.7%	89.5%
FFA-Net	98.9%	86.2%	90.0%	85.3%
CGA-Net	99.8%	96.7%	98.4%	93.1%
AOD-Net	96.7%	89.3%	94.3%	88.1%
Our method	98.2%	95.6%	97.0%	91.8%

Fig. 13

The recognition results of the real-world hazy testing dataset.

Table 7

Evaluation metrics of traffic sign recognition results of real-world hazy testing dataset

Methods	Precision	Recall	mAP@.5	mAP@.5:.95
Real-world Hazy	63.1%	80.0%	66.8%	53.6%
CAP	76.7%	75.4%	72.1%	57.5%
DCP	75.1%	75.3%	76.6%	57.9%
NLD	78.4%	71.6%	74.2%	58.6%
Dehaze-Net	78.6%	82.9%	81.6%	62.1%
MSCNN	86.8%	82.2%	81.1%	61.2%
FFA-Net	76.4%	78.1%	77.1%	60.1%
CGA-Net	77.8%	72.9%	77.9%	60.1%
AOD-Net	83.0%	79.4%	80.6%	61.3%
Our method	85.4%	82.7%	83.8%	63.3%

In Tables 6 7, the red in the table indicates the three with the best evaluation metrics, and the blue shows the three with the worst evaluation metrics. We compared and analyzed the objective evaluation metrics in Tables 2 4 with the traffic sign recognition results in Tables 6 7. From Tables 2 4, it can be seen that the MSE, SSIM and PSNR values of CAP and NLD are lower in the real-world BEDDE dataset, while the MSE, SSIM and PSNR values of FFA-Net and our method are higher. However, as shown in Table 6, in the traffic sign recognition results on the synthetic hazy dataset, CAP and NLD have better recognition effect than FFA-Net and our Method. But looking at the data in Table 7, the traffic sign recognition results of CAP and NLD are poor in the real-world dataset, which indicates that the synthetic hazy image is different from the real-world image, and evaluation metrics such as MSE, SSIM and PSNR cannot be relied on alone. Therefore, we calculated the values of VI and RI. In summary, the dehazing method with high VI value has a better result in traffic sign recognition in the synthetic hazy dataset, while the method with high RI value has a better result in the real-world hazy dataset.

Since the dehazing method in this paper is oriented to the real-world hazy scene of autonomous driving, the traffic sign recognition results in Table 7 are mainly analyzed. The recognition ability of traffic signs dehazing with traditional methods such as CAP, DCP and NLD is relatively low. From Fig. 15, we can observe that DCP, DCP and NLD suffer from severe color distortions and artifacts in the sky scene because of their underlying prior assumptions, this leads to reduced recognition of traffic signs. In contrast, FFA-Net and GCA-Net recover images with excessive brightness, FFA-Net gets higher values of PSNR and SSIM in the BEDDE, but the dehazing images still have obvious color distortion, and the processing power of GCA-Net at high-frequency detail information performance such as textures, edges and the blue sky is always unsatisfactory. The results from AOD-Net are dim and left with several hazy residuals. AOD-Net cannot remove the hazy completely and tends to output low-brightness images because of AOD-Net is a light-weight CNN and may not be able to extract more valid information. The methods of Dehaze-Net and MSCNN rely on the CNN model to estimate atmospheric light and transmission maps for dehazing, but there are hazy residuals, and the reason is inaccurate transmission map leads to more hazy residuals in dehazing results.

In our method, multi-scale convolution is used to extract features, and the residual module is used to compensate for information loss in the convolution process. Experimental results show that our method can maintain the edge features of traffic signs, solve the color distortion and improve the contrast of the image. Moreover, the single image of the light-weight network designed in this paper only takes 54 ms to dehaze. It can meet real-time requirements. Therefore, compared with other methods, our dehaze method is more suitable for the image preprocessing of traffic sign recognition task in hazy weather to improve the safety of the driving assistance system and autonomous driving system.

6 Conclusion

In this paper, by introducing the idea of residual learning, a dehazing network containing multi-scale residual modules is proposed. The multi-scale convolution is used to extract features, and the information loss in the convolution process is compensated by the residual module. A special end-to-end multi-scale feature information fusion method is constructed. Experiments proved that our method could better maintain the edge characteristics of traffic signs after dehazing, solve color distortion, and improve the image’s contrast. Also, it is combined with the real-world application scene of traffic sign recognition. MSE, PSNR, SSIM, VI and RI were used as objective evaluation metrics to conduct the experiment on the BEDDE of real-world hazy images, and to compare it with 8 State-of-the-art dehazing methods, so as to verify the effectiveness of our method. More importantly, we dehaze the traffic sign images based on the synthetic hazy testing dataset and real-world hazy testing dataset, apply YOLOv5 as the detection tool, select Recall, Precision, and mAP as evaluation metrics compare with the recognition effect of 8 state-of-the-art dehazing methods. Because of the characteristics of high real-time requirements for traffic sign recognition, while testing the recognition performance of traffic signs in hazy weather, it is compared with the running time of the existing dehazing method, which proves the superiority of our method.

7 Discussion

We proposed a light-weight dehazing network with multi-scale residual blocks, which can improve traffic sign recognition in real-world hazy weather. To demonstrate the superiority of our method, we compared and analyzed the MSE, SSIM and PSNR of 8 kinds of state-of-the-art dehazing methods and our method. But these evaluation metrics are designed to evaluate general image distortions such as noise and blur, and however, regarding dehazing evaluation, their effectiveness was not verified. Therefore, VI and RI were also used in this paper to evaluate the dehazing images. Meanwhile, our dehazing method is specifically used for traffic sign recognition, so we believe that the most direct evaluation metric is the result of traffic sign recognition.

We first test on the synthetic hazy traffic sign datasets. We found that the effect of the traditional dehazing method was better than deep learning dehazing method in traffic sign recognition. However, this is completely different from the MSE, PSNR, SSIM, VI and RI values calculated in the BEDDE dataset, and evaluation metrics show that deep learning methods are better than traditional dehazing methods. We speculate that the principles of synthetic hazy and traditional dehazing methods are similar, which results in the traditional dehazing methods performing well on the synthetic hazy traffic sign dataset. There is no real-world hazy traffic sign public dataset at present. Therefore, we collected 100 real-world traffic signs hazy images as a testing dataset, and the comparative experiment of traffic sign recognition is carried out again. The results prove that our method is superior to the other 8 kinds of state-of-the-art dehazing methods in the recognition results of real-world traffic sign hazy images. But it still needs further to improve the dehazing ability of the thick hazy images and improve the accuracy of traffic sign recognition if it is to be truly applied in the autonomous driving scene.

Footnotes

Acknowledgments

Our deepest gratitude goes to the anonymous thoughtful suggestions that have helped improve this paper substantially.

References

Zhou

J.C.

, Zhang

D.H.

, Zou

P.Y.

, Zhang

W.D.

and Zhang

W.S.

, Retinex-Based Laplacian Pyramid Method for Image Defogging, IEEE Access 7 (2019), 122459–122472.

Kasauka

, Sugiyama

, Tsutsui

, Okuhata

and Miyanaga

, An Architecture for Real-Time Retinex-Based Image Enhancement and Haze Removal and Its FPGA Implementation, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E102.A(6) (2019), 775–782.

Yamakawa

and Sugita

, Image enhancement using Retinex and image fusion techniques, Electronics and Communications in Japan 101(8) (2018), 52–63.

Javier

V.C.

, Adrian

, Praveen

, B.M.J.J.o.R.-T.I. Processing, A fast image dehazing method that does not introduce color artifacts, (2018), 1–16.

Jobson

D.J.

, Rahman

Z.-u.

and Woodell

G.A.

, Properties and Performance of a Center/Surround Retinex, IEEE Transactions on Image Processing 6(3) (1997), 451–462.

Jobson

D.J.

, Rahman

Z.-u.

and Woodell

G.A.

, A Multiscale Retinex for Bridging the Gap Between Color Images and the Human Observation of Scenes, IEEE Transactions on Image Processing 6(7) (1997), 965–976.

S.Y.

and Zhu

, Low-Illumination Image Enhancement Algorithm Based on a Physical Lighting Model, IEEE Transactions on Circuits and Systems for Video Technology 29(1) (2019), 28–37.

Diaz-Ramirez

V.H.

, Hernández-Beltrán

J.E.

and Juarez-Salazar

, Real-time haze removal in monocular images using locally adaptive processing, Journal of Real-Time Image Processing 16(6) (2017), 1959–1973.

Mudge

and Virgen

, Real time polarimetric dehazing, Applied Optics 52(9) (2013), 1932–1938.

10.

Liang

, Zhang

, Ren

, Ju

and Qu

, Polarimetric dehazing method for visibility improvement based on visible and infrared image fusion, Appl Opt 55(29) (2016), 8221–8226.

11.

, Sun

and Tang

, Single Image Haze Removal Using Dark Channel Prior, IEEE Trans Pattern Anal Mach Intell 33(12) (2011), 2341–2353.

12.

Jackson

, Kun

, Agyekum

K.O.

, Oluwasanmi

and Suwansrikham

, A Fast Single-Image Dehazing Algorithm Based on Dark Channel Prior and Rayleigh Scattering, IEEE Access 8 (2020), 73330–73339.

13.

Sarkar

, Sarkar

P.R.

, Mondal

and Nandi

, Empirical wavelet transform-based fog removal via dark channel prior, IET Image Processing 14(6) (2020), 1170–1179.

14.

and Dong

, DSP-based image real-time dehazing optimization for improved dark-channel prior algorithm, Journal of Real-Time Image Processing 17(5) (2019), 1675–1684.

15.

Yang

, Jiang

, Lv

and Jiang

, A real-time image dehazing method considering dark channel and statistics features, Journal of Real-Time Image Processing 13(3) (2017), 479–490.

16.

Zhu

, Mai

and Shao

, A Fast Single Image Haze Removal Algorithm Using Color Attenuation Prior, IEEE Trans Image Process 24(11) (2015), 3522–3533.

17.

Berman

, Treibitz

, Avidan

, Non-local Image Dehazing, IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1674–1682.

18.

Hussain

and Jeong

, Visibility Enhancement of Scene Images Degraded by Foggy Weather Conditions with Deep Neural Networks, Journal of Sensors 2016 (2016), 1–9.

19.

Yan

, Li

, Zheng

, Xu

and Yan

, MMP-Net: A Multi-Scale Feature Multiple Parallel Fusion Network for Single Image Haze Removal, IEEE Access 8 (2020), 25431–25441.

20.

Yuan

, Wei

, Lu

and Xiong

, Single Image Dehazing via NIN-DehazeNet, IEEE Access 7 (2019), 181348–181356.

21.

Chen

, He

, Fan

, Liao

, Zhang

, Hou

, Yuan

, Hua

, Gated context aggregation network for image dehazing and deraining, IEEE Winter Conference on Applications of Computer Vision (2019), IEEE, pp. 1375–1383.

22.

, Pan

, Li

, Tang

, Single image dehazing via conditional generative adversarial network, IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 8202–8211.

23.

, Peng

, Wang

, Xu

, Feng

, AOD-Net: all-in one dehazing network, 16th IEEE International Conference on Computer Vision (ICCV), 2017, pp. 4780–4788.

24.

Qian

, Zhou

and Zhang

D.Y.

, FAOD-Net: A Fast AOD-Net for Dehazing Single Image, Mathematical Problems in Engineering 2020 (2020), 1–11.

25.

Bolun

, Xiangmin

, Kui

, Chunmei

and Dacheng

, DehazeNet: An End-to-End System for Single Image Haze Removal, IEEE Trans Image Process 25(11) (2016), 5187–5198.

26.

Chang

J.H.R.

, Li

C.-L.

, Poczos

, Vijaya Kumar

B.V.K.

, Sankaranarayanan

A.C.

, One Network to Solve Them All –Solving Linear Inverse Problems Using Deep Projection Models, IEEE International Conference on Computer Vision, (2017), pp. 5889–5898.

27.

Meinhardt

, Moeller

, Hazirbas

, Cremers

, Learning Proximal Operators: Using Denoising Networks for Regularizing Inverse Imaging Problems, IEEE International Conference on Computer Vision, (2017), 2017, pp. 1799–1808.

28.

Yang

, Sun

, Proximal Dehaze-Net:A Prior Learning Based Deep Network for Single Image Dehazing, European Conference on Computer Vision (ECCV), 2018, pp. 1–14.

29.

Ren

, Pan

, Zhang

, Cao

and Yang

M.-H.

, Single Image Dehazing via Multi-scale Convolutional Neural Networks with Holistic Edges, International Journal of Computer Vision 128(1) (2019), 240–259.

30.

Wiesemann

, Jiang

X.Y.

, Fog Augmentation of Road Images for Performance Analysis of Traffic Sign Detection Algorithms, 17th International Conference on Advanced Concepts for Intelligent Vision Systems (ACIVS), 2016, pp. 685–697.

31.

Yan

, Yu

, Shi

and Feng

, The recognition of traffic speed limit sign in hazy weather, Journal of Intelligent & Fuzzy Systems 33(2) (2017), 873–883.

32.

Cao

, Qin

, Jia

, Xie

, Liu

, Ma

, Yu

, Haze Removal of RailwayMonitoring Images Using MultiScale Residual Network, IEEE Transactions on Intelligent Transportation Systems, (2020), 1–14.

33.

Narasimhan

S.G.

and Nayar

S.K.

, Vision and the atmosphere, International Journal of Computer Vision 48(3) (2002), 233–254.

34.

Narasimhan

S.G.

, Nayar

S.K.

, Chromatic framework for vision in bad weather, Computer Vision and Pattern Recognition, (2000), New York, pp. 598–605.

35.

Huang

, Liu

, Van Der Maaten

, Weinberger

K.Q.

, Densely Connected Convolutional Networks, IEEE Conference on Computer Vision and Pattern Recognition, (2017), pp. 2261–2269.

36.

, Zhang

, Ren

, Sun

, Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2016), pp. 770–778.

37.

Wang

, Jiang

, Qian

, Yang

, Li

, Zhang

, Wang

, Tang

, Residual attention network for image classification, IEEE Conference on Computer Vision and Pattern Recognition, (2017), pp. 3156–3164.

38.

Lim

, Son

, Kim

, Nah

, Mu Lee

, Enhanced deep residual networks for single image super-resolution, IEEE Conference on Computer Vision and Pattern Recognition Workshops, (2017), pp. 136–144.

39.

Han

, Kim

, Deep pyramidal residual networks, IEEE conference on computer vision and pattern recognition(2017), pp. 5927–5935.

40.

Veit

, Wilber

, Belongie

, Residual Networks Behave Like Ensembles of Relatively Shallow Networks, Neural Information Processing Systems (2016), Neural Information Processing Systems (NIPS), 10010 North Torrey Pines Rd, La Jolla, California 92037 USA, Barcelona, Spain.

41.

, Ren

, Fu

, Tao

, Feng

, Zeng

, Wang

, Benchmarking Single Image Dehazing and Beyond, IEEE Trans Image Process, (2018).

42.

Bachlechner

, Prasad Majumder

, Mao

H.H.

, Cottrell

G.W.

, McAuley

J.J.a.e.-p.

, ReZero is All You Need: Fast Convergence at Large Depth, 2020, p. arXiv:2003.04887.

43.

Zhao

, Zhang

, Huang

, Shen

, Zhao

, Dehazing Evaluation: Real-world Benchmark Datasets, Criteria and Baselines, IEEE Transactions on Image Processing, (2020), 1–1.

44.

Bijelic

, Kysela

, Gruber

, Ritter

, Dietmayer

, Recovering the Unseen: Benchmarking the Generalization of Enhancement Methods to Real World Data in Heavy Fog, CVPR (2019), pp. 11–21.

45.

Ancuti

C.O.

, Ancuti

, Timofte

, De Vleeschouwer

, O-HAZE: a dehazing benchmark with real hazy and haze-free outdoor images, Computer Vision and Pattern Recognition Workshops, (2018), pp. 867–875.

46.

Ancuti

, Ancuti

C.O.

, Timofte

, De

, I-HAZE: A Dehazing Benchmark with Real Hazy and Haze-Free Indoor Images, in: J. BlancTalon, D. Helbert, W. Philips, D. Popescu, P. Scheunders (Eds.), Advanced Concepts for Intelligent Vision Systems, (2018), pp. 620–631.

47.

Qin

, Wang

, Bai

, FFA-Net: Feature Fusion Attention Network for Single Image Dehazing, The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-2020), pp. 11908–11915.

48.

Chen

, He

, Fan

, Gated Context Aggregation Network for Image Dehazing and Deraining, IEEE Winter Conf. on Applications of Computer Vision (WACV), (2019).

49.

Zhang

, Wang

, Lu

, Wang

and Sangaiah

A.K.

, Light-weight deep network for traffic sign classification, Annals of Telecommunications 75(7-8) (2019), 369–379.

End-to-end dehazing of traffic sign images using reformulated atmospheric scattering model

Abstract

Keywords

1 Introduction

2 Related work

3.1 Network design

Table 1 Training parameters Parameters Value Image size 512×512 Learn rate 0.0001 Momentum 0.9 Weight decay 0.0001 Train batch size 16 Val batch size 8 Epochs 51

4.1 Subjective visual effect

Table 5 The average running time of each dehazing algorithm in the real-world hazy dataset Methods Platform Time (s) CAP Matlab 0.790 DCP Matlab 4.780 NLD Matlab 6.120 Dehaze-Net Matlab 2.140 MSCNN Matlab 1.750 FFA-Net Pytorch 0.265 CGA-Net Pytorch 0.030 AOD-Net Pytorch 0.038 Ours Pytorch 0.054

7 Discussion

Footnotes

Acknowledgments

References

Table 1
Training parameters

Parameters Value

Image size 512×512

Learn rate 0.0001

Momentum 0.9

Weight decay 0.0001

Train batch size 16

Val batch size 8

Epochs 51