Abstract
Hazy weather affects drivers’ sightline seriously and causes a high potential safety hazard. This paper proposes a novel approach for recognizing the speed limit sign in hazy weather. It consists of three major modules: haze removal, speed limit sign location, and sign recognition. In haze removal, this paper proposes to dehaze image with the dark channel prior. The speed limit sign is located by Histogram of Oriented Gradient (HOG) feature extraction and Support Vector Machine (SVM) classification and is recognized by the seven layers Convolutional Neural Networks (CNN). Experimental results show that the proposed method has better performance than the state-of-art dehazing methods and the processing time is also reduced. The recognition rate of the speed limit signs is 98.51% that is better than the human performance, and the classifier can recognize the speed limit sign with rotation, shift, scale and other distortions.
Introduction
The traffic sign plays an important role in providing information for drivers. Especially, the speed limit sign can tell drivers the speed restriction [1]. Therefore, it is very significant to recognize the traffic sign correctly and protect people’s security. However, it is hard to recognize the sign precisely only by the human eyes. Additionally, the hazy weather often occurs in China in recent years, which affects drivers’ sightline seriously. Thus, the automatic traffic sign recognition (TSR) in hazy weather is necessary. TSR is a part of the intelligent transport system (ITS), which can help drivers to recognize signs on the road and prevent accidents.
The essence of haze removal is to enhance images and improve contrast. Multi-Scale Retinex (MSR) is a simple image enhancement method. The main idea of MSR [2, 3] is based on that the color of an object is determined by the ability of light reflection and no dependence on the intensity absolute value of the light source. MSR regards the candidate image to enhance with two components: incident component and reflective component. With convolution and logarithm operations, reduce or even remove the influence of the incident component, and then, the reflective component, namely, the enhanced image can be obtained. MSR has good color fidelity but is not ideal in details. Kaiming He et al. [4–6] proposes an effective image prior, dark channel prior, to remove haze from the single input image. This method can estimate the thickness of the haze and remove the haze by using the dark channel prior. It is efficient to protect edge information but time-consuming. And if the input image contains a large number of sky or highlight areas, the output image may have distortions in these areas.
In the area of sign location and recognition, Akatsuka and Imai [7] firstly attempted to generate a real-time system. Many approaches have been proposed since then. Jack Greenhalgh et al. [8] proposed to use a cascade of SVM classifiers which are trained using HOG features to locate and recognize signs. This method has a very fast processing speed, but the recognition accuracy is relatively low. Fatin Zaklouta et al. [9] used K-d trees and Random Forests to recognize traffic signs. It has a 95.95% recognition rate, but may have over-fitting in some classifications that may have many noises. Thus, this method is not suitable for practical application.
This paper proposes an effective method to recognize the speed limit sign in hazy weather. It can remove haze effectively and recognize the signs correctly. At the same time, it is not sensitive to noise, rotation and translation.
The remainder of this paper is organized as follows: Section 2 describes the complete process of proposed method. Then, in Section 3, the proposed method is compared with the state-of-art methods, and experimental results will be shown. Finally, conclusions are drawn in Section 4.
Speed limit sign recognition system
This paper focuses on recognizing the speed limit sign in hazy weather. Figure 1 shows some speed limit signs.

Speed limit signs.
In order to recognize these sign, this paper proposes a recognition system. An overview of this system is shown in Fig. 2.

The overview of sign recognition system.
Firstly, haze in the frame extracted from the input video is removed by adopting a single image dehazing with dark channel prior. Secondly, the red areas can be separated out by color segmentation in HSV space and furthermore the circular degree is used to find candidate sign areas. Thirdly, the HOG features of these areas are extracted and classified by the trained SVM classifier to make a decision whether these areas consist of the speed limit sign. Finally, the speed limit sign can be recognized correctly with a trained seven layers CNN.
Dehazing has a great influence on sign segmentation. In this paper, image dehazing with dark channel prior is used to dehazy image. In this part, the contribution is to improve the calculation of the transmission from the hazy image to dehazy image. The three improvements are adding parameter to make the dehazy image more natural, downsampling the original image to reduce the calculation of transmission and proposing the largest atmospheric light to get the atmospheric light.
In the area of computer vision and computer graphics, a simplified atmospheric scattering model is widely used to explain the cause of formation of a hazy image. This model is as follows [10].
Obviously, it is impossible to obtain the only solution, so some prior knowledge is necessary. Kaiming He et al. [4] proposed an effective image prior named dark channel prior. This prior indicates that in most of the patches without sky, there are always some pixels have very low intensity in at least one color channel. Namely, the minimum intensity in these patches may have a very low value. For an arbitrary image J, its dark channel J
dark
is defined as Equation (2):

Images and their dark channels. (a) and (c) Original images with and without haze respectively. (b) and (d) The corresponding dark channels.
In order to calculate the transmission from the hazy image to dehazy image, for each color channel, Equation (1) is normalized by A
c
:
Take Equation (5) into Equation (4), we can get the transmission
Here adds the parameter ω ranged from 0 to 1, because if all haze in the image is removed, it may be unnatural [11]. In this paper, ω is 0.95.
Calculating the transmission
By the above reasoning, the atmospheric light A is assumed known. In practice, it can be obtained from the hazy image by dark channel. Specific steps are as follows in method proposed in [4]: Pick the top 0.1% brightest pixels in the dark channel; Find out the value with the highest intensity in the hazy image at these positions, this value will be selected as the atmospheric light A.
Here, the atmospheric light A is the intensity of the pixel, thus, A may tend to be 255 at each color channel. It will cause a large number color spots and color cast after processing. And if the hazy image has large sky areas or other brighter areas, the atmospheric light A will be the intensity of pixel in these areas, which is not the expected and may appear transition areas obviously in these areas after processing [12]. This paper proposes a new method and a parameter called the largest atmospheric light A
max
to get the atmospheric light A. The improved steps are as follows: Pick the top 0.1% brightest pixels in the dark channel; Calculate the average values of these pixels in the hazy image, which is A
avg
; If the A
avg
is less than A
max
, the atmospheric light A is A
avg
, else the A is A
max
.
After calculating the average value of the top 0.1% brightest pixels in the dark channel in the hazy image, A can’t tend to 255 at each color channel and A max is a statistic value. Here choose 500 images with large bright sky areas in different scenes, and extract these sky areas. Then, their dark channel images are obtained and their atmospheric light A are calculated by the followed steps (i) and (ii) above. Thus, A max is 220 as a statistic value. It can prevent the atmospheric light being the intensity of a pixel in sky areas.
When the transmission t (x) is very low, it may cause J (x) tend to 255 and make the image tend to white. Thus, there needs a threshold t0 when t (x) is less than t0, set t (x) = t0, t0 = 0.1 in this paper. So, the final recover equation is as follows:
The image without haze can be obtained from Equation (7).
The next step after dehazing is to locate the speed limit sign in the image which includes the main steps as follows: Convert the image from RGB to HSV color space; Segment red areas by threshold segmentation in HSV color space, as shown in Fig. 4(b); Use Canny operator to extract edges, as shown in Fig. 4(c); Process with the Morphological closing operation; Extract connected domains, as shown in Fig. 4(d); Fill the connected domains with holes, as shown in Fig. 4(e); Calculate the circular degree of each connected domain; If the circular degree is greater than the threshold, it means this area is a circular area as shown in Fig. 4(f) (The area is marked with yellow). And then extract its HOG features; Judge the circular area whether it consists the speed limit signs by using the trained SVM classifier, as shown in Fig. 4(g) (The area is marked with a green rectangle).

The location of the speed limit sign. (a) Original image. (b) Red areas after color segmentation. (c) Extracted edges. (d) Connected domains. (e) Fill connected domains with holes. (f) The circular areas found by calculating the circular degree. (g) Mark the area with speed limit sign.
The HOG feature is invariant to geometric and optical deformation. The HOG features and SVM classifier are widely used in pedestrian detection [13] and also useful to detect the traffic sign [14]. Some systems use HOG features to describe the speed limit sign and have good results [15, 16]. So this paper also uses HOG to extract the speed limit sign. The candidate areas are resized to 32×32 pixels, and then their HOG features are extracted with the cell size of 4×4 pixels and the block size of 4×4 cells. Thus, the dimension of the HOG features vector of each 32×32 pixels candidate areas is 1296. The SVM classifier is trained with 527 positive samples contained speed limit signs and 1071 negative samples. Before training, these positive and negative samples are randomly divided into two parts, one is for training and the other is for testing. Each sample is resized to 32×32 pixels and extracted its HOG features. The accuracy of the SVM classifier is 98.8962%. With this classifier, the speed limit sign can be located correctly in the image.
The last step in the proposed system is to recognize the speed limit sign located in the previous step. In this paper, a trained seven layers CNN is adopted to recognize the sign.
The traditional recognition algorithm usually consists of three steps: pre-processing, feature extraction and recognition. The most important step is feature extraction. The extracted features affect the accuracy of recognition directly.
In 1958, David Hubel and Torsten Wiesel found a kind of neurons called Orientation Selective Cell. When the pupil found the edge of the object, and the edge points to a certain direction, this kind of neurons are active. They also proposed the concept of receptive field [17]. This finding contributed to the breakthrough of the computer artificial intelligence after forty years. The deep neural network arises at the historic moment.
The deep neural network does not need to extract features manually, which simulates the human brain to analysis, learn and explain data. Thus, the deep neural network can get features from pixels, edges, object parts and object models. It gets rid of extracting features manually and guarantees the effectiveness of the classifier.
CNN is a special deep neural network model. Its particularity reflects in two aspects: one is that its connections between neurons are not fully connected, and the other is that the weight of some connections between neurons in the same layer is shared. CNN has been proved useful in handwritten recognition. In fact, it is also suitable for traffic signs recognition [18–21]. Figure 5 shows the architectures of a seven layers CNN in this paper.

Architecture of CNN.
This CNN consists of two pairs of convolutional and subsampling layers, in Fig. 5, the rectangles of C1, C3 and C5 represent the convolutional layers, and the rectangles of S2 and S4 represent the subsampling layers, and each layer only receives connections from its previous layer. By this structure, the feature vector can be extracted from raw pixel intensity of the input image. The last two layers of rectangles F6 and F7 are fully connected layers to classify the inputimage.
The structure of the seven layers CNN is shown in Table 1.
Structure of the seven layers CNN
The output feature maps of each convolutional layer are the convolutional result of the feature maps from the previous which layer passed through a nonlinear activation function. It can be described as follows:
For subsampling layer, there are N output maps if there are N input maps. However, each output map becomes smaller than the input map. Usually, the sub-sampling function is to calculate the sum of the different n×n patches of the input map. Then, the result passes through a sigmoid activation function which makes the feature maps to be shift-invariance.
Classification layer
Each feature map of the last convolutional layer is subsampled to one pixel. The last fully connected layer has one output unit per class in the recognition task. This process likes the traditional BP network.
The training algorithm is similar to the BP network. The CNN can recognize the image with rotation, shift, scale and other distortions. It can be used in the recognition of traffic signs in the realconditions.
Experimental results
The proposed haze removal method is verified on two image databases: one is the image database collected by authors which consists of 200 hazy images without large-scale sky areas and 300 hazy images with large-scale sky areas, and the other one is the open traffic signs dataset GTSRB (German Traffic Sign Recognition Benchmark). The GTSRB dataset includes eight classes of speed limit signs and thirty-five classes of other traffic signs; each image consists of a Region-of-Interest which contains only one traffic sign. The videos are taken by a camera mounted on the car in different conditions. The video frame rate is 20 frames per second. All experiments are run by Matlab 2010a on the computer with a 2.2 GHz Inter Core I7 CPU.
Performance of haze removal
In Fig. 6, the proposed approach is compared with MSR and He’s work [4]. The top line is the original image with haze and from left to right, the degree of haze gets deepened gradually. The next line is the results after MSR operation, and the third line is the results of He’s method. The last line is the proposed results.
In Fig. 6, the results of MSR have good color fidelity but are blurred in details and some areas are over enhanced. The results do not look natural because MSR is not based on the atmospheric scattering model and cannot remove haze depends on the depth of objects. He’s results look more natural and have better performance in details than MSR. However, due to this method has the flaw in calculating the global atmospheric light. If the hazy image has large sky areas, the results in these areas will have spots and in other areas will have halos. The proposed method improves the processing effect than He’s method when the hazy image has a large scale of sky areas. It prevents the global atmospheric light in each color channel to tend to be 255 and fall into the sky area accidentally. The results are better than He’s results in sky areas and have no halos around objects.

Haze removal results. Top: original images. Second: MSR results. Third: He’s results. Bottom: This paper’s results.
In order to improve the processing speed, the proposed method downsamples the input image and calculates the transmission map. Then, the transmission map is upsampled to the original size by linear interpolation. Though, it may have a little influence on the result, but the processing speed is quicker. The results of proposed method in calculating the transmission map with and without downsampling are shown in Fig. 7.

Comparison between the method with and without downsampling in calculating the transmission map. (a) Original images. (b) Results of the method without downsampling. (c) Results of the method with downsampling in calculating the transmission map.
Here, the two kinds of results are almost the same in subject perception. The downsampling reduces the precision of the transmission map, but it has little influence on the effect of dehazing, and the most important is the processing speed improved dramatically. Five video clips are randomly chosen and each clip includes 3-second videos and 20 frames per second to compare with the processing time of MSR, He’s method and the proposed method without downsampling and with downsampling. Table 2 shows the comparative results. The proposed method with downsampling in calculating the transmission map is almost three times much quicker than the method without downsampling.
Processing time of haze removal method
In this paper, the seven layers convolutional neural network (CNN) is used to recognize the speed limit sign. In order to train this CNN, the GTSRB dataset is selected as the samples. As the training sample set, there are 12906 positive samples and 34057 negative samples. These samples have different cover degree, light conditions, and scale size. Figure 8 shows some samples.

Training samples. (a) Positive samples. (b) Negative samples.
Before the training, the size of these samples should be normalized to 32×32 pixels. Here, this paper compares the effects of different preprocessing methods (None, Image adjustment, Histogram equalization, Adaptive histogram equalization). The test samples are randomly divided into five groups; each group includes 4690 test samples. The accuracy of the proposed method is evaluated with precision defined as follows and Table 3 shows the comparativeresult:
Recognition accuracy using different preprocessing methods
According to Table 3, different preprocessing methods have influence on the recognition accuracy. Here, adaptive histogram equalization performs better than other preprocessing methods, and its recognition accuracy is 98.51%. Therefore, the adaptive histogram equalization process is complemented before training the CNN.
The CNN classifier is compared with the SVM classifier proposed in [8] and the K-d trees and Random Forests proposed in [9]. Also, the human performance is compared with these methods. The comparative results are shown in Table 4.
Average accuracy and computing time of SVM and AdaBoost
From Table 4, the CNN performs much better than SVM and Random Forests classifiers because SVM and Random Forests need to extract features manually, and it can’t be guaranteed that these features can represent objects well. These features are hard to meet requirements of all cases. However, the CNN can extract the best features by itself and extract higher features in each convolutional layer. Thus, through enough training samples, the CNN can recognize the object in most cases. The recognition accuracy of CNN is also better than the human performance. Therefore, the CNN classifier can help human beings to recognize traffic sign.
Figure 9 shows the performance of the system on the real traffic video. The videos are taken by the camera on the car. There is a little haze in the image. The speed limit sign is circled by a green rectangle and marked by a red warning letters.

Performance of the recognition system. (a, b, c, d) Performance of the recognition system in hazy weather. (e, f) Performance of the recognition system in sunny.
According to the Fig. 9, the method proposed in this paper can recognize the speed limit sign correctly in different conditions. In haze weather, the method can remove haze effectively and guarantee the following parts can do well. Therefore, our method is suitable for recognizing the speed limit sign in haze weather.
Conclusions and future work
This paper proposes an efficient method of speed limit sign recognition in hazy weather. This method improves the single image dehazing with dark channel prior and extracts features by HOG and SVM classifier to detect areas with signs. The seven layers convolutional neural network is used to recognize the sign.
In the dehazing step, this method adds a parameter to the transmission of the dehazing equation to make the output image more natural and have no halos and distortions in sky areas.Also, a downsampling is done before calculating the transmission map in this paper which can reduce much computing time on the premise of ensuring the quality of dehazing. In addition, the largest atmospheric light is proposed to get the atmospheric light. In the recognition step, this paper builds the seven layers CNN to recognize the sign. The recognition accuracy of the CNN in this paper is up to 98.51%. It is much higher than the SVM classifier, Random Forests and the Human Performance. The CNN can extract the best features automatically and extract higher features in each convolutional layer. It makes the CNN can recognize the sign in different environments.
In the future work, the quality of removing dense haze need to improve because of happening much frequently. According to different haze conditions, an adaptive dehazy model will be made by big data and deep learning algorithms in order to obtain the largest atmospheric light A max which can suit for different haze conditions. The processing time and robustness of the method can also be improved to make the method more suitable for real life.
Footnotes
Acknowledgments
This work was granted by Tianjin Sci-tech Planning Projects (Grant No. 14RCGFGX00846), the Natural Science Foundation of Hebei Province, China (Grant No. F2015202239) and Tianjin Sci-tech Planning Projects (Grant No. 15ZCZDNC00130).
