Abstract
Low-illumination image restoration has been widely used in many fields. Aiming at the problem of low resolution and noise amplification in low light environment, this paper applies style transfer of CycleGAN(Cycle-Consistent Generative Adversarial Networks) to low illumination image enhancement. In the design network structure, different convolution kernels are used to extract the features from three paths, and the deep residual shrinkage network is designed to suppress the noise after convolution. The color deviation of the image can be resolved by the identity loss of CycleGAN. In the discriminator, different convolution kernels are used to extract image features from two paths. Compared with the training and testing results of Deep-Retinex network, GLAD network, KinD and other network methods on LOL-dataset and Brightening dataset, CycleGAN based on multi-scale depth residuals contraction proposed in this experiment on LOL-dataset results image quality evaluation indicators PSNR = 24.62, NIQE = 4.9856, SSIM = 0.8628, PSNR = 27.85, NIQE = 4.7652, SSIM = 0.8753. From the visual effect and objective index, it is proved that CycleGAN based on multi-scale depth residual shrinkage has excellent performance in low illumination enhancement, detail recovery and denoising.
Keywords
Introduction
With the rapid development of artificial intelligence, artificial neural networks have played an amazing role in many fields. A set of corresponding complete network structure can solve or perfect many difficult and complex problems at present. The emergence of artificial neural networks has solved the problems that require a large number of human resources. In terms of image processing [20], such as image classification, image recognition, image segmentation, image hyperpartition reconstruction, image restoration and other fields, with the development of deep learning, researchers can combine traditional algorithms and ideas to build a network that can handle this problem well.
Image restoration is an area in which it is difficult to grasp the quality of the result in image processing, because there is no very accurate reference and index to evaluate whether the result is a perfect image recognized by most people. But image restoration is still a very popular problem in image processing, image quality for scholars engaged in research on images is very important. There are many kinds of ways to improve the quality of the images. For example, the aspects of image acquisition technology and equipment may improve the image quality is very good, but there are many external factors that affect the quality of the collected images and factors such as illumination, various factors, such as ambient random noise and relative motion in the imaging process, cause the deviation and distortion between the observed image and the real image, so the quality of the image needs to be improved from the later image processing.
The development of low illumination image enhancement technology is impacted by deep learning. After comparing the traditional image quality improvement technology with the deep learning network, the researchers gradually found the charm of deep learning. For low illumination image enhancement, the traditional method is simplex for low illumination image enhancement, and the result of enhancement is not scalable. Deep learning, on the other hand, can integrate many processes and enhance them at multiple scales. Because of the strong expansibility and adaptability of the artificial neural network, and the mysterious random ability of the artificial neural network, the result of the image processing is hard to predict. It is through deep study of the mystery that researchers are more curious about what kind of networks can be constructed to produce more desirable results.
Low illumination image restoration is of great research significance, but it also has a very wide range of practical applications. In video surveillance, for example, the technology can improve the quality of light; in medical imaging, it can improve image sharpness and aid in medical diagnosis; and in face recognition, it can improve the accuracy of face recognition. The processing ability and diversity of artificial neural networks are developing rapidly, and the process of low-illumination image processing is becoming more comprehensive and effective.
In the development of low illumination image restoration, there are many methods to improve the brightness, such as histogram equalization, gray change pixel value, illumination information estimation and brightness enhancement algorithm. One of the most influential research theories is that of Retinex. Up to now, artificial neural networks are still built using Retinex’s ideas and methods. The denoising process is to use the traditional mean filter and the latest Block Matching 3D [5] denoising algorithm. These denoising algorithms have strong guiding power for the construction of artificial neural networks. When researchers integrate these algorithms into the network, the results are remarkable. In the development of low illumination image restoration, there are many methods to enhance luminance, such as histogram equalization, gray change pixel value, illumination information estimation algorithm and brightness enhancement algorithm. One of the most influential research theories is that of Retinex. Up to now, artificial neural networks are still built using Retinex’s ideas and methods. The denoising process is to use the traditional mean filter and the latest Block Matching 3D [5] denoising algorithm. These denoising algorithms have strong guiding power for the construction of artificial neural networks. When researchers integrate these algorithms into the network, the results are remarkable.
Related work
Many researchers at home and abroad have made contributions to low illumination enhancement methods based on deep learning. The neural network built in the aspect of illumination estimation can take illumination information as an important index to study. This is the main research direction to restore the brightness of the image. Using the feature of illumination information extracted as the input of neural network can effectively improve the brightness of image, process the chromatic aberration and white balance of image [21]. In the aspect of low illumination enhancement, it is more comprehensive to improve the image quality, such as brightness, contrast, noise, artifacts and other factors. Low-illumination enhancement is a continuation of illumination estimation, so the low-illumination enhancement network will also include this component for training, or use end-to-end training to improve the quality of low-illumination images.
In 2017, Kin Gwn Lore, Adedotun Akintayo et al proposed the use of autoencoders to de-noise and enhance stacked synthetic datasets. They simulated a low-light environment and used gamma correction and Gaussian noise to synthesize images, and then used LLNet(A deep autoencoder approach to Natural low-light Image Enhancement) [6] to learn contrast and denoise. The effectiveness of the network model is proved by experiments on real scene images with the trained model. In 2018, Jianrui Cai, Shuhang Gu and other researchers published a research method for single image contrast enhancement [7], which applied CNN network to denoising, hypersegmentation and blur removal tasks. This paper constructs a large scale multi-exposure image data set with a two-stage network enhancement model. The model extracts the low frequency and high frequency components of the image, enhances the low frequency and high frequency respectively, and finally fuses the enhanced results as the input of the next enhancement network, and finally obtains the contrast enhanced image results. In this paper, Retinex Decomposition for low-light Enhancement [9] is implemented using CNN to decouple low-illumination images, and then decompose to obtain illumination and reflection images of the images. The enhancement result is obtained by multiplying the illumination image with the reflection image. Finally, the denoising method is BM3D. In the article published by Zhang and Yonghua in 2019, the author constructed a KinD(Kindling the Darkness) dark light enhancement network [10], which can be divided into image decomposition network, enhancement network and restoration network. Taking LOL datasets as training set, we have achieved very good results in visual and qualitative and quantitative experiments.
After analyzing and collecting all kinds of low illumination image enhancement problems, we propose solutions. These include problems with illumination enhancement, chromatic aberration and noise reduction. At the same time, uneven illumination and image enhancement of backlight and backlight are also taken into account. Therefore, this model is also used for image restoration of backlight and backlight in the experimental process.
Low illuminance image enhancement network based on CycleGAN
The design idea of CycleGAN
Generative Adversarial Networks (GAN) [11] are very effective learning networks in deep learning. It is based on the game theory to construct a two-part network to fight against each other. The reason for its construction at first is to make up for the lack of data set and create more expected data. The idea of generating adversarial network is very worthy of researchers to study, and it has superior performance in image reconstruction than other networks. In the later stage, it also derived many variant networks based on GAN to solve problems in various fields.
CycleGAN(Cycle-Consistent Generative Adversarial Networks) [1] is an image style migration network, which is essentially composed of two mirror GANs. This ring network created by the authors of CycleGAN enables style transfer of images. The network takes two kinds of images with different features as input, and the original intention is to transform the image with feature A into the image with feature B through network processing. And this network does exactly what the author wants. For example, it can convert an image of a real landscape into a Van Gogh oil painting, or an image of a horse into a zebra. CycleGAN has strong flexibility and combines the advantages of GAN network to generate very clear images after image transformation and reconstruction and improve the quality of generated images. Loss of loop consistency also results in better constraints and prevents gradient loss during training. Influenced by CycleGAN’s idea, we can assume that low-illumination images are one image style and normal-illumination images are another. The process of converting low illuminance image into high quality image under normal illuminance is similar to the process of transforming night into day. We expect to acquire a conversion generator that can realize this process through training. Thus, the CycleGAN model structure diagram of low illumination image enhancement is constructed as shown in Fig. 1.

The two mirrored GAN networks in CycleGAN are respectively learned to obtain two mappings. The first mapping G(A) converts the real image Real_A of low illumination and low quality into the generated image Fake_B of high quality under normal illumination, and the second mapping G(B) converts the real image Real_B of high quality under normal illumination into the generated image Fake_A of low illumination and low quality. Next, Fake_A is used as the input of the network, and it is converted into the middle image Rec_B with normal light and high quality using mapping G(A). Then, Fake_A is used as the input and converted into the middle image Rec_A with low light and low quality using mapping G(B). The two mirror networks improve the generator’s two mappings based on this loop and feedback from the discriminator.
However, using the general CycleGAN model to complete such transformation can not generate high-quality images, and sometimes there will be checkerboard effect, uneven illumination and other problems. Even so, this method does what we want the transformation to do. As shown in Fig. 2, we restore the low illumination image to normal image, and then convert the normal illumination image to low illumination image.

Build
The next step is how to generate high-quality images. We conducted a large number of experimental results for the problems encountered, and finally took targeted measures to solve them. In terms of denoising, we learn from the idea of deep residual shrinkage network [2]. At the same time, we expand the network width, adopt multipath to extract input features, then construct deep residual shrinkage with residual module, and finally suppress noise by learning soft threshold autonomously. Next, empty convolution [16] Unsample was used to remove artifacts and checkerboard effect [17] and improve the receptive field, so as to smoothly generate images and improve image quality. The results are shown in Fig. 3. Based on CycleGAN’s adversarial generation characteristics, the generator’s mapping capabilities will gradually move toward producing high-quality images. In order to prevent the color distortion of the generated image as much as possible, the content loss function is used for constraint.

After the checkerboard effect appears in the generated image, a hollow convolution and up-sampling structure is added to the generator of the network. The checkerboard effect is solved and the image quality has been significantly improved. The above figure is a comparison diagram of the generated results.
In order to better obtain detailed information, we use multi-path method to extract feature information, and use convolution kernels of different dimensions and sizes to extract feature information, which can reduce the amount of calculation and improve the calculation speed.
In network path 1, a void convolution kernel with a size of 3 × 3 ×64 and an expansion coefficient of 2 are used for feature extraction. Nine deep residual shrinkage network blocks with a size of 3 × 3 ×64 are used for feature extraction, and the output X1 dimension is 64.
In network path 2, a void convolution kernel with a size of 3 × 3 ×64 and an expansion coefficient of 2 are used for feature extraction. Six deep residual shrinkage network blocks with a size of 1 × 1 ×64 are used for feature extraction, and the X2 dimension of output is 32.
In network path 3, a convolution kernel with a size of 3 × 3 ×32 is used. The X3 dimension of preliminary feature output is 32, and six deep residual shrinkage network blocks with a size of 5 × 5 ×32 are used for feature extraction, and the X4 dimension of output is 32.
The output X3 of the initial convolution of path 3 and the output X4 after deep residual contraction are combined into Y1 with dimension 64, and the output X2 of path 2 and X3 are combined into Y2 with dimension 64. Finally, the features extracted from the three paths X1, Y1 and Y2 are combined into Out dimension with dimension 192. Then, 1 × 1 ×64 convolution kernel is used to smooth the feature information of the three paths. Finally, the 3 × 3 ×3 empty convolution kernel is used to smooth convolution, and then the Output image is obtained.
The deep residual shrinkage network includes the deep residual network and the network learning soft threshold function. After obtaining a set of thresholds, the soft threshold learning network realizes channel weighting and reduces redundant information to suppress noise.
The generator model is shown in Fig. 4.

Generator network structure.
In generative adversarial network, discriminator plays the role of feedback information. It compares various features of real samples and generated samples, and then feeds back the difference results of structure, pixel and content information between the two to the generator. Gradient optimization is the ability of the generator to continuously update loss values to generate high quality illuminance.
The structure of the dichotomous discriminator is shown in Fig. 5. It extracts input features through the dual paths of two types of convolution kernels, obtains one-dimensional eigenvalues through the convolution layer, and finally calculates the difference between the generated sample and the real sample.

Discriminant network structure.
Adversarial loss D
Y
(G _ A (X)), as shown in Equation 1, antagonistic loss LGAN_A of generator G _ A and discriminator D
y
,according to the target expectation, is
Adversarial loss D
X
(G _ B (Y)), as shown in Equation 3, the antagonistic loss LGAN_B of generator G _ B and discriminator D
x
, likewise, the target expectation is
Circulation loss L Cycle consistency (G _ A, G _ B), such as the Equation 5, ∥G _ B (G _ A (X)) - X ∥ 1, the calculation of loss after reconstruction, the L1 Loss calculation.
Loss of identity, L Identity (G _ A, G _ B), such as the Equation 6 ∥G _ A (Y) - Y ∥ 1, and the loss of loss value is obtained by using the L1 Loss.
Confront loss LGAN_A and LGAN_B, identity L Identity , and cycle consistency loss L Cycle are calculated and added together as the total loss of the whole network. The cyclic congruence loss and confrontation loss are reconstruction losses, which aim to generate a sample as similar as possible to the target image. To prevent the generator from changing the tone of the image, this loss value is used to ensure that the generated sample is the same color as the real image.
Experiment settings
This section will describe the evaluation of the experiment. To fully evaluate the proposed method, we test the effectiveness of the network on low-light datasets. The training sets we use are LOL Real Captured low/ Normal Light images. The dataset has 485 pairs of images with a size of 400 × 600 pixels. We also validated the method on the Brightening Train dataset, which had 1000 pairs of 384 × 384 pixels. Through this process, subjective evaluation and objective evaluation are obtained. In addition, the experimental results of other networks are also compared in the evaluation. This paper only studies and compares the experimental effects of Deep Retinex [9], KinD [10] and GLAD-Net [12]. It is noteworthy that these networks all adopt supervised or semi-supervised learning. Most of their ideas follow the Retinex theory and are mostly supervised learning on an end-to-end network. The theoretical basis of our proposed method is the unsupervised learning network constructed based on CycleGAN’s idea of style transfer, which is used to realize the restoration of low-illumination images. However, the results of the experiment with unsupervised network are no less subjective and objective than those of the experiment with end-to-end learning network. In the training, we trained two generators and two discriminators, while in the test, we can only create a single generator, and then load its trained parameters to test the image reconstruction ability of the generator by generating results.
Implementation details
The training part of the network does not need to separate the training generator from the discriminator. During the training, in order to speed up the network training speed and reduce the memory occupation, we adjusted the size of the input sample to 96 × 96 pixels, set the batch_size value of the batch to 1, and set the number of training iterations to 1000. Then we call visidom to visualize the training results, and the reconstruction results will be retained every 50 times, so that the monitoring results can be more intuitive. When testing the network model, the input sample size can be adjusted arbitrarily. The parameter initialization of the network adopts Kaiming algorithm [15]. Adam optimizer was used in all our experiments, and the initial learning rate of generator and discriminator was kept at 1e - 3 and 2e - 3 [1] during the training process. The learning rate was dynamically adjusted with the number of iterations of training. We choose the instance normalization method, which only normalizes H and W to accelerate the model convergence and maintain the independence of each image instance.
Evaluation metrics
In terms of subjective evaluation, different networks were used to compare images generated on the two test sets of LOL-dataset and Brightening Train. Objective evaluation indexes include peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and natural image quality evaluator(NIQE) and patch-based contrast quality index(PCQI).
Table 1 shows the objective evaluation indicators of the generated samples on the LOL dataset. We compare and analyze the test results of five deep learning-based low-illuminance image enhancement networks. PSNR is used to measure the average error between input and output, and the higher the value, the higher the image quality. SSIM is an indicator used to measure the similarity of two images. The closer the value is to 1, the closer the generated image is to the label. NIQE is used to measure the difference in multivariate distribution of an image to be tested. It is a completely blind, distortion free and no reference image quality evaluation index. The smaller its value is, the more consistent the image is with the standard of human eyes. PCQI is used for comparison of mean and variance within patch. Larger variance means stronger contrast. By comparing and measuring the values of PSNR, SSIM, NIQE and PCQI, it is found that the proposed CycleGAN based on multi-scale deep residual shrinkage is obviously superior to other network methods, but the PCQI value of Retinex is superior to other networks.
On test generation LOL test set image quality evaluation
On test generation LOL test set image quality evaluation
As shown in Fig. 6, we selected part of the result comparison diagrams. The contrast method we adopted is the current very efficient low illumination enhancement network method. We use the same test data as the input of the network for the comparative experiment. The first group of diagrams shows the input of the network and the results generated by each network for the same input. In terms of subjective visual effect, each network can improve the brightness of scene image under low light, but there is still color deviation between the generated result and the label, and it also produces noise influence. Not only does our network outperform other methods in objective metrics, it also outperforms other methods in subjective actual results. Secondly, in order to better evaluate the visual effect, we processed the generated results of several networks by magnifying the details in the second group of pictures, and then it can be seen that the network proposed by us corrects the color deviation problem and better inhibits the noise problem.

This is a subjective comparison between the low-illumination image enhancement method proposed by us and other network methods after testing the LOL dataset.
Table 2 shows objective evaluation indexes of images generated in Brightening Train test. We also compare and measure the four evaluation indexes. The PSNR quality evaluation index of our method reached 27.85, SSIM value reached 0.8753 and NIQE value reached 4.7652. The above three quality evaluation indexes reflect that our results are superior to other comparison methods in image quality, brightness improvement and detail reconstruction. However, the PCQI value of GLAD is superior to our method.
Test comparison of quality metrics for generated images on the Brightening Train test suite
As shown in Fig. 7, our method performed well on the Brightening data set. In order to compare the visual effects of some test results, we selected typical underexposed images and backlit images as tests. In terms of visual effects after image enhancement, the brightness of image results generated by other network methods is improved, but the enhanced results of Retinex network still have problems of distortion and noise. After comparing the image results generated by GLAD and KinD methods with the real image, different degrees of color distortion were found. The proposed method has excellent performance in brightness enhancement, detail information recovery and color deviation.

The low illumination image enhancement method proposed by us, Retinex network method and GLAD method were subjectively compared after testing Brightening data set.
Figure 8 shows the comparison of the enhancement effect of each network on the backlight image. In addition, we also hope that the model has better generalization ability, so as to put forward a guess, that is, can improve the image effect under backlight scene. Therefore, we took the backlight image with a size of 1360 × 800 as the test sample. These test images do not have actual labels, nor do they calculate objective indicators, so we only see improvements in subjective vision. Secondly, we did not train all the networks in the backlight data set, but only verified the generalization ability of our proposed method. From the evaluation of visual effect alone, the comparison of image brightness enhancement and detail information recovery after enhancement shows that the effectiveness of the method proposed by us is stronger than Retinex network and GLAD network in enhancing backlight images.

This is the contrast result of image enhancement by Retinex, GLAD and the method proposed by us by taking the backlight image in the data set as the test sample of the network.
In order to verify the extensibility of our proposed method in many aspects, we also carried out partial extension experiments. We wonder whether the underwater image can be enhanced and fogged through the idea of style transfer. This is the research work to be continued in the next part of this paper, hoping to inspire readers. Figs. 9 10 shows our test results on the EUVP open dataset of underwater images and Donator-Beta open dataset in foggy weather.

Part of the experimental results of our method on underwater image dataset EUVP are the input image, the generation result of our method and the label image from left to right.

Part of the experimental results of our method on donator-beta foggy dataset are the input image, the generation result of our method and the label image from left to right.
CycleGAN based on multi-scale residual shrinkage achieves the goal of enhancing low-illuminance images. Using CycleGAN as the prototype, we use two mirror GANs to generate and transform low-illuminance and normal-illuminance scene images. In feature extraction, we adopt multipath method to enlarge network width, enhance receptive field, and add deep residual contraction to suppress image noise. The network makes use of the advantages of cyclic generative adversarial network to make the discriminator and generator play games with each other in the training process, and then the cyclic consistency constraints of the two mirror GANs make the generated image quality closer to the quality of the label. The above experimental results show that the PSNR, NIQE and SSIM of CycleGAN based on multi-scale deep residual contraction proposed in this study are 24.62, 4.9856 and 0.8628 respectively in the image quality evaluation index test results of LOL data set. In the test results of Brightening low illumination image dataset, PSNR, NIQE and SSIM were 27.85, 4.7652 and 0.8753, respectively. Compared with KinD, GLAD and deep-Retinex networks, the PSNR value, NIQE value and SSIM value of the proposed method on the two data sets are significantly improved, NIQE value is closer to 1, and SSIM value is significantly reduced. Although the experimental results of backlight image restoration [22] were not objectively evaluated, compared with KinD, GLAD and Deep-retinex, the image quality generated by our proposed method has significantly improved in subjective vision. Experimental results show that CycleGAN based on multi-scale residual shrinkage can significantly improve the effect of detail restoration, brightness enhancement, image color deviation and image noise suppression, and this method has certain generalization ability in low illumination image processing.
The method proposed in this paper applies the idea of style transfer to low illumination image enhancement for the first time. Low illumination and low quality image can be regarded as a kind of style, the style of the normal illumination and the high quality images can be regarded as another kind of style, with CycleGAN as the prototype, according to the characteristics of low illumination image, image noises are restrained by increasing feelings of wild and methods to improve and training data, to realize the image from the weak light to normal light conversion and improve the quality of the images. Although the experiment in this paper has proved the effectiveness of the method on LOL dataset and Brightening image dataset with low illuminance, and a simple test was conducted on backlight images at the end of the experiment, this method was limited by data sets, so it could not be comprehensively verified whether the proposed method is applicable to all scenes affected by light. Low illumination image enhancement has a very wide range of application areas, and in the field of deep learning approach is always in constant exploration and excavation, the proposed approach in this field is the application of a new idea, we believe that our method can get more confirmed in more fields, perhaps can be used for training the image to study network [19], the depth of the fog or it can be applied in the field of underwater image enhancement [18]. The generalization ability of CycleGAN based on multi-scale deep residual shrinkage in different fields needs to be verified in subsequent experiments.
Footnotes
Acknowledgments
This research was supported by the National Natural Science Foundation of Region(CN)(61966035), the Key Program of Joint Fund of National Natural Science Foundation of Region(CN)(U1803261), the International Cooperation Project of the Science and Technology Department of the Autonomous Region “Construction of Data-driven China-Russia Cloud Computing Sharing Platform” (2020E01023). We would also like to thank our tutor for her careful guidance and all participants for their insightful opinions.
Disclosures
The authors declare no conflicts of interest.
