Abstract
Small object detection has important application value in the fields of autonomous driving and drone scene analysis. As one of the most advanced object detection algorithms, YOLOv3 suffers some challenges when detecting small objects, such as the problem of detection failure of small objects and occluded objects. To solve these problems, an improved YOLOv3 algorithm for small object detection is proposed. In the proposed method, the dilated convolutions mish (DCM) module is introduced into the backbone network of YOLOv3 to improve the feature expression ability by fusing the feature maps of different receptive fields. In the neck network of YOLOv3, the convolutional block attention module (CBAM) and multi-scale fusion module are introduced to select the important information for small object detection in the shallow network, suppress the uncritical information, and use the fusion module to fuse the feature maps of different scales, so as to improve the detection accuracy of the algorithm. In addition, the Soft-NMS and Complete-IOU (ClOU) strategies are applied to candidate frame screening, which improves the accuracy of the algorithm for the detection of occluded objects. The experimental results on MS COCO2017, VOC2007, VOC2012 datasets and the ablation experiments on MS COCO2017 datasets demonstrate the effectiveness of the proposed method.The experimental results show that the proposed method achieves better accuracy in small object detection than the original YOLOv3 model.
Introduction
Object detection is an important branch of computer vision, which purpose is to find the location of objects and complete the classification of objects. It has very important practical applications in many fields, such as monitoring safety, autonomous driving, traffic monitoring, scene analysis and robust vision, etc. In recent years, many deep learning based object detection methods have achieved state-of-the-art results, which can be roughly divided into one-stage and two-stage detectors.
In the field of computer vision, small object detection has always been a challenging task. It is generally defined in the MS COCO dataset as shown in Table 1. It can be seen that the small object has less pixel information. Compared to larger objects, small objects typically occupy a smaller pixel area in the image, have less distinguishable appearances, and are easily interfered with by the surrounding background. Therefore, how to accurately detect small objects has been one of the hotspots in computer vision research.
Comparison of different YOLO Networks(√ stands for improving the small object detection module or method, and ♣ stands for improving the network receptive field)
Comparison of different YOLO Networks(√ stands for improving the small object detection module or method, and ♣ stands for improving the network receptive field)
With the development of deep learning technology, many deep learning-based methods have been proposed and achieved good performance in small object detection [1–4]. These methods typically rely on convolutional neural networks to utilize high-level semantic information and multi-scale features to improve the accuracy of small object detection. In addition, some methods based on feature pyramids, region proposals, attention mechanisms, multi-task learning, and reinforcement learning have also been proposed to enhance the performance and robustness of small object detection. Woo et al. [5] present a lightweight attention module CBAM, which has been experimentally proved to have certain advantages in identifying target objects. Li et al. [6] propose a YOLOv3 method of adaptive multi-scale feature fusion to realize the detection of small objects in remote sensing images, which is not ideal for detecting small objects in general scenes because it does not consider the relationship between background and objects. Dong et al. [7] put forward a lightweight vehicle detection model based on YOLOv5, which proves that the CBAM module is of great significance for the selection of important features of vehicles.
Despite some progress has been made, small object detection is still a challenging task. In practical applications, small object detection is often applied in fields such as monitoring, security, transportation, and medical care, which are of significant importance for ensuring the safety of personnel and equipment, improving production efficiency, and enhancing medical diagnosis. Therefore, how to solve the problems in small object detection, improve its accuracy and robustness, is still an important issue in computer vision research. To this end, we study and compare the characteristics of different versions of YOLO network, as shown in Table 2. An improved small object detection algorithm based on YOLOv3 network is proposed.
Definition of objects size in MS COCO
The main contributions of our work can be summarized as follows: In the backbone extraction network, we use multi-scale feature fusion module and CBAM to strengthen the fragile relationship between small objects and the background, and provide rich contextual semantic information for the prediction of small objects. In this paper, a novel feature enhancement module DCM is proposed, which achieves the purpose of information enhancement in the predicted feature map by fusing the feature maps of different receptive field sizes. An improved Soft-NMS strategy is applied to the model. By optimizing the matching and filtering rules for candidate boxes, we can increase the prediction accuracy for occluded objects while avoiding the issue of overlapping caused by directly reducing confidence scores.
With the development of deep learning, people use deep learning methods to improve the detection accuracy of small objects. The research on deep learning methods is mainly carried out from the following four aspects.
Data augmentation
Due to the uneven distribution of objects of various sizes in training samples, data augmentation is the most direct method to improve the performance of small object detection. Yu et al. [8] suggest a scale matching strategy, which reduces the loss of small object information by cropping different object sizes and reducing the difference between objects of different sizes. Kisantal et al. [9] present a method of replication enhancement, which increases the number of training samples of small objects by replicating small objects in the image. Accordingly, it improves the detection performance of small objects. Chen et al. [10] propose an adaptive resampling strategy for data enhancement in RrNet by considering the context information of the objects, and achieve a better data augmentation result.
Data augmentation improves the detection performance of small objects by increasing the data samples of small objects. However, on the one hand, this method increases the amount of calculation. On the other hand, it introduces new noise by using inappropriate data augmentation strategies.
Context learning
In real life, objects are related to the background environment in which they are located. Through the study of this relationship, people model this relationship in deep learning to improve the performance of object detection. Lim et al. [11] propose a method that combines context with multi-scale features to improve the detection accuracy of small objects in the real world. Fu et al. [12] put forward a new method based on the context, which is to solve the detection and missed detection of small objects by using more spatial information in the small objects.
This method makes full use of the relevant information between small objects and the background, and has a certain improvement in the detection of small objects. However, the lack of background information of the current object and the correlation between objects are not considered.
Generative adversarial learning
The purpose of generative adversarial learning is to map low-resolution small objects to high-resolution objects. This method mainly improves the detection performance of small objects by increasing the features of small objects. Bai et al. [13] develop a multi-task adversarial generative network that improves the detection performance of small objects by recovering clear objects from blurred small objects. Noh Bai et al. [14] propose a new super-resolution method that addresses the problem of mismatched receptive fields generating incorrect features.
Generative adversarial learning improves its detection performance by augmenting the features of small objects. However, the entire network training is more difficult and the generator cannot generate rich samples, so the performance improvement is limited.
Multi-scale feature fusion
Due to the small size of the object in the image, multi-scale is an effective processing method. Multi-scale feature fusion methods include pyramid feature fusion, feature pyramid network (FPN), and PANet. Furthermore, Nayan et al. [15] demonstrate a real-time detection algorithm that extracts multi-scale features of the network, which can improve the detection accuracy of small objects. Liu et al. [16] propose a high-resolution detection network that retains the location information of small objects as much as possible to extract more semantic information. Deng et al. [17] show an extended feature pyramid network for small object detection.
Although existing multi-scale fusion strategies have achieved very good results in small object detection, there are still some limitations, such as feature loss, computational complexity, scale selection. Among them, scale selection is that multi-scale feature fusion requires the selection and combination of features at different scales. For small objects, how to choose the appropriate scale and how to combine the features of different scales is extremely important. To solve this problem, this paper proposes an effective scale selection and feature fusion strategy.
YOLOv3 employs a feature pyramid network to predict objects of different sizes at different spatial resolutions. For the detection of small objects, YOLOv3 shows a lower recognition rate on small objects because it does not consider the relationship between the object and the background and the problem of less information about small objects on the prediction feature map. Inspired by multi-scale feature fusion and context learning models, this paper uses CBAM in the backbone extraction network of the model to select the feature information of small objects and suppress non-important information or noise information in shallow features. In addition, our DCM module uses multi-scale fusion of feature maps of different receptive fields to enrich feature information, and innovatively combines Soft-NMS and CIOU to improve the detection accuracy of occlusion small objects.
Architecture of network
In this section, we will introduce the network architecture proposed in this paper in detail, and will introduce the network composition as a whole and give a detailed introduction to the different modules. Due to expand the following discussion more conveniently, some of the nouns used and the corresponding abbreviations are shown in Table 3.
Define nouns and corresponding abbreviations
Define nouns and corresponding abbreviations
This paper presents a modified version of the YOLOv3 object detection network that achieves improved performance over the original. The network architecture comprises three key components: a backbone extraction network called darknet-53, a feature enhancement module in the neck, and a prediction head. The original YOLOv3 network did not take into account the influence of network depth and feature fusion in the neck on the detection of small objects, resulting in a low detection accuracy of only 16.3% for small objects on the MS COCO2017 dataset [18]. Therefore, the primary objective of the method proposed in this paper is to enhance the feature enhancement and backbone network components of the original network [6]. We add a feature enhancement module DCM to the backbone network of the network and a multi-scale fusion module to the neck of the network to improve the detection accuracy of the network for small objects. In addition, because small objects are easily occluded by large objects, the algorithm filters the candidate frames of small objects to a certain extent when screening candidate frames, which makes these small objects miss detection. To this end, we use the Soft-NMS and CIOU strategies on the one hand to reduce the penalty for small objects, and on the other hand to improve the detection accuracy of candidate boxes. Thus, the detection accuracy of occluded objects is improved.
Through the above analysis, we design the network structure shown in Figure 1. The entire network structure is mainly composed of three parts: the backbone, the neck and the detection head. Among them, the backbone network uses multi-scale fusion and attention mechanism to enhance the feature extraction ability of small objects. The neck structure strengthens the feature information extracted by the backbone network and expands the receptive field of the feature map by using the DCM module. In this way, auxiliary information can be provided for the detection of small objects through the global feature information, which makes the detection of small objects more accurate. For the detection head part, we use an improved detection head to predict the regression box and the classification task separately.

The proposed method based on the combination of feature fusion and feature enhancement can propagate the feature information of small objects in the shallow layer from the shallow layer of the network to the deep layer through skip connections. The CBAM module weights the shallow information through the attention module to reduce some shallow noise information, and the Fusion module mainly fuses the feature information of different scales extracted by the backbone network, and sends it to the small object detection head through the skip connection. The DCM module is mainly used in the backbone and neck of the network. The module in the backbone is mainly to enhance feature information, and the one in the neck is to expand the receptive field.
The attention mechanism originates from the study of human vision. Humans give priority to important information and ignore unimportant information. In the convolutional neural network, the attention mechanism mainly uses the learned weight feature map to realize the screening of important information in the original feature map. We add an attention module to the multi-scale feature fusion module at the neck of the network to reduce the interference of redundant information on small object prediction so as to improve the detection of small objects [5].
The structure of the module is shown in Figure 2, which combines spatial and channel attention. The output result of the convolutional layer will first pass a channel attention module to obtain different feature weights on the channel, and then pass through a spatial attention module, and finally obtain the proportion result of the feature map.

This module mainly includes channel attention and spatial attention. It is mainly composed of MLP layer network, which is used to weight the shallow semantic information and remove some noise information in the shallow feature map.
The channel attention mechanism is to compress the feature map in the channel dimension to obtain different proportions of channel information and improve the feature representation ability. A channel description matrix is obtained by using global average pooling and global maximum pooling, and then sent to a fully connected network to obtain a weight matrix, and the final attention feature map is obtained by multiplying the weight matrix with the original feature map. The channel attention mechanism can be expressed as:
The spatial attention weights different regions in the image to find the most important parts in the network for processing. The specific operation process is as follows: for a feature map with an input size of H × W × C . First, the max pooling and average pooling operations are performed on the spatial dimension to obtain two feature maps of size H × W × 1. Then, the two feature maps are spliced, and the weight coefficients are obtained through a 3*3 convolutional layer. Finally, the weight coefficient is multiplied with the input feature map to obtain a new weighted feature map.
Generally, after the downsampling operation of the first few convolutional layers in the backbone extraction network, the resolution of the feature map has become very low. If the downsampling operation is continued, a lot of context information will be lost. Inspired by dilated convolution and U-Net network, we propose a new feature enhancement module DCM [19]. The downsampling operation is no longer used in this module, and this module is applied to the deep network of the backbone extraction network and the neck feature enhancement module respectively. The useful information in the prediction feature map is enriched by fusing and enhancing multiple features in the feature map, and the enhanced feature map and the prediction feature map are effectively fused to provide more information for the small object prediction feature map, thus it effectively improves the prediction accuracy of small objects.
Figure 3 depicts the primary structure of the module, which consists of a series of 3*3 convolutional layers with varying expansion coefficients. The module utilizes symmetrical structures with expansion coefficients of 2, 4, 8, 4, and 2 to capture feature information at different receptive fields. These feature maps are then subjected to feature fusion using a network structure similar to U-Net. By leveraging this structure, the module can provide additional context information for the prediction of small objects, leading to improved accuracy in their detection. Overall, the module is designed to effectively extract and fuse features from the input data, thereby enhancing the performance of small object detection.

This is our proposed feature enhancement module, which is mainly composed of convolutional layers with different dilation rates, and enhances the feature information of the input feature map by adopting cross-layer fusion. Finally, the input feature information is fused with the module output information in a way similar to residual connection.
For the backbone extraction network, low-level semantic information contains more pixel values, edges and textures information about small objects, but it is the lack of higher-level abstraction and semantic information, as well as the potential inclusion of more noise and irrelevant information. However, high-level features contain more abstract semantic information, which can be used in object detection to determine the position and size of detection boxes, as well as to classify the detected objects. By combining low-level and high-level semantic information, it can effectively improve the performance and robustness of deep learning models, while also enhancing the understanding and representation ability of the data. To achieve effective fusion of these two levels of features, we use the CBAM module to filter the noise information. We use a multi-scale fusion module to fuse and complement the three low-level feature maps to preserve the spatial information in small objects. Specifically, the different features of the first three residual blocks of the backbone network are concatenated in the channel dimension.
The backbone network outputs feature maps with different scales, such as 80*80, 40*40, and 20*20. The 80*80 feature map is used to predict small objects. In the process of forwarding convolution of the network, the feature map will gradually become smaller, and the information of the shallow features cannot be well retained in the deep feature map. As a result, the feature information of the upper layer cannot be effectively transmitted to the next layer. We add a new structure with the aim of solving this problem. This structure is shown in Figure 5. The structure contains a max pooling layer, an average pooling layer and a convolutional layer. We feed the three different outputs into the CBAM module to get the weighted feature map. Then, we downsample the output feature map of the first residual block 1 time using max pooling and average pooling to downsample the output feature map of the second residual block 2 times. Then, we feed the three different outputs into the CBAM module to get the weighted feature map. And finally, we concatenate the three weighted feature maps in the channel dimension. In this feature fusion module, we use L2 regularization instead of batchnorm is mainly to balance the differences between samples [20]. Through this structure connection, the information of the upper layer of the network can be more transmitted to the network structure of the next layer [21, 22]. The above process can be described as

The fusion module mainly consists of three paths, where the feature maps of the lower two paths are downsampled using average pooling and max pooling, and then L2 regularization is used to balance different samples. Finally, the module uses 1*1 convolution to convert the number of channels to the specified size, and splices in the channel dimension to obtain the final output result.
Finally, these three level features are concatenated in channel and then get final features by using a 3*3 convolution layer:
The detection head of YOLOv3 is very simple. It consists of a 1*1 convolutional layers and a 3*3 convolutional layers to obtain the final predictions. In the network, the regression and classification tasks will affect each other, and predicting the two tasks together will have a greater impact on the final result. We replace the YOLOv3 detect head with decoupled head as in Figure 4. It contains a 1*1 convolutional layer and two 3*3 convolutional layers. The 1*1 convolutional layer is mainly to reduce the channel dimension and the two 3*3 convolutional layers are used for regression and classification prediction tasks respectively. The first branch is mainly used to predict the category of objects. The other branch is mainly used to predict the size and position of the object regression box. We replace the Leaky ReLU function in the 3*3 convolutional layer with the Mish activation function, which can further improve the accuracy and generalization of the network detection head [23].

This module is mainly composed of two branches, the upper one is the classification branch, and the lower one is the regression branch. The blue CBL module is a 1*1 convolution, which mainly reduces the number of channels of the feature, and the gray CBM module is composed of two 3*3 convolutions, which are used to complete classification and regression tasks. The purple module is a 1*1 convolution that mainly reduces the number of channels to a specific number, while the yellow module splices the results of classification and regression according to the channel dimension.
Non-maximum suppression is a very important part of the object detection network. First, it will sort according to the intersection score of the prediction box and the labeled box, and select the detection box with the highest score. Then, it performs the IOU calculation with other prediction boxes, and if it is greater than the set threshold, then discard this prediction box. Finally, from the remaining prediction boxes, it finds the one with the largest score, and so on. However, the NMS will have problems with the detection of dense objects, because the NMS will force the scores of adjacent detection boxes to zero. In this case, multiple overlapping objects will fail to detect, which will reduce the average detection accuracy of the network. Therefore, we can reduce the score of the adjacent detection boxes instead of completely removing them. Although the score is reduced, the adjacent detection boxes are still in the sequence of object detection. This is Soft-NMS [24]. Algorithm 1 shows the detailed steps.
Most of the small objects in the dataset cannot be adequately detected due to being occluded by large objects. we choose to use the CIOU to replace the IOU in the original algorithm [25]. Because the CIOU considers the distance between the object and the anchor box, the repetition rate, and the aspect ratio, which can make the prediction frame of each object more accurate and avoid some unnecessary coincidence of prediction frames occurs. The formula is as follows:
where
For the penalty function, it is best to choose a continuous one, otherwise it will cause a sudden change in the detection order. Continuous penalty functions should have no penalty when there is no overlap, and should have a high penalty when there is high overlap. In addition, when the degree of overlap is low, the penalty should be gradually increased, that is, M should not affect the score of the prediction box with a low degree of overlap. However, when the coincidence of B i and M is close to 1, B i will add a significant penalty. Based on this consideration, we choose the Gaussian penalty function, the function expression is as shown in formula (7), it uses this function to update each iteration process, and update the scores of all remaining detection frames, and finally found through experiments, the N t value Set to 0.9, g=1.1, the best experimental effect is obtained.
Evaluation metrics
In order to quantitatively evaluate the performance of the model proposed in this paper, we use the average accuracy of the evaluation index commonly used in object detection to measure the effect of the model. For the purpose of more comprehensively evaluating the performance of different models, we choose different IOU, and then the mean value of AP under these IOU.
(1)
In order to prove the effectiveness of the method proposed in this paper, we implemented, trained and verified the model on the pytorch platform. Among them, we selected COCO-train 2017 as the training data set, and resized each picture into a scale of 640*640. In addition, some hyper parameters in network training (learning rate lr=1e-5, batch_size=16, weight_decay=5e-4) are set. For the purpose of fast training of the network, this paper conducts two-stage training on two 2080Ti GPUs. In the first stage, during the first 50 epochs, the entire model backbone is trained frozen. In the second stage, in the following 150 epochs, the backbone network of the model is thawed and participates in the training of the entire network. Finally, the network tends to converge after about 200 hours of training. The loss decline curve of the model is shown in Figure 6.

This is the training loss reduction curve of the model on the training dataset and the verification dataset.
To compare the performance of the algorithm in this paper and several other commonly used object detection algorithms in small object detection tasks, we conducted comparisons of different algorithms on multiple aspects such as detection accuracy, model size, computational complexity, etc. using the COCO API toolbox on both the MS COCO2017 and VOC datasets. The evaluation results of all algorithms are shown in Tables 4–6. As can be seen from Table 4, the algorithm in this paper has an AP value of at least 5.4% higher than other algorithms in small object detection. For the detection of large objects, the AP value of the algorithm is also increased by about 3.5%. For the VOC dataset, the algorithm in this paper improves by about 2%. Regarding computational complexity, our model in this paper requires more time compared to several other models. This is mainly due to the adoption of multi-scale feature fusion and feature enhancement modules in our model, as well as the candidate box filtering strategy, which requires judging a large number of candidate boxes for occluded objects.
Accuracy(%) of different object detectors on the MS COCO2017 dataset
Accuracy(%) of different object detectors on the MS COCO2017 dataset
Accuracy(%) of different object detectors on the VOC2007 dataset
Accuracy(%) of different object detectors on the VOC2012 dataset
In order to more intuitively demonstrate the accuracy of the algorithm in this paper for small object detection, we visualized the detection results of different algorithms, and the results are shown in Figure 7. Among them, the YOLOv3 and YOLOv4 algorithms miss detection of distant characters in the picture, and cannot accurately detect occluded or relatively small objects. The main reason may be due to the lack of sufficient contextual semantic information for small objects in the network, which affects the detection of small objects. For the YOLOv5 algorithm, some small objects in the picture can be identified, but there is a problem of detection failure for some occluded objects. This is mainly because the YOLOv5 algorithm uses the method of small object data enhancement during network training, which improves the detection accuracy of the network for small objects. However, since the network does not consider the context semantics of small objects, there will still be missed detection problems for some occluded objects. However, the method in this paper can distinguish and detect small objects and occluded objects in the picture very well.

Our proposed method can better capture the location of small objects in the picture compared with several other methods. The YOLOv3 and YOLOv4 object detectors have been unable to detect relatively small people, while the YOLOv5 object detector has a better performance, but there is some deviation in the location of small people.
The experimental results in the first and second rows of Figure 8 show that the YOLOv3 detection algorithm suffers from detection errors and missed detections of small objects that are obscured due to occlusion between objects. This is mainly because the YOLOv3 algorithm uses non-maximum suppression, and there are problems of inaccurate matching and misidentification of candidate frames for adjacent objects when screening candidate frames. The algorithm in this paper improves the positioning of candidate frames and retains redundant candidate frames by combining CIoU and Soft-NMS to achieve the detection of occluded objects. In addition, the third row of the figure can be seen that the YOLOv3 algorithm has the problem of detection failure for small objects with a single background, which is mainly due to the fact that the algorithm ignores the relationship between the background and the object and the reason why the predicted feature map information is less, and the algorithm in this paper uses CBAM to capture the relationship between the object and the background and the object and strengthens the information of the predicted feature map by using DCM and multi-scale fusion, so as to achieve the detection of these small objects.

Comparison of the experimental results of the YOLOv3 algorithm with the improved YOLOv3 algorithm.
In order to verify the influence of different modules in the network on the detection results of small objects, we conducted ablation experiments on the entire network, and the results are shown in Table 7. In this experiment, the interaction between different modules is not considered, and only the influence of a single module on the experimental results is considered.
Roadmap of improved YOLOv3 on the COCO 2017 dataset
Roadmap of improved YOLOv3 on the COCO 2017 dataset
A → B First of all, we add a feature enhancement module DCM to the backbone extraction network and the neck. This module effectively improves the context information of small objects in the feature map by fusing feature maps under different receptive fields. As can be seen from the Table ?? the detection of small objects in the entire network has increased by 5% after adding the feature enhancement module.
B → D Next, in order to further enhance the feature extraction ability of the neck in the network, the algorithm proposed in this paper uses a multi-scale feature map fusion and attention weighting module to enhance the spatial information of small objects, thereby improving the detection accuracy of small objects. As can be seen from the results, the module has improved the detection of small objects by 7.8%.
D → E In order to further improve the detection accuracy of the network, we use a decoupling head in the network to separately process the regression and classification tasks in the object detection task, thereby improving the detection of small objects by 2.4%.
E → F Post processing is also where we can improve the performance, we use the Soft-NMS to replace traditional NMS, and the mAP improved by 0.2%. For the detection of small objects, it has improved by 1.3%. When the traditional NMS algorithm performs detection box selection, the detection box for some occluded small objects will be removed. This will cause some small objects to fail to be detected. However, the Soft-NMS improves the accuracy of small object detection by reducing the score of occluded small object detection boxes.
From the above ablation experiments, it can be seen that the use of the dilated convolution module can significantly improve the detection accuracy of large objects, because the dilated convolution significantly improves the receptive field of the large object detection feature map and get more global information. However, the detection accuracy of small objects is significantly improved by using multi-scale feature fusion and attention module. Since this module fuses different weighted the low-level semantic information and then fuses it with the high-level semantic information obtained by upsampling, so that the entire small object detection head contains more feature information of small objects.
In this paper, aiming at the problem of how to select appropriate scales and how to combine features of different scales in small target detection, we proposed a novel small object detection model that achieved state-of-the-art performance on the MS COCO2017 and VOC datasets. Our approach incorporated multi-level feature fusion and feature enhancement modules, as well as a candidate box filtering strategy to improve detection accuracy, especially for small objects. While our model requires more computational time compared to some existing models, we believe that the improved accuracy justifies the trade-off and demonstrates the effectiveness of the method in experiments. Future work could focus on further optimizing our model to reduce its computational complexity while maintaining or even improving its performance.
Footnotes
Acknowledgment
This work was supported in part by the Fundamental Research Funds for the Central Universities (No. 31920230137, 31920230030, 31920220037), the Gansu Provincial Department of Education University Teachers Innovation Fund Project (No.2023B-056), the Introduction of Talent Research Project of Northwest Minzu University (No. xbmuyjrc201904), the Gansu Provincial First-class Discipline Program of Northwest Minzu University (No.11080305), the Leading Talent of National Ethnic Affairs Commission (NEAC), and the Young Talent of NEAC, and the Innovative Research Team of NEAC (2018) 98.
