Abstract
In order to solve the problems with the traditional aircraft target type recognition algorithm, such as difficulty in feature selection, weak generalization ability, slow recognition speed, and low recognition accuracy, this paper put forward a new method that could detect and recognize aircraft targets in aerial images quickly and accurately. The aircraft targets in the images were detected rapidly and located through YOLOv3-tiny, and after image denoising, shadow detection, and positioning, then we used the Sobel operator to calculate the edge gradient of the target; the image of the aircraft target was segmented by using the region growth method, and then the principal component analysis (PCA)was used to obtain the central axis of the aircraft target. The projected distance from the edge contour to the central axis was sampled at equal intervals along the direction of the central axis, and its ratio to the length of the central axis was calculated to construct the feature vector. Finally, the Spearman rank correlation method was used to match the feature vectors to realize the recognition of the aircraft type. Experiments showed that the proposed method had strong adaptability and small computation and could quickly detect and accurately recognize aircraft targets in aerial images.
Introduction
Military aircraft play a vital role in modern warfare. How to quickly and accurately locate and identify enemy aircraft targets in aerial images plays a crucial role in military action decision-making.
In reality, aerial photography is often used to reconnoiter the enemy targets. Aerial images have a large amount of data and are updated fast.
The traditional method of identifying aircraft targets in aerial images based on manual methods cannot meet the requirements of real-time, nor can it meet the needs of the current intelligent information combat command system. Therefore, in recent years, how to automatically, quickly, and accurately detect and identify the aircraft target type from aerial images has become a hot research topic [1, 2], and some achievements have been made.
Currently, several methods are applied in aircraft target recognition, where some researchers use image features for aircraft target recognition.
Diao et al. [3] proposed a classification method based on deep belief networks. Fang et al. [4] proposed a combination of moment invariants and back propagation neural network-based aircraft identification methods.
However, a common problem with the above methods is that training models must obtain a large amount of data for each type. In addition, the imbalance of data distribution will also seriously affect recognition accuracy.
In addition, some researchers have performed aircraft type matching recognition based on some basic features of the aircraft [5–8]. In these methods, the aircraft target is usually first segmented to obtain the image’s edge and morphological features, then compared with the standard template in some way.
Zhao et al. [5] transformed aircraft type identification into a key point detection problem and used critical point template matching to identify aircraft. Benedetto F et al. [6] proposes a target recognition method based on the skeleton characteristics of aircraft targets. Luo J et al. [7] proposed a target recognition method based on affine invariant moments. Benedetto F et al. [8] proposed a method to classify aircraft targets using a binary tree based on aircraft prior knowledge. The above methods based on aircraft structural characteristics have realized the function of aircraft target recognition to a certain extent, but the actual effect could be better. These methods generally have problems of a low recognition rate, a large amount of calculation, and poor real-time performance.
Furthermore, these methods usually depend on manually selecting target features to achieve target segmentation and matching recognition. However, in aerial images, aircraft targets often have no fixed gray features due to painting, lighting, and other factors. At the same time, the imaging morphology of the same type of aircraft targets will be pretty different due to different shooting angles, making it difficult to accurately complete the target segmentation and extract the corresponding structural features. To sum up, under the influence of many factors, the recognition effect of aircraft-type recognition algorithms based on structural feature matching is limited.
Deep learning is a branch of machine learning [9] that imitates the biological neural network. Combining low-level features to form abstract high-level representation attribute categories or features can adaptively learn effective features from many input data for regression and classification. Compared with traditional machine learning, the deep neural network of deep learning [10–12] can extract features that cannot be extracted by conventional machine learning without artificially setting feature extraction rules. At the same time, deep learning is more adaptable, and it is not easily affected by background, illumination, and other conditions when it is applied to image recognition. More and more scholars are using deep learning techniques for image classification [13–15]. Liu, Yansong, et al. [16] proposed a simplified YOLO network applied to airport aircraft target recognition, which realizes the simultaneous recognition of two targets and improves the recognition speed. Liu Z et al. [17] used YOLOv5 as the basic network framework, used multi-scale optimization training, effectively detected the characteristics of aircraft types, and realized the recognition of aircraft targets. Yang G et al. [18] proposed a multilayer BP neural network-based aircraft target recognition algorithm to address the problem of low recognition rate of methods such as template matching in aircraft target recognition. Zuo, Jiawei et al. [19] proposed a novel aircraft-type recognition framework based on deep convolutional neural networks.
Although applying methods based on deep convolutional neural networks has achieved some success in an aircraft target recognition, the neural network model used in the aircraft target recognition algorithm based on deep learning is highly complex. The model optimization lacks complete mathematical theory support, so it needs to learn and train through large-scale operations, resulting in a large amount of calculation in deep learning and high demand for computing resources. At the same time, due to the lack of public military aircraft data sets, the training of deep learning networks relies on the manual labeling of training images, resulting in a large workload.
Among the two types of aircraft type recognition methods mentioned above, the template matching method has the problems of complex target segmentation and feature extraction, while the process using deep learning has the issues of difficulty in training deep neural network models and the need for large amounts of data.
To solve the above problems, we combined deep convolutional neural network technology and feature matching method to propose an aircraft target recognition algorithm based on deep learning and structural feature matching, which uses a deep convolutional neural network to identify and locate aircraft targets, then segment aircraft targets. Then the principal axis is obtained by the PCA method, and the feature vector is constructed by the distance from the edge to the principal axis. Finally, the aircraft type is identified by the template matching method.
The primary process of our proposed aircraft-type identification algorithm is as follows:
Firstly, the YOLOv3-tiny single-stage target detection network is used to detect and locate the aircraft target in the aerial image [20]. Then anisotropic diffusion filter [21, 22] filters the image of the detected and recognized airplane target area, which can eliminate noise and retain edge details simultaneously. To eliminate the adverse effects of shadows on aircraft target segmentation, the OUST [23] method is used to detect and locate the shadows of aircraft target images [24]. Then the gray gradient is calculated by the Sobel operator. The effective seed points are obtained by combining the shadow position information, and then the aircraft target is segmented by the region-growing method. Then the principal component analysis (PCA) [25] is used to obtain the central axis of the aircraft target. Then the projection distance from the edge contour to the principal axis is obtained by sampling at equal intervals along the central axis direction. The projection distance ratio to the principal axis’s length is calculated to construct the feature vector. Finally, the Spearman rank correlation method matches the obtained aircraft target feature vector with the pre-established aircraft structure database [26] and calculates the correlation coefficient, based on whose results the aircraft type is determined.
The advantages of our proposed aircraft type identification method are mainly the following two points: The proposed algorithm can effectively improve the recognition accuracy, at the same time, the calculation amount is small, and the recognition speed is greatly improved than the existing methods. our algorithm can have good recognition ability even after rotation and scaling of the aircraft target image
This paper is structured as follows. Section 2 presents the methods, including aircraft target localization, target segmentation, contour extraction, feature vector acquisition, and feature matching. Section 3 compares and analyzes the recognition results of the algorithms in this paper and some of the typical algorithms mentioned. Finally, we discuss what we have learned, future work, and conclusions in section 4.
Proposed methods
This section discusses the proposed aircraft identification framework’s architecture and explains how each part works. as shown in Fig. 1.

The framework of our Aircraft target type recognition method.
The framework consists of four parts: aircraft target recognition, aircraft target image segmentation, feature vector extraction, and feature matching.
To identify the aircraft type, we are to identify multiple possible aircraft targets in the image, then the Segmentation of aircraft targets in images. Next, we obtain the central axis of the aircraft target and construct a feature vector using the distance from the edge to the central axis; finally, the feature vector is matched with the pre-established aircraft structure database to realize the recognition of aircraft type.
The detection and location of aircraft targets in aerial images is the key prerequisite step of aircraft recognition, directly affecting subsequent recognition results. To realize the rapid detection and location of aircraft targets in aerial images, we use the YOLOv3-tiny target detection model based on the optimization of YOLO3.
YOLO algorithm
YOLO is a convolutional neural network model that detects multiple target locations and categories simultaneously. It inputs the whole image to the network model for training and prediction without extracting the target candidate region from the image. Finally, it regresses the parameter information such as target center location, length, width, and target category.
When the YOLO model trains the network, the image is divided into S*S grids, each grid is responsible for B prediction target boundary boxes, and each boundary box contains four prediction parameter values, which are the center coordinates (X, Y) of the target and the aspect ratio (W, H) of the boundary box relative to the whole image; Each bounding box also corresponds to a confidence score C. When the bounding box contains the target, the parameter probability is 1, otherwise 0. The intersection area ratio of the target bounding box output for the mesh and the actual target bounding box, is illustrated in Equation (1).
During aircraft detection, since there is only one type of aircraft, the conditional probability Pr (plane | Object) is generated for each mesh, and the probability that the bounding box contains an aircraft is shown in Equation (2).
The algorithm process is as follows: first, adjust the input image to the specified size, which meets the requirements of S * S, then input the convolutional neural network, output the grid category and its corresponding target bounding box and reliability, and obtain the target bounding box with the highest reliability. Finally, calculate the probability that the bounding box contains an aircraft through formula 2, and realize the target positioning after comparing it with the threshold. In the network model, S = 7 and B = 2. Therefore, an input image will eventually generate an output of,S*S*B*(X,Y,W,H,C,P(plane)) totaling 7 * 7 * 2 * 6 = 588 parameters.
To improve the running speed, adapt to run on UAVs or other aerial vehicles, and realize the detection and recognition of single targets such as aircraft, this paper adjusts some parameters based on the micro YOLOv3-tiny network model: the network structure includes 23 layers, and the network input is 832*832. The backbone network deletes the ResNet structure, retains the multi-scale fusion features of yoloV3, and outputs two feature maps of different sizes. The network structure is shown in Fig. 2.

The miniature neural network model diagram we used in the paper.
There is no public data set in the field of aircraft target recognition, especially in military aircraft recognition. To train the neural network model in this paper, we select 6000 airplane images in the CIFAR-10 data set, including 5000 training images and 1000 test images. In addition, 1200 aerial images containing aircraft targets were collected from Google Earth, Bing satellite images, Microsoft flight simulation software, and the Internet. The collected images had different colors, backgrounds, lighting, shooting angles, and sizes, of which 1000 were used as training images and 200 as test images. After the collected images are named in a unified format, the Labelimg tool is used to label the images, as shown in Fig. 3.

Labeling of aircraft targets in images.
An XML file containing the destination and location information is generated. Edit and run the Python script to create a list of the training set and test set files after traversing the image and annotation information files.
To solve the problem of an insufficient number of images in the self-made data set, we use data enhancement to improve the aircraft target data set so that the neural network model training and learning can achieve better results. The data enhancement processing method is realized using image mirroring, rotation, geometric scaling, displacement, color enhancement, and multi-means combination to expand the number of supplementary acquisition images to 5 times the original 1200, namely 6000. So far, 10000 training set images (5000 + 1000 * 5) and 2000 test set images (1000 + 200 * 5) have been obtained.

Use multiple image transformations to enhance data.
This chapter uses the Darknet architecture to train the network. The training set contains 10000 images, and the test set has 2000 images. In training, the loss function is taken as follows; The number of iterations is 10000, the impulse constant is 0.9, the weight attenuation is 0.0005, and the initial learning rate is 0.0001. After ten cycles, the learning rate is adjusted to 0.001 until the loss function drops to a stable value, and the training is stopped. The best weight value is selected according to the evaluation index. In the aspect of experimental model evaluation, we use GIOU, illustrated in Equation (3), objectless, classification, precision, presented in Equation (4) and recall, shown in Equation (5) to evaluate the quality of the model.
To verify the model’s effectiveness, the trained Yolov3-tiny model is used to predict 200 test set images. The total number of targets in the test-selected images is 348; the results are shown in Fig. 5.

Aircraft target identification and localization by using Yolov3-tiny.
The results show that the application of the model can better identify and locate the aircraft target on the map. According to the statistics, the recognition speed of the YOLOv3-tiny3 network model is up to 41 frames per second, with a precision 93.85% and a recall 95.26%, which can recognize and locate the airplane target in the aerial image quickly and accurately. It lays a foundation for the later identification of aircraft models.
Segmentation of aircraft targets is a vital prerequisite for type recognition. The segmentation process goes through three stages: image filtering, shadow region localization, and target segmentation based on region growth, As follows.
Aircraft target image filtering
After completing the detection and location of the aircraft target, the target image needs to be prepossessed to better extract the characteristics of the aircraft target before identifying the specific type of aircraft.
In the process of aerial image acquisition, it is inevitable to produce image noise due to weather, dust, and electronic noise, which reduces the quality of the image. A large number of image noise will directly affect the extraction of target edge features, resulting in the loss of some target features and ultimately affecting the feature vector matching.
To better restore the image’s geometric characteristics and retain the image’s edge details, an improved anisotropic diffusion filter is used to smooth the image based on the characteristics of noise composition in aerial images. The anisotropic diffusion filter adaptively selects the diffusion intensity according to the local features of the image and stops when the diffusion process encounters an edge. The formulas are shown in equations (7):
Among them are the smoothed image after t iterations, the diffusion coefficient, and the gradient threshold K. In the filtering process, edges with gradient values greater than K are retained, and borders with gradient values less than K are filtered and smoothed iteratively.
After the iterative filtering, the salt and pepper noise in the image is filtered using the inverse harmonic filtering average filter. The inverse harmonic filtering averaging filter is shown in Equation (8).
In this paper, the pepper noise is removed when Q is 1.6, and the salt noise is removed when Q is –1.6.
After the aircraft target detection, positioning, and filtering processing, the aircraft target region image in the aerial image is obtained, eliminating the interference of other irrelevant region images, improving the accuracy of target segmentation, and reducing the amount of computation.
Before identifying the type of aircraft target, the profile information of the aircraft should be extracted to construct the feature vector. Generally speaking, the gray distribution of foreground aircraft targets and airport runway background images is even, and the difference is noticeable. At the same time, aircraft targets have rich edge information. Therefore, aircraft targets can be segmented accurately and conveniently according to edge features. However, after analyzing a large number of aerial images, it is found that a considerable part of images is collected under sufficient illumination. In these images, aircraft targets usually have shadows due to illumination. Aircraft shadows also contain more edge information, which will seriously affect the segmentation effect, and then affect the acquisition of feature vectors.
First, use the OSTU method to detect shadows. The grayscale value of shadows in an image is low, and the grayscale distribution is even. Therefore, when there are shadows in the image, the grayscale image histogram will exhibit a significant bimodal distribution state, as shown in Fig. 6.

Gray histogram of the aircraft target image.
Based on this gray distribution feature, the OSTU method detected the shadow area and obtained a binary black-and-white image. The bwareaopen function of MATLAB is used to process the shadow image, eliminating the part of the area less than the threshold value ST, and the reserved area was the shadow of the aircraft target, as is shown in Fig. 7.

Example of shadow detection.
In this paper, we propose a method combining OSTU, edge gray gradient calculation, and region growth is used to segment aircraft targets.
After shadow extraction, the image gray gradient is calculated. The edge of the image is the area where the gray level changes by step or the roof changes. The stronger the gradient value in the image is, the faster the pixel points of the x-axis and y-axis change, so the point is more likely to be the edge. In this paper, the gradient value of the image gray value function of the Sobel operator is used to detect the image edge. Sobel operator contains two sets of 3 * 3 matrices, namely horizontal matrix and longitudinal matrix, which are convolved with the image in the plane to obtain the approximate values of horizontal and longitudinal brightness difference, as is shown in Equation (9).F is the original image, Gx and Gy are the gray difference values of horizontal and vertical, respectively, and is the gradient amplitude.
After the image gray gradient value is calculated, the gradient threshold T = 45 is set, and the points whose gradient value is greater than the threshold T are selected as the edge key points. The marginal key points are used as seed points, and then the region growth method is used to segment the target. After the seed points are obtained, adjacent regions are retrieved based on the law that the gray distribution of the target is similar. The difference between the gray and gray values of the seed points is less than value M, and the gray threshold M = 10 is set. Growth is carried out according to the similarity criterion until no pixel meets the growth criterion. Thus, the image of the aircraft target and the shaded part is segmented, as shown in Fig. 8.

Segmentation of aircraft target images.
After obtaining the image of the aircraft target and shadow, illustrated in Fig. 8(a) by using the region-growing method, with the location information of the shadow region detected by the OSTU method, the shadow part can be excluded to retain the aircraft target. Then the image was binarized, as is shown in Fig. 8(b). Finally, the contour of the aircraft target is obtained by morphological operation, as shown in Fig. 8(c). Through image filtering, shadow positioning, gray gradient calculation, and finally, using the region growth method, the Segmentation of the aircraft target is effectively completed, and the aircraft contour is obtained, laying a foundation for the subsequent feature extraction.
Feature extraction
Aircraft type recognition technology based on deep learning relies on many manually labeled training images to extract the potential features of aircraft targets. However, due to the large number of aircraft types, it is difficult to obtain a sufficient number of different types of aircraft image databases for neural network training. At the same time, the high similarity of various aircraft also brings great difficulties to the identification of aircraft types.
In this paper, after obtaining the aircraft target contour, the principal component analysis (PCA) based method is used to obtain the aircraft target central axis direction, as is shown in Fig. 9. Firstly, the length LAB of the spindle is obtained, M equal regions are divided along the direction of the spindle, and the mean LMR and LML of the distance between the inner edge of each region and the center of the spindle are calculated as is shown in Fig. 9. below, M is set at 10 in this paper to divide ten regions. In each region, the distance between the left and right edge points and the spindle is set at equal intervals, and N is set at 10.

Example of obtaining the distance from the aircraft profile to the central axis.
Due to different aerial photography angles, the acquired aircraft target imaging may be asymmetric. To reduce the adverse impact on aircraft target recognition, compare the average distance from the edge points on both sides of the central axis with the central axis in each region, and take the larger value as the distance LM from the edge to the central axis of the region, as is shown in Equation 10.
After obtaining the distance, divide the distance by the length of the spindle to get the parameter shown in Equation 11.
Finally, the feature vector is constructed A = [α1 α2 α3 ⋯ αM].
Due to the impact of factors such as aerial photography perspective, distance, and weather illumination, the feature vectors extracted by this method for the same aircraft type in different scenarios may also differ. However, for the same type of aircraft target, the extracted feature vectors in different situations have a high degree of correlation. Based on this feature, to quickly and accurately identify the aircraft target model, this article uses the Spearman rank correlation matching algorithm to match the extracted feature vectors of the aircraft target with the aircraft feature vectors in the existing database and realizes the aircraft model recognition by calculating the correlation coefficient of the feature vectors.
The Spearman rank correlation is shown in Equation (12).
In the Equation, ρ is the correlation coefficient, D i is the level difference, and M is the dimension of the feature vector.
The correlation coefficient between the two sets of feature vectors is calculated by Spearman rank correlation, which reflects the similarity between the two sets of vectors. The higher the correlation coefficient, the higher the possibility that the target type to be measured is consistent with that of the aircraft target in the database.
Experimental environment: i9-12950HX CPU; GPU: NVIDIA RTX™ A5500; Memory: 32G. Software environment: WINDOWS 11, Python 3.7.6, OpenCV4.2.
In the experiment, 48 different aircraft types are selected from the aircraft in service in China, Japan, South Korea, the United States, and other countries, their contour features are extracted, and the feature vector database is constructed. One hundred images of aircraft targets from several airport areas in China, Japan, South Korea, and the United States are collected from Google satellite maps and the Internet as a test set for recognition testing. The algorithm’s precision, recall, and detection speed are verified and analyzed through experiments.
Calculate the correlation coefficient between the target to be measured and different types of aircraft in the database, arranging according to the magnitude of the correlation coefficient, and infer the target’s type according to the coefficient’s magnitude.
Statistical table of aircraft model recognition algorithm results in this paper
Statistical table of aircraft model recognition algorithm results in this paper
Taking Fig. 10 as an example, in Fig. 10(a), aerial images of some areas of Kadena Air Base are collected, and three aircraft targets are detected and located through YOLOv3-tiny, as shown in Fig. 10(b). After denoising the image through an anisotropic diffusion filter, target segmentation, and acquisition of the central axis, the feature vectors of the three targets are extracted, respectively. Finally, the Spearman rank correlation method calculates the correlation coefficient between the target feature vector and the feature vector in the database. The correlation coefficients p of three sets of targets are calculated, with the maximum values 0.965, 0.968, 0.975, respectively. According to the calculation results, it is assumed that the three planes are all US-made F16 fighters, illustrated in Fig. 10(c), which is consistent with the truth.

Example of aircraft type identification results.
The experiment is carried out on all 100 images to be tested. There are 375 aircraft targets in the images to be tested. The method in this paper is used to detect and locate the aircraft targets and identify the model. The results are shown in the following table:
The results show that the algorithm can quickly and effectively detect the aircraft target in the image and identify the aircraft type; the recall rate is 97. 86%, the precision rate is 96. 46% and the detection speed is 41 frames per second, which verifies the effectiveness and reliability of the algorithm.
To verify the performance of our proposed algorithm, we conducted comparative experiments using the proposed algorithm and the methods in [5] and [17].
It should be noted that since literature [5] and literature [17] do not have the function of recognizing multiple aircraft targets at one time, so we manually segmented and extracted 200 aircraft targets individually and then tested them to compare the number of correct identifications and the accuracy rate. To further analyze the performance and adaptability of the algorithm, we scaled each of these 200 aircraft targets four times at different scales and finally got 1,000 samples, including the original target. Then, after we rotated this sample of 1,000 targets four times at different angles, we got 5,000 test targets; the 5000 targets were randomly divided into ten groups with 500 targets in each group. For each group, the algorithm in this paper and the methods in [5] and [17] were respectively used for testing. The test results are shown in Fig. 11.

Comparison of the test results of these three algorithms.
The statistical results obtained after several experiments are shown in Table 2.
Comparison of the recognition accuracy of the three methods
The experimental results show that our method is superior to the methods in [5] and [17]. It proves the effectiveness, adaptability, and advancement of our method.
In this paper, YOLOv3-tiny is used to quickly detect and locate the aircraft target in the image. After filtering and segmenting the image, the aircraft contour features are obtained to construct the feature vector, and finally, the aircraft type is recognized using the grade difference method. Experiments show that the method can effectively identify the aircraft type in the aerial images of the airport area and has a good recognition effect on the aircraft images taken from different angles, reflecting the algorithm’s robustness. Overall, the detection and recognition accuracy of the algorithm has achieved the desired results. Compared with the traditional deep neural network or structural feature matching method, the proposed algorithm has strong adaptability, less computation, and a high recognition rate, which can effectively improve the accuracy and speed of the detection system in identifying aircraft target types.
In future research, we will focus on solving the problem of fine-tuning the neural network model to obtain faster training speed,Improve the speed of detection. Currently, the main problem is that there are fewer model data in the database, and only a few aircraft types can be identified. Therefore, it is necessary to expand the database of aircraft model data. This paper will also be implemented using FPGA technology to better push into practical applications.
Footnotes
Acknowledgments
This work was supported by the Key Natural Science Research Project of Anhui Provincial Universities (No. KJ2020A0673), National College Students Innovation and Entrepreneurship Training Program (No. 202210380048), Chaohu University Quality Engineering Project (No. ch20mooc07).
