Abstract
In order to improve the detection accuracy of high-voltage dense channel satellite image, a satellite target detection algorithm based on deep learning is proposed. The convolution neural network is selected to extract the feature map of high-voltage dense channel satellite image, and the extracted feature map is input into the optimized deformation convolution neural network. The value of each sampling point and the corresponding position authority of block convolution kernel are weighted by using the regular region sampling feature map. The feature map output by the convolution operation of pooling layer is used to obtain the depth features of the same dimension. The depth feature is input into the full connection layer to obtain the full connection feature of candidate target area, and the target detection in high-voltage dense channel satellite image is realized. The experimental results show that the target detection accuracy of the method is higher than 99% and the false alarm rate and false alarm rate are lower than 1.4%.
Keywords
Introduction
The dense transmission channels of State Grid Corporation are mostly located in mountainous areas, with the complex geographical environment, dense vegetation and frequent mountain fires. Once the line jumps together, it will directly endanger the safety of the power grid. Large-scale urban construction is being carried out in various places with vigorous economic development, and the high-voltage transmission channels of the National Grid supporting urban construction are becoming denser. In a limited space, the contradiction between transmission line channels and personnel activities has become more and more intense. The hidden dangers of transmission channels that have been repeatedly prohibited have caused frequent occurrences of natural disasters, which seriously threaten the operation of the lines. Therefore, different monitoring and prevention measures have been adopted in various parts of our country, and the existing high-voltage dense power transmission channel monitoring system has been automatically transformed through various modern communication technologies. However, the main monitoring system still has problems such as lack of regional unified planning, obsolete equipment, and long data collection and analysis cycle, which cannot meet the needs of real-time monitoring, early warning and prevention [1, 2]. Deep learning can use the original data to automatically learn features, and find the internal structure and relationship of image objects from a large number of big data through multi-layer nonlinear networks. Deep learning is widely used in environmental pollution detection, image texture feature recognition, vehicle license plate image information collection, target detection and other fields. Using Sigma operator to calculate air quality, combined with fuzzy reasoning system, urban air quality evaluation is realized [3]. Convolutional neural network is used to reduce the number of training samples, generate image feature vectors, classify image surface textures, and improve image detection effect [4]. The input image is preprocessed by deep learning, and the processed image is segmented to extract and recognize the license plate image [5].
Furthermore, many experts and scholars have carried out research on the key technologies of transmission channel satellite inspection. It is hoped that the environmental elements of the transmission channel can be extracted, analyzed and changed by artificial intelligence, so that can realize the intelligent identification, analysis and comparison of the transmission channel, as well as monitoring and early warning, and improve its safety operation and maintenance level. To improve the operation and maintenance level of the transmission channel, literature [6] proposed a deep learning-based aerial small target detection algorithm to study the local spectral structure characteristics of the statistical image of the change of pixel attributes. However, it ignores the geometric structure around the image, which makes it difficult to process the semantic information of the image, and the accuracy of information extraction is not high. Literature [7] proposes a detection algorithm for the area of interest of a traffic target based on deep learning. Its advantage is that it can meet the needs of users to a certain extent, and the detection results can be used directly, but its target extraction is more difficult.
This paper proposes a high-pressure dense channel satellite target detection algorithm based on deep learning, and uses the hierarchical network structure of the deep neural network to extract high-level features of massive data. It improves the fine extraction of power transmission channels and various environmental features within a range of 300 m, and comprehensively improves the intrinsic safety of power transmission lines and the efficiency of transportation inspection. It also effectively solves the problems of excessive calculation, difficult feature selection, poor pertinence, and trivial change patterns in the monitoring process of transmission channels. Its research results can be promoted and applied to the daily operation and maintenance management of dense transmission lines across the country, which can significantly improve the hidden danger identification, monitoring and early warning capabilities, and safe operation of transmission channels, and greatly reduce labor and labor costs. It greatly improves the operation and maintenance management level of the transmission channel, which has huge economic value and broad application prospects [8, 9].
Satellite target detection of high-pressure dense channels based on deep learning
Convolutional neural networks
The single neuron model of the convolutional neural network is shown in Fig. 1.

Single neuron model.
The input signal x i of the neuron of the convolutional neural network comes from the output of the feed-forward n neurons. The input signal x i is multiplied by the weight WI and then spread and transmitted. The total input value is obtained by adding it to the bias b. It is necessary to compare the total input received by the neuron and the set threshold θ, and use the activation function to realize the activation process, which can obtain the neuron output. A simple 4-layer convolutional neural network model is shown in Fig. 2 [10].

Neural network model.
The convolutional neural network is constructed by connecting a large number of neurons of the same form according to a certain organizational hierarchy [11]. Each column becomes the layer of the neural network, which is mainly divided into the input layer, hidden layer and output layer. The input of each layer is the output of the previous layer, and the output is the input of the next layer.
(1) Convolutional layer
Given that the input tensor of the convolutional layer l is x l ∈ RH l ×W l ×D l , and the obtained convolution kernel is f l ∈ RH×W×D l . The convolution operation starts at the position (0,0,0) of the image, and the values of the image pixels in the corresponding positions are respectively multiplied by the parameters in the convolution kernel bit by bit, and the result of the convolution operation is accumulated. After a convolution operation is over, the convolution kernel slides on the input image according to the specified step size, and performs the same convolution operation in sequence. The results of the convolution operation are arranged in order, and the two-dimensional output is formed according to their positions. If there are N similar convolutions, the output tensor is xl+1 ∈ R(H l -H+1)×(W l -W+1)×N.
In summary, the convolutional layer has the characteristics of weight sharing, and the same convolution kernel has the same different positional parameters, which can greatly reduce the model parameters. Since the local features of the image are obtained by convolution using a certain size of the convolution kernel in the local area, it has better local feature mining of the image. The introduction of different convolution kernels can learn different local features. Using a combination of different convolution kernels can enable the convolutional neural network to learn high-level image semantic information, so that can obtain good results.
(2) Pooling layer
The principle of the pooling layer is to perform input image abstraction and dimensionality reduction operations by imitating the human visual system, and its essence is a kind of down-sampling operation. For input images with a large number of convolutional features, it is necessary to operate the pixel area through the pooling layer to remove the noise to a certain extent while extracting the local salient features. Common pooling operations include average pooling and maximum pooling. The formulas are as shown in formula (1) and formula (2) respectively.
(3) Fully connected layer
The input and output of the fully connected layer are all column vectors, and the connection mode is fully connected, and each layer of nodes is connected to adjacent layer nodes.
It can map the learned features to the sample space, realize feature dimensionality reduction and sample classification, and play a role in classifying and discriminating features.
Assuming that the input x of the fully connected layer is a m-dimensional vector, and the output y is a n-dimensional vector, the fully connected layer depends on the coefficient matrix A ∈ Rn×m and the deviation b, so that it can be obtained as
Where f () is the activation function.
There is no need to apply an activation function for the last fully connected layer. The obtained formula is as follows:
(4) Softmax layer
The meaning of Softmax is normalized, and the function of the Softmax layer is to transform the input column vector into a probability distribution, so that it can be obtained as:
The function of the Softmax layer is to achieve target multi-feature classification, which serves as the final activation layer of the convolutional neural network. The output vector is the probability that the input image of the convolutional neural network belongs to each category.
Based on the above analysis, the convolutional neural network is a feed-forward multilayer neural network, and its network layers mainly include two types: convolutional layer and pooling layer. The convolution layer is the core layer of the network, whose main function is to extract various features of the image, and realize the feature abstraction of the upper layer feature map through the convolution kernel. Then, the new feature map is obtained through the activation function, and the convolution kernel is usually a weight matrix. The role of the pooling layer includes down-sampling the input feature map and abstracting the original feature input signal. By changing the size of the feature map to make it smaller, it achieves a significant reduction in training parameters and a reduction in the degree of model overfitting in the process of convolutional network propagation [12, 13].
Convolutional neural network technology is a mature technology that has been applied in many fields. This paper uses convolutional neural network technology because it has better performance in image local feature mining. This technology introduces different convolution kernels to learn the local features and semantic information of high-level images, which can maximize the effect of high-pressure dense channel satellite target detection, and it has strong applicability and practicability.
The convolutional neural network can extract high-pressure dense channel satellite image targets, however, it cannot be applied to high-pressure dense channel satellite image target detection. The convolutional neural network needs to be optimized, and the target detection of high-pressure dense channel satellite images is achieved through regular block convolution operations. The training part of the optimized convolutional neural network consists of two parts: the prediction sub-network and the discriminating sub-network of the target area, and they share the convolutional neural network structure.
The upper layer feature sampling map of the input region R is sampled using rules. Meanwhile, the value of each sampling point and the corresponding position authority of the square convolution kernel are weighted and summed. Finally, the result is regarded as the output of the convolution operation. The receptive field of the square convolution kernel is selected as a regular area. When the volume of the convolution kernel is 3×3, the regular area R is defined as:
Any point q0 is defined in the feature map y output by the convolution operation, and the formula is as follows:
Where the elements in the receptive field area are expressed as q n , the input feature map is expressed as x, and the convolution operation is expressed as w.
The offset {Vq
n
|n = 1, 2, ⋯ , N } is added to the deformed convolution operation, and the number of elements in the receptive field area is described as N. The formula (7) is transformed to obtain:
The feature map of the convolution kernel convolution operation input of the offset variable. The 18-channel offset feature map is output, and the total number of feature values of the deformation convolution operation offset value is obtained, which can reflect each position in the map. Finally, the feature map is obtained by the offset value [14].
At the center pixel position of each sliding window, the convolution kernel with the size of the sliding window is selected to acquire the depth feature X
i
. The candidate detection frame is described by
If the candidate detection frame is a positive sample, the intersection of the true value frame
Where
Where W represents the cross-entropy loss function of model parameters; the cross-entropy loss function that measures the classification loss is described by L
e
(q (X) , Y); various target probabilities are expressed as q (X), and λ is the balance parameter. The background detection box that does not participate in the regression operation is described as [Y ⩾ 1], and
The overall loss function formula of the prediction sub-network of the candidate target area is obtained as:
Where μ m is described as a weighting parameter.
The gradient descent method is used to solve the candidate target region prediction sub-network. The predicted result realizes the final target recognition through the target area discriminating sub-network of the convolutional neural network.
The candidate target area is pooled by the pooling layer of the convolutional neural network, and the depth features of the same dimension are obtained. The depth features are input to the fully connected layer to obtain the fully connected features of the candidate target area, and realize the final position regression and classification of the candidate target area. The multi-directional rotation of satellite image targets in high-pressure dense channels will cause the candidate target area pooling operation to be mixed with the background area, and the offset variable needs to be introduced into the area pooling operation through the deformation convolution method [15].
Given that the volume of the rectangular area is w × h, and the size of the rectangular area is converted to k × k through the pooling operation, and the conversion formula is:
Where n ij represents the number of pixels in each sub-region, and the upper left corner coordinates are described as q0.
The offset Vq
ij
is added to formula (13), which can get:
In formula 14, it is 0 ⩽ i, j < k.
Through the above analysis, it can be known that the target area identification sub-network is operated by deformation pooling, and the depth characteristics of each candidate target area are extracted from the residual block group. The convertible loss function is:
Where μ10 represents the weighting parameter of the target area discriminating sub-network, the training set of the target area discriminating sub-network is represented as S10, and W d represents the depth feature of the target area.
The satellite target detection process of high-pressure dense channels based on deep learning is as follows. The target area is used to predict the sub-network and the target area is initialized to identify the sub-network and add a fully connected layer. The training sub-network is combined and the learning rate is set. After the training is completed, the high-pressure dense channel satellite image of the target to be detected is input. The target with a higher ranking in the obtained confidence result is taken as the target detection result. It is necessary to suppress the detection frame with a higher degree of overlap than the non-maximum value, and the result of the remaining target is the target detection result of the high-pressure dense channel satellite image.
To verify the effectiveness of the satellite target detection algorithm based on deep learning high-pressure dense channel proposed in this paper. The NWPU VHR-9 data set is selected as the experimental object. The data set contains 1947 high-pressure dense channel satellite images. As shown in Fig. 3.

High-pressure dense-channel satellite images.
The data set is divided into 10 sample sets, and each sample set contains 228 satellite images of high-pressure dense channels. 70% of the samples from each sample set are selected as the training set, and the remaining 30% as the test set. The training set and test set are independent.
The aviation small target detection algorithm based on deep learning (Reference [6]) and the deep learning-based traffic target region of interest detection algorithm (Reference [7]) are selected as the comparison algorithm with this paper, so that can intuitively evaluate the detection performance of the proposed algorithm.
The recall rate is the ratio of the number of positive samples correctly detected by different algorithms to the actual number of all positive samples. The accuracy rate is the proportion of the actual positive samples that are correctly detected as positive samples.
The accuracy and recall rate of the three algorithms are calculated for detecting satellite image targets in the high-pressure dense channel method, and the P-R curve with the drawn accuracy and recall rates are drawn. The P-R curves of the three algorithms are shown in Fig. 4.

P-R curve.
The result obtained by the x-axis, y-axis and P-R curve in Fig. 4 is the average accuracy. In the satellite image target detection results of high-pressure dense channels, there are big differences in the average accuracy of different types of objects. To accurately measure the detection accuracy of different algorithms for targets in high-pressure dense channel satellite images, the average accuracy is used to measure the target detection results of each algorithm. After processing of this method, the image target becomes clearer, as shown in Fig. 5:

Detection of satellite images.
In the sample set of high-pressure dense channel satellite images, the average precision mean results of different algorithms are counted. The average precision is the performance measure of this kind of algorithm to predict the target position and category. It is very useful for evaluating target location model, target detection model and case segmentation model. In model prediction, there are many bounding box, but most of them have little confidence. Therefore, after setting the required output confidence as a fixed threshold, 10 sample data sets are tested. The statistical results are shown in Table 1.
Comparison of average precision mean results
It can be seen from the experimental results in Table 1 that, in the satellite image target results of the high-pressure dense channel with 10 sample sets, the average target detection accuracy of the algorithm in this paper is significantly higher than the other two algorithms.
The average accuracy of the algorithm in this paper is higher than 99%, which shows that the algorithm has high target detection performance for satellite imagery of high-pressure and dense channels, and the obtained target detection effect is superior.
Under different sample sets, the high-pressure dense channel satellite image detection target is obtained, and the missed alarm rate and false alarm rate of the target using three algorithms are counted. The statistical results are shown in Figs. 6 and 7.

Comparison of missed alarm rate of different algorithms.

Comparison of false alarm among different algorithms.
It can be seen from the experimental results of Figs. 6 and 7 that, under different sample sets, the algorithm in this paper detects satellite image targets in high-pressure dense channels, and its missed alarm rate and false alarm rate are significantly lower than the other two algorithms. The missed alarm rate and false alarm rate of the algorithm in this paper are both lower than 1.4%. However, the missed alarm rate and false alarm rate using the algorithm of literature [6] and the algorithm of literature [7] are both higher than 2.0%. Experimental results show that for different sample sets, the algorithm in this paper can effectively identify satellite image targets in high-pressure dense channels. The algorithm in this paper can accurately identify positive samples and negative samples, and can be applied to the actual application of different types of target detection in high-pressure dense channel satellite images.
By virtue of its powerful learning ability and feature extraction ability, deep learning methods can learn high-level abstract features of images, and avoid a large number of artificial feature design and parameter adjustment with relatively strong subjectivity, so that the model has stronger generalization ability. Convolutional neural network is the most commonly used deep learning algorithm in the field of target detection. In this paper, convolution neural network is mainly used to extract the feature map of high-voltage dense channel satellite images. Targeted algorithm optimization, the feature map is sampled by regular area, and the depth feature is obtained by convolution operation of pool layer. In the connection layer, the target detection in high-voltage dense channel satellite images is realized. Experiments on NWPU VHR-9 data set show that the algorithm is effective in image target detection. The main work of this paper is summarized as follows: A high-voltage dense channel satellite target detection algorithm based on deep learning is proposed. This paper introduces convolutional neural network, which solves the problem of gradient disappearance in deep network, and at the same time, deeper network level can enable the network to extract better high-level semantic features. Optimize the training part of convolutional neural network, and use rules to sample the upper feature sampling map of the input area, so that multi-level features can be fused, taking into account the advantages of different levels of feature information, and greatly improving the detection performance of small targets without increasing the computational complexity. The depth feature of the target area is extracted from the residual block group, and the remaining targets are obtained by overlapping detection frames, thus completing the target detection of satellite images in high-voltage dense channels. The performance of the proposed algorithm for target detection is verified by comparative experiments on the open source NWPU VHR-9 image data set. The fluctuation of P-R curve is small, and the prediction probability of test data is high. The average precision is higher than 99%, the false alarm rate is lower than 1.4%, and the false alarm rate is 1.4%.
However, high-voltage dense channel satellite targets are complex and diverse, so it is difficult to classify and regress them accurately, and there is still much room for improvement. Therefore, it is necessary to further improve the performance of target attitude extraction, real-time tracking under limited sample data, optimize the extraction accuracy and improve the real-time performance of the algorithm.
Footnotes
Declarations
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
The study was supported by Taizhou Vocational College of Science and Technology Jiamu Talent Training Plan (NO. 2022JM018).
