Abstract
The power metering system is an important part of the smart grid for data acquisition and analysis. The fault state of the main station directly affects the stable and safe operation of the power metering system. Hinged on the real-world data supplied by the monitoring platform of the Metrology Center of Guangdong Power Grid Co., Ltd., we present a novel malfunction diagnosis method for the main station of the power metering system. The proposed method utilizes the synthetic mi-nority over-sampling technique (SMOTE) and designs a combined model of long short-term memory (LSTM) network and ResNet. SMOTE solves the sample imbalance problem. Furthermore, the combined LSTM-ResNet model employs LSTM to extract the time-dependent signal feature and exploits ResNet to optimize data flow. Consequently, the proposed LSTM-ResNet model improves training efficiency and malfunction diagnosis accuracy. The proposed diagnosis mthod is verifird on the real-world data, which proves the proposed method’s surpass traditional methods. A specific analysis of results and the practical application of the proposed method is also elaborated.
Keywords
Introduction
The smart grid is regarded as the infrastructure of the modern power system [1, 2]. It solves the domino effect failure of traditional power systems due to the hierarchical to-pology of assets [3]. The power metering system is an important part of the smart grid for data acquisition and analysis [4]. Benefiting from advanced measurement technology in power systems, the smart grid realizes the collection and monitoring of electric energy data of all consumers, as well as massive data analysis and business applications [5]. In recent years, an advanced measurement infrastructure system has been vigorously pro-moted for the power grid in China. At present, billions of on-site terminals and smart me-ters have been put into use, which provides advanced and effective technical means for local and municipal power supply bureaus to realize load prediction and control, electric-ity monitoring and abnormal analysis, power theft prevention, line loss statistics, etc. However, with the expansion of the power system scale, the frequent failures of the main station in the power metering system increase the difficulty of maintenance. Therefore, it is necessary for power system stability to explore the operating state of the main station and investigate its malfunction diagnosis technology.
At present, more studied fault diagnosis algorithms mainly include expert database fault detection algorithm [6] and data-driven fault detection algorithm [7]. Among them, data-driven algorithms mainly include principal component analysis [8], partial least squares [9, 10], discriminant analysis [11], qualitative trend analysis [12], etc., as well as improved methods based on these methods [13]. In recent years, with significant progress in the study of computer computing power, the effect of data-driven algorithms such as artificial intelligence algorithms has been significantly improved [14]. Big data mining and neural network algorithms have been widely studied and applied, and a variety of data-driven algorithms suitable for measurement systems have been proposed, gradually becoming the mainstream fault detection algorithm. The existing contributions to the investigation of malfunction diagnosis mainly focused on artificial neural networks (ANNs). In [15], the authors proposed an optimization method for an ANN-based genetic algorithm to improve the malfunction diagnosis model accuracy. For transient instability, an online prediction method based on ANN was addressed in [16]. Additionally, [17] proposed a novel fault detection method of power plant burner system based on ANN. However, the temporal infor-mation during power transmission, critical for malfunction diagnosis, cannot be extracted by ANN. In contrast to these ANN-based studies, the authors of [18, 19] proposed long short-term memory (LSTM) to increase malfunction diagnosis accuracy, which can extract temporal information. As an improved RNN, LSTM networks can solve the problem of gradient vanishing of conventional RNN and extract the time series features well [20]. In [18], a method based on the LSTM network was proposed to get good malfunction diagnosis performance when strong noise is present. Depend on the common measurement signals, the work in [19] utilized LSTM networks to detect faults timely. The results showed that the LSTM network outperformed the convolutional network. Overall, LSTM is an efficient method of malfunction diagnosis. However, research on data-based malfunction diagnosis in power systems, especially power metering systems, the use of LSTM networks is still in the initial stages [21].
Although the authors of [22] applied LSTM to deal with malfunction diagnosis in the power system, they only considered the abnormal points of smart meters rather than the whole main station system. To the best of our knowledge, there is no open literature addressing the malfunction diagnosis of the main station of the power metering system. The large data sets resulting from a complicated business present a new challenge for mal-function diagnosis of the main station. More specifically, the network layers may need to be increased for big data malfunction diagnosis. However, with the stacking of network layers, the single-stacked LSTM model results in gradient vanishing and network performance degradation. In addition, the small sample imbalance problem always exists in real-world power systems. The number of fault samples are too small compared to that of normal samples. It will lead to a decrement in malfunction diagnosis ac-curacy.
To overcome the abovementioned problems, we propose a novel malfunction diagnosis method for the main station of the power metering system hinged on the real-world data supplied by the monitoring platform of the Metrology Center of Guangdong Power Grid Co., Ltd. Specifically, we utilize a variety of methods to address the fault samples are too few. Under-Sampling Methods will lose a large amount of sample information, which is not conducive to model generation; cost-sensitive learning methods highlight negative samples by increasing sample weights, but too few negative samples make the generated models unstable; Gaussian Mixture Models (GMM) are limited by the lack of obvious distinction between positive and negative samples, and the quality of generated samples is not up to standard; Synthetic Minority Over-Sampling Technique (SMOTE) [23] can randomly generate new samples from existing faulty samples, which can generate a large number of negative samples with obvious features, effectively improving the accuracy of model training and solving the problem of sample imbalance. In addition, Since LSTM is suitable for scenarios where historical information needs to be considered, and it also has good generalization properties to accommodate complex data from metering masters, we use LSTM to model temporal data and analyze the historical state of the network. but the gradient vanishing is still a problem that LSTM needs to solve. With this in mind, ResNet can solve the model “degradation” phe-nomenon (i.e., the vanishing/exploding gradient when the number of network layers deepens) [24], we introduce the ResNet module and connect it with the LSTM layer to build the LSTM-ResNet model. This LSTM-ResNet model uses LSTM to extract the time-dependent signal feature and exploits ResNet to optimize data flow. The addition of ResNet improves training efficiency and malfunction diagnosis accuracy further. According to the experimental results, the proposed diagnosis method surpass LSTM and ResNet for the main station. As a benefit, reliable and stable operation can be guaranteed by the proposed method for the power metering system.
The rest of this paper is organized as follows. Section 2 explores the data set and present the selected data with SMOTE training sample. In Section 3, we briefly describe the methodologies of LSTM and ResNet, which will be used to construct the pro-posed model. Then, we propose a LSTM and ResNet combined model. In Section 4, we present experimental results compared to other deep learning models. Finally, the conclusion is given in Section 5. Overall, our paper has several innovative contributions, including: 1. Introducing the LSTM-ResNet approach for fault diagnosis. 2. Employing the SMOTE algorithm to balance the data. 3. Conducting experiments to demonstrate the performance of the proposed method.
Materials and equipment
Characteristic parameter extraction
The data set adopted in this paper is the data from July 1st to July 13th, 2022 provided by the comprehensive monitoring platform of China Southern Power Grid. The data from 4:00 to 10:00 on July 13 are considered fault data. The data set consists of 11 forms, each containing multiple characteristic parameters, representing the status monitoring data of different modules of the metering automation master station. The 11 forms are Front-end service operation monitoring, full-link monitoring software program status data, full-link software monitoring calculation data entry link monitoring, full-link software monitoring database table-space monitoring, full-link software monitoring data entry link monitoring, Full-link device monitoring IP information table, full-link device monitoring Disk, Full-link device monitoring network device, full-link device monitoring host device, device monitoring disk input-output (IO), Database running service monitoring, middle-ware service application list information.
To select the feature parameters conducive to subsequent malfunction diagnosis from 11 forms for subsequent processing, we drew the data trend graph of all the feature pa-rameters. We observed the trend graph to extract the feature parameters. Take the Mid-dleware Service Application List Information form as an example. The form has three characteristic parameters: the current heap size of the application, the current number of sessions, and the current free heap of the application. Since the fault occurred from 4:00 to 10:00 on July 13th, 2022, to observe the changes in parameter characteristics conveniently when the fault occurred, we selected the data from 18:15 on July 12th to 14:00 on July 13th to draw the data trend chart. The drawing results are shown in Figs 1–3. In Figs 1–3, the X-axis represents the time of fault occurrence, while the Y-axis in Fig. 1 represents the current heap size of the application, the Y-axis in Fig. 2 represents the current free heap of the application, and the Y-axis in Fig. 3 represents the current number of sessions of the application. The number of current sessions and the current free heap of the application fluctuate more significantly during the fault occurrence period. Therefore, these two parameters are selected for sub-sequent malfunction diagnosis.
The current heap size of the application.
The number of the current free heap of the application.
The number of current sessions of the application.
According to the above principles, the feature parameters are extracted. Finally, 10 feature parameters are selected from various feature parameters for subsequent malfunc-tion diagnosis, which are the number of current sessions of the application, the current free heap of the application, the number of active sessions, the rollback segment, the utili-zation rate of temporary table-space, the number of slow SQL statements, the number of files waiting to be stored in the database, the cumulative duration of storing in the data-base, the number of files successfully stored in the database, and the number of files that failed to be stored.
The schematics of SMOTE.
SMOTE is a comprehensive sampling artificial synthetic data algorithm to address the data category imbalance issues, which synthesises the small sample category data by increasing the sampling density of the neighbourhood of the small sample category data. Concretely, a data sample of small sample category data is randomly selected and then several the nearest adjacent data points are found out, new data samples will be acquired through interpolation between center sample and adjacent data points.
For the problem of small sample imbalance, we use SMOTE method for class sample enlargement [25]. In this paper, we select more than 600 normal state data points and 24 failure state data points. The SMOTE schematics as shown in Fig. 4. In this figure, the circle represents the majority of sample points, the black star represents the minority of sample points. The coordinate axis representation of the SMOTE algorithm is dependent on the characteristics and types of the dataset, and usually requires methods such as dimensionality reduction and visualization to better understand, the horizontal and vertical axes are used to represent the feature parameters contained in the sample points for visualizing the algorithm process, which can facilitate a better understanding of the SMOTE algorithm.
This algorithm steps are as follows:
(1) Calculate the Euclidean distance between a few types of samples and select
(2) Set the sampling rate according to the imbalance ratio.
(3) Calculate
where
LSTM
LSTM is designed to solve the long sequence dependence problem of RNN [26]. Unlike traditional RNN, LSTM uses the concept of “gates”. Meanwhile, the LSTM network has the same chain structure as RNN, composed of a sum of circulating cell units. However, the LSTM is much more complex than the RNN, as shown in Fig. 5.
The internal structure of the LSTM block.
Figure 5 shows that a LSTM block consists of 4 parts: forget gate, memory cell, input gate, and output gate. The forget gate, output gate, and input gate are used to release, refresh and control the information in the storage unit. At the time point, the input of the LSTM block includes the current sequence vector, the memory unit, and the previous hidden state. The current hidden state and storage unit are the output. The mathematical calculation equations of forget gate, input gate, and output gate are as follows:
The update of the storage unit is determined by oblivion gate, which determines whether the historical information of the memory cell should be forgotten, and the input gate, which determines the percentage of the curr-ent input information and the abstract information of the previous time sequence in the state update. Precisely, the state of the storage unit is updated according to the following formula:
The output corresponding to the hidden layer is represented as:
where
As can be seen from the above series of calculation formulas, the storage unit runs through the whole LSTM structure. Therefore, the storage unit information can be transfered continuously along the entire chain. This is why LSTM can solve the long-term sequence dependence problem. In addition, LSTM can improve gradient vanishing and exploding induced by RNN back propagation. In a word, LSTM is a variant model of RNN with excellent performance and has better performance in processing time series modeling tasks.
With the development of deep learning, the number of network layers is deepened continuously. However, when the CNN reaches a certain depth, the stacking of network layers will not further improve the model’s performance but lead to a decline in the accuracy of the network. To solve this problem, ResNet was proposed in 2015 by scholars such as He [27]. Unlike traditional CNN, ResNet introduces the core idea of identifying shortcut connections and adds a jump between several network layers. Specially, the upper layer network output is equally connected to the lower layer network. Its structure is described in Fig. 6.
The structure of a ResNet block.
Assume that the input of a certain layer of CNN is
In the ResNet module,
The structure of LSTM-ResNet.
As shown in Fig. 7, we combine LSTM and ResNet to propose the LSTM-ResNet model. Specifically, the LSTM-ResNet model consists of a LSTM layer, a convolution layer, three ResNet blocks, a pooling layer, and two full-connection layers. The output of LSTM is set to the input of the ResNet block to extract spatial features and alleviate the dispersion of the gradient. The ResNet block is shown in Fig. 6. The output of each convolutional layer in the ResNet block uses batch normalization (BN) to improve the gradient flowing through the network, alleviating gradient vanishing and exploding [28]. The output of the ResNet block is set to the input of the full-connection layer. The full-connection layer contains dropout to prevent overfitting.
Experiment setup
Experiment settings
The Rectified linear unit (ReLU) was used for the activation function of all convolution layers, and the activation of the output layer was set to SoftMax. The loss function was set as the focal loss, which adds a weight relative to the sample proportion based on the cross-entropy loss. The formula is as follows:
where
Evaluation index
In these experiments, the effectiveness of model is evaluated by False Positive Rate and False Negative Rate. The formulas are as follows:
where, FN is represent as the faulty samples number judged to be normal, TP is represent as the faulty samples number judged to be faulty, FP is represent as the non-faulty samples number judged to be faulty, TN is represent as the non-faulty samples number judged to be normal.
Experiment results
Before experiments, we had to process original data collected from the main station of the power metering system. Firstly, we selected the feature parameters conducive to subsequent malfunction diagnosis by using the method mentioned in Section 2.1, and then sorted all data in time order. Only a short period of time before and after the faults occur is retained, which will be balanced by fault category taking advantage of SMOTE. All data was normalized to the range of 0 to 1 before input to model training and testing. We conducted all experiments using TensorFlow 2.4.1 as the back end and Adam optimizer, learning rate was set to 0.01. All data was shuffled before each epoch of the training process.
Performance of proposed method on different 
In this subsection, we investigate the influence of weight factor
Table 1 shows the results of False Positive Rate, False Negative Rate and prediction accuracy of our proposed LSTM-ResNet on different
With/without SMOTE
Performance metric on different weight factor
Performance metric on different weight factor
Performance of proposed model with/without SMOTE.
In this subsection, we conduct another extensional experiment to explore how SMOTE influences the performance of model and show it in Fig. 9. Figure 9a and b show the curves of validation accuracy and loss with/without SMOTE, respectively. Obviously, SMOTE improves the performance of the proposed model greatly. Specifically, the validation accuracy value with SMOTE is 10% higher than that without SMOTE, and the validation loss value with SMOTE is 0.15 lower than that without SMOTE.
In order to confirm the effectiveness of our proposed LSTM-ResNet, we conducted several experiments based on Transformer [29], MultiLayer Perceptron (MLP) [30], LSTM, ResNet and LSTM-ResNet, respectively. The results of the experiments are shown in the Fig. 10 and Table 1. Figure 10a shows the curves of the validation accuracy of our proposed LSTM-ResNet, LSTM, ResNet, Transformer and MLP. Obviously, the test performance of LSTM-ResNet outperforms that of LSTM and ResNet. Specifically, the curve of our LSTM-ResNet soared to 0.85 at the early stage, and then raised towards to 0.9 or more. However, the curves of LSTM and ResNet raised towards to around 0.8 and Transformer MLP raised towards to around 0.85. Figure 10b shows the curves of the validation loss of our proposed LSTM-ResNet, LSTM, ResNet, Transformer and MLP. It can be observed that LSTM-ResNet outperforms LSTM, ResNet and other networks. To be specific, the curve of LSTM-ResNet plunged to 0.3 at the early stage, and then converged to around 0.20. However, the curve of LSTM converged to around 0.42 and the curve of ResNet converged to 0.30, so similar the other networks.
Table 2 describes the results of the False Positive Rate, False Negative Rate, and prediction accuracy of our proposed LSTM-ResNet, LSTM, ResNet, MLP and Transformer. The proposed LSTM-ResNet has a lower False Positive Rate, False Negative Rate, and higher prediction accuracy than the other four. Overall, all the experimental results verify the effectiveness of the proposed model.
Performance metric with three models
Performance metric with three models
Performance of the three models.
To improve operational reliability in the power metering system, this paper proposes a novel malfunction diagnosis method based on the real-world data provided by the monitoring platform of the Metrology Center of Guangdong Power Grid Co., Ltd. The proposed method utilizes SMOTE and designs a combined model of LSTM and ResNet. Specifically, we extend the selected data with SMOTE training sample to solve the problem of small sample imbalance. Furthermore, the combined LSTM-ResNet model employs LSTM to extract the time-dependent signal feature and exploits ResNet to optimize data flow. According to the experimental results, the proposed diagnosis method has lower False Positive Rate and False Negative Rate compared with traditional models. However, compared with the False Positive Rate, False Negative Rate is relatively high. In the next step, relevant research will be carried out to reduce the False Negative Rate under the premise of constant False Positive Rate. In addition, the amount of data in the current study is small, so it is necessary to conduct sequential analysis and comprehensive fault analysis on more data in the future.
Footnotes
Acknowledgments
The authors acknowledge the National Natural Science Foundation of China (Grant: 62173256), the China Southern Power Grid Company Limited (Grant: 035900KK52190016(GDKJXM20199917)).
