Abstract
Even in the presence of renewable sources in the customer’s own premises, the utility’s supply needs to be maintained reliably to ensure the availability of electricity. Security prediction of the high voltage transmission system (HVTS) is significant in the modern scenario since the power failures results in huge economic loss or sometimes human life and comfort. The responsibility of the system operator is to supply electricity to its customers with a reasonably high degree of reliability and good power quality. This calls for assessing the security levels and initiating early steps to mitigate the effect of failures. Such prediction of the system security levels ensures availability of service to customers and helps in the operations planning. Paper proposes a pattern recognition approach using the k-nearest neighbor (k-NN) classifier for the security level predictions for HVTS from the failure rates assessed from the historical data on operation. This method is employing the newly developed Gaussian fuzzy index formulated by the authors using the failure rates in the HVTS. Both simulation and the validation using field data have been done and the results are given in the paper. The overall accuracy obtained is near 89.88%.
Keywords
Introduction
The power system security is the probability of a power system’s operating point remaining in a viable state of operation. If the power flows and the bus voltages of a power system are within acceptable limits, irrespective of the variations in generation or load demand or equipment outage, the system is in the secure state. Security assessment and security control are the two major divisions of power system security. While the assessment determine the security level of the system operating state, the security control determines the appropriate security constrained scheduling required in attaining the target security level in an optimal manner. Supervisory control and data acquisition (SCADA) systems and post-processing by a state estimator are used to monitor the states of the power system [1]. This assessment is significant in the modern times since the power failures results in huge economic loss or sometimes human life and comfort. The high voltage transmission system (HVTS) being the backbone of the modern power system, in spite of large penetration of renewable energy sources (RES), it networks the generating stations and the load centers. The transmission system operator (TSO) plays a key role in ensuring national energy security [2]. Compared to the generation system, the transmission system reliability analysis is more complex, and for the operators, the assessment of reliability and security remains a challenge due to the multitude of equipment, the insulation coordination and so on. It has been seen that in the exercise of power system operations planning the security assessment studies plays a significant role. The methods can be extended for the real time system to provide valuable information for maintaining the security of the system under any conditions. Based on such information, the TSO may draw up a plan based on current forecast, and check whether the supply and demand will equate at each point on the network for a given finite network capacity. TSO in collaboration with the producers and the end users connected to the grid along with other distribution system operators must take necessary measures to ensure the continuity of electric energy supply and to minimize the failures may be by both demand side and supply side management. Substations are important nodal points in the HVTS, specifically when the substations are feeding critical loads in a load centre. Quick detection and isolation of transmission failures remain the highest priority in a substation for maintaining high power quality. Pattern recognition techniques are well known for classification. Since k -NN technique is simple and highly efficient in pattern recognition, it has been chosen as one of the popular methods [3, 4] for the prediction purpose in different areas of research. k -NN classifier technique has been used effectively for the prediction and classification of crime prediction [5]. A novel privacy preserving k -NN classification protocol for data security has been presented in [6] and a k -NN based load forecasting technique for generation scheduling has been discussed in [7].
Machine learning technique remains one of the most popular methods in the power system security assessment. A method for online voltage security assessment using phasor measurement units (PMU) and decision tree (DT) has been presented in [8], where offline training has been given to DTs and updated periodically for improving robustness. Here only two states namely secure and insecure have been considered for the voltage security assessment. Fast voltage security assessment for severe contingencies has been performed by comparing the real-time PMU measurements with the critical splitting rule (CSR) in the decision tree, and the method has been tested on the American Electric Power (AEP) system. A hybrid approach based on random forests models and boosting model has been proposed in [9] for the evaluation of power system reliability. In [10], multiple machine learning techniques such as artificial neural networks (ANNs), support vector machines (SVM) and DTs, are used to determine the security of the system which can be taken as a semi-automated method for the on line security assessment with four states namely normal, alarm, correctable emergency and non correctable emergency states. Machine learning techniques based on classification and regression tree (CART) algorithm and probabilistic DT approach for estimation of voltage security is discussed in [11].
In this work an attempt is made to predict the security levels of HVTS with the implementation of k nearest neighbour (k -NN) Classifier. A new approach is introduced in which the HVTS has been modeled as a 16-state Markov state model with five security levels viz. secure, alert, insecure, emergency and blackout, considering the violations in voltage(V), frequency(F), and the thermal limits(T) and the failure of electrical components. The transition between the levels depends on failure and repair rates of the above-said parameters. From this Markov model, it is possible to obtain the probability of each security levels. The system failures due to the violations of the above said parameters are unpredictable and can treat as fuzzy membership functions. A uniform fuzzy security index (UFSI) has been developed using Gaussian fuzzy membership functions of the failure rates of the system due to violations of voltage, frequency, thermal limits and the failure of heavy electrical components. For predicting the security level of the system k -NN classifier is utilized for which the input features are the failure rates and UFSI. The predicted results compared with the actual result and the overall accuracy found to be 89.88%.
The paper considers the following The HVTS is modeled by taking the violations of voltage, frequency, thermal limits and the failure of electrical components and developed a sixteen-state Markov model for HVTS A unified fuzzy security index developed by assigning Gaussian fuzzy membership functions for the violations of voltage, frequency, thermal limits and the failure of electrical components Performed security prediction by k -NN classification technique with five inputs viz the above-said violations and the UFSI Performance evaluation of k -NN classifier done and the overall accuracy obtained as 89.88%
The remaining part of the paper is organized as follows: A brief description of transmission system security is given in section 2 and section 3 explains the development of Markov model of HVTS. The formation of UFSI is given in Section 4 and the security predictions using k -NN classifier is explained in section 5. The case study with simulation results and the conclusions are presented in the sections 6 and 7 respectively.
Transmission system security
Three different methodologies that quantify the transmission system security are (i) Probabilistic transient stability assessment (ii) Well-being analysis (iii) Over load security assessment [12]. The probabilistic transient stability assessment includes dynamic analysis and the remaining two deals steady-state security assessments. The focus within the three methodologies is on developing overall system indices for system security, and not component importance indices.
DC power flow method, sensitivity analysis, and compensation method, etc. [13] are the general methods for transmission section security analysis. In DC power flow method, the non-linear power flow problem simplified as a linear one which makes the calculation fast and convenient [14, 15]. In sensitivity method the branch outage took as a disturbance of normal operation conditions, and its impact is simulated by power injection into nodes. Sensitive method is widely used because of its practicality, clear physical concepts and high efficiency [16, 17]. In compensation method, the transmission tripping is by injecting power or current increments injected at the two end nodes [18, 19]. An algorithm formed by the combination of DC sensitivity and compensation method after the traditional study of branch outage proposed in [20] for the fast prediction of cascading overload for transmission section protection. The prediction of security level of the power system in Bosnia is presented in [21], which is performed using Markov chains in conjunction with Monte Carlo simulation. The real security levels show a striking coincidence with the prognosticated result, and computation is fast compared to other methods. Based on probabilistic steady state and dynamic security assessment model, a two level system model is introduced by using time to insecurity as security index in [22] and the results can be used to conduct operators to analyse system security effectively and take preventive control. A method based on machine learning and proper sampling techniques is proposed in [23] for overcoming the difficulties of the conventional Monte Carlo approaches.
The predicted results obtained from k -NN classifier are compared with true results in a confusion matrix in this work. The performance measures of k -NN classifier are verified and found to be suitable for power system security level prediction.
Markov chain
A Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. S = s1, s2, . . . , s r is a set of states and the process may starts in any one of the given states. The process moves successively from one state (i) to another state (j) called a step with a probability P ij which is the transition probability.
A Markov chain is an integer-time process, X n , n ≥ 0 for which the sample values for each random variable X n , n ≥ 1, lie in a countable set S and depend on the past only through the most recent random variable Xn-1. More specifically, for all positive integers ‘n’, and for all states i, j, k . . . , m in ‘S’,
Equation (1) can be abbreviated as Equation (2) and given below [24]
Furthermore, Pr {X
n
= j|Xn-1 = i} depends only on ‘i’ and ‘j’ (not ‘n’) and is denoted by
The initial state
For statistical modeling, Markov chain technique is very effective. Complex behavior of various stochastic systems can be described by this and has a well-developed mathematical apparatus. From the given set of circumstances, a single outcome can predict from a deterministic model. Prediction of a set of possible outcomes weighted by their probabilities is possible in the stochastic model. In Reliability, Maintainability, and Safety (RMS), a system’s operation can be described concerning time by stochastic modeling. Markov models are frequently used in RMS. At any point of time, the failure or repair of a module can occur. The final state reached by the system depending upon the system configuration. The Markov process can completely characterize from the transition probability matrix. Here the transition matrix elements are the failure and repair rates in the HVTS.
The sixteen state Markov model of HVTS shown in Fig. 1. The sixteen states arranged in five levels which are determined by the failure rates. Instead of taking the failure and repair of components alone, here the failures due to violation of voltage, frequency, thermal limit and contingency of heavy electrical equipments have been considered for modeling HVTS. State 1 is in secure level which is the most required level. States 2, 3, 4 and 5 are at the same level which is the alert level. In the alert level, the assumption taken is the violation occurs in any one of four parameters. States 6 to 11 are in the insecure level where the violations occur in any of two parameters. The emergency level assigned to states 12 to 15, and in this level, three parameters violate the limits. The sixteenth state assigned as blackout level where all the four parameters violate the limits. Here λ V , λ F , λ T , λ H are the failure rates and μ V , μ F , μ T and μ H are the repair rates. The suffix V, F, T and H represent voltage, frequency, thermal limit and heavy electric components.

Sixteen state five level Markov Model of HVTS.
The sixteen state Markov model can be reduced to a five-level diagram, where the levels are identified as secure, alert, insecure, emergency and blackout as in Fig. 2. The secure level is the most preferred, and the blackout is the most unwanted level. The assessment of the security of the model can perform with the development of a uniform fuzzy security index. Here the index is developed by taking the Gaussian membership functions for the failure rates, and it is discussed in the next session.

Reduced diagram of sixteen state Markov Model to five levels.
Because of the ability of fuzzy logic to deal with uncertainty and nonlinearity the failures in the power system can be realized by fuzzy logic membership functions. The failure rates of the system due to violation of system parameters namely the voltage, frequency, thermal limit and the contingency of heavy electrical equipment are represented by Gaussian fuzzy membership functions. A security membership grade can be associated with each failure rate since each failure rates holds its own security level and it is the lowest security membership grade (LSMG) which indicates the weakest point in the fuzzy membership function of the respective failure. A universal security index of the power system is formed by the interaction of all the failure rates using their fuzzy membership functions. Each failure rate is characterized by the LSMG. Hence, it is possible to identify a part of the high voltage transmission system with the worst security level, as well as to calculate the joint fuzzy level of the system by the universal fuzzy security index (UFSI). The binary mapping is used to aggregate relevant LSMGs through the binary operator T [25, 26]. The UFSI itself is determined by multiplying the LSMGs of the four failure rates in the HVTS. UFSI = 0 indicates power system blackout.
An adapted Bellman– Zadeh model [27] has been used for the calculation of UFSI. If S = s
i
is a set of possible states of a power system operation, then a fuzzy security operating state can be defined as a fuzzy set in S, characterized by its SMG μ (s
i
): S ⟶ [0, 1] so that μ (s
i
) specifies the grade of membership in a continuous domain of a particular state S
i
ɛS [28]. For a power system segment in the current power system operating state s
i
, the fuzzy security state (i.e., the LSMG) can be expressed as in Equation (4),
‘∧’ is the fuzzy operator represents the minimum of the relevant component SMGs. The UFSI is given an aggregation of the LSMGs of the power system segments through the fuzzy intersection operator in operating state s
i
:
The product of the four segment membership functions results in a new fuzzy set with a membership function μ agg (s i ) by Larsen implication rule [29].
Developed UFSI along with the failure rates have been selected as input features for power system security prediction using k -NN classifier are discussed in the next session.
Identifying the security level from the UFSI value is not much satisfactory since the ranges for specific levels cannot be defined precisely. Even though the index value is high for the secure level and decreasing to black out level, it is observed that the variation of the index value is not in a uniform manner. The security prediction can perform with k -NN classifier technique where the input features are the four failure rates considered along with the developed UFSI and the result can be more accurate. Classification Learner trains models to classify data. The data can be explored, features can be selected, validation schemes can be specified, models can be trained, and results can be assessed by the k -NN classifier. Decision trees, discriminant analysis, support vector machines, logistic regression, nearest neighbors and ensemble classification can be performed by k -NN which gives automated training to search for the best classification model type. k -NN Classifier is having 90% accuracy with medium prediction speed and memory usage. The classification categorizes query points based on their distance to points (or neighbors) in a training dataset. It is possible to use various metrics to determine the distance. From the given set ‘X’ of ‘n’ points and a distance function, k-nearest neighbor (k -NN) search helps to find the k closest points in X to a query point or set of points. k -NN based algorithms are widely used as benchmark machine learning rules.
A set of ‘n’ labeled examples
The distances between different objects are positive and the distance between ‘x’ and ‘y’ is the same as the distance between ‘y’ and; ‘x’. The triangle inequality means roughly that the distance from ‘x’ to ‘z’ to ‘y’ is never shorter than going directly from ‘x’ to ‘y’. Some typical distance functions used in distance calculations are shown in the equations, below
In this paper the authors have taken Euclidean distance approach for neighbour calculation.
The failure rates and the UFSI with known security levels are chosen as feature vectors and fed to k -NN classifier for training the model. Figure 3 represents the proposed k -NN classifier with training and testing data. The prediction output from the classifier can be any one of five security levels.

k -NN Classifier.
Sensitivity, specificity, and accuracy are the performance measures used for the evaluation of classifiers [32]. These parameters are evaluated by comparing the actual test output and the predicted output. A confusion matrix visualizes the number of true positives (TPs), false positives (FPs), false negatives (FNs), and true negatives (TNs) for a classifier and it is a table that is often used to describe the performance of a classification model (or “classifier”) on a set of test data for which the true values are known. For multi-class problems, the elements at the main diagonal of the confusion matrix show correct classifications and all other element show incorrect classifications. The correct classification is the true positive prediction TP. TN is the true negative prediction, and FP is the false positive predictions for the considered class. FP is the sum of values in the corresponding column excluding TP. The total number of false negative for a class is the sum of values in the corresponding row excluding TP. The total number of true negative for a specific class is the sum of all columns and rows excluding that class’s column and row.
The precision is given by the ratio of true positive results to the sum of true positive and false positive results and is given by Equation (14) [33]
Sensitivity relates to the ability of the trained model to identify positive results which corresponds to the true positive rate of the considered class and is given by the Equation (15) [33]
The total number of test examples of any class would be the sum of the corresponding row, i.e., the TP+FN for that class. Specificity corresponds to the true negative rate of the considered class given by Equation (16) [33]
The classification accuracy (Acc) of a measurement system is the degree of closeness of measurements of a quantity to that of its actual (true) value and defined as in Equation (17) [33]
Accuracy calculated as the sum of correct classifications divided by the total number of classifications, and it is the fraction of predictions that are true.
HVTS taken for the case study is 110 kV utility substation Chevayur of Kerala State Electricity Board and the representation is shown in Fig. 4. Two 110 kV feeders from the Nallalam 220 kV substation and Kakkayam 220KV switching substation are the input feeders. The output feeders include one number 66 kV feeder and 12 numbers of 11 kV feeders. During peak hours the 11 kV feeders are loaded to the maximum thermal limit. The maximum load observed in the last year was 33MVA with 1800Ampere. Two underground feeders are under construction in the substation, and the substation needs up gradation shortly. The substation is feeding a thickly populated city with critical loads like hospitals, malls, entertainment facilities and the domestic user in the high rise buildings, clusters, etc.

HVTS Substation Chevayur Representation.
The HVTS has been modeled as a sixteen-state Markov Model as explained in session 3 and this can be reduced to a five-level diagram. The level transitions are depending on the failure rates. The failure rates and repair rates are tabulated in Table 1 from the yearly substation report of 2015 [34]. For the formation of the fuzzy index, the Gaussian fuzzy membership function is assigned to the failure rates of the parameters and shown in Fig. 5. Three conditions of best, satisfactory/normal and worst have taken for each parameter.

Gaussian Membership Function.
Failure and repair rates of HVTS
The membership grade of the Gaussian function is given by
The security membership grade (SMG) of the Gaussian fuzzy membership functions calculated for three conditions of best, satisfactory and worst cases by assigning the values of C i and σ i as given in the Table 2.
In the system output, five levels that taken in the Markov model are assigned as Gaussian fuzzy membership functions as shown in Fig. 6 for secure, alert, insecure, emergency, and blackout levels. By de-fuzzification, the output level can be obtained.

Output security levels.
For various failure rates within the range [0, 1], the SMG and UFSI values are tabulated in Table 3. The failure rates are in columns 1, 3, 5 and 7 respectively. The security level can be obtained from the de-fuzzification. μ λ V , μ λ F , μ λ H , μ λ T are the SMGs of the voltage failure rate, frequency failure rate, heavy electrical equipment failure rate and thermal failure rate, are tabulated in columns 2,4 6 and 8.
SMG of the Gaussian fuzzy membership functions
It is observed that the value of UFSI decreases from secure level to the blackout level. But the security prediction from the UFSI is not much reliable since the variations in the index value are not in a particular fashion. Hence the prediction can be better with the classifier techniques with the proper selection of the training vector as the input feature.
For predicting security of HVTS, a new training vector of 44 sets of values of the failure rates, corresponding UFSI, and the security level from the de-fuzzifier are selected as feature input to the k -NN classifier and is shown as Training vector in Table 4. With this training vector the prediction of security level is possible for any failure rates and UFSI. Here the predictions are made for 50 sets of failure cases and the results are shown in confusion matrix in Table 5.
Training vector
Confusion matrix
From the confusion matrix the values of TP, TN, FP, and FN for all levels are calculated and tabulated in Table 6. For checking the performance of the classifier, Equation 14, 15, 16, and 17 are used, and the performance measures are tabulated in Table 7. The accuracy predicted for all the levels are found to be above 90%. The proposed k -NN classifier for security level prediction is suitable for the power system. The overall accuracy obtained by Equation (17) is 89.88%.
True and false prediction values from confusion matrix
Performance measures of k-NN classifier
A new security assessment method is proposed for the high voltage transmission system (HVTS). The system around a high voltage substation has been modeled using a 16 state Markov model. Following the usual approach, five security levels have been selected for the system. On this basis an improved universal fuzzy security index (UFSI) has been developed by considering the failure rates in the system. Security level prediction was performed by employing a k -NN classifier with the failure rates and the developed UFSI as feature inputs. Here the predicted security levels are compared with the actual levels and shows very good results. Historical data obtained from a substation has been used for validating the proposed methods. The 16 state Markov model and k -NN classifier technique can be extended for wider network for the operations planning.
Footnotes
Acknowledgments
Authors acknowledge the support rendered by the Kerala State Electricity Board and their officers for providing the relevant field data for the background studies of the work reported in the paper.
