Research on user phase identification algorithm based on improved cloud model and adaptive segmented voltage

Abstract

To solve the problem of inaccurate user phase identification, the paper proposes a new algorithm based on improved cloud model and adaptive segmented voltage algorithm. Firstly, the new algorithm uses improved cloud model to calculate the digital features of station area and users’ voltage sequences quickly. Secondly, the paper uses the adaptive segmentation voltage algorithm to divide the full voltage sequences into three parts automatically to add local features into phase identification. Finally, the paper calculates cosine similarity between each segmented voltage cloud model to identify users’ voltage phase. The analysis based on station data and field verification shows that the new algorithm has not only improved the calculation efficiency by 41% compared with traditional user phase identification algorithm, but also increased the difference in identification results between different phases by 1000 times. In the final result, the accuracy of the new algorithm is 95%. The new algorithm has more obvious differentiation and higher accuracy. The analysis results based on the actual engineering data also prove the feasibility and effectiveness of the new user phase identification algorithm.

Keywords

Phase identification adaptive segmentation voltage improved cloud model cosine similarity

1 Introduction

In power grid enterprise management, each low-voltage station area which is powered by distribution is the smallest management unit. The efficiency of transformers affects the benefits of the power grid enterprise directly. As we all know, the unbalance of three-phase load will cause high line loss and reduce the efficiency of distribution transformers. Because there are lots of single-phase load in our daily life, the unbalance rate of three-phase load is low. In the daily work of the power grid enterprise, the basic information of voltage, current and electricity consumption of different stations and users can be collected by the power collection system, but the information that which phase users belong to is missing. And the missing information affects the calculation of three-phase unbalance rate. To save energy, a fast and effective method to identify the phase of users is urgently needed.

At present, the commonly used methods of users’ phase identification can be divided into two types [1]: on-site identification, which often requires a large number of staff to identify users’ phase one by one. The method is highly accurate but consumes a lot of labor and easily affects the daily life of users, so it is infeasible. The other type is algorithm, which is based on users’ real power and voltage data.

The algorithms which are commonly used in phase identification mainly include machine learning, Pearson correlation coefficient and cluster analysis. In [2, 3], a machine learning method based on data mining, which leverages power consumption data collected through the advanced metering infrastructure (AMI) has been proposed for an accurate and efficient phase identification. In [4], an integrated method, which uses spectral clustering for solving the problem of users’ phase identification has been formulated. In [5], the paper uses k-means clustering algorithm to identify the users’ phase. In [6], a new method combines fourier series compression and clustering algorithm together to identify the phase of users. In [7], the paper proposes a consumer phase identification algorithm based on the correlation characteristics to improve the accuracy of consumer phase identification with incomplete data. In [8], the paper improves the phase identification accuracy by averaging across loads connected to the same transformer. In [9], the paper proposes a method to extract the data features through the wavelet analysis.

In research of users’ phase identification, we have found that the fluctuation of the load carried by the station area three-phase power supply is random and there is a certain potential relationship between the three-phase voltage and users’ voltage. Though the potential relationship is fuzzy, but it is real. In the prior studies, the fuzziness between users’ voltage and stations’ three-phase voltage is less concerned, but the fuzziness also plays important role in user phase identification. To solve the problem, a new algorithm based on improved cloud model [10, 11] and adaptive segmentation voltage has been proposed to identify users’ phase. Firstly, the digital characteristics of the cloud model quantitatively can not only represent the randomness of users’ electricity consumption but also express the fuzzy relationship between the three-phase voltage in substation area and the users’ voltage, and the improved cloud model has been proposed based on the traditional cloud model to save the calculation time. Secondly, inspired by the former research, the collected voltage sequence is adaptively divided into three segments by the adaptive segmentation voltage algorithm to increase the segmented waves’ fluctuation characteristics, and the digital cloud characteristics of each segmented voltage sequence are quickly calculated by the improved cloud model. Thirdly, the cloud features of full voltage sequences are calculated by the improved cloud model. Finally, according to the segmented voltage improved cloud model, the cosine similarity between users’ full domain voltage sequences and station’s three-phase voltage sequences are calculated, then the users’ phase is identified according to the size of cosine similarity.

2 Cloud model and its improvement

2.1 Definition of cloud model

Cloud model is a mathematical model which enables the interconversion of quantitative concepts and qualitative concepts. The cloud model consists of a large number of cloud drops which represent data point. The overall shape of the cloud model reflects the qualitative concepts under quantitative values. The definition of the cloud model is shown below:

The qualitative concept C is a concept on the quantitative domain U, if x ∈ U is a random realization of the concept, the certainty degree of C for X is a random number with stable distribution: μ (x) : U → [0, 1] , ∀ x ∈ U. The distribution on the domain is called the cloud model. The quantitative value reflects the randomness of the quantitative concept, and μ (x) reflects the possibility of quantitative value belonging to the qualitative concept.

The fluctuation characteristics of different voltage sequences can be expressed by the cloud model. Different voltage sequences have different cloud model shapes. The more similar between the shapes of two cloud models are, the more similar fluctuation characteristics of the voltage sequences cloud model are. That means the two voltage sequences is more likely belonging to the same phase of power supply.

2.2 Cloud generator

The cloud model often uses three numerical features such as expectation E_x, entropy E_n and super entropy H_e to describe the qualitative concept of data. Let the voltage sampling sequence be U = [u₁, u₂, u₃, . . . , u_n], and the cloud model of the voltage sequence be C_U = [E_x, E_n, H_e]. When E_x = 235.38, E_n = 1.09, H_e = 0.1 the cloud model is shown in Fig. 1.

Fig. 1

Voltage cloud model C_U = [235.38,1.09,0.1].

The expectation E_x in Fig. 1 represents the mathematical expectation of the distribution in voltage domain space. It also means the average value of the voltage sequence. The entropy E_n represents the span of voltage cloud model, reflecting the dispersion degree of voltage cloud droplet distribution. Similar to the “3σ norm”, there is a “3E_n norm” in voltage cloud model, that is 99.74% of the voltage droplets will fall in the area [E_x - 3E_n, E_x + 3E_n], and the droplets outside the area will be called small probability events. Super entropy H_e is the entropy of the entropy E_n, not only represents the thickness of voltage cloud model, but also reflects the deviation degree of the voltage cloud droplet distribution. The greater thickness of voltage cloud model is, the more unstable voltage cloud droplet distribution is.

Cloud models are closely related to cloud generators. The process of generating cloud models C_U = [E_x, E_n, H_e] by voltage sampling sequences is called Backward Cloud Transformation (BCT). While the process of generating more cloud drops by using the original voltage sampling sequences is called Forward Cloud Transformation (FCT), and the process of traditional cloud generator algorithms are shown in Tables 1 2.

Table 1

The algorithm of forward cloud transformation

Input:	Cloud model C_U = [E_x, E_n, H_e] and number of cloud drops N
Output:	Cloud Drops (x (i) , y (i)) i = 1, . . . , N
Step1:	Calculate the generation entropy
Step2:	Generate cloud droplet horizontal coordinates
Step3:	Generate cloud droplet vertical coordinates
Step4:	Repeat Step1–Step3 until N cloud drops have been generated

Table 2

The algorithm of backward cloud transformation

Input:	Voltage sampling sequence U = [u₁, u₂, u₃, . . . , u_n]
Output:	Voltage cloud model C_U = [E_x, E_n, H_e]
Step1:	$E_{x} = \frac{1}{n} * \sum_{i = 1}^{n} u_{i}$
Step2:	$E_{n} = \sqrt{\frac{π}{2}} * \frac{1}{n} * \sum_{i = 1}^{n} \| u_{i} - E_{x} \|$
Step3:	$S = \sqrt{\frac{1}{n - 1} * \sum_{i = 1}^{n} (u_{i} - E_{x})^{2}}$
Step4:	$H_{e} = \sqrt{S^{2} - E_{n}^{2}}$

2.3 Improved backward cloud algorithm

It can be seen from 2.2 that the cloud model has two types of cloud generators. One is the forward cloud which is used to generate more cloud drops, and the other is the backward cloud which can represent the quantitative voltage cloud drops with the cloud model C_U = [E_x, E_n, H_e]. The traditional backward cloud generator algorithm is shown in Table 2, the calculation method is very simple. However, in engineering applications, the daily fluctuation of the voltage sampling sequence is relatively small under normal operation. So the sample variance of the voltage data is small, it is easy to appear the phenomenon that $S^{2} - E_{n}^{2} < 0$ . At this time, the calculation result of super entropy H_e is a false number that means the calculation result is invalid. Therefore, it is often found that the traditional backward cloud algorithm is not feasible in practical applications.

To solve the problem that $S^{2} - E_{n}^{2} < 0$ , the commonly used improvement methods are uncertainty-free backward cloud algorithm and Multiple Backward Cloud Transformation based on Sampling with Replacement (MBCT-SR). Among the algorithms, the uncertainty-free backward cloud algorithm increases the sample sequence variance by censoring the sample points closest to the mean of sample sequence E_x in the original sample step by step, until the square of super entropy in the sampled voltage cloud model is positive. But censor the data points of the original sample sequence may lose data information. The MBCT-SR algorithm generates many new sample sequences by random sampling, grouping the original sample data. Recalculate the super entropy of the sequences, so that the square of super entropy will be positive. The calculation results have strong stability, but the calculation process is too complicated and the model solution takes too long.

In this paper, an Improved Backward Cloud Transformation (IBCT) algorithm is proposed, the algorithm procedure is shown in Table 3.

Table 3
The process of IBCT

Input: Voltage sampling sequence U = [u₁, u₂, u₃, . . . , u_n]

Output: Improved voltage cloud model C_U = [E_x, E_n, H_e]

Step1: $E_{x} = \frac{1}{n} * \sum_{i = 1}^{n} u_{i}$

Step2: $E_{n} = \sqrt{\frac{π}{2}} * \frac{1}{n} * \sum_{i = 1}^{n} | u_{i} - E_{x} |$

Step3: $S = \sqrt{\frac{1}{n - 1} * \sum_{i = 1}^{n} (u_{i} - E_{x})^{2}}$

Step4: Judgment Session: if $S^{2} - E_{n}^{2} < 0$ , then go in Step5, else go in Step6.

Step5: Let U_max = max(U), update voltage sampling sequence U = [U, U_max], skip to

Step3 to recalculate the sample sequence’s variance

Step6: $H_{e} = \sqrt{S^{2} - E_{n}^{2}}$

Input:	Voltage sampling sequence U = [u₁, u₂, u₃, . . . , u_n]
Step1:	$E_{x} = \frac{1}{n} * \sum_{i = 1}^{n} u_{i}$
Step2:	$E_{n} = \sqrt{\frac{π}{2}} * \frac{1}{n} * \sum_{i = 1}^{n} \| u_{i} - E_{x} \|$
Step3:	$S = \sqrt{\frac{1}{n - 1} * \sum_{i = 1}^{n} (u_{i} - E_{x})^{2}}$
Step4:	Judgment Session: if $S^{2} - E_{n}^{2} < 0$ , then go in Step5, else go in Step6.
Step5:	Let U_max = max(U), update voltage sampling sequence U = [U, U_max], skip to
	Step3 to recalculate the sample sequence’s variance
Step6:	$H_{e} = \sqrt{S^{2} - E_{n}^{2}}$

3 User phase identification algorithm

Traditional user phase identification algorithms often use Pearson correlation coefficients, clustering and other methods. No matter which one of traditional user phase identification algorithms is taken, the results of three phase judgement based on the practical application of engineering are too small to identify the phase of users. It is very easy to misjudge users’ phase. In order to solve such a problem, the paper proposes a new algorithm based on adaptive segmentation voltage algorithm and improved cloud model. By calculating the cosine similarity between the adaptive segmentation voltage sequence cloud models, the users’ phase can be identified correctly.

3.1 Adaptive segmentation algorithm

In the daily maintenance of the power grid enterprises, the electricity collection system is often used to collect the electrical energy information of users and station areas. The current collection frequency of the collection system is once every 15 minutes. So the number of data points is 96 points per day. In the study of this paper, the voltage sampling sequence is taken as 96 points of voltage sampling per day, when the cloud model is calculated for the whole voltage sampling sequence. The calculation result only reflects the overall digital characteristics of the voltage sampling sequence. While the station area voltage fluctuates less during normal operation, the digital characteristics of the whole voltage cloud models are less different, the differentiation degree of the calculation result is lower, it is very easy to misjudge the phase of users.

From the previous sections, it is clear that the entropy E_n in the cloud model represents the chaos degree of data. The larger entropy value is, the larger range of values are acceptable to the qualitative concept, so there are more unstable and chaotic data. It is also same in user phase identification, the smaller entropy value is, the more obvious the fluctuation characteristics of the voltage sampling sequences are. Therefore, the adaptive segmentation algorithm will aim at reducing the entropy of the voltage sampling sequence and adaptively dividing the voltage sampling sequence into an arbitrary number of segments according to the principle of maximum entropy reduction of the segments, and the specific algorithm flow is shown in Table 4.

Table 4
The algorithm of adaptive segmentation in voltage

Input: Voltage sampling sequence U_i = [u_i1, u_i2, . . . , u_i96], Number of segments N

Output: Segmented voltage U_i1, U_i2, . . . , U_iN

Step1: Let U_ireg = U_i

Step2: Calculate the cloud model C_{U
_ireg} = [E_xi, E_ni, He_i]

Step3: The original voltage sequence is divided into two segments and , and the corresponding cloud models and , ,

Step4: Calculate $Δ E (j) = E_{ni} - (E_{n 1}^{'} + E_{n 2}^{'}) j = 1, 2, . . ., 95$ .

Step5: Let ΔE_max = max(ΔE). The corresponding segmentation point positions are recorded, and at this time the segmentation sequence V₁, V₂ and its cloud model, C_{V
₁}, C_{V
₂} are recorded.

Step6: According to C_{V
₁}, C_{V
₂}. The part of V₁, V₂, which has a smaller entropy value is noted as , and the other part is . Let $U_{i 1} = V_{1}^{'}$ , U_ireg = V₂, count = count + 1. When count = N skip to Step7, otherwise skip to Step2.

Step7: Output Adaptive Segmentation Sequence {U_i1, U_i2, . . . , U_iN}

Input:	Voltage sampling sequence U_i = [u_i1, u_i2, . . . , u_i96], Number of segments N
Step1:	Let U_ireg = U_i
Step2:	Calculate the cloud model C_{U _ireg} = [E_xi, E_ni, He_i]
Step3:	The original voltage sequence is divided into two segments and , and the corresponding cloud models and , ,
Step4:	Calculate $Δ E (j) = E_{ni} - (E_{n 1}^{'} + E_{n 2}^{'}) j = 1, 2, . . ., 95$ .
Step5:	Let ΔE_max = max(ΔE). The corresponding segmentation point positions are recorded, and at this time the segmentation sequence V₁, V₂ and its cloud model, C_{V ₁}, C_{V ₂} are recorded.
Step6:	According to C_{V ₁}, C_{V ₂}. The part of V₁, V₂, which has a smaller entropy value is noted as , and the other part is . Let $U_{i 1} = V_{1}^{'}$ , U_ireg = V₂, count = count + 1. When count = N skip to Step7, otherwise skip to Step2.
Step7:	Output Adaptive Segmentation Sequence {U_i1, U_i2, . . . , U_iN}

3.2 Adaptive segmented cloud model similarity calculation

According to the adaptive segmentation algorithm in 3.1, the original voltage sequence can be divided into several segments. Because there are three peaks in the daily electricity consumption, this paper divided the voltage into three parts. The similarity between cloud models of the original voltage sequence can be obtained by calculating the similarity between segmented voltage cloud models. The common algorithms used in the sequence similarity calculation are Pearson correlation coefficient algorithm, cosine similarity algorithm of cloud model (LICM), shape similarity of cloud model (PCM) algorithm and expectation curve based cloud model similarity (ECM) algorithm. The algorithm used in this paper is the cosine similarity algorithm.

Suppose there are two voltage sequences U₁ and U₂, and the cloud models of the voltage sequences are C₁ = [E_x1, E_n1, He₁] and C₂ = [E_x2, E_n2, He₂]. The vectors consisting of the digital features of two cloud models are ${\vec{U}}_{1} = (E_{x 1}, E_{n 1}, {He}_{1})$ and ${\vec{U}}_{2} = (E_{x 2}, E_{n 2}, {He}_{2})$ . The cosine between vector ${\vec{U}}_{1}$ and vector ${\vec{U}}_{2}$ is called the cosine similarity of the cloud models which is recorded as Q. $Q ({\vec{U}}_{1}, {\vec{U}}_{2}) = cos ({\vec{U}}_{1}, {\vec{U}}_{2}) = \frac{{\vec{U}}_{1} \cdot {\vec{U}}_{2}}{| | {\vec{U}}_{1} | | | | {\vec{U}}_{2} | |}$

The two voltage sequences are divided by the adaptive segmentation algorithm into {U_1-1, U_1-2, U_1-3} and {U_2-1, U_2-2, U_2-3} and the cloud models of the segmented voltage sequences are C_U1-1, C_U1-2, C_U1-3, C_U2-1, C_U2-2 and C_U2-3. Record the cosine similarity between C_U1-1 and C_U2-1 as $q_{1} = Q ({\vec{U}}_{1 - 1}, {\vec{U}}_{2 - 1})$ . In the same order, we can obtain $q_{2} = Q ({\vec{U}}_{1 - 1}, {\vec{U}}_{2 - 2})$ , $q_{3} = Q ({\vec{U}}_{1 - 1}, {\vec{U}}_{2 - 3})$ , $q_{4} = Q ({\vec{U}}_{1 - 2}, {\vec{U}}_{2 - 1})$ , $q_{5} = Q ({\vec{U}}_{1 - 2}, {\vec{U}}_{2 - 2})$ , $q_{6} = Q ({\vec{U}}_{1 - 2}, {\vec{U}}_{2 - 3})$ , $q_{7} = Q ({\vec{U}}_{1 - 3}, {\vec{U}}_{2 - 1})$ , $q_{8} = Q ({\vec{U}}_{1 - 3}, {\vec{U}}_{2 - 2})$ , $q_{9} = Q ({\vec{U}}_{1 - 3}, {\vec{U}}_{2 - 3})$ . Based on the cosine similarity of the segmented voltage sequences, the cosine similarity between the original voltage sequences’ cloud models is: $Q ({\vec{U}}_{1}, {\vec{U}}_{2}) = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} α_{i} q_{i}}$

Where α_i represents the overlap of sampling periods between different segmented cloud models: $α_{i} = \frac{num (t_{Ui - j} \cap t_{Un - m})}{max (num (t_{Ui - j}), num (t_{Un - m}))}$

Where t_Ui-j represents the set of sampling points of the voltage sequence U_i-j.

3.3 Phase identification algorithm

In summary, this paper predicts the users’ phase by the adaptive segmentation voltage algorithm and cosine similarity. The specific steps of the identification algorithm are shown in Table 5.

Table 5
The algorithm of users’ phase identification

Input: Each voltage sampling sequence

Output: The result of the users’ phase judgment

Step1: Collect three-phase voltage sequences and customers’ voltage sequences by using the electricity collection system.

Step2: The voltage sampling sequence is segmented into three parts according to the adaptive segmentation algorithm.

Step3: Calculate the segmented voltage cloud model by the improved cloud model. Calculate the cosine similarity between the users’ voltage sequences and three-phase voltage sequences.

Step4: Identify the users’ voltage phase according to the cosine similarity calculation result.

Input:	Each voltage sampling sequence
Step1:	Collect three-phase voltage sequences and customers’ voltage sequences by using the electricity collection system.
Step2:	The voltage sampling sequence is segmented into three parts according to the adaptive segmentation algorithm.
Step3:	Calculate the segmented voltage cloud model by the improved cloud model. Calculate the cosine similarity between the users’ voltage sequences and three-phase voltage sequences.
Step4:	Identify the users’ voltage phase according to the cosine similarity calculation result.

4 Data validation

In this paper, we use the electricity collection system to collect the sampled values of voltage in Z0*** station area in a day, and verify the validity of each algorithm proposed in this paper on the basis of this database.

4.1 Verification of the effect of improved voltage cloud model solution

The calculation steps of Improved Backward Cloud Transform (IBCT) are similar to the uncertainty-free inverse cloud algorithm, but the IBCT algorithm can better restore the depth information of the original data. In this paper, based on the collected voltage sampling sequences, the numerical characteristics and the calculation time of cloud model under different algorithms are calculated by MATLAB, and the comparison results are shown in Tables 6 7.

Table 6
The comparison of He² between different methods

He²\Methods U-BCT MBCT-SR IBCT BCT

User 1 0.0437 0.1069 0.0289 –0.0625

User 2 0.02 0.1148 0.0335 –0.0432

User 3 0.0069 0.0941 0.0113 –0.0694

User 4 0.0312 0.0844 0.0188 –0.0735

User 5 0.0451 0.0911 0.0096 –0.0721

He²\Methods	U-BCT	MBCT-SR	IBCT	BCT
User 1	0.0437	0.1069	0.0289	–0.0625
User 2	0.02	0.1148	0.0335	–0.0432
User 3	0.0069	0.0941	0.0113	–0.0694
User 4	0.0312	0.0844	0.0188	–0.0735
User 5	0.0451	0.0911	0.0096	–0.0721

Table 7

The comparison of time consuming between different methods

time\Methods	U-BCT	MBCT-SR	IBCT	BCT
User 1	1.701	9.528	1.692	2.879
User 2	2.301	10.203	2.361	4.101
User 3	2.088	10.014	2.401	4.291
User 4	2.273	9.953	2.099	3.471
User 5	1.865	9.902	1.988	3.394

From Table 6, we can see that the square of super entropy in cloud model under the traditional BCT algorithm is negative, so the entropy value calculation result is meaningless. While the Uncertainty BCT algorithm, MBCT-SR algorithm and IBCT algorithm all of these algorithms can solve the problem of $S^{2} - E_{n}^{2} < 0$ .

By observing Table 7, it is easily to find that the IBCT algorithm itself takes much less time than MBCT-SR algorithm and the BCT algorithm. Compared with the BCT algorithm, the IBCT algorithm takes nearly 41% less time and nearly doubles the solving efficiency, which lays the foundation of fast calculation of cloud model.

4.2 Validation of the adaptive segmentation algorithm

To facilitate the display, the three-phase voltage of the station area and the voltage data of some users are selected as typical voltage sequences for display. The full domain voltage fluctuation of the typical sequences are shown in Fig. 2.

Fig. 2

The whole voltage wave form of typical voltage sequences.

The fluctuation characteristics between users’ voltage and three-phase voltage in Fig. 2 are relatively similar and not distinguishable.

The full domain cloud model parameters for typical voltage sequences are calculated by the improved cloud model as shown in Table 8.

Table 8

The results of typical voltage sequences which are dealt by IBCT

Typical voltage sequences	Cloud model C_U = [E_x, E_n, He]
Phase A	[235.39,1.09,0.11]
Phase B	[235.77,1.06,0.11]
Phase C	[235.29,1.05,0.11]
User 1	[235.33,1.10,0.11]
User 2	[235.65,1.02,0.10]
User 3	[235.30,1.06,0.10]

The full domain voltage cloud model is shown in Fig. 3.

Fig. 3

The full domain voltage cloud model diagram.

The difference between users’ voltage cloud model and three-phase voltage cloud model in Fig. 3 is too small. The results of cosine similarity and sequences’ phase judgment are shown in Table 9.

Table 9

The results of phase identification and cosine similarity in whole sequence

Voltage sequences	A	B	C	Judgment results
A	1	0.9999	0.9998	A
B	0.9999	1	1	C
C	0.9998	1	1	C
User 1	1	0.9999	0.9997	A
User 2	0.9994	0.9998	0.9999	C
User 3	0.9999	1	1	C

From Table 9, it can be seen that the cosine similarity between each voltage sequences and the three-phase voltage are relatively close to each other based on the full-domain voltage sequence cloud model phase identification. It is very easy to be misjudged. In Table 9, the Phase B voltage sequence is misjudged as Phase C.

In order to reduce the rate of misjudgment, the adaptive segmentation voltage algorithm will be introduced, and the three-segment voltage local waveform after the adaptive segmentation algorithm is shown in Fig. 4.

Fig. 4

The wave comparison in three parts of segmented voltage.

Figure 4 is composed of three sub-plots. Figure (a) is the first part of local waveform comparison between three-phase power supply and users’ voltage. It can be easily seen that the fluctuation characteristics of users’ voltage sequences and three-phase voltage sequences are relatively close. It is difficult to distinguish which phase of each user belongs to. Figure (b) is the second part of local waveform comparison between three-phase power supply and user voltage. It is not difficult to find that the fluctuation characteristics between user1 and Phase C voltage sequence are closer, user2 and Phase B voltage sequence are closer, user3 and Phase A voltage sequence are closer. Figure (c) shows the comparison between three-phase power supply and user voltage in part three of local waveform, and the results of Figure (c) are similar to those of Figure (b) that means the fluctuation characteristics of user1, user2 and user3 in Figure (c) are closer to those of Phase C, Phase B and Phase A voltage sequence respectively. In summary, the fluctuation characteristics of user1, user2 and user3 are closer to the Phase C, Phase B, and Phase A voltage sequences respectively. Thus, it can be roughly analyzed that user1 belongs to Phase C power, user2 belongs to Phase B power, and user3 belongs to Phase A power.

In order to conduct a more detailed and in-depth study on user phase identification, the improved voltage cloud model is used to calculate the voltage segmentation cloud model of station area and users, and the calculation results are shown in Table 10.

Table 10

The results of segmented voltage sequences which are dealt by IBCT

Sequence	Cloud model I	Cloud model II	Cloud model III
A	[234.63,0.52,0.05]	[236.69,0.17,0.02]	[235.99,0.84,0.08]
B	[235.02,0.54,0.05]	[237.09,0.17,0.02]	[236.43,0.71,0.07]
C	[234.59,0.49,0.05]	[236.42,0.73,0.07]	[235.56,0.76,0.08]
User1	[234.64,0.59,0.06]	[236.62,0.76,0.08]	[235.61,0.81,0.08]
User2	[234.97,0.54,0.05]	[236.99,0.26,0.03]	[236.23,0.74,0.07]
User3	[234.60,0.47,0.05]	[236.60,0.16,0.02]	[235.84,0.86,0.09]

According to the cloud model comparison between users and station three-phase power segmentation voltage in Table 10, the comparison diagrams between cloud models are shown in Fig. 5.

Fig. 5

The cloud model comparison between three parts of segmented voltage.

It is not difficult to observe the segmented voltage cloud model in Fig. 5: User1’s I segmented cloud model is closer to both Phase A voltage cloud model and Phase C voltage cloud model. But in II and III segmented cloud model comparison diagrams, User1’s cloud model is only closer to the Phase C voltage cloud model and there are more obvious gap between User1’s cloud model and the other two voltage cloud model. User3 is similar to User1, User3’s I segmented cloud model is closer to both Phase A voltage cloud model and the Phase C voltage cloud model with little distinction. But in II and III segment cloud model comparison chart, the cloud model of user3 is only closer to the Phase A voltage cloud model. The gap of voltage cloud model between the remaining two phases is more obviously. The cloud model of user2 is closer to the Phase B voltage cloud model in each segment cloud model comparison chart. The results of the segment cloud model comparison diagrams show that: user1 belongs to Phase C, user2 belongs to Phase B, and user3 belongs to Phase A.

In order to display the user phase identification results digitally, the cosine similarity is used to calculate each segment voltage cloud model, and the calculation and phase identification results are shown in Table 11.

Table 11

The results of phase identification and cosine similarity in segmented sequences

Voltage sequences	A	B	C	Judgment results
A	1	0.9006	0.7750	A
B	0.8983	1	0.7750	B
C	0.8821	0.8795	1	C
User 1	0.7975	0.8917	0.9687	C
User 2	0.8991	0.9987	0.7791	B
User 3	0.9997	0.8993	0.7742	A

Analyzing the calculation results in Table 11, we can find that the cosine similarity of voltage cloud model under adaptive segmentation is more accurate. The difference of cosine similarity between different phases is more obviously, the possibility of misjudgment is smaller. Meanwhile the calculation result of cosine similarity between user1 and Phase C is 0.9687; the calculation result of cosine similarity between user2 and Phase B is 0.9987; the calculation result of cosine similarity between user3 and Phase A is 0.9997. The result of cosine similarity between user3 and Phase A is 0.9997, and the calculation result between the corresponding phase power supply and user’s voltage is obviously larger than the cosine similarity between the other two phases and users’ voltage. From Table 9, we can see that the cosine similarity difference between A-A and A-B is 0.0001 based on full domain analysis. But the same difference based on segmented sequence analysis is nearly 0.1. So we can say the difference of similarity between phases is enlarged by 1000 times. So the result of users’ phase identification is user1 belongs to Phase C, user2 belongs to Phase B and user 3 belongs to Phase A. The judgement result of cloud diagram and cosine similarity is same, which combines numbers and shapes.

4.3 Verification of actual results in the field

In order to verify the accuracy and feasibility of the new user phase identification algorithm proposed in this paper, the researchers verified phase sequences of 60 low-voltage users in Z0*** station area on site by using the multi-functional low-voltage station phase sequence identifier. The multifunctional low-voltage station area identifier is a field verification instrument that can not only verify the relationship between users and transformers, but also identify the phase of users. The verification device is shown in Fig. 6.

Fig. 6

Multifunctional low-voltage station area identifier.

Fig. 7

The results of station area identifier in scene.

The multi-functional low-voltage station identification instrument is composed of two parts: the mother machine and the daughter machine. During the field test, the mother machine is clamped on the three-phase voltage of the low-voltage station table, the daughter machine is clamped between the zero line and the fire line of different users. Based on data communication and phase separation calculation between daughter machine and mother machine, the daughter machine can realize which the users’ phase belongs to.

Randomly selected Z0*** station area of 60 users for on-site phase verification, the results of new algorithm identification and scene are compared as shown in Table 12.

Table 12

The comparison between the new algorithm result and real situation

User ID	(Algorithm, Scene)	User ID	(Algorithm, Scene)
***988	(C,C)	***183	(A,A)
***991	(C,C)	***196	(A,A)
***008	(C,C)	***200	(A,A)
***011	(C,C)	***213	(A,A)
***024	(C,C)	***226	(A,A)
***986	(A,A)	***239	(A,A)
***994	(A,A)	***268	(C,C)
***108	(A,A)	***301	(C,C)
***031	(B,B)	***314	(C,C)
***054	(B,B)	***327	(C,C)
***037	(B,B)	***460	(C,C)
***040	(C,C)	***473	(C,C)
***053	(C,C)	***549	(B,B)
***066	(C,C)	***552	(B,B)
***398	(B,B)	***565	(B,B)
***402	(B,B)	***578	(B,B)
***415	(B,B)	***581	(B,B)
***428	(A,B)	***594	(B,B)
***431	(A,A)	***154	(C,C)
***513	(A,A)	***167	(C,C)
***526	(A,A)	***608	(C,C)
***539	(A,A)	***611	(C,C)
***542	(A,A)	***624	(C,C)
***555	(A,A)	***738	(C,C)
***568	(B,B)	***709	(A,C)
***614	(B,B)	***438	(B,B)
***627	(B,B)	***441	(B,B)
***630	(B,B)	***454	(B,B)
***643	(B,B)	***467	(B,B)
***656	(B,B)	***483	(B,B)
***685	(C,C)	***496	(C,B)
***672	(C,C)	***500	(C,C)

Observing Table 12, it is not difficult to find that the results of phase identification algorithm randomly selected users are consistent with the results obtained from on-site verification, which verifies the feasibility and effectiveness of the algorithm proposed in this paper. In the final result, the accuracy of the new algorithm is 95%.

5 Summary

To solve the problem of inaccurate judgment of users’ phase in daily management of electric power enterprises, this paper proposes a new user phase identification algorithm based on improved cloud model and adaptive segmentation voltage algorithm. Compared with traditional user phase identification algorithm, the advantages of the algorithm proposed in this paper are as follows:

(1) This paper improved the traditional cloud model. Through data comparison, it can be found that the new model not only solves the problem of meaningless calculation in super entropy on the basis of retaining the original data information, but also greatly reduces the model calculation time consuming, making the solution efficiency of the user identification model method improved by 41%.

(2) On the basis of the improved cloud model, this paper proposes an adaptive segmentation algorithm which divides the full domain sampling sequence of user and three-phase voltages into three segments. The algorithm introduces the local fluctuation characteristics into the user phase discrimination process, and lays the foundation for the new user phase identification algorithm.

(3) The user phase identification algorithm proposed in this paper integrates the two algorithms above: firstly, the adaptive segmentation algorithm is used to divide the global voltage sampling sequence with similar fluctuation characteristics into three parts of local sampling voltage sequences with more obvious fluctuation characteristics. Then the cloud digital characteristics of the three-part voltage sequence are calculated by the improved cloud model. Finally, the user phase is correctly identified by combining with the cosine similarity calculation. Through data verification, it is easy to find that the user phase identification algorithm proposed in this paper expands the cosine similarity gap between user voltage sequences and different power supply voltage sequences by 1000 times which effectively reduces the misjudgment rate of user phase discrimination and improves the differentiation and accuracy of user phase identification. In the final result, the accuracy of the new algorithm is 95%. After field verification we find that the results of users’ new identification algorithm are feasible.

The characteristic of saving calculation time in the improved model is suitable for the research in future. The thought of adaptive segmentation algorithm is universal, but the algorithm in this paper only divides the full domain voltage sequence into three parts by station area’s load characteristics. The algorithm didn’t concerned about the users’ habits and the influence of the seasons change. In the future research, the voltage sequence can be divided into more parts which are decided by different users’ habits and the influence of different seasons. The results in the future maybe more interesting.

Footnotes

Acknowledgments

This paper was supported by the National Social Science Fund of China (No. 20BTJ012), Social Science Foundation of Hebei Province of China (No. HB18GL008), Beijing Intelligent Logistics System Collaborative Innovation Center (No. BILSCIC-2019KF-15), and Philosophy and Social Science Key Cultivation Project of Hebei University (No. 2019HPY035).

References

Therrien

, Blakely

and Reno

M.J.

, Assessment of Measurement-Based Phase Identification Methods, in IEEE Open Access Journal of Power and Energy 8 (2021), 128–137. doi: 10.1109/OAJPE.2021.3067632

Hosseini

Z.S.

, Khodaei

and Paaso

, Machine Learning-Enabled Distribution Network Phase Identification, in IEEE Transactions on Power Systems 36(2) (2021), 842–850. doi: 10.1109/TPWRS.2020.3011133

Foggo

and Yu

, Improving Supervised Phase Identification Through the Theory of Information Losses, in IEEE Transactions on Smart Grid 11(3) (2020), 2337–2346. doi: 10.1109/TSG.2019.2952080

Liu

, et al., Practical Method for Mitigating Three-Phase Unbalance Based on Data-Driven User Phase Identification, in IEEE Transactions on Power Systems 35(2) (2020), 1653–1656. doi: 10.1109/TPWRS.2020.2965770

Overington

, Edwards

, Trinkl

and Buckley

, Application of Constrained K-Means Algorithm for Phase Identification, 2021 31st Australasian Universities Power Engineering Conference (AUPEC), Perth, Australia, 2021, pp. 1–6. doi: 10.1109/AUPEC52110.2021.9597789

Chiu

, Wong

, Park

, Mahony

, Ferri

and Berson

, Phase Identification of Smart Meters Using a Fourier Series Compression and a Statistical Clustering Algorithm, 2022 IEEE Electrical Power and Energy Conference (EPEC), Victoria, BC, Canada, 2022, pp. 224–228. doi: 10.1109/EPEC56903.2022.10000137

, Zhou

, Li

, Liu

and Zhang

, Improving correlation-based consumer phase identification for incomplete data, 2020 IEEE Sustainable Power and Energy Conference (iSPEC), Chengdu, China, 2020, pp. 2533–2538. doi: 10.1109/iSPEC50848.2020.9351083

Zaragoza

and Rao

, Virtual Voltage Measurements via Averaging for Phase Identification in Power Distribution Systems, 2022 IEEE Kansas Power and Energy Conference (KPEC), Manhattan, KS, USA, 2022, pp. 1–5. doi: 10.1109/KPEC54747.2022.9814710

Qingning

, Xutao

, Feihu

, Jin

and Kaimin

, Consumers’ Phase Identification in Low Voltage Station Area Based on Wavelet Analysis of Consumption Data, 2021 IEEE International Conference on Power, Intelligent Computing and Systems (ICPICS), Shenyang, China, 2021, pp. 346–350. doi: 10.1109/ICPICS52425.2021.9524193

10.

Kai Zheng

, et al., Fuzzy Synthetic Condition Assessment of Wind Turbine Based on CombinationWeighting and Cloud Model, 1 Jan. 2017:4563–4572.

11.

Navid Parsa

, Bahman Bahmani-Firouzi

and Taher Niknam

, Probabilistic Operation Management of Automated Distribution Networks in the Presence of Electric Vehicles and Renewable Energy Sources, 1 Jan. 2020:7035–705.