Multi-source heterogeneous data fusion model based on fuzzy mathematics

Abstract

Sensors as the sensing end of intelligent control can be used to collect various data instead of human beings. In the context of technological development, the variety of sensors leads to multiple and structurally unequal data sources, and fusion of these data becomes a problem for consideration. The study constructs an intuitionistic fuzzy transformation method to handle data with various attributes with the help of fuzzy mathematical concepts, which characterizes the data based on the hesitancy and ideal solutions under Gaussian distribution. Simulations of classical classification data show that the intuitionistic fuzzy transformation method can effectively differentiate the affiliation of data points in the dataset, and the results of 800 simulations show that the qualitative accuracy of the algorithm can reach 89%, while the causes of abnormal data are explored and it is found that the attributes of the dataset based on Gaussian distribution are too close to each other as the cause of misclassification; the algorithm is also optimized from multi-dimensional considerations, and a An optimization operator based on the distance method of superior and inferior solutions was constructed and simulated for several optimization paths. The results show that the study uses an optimization scheme that is significantly better than the existing fuzzy operator, and 800 times can improve the accuracy rate up to 95.23%, which is 14.01% higher than that of a single attribute. This indicates that the intuitionistic fuzzy algorithm of this study has some rationality and is able to fuse the data of multiple attributes of the sensor for determination and provide the necessary basis for decision making.

Keywords

Fuzzy mathematics multi-source heterogeneity sensors gaussian distribution

1. Introduction

Multi-source heterogeneous data fusion models are widely available in various fields, and the application of such fusion models has spread to military, medical and environmental monitoring industries with the development of intelligent control and sensors, and decision-level fusion according to the fusion level is a higher level model, which is highly fault-tolerant and not too dependent on sensors [1]. However, the own characteristics of multi-source heterogeneous data determine that accurate quantitative processing of data can be computationally expensive, because unequal data structures and distributions and nonlinear relationships create great difficulties for computation, so the use of fuzzy quantitative ideas based on fuzzy mathematics can effectively improve this problem and also reduce some pressure on the data fusion center for sensors [2]. The basic idea of using fuzzy sets for data fusion is to fuzzify the original data qualitatively and transform it while preserving the attribute features to obtain the affiliation of that data [3]. In most of the fusion centers or decision centers are able to make determinations or decisions regarding the attributes, the model needs to be guaranteed to have simplicity from the acquisition to the final process [4]. The study will construct an intuitive fuzzy transformation method based on fuzzy sets for data obeying Gaussian distribution, while optimizing the operators in the algorithm to correct the dependence on attributes. Finally, the model is tested by a classical classification dataset to evaluate its performance.

2. Related work

Fuzzy mathematics meets the practical needs of fuzzy phenomena and its good performance of quantified fuzzy objects ensures the wide application of the tool. Mao and Zhang [5] used fuzzy mathematics in the education industry, they created a model based on fuzzy mathematics as an evaluation tool for entrepreneurship education, and an example analysis of several universities showed the practical relevance of the model. Alexandru and Ciobanu [6] used fuzzy sets in a finite support mathematical framework to describe infinite universe fuzzy sets by introducing finite support, in the process they also found algebraic properties of fuzzy sets when applied and also they introduced specific arithmetic and expansion principles. Wang [7] used fuzzy mathematics to build a multi-level qualitative and quantitative evaluation system to solve the problem that it was difficult to advance the geological study of the gas reservoir, and successfully predicted three favorable regions of coalbed methane system and multi-layer mixed production with this system. Their research breakthrough lies in applying the fuzzy mathematical model to the field that was difficult to solve. And based on the purpose and practicality of the fuzzy concept of the characteristics of effective play. Liu [8] proposed a fuzzy mathematics-based control method for water quality pollution volume to solve environmental problems, which first sets the target of total discharge volume, and then evaluates the pollution in key areas with the control method. Introducing fuzzy mathematics into sewage discharge is a creative attempt, because the calculated cost of precise control of urban pollution cannot bring better benefits, and this point form has better operability. Peng [9] proposes a new analysis method on the management model of aviation industry in the context of big data, which is based on fuzzy mathematics and error correction model processing tool, the processing results will be used for the estimation of BP model, using the constructed database situation they believe that the method has some positive significance for the intelligent management of the aviation industry, in addition they make some suggestions on the prediction results.

Multi-source heterogeneous data fusion models can process data with various attributes from sensors, and the model construction concept of this processing model has a great relationship with the usefulness of the data, and various industries in the context of big data are very interested in the development prospects of fusion models. Li et al. [10] noticed that most of the existing multi-source heterogeneous fusion models pay attention to the network structure information modeling and ignore the multivariate information of different nodes, so they proposed. Zhou et al. [11] considered that the traditional deep learning methods were lacking in feature extraction of multi-source heterogeneous data, and in order to improve the accuracy of fault diagnosis they proposed an alternating optimization network model, and the alternating training. The results showed that the method can tap the deep common features of multi-source heterogeneous data and provide some ideas for fault detection of rolling bearings. For this reason, Chai et al. [12] constructed a multi-source heterogeneous analysis method, and simulation experiments on specific futures data showed the practicality of their proposed framework. Liu et al. [13] considered that mining information from urban multi-source heterogeneous data is a challenging task, and for this reason, they constructed a knowledge mining network to solve urban flow related problems, they experimented with Chengdu drop order and POI datasets as simulated objects, and the results showed that the model is effective. Zhu et al. [14] constructed a systematic data fusion method for recommendation of heterogeneous features, which can mix user interaction activities with tensor factors to achieve accurate course recommendation, and they performed multiple data generated by several participants simulations and the results show that the method outperforms existing neural networks and matrix models, but the data sources they studied are relatively stable and the ability to fuse data of different dimensions is yet to be improved.

The idea of the way to construct multi-source heterogeneous fusion models is very broad and the breadth of application is expanding, and scholars are actively promoting their research. The objects handled by the fusion model are unequal types of data from various sensors, which leads to the difficulty of fusion models to satisfy both accuracy and efficiency, and the need of constructing the model fits better with the performance of fuzzy mathematical quantification data, and it will be a meaningful attempt to introduce fuzzy theory into the fusion model of multi-source heterogeneous data.

3. Fuzzy mathematics-based multi-source heterogeneous data fusion model building

3.1 Intuitionistic fuzzy integration algorithm construction based on multi-attribute data

The fuzzy set concept is a localization theory used for classification, which enables the determination of attributes of the data transmitted by the sensor, with a specific mechanism that uses the defined affiliation function to extend the variation of the individual-sample set affiliation, thus determining a non-exact description of the problem, and the distribution of the sensor’s with the data is shown in Fig. 1.

Figure 1.

Cluster sensing structure.

The complete decision fusion process in wireless sensor domain should be from node acquisition to decision proposal, but since the result of local decision can directly affect the final fusion, an accurate decision algorithm needs to be constructed to achieve a simplified local decision process. The intuitive fuzzy set construction algorithm is by far the more commonly used method, which first fuzzifies the attribute values of the raw information collected by the sensors and transforms the domain of affiliation values to between $[{0,1}]$ to indicate the degree of affiliation to the classified attributes [15]. Usually, sensor sensing data suits are mostly natural data, and most of the data points $x$ in the dataset obeys Gaussian distribution, so the affiliation function should conform to the Gaussian distribution, as in the following Eq. (1).

$\displaystyle f({x,\sigma,\mu})=e^{\frac{({x-\mu})^{2}}{2\sigma^{2}}}$ (1)

The above Eq. (1) of $\sigma$ represents the standard deviation of the normal distribution and $\mu$ is the standard deviation, which maps all the attribute values to $[{0,1}]$ . At this point if the intuitionistic fuzzy data set is $A=\{{u_{i},u_{A}({u_{i}}),v({u_{i}})}\}$ , then the expression of the conversion operator $K_{a}$ from the intuitionistic fuzzy set to the fuzzy set is given in Eq. (2) below.

$\displaystyle K_{a}(A)=({1-\partial})\pi_{A}(u)+\partial\pi_{A}(u),v_{A}(u)+u,% u_{A}(u)$ (2)

In the above Eq. (2) $\partial\in[{0,1}]$ , $\pi_{A}(u)$ is the hesitation degree of fuzzy set, $u_{A}(u)$ is the affiliation degree of intuitionistic fuzzy set, $v_{A}(u)$ is the unaffiliated degree of intuitionistic fuzzy set, from Eq. (2) it is known that $K_{a}(A)$ is the traditional fuzzy set, the hesitation degree in this fuzzy set is divided into two parts, and the division ratio $\partial$ is usually calculated according to the following Eq. (3).

$\displaystyle\partial=\frac{u_{A}(u)}{u_{A}(u)+v_{A}(u)}$ (3)

A part of the hesitancy degree is divided to the affiliation degree and another part is assigned to the non-affiliation degree according to the proportional case of the above Eq. (3), which is usually designed based on the original proportion of the affiliation degree. In order to obtain the conversion relation from fuzzy sets to intuitionistic fuzzy sets, the following Eq. (4) is designed for the construction method of fuzzy sets.

$\displaystyle\left\{\begin{array}[]{l}\pi_{A}(u)=1-({v_{A}(u)+u_{A}(u)})\\ u=u_{A}(u)+\partial\pi_{A}(u)\\ \end{array}\right.$ (4)

According to the construction method of the above Eq. (4) to reach the intuitionistic fuzzy affiliation can be derived from the following Eq. (5).

$\displaystyle u_{A}(u)=u({1-\pi_{A}(u)})$ (5)

According to the above Eq. (5), if the hesitation degree of the fuzzy set is determined, the values of the affiliation and non-affiliation degrees can be calculated, so the selection of the hesitation degree is the most central step in the whole fuzzy set construction link, the existing hesitation degree calculation methods are based on intuitionistic fuzzy entropy and proximity point information, but this method cannot be used in the network environment of wireless sensing, so the study will determine a wireless sensing network based of adaptive hesitation calculation method, which still considers the Gaussian distribution of the received data of sensing nodes [16]. And in the Gaussian distribution, the variance is positively related to the data dispersion, so if the variance of the data set is smaller, the distribution of the data is more concentrated, and then the affiliation of the data with the set can be easily determined, and vice versa, so the variance size has a direct relationship to the selection of the hesitation degree of the intuitionistic fuzzy set. Based on this relationship, it is assumed that the maximum hesitation of different values for each category under the attribute $j$ is $\theta_{j}$ , and according to the characteristics of Gaussian distribution, this threshold can also be regarded as the hesitation of the sample set with the largest standard deviation under the attribute $j$ . At this time, the hesitation of the sample of the category $i$ regarding the attribute $j$ can be calculated according to the following Eq. (6).

$\displaystyle\theta_{ij}=\frac{\sigma_{ij}}{\max({\sigma_{ij}})}.\theta_{j}$ (6)

In the above Eq. (6), $\max({\sigma_{ij}})$ is the maximum hesitation of standard deviation, $\sigma_{ij}$ is the standard deviation of the sample $i$ in the attribute $j$ , where $i=({1,2,\ldots,n})$ , $j=({1,2,\ldots,m})$ , $n$ and $m$ are the total number of samples and attributes, respectively.

After establishing the calculation equation, the algorithm will be constructed by first inputting each sample $u_{ij}\theta e$ and the sample attribute value $k$ , calculating the mean value $\mu_{ij}$ and standard deviation $\sigma_{ij}$ of the sample in each attribute through training, and then determining the information of the sample distribution under each attribute based on the results. $\theta_{ij}$ The hesitation matrix is used as the basis of Eqs (4) and (5) to calculate the affiliation of each data point to the attribute $u_{ij}$ and the non-affiliation $v_{ij}$ , and finally the construction of the intuitionistic fuzzy set according to the fuzzy set conversion equation.

3.2 Multi-dimensional based multi-source heterogeneous data fusion model construction

The intuitive fuzzy integration algorithm constructed from the attribute values of individual data detected by wireless sensors can only be used as a basis for judgments in a single dimension, while multiple sources of heterogeneous data often possess different dimensions, which leads to the fact that a single integration algorithm does not serve as the final framework for decision-level fusion models, and therefore multidimensional result fusion is required to ensure the accuracy of decisions [17]. The general wireless sensor data points present inter-category similarity and the fusion direction of the study will be constructed considering this principle to derive a method for determining the weight of each attribute. Finally, the multi-attribute classification problem is solved according to the superiority-disadvantage solution distance method, and finally the data fusion of sensor network nodes is completed.

Intuitionistic Fuzzy Weighted Averaging (IFWA) and Intuitionistic Fuzzy Weighted Geometric (IFWG) are common fuzzy integration operators that can obtain more accurate multi-attribute results. Weighted Geometric, IFWG), although the results of these two operators are also intuitionistic fuzzy and also have general functional properties, they are weighted with a single attribute weight, which leads to results that are too attribute dependent and ignore the importance of the intuitionistic fuzzy number [18]. Various improved operators have been derived from the above two basic operators, among which the Intuitionistic Fuzzy Hybrid Averaging (IFHA) operator with weighting coefficients $w$ obeys the normal distribution and assigns smaller weights to the data at both ends in order to compensate for the effect of anomalous data on the results, which will be used in the study as a processing tool.

The study uses the Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS) for the decision problem of multidimensional attributes, the principle of this algorithm is to use the distance between the evaluated object and the ideal solution as the basis of superiority and inferiority, which is divided into optimal and inferior objectives [19]. The decision matrix for a given problem $A$ has different domains of sample attribute values and will be normalized according to the following Eq. (7)

$\displaystyle A^{\prime}=f^{\prime}_{ij}=\frac{f_{ij}}{\sqrt{\sum\limits_{i=1}% ^{n}{f_{ij}^{2}}}}$ (7)

$A^{\prime}$ The matrixin Eq. (7) is the normalized matrix, at this time, the evaluation of $i$ in the matrix about the attribute $j$ is $f_{ij}$ , and the decision matrix is weighted by the attribute weight construction to obtain the matrix $R$ , and the elements of the matrix $r_{ij}$ are multiplied by the evaluation matrix $f^{\prime}_{ij}$ by the attribute weights $\omega_{j}$ , and the larger the result $r_{ij}$ means the closer to the positive ideal value, and vice versa, the positive and negative ideal is expressed in Eq. (8) below.

$\displaystyle\left\{\begin{array}[]{l}R^{+}=(r_{1}^{+},r_{2}^{+},\ldots,r_{m}^% {+})=\{\max_{i}r_{ij}|j=1,2,\ldots,m\}\\ R^{-}=(r_{1}^{-},r_{2}^{-},\ldots,r_{m}^{-})=\{\min_{i}r_{ij}|j=1,2,\ldots,m\}% \\ \end{array}\right.$ (8)

The ideal distance between each sample in the matrix and Eq. (8) is calculated and the result is output according to Eq. (9).

$\displaystyle\left\{{\begin{array}[]{l}D^{+}=\sqrt{\sum\limits_{j=1}^{m}{({r_{% ij}-r_{j}^{+}})^{2}}}\\ D^{-}=\sqrt{\sum\limits_{j=1}^{m}{({r_{ij}-r_{j}^{-}})^{2}}}\\ \end{array}}\right.$ (9)

$D^{+}$ Theand $D^{-}$ in Eq. (9) denote the distances from the positive and negative ideals, respectively, and then each regular solution is calculated according to the Euclidean distance equation, as shown in Eq. (10) below.

$\displaystyle D({r_{1},r_{2}})=\frac{1}{2n}\sum\limits_{j=1}^{n}{({u_{r_{1}}({% x_{j}})-u_{r_{2}}({x_{j}})})}^{2}+({v_{r_{1}}({x_{j}})-u_{r_{2}}({x_{j}})})^{2% }+({\pi_{r_{1}}({x_{j}})-\pi_{r_{2}}({x_{j}})})^{2}$ (10)

The similarity ranking of feasible solutions is performed according to the calculation result of Eq. (10) above, and the result of the ranking will be used as the superiority or inferiority of the evaluation object in order to provide a basis for decision making. The key of this method is the selection of positive and negative ideals, while the standard TOPSIS method can only reflect the internal situation of the sample set, if the object is a single value, the solution determined according to this method is not representative; secondly, it lies in the determination of attribute weights, most of the traditional sources of weights are given by experts, while wireless sensors need to be trained on historical information to obtain specific weights. Therefore, the study will improve the method adaptively by converting the data monitored by the sensor node $S$ , where the number of categories is $n$ and the number of attributes is $k$ , to obtain the fuzzy matrix as follows in Eq. (11).

$\displaystyle R_{s}=({r_{ij}^{s}})_{n.k}=\left[{{\begin{array}[]{*{20}c}{r_{11% }^{s}}&{r_{12}^{s}}&{\ldots}&{r_{1k}^{s}}\\ {r_{21}^{s}}&{r_{22}^{s}}&{\ldots}&{r_{2k}^{s}}\\ {\ldots}&{\ldots}&&{\ldots}\\ {r_{n1}^{s}}&{r_{n2}^{s}}&{\ldots}&{r_{nk}^{s}}\\ \end{array}}}\right]$ (11)

The $r_{ij}^{s}$ of Eq. (11) is the degree of intuitive fuzziness of the attribute $j$ under the node $S$ to the category $i$ . Here the attribute weights are defined adaptively for data fusion, the sensor nodes under this definition have different types of sensory modules and the data have category differences, and the data are transformed by the intuitionistic fuzzy transformation equation, at which point the data still retain the original characteristics. The improved attribute weight definition method will analyze the fuzzy matrix and assign weights based on the inter-category information, and the same attribute category spacing within the result will be used as the basis for weight assignment, and the distance is positively correlated with the weight size. The matrix constructed above about the detection point $S$ is split and reorganized according to the following Eq. (12).

$\displaystyle({d_{ij}^{i_{1}}})_{m.1}=\left[{{\begin{array}[]{*{20}c}r_{ij}^{i% _{1}1}\\ r_{ij}^{i_{1}2}\\ {\ldots}\\ r_{ij}^{i_{1}m}\\ \end{array}}}\right]$ (12)

Equation (12) of $i=({1,2,\ldots,n})$ , $j=({1,2,\ldots,k})$ , the distance matrix between the data can be calculated according to the Euclidean distance equation as follows Eq. (13).

$\displaystyle({D_{j}^{i_{1}}})m.n=\left[{{\begin{array}[]{*{20}c}1&{D({d_{1j}^% {i_{1}},d_{2j}^{i_{1}}})}&{\ldots}&{D({d_{1j}^{i_{1}},d_{nj}^{i_{1}}})}\\ {D({d_{2j}^{i_{1}},d_{1j}^{i_{1}}})}&1&{\ldots}&{D({d_{2j}^{i_{1}},d_{nj}^{i_{% 1}}})}\\ {\ldots}&{\ldots}&&{\ldots}\\ {D({d_{nj}^{i_{1}},d_{1j}^{i_{1}}})}&{D({d_{nj}^{i_{1}},d_{1j}^{i_{1}}})}&{% \ldots}&1\\ \end{array}}}\right]$ (13)

The similarity matrix of the attributes is obtained according to the similarity equation based on the category distance matrix of Eq. (13), and then the results are normalized to obtain the weight vector of the attributes, which is calculated according to Eq. (14).

$\displaystyle\left\{{\begin{array}[]{l}D_{j}=\frac{1}{n}\left({\sum{({d_{j}^{1% }})+\sum{({d_{j}^{2}})}+\ldots+\sum{({d_{j}^{n}})}}}\right)\\ \omega=\left({\frac{d_{1}}{\sum\limits_{j}{d_{j}}},\frac{d_{2}}{\sum\limits_{j% }{d_{j}}},\ldots,\frac{d_{k}}{\sum\limits_{j}{d_{j}}}}\right)\\ \end{array}}\right.$ (14)

The above Eq. (14) of $j=({1,2,\ldots,n})$ , the calculation result will be used as the basis for the weight assignment under the same attribute. After determining the weight assignment scheme will be optimized for the selection of positive and negative ideal solutions, determine the idea is to find out the positive ideal solution according to the intuitionistic fuzzy average operator, and then calculate the distance between the samples in the training set and the positive ideal solution, and the farthest point as the negative ideal solution, the positive ideal solution and the distance calculation equation as the following Eq. (15).

$\displaystyle\left\{{\begin{array}[]{l}R_{i}^{+}=\frac{1}{n}({r_{i}^{1}\oplus r% _{i}^{2}\oplus\ldots\oplus r_{i}^{m}})\\ D({R_{i}^{+},R_{i}^{-}})=\mathop{\max}\limits_{r_{i}^{s}}({d({r_{i}^{1},R_{i}^% {+}}),d({r_{i}^{2},R_{i}^{+}}),\ldots,d({r_{i}^{m},R_{i}^{+}})})\\ \end{array}}\right.$ (15)

The above Eq. (15) in $i=({1,2,\ldots,n})$ , $s=({1,2,\ldots,m})$ . After determining the positive and negative ideal values to operate this data fusion model, the operation process is shown in Fig. 2.

Figure 2.

Weighted TOPSIS intuitionistic fuzzy heterogeneous data fusion algorithm.

The training sample set in Fig. 2 above has the category of $n$ and the number of attributes of each category is $k$ , the samples are converted according to the intuitionistic fuzzy conversion equation, each calculation process as well as the equation are calculated according to the given equation, and the similarity of the final result of the category intuitionistic fuzzy value and the ideal value is used as the final decision basis, the process fuses data from different sources with different structures, and finally achieves the decision-level multi-source heterogeneous data The requirements of the fusion model.

4. Simulation results and performance analysis of fuzzy mathematics based multi-source heterogeneous data fusion model

The first step of multi-source heterogeneous data fusion is to intuitively fuzzify the data from real scenes and fuzzify the original “heterogeneous” data from sensors. The study will use 40 sets of data as training data and 10 sets as reference data to evaluate the performance of the model.

The performance analysis of the fuzzy conversion mechanism was first conducted to evaluate the accuracy of the classification by the case of the degree of affiliation and non-affiliation, and the study randomly selected data as a test set to observe the conversion of the intuitionistic fuzzy construction method, and the specific performance is shown in Fig. 3.

Figure 3.

The intuitionistic fuzzy situation of class I data.

Figure 3 shows the intuitive fuzzy construction of data set I, among which Fig. 3a is about the membership results of the three data sets. The three bar graphs of each group of data from 1 to 10 indicate the membership of the data only to the sample set where the data is located. Except for the second group of data, the results show that the membership of the data to the attribute is relatively high, and the membership of several groups exceeds 0.8. It shows that the membership division of this method is reasonable; Fig. 3b is about the non-membership of the three data sets. The results show that the non-membership degree of the data pair is mostly below 0.1, and the non-membership degree of each data pair is not higher than the membership degree of other data sets, with an average gap of more than 0.5. Therefore, this intuitionistic fuzzy method has good classification performance and can effectively screen data from various sources based on attributes. In order to quantify the accuracy of this classification, fuzzy score value is introduced, and the size of the score is proportional to the accuracy. The specific score is shown in Fig. 4.

Figure 4.

Membership scores of Class I sample data to sets II and III.

According to the scores in Fig. 4 above, it can be seen that the affiliation score of each data point to the set in which it is located is significantly higher than the other sets except for the second set of data, and the experiment as a whole further validates the classification accuracy of the method. Since the second set of data mentioned above showed deviations, in order to test the percentage of such deviations, the study conducted 800 experiments using the method, and the results showed an accuracy of 89%, which indicates that the accuracy of the conversion method can meet the requirements of the data fusion model with a certain degree of reliability. The analysis of the anomalies involved in the experiments yielded the affiliation functions for Class I and Class II as shown in Fig. 5.

Figure 5.

Membership function analysis of outliers.

Figure 6.

Abnormal number of attributes.

The reference value of the Class I data in the data set is 5.5. The membership function result in Fig. 5 shows that the membership degree of the class II data at 5.5 is 0.06021, which is 0.1668 higher than that of the class I (0.4353). This is because the determination method of the hesitancy proposed in the research is based on variance, so the hesitancy degree of this type of data is larger. But different from the traditional fuzzy construction relationship, the greater the degree of membership of the data. Therefore, the intuitionistic fuzzy structure proposed in this study changes the membership results of the outliers, and this result is more in line with the reality.

The data of each dataset is classified, and the attribute categories are divided into four attributes of calyx length and width and petal length and width, which are represented by A1, A2 and B1, B2, respectively, and the traditional method fuzzy construction method and the intuitionistic fuzzy algorithm proposed by the study are used to classify the dataset, and the evaluation results will directly affect the accuracy judging, based on this detection method for all test samples for outlier statistics, the results are shown in Fig. 6.

As shown in Fig. 6, the attribute construction of all test samples shows that the number of anomalies of A1 and A2 attributes is above 4, while the abnormal value of B1 and B2 attributes is below 2. This indicates that the cause of abnormal data or inaccurate classification may be the misjudgment of attributes. In order to explore the interference of the similarity of four attributes on the degree of membership, the distribution of four attributes based on normal distribution was studied, and the results are shown in Fig. 7.

Figure 7.

Distribution of three data sets under four attributes.

The distribution of Fig. 7 above shows that as far as attributes A1 and A2 are concerned, the three groups of data are more closely distributed, especially the distribution of Classes II and III is more closely distributed, and there is a possibility of affecting the accuracy of the classification, but the two attributes B1 and B2 show more dispersion, which indicates that the method is still valid.

In terms of the algorithm itself, the degree of affiliation is directly related to the hesitation degree $\theta$ . Real-time selection of the hesitation degree can improve the classification accuracy and thus the performance of the data fusion algorithm. The study will test the optimal range of values for the whole model by selecting different values of hesitation degree under the same experimental conditions, and Fig. 8 shows the correlation between the hesitation degree and the outliers.

As can be seen from Fig. 8 above, the total number of anomalies remains stable at $\theta=[{0.2,0.4}]$ for the hesitation degree, so the intuitionistic fuzzy transformation method constructed in the study can extend the optimal value of hesitation degree to a range to accommodate the diversity of heterogeneous data from multiple sources, which means that it can produce satisfactory decision results more accurately in the allowed range regardless of the data, thus indicating the robustness of the method.

The fusion algorithm is optimized with weighted TOPSIS compared with the traditional model, and the intuitionistic fuzzy averaging operator is upgraded to be applicable to the processing of various data. To test the improvement effect will be tested using the above data set as well as the relevant experimental conditions for various improvement paths, and the number of tests remains 800, and the results are shown in Table 1.

Table 1

Performance comparison of several algorithms

Index	Traditional algorithm	IFWA	IFWG	CSWBTA
Accuracy	81.22%	79.25%	84.54%	95.23%
Recall	77.59%	79.06%	82.52%	94.27%
True positive rate	74.61%	77.83%	83.89%	91.69%
False positive rate	9.71%	9.22%	11.21%	2.16%
Final evaluation	79.22%	77.39%	86.16%	97.52%

Figure 8.

Distribution of three data sets under four attributes.

Through the above experimental results, it can be seen that the improved algorithm proposed in the study has a more significant effect on the optimization and improvement of the traditional fusion algorithm, and the accuracy rate is improved by 14.01%, and the optimized path of IFWA has a lower accuracy rate for the original algorithm, which is caused by the fact that the weight assignment of this operator will take into account the importance of the attributes; while IFWG has a certain improvement on the accuracy rate, and overall the optimized algorithm of this study The algorithm of path construction has the best performance, and the improvement of TOPSIS has some positive significance on the decision result of fused data. The above control experiments were evaluated for prediction performance according to different hesitation thresholds, and the results are shown in Fig. 9.

Figure 9.

Comparison of accuracy of algorithms under different thresholds.

As can be seen from Fig. 9 above, compared with a single dimension, both IFWG and IFWA operators can effectively improve the accuracy, but the most significant improvement is the optimization of the TOPSIS operator. The prediction accuracy of the hesitation degree in the interval (0.15, 0.5) can be as 0.8 is the dividing line to distinguish between a single dimension and the accuracy of the optimization operator, which further shows that the optimization path of the operator is reliable and has a fault-tolerant space, which can significantly improve the classification accuracy.

5. Conclusion

Multi-source heterogeneous data needs fusion data as the basis for intelligent control decision-making in various fields. Transforming it according to the function of fuzzy mathematics takes advantage of the consistency of results and methods. Specifically, the decision-level fusion model constructed requires the help of Fuzzy mathematics to achieve data classification. The research constructs an intuitionistic fuzzy transformation algorithm based on fuzzy sets to transform the original data. The simulation results of the algorithm performance show that the overall classification performance of the algorithm is good. Although there will be errors in evaluating the algorithm from attributes, it will cause this situation. The reason is that the data set itself is close to the Gaussian distribution; the multiple simulation results of the iris data set show that the accuracy of the model is operable; in order to solve the problem of insufficient classification performance in a single dimension, the study introduces fuzzy ensemble. The TOPSIS algorithm used in the case of the operator improves the accuracy of the original model. Although the IFWG and IFWA operators can improve the accuracy to more than 80% when the hesitation degree is in the interval (0.15, 0.5), the operator introduced in the study showed higher performance. This shows that this construction idea and optimization path are reliable and can meet the decision-making needs of the fusion center in the wireless sensor network environment. However, this research only improves the classification accuracy from the local decision-making. In the future, the multi-type sensor data fusion model will be carried out. explore.

References

Yan

Guo

. Rotor unbalance fault diagnosis using DBN based on multi-source heterogeneous information fusion. Procedia Manuf. 2019; 35: 1184-1189.

Huang

Wang

. Network security situation awareness based on the optimized dynamic wavelet neural network. Int J Net Secur. 2018; 20(3): 593-600.

Liu

Xie

. Urban flow pattern mining based on multi-source heterogeneous data fusion and knowledge graph embedding. IEEE T Knowl Data En. 2021; 1041-4347.

. Design and application of soil moisture content monitoring system based on cloud-native technology. Trans Chin Soc Agri Eng. 2020; 36(13): 165-172.

Mao

Zhang

. Analysis of entrepreneurship education in colleges and based on improved decision tree algorithm and fuzzy mathematics. J Intell Fuzzy Syst. 2021; 40(2): 2095-2107.

Alexandru

Ciobanu

. Fuzzy sets within Finitely Supported Mathematics. Fuzzy Set Syst. 2018; 339: 119-133.

Wang

Qin

Xie

Shen

Zhao

Huang

, et al. Coalbed methane system potential evaluation and favourable area prediction of Gujiao blocks, Xishan coalfield, based on multi-level fuzzy mathematical analysis. Journal of Petroleum Science and Engineering. 2018; 160: 136-151.

Liu

. Research on control method of pollutant total amount of water quality based on fuzzy mathematics. Earth Sci Res J. 2020; 24(2): 191-199.

Peng

. Aviation industry management model and exchange rate index analysis based on error correction model and fuzzy mathematics. J Intell Fuzzy Syst. 2019; 37(5): 6363-6375.

10.

Lin

Khan

Cui

. Multi-source information fusion based heterogeneous network embedding. Inf Sci. 2020; 534: 53-71.

11.

Zhou

Yang

Chen

Wen

. Fault diagnosis based on deep learning by extracting inherent common feature of multi-source heterogeneous data. P I Mech Eng I-J Sys. 2021; 235(10): 1858-1872.

12.

Chai

Luo

. A multi-source heterogeneous data analytic method for future price fluctuation prediction. Neurocomputing. 2020; 11-20.

13.

Liu

Xie

. Urban flow pattern mining based on multi-source heterogeneous data fusion and knowledge graph embedding. IEEE T Knowl Data En. 2021; 1041-4347.

14.

Zhu

Qiu

Shi

Chambua

Niu

. Heterogeneous teaching evaluation network based offline course recommendation with graph learning and tensor factorization. Neurocomputing. 2020; 415: 84-95.

15.

Dhar

Kundub

. A novel method for image thresholding using interval type-2 fuzzy set and Bat algorithm. Appl Soft Comput. 2018; 63: 154-166.

16.

Rajasoundaran

Prabu

Routray

Malla

Kumar

Mukherjee

, et al. Secure routing with multi-watchdog construction using deep particle convolutional model for IoT based 5G wireless sensor networks. Comput Commun. 2022; 187: 71-82.

17.

Angwech

Alfa

Maharaj

BTJ

. Managing the harvested energy in wireless sensor networks: A priority Geo/Geo/1/k approach with threshold. Energy Rep. 2022; 8: 2448-2461.

18.

Fan

Song

Lei

. New operators for aggregating intuitionistic fuzzy information with their application in decision making. IEEE Access. 2018; 6: 27214-27238.

19.

Rani

Mishra

Mardani

Cavallaro

Krishankumar

Streimikiene

. An extended fuzzy divergence measure-based technique for order preference by similarity to ideal solution method for renewable energy investments. Renew Energy Driven Future. 2021; 469-490.