Abstract
An increasing number of applications require the integration of data from various disciplines, which leads to problems with the fusion of multi-source information. In this paper, a special information structure formalized in terms of three indices (the central presentation, population or scale, and density function) is proposed. Single and mixed Gaussian models are used for single source information and their fusion results, and a parameter estimation method is also introduced. Furthermore, fuzzy similarity computing is developed for solving the fuzzy implications under a Mamdani model and a Gaussian-shaped density function. Finally, an improved rule-based Gaussian-shaped fuzzy control inference system is proposed in combination with a nonlinear conjugate gradient and a Takagi-Sugeno (T-S) model, which demonstrated the effectiveness of the proposed method as compared to other fuzzy inference systems.
Keywords
Introduction
The human brain obtains information from different sources; it then merges this information to form concepts and finally outputs natural language (NL), which is powerful and versatile enough to describe the real world. NL can be regarded as the fusion of disparate information; it is vague, ambiguous, and uncertain. The quantitative calculation and qualitative analysis of NL is the ultimate goal of artificial intelligence. There are two strands of research linking the initial information acquisition with NL: (1) how to simplify the presentation of NL and (2) how to form NL from multi-source information. Usually, humans express emotions of certain objects by using sentences and affective words, but they cannot fully express their intuitive perception of an object simply through separating these terms. Natural Language Processing (NLP) was developed to solve this problem; however, many difficulties remain in this field. Computing with Words (CW) was also introduced to decrease the complexity related to linguistic variables [16–18]. This has allowed for a more exact expression of the meaning of what a human is thinking about and has provided a feasible direction for NLP under weakened conditions. Zadeh introduced the framework of this phenomenon of uncertainty using Fuzzy Sets (FS) in 2005 [19]. The FS theory was also addressed to describe objects at a coarse-grained level. Herrera and Martínez [5] introduced a 2-tuple fuzzy linguistic representation model for CW without any loss of information. Furthermore, Lawry [13, 14] proposed Label Semantics (LS) for vague concept modeling and reasoning techniques so as to formalize uncertainty in presentation theory. Subsequently, Lawry and Tang [12, 34, 35, 12, 34, 35] proposed a new semantic understanding model: the Prototype Theory (PT). These works discovered the connection between fuzzy presentation technology and high-level semantics. In engineering fields, linguistic representation models combined with affective words have had some applications, such as fuzzy decision making [21, 31] and KANSEI Engineering (KE). Fuzzy inference methodologies have also been shown to be effective in our previous work on Rough Sets [7] and Fuzzy Support Vector Machines (SVMs) [6].
However, it has been regarded as more feasible to focus on multi-source information fusion rather than on NL itself. Moreover, it is important to discover the mechanics of integrating multi-source information in the human brain. Due to the modular and vague appearance of multi-source information, uncertainty reasoning methods and their associated mathematical tools are thought to offer more interpretability and a much stronger generalization capability [24]. Yager developed the theoretical foundation for multi-source information fusion techniques based on set measure and possibility theories [25, 26]. Normally, single-source information consists of steady features that are more easily formalized and parameterized. In previous studies, the sum, product, max/min, and Weighted Arithmetic Mean (WAM) were used to combine single-source information, and each output represented an independent source of information that could be treated separately [15].
Relative to mathematical research and understanding the phenomenon of uncertainty, the integration of information using fuzzy inference techniques pervades many scientific disciplines, such as multivariate and type-2 fuzzy sets; bipolar models [10, 11]; and probability and possibility issues [9, 27]. Information fusion is the merging of information from disparate sources with differing conceptual, contextual, and typographical representations. It has been successfully applied in data mining and the consolidation of data from unstructured or semi-structured resources, and it has also led to many achievements in various fields [1, 8]. Fusion methods include product fusion (such as the Bayes posterior probability model), linear fusion (SVM classifiers), and nonlinear fusion (super-kernel integration) [23]. Recent developments and applications of fuzzy information fusion can be found in pattern classification, image analysis, decision-making, man-made structures, and medicine [30, 32]. Furthermore, over the past several years, there has been a number of successful applications of fuzzy integrals in decision-making and pattern recognition that have employed multiple information sources [3, 20].
In this paper, we formalize multi-source information as a multivariable group and describe each information structure as a special kind of triple, I = < P, d, ρ >, where P denotes a typical point of positive examples relative to the information structure I, d is a distance measurement that represents the population of information, and ρ is a Probability Density Function (PDF). The basic idea of this formalized information structure is to assume that the neighborhood radius of each information structure is uncertain, which is limited by PDF-ρ. Thus, we will calculate the value of P relative to an information structure under a given level. An information fusion technique was developed by formalizing this special information structure; furthermore, information fusion employing fuzzy sets was applied in this paper. A Single Gaussian Model (SGM) was applied to single-source information, and a Gaussian Mixed Model (GMM) was applied to the fusion of this information by incorporating probabilistic and statistical methods [28, 36].
The remainder of this paper proceeds as follows. In Section 2, we propose an information structure that incorporates a definition of the information kernel, boundary, and Gaussian PDF. An improved algorithm for parameter estimation is also introduced. Section 3 introduces fuzzy similarity relations and IF-THEN rules for this special information structure. These are helpful for calculating the possibilities in a rule-based fuzzy inference system (FIS). Section 4 develops a rule-based information fusion model using a conjugate gradient and Takagi–Sugeno (T-S) model under a rule-based Gaussian-shaped fuzzy inference system (RGS-FIS). A time-series analysis using natural disaster datasets is also introduced using RGS-FIS, and we demonstrate the effectiveness of our method in comparison to other methodologies. Finally, in Section 5, we give our conclusions and ideas for future work.
Information fusion models by using probability density function
Definitions
Definitions for our information structure and kernel computing method were established as follows.
d (P
k
± Q
k
) = ∥ P
k
± Q
k
∥, ∀P
k
, Q
k
∈ R
n
∀α, β ∈ R, P
k
, Q
k
∈ R
n
We have d (αP
k
± βQ
k
) = ∥ αP
k
± βQ
k
∥; in addition,
The boundary of v k gives the scale of the neighborhood of all elements in this special information structure. This is defined below.
- The Upper Approximation Boundary (UAB)
- The Lower Approximation Boundary (LAB)
Therefore, the boundary is P B = UP B ∖ LP B = B (u, t). Thus, we have P B = P K + λ (P B - P K ), λ ∈ [0, 1], which exhibits fuzziness attributes at the boundary.
–Single Gaussian Model for single-source information
The Gaussian distribution is a continuous probability distribution with a bell-shaped PDF in one-dimensional space:
The parameter μ is the mean or expectation, and σ2 is the variance. The SGM is applied to induct the density function of the proposed information structure I, and we define:
The maximum likelihood estimation can be used to estimate the parameters (Φ, μ) under (8). Taking the logarithm of (8), we have:
Taking the partial derivative w.r.t. μ of O (μ, Φ) and setting it to 0, we obtain the following:
This gives . Similarly, for Φ, we can obtain . Thus, if the density of each point in v
k
is , then our estimation of the parameter μ is:
The covariance is converted to
- Gaussian Mixed Model and parameter estimation
For multi-source information fusion, we need to calculate all of I
k
’s density functions as well as calculate the new density function. For m multi-source information structures, let for a normalized weight parameter α: i.e., ∑
i
α
i
= 1. To calculate and simplify the covariance matrix Φ, let
From the SGM, we have that
Calculate:
and
Then, for , c ∈ R, GMM is defined as G (P) = ∑ i α i δ (P, μ i , σ i ), i = 1, 2, ⋯ , lf. The number of parameters for estimation is 3l. If we let , the object is that:
Let , so that:
Similarly, we can find:
Setting the above two equations equal to 0, we have
For α
j
, under the constraint ∑
j
α
j
= 1, we use Lagrange multipliers to re-define the object as:
Differentiating this new object w.r.t. α j , we have that:
Given an initial value and in order to achieve convergence, μ1, μ2, ⋯ μ m may be calculated by the cluster method.
If for a given threshold δ, then stop the process; otherwise, proceed to Step 2.
In actuality, the density function of information fusion under this special structure is a product of the fusion of SGMs. For all information structures v
k
and their SGM densities δ (v
k
), ; therefore, we have that:
where C is an undetermined constant.
In particular, in a one-dimensional space with σ = 1, we have that:
This is a linear transformation of the basic Gaussian function. Thus, for any two information structures v i , v j , the fusion result is v ij =< αP ij , βd ij , γδ ij > where α, β, and γ are undetermined coefficients.
Fuzzy implications of information structures under IF-THEN rules
In fuzzy sets, the rule “IF x is , THEN y is ” indicates a fuzzy implication between and as denoted by . If we let x, y ∈ [0, 1] be the memberships of and , respectively, we list the Mamdani model for the membership computing as t ∀x, y ∈ [0, 1], F (x, y) = Min {x, y}.
We construct a fuzzy membership based on a new fuzzy implication and inference system. We also derive a similarity relationship and apply this to the Gaussian density function-based fuzzy rule inference system. For δ k in a rule-based IF-THEN inference system, suppose that the rule set is:
IF I1 is v1, then I o is v o , ω1, I2 is v2, then I o is v o , ω2.
We can integrate these rules as:
IF I1 is v1 and I2 is v2, THEN I
o
is v
o
,
The Mamdani model for δ k in two-dimensional space (x, y) will be:
In particular, if σ11 = σ12 = σ21 = σ21 = 1 and μ11 = μ12 = μ21 = μ22 = 0, we have that:
Thus, for any other implication operators, the function of rules will have the form:
Mamdani model-based fuzzy control inference system using nonlinear conjugate gradient
In the previous section, information was formalized as v
k
=< P
k
, d
k
, ρ
k
> where P
k
is a central point in R
n
, d
k
∈ R, and ρ
k
is a Gaussian density function. P
k
and d
k
operate as fuzzy numbers using the fuzzy logical operation in Section 2. In our fuzzy rule-based inference system if different information (multi-source information) implies the same conclusion, then this information is integrated. Supposing that the multisource information structure v
k
will conclude with a particular assertion at the ρ
k
level, we have that:
This can be simplified to,
and
Now, we only discuss the density function δ
k
(here ρ
k
) in our FIS. Letting Θ = [δ1, δ2, ⋯ , δ
n
], we have that:
From the previous section, we know that ϖ (Δ) is a Gaussian density function, so the rule is re-labeled as IF X THEN f (X). For this rule set, we have
However, as f (X) is a nonlinear function, it is difficult to find its minimum point under the Mamdani model, so we need to linearize f (X) and use the nonlinear conjugate gradient algorithm to optimize the parameters of f (X).
If we suppose that f (X) = [Ax - b]
T
[Ax - b], then the gradient is ∇
x
f (x) =2A
T
(Ax - b), and the objective is to find x subject to ∇
x
f (x) =0. The nonlinear conjugate gradient requires f being twice differentiable, but as f is a Gaussian function, it is infinitely differentiable. Starting from the opposite direction as Δx0 = - ∇
x
f (x0) with step size α, we have that:
This is the first iteration in the direction of Δx0, and by setting the initial conjugate direction s0 = Δx0, the following steps will calculate Δx n :
The algorithm is based on the quadratic function that we use to normalize the Gaussian function f (x) in order to speed up the iterations. Considering a simplified Mamdani model and from formula (20), we know that:
Using the nonlinear conjugate gradient, we obtain the results given in Fig. 1 and Table 1 by comparing with other special functions. From Table 1, we know that the Gaussian density function will be approximated in just a few steps by the nonlinear conjugate gradient algorithm, which is the reason we selected the Gaussian distribution as the density function of this special structure. We also compare other forms of density function, which appear to require more steps under the nonlinear conjugate gradient algorithm.
Takagi and Sugeno [18] proposed a fuzzy IF-THEN rules system as the local input–output relations of a nonlinear system to scale the population of rules under a multi-dimensional fuzzy inference system, known as the T-S model [21]. The normal rules for the T-S model under the special information structure proposed for our information fusion method are:
The T-S model outputs a linear, non-constant function that will reduce the population of rules.
From rule set RT-S, we can simplify , in which a i and b i are undetermined constants. Let the standard deviation in Equation (6), μ = 0, and thus, . The first part of I f is a GMM model that can be estimated by Section 2.2-(2), and the second part of I f is a linear function (see Fig. 2).
Furthermore, for the nonlinear conjugate gradient proposed in Section 4.1, we obtain 100 steps and 301 gradients to find the minimum point (the Mamdani model). As a result, we can simplify this in RGS-FIS under the T-S model to output three linear membership functions. Suppose that the inputs are Gaussian-shaped rules, and the outputs are linear functions. Let the membership function of INPUT 1 and INPUT 2 be a Gaussian function, and the OUTPUT is composed of three linear functions [33]. We have this RGS-FIS system under the T-S model (see Fig. 3).
Concluding remarks and future works
This paper proposed a novel information structure applicable to a Gaussian-shaped FIS. We developed the RGS-FIS approach using the nonlinear conjugate gradient algorithm and a T-S model. However, there are two problems with RGS-FIS: one is that new fusion operator parameters depend on a complex estimation process, and the other is that all data variables are supposed to be independent (r = 0). The model selection for similarity computing under rule-based fuzzy implication operations should also be improved.
Future work will focus on the pre-processing of datasets as well as the estimation of model parameters. Pre-processing will tune the parameters of the model to display a simpler mathematical presentation and assure a robust inference process. Furthermore, the fusion operator needs to be improved so that it does not solely depend on fuzzy implications. Although similarity computing is the key factor for calculating the possibility of IF-THEN rules, it is not clear whether a feasible algorithm can be developed for this. Hence, the possibility of the IF-THEN rules also needs to be calculated and improved.
Footnotes
Acknowledgments
The authors would like to thank the editors, the anonymous reviewers, and Dr. Thayer El-Dajjani for their most constructive comments and suggestions to improve the quality of this paper. This work is supported by Zhejiang Provincial Natural Science Fund under No. (LY13H180012)
