Practical aspects of equivalence of Baldwin’s and Zadeh’s fuzzy inference

Abstract

The article presents a thorough analysis of fuzzy inference introduced by Baldwin and compares this approach to Zaheh’s compositional rule of inference. The comparison is performed in order to analyze the equivalence of the two methods and describe practical aspects of this fact for simple and compound premises, indicating advantages and disadvantages of both approaches. The main aim of the analysis is focus on the computational complexity of the methods. The most important feature of Baldwin’s inference is transfer of the inference process into a truth space, unified for all input variables. Such environment allows to obtain one fuzzy truth value describing a compound premise in a sequence of low dimensional computations. The article proves equality of such approach with the compositional rule of inference. Therefore, this solution is much more computationally efficient in case of compound cases, for which compositional rule of inference is multidimensional.

Keywords

Fuzzy inference fuzzy truth value fuzzy sets

1 Introduction

The approximate inference based on a fuzzy truth value was introduced for the first time by Baldwin in a research report at Bristol University in 1977 [1], four years after Zadeh had presented the compositional rule of inference [43]. However, the following paper [2], published in 1979, has become more popular, thus, more frequently quoted position.

There are a few things that make Baldwin’s approach so intriguing. First of all, it eventuates directly from the extension of classical logic [2]. Second of all, emphasized by the author himself [2], it is devoid of the problem of multidimensionality in case of rules with a compound premise, which is observed for compositional rule of inference [17 , 20]. It results from transferring the inference process into the truth space, which is a unified environment for all input variables of a fuzzy system. However, Baldwin probably did not realize that his more complex solution equates to Zadeh’s approach in terms of the results obtained. He did not also consider the most common in applications, simple singleton approach, which reduces Zadeh’s inference to a very efficient mechanism. Zadeh and Baldwin’s techniques - the two completely different views on fuzzy inference. The first one is extremely popular, the second one is practically forgotten. This situation raises many questions. For example, when is it worth to use Baldwin’s inference and when not? Is Baldwin’s solution really too complex to implement? Can we simplify the method using singletons as membership functions for input variables? Will this make the method as efficient as those commonly used? This article answers these questions. Subsequent sections present a detailed analysis of Baldwin’s approximate inference, based on a fuzzy truth value. This approach is compared with the classical solution, based on Zadeh’s compositional rule of inference, where the equivalence of both systems has been demonstrated. However, the main purpose of the comparison focuses on computational complexity, showing the advantages of Baldwin’s inference. The presented analysis extends the key theory of this subject. This allows to implement a computationally efficient fuzzy systems without the simplifying assumptions that are commonly used in practice. This subject has never been analyzed in the literature and it is a very important aspect from the practical applications point of view.

In further sections of this paper, both approaches to fuzzy inference will be referred to as Baldwin’s inference and Zadeh’s inference.

1.1 Related works

Unfortunately, considering a tremendous extent of literature in the field of fuzzy reasoning systems, Baldwin’s inference is hardly ever referred to. In the summary of his pioneer article [2] the author himself emphasizes that the concept is just in the initial phase and a lot of research is required to develop the method and its applications, which has also been planned. However, in the following research the obligations have not been fulfilled. Fortunately, the approach has not been forgotten and is available in books concerning fuzzy sets and approximate reasoning [6 , 31].

The possible reason of such situation could be the paper [37], where authors question the idea of inverse truth function modification and criticize the approach based on it. The authors show that the fuzzy inference presented by Baldwin can be converted back to the direct approach, as they call Zadeh’s solution. Generally, the equivalence of the two methods is proven, therefore, as they suggest, any modifications transforming direct approach into truth space and back are redundant.

The main problem in this analysis is involved with computational complexity. The authors show, using Baldwin’s definition, that the inverse truth functional modification in case of a multi-dimensional antecedent is problematic, because obtaining a compound truth function is as complex as operating on multi-dimensional relation in direct approach.

Unfortunately, the authors did not consider obtaining a truth function of a compound statement by joining one truth function after another, working all the time in a low-dimensional environment. Without other simplifying assumptions it is not possible in direct approach, because of different spaces in which subsequent premises are defined. The advantage of Baldwin’s approach can be noticed in this very aspect of the inference process, because obtaining truth functions representing relationships between facts and premises creates a unified environment, which allows the approach to perform subsequent compositions of only two truth functions at a time. Therefore, the computational effort does not exist and complexity is linear according to the number of premises in a compound rule. These advantages are also briefly shown in [17 , 20].

Other research referencing to the approach are [3, 39] and [4]. The paper [39] considers using various fuzzy implications in Baldwin’s inference, as well as implements some simplifications which allow to lower the computational complexity of calculating a truth function of conclusion. In papers [3] and [4] Baldwin refers to his pioneer work [2], proposes simple algorithmic implementation and analyzes a fuzzy implication in various approximate inference solutions.

Dubois and Prade have also made a significant contribution to the field of a fuzzy truth value. They refer to Baldwin’s paper in [12, 13], however, their research only mention the topic of a fuzzy truth value and does not concern Baldwin’s method in particular. Similar situation can be encountered in case of other papers. For instance, the authors of [11 , 30] propose different methods of an approximate inference using both fuzzy and non fuzzy truth value. Yang et al. [41] recommend transferring the inference mechanism based on classical sets to fuzzy sets using the extension principle. In both papers the Baldwin’s method is not analyzed or compared to the proposed approaches.

Jantzen [16] suggests an interesting matrix solution, where a fuzzy truth value is a three-element vector of values from [0, 1] range. Again, the Baldwin’s method is listed only as an example of work contributing to the field of fuzzy inference and a fuzzy truth value.

Authors refer to Baldwin’s research primarily in the introductions to the above-mentioned papers in which the authors enumerate various approaches in the state of the art.

The remaining papers only quote the publication [2] emphasizing the author’s contribution to the field of fuzzy inference, but in fact they focus on other problems. The following papers are good examples of research belonging to this group: [15, 29] and [34].

In more recent studies, references to Baldwin’s work can be found in the field of type-2 fuzzy sets. They refer to the fuzzy truth value of each element of a type-2 fuzzy set, but not to the method of inference based on it. The following papers are examples of such research: [14] and [27].

Equivalence of different approximate inference solutions [35 , 47] for certain conditions is discussed in [42] for Zadeh and Tsukamato’s approach and in [40] where Sugeno and Takagi’s approaches are compared and equivalence of all mentioned were concerned for Mamdani and Assilan’s conjunction. The work [26] is also worth mentioning due to the fact that it compares solutions proposed by Zadeh, Mamdani and Mizumoto. However, comparisons of Baldwin’s approach to other inference methods, except already mentioned [37], can not be found.

In recent years, the literature offers proposals for a new approach to approximate inference methods: optimal fuzzy inference [22], and fractional fuzzy inference system [24]. Both approaches propose approximate inference methods, alternative to Zadeh’s compositional rule of inference. The method described in [22] reduces the inference process to the optimization task. We are looking for the result of inference so that the fuzzy relation arising from the fact and this result of inference is ‘similar’ to the fuzzy relation containing knowledge. The similarity can be assessed on the basis of several criteria: the sum of the modulus of the differences, the sum of the squared differences, and the maximum difference. This method is drastically different from both Zadeh’s compositional rule of inference and Baldwin’s method. So far, no one has proposed extending this method to linguistically defined degrees of truth. In turn, the method described in [24] is based on an alternative concept of the membership function of a set, ie. on the fractional horizontal membership function. The membership function of this set uses a new variable called relative-distance-measure or the horizontal index. Using the concept of fractional horizontal membership function in [24], the extension of Zadeh’s compositional rule of inference is introduced. An interesting feature of these systems is that the original Zadeh method is their special case (for a fractional index equal to 1). Also, no extension to the Baldwin’s inference method has been developed for this approach. Summing up, it should be stated that the new methods of approximate inference currently have only incidental applications and the Zadeh method is still the leader among approximate inference methods.

Analysis of the state of the art shows that the interesting approach to fuzzy inference introduced by Baldwin, has not been further developed or even applied in practice. It is particularly hard to find documented use cases of this method, as well as papers confronting it with other widely used solutions. There are two possible reasons of such situation. First, the criticism in paper [37], published three years after the method was introduced, may have influenced many scientists not to pursue their research in the field. The second reason may be related with apparently higher degree of complexity compared to other simplified solutions. Therefore, the authors of this paper intentionally focused on comparing the solution with the compositional rule of inference, as well as on the problems associated with the possible application of this approach.

The following sections are organized as follows. First, in Section 2, Baldwin’s and Zadeh’s inference methods are analyzed in a simplified environment (assuming facts as singletons). Further, in Section 3, the methods are compared in the general case, where the membership functions of facts are not limited. The special case with singletons is analyzed mainly for educational purposes, as equivalence in this situation can be shown directly and the process is much easier. In both cases, the equality of Baldwin’s and Zadeh’s solutions is proven, and most importantly, the paper discusses the practical aspects of this fact. Finally, in Section 3.6, the problem of the associativity of truth functions is analyzed. This is an important contribution to the theory and particularly important to the computational complexity of inference. Section 3.7 describes the results of numerical examinations, performed on analysed fuzzy inference methods, in order to test the real computational complexity and to evaluate approaches in the context of practical applications

2 Equivalence in case of facts in a form of singletons

This section focuses on comparison of Baldwin’s and Zadeh’s approaches, assuming the simplest form of a fact membership function, which is a singleton. The analysis is divided into two stages. The first stage focuses on non compound premise, further described as a simple premise, following the generalized modus ponendo ponens [43].

$\begin{matrix} FACT : & X is A^{'} \\ RULE : & if X is A then Y is B \\ CONCLUSION : & Y is B^{'} . \end{matrix}$ (1) Analogically, the second stage analyzes a compound premise in the following inference process

$\begin{matrix} FACT : & X_{1} is A_{1}^{'}, X_{2} is A_{2}^{'}, \dots, X_{N} is A_{N}^{'} \\ RULE : & if X_{1} is A_{1} and X_{2} is A_{2} and \dots and X_{N} is A_{N} then Y is B \\ CONCLUSION : & Y is B^{'} . \end{matrix}$ (2)

Throughout the paper, the membership functions of facts A′ are denoted by μ_A′ for the simple premise and $μ_{A_{i}^{'}}$ for i = 1, ⋯ , N in case of the compound premise. Analogically, the membership functions of premises A are denoted by μ_A and μ_{A
_i} for i = 1, ⋯ , N. On the side of the consequent, the membership function of conclusion B is denoted by μ_B and the fuzzy result B′ as μ_B′.

Fuzzification of an input variable using singletons makes the process of fuzzy inference much less complicated. The analysis presented in this section assumes the fact membership function μ_A′ defined as follows $μ_{A^{'}} (x) = {\begin{matrix} \begin{matrix} 1, & x = x_{w} \\ 0, & x \neq x_{w} \end{matrix}, \end{matrix}$ (3) where x_w represents an input numerical value.

2.1 Approach in case of a simple premise

In case of a simple premise (1), according to Zadeh’s compositional rule of inference [43], the fuzzy result B′ described by a membership function μ_B′ is obtained as follows [8, 32].

$y \in Y \forall μ_{B^{'}} (y) = sup_{c x \in X} [μ_{A^{'}} (x) ★_{T} I (μ_{A} (x), μ_{B} (y))],$ (4) where μ_A, μ_B are membership functions of a premise A and a conclusion B respectively. A function I represents any fuzzy implication, whereas an operation ★_T stands for any T-norm.

For a chosen point y_n from the domain Y the given equation takes the following form

$μ_{B'} (y_{n}) = sup_{c x \in X} [μ_{A'} (x) ★_{T} I (μ_{A} (x), μ_{B} (y_{n}))] .$ (5) Assuming that a fact membership function in a form of singleton is the same as in (3), obtaining the function μ_B′ is much less complex because μ_A′ > 0 only for x = x_w. In this case (5) takes the following form

$μ_{B'} (y_{n}) = μ_{A'} (x_{w}) ★_{T} I (μ_{A} (x_{w}), μ_{B} (y_{n})),$ (6)

where, on the basis of T-norm’s boundary condition, the intersection with a fact membership function can be skipped because only for x_w it is equal to 1. Thus, the final form is obtained as follows

$μ_{B'} (y_{n}) = I (μ_{A} (x_{w}), μ_{B} (y_{n})) .$ (7)

Fig. 1 presents a given problem of obtaining the μ_B′ function in two sample points y₁, y₂ of the domain Y, in which a membership function of conclusion μ_B (y₁) =0 and $μ_{B} (y_{2}) = \frac{1}{2}$ .

Fig. 1

Obtaining a membership function μ_B′ for a sample situation (membership functions of a conclusion B, a premise A and a singleton fact A′). In the example the T-norm minimum and Lukasiewicz’s implication have been used. On the Y axis, the thick dashed line corresponds to an output membership function μ_B′. In the domain X, the thick dashed line corresponds to an implication function I. The result of an intersection of the singleton and the implication function has been marked with the thick, light grey solid line.

In both examples it can be noticed, that an intersection of a membership function of a fact and a function of implication is different than 0 only for value x_w in the domain X.

Baldwin’s fuzzy inference is based on truth functions [2]. Considering the example (1), first the truth function of a premise τ_P is obtained, which precisely describes a relationship between the fact A′ and the premise A. Then, the inference is performed in a truth space using the obtained truth function and implication. The result of this process is the truth function of conclusion τ_B, which allows to obtain conclusion B′ by a truth functional modification [5].

The following expression presents obtaining a conclusion membership function μ_B′, by truth functional modification of μ_B using τ_B $μ_{B'} (y_{n}) = τ_{B} (μ_{B} (y_{n})) y_{n} \in Y .$ (8) Performing a cylindrical extension [8] of the conclusion truth function τ_B, the following is obtained

$μ_{B'} (y_{n}) = sup_{c η \in [0, 1]} [τ_{P} (η) ★_{T} I (η, μ_{B} (y_{n}))],$ (9) where τ_P in case of a simple premise is obtained as follows [2].

$τ_{P} (η) = sup_{c η = μ_{A} (x) x \in X} μ_{A'} (x) .$ (10)

Taking into account that the fact membership function μ_A′ is in a singleton form, the following equation can be written $η \in [0, 1] \forall τ_{P} (η) = {\begin{matrix} \begin{matrix} 1, & η = μ_{A} (x_{w}) \\ 0, & η \neq μ_{A} (x_{w}) \end{matrix}, \end{matrix}$ (11) because for every x ≠ x_w in the domain X the function μ_A′ is equal to 0. Therefore, the truth function of a premise is also a singleton.

Fig. 2 shows obtaining the truth function of a premise for given assumptions. It illustrates three different examples. In a situation a) a fact is rather not compatible with a premise, which results in obtaining the truth function that is close to the absolute false [2].

Fig. 2

Obtaining the truth function of a premise for facts in a form of singletons. In the domain X a membership function of a premise and a fact has been presented. Thick line corresponds to the output truth functions of premises.

The next example is characterized by higher compatibility of the fact with the premise and therefore, the truth function is getting closer to the absolute truth [2], which is obtained in the situation c) for full compatibility.

In case a) it has been shown how to obtain two points of the truth function τ_P for levels η = 0.25 and η = 0.75. For the level 0.25 two points of the domain X are obtained, in which the membership function of a premise intersects this value. For these points the value of the truth function of a premise is obtained and as a result of supremum the higher value is selected - in this case the one equal to 1. The second η value taken into consideration illustrates a situation, where the fact membership function is equal to 0 in given points of the domain. It can be noticed, that such result is obtained for all values η ≠ 0.25.

The example b) precisely shows the procedure of obtaining only one point, for which a non-zero result is obtained. In the example c) a detailed process for level η = 1 has not been shown, because in such situation the supremum operation is performed for the whole range of the domain X, in which the membership function of a premise is equal to 1.

The truth function of a premise τ_P in a form of (11) extended in the expression (9) will allow to skip the supremum operation, because an intersection of τ_P with a function I will be different than 0 only for η = μ_A (x_w)

$μ_{B'} (y_{n}) = τ_{P} (μ_{A} (x_{w})) ★_{T} I (μ_{A} (x_{w}), μ_{B} (y_{n})) .$ (12) The value τ_P in the given equation is obviously equal to 1, therfore, on the basis of T-norm boundary condition, it can be further written as $μ_{B'} (y_{n}) = I (μ_{A} (x_{w}), μ_{B} (y_{n})),$ (13) which is equal to the expression (7). Hence, the equation describing Baldwin’s inference comes down to the one resulting from Zadeh’s compositional rule of inference.

2.2 Approaches in case of a compound premise

The compound premise, like shown in (2), is the most common case in rule-based fuzzy systems. To simplify further analysis let us consider (2) with only two simple premises in composition (N = 2). In such case the equation describing Zadeh’s compositional rule of inference takes the following form

$\begin{matrix} y \in Y \forall μ_{B'} (y) = sup_{c x_{1} \in X_{1} x_{2} \in X_{2}} [(μ_{A_{1}'} (x_{1}) ★_{T_{1}} μ_{A_{2}'} (x_{2})) \\ ★_{T} I (μ_{A_{1}} (x_{1}) ★_{T_{1}} μ_{A_{2}} (x_{2}), μ_{B} (y))], \end{matrix}$ (14) where μ_A₁′, μ_{A
₁} represent membership functions of a fact and a premise in the domain X₁ and μ_A₂′, μ_{A
₂} describe a fact and a premise in the domain X₂. The ★_{T
₁} operation represents any T-norm responsible for an “ AND” conjunction of rule premises.

Considering facts in a singleton form, the operation ★_T will give a result different than 0 only for these points of the domains X₁ and X₂, in which functions μ_A₁′, and μ_A₂′ are equal to 1. Let’s mark these points respectively x_1w and x_2w. Thus, in this situation (14) is reduced to $\begin{matrix} y \in Y \forall μ_{B'} (y) = I (μ_{A_{1}} (x_{1 w}) \\ ★_{T_{1}} μ_{A_{2}} (x_{2 w}), μ_{B} (y)) . \end{matrix}$ (15)

It is worth noticing, that the above mentioned expression will also be true for a larger composition. Therefore, for a selected point y_n of the domain Y, in case of N simple premises, it can be written as follows

$\begin{matrix} μ_{B'} (y_{n}) = I (μ_{A_{1}} (x_{1 w}) ★_{T_{1}} μ_{A_{1}} (x_{2 w}) \\ ★_{T_{1}} \dots ★_{T_{1}} μ_{A_{N}} (x_{Nw}), μ_{B} (y_{n})) . \end{matrix}$ (16)

An equation illustrating Baldwin’s inference in case of a larger number of simple premises looks the same as in case of one premise. The difference lies in the truth function τ_P, which in this case combines all truth functions of premises in the rule. The following equation, suggested by Baldwin [2], defines τ_P for the rule with two premises joined by an “ AND” operation, modeled by a T-norm ★_{T
₁}

$τ_{P} (η) = sup_{c η = η_{1} ★_{T_{1}} η_{2} η_{1}, η_{2} \in [0, 1]} [τ_{P_{1}} (η_{1}) ★_{T_{1}} τ_{P_{2}} (η_{2})],$ (17) where truth functions τ_{P
₁} and τ_{P
₂} for every simple premise are obtained according to the expression (10) and take the following form

$\begin{matrix} τ_{P_{1}} (η_{1}) = sup_{c η_{1} = μ_{A_{1}} (x_{1}) x_{1} \in X_{1}} [μ_{A_{1}'} (x_{1})], \\ τ_{P_{2}} (η_{2}) = sup_{c η_{2} = μ_{A_{2}} (x_{2}) x_{2} \in X_{2}} [μ_{A_{2}'} (x_{2})] . \end{matrix}$ (18) As it has been already shown for given assumptions, the above mentioned truth functions of premises will take a form of singletons. They will be different than 0 only in points x_1w and x_2w in the domains X₁ and X₂ respectively.

$\begin{matrix} τ_{P_{1}} (η_{1}) = {\begin{matrix} \begin{matrix} 1, & η_{1} = μ_{A_{1}} (x_{1 w}) \\ 0, & η_{1} \neq μ_{A_{1}} (x_{1 w}) \end{matrix}, \end{matrix} \\ τ_{P_{2}} (η_{2}) = {\begin{matrix} \begin{matrix} 1, & η_{2} = μ_{A_{2}} (x_{2 w}) \\ 0, & η_{2} \neq μ_{A_{2}} (x_{2 w}) \end{matrix} . \end{matrix} \end{matrix}$ (19)

Taking into account the above, a compound truth function τ_P, described by the expression (17), will also take a form of a singleton $τ_{P} (η) = {\begin{matrix} \begin{matrix} 1, & η = μ_{A_{1}} (x_{1 w}) ★_{T_{1}} μ_{A_{2}} (x_{2 w}) \\ 0, & η \neq μ_{A_{1}} (x_{1 w}) ★_{T_{1}} μ_{A_{2}} (x_{2 w}) \end{matrix}, \end{matrix}$ (20) because an intersection η₁ ★ _{T
₁}η₂ will be different than 0 only for η₁ = μ_{A
₁} (x_1w) and η₂ = μ_{A
₂} (x_2w). Obviously, depending on a given T-norm ★_{T
₁}, η will at the most be equal to the lower of the two values: μ_{A
₁} (x_1w), μ_{A
₂} (x_2w). Therefore, it can be said that compounding two truth functions for an “ AND” conjunction will choose in the result the truth function which is shifted more towards the absolute false [2]. An example of obtaining a truth function of a compound premise according to (20) is shown in Fig. 3.

Fig. 3

Obtaining a compound truth function for two premises: a) a truth function τ_P
₁, b) a truth function τ_P
₂, c) a result of τ_P composition for two sample T-norms modeling a conjunction "AND" of premises (solid line - minimum, dashed line - product).

In case of N simple premises, the equation (17) takes the following form

$\begin{matrix} τ_{P} (η) = sup_{c η = η_{1} ★_{T_{1}} η_{2} ★_{T_{1}} \dots ★_{T_{1}} η_{N} η_{1}, η_{2}, \dots, η_{N} \in [0, 1]} \\ [τ_{P_{1}} (η_{1}) ★_{T_{1}} τ_{P_{2}} (η_{2}) ★_{T_{1}} \dots ★_{T_{1}} τ_{P_{N}} (η_{N})] \end{matrix}$ (21) for given assumptions similarly simplified to a form

$\begin{matrix} τ_{P} (η) = \\ {\begin{matrix} \begin{matrix} 1, & η = μ_{A_{1}} (x_{1 w}) ★_{T_{1}} μ_{A_{2}} (x_{2 w}) ★_{T_{1}} \dots ★_{T_{1}} μ_{A_{N}} (x_{Nw}) \\ 0, & η \neq μ_{A_{1}} (x_{1 w}) ★_{T_{1}} μ_{A_{2}} (x_{2 w}) ★_{T_{1}} \dots ★_{T_{1}} μ_{A_{N}} (x_{Nw}) \end{matrix} \end{matrix} . \end{matrix}$ (22) Due to the fact that any T-norm by definition is characterized by associativity, (22) can also be obtained using (20) adding another truth functions of premises to the result. Associativity of truth functions of premises in this case is obvious. However, it is worth focusing our attention on it because it has a substantial importance in case of facts with membership functions not limited to singletons.

Expanding (9) with (22) allows us to skip the supremum operation with regard to singleton form of τ_P $μ_{B'} (y_{n}) = τ_{P} (η_{w}) ★_{T} I (η_{w}, μ_{B} (y_{n})),$ (23) where η_w = μ_{A
₁} (x_1w) ★ _{T
₁}μ_{A
₂} (x_2w) ★ _{T
₁} ⋯ ★ _{T
₁}μ_{A
_N} (x_Nw).

Skipping an intersection with τ_P, which in this case is equal to 1 (on the basis of T-norm’s boundary condition), the final form is obtained $\begin{matrix} μ_{B'} (y_{n}) = I (μ_{A_{1}} (x_{1 w}) ★_{T_{1}} μ_{A_{2}} (x_{2 w}) \\ ★_{T_{1}} \dots ★_{T_{1}} μ_{A_{N}} (x_{Nw}), μ_{B} (y_{n})), \end{matrix}$ (24) which is equal to (16).

In this way the equivalence of Zadeh and Baldwin’s approach in case of a compound premise has been shown, assuming that the membership functions of facts are in a form of singletons.

2.3 Practical consequences of equality in simplified environment

Simplified approaches reduce relationship between the fact A′ and the premise A to one real number from [0, 1] range (naturally 1 for full compatibility and 0 for no compatibility). Therefore, a computational complexity of algorithms implementing such solutions is very low (linear, O (n), according to the number of simple premises in a rule). The linear complexity is shown in results of numerical examinations presented further in Section 3.7.

Comparison of the two analyzed approaches in that simplified environment shows a slightly higher complexity of Baldwin’s solution. Singletons in Zadeh’s approach allow to tremendously reduce the analyzed space, which is particularly visible for the compounded premise (14)_simplified_by_singleton. Baldwin’s inference can be also strongly reduced, however, there is the additional step of computing the truth function of conclusion τ_B modifying the conclusion fuzzy set to obtain the result, as shown in (9)Tb. Although the process is performed only once, regardless of the number of facts in a rule, the phase is omitted in Zadeh’s approach.

Generally, it is hard to think of an easier or more direct solution than (16). It is the reason of its popularity in practical applications.

3 Equivalence in case of fuzzy facts

This section presents similar analysis for given inference examples (1) and (2). This time, however, the inference process considers the general case, not limited to singletons.

An assumption that facts membership functions can take any form does not allow us to apply simplifications discussed in the previous section. Therefore, the comparative analysis of both approaches becomes more complicated. First of all, let us look at the first stage of Baldwin’s fuzzy inference, which is obtaining a truth function of a premise. Deep understanding of the method obtaining it, as well as what it represents, is really crucial before further stages.

3.1 Truth function of a premise

The following equation describes the method of obtaining the truth function of a premise τ_p [2].

$τ_{p} (η) = sup_{c η = μ_{A} (x) x \in X} μ_{A'} (x) .$ (25)

Fig. 4 presents obtaining the truth function of a premise according to (10)2 in six different situations. To make the graph clearer the axes have been described only in the first situation and do not change in next ones. Sample fact and premise membership functions have been described in domain X, which is oriented vertically. Trapezoidal membership function of a premise is constant in every of all six situations. However, the triangular fact membership function changes its position. Obtained truth functions have been marked with a thick, black line in domain [0, 1] × [0, 1].

Fig. 4

Obtaining a truth function of a premise for fuzzy facts in six sample situations. Axes are described only in part a). The truth functions have been marked with a thick black line in a domain [0, 1] × [0, 1], whereas in a domain X a trapezoidal membership function of a premise μ_A and triangular membership function of a fact μ_A′

The whole illustrates a process of moving the triangular membership function of a fact in the direction of smaller and smaller compatibility with a premise. It results in obtaining subsequent truth functions beginning with the absolutely true [2] for full compatibility in situation a), to a situation f) in which the truth functions begins to get closer to the false [2] for lower compatibility.

The least complicated example is illustarted in situation a), because a range of domain X, in which μ_A = 1, entirely contains the area where μ_A′ > 0. The area has been marked with thick, gray line. Therefore, according to (10)2 for the part of the domain X, in which η = μ_A = 1, the maximum value μ_A′ is chosen. In this case the maximum of μ_A′ is equal to 1 in point I, hence, this value is transferred as a value of the truth function τ_P.

In such a way one point of the truth function for η = 1 is obtained. The same example presents how to obtain value of τ_P for the level $η = \frac{1}{2}$ , which has been marked with a vertical dashed line. As opposed to η = 1, where a certain range of X domain has been obtained, in this case the level intersects μ_A in two points, therefore, obtaining two points of X domain. As it can be seen in both points μ_A′ is equal to 0, which is shown by dashed line arrows. Supremum of two 0 values is obviously equal to 0, which is a result for this η level. It can be observed that similar situation happens to all levels η ≠ 1, because μ_A′ = 0 for all points in X domain, where μ_A ≠ 1.

In situation b) the thick, grey line similarly shows a part of μ_A′ existing in an area, where μ_A = 1. In this situation for some areas of X, where μ_A ≠ 1, non-zero values of μ_A′ are present. They will obviously be appropriately mapped in obtained truth function. Calculations has been precisely presented for level $η = \frac{3}{4}$ , which is marked with a vertical dashed line. Intersection of this level with μ_A gives two points of X, where μ_A′ = 0. These values of μ_A′ are pointed by arrows printed with dashed lines. It can be noticed that a similar result will be obtained for all levels smaller than the considered η. However, by shifting η to the right of the considered level, the upper arrow marked with dashed line will start to indicate higher values of μ_A′. Thus the given level of η is a boundary level, in which the truth function still is equal to 0. In this way the first characteristic point of the truth function is obtained and marked with I. The second characteristic point will be obtained for η approaching 1. In this case, imagining a vertical dashed line being very close to value 1, also two values of μ_A′ will be obtained, where one will be close to II, and the second one will be equal to 0. Therefore, for η → 1, τ_P goes in the direction of value marked by II. The last characteristic point is obtained for η = 1, for which the maximum value of μ_A′ is obtained in III.

Situation c) is a boundary situation, in which the maximum of μ_A′ is still set in the area of μ_A maximum. There can be obtained two characteristic points describing values of τ_P ≠ 0. The truth function begins to rise starting from the marked level $η = \frac{1}{2}$ and reaching the maximum for η = 1.

In the following situations d), e) and f) the maximum of μ_A′ is obtained for levels η ≠ 1. Situation e) is a characteristic stage in which the half-truth is obtained (τ_P (0) =0, $τ_{P} (\frac{1}{2}) = 1$ , τ_P (1) =0) [12 , 46]. It is worth noticing that in situation f) the meaning of a thick, grey line has been changed to show a part of μ_A′ existing in the area in which this time μ_A = 0. Starting from this example the truth functions begin to move towards the absolute false, which will be obtained in a situation where for every η > 0 function μ_A′ = 0 (when triangular shape of fact membership function will be entirely outside of the support of A).

To summarize the deliberations presented above, we can characterize the truth function of a premise saying that it contains the highest values of μ_A′ for every η in [0, 1]. However, each η level reversely mapped by μ_A, indicates the area of X where to look for those values in μ_A′. This observation is crucial in further analysis because it shows what the truth function of a premise actually represents. Therefore, the equation (10)2 can be expressed in a form of the following, shortened corollary

Corollary 1. Function τ_P represents the highest values of a fact in these points of universe of discourse, where a premise takes the same value.

where by the fact and the premise we obviously understand the membership functions of the fact and the premise respectively.

3.2 Equivalence in case of a simple premise

Analogically to the Section 2_simple, analyzing the example (1), first let us consider the inference with one simple premise in general case. In this type of environment, an equivalence of the analyzed approaches, considering no simplifications, can be shown by moving both equations to the same domain. Baldwin’s inference changes the fact and premise’s domain into [0, 1] through the truth functions. By restoring the original domain both solutions can be directly compared.

Considering the process (1), obtaining the membership function of B′ by Baldwin’s inference can be described by the following equation $μ_{B'} (y_{n}) = sup_{c η \in [0, 1]} [τ_{P} (η) ★_{T} I (η, μ_{B} (y_{n}))] .$ (26) According to (10)2, η = μ_A (x), which describes a connection between domains [0, 1] and X. Therefore, substituting η with μ_A (x) the equation operates again in X

$μ_{B'} (y_{n}) = sup_{c x \in X} [τ_{P} (μ_{A} (x)) ★_{T} I (μ_{A} (x), μ_{B} (y_{n}))] .$ (7)

Expanding the space [0, 1] into X is presented in Fig. 5, which shows calculations for one given point of the domain Y (for μ_B = 0). The mapping of values for the truth function and an implication in point x_i has been precisely presented. It is worth noticing that the same values of the truth function and the implication have been repeated in different points of X, which of course results from ambiguity of a function μ_A. It is clear, that the operation does not change the inference result. A projection from an intersection will be performed in a larger space on repeated values of both functions.

Fig. 5

Transforming space [0, 1] to X for Baldwin’s inference according to (27). The figure shows a premise truth function and an implication function while calculating one point of a conclusion. The implication function I has been shown by a thick, grey line for μ_B (y_n) =0. The thick, black line shows the premise truth function τ_P. The trapezoidal membership function of a premise μ_A has been shown below.

Obtaining the membership function of conclusion using Zadeh’s inference is described by the following equation $μ_{B'} (y_{n}) = sup_{c x \in X} [μ_{A'} (x) ★_{T} I (μ_{A} (x), μ_{B} (y_{n}))],$ (28) which is visualized in Fig. 6 for two points y₁ and y₂.

Fig. 6

Obtaining a conclusion membership function B′ shown for two points in domain Y. In this example a minimum T-norm has been used as well as Łukasiewicz’s implication. On axis Y a thick solid line shows a conclusion membership function, whereas a dashed line shows a result of inference - membership function of B′. In domain X a solid line presents fact and premise membership functions. The fact with a black color, whereas the premise with a grey. A thick dashed line corresponds to an implication function. The outcome of intersection of a fact and an implication is shown with a light-grey area.

Comparing (27) with (5)2 it can be noticed, that the equations differ only with the left operand of ★_T T-norm. In Baldwin’s inference it is τ_P (μ_A (x)) whereas in case of Zadeh’s μ_A′ (x). The only direct connection is that τ_P is created from values of μ_A′. Going further, according to (10)2, the statement τ_P (μ_A (x)) represents only the highest value of μ_A′ in these points of X, where a premise membership function is equal to μ_A (x). In case of Zadeh’s approach μ_A′ is unchanged taking in these points, apart from the maximum value, also lower values.

Therefore, looking broader at the statement (5)2 one should think when an intersection μ_A′ (x) with an implication function reaches maximum, because by the supremum operation only the highest value makes a final result.

For given y_n the only variable parameter of a function I in space X is μ_A. Therefore, choosing only such points of the space in which μ_A takes the same value the constant result of I function will also be obtained. In such constrained space X the maximum value of intersection with ★_T T-norm will be obtained only for the maximum value of μ_A′. Therefore, by dividing space X into all possible point groups, in which μ_A obtains the same value, it is enough to have the maximum values of μ_A′ for each of the groups in order to obtain the highest results of intersection. Taking into account the corollary 3.1 it can be said that the exact requirements are fulfilled by the truth function of a premise, which proves the equality of (5)2 and (27). This allows us to look closer at the consequences of this equality.

The equivalence of the two approaches has been shown in Fig. 7 for a sample situation. The fact and the premise, described by functions μ_A′ and μ_A respectively, have been shown in the bottom part of the graph. Examples are shown at the top. Situation presents obtaining the intersection of the truth function of a premise τ_P with an implication function I for four selected points of Y space (conclusion space), where μ_B takes the following values: 0, $\frac{1}{4}$ , $\frac{1}{2}$ and $\frac{3}{4}$ . In order to keep the graph clear the functions have been described only in the case where μ_B = 0.

Fig. 7

Equivalence of Baldwin and Zadeh’s inference in case of a simple premise

For each situation Baldwin’s inference has been shown in space [0, 1] and X, transferred according to relation η = μ_A (x). Moreover, in space X dotted line shows the membership function of a fact μ_A′. In this way the graph illustrates also Zadeh’s inference by an intersection of μ_A′ with an implication function. The highest values of an intersection have been shown by a horizontal dashed line connecting both approaches.

In space X letters A and B indicate two intervals characterized by different variability of a premise membership function. In the interval B the function μ_A is constant and takes the value 1, whereas in A the μ_A function increases linearly from 0 to 1 therefore including all the possible truth values. Similar areas of X space can be distinguished for μ_A = 0 and its descending slope, however, they are not as valuable because the fact membership function is then constant taking the 0 value.

The A interval has been marked with a grey background in each of inference cases. It presents a rescaled mapping of the truth function τ_P and I function from charts in space [0, 1].

Comparing the graphs of τ_P and μ_A′ functions it can be noticed that they overlap in the interval A. However, in the B interval τ_P is constant and equal to the maximum value of μ_A′ in this area. Therefore, it can be said that Baldwin’s inference in obtaining τ_P has left only this one value which can give the highest result in an intersection with implication function I. The rest of μ_A′ values in B has been skipped. In this way for inference purposes the whole space X is being considerably constrained (or compressed) to the set of only significant points. In a given situation the whole space X is being constrained to the interval A and only one point of B, where μ_A′ takes value 1.

Fig. 8 shows one important consequence of transforming the relationship between facts and premises into truth functions. It can be seen that given truth function can be obtained for two different facts, located within the area of both slopes of membership function of a premise. A truth function τ_P obtained in this two situations would be identical. If μ_A had symmetrical slopes it would not be possible to define a precise location of μ_A′ basing on τ_P because there would be two possibilities. Therefore, it can be said that the truth function of a premise defines only the compatibility of a fact with a premise. An identical compatibility can be obtained in some cases for a different location of μ_A′ against μ_A and thus the outcome of the inference process will be the same.

Fig. 8

Ambiguity of Baldwin’s and Zadeh’s inference for a rule with a simple premise.

The given examples as well as the fact of equivalence show that Zadeh’s inference is characterized by the same feature. However, in Zadeh’s approach there is no stage where the compatibility between facts and premises are separately obtained. The result is calculated basing on fact, premise and conclusion altogether. Baldwin’s approach allows us to see this mechanism directly.

3.3 Compound truth function of a premise

Now let us consider a compound premise like in the example (2), analogically to the previous Section 2_compound. However, this time in general case.

Baldwin defined obtaining a compound truth function for two premises in a rule joined with a conjunction "AND" as well as "OR" [2]. These solutions can also be obtained using the extension principle for operation on non fuzzy truth values [8]. Therefore, by extending F function of N truth values η_i in the following form

$η = F (η_{1}, η_{2}, \dots, η_{N}) = η_{1} ★_{1} η_{2} ★_{2} \dots ★_{(N - 1)} η_{N},$ (29) where ★_i for i = 1, ⋯ , (N - 1) represents any triangular norm (T-norm or S-norm), the following expression is obtained

$\begin{matrix} τ_{P} (η) = sup_{c η = F (η_{1}, η_{2}, \dots, η_{N}) η_{1}, η_{2}, \dots, η_{N} \in [0, 1]} \\ [τ_{P_{1}} (η_{1}) ★_{T_{2}} τ_{P_{2}} (η_{2}) ★_{T_{2}} \dots ★_{T_{2}} τ_{P_{N}} (η_{N})], \end{matrix}$ (30) where ★_{T
₂} represents any T-norm, whereas τ_{P
₁}, τ_{P
₂}, ⋯ , τ_{P
_N} are fuzzy truth values, for which the extension operation is being performed. It is also worth noticing that according to the extension principle the F function, as well as the intersection ★_{T
₂}, are performed in N dimensional space [37].

Considering ★_i as any T-norm, where i = 1, ⋯ , (N - 1), we receive equation obtaining a compound truth functions in the case when the premises in a rule are joined with a conjunction “ AND”. Obviously, ★_i replaced with any S-norm results in “ OR” conjunction.

Mixed cases can also be encountered. However, it is important to notice that a canonical form of a rule contains only “ AND” conjunctions [9]. A compound truth function, where three premises are joined with different types of conjunctions, would be described as follows $\begin{matrix} τ_{P} (η) = sup_{c η = η_{1} ★_{T_{1}} η_{2} ★_{S_{1}} η_{3} η_{1}, η_{2}, η_{3} \in [0, 1]} \\ [τ_{P_{1}} (η_{1}) ★_{T_{2}} τ_{P_{2}} (η_{2}) ★_{T_{2}} τ_{P_{3}} (η_{3})], \end{matrix}$ (31) where ★_{T
₁} T-norm and ★_{S
₁} S-norm represent conjunctions “ AND” and “ OR” respectively.

Considering η_i = μ_{A
_i} (x_i) for i-th premise in (30) it is possible to define equation described in $X = X_{1} \times X_{2} \times \dots \times X_{N}$ space as follows $\begin{matrix} τ_{P} (η) = sup_{c η = F (x_{1}, x_{2}, \dots, x_{N}) x_{1} \in X_{1} x_{2} \in X_{2} \dots x_{N} \in X_{N}} \\ [τ_{P_{1}} (μ_{A_{1}} (x_{1})) ★_{T_{2}} τ_{P_{2}} (μ_{A_{2}} (x_{2})) ★_{T_{2}} \dots \\ \dots ★_{T_{2}} τ_{P_{N}} (μ_{A_{N}} (x_{N}))], \end{matrix}$ (32) where F (x₁, x₂, ⋯ , x_N) = μ_{A
₁} (x₁) ★ ₁μ_{A
₂} (x₂) ★ ₂ ⋯ ★ _(N-1)μ_{A
_N} (x_N).

According to the corollary 3.1, a truth function of i-th premise τ_{P
_i} represents the highest values of μ_{A′_i} in these points of space X_i where μ_{A
_i} obtains the same value η_i. Obviously, in the case of many premises the same value η in (32) is obtained in these points of $X$ where function F is constant. Depending on a sequence of ★_i operations, represented by function F, obtaining a constant result is possible for different values in every space X_i. For example by substituting every ★_i operation by a T-norm minimum ★_{T
_min} the function F takes the following form $\begin{matrix} η = F (η_{1}, η_{2}, \dots, η_{N}) \\ = η_{1} ★_{T_{\min}} η_{2} ★_{T_{\min}} \dots ★_{T_{\min}} η_{N} . \end{matrix}$ (33) In such situation $η = \frac{1}{4}$ will be obtained for any $η_{i} = \frac{1}{4}$ when the remaining $η_{i} \geq \frac{1}{4}$ . Hence, in a general case η will be equal to any value φ ∈ [0, 1] when η_j = φ for j ∈ {1, 2, ⋯ , N} and η_i ≥ φ for i = 1, 2, ⋯ , (j - 1) , (j + 1) , ⋯ , N. Every η_i ≥ φ represents ranges of possibilities, where η_i values do not influence the result of function F.

Therefore, changing space into $X$ by assigning η_i = μ_{A
_i} (x_i), the function F takes the following form $\begin{matrix} η = F (x_{1}, x_{2}, \dots, x_{N}) \\ = μ_{A_{1}} (x_{1}) ★_{T_{\min}} μ_{A_{2}} (x_{2}) ★_{T_{\min}} \dots ★_{T_{\min}} μ_{A_{N}} (x_{N}), \end{matrix}$ (34) which will accordingly transfer ranges of possibilities from space [0, 1] into spaces X_i, depending on the shape of μ_{A
_i}. Consequently for each η_i there is a certain set of points x_i ∈ X_i, for which μ_{A
_i} is constant and equal to η_i. Within each defined set τ_{P
_i} (μ_{A
_i} (x_i)) also obtains constant value, which is equal to the maximum value of μ_{A′_i} in the set. Considering the whole expression of τ_{P
_i} intersection in (32) as follows $\begin{matrix} τ_{P_{1}} (μ_{A_{1}} (x_{1})) ★_{T_{2}} τ_{P_{2}} (μ_{A_{2}} (x_{2})) \\ ★_{T_{2}} \dots ★_{T_{2}} τ_{P_{N}} (μ_{A_{N}} (x_{N})), \end{matrix}$ (35) for selected η there is N ranges of possibilities (for each η_i in each X_i space). It is worth noticing, that by processing all the possible η_i the suprememum of intersection (10)NCutInX will be equal to supremum of the following intersection of facts’ membership functions $μ_{A'_{1}} (x_{1}) ★_{T_{2}} μ_{A'_{2}} (x_{2}) ★_{T_{2}} \dots ★_{T_{2}} μ_{A'_{N}} (x_{N}) .$ (36)

Functions τ_{P
_i} for each η_i obtain the same value within the set of x_i points, which are equal to the maximum of μ_{A′_i} in this set. On the other hand μ_{A′_i} in the set can take different values. However, according to T-norm definition, only the highest value influences the result of output supremum.

Therefore, the expression (32) can be transformed to the following form

$\begin{matrix} τ_{P} (η) = sup_{c η = f (x_{1}, x_{2}, \dots, x_{N}) x_{1} \in X_{1} x_{2} \in X_{2} \dots x_{N} \in X_{N}} \\ [μ_{{A'}_{1}} (x_{1}) ★_{T_{2}} μ_{{A'}_{2}} (x_{2}) ★_{T_{2}} \dots ★_{T_{2}} μ_{{A'}_{N}} (x_{N})], \end{matrix}$ (37) where instead of τ_{P
_i} (μ_{A
_i} (x_i)) membership functions of facts were assigned: μ_{A′_i} (x_i) for i = 1, ⋯ , N.

For better understanding, the equality of (32) and (37) is visualized in Fig. 9 presenting a process of compounding two truth functions using the considered equation. The upper part presents membership functions defining sample situation of a rule with two premises joined with a conjunction “ AND”. Membership functions of premises have been marked as μ_{A
₁},μ_{A
₂} whereas facts as μ_A′₁ and μ_A′₂ respectively. The truth functions of premises τ_{P
₁} and τ_{P
₂} have also been illustrated.

Fig. 9

Obtaining a compound truth function for a rule with two premises joined with a conjunction “ AND”. As an operation of intersection the minimum T-norm was used.

For the purposes of the example the minimum T-norms were used, therefore, the function F takes in this case the following form $η = F (η_{1}, η_{2}) = η_{1} ★_{T_{\min}} η_{2} = μ_{A_{1}} (x_{1}) ★_{T_{\min}} μ_{A_{2}} (x_{2}) .$ (38)

Charts situated in the middle of the figure show intersections of corresponding membership functions in space X₁ × X₂. Starting from the left side, an intersection of premises membership functions are presented first, followed by the intersection of facts. The last chart presents the same situation but for truth functions of premises transferred into spaces X₁ and X₂. Obviously, the intersection of two membership functions is performed in three dimensions. Figures show projections of 3D views onto X₁ × X₂ plane, where solid line marks contours of 3D structures. Color white outside the contours represent value 0. Shades of gray represent values greater than 0 (darker color represents higher value). The bottom line of charts show only contours of relevant intersections in spaces X₁ × X₂ to clearly present only several analyzed levels. Dashed line depicts three sets of points, for which the operation of composition F is equal to $\frac{1}{6}$ , $\frac{1}{2}$ and $\frac{5}{6}$ .

In the chart showing contours of intersection μ_A′₁ ★ _{T
_min}μ_A′₂, the gray thick line stands for maximum values for this operation in each of marked sets of points. For $η = \frac{1}{2}$ the maximum is 1 and for $η = \frac{1}{6}$ and $η = \frac{5}{6}$ it is $\frac{1}{3}$ . The last chart clearly shows that the same maximum values will be obtained for intersection of τ_{P
₁} (μ_{A
₁}) ★ _{T
₂}τ_{P
₂} (μ_{A
₂}). The thick grey line have the same meaning as in the previous chart, whereas, the thick black line indicates common part of areas with maximum values in both cases. Therefore, it can be said that an intersection of truth function of a premise will only multiply the intersections of facts membership function within defined sets of points, for which η have the same value.

To summarize the analysis it is worth noting down the equation (37) in a form of the following shortened corollary:

Corollary 2. Compound truth function τ_P represents the highest values of facts intersection in these points of universe of discourse for which any sequence of operations on premises obtains the same value.

where facts and premises is understood as membership functions of both facts and premises, whereas, operations stand for any triangular norm (T-norm or S-norm).

Therefore, as it was shown earlier for simple premise, also a compound truth function represents only a relevant area of the universe of discourse - in this case an important area of facts’ intersection - preserving only the most valuable information.

3.4 Equivalence in case of a compound premise

Similarly to previous cases, the equivalence of Baldwin’s and Zadeh’s inference can be shown by moving (9)2 into $X$ space. Therefore, regarding $\begin{matrix} η = F (x_{1}, x_{2}, \dots, x_{N}) = μ_{A_{1}} (x_{1}) ★_{1} \\ μ_{A_{2}} (x_{2}) ★_{2} \dots ★_{(N - 1)} μ_{A_{N}} (x_{N}), \end{matrix}$ (39) where F represents the sequence of operations joining premises, the following equation is obtained $\begin{matrix} μ_{B'} (y_{n}) = sup_{c x_{1} \in X_{1} x_{2} \in X_{2} \dots x_{N} \in X_{N}} \\ [τ_{P} (F (x_{1}, x_{2}, \dots, x_{N})) ★_{T} I (F (x_{1}, x_{2}, \dots, x_{N}), μ_{B} (y_{n}))] . \end{matrix}$ (40) In this case τ_P obviously represents a compound truth function of N premises.

An expression describing Zadeh’s inference for n premises in a rule, joined by the function F, takes the following form $\begin{matrix} μ_{B'} (y_{n}) = sup_{c x_{1} \in X_{1} x_{2} \in X_{2} \dots x_{N} \in X_{N}} \\ [T_{A'} (x_{1}, x_{2}, \dots, x_{N}) ★_{T} I (F (x_{1}, x_{2}, \dots, x_{N}), μ_{B} (y_{n}))] \end{matrix}$ (41) where T_A′ represents a multidimensional fuzzy set created from facts’ membership functions using any T-norm ★_{T
₂} as follows $\begin{matrix} T_{A'} (x_{1}, x_{2}, \dots, x_{N}) = \\ μ_{{A'}_{1}} (x_{1}) ★_{T_{2}} μ_{{A'}_{2}} (x_{2}) ★_{T_{2}} \dots ★_{T_{2}} μ_{{A'}_{N}} (x_{N}) . \end{matrix}$ (42)

Comparing (41) to (40) it can be seen, that the equations differ only in the function on the left side of the operation ★_T. In case of Baldwin’s inference it is the compound truth function τ_P and in case of Zadeh’s inference it is the relation of facts’ membership function T_A′. Therefore, it is needed to resolve if that difference influences the inference outcome.

For specified y_n the variability of I depends only on the variability of F. For the points of $X$ , where F obtains the same value, the function I is also constant. Within such a set of points the result of outer supremum depends only on the highest value of intersection ★_T between constant I and T_A′. For constant I, according to the definition of T-norm, the highest result is obtained only for the highest value of T_A′. Therefore, for any set of points in $X$ for which F is constant, it is enough to preserve only the highest value of T_A′, because only the highest value influences the outer supremum.

Confronting this analysis with corollary 3.1N it can be noticed, that compound truth function of premises τ_P fulfills the mentioned conditions of preserving only the highest values within defined sets. Thus, (40) and (41) can be considered equal, because for all y_n ∈ Y the two equations produce the same result.

Equality of the two approaches for rules with a compound premise is hard to present graphically, because of the higher number of dimensions. However, the problem was analyzed similarly to the case of only one premise. Because of higher complexity and the possibility of usage of different junction operations for premises, it was helpful to analyze what the compound truth function represents and formulate the corollary 3.1N. This step allows for direct comparison of the two considered solutions.

3.5 Practical consequences of equality in general case

The thorough analysis of simple and compound truth functions of a premise allows us to see their real nature. In the context of equality, the truth function can be understood as a particular fuzzy set preserving only relevant values of a fact membership function, chosen according to the relationship with a premise membership function.

It can be stated that the truth functions represent a complete fuzzy relationship (or compatibility) between relevant facts and premises. In simplified approaches such compatibility is mapped to only one value in [0, 1] range. It was precisely shown for singletons in the previous Section 2, where the truth functions were also in singleton form, defined by only one input value.

Analyzing the considered solutions in the context of computational complexity, it is now important to mention the criticism of inference based on the truth functional modification in [37]. If one assumes that the computation process would be based on (40), than there really is no advantage of Baldwin’s approach, because it also needs processing of a multi-dimensional space. However, in contrast to Zadeh’s approach, all truth functions are defined in the same truth space [0, 1]. This very fact allows for subsequent joins of only two truth functions at a time and completely changes the view on the computation process. In such case the complexity becomes linear according to the number of simple premises in a rule, because the whole process stays in the low-dimensional space. The compositional rule of inference, without any simplifications, is characterized by exponential complexity, because adding another simple premise to the rule extends the space of analysis by one dimension.

Therefore, to complete the comparison, it must be analyzed if obtaining the compound truth function by subsequent joins is equal to the general formula (30). Such analysis is presented in the following section.

3.6 Associativity of truth functions’ composition

Let us consider the following fuzzy rule with a premise composed of three simple premises $if X_{1} is A_{1} and X_{2} is A_{2} and X_{3} is A_{3} then Y is B .$ (43) As it was previously shown in (30), a composition of truth functions, in case of a compound premise, can be obtained using the extension principle. Thus, the composition of truth functions for a rule with three simple premises takes the following form $\begin{matrix} τ_{P} (η) = sup_{c η = η_{1} ★_{1} η_{2} ★_{2} η_{3} η_{1}, η_{2}, η_{3} \in [0, 1]} \\ {τ_{P_{1}} (η_{1}) ★_{T_{2}} τ_{P_{2}} (η_{2}) ★_{T_{2}} τ_{P_{3}} (η_{3})} . \end{matrix}$ (44) For subsequent compositions of two truth functions, the result can be obtained in two following steps $τ_{P'} (η') = sup_{c η' = η_{2} ★_{2} η_{3} η_{2}, η_{3} \in [0, 1]} {τ_{P_{2}} (η_{2}) ★_{T_{2}} τ_{P_{3}} (η_{3})},$ (45) $τ_{P''} (η) = sup_{c η = η_{1} ★_{1} η' η_{1}, η' \in [0, 1]} {τ_{P_{1}} (η_{1}) ★_{T_{2}} τ_{P'} (η')} .$ (46) Inserting (45) in (46) we obtain $\begin{matrix} τ_{P''} (η) = sup_{c η = η_{1} ★_{1} η' η_{1}, η' \in [0, 1]} \\ {τ_{P_{1}} (η_{1}) ★_{T_{2}} {sup_{c η' = η_{2} ★_{2} η_{3} η_{2}, η_{3} \in [0, 1]} {τ_{P_{2}} (η_{2}) ★_{T_{2}} τ_{P_{3}} (η_{3})}}} . \end{matrix}$ (47) Taking into account the equation η′ = η₂ ★ ₂η₃, the (47) can take the following form $\begin{matrix} τ_{P''} (η) = sup_{c η = η_{1} ★_{1} (η_{2} ★_{2} η_{3}) η_{1}, η_{2}, η_{3} \in [0, 1]} \\ {τ_{P_{1}} (η_{1}) ★_{T_{2}} {sup_{c η' = η_{2} ★_{2} η_{3} η_{2}, η_{3} \in [0, 1]} {τ_{P_{2}} (η_{2}) ★_{T_{2}} τ_{P_{3}} (η_{3})}}} \end{matrix}$ (48) where the outer supremum is performed in a space enlarged by one dimension. Therefore, there exists a set of points in η₂ × η₃ depending on the applied triangular norm ★₂, for which a specified η′ is obtained. The inner supremum operation is unchanged. It should be noticed, that at this stage the η′ variable is still preserved inside inner supremum, obviously obtained according to equation η′ = η₂ ★ ₂ η₃. This way, because of extending the space of analysis in outer supremum, the operation of inner supremum is repeated for the same η′.

To simplify further analysis let the η₁ be constant. In such case a variability of η depends only on a variability of η′, so actually η₂ and η₃. Thus, depending on given ★₁ and ★₂, a chosen η value is obtained for a set of points in η₂ × η₃. Let this set be denoted by Z. Within the set Z the operation of intersection is performed between τ_{P
₁} and the result of inner supremum. Taking into account the fact of constant η₁, thus also τ_{P
₁}, the highest result depends only on the highest result of intersection τ_{P
₂} ★ _{T
₂} τ_{P
₃} obtained for all points in Z. The highest result is obviously provided by the inner supremum. However, by omitting this supremum, also other, lower values of τ_{P
₂} ★ _{T
₂} τ_{P
₃}, will be obtained in Z. Nevertheless, taking into account the outer supremum, it does not influence the outcome, because the lower values will be omitted again. Thus, (48) can be transformed into the following form $\begin{matrix} τ_{P''} (η) = sup_{c η = η_{1} ★_{1} η_{2} ★_{2} η_{3} η_{1}, η_{2}, η_{3} \in [0, 1]} \\ {τ_{P_{1}} (η_{1}) ★_{T_{2}} τ_{P_{2}} (η_{2}) ★_{T_{2}} τ_{P_{3}} (η_{3})}, \end{matrix}$ (49) which is equal to (44).

It should be noticed, that the same procedure can be applied to prove equality of (44) with the approach joining τ_{P
₃} with composition of τ_{P
₁} and τ_{P
₂}. Thus, the operation is associative.

Analogical approach can be used in general case, for larger number of truth functions. Assuming that there is N truth functions, the partial truth function τ_P′ (η′) joining two last ones: τ_{P
_(N-1)} and τ_{P
_N} with the operator ★_(N-1), will take the following form $\begin{matrix} τ_{P'} (η') = sup_{c η' = η_{(N - 1)} ★_{(N - 1)} η_{N} η_{(N - 1)}, η_{N} \in [0, 1]} \\ {τ_{P_{(N - 1)}} (η_{(N - 1)}) ★_{T_{2}} τ_{P_{N}} (η_{N})} . \end{matrix}$ (50) Therefore, joining the above with the remaining truth functions is as follows $\begin{matrix} τ_{P''} (η) = sup_{c η = F' (η_{1}, η_{2}, \dots, η_{(N - 2)}, η') η_{1}, η_{2}, \dots, η_{(N - 2)}, η' \in [0, 1]} [τ_{P_{1}} (η_{1}) ★_{T_{2}} τ_{P_{2}} (η_{2}) ★_{T_{2}} \dots \\ \dots ★_{T_{2}} τ_{P_{(N - 2)}} (η_{(N - 2)}) ★_{T_{2}} τ_{P'} (η')], \end{matrix}$ (51) where the F′ function takes the following form $\begin{matrix} F' (η_{1}, η_{2}, \dots, η_{(N - 2)}, η') = η_{1} ★_{1} η_{2} ★_{2} \\ \dots ★_{(N - 3)} η_{(N - 2)} ★_{(N - 2)} η' . \end{matrix}$ This procedure allows to decrease the space of (30) by one dimension. Applying the same approach to (51) results in another reduction. The sequence of N - 1 such steps transforms N + 1 dimensional operation (30) into N - 1 operations working in three dimensional space.

Equality of (30) and the proposed associative approach is an important fact, especially in the context of proved equality of Baldwin’s and Zadeh’s inference methods. Significant reduction in computational complexity allows practical applications to implement a full version of inference mechanism, regardless to the type of applied triangular norms used as junction operators.

3.7 Numerical examinations

The Baldwin’s method, as well as Zadeh’s approach, was implemented in Java programming language [18]. Created library allowed to examine the computational complexity of Baldwin’s method in order to assess its potential in practical applications. Fuzzy sets in the library are defined by piecewise linear membership functions. This flexible form allows to create needed description using approximation or interpolation. The examination environment was based on Windows operating system and Intel processor with 2.4 GHz clock rate.

All examinations were performed for compound premises, therefore, a general fuzzy rule for all tests was as follows $\begin{matrix} if X_{1} is A_{1} and X_{2} is A_{2} and \dots \\ and X_{N} is A_{N} then Y is B, \end{matrix}$ (52) just like in the example (2) presented at the beginning of the Section 2.

Computations were performed for a variable N in simplified (based on singletons) and general case. Membership functions of facts, premises and conclusions were defined randomly using Gaussian function (piecewise linear interpolation with 23 points).

Simplified approaches, for facts fuzzyfied with singletons, were compared first. Examinations started for N = 100 and ended for N = 5000. Such large number of attributes in a dataset can occur e.g. in gene processing systems [10, 23]. The results of examinations are presented in Fig. 10.

Fig. 10

Average computation time of one rule for increasing complexity of antecedent, where N represents the number of simple premises. The results are obtained for both approaches with facts in form of singletons. Results for Baldwin’s system are marked with gray line and Zadeh’s with black.

It can be noticed, that Baldwin’s approach works slightly slower, because in comparison to (16) the algorithm needs to compute the truth function of conclusion, which is an additional step (the truth functions in these tests were described approximately by 50 points). However, it must be emphasized that the difference is very small and probably would not be significant in most practical implementations. Therefore, the results precisely support the theory described in the Section 2.

The most interesting from the practical application point of view is the general case. Increasing complexity of an antecedent makes the full implementation of compositional rule of inference practically useless for rules containing only several simple premises.

Fig. 11 presents results of tests obtained for Baldwin’s and Zadeh’s systems without simplifications, using facts described by Gaussian membership function. In this examination the truth functions were described by smaller number of points (10 to 20), because of higher complexity of the task. The variable N also represents the number of simple premises in a compound rule, which results in N - 1 compositions of partial truth functions in Baldwin’s approach and computations in N-dimensional space for Zadeh’s solution.

Fig. 11

Average computation time of one rule for increasing complexity of antecedent, where N represents the number of simple premises. Fig. a) shows results for Baldwin’s system and b) for Zadeh’s. In this case facts are not limited to a singleton and described with Gaussian membership function.

Results for Baldwin’s approach are presented on the left and Zadeh’s on the right. It can be noticed that without simplifications the compositional rule of inference is characterized by an exponential complexity (please notice the logarithmic scale in chart b). The examinations in this case were performed only for 5 simple premises in a rule and the dotted part of the chart is an extrapolation, only indicating higher values for such complexity.

On the other hand, it can be observed that for Baldwin’s approach, even for composition of 10 thousand truth functions in a rule, the average time was less than 20 ms. This results clearly show that Baldwin’s system using sequential joins of truth functions, can be successfully applied even in a very complex environment, such as gene classification systems or other problems involving a large number of attributes. Moreover, the output results will be identical to those obtained by a compositional rule of inference.

Low computational complexity allows to consider the implementation of the analyzed inference using type-2 fuzzy sets [44]. Although the general concept of the type-2 fuzzy sets is rather to complex for practical use [25], the interval type-2 fuzzy sets are successfully applied in many fields like classification problems [7], medical data analysis [36] and even big data processing [33].

4 Conclusion

The article presented a thorough comparative analysis of Baldwin’s and Zadeh’s approaches to fuzzy inference, focusing on the practical consequences. From the mathematical point of view, the solution presented by Zadeh is direct [37] and therefore simpler in description.

As it has been shown for the case where facts are represented by singletons, the approach of Baldwin is characterized by an insignificantly higher computational complexity. Both approaches simplify considerably in such environment, however, in Baldwin’s solution the need of computing the truth function of conclusion causes an additional overhead. Although the consequences are not significant for the computation time, there are no arguments for using Baldwin’s method in simplified implementation. However, the great advantages of this method can be observed in complex environment, where fuzzy description of facts is not constrained to singletons only.

The compositional rule of inference is problematic in case of a multidimensional antecedent. Computing the inference result for a rule with only several simple premises is a very time consuming task. The problem does not exist in Baldwin’s approach, because the truth functions of simple premises are described in a unified space, thus, the truth functions can be subsequently joined in a sequence of computationally efficient operations. Equivalence of subsequent joins with multidimensional composition have been proved in final sections of the article.

Thorough analysis presented a character of the premise truth function. Generally speaking, it reflects the compatibility of a fact with a premise in a fuzzy form. Simplified approaches map the compatibility into one value in a [0, 1] range, like commonly used solution presented first by Mamdani and Assilan. In the context of equivalence it can be stated that the truth function preserves only valuable information of a fact membership function, describing its relationship (or compatibility) with a premise. Therefore, it allows to move the inference process into a unified truth space, where the truth functions, unlike different spaces in Zadeh’s approach, can be aggregated in a much more computationally efficient process (the complexity is linear versus exponential). The efficiency was acknowledged by presented results of numerical examinations. This is the strongest fact speaking in favour of Baldwin’s approach, but only when it is based on sequential joins of truth functions.

Described inference system can be compared in a way to the one of Mamdani and Assilan. It allows to analyze a compound antecedent of the rule premise after premise, obtaining partial results in a form of truth functions, instead of values in [0, 1] range. In the end the final result is aggregated in one compound truth function of a premise, just like in the Mamdani and Assilan’s approach with one rule activation level. Taking into account the equivalence it could be stated that Baldwin’s solution based on sequential joins of truth functions, optimizes the compositional rule of inference and makes it possible to apply it in its full form.

Low computational complexity allows to look further into application of the analyzed solution in real-life scenarios. Therefore, future work will focus on two major aspects. First, the general complexity of implemented method must be optimized, because the most common and simplified approaches are more time efficient. It can be achieved by simplifying the definition of truth function, using only two or three points of description. Such modification should preserve the overall process and give satisfactory results. Nevertheless, it must be implemented, compared and examined. The second aspect considers extending the method to apply type-2 fuzzy sets, which are more flexible in defining the uncertainty. However, their representation is much more complex and lower computational complexity of Baldwin’s approach should allow to perform data processing in real time.

Footnotes

Acknowledgment

The authors are grateful to anonymous referees for their constructive comments that have helped to improve the quality and presentation of this manuscript. The publication was partially supported by: the Rector’s research and development grant (J.L): Silesian University of Technology, grant no. 02/130/RGJ20/0001.

References

Baldwin

, A New Approach to Approximate Reasoning Using a Fuzzy Logic. Reaserch Report EM/FS3. University of Bristol. (1977).

Baldwin

, A new approach to approximate reasoning using a fuzzy logic, Fuzzy Sets and Systems 2 (1979), 309–325.

Baldwin

and Guild

, Feasible algorithms for approximate reasoning using fuzzy logic, Fuzzy Sets and Systems 2 (1980), 225–251.

Baldwin

and Pilsworth

, Axiomatic approach to implication for approximate reasoning with fuzzy logic, Fuzzy Sets and Systems 3 (1980), 193–219.

Bellman

and Zadeh

, Modern uses of multiple-valued logic, Chapter Local and Fuzzy Logics (1977), (pp. 103–165).

Bouchon-Meunier

, Dubois

, Godo

and Prade

, Fuzzy sets in approximate reasoning and information systems. chapter Fuzzy Sets and Possibility Theory in Approximate and Plausible Reasoning. (pp. 15–190). Kluwer Academic Publishers. (1999).

Chowdhury

, Qadir

, Laha

, Konar

and Nagar

A.K.

, Finger-induced motor imagery classification from hemodynamic response using type-2 fuzzy sets. In A.K. Nagar, K. Deep, J.C. Bansal and K.N. Das (Eds.), Soft Computing for Problem Solving 2019 (pp. 185–197). Singapore: Springer Singapore. (2020).

Czogała

and Łęski

, Fuzzy and Neuro-Fuzzy Intelligent Systems, Heidelberg: Physica-Verlag, Springer-Verlag Comp. (2000).

Czogała

and Ł

, ęski, On equivalence of approximate reasoning results using different interpretations of if-then rules, Fuzzy Sets and Systems 117 (2001), 279–296.

10.

Danziger

, Baronio

, Ho

, Hall

, Salmon

, Hatfield

, Kaiser

and Lathrop

, Predicting positive p53 cancer rescue regions using most informative positive (mip) active learning, PLOS Computational Biology (2009), 5.

11.

Ding

, Shen

and Mukaidono

, A new method for approximate reasoning. In Proceedings of Nineteenth International Symposium on Multiple-Valued Logic (1989), (pp. 179–185).

12.

Dubois

and Prade

, Fuzzy sets in approximate reasoning, part 1: Inference with possibility distribution, Fuzzy Sets and Systems 40 (1991), 143–202.

13.

Dubois

and Prade

, What are fuzzy rules and how to use them, Fuzzy Sets and Systems 86 (1996), 169–185.

14.

Gera

and Dombi

, Exact calculations of extended logical operations on fuzzy truth values, Fuzzy Sets and Systems 159 (2008), 1309–1326.

15.

Iancu

, Propagation of uncertainty and imprecision in knowledge-based systems, Fuzzy Sets and Systems 94 (1998), 29–43.

16.

Jantzen

, Array approach to fuzzy logic, Fuzzy Sets and Systems 70 (1995), 359–370.

17.

Kudłacik

, Advantages of an approximate reasoning based on a fuzzy truth value, Medical Informatics & Technologies 16 (2010), 125–132.

18.

Kudłacik

, http://fuzzlib.eu/.

19.

Kudłacik

, Performance evaluation of baldwin’s fuzzy reasoning for large knowledge bases, Medical Informatics & Technologies 20 (2012), 29–38.

20.

Kudłacik

, An analysis of using triangular truth function in fuzzy reasoning based on a fuzzy truth value, Medical Informatics & Technologies 22 (2013), 103–110.

21.

Lascio

L.D.

and Gisolfi

, Averaging linguistic truth values in fuzzy approximate reasoning, International Journal of Intelligent Systems 14 (1998), 1998.

22.

, He

, Qin

and Meng

, Some notes on optimal fuzzy reasoning methods, Information Sciences 503 (2019), 652–669.

23.

Liu

, Xu

, Zhang

, Liu

, Yu

, Liu

and Dehmer

, Feature selection of gene expression data for cancer classification using double rbf-kernels, BMC Bioinformatics, (2018), 19.

24.

Mazandarani

and Xiu

, Fractional fuzzy inference system: The new generation of fuzzy inference systems, IEEE Access 8 (2020), 126066–126082.

25.

Mendel

, Uncertain rule-based fuzzy logic systems: Introductions and new directions. London: Prentice Hall PTR. (2001).

26.

Mizumoto

and Zimmermann

H.-J.

, Comparison of fuzzy reasoning methods, Fuzzy Sets and Systems 8 (1982), 253–283.

27.

Own

, Handling partial truth on type-2 similarity-based reasoning, Expert Systems with Applications 36 (2009), 3007–3016.

28.

Own

C.-M.

, Granular computing and intelligent systems: Design with information granules of higher order and higher type. chapter Type-2 Fuzzy Similarity in Partial Truth and Intuitionistic Reasoning. (pp. 95–115). Springer. (2011).

29.

Raha

and Ray

, Reasoning with vague default, Fuzzy Sets and Systems 91 (1997), 327–338.

30.

Raha

and Ray

, Reasoning with vague truth, Fuzzy Sets and Systems 105 (1999), 385–399.

31.

Ray

K.S.

, Soft Computing and Its Applications volume 1. Apple Academic Press. (2014).

32.

Rutkowski

, Computational Intelligence, Methods and Techniques. Springer. (2008).

33.

Shukla

A.K.

, Yadav

, Kumar

and Muhuri

P.K.

, Veracity handling and instance reduction in big data using interval type-2 fuzzy sets, Engineering Applications of Artificial Intelligence 88 (2020), 103315.

34.

Straszecka

, An interpretation of focal elements as fuzzy sets, International Journal of Intelligent Systems 18 (2003), 821–835.

35.

Sugeno

and Takagi

, Multidimentional fuzzy reasoning, Fuzzy Sets and Systems 9 (1983), 313–325.

36.

Tabakov

and Jablonski

, A new region growing medical image segmentation algorithm based on interval type-2 fuzzy sets. In L. Barolli, F. Amato, F. Moscato, T. Enokido and M. Takizawa (Eds.), Web, Artificial Intelligence and Network Applications (pp. 1330–1340). Springer International Publishing. (2020).

37.

Tong

R.M.

and Festathiou

, A critical assessment of truth function modification and its use in approximate reasoning, Fuzzy Sets and Systems 7 (1982), 103–108.

38.

Tsukamoto

, Advances in fuzzy set theory and applications. chapter An Approach to Fuzzy Reasoning Method. (pp. 137–149). Amsterdam: North-Holland. (1979).

39.

Wang

and Hu

, Approximate reasoning based on linguistic truth value with α-operator, Fuzzy Sets and Systems 105 (1999), 401–407.

40.

Wangming

, Equivalence of some methods on fuzzy reasoning. In Proceedings of First International Symposium on Uncertanity Modelling and Analysis. (1990).

41.

Yang

, Kerre

, Ruan

and Zhenming

, A study on fuzzy reasoning mechanism based on extension principle. In The Ninth IEEE International Conference on Fuzzy Systems (pp. 185–190). volume 1. (2000).

42.

Ying

M.S.

, Some notes on multidimentional fuzzy reasoning, Cyber and Syst 19 (1988), 1–13.

43.

Zadeh

, Outline of a new approach to the analysis of complex systems and decision processes, IEEE Transactions on Systems Man and Cybernetics 3 (1973), 28–44.

44.

Zadeh

, The concept of a linguistic variable and it’s application to approximate reasoning. parts 1– 3, Information Science 8 (1975), 199–249, 301–357, 43–80.

45.

Zadeh

, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets and Systems 1 (1978), 2–28.

46.

Zadeh

, Pruf – a meaning representation language for natural languages, International Journal Man-Machine Studies 10 (1978), 395–460.

47.

Zadeh

, Machine intelligence. chapter A Theory of Approximate Reasoning. (pp. 149–194). New York: Wiley volume 9. (1979).