Complex objective optimization in fuzzy environments

Abstract

Multi-objective optimization can be used to address possible conflicting relationships between multiple objectives. However, some objectives have a fuzzy temporal relationship between them, making it difficult to give a common method to portray the fuzzy temporal relationship. To fill this gap, we propose the concept of complex objectives, which can be described by fuzzy temporal logic that includes both temporal and logical operators. Furthermore, we investigated the optimal control problems of complex objectives and developed a fuzzy system called possibilistic decision systems (PDSs) to establish a framework for optimal control. In PDSs, states of fuzzy systems are determined by a family of variables, and transitions induced by actions between fuzzy states of systems are also fuzzy uncertain and determined by a possibility degree. Importantly, we proved that memoryless strategies are sufficient for optimal control of complex objectives. Finally, the theory presented in this paper is illustrated by a mobile robot simulation.

Keywords

Multi-objective optimization complex objectives fuzzy temporal logic decision systems possibility theory

1 Introduction

Multi-objective optimization (MOO) [1] is a problem in which multiple decision objectives are considered in the optimization process. Unlike traditional single-objective optimization problems, multi-objective optimization problems have multiple conflicting objectives, and the optimization process is no longer simply a matter of finding the optimal solution to a single objective function, but of finding the optimal balance between multiple objectives. Multi-objective optimization has important applications in many practical applications, such as engineering design [2], financial investment [3 –5] and control systems of robot [6 –8].

Research on multi-objective optimization problems has been ongoing for decades and researchers have proposed many different algorithms and methods, including evolutionary algorithms based on genetic algorithms, particle swarm optimisation, simulated annealing and ant colony algorithms, as well as traditional optimization methods based on linear programming, non-linear programming, support vector machines, neural networks and so on. Each of these methods has its own advantages and disadvantages and is suitable for different types of multi-objective optimization problems.

Most optimization problems in practice are multi-objective optimization problems. In general, sub-objectives may influence and constrain each other. The improvement of one sub-objective may cause the performance of another or several sub-objectives to decrease. In other words, it is not possible to achieve optimal values for several sub-objectives at the same time, but only to coordinate and compromise between them, so that each sub-objective is optimized as far as possible. In this case, the multi-objective optimization problem can generally be written as the following mathematical model: $f (x) = [f_{1} (x), f_{2} (x), \dots, f_{n} (x)]$ (1) where f (x) denotes the objective functions to be considered, with the objective of minimizing or maximizing them. f_i (x), 1 ≤ i ≤ n are sub-objectives.

1.1 Motivation and research gap

Multi-objective optimization is a challenging field that must address several issues. One of the primary issues is how to handle conflicts and trade-offs between system objectives in uncertain environments, particularly under fuzzy uncertainty [9]. In some cases, there may be apparent complex conflicts between different objectives, such as logical connectives ∧ and ¬. In other cases, there may be intricate relationships and constraints between objectives, such as temporal relationships expressed by operators like “next time,” “until,” and “always.” For example, consider the objective of ensuring that sub-objective a is always reached before sub-objective b. Such a temporal relationship is difficult to express through a functional relationship. Therefore, multi-objective optimization requires the design of algorithms and methods that can adapt to different types of logical conflicts and temporal constraints in fuzzy environments.

To fill this gap, we proposed the concept of complex objective (CO), which is described by fuzzy temporal logic such that the objective function can express complex temporal properties between sub-objectives in fuzzy systems. In fact, a complex objective is also a quantitative property of a system. Fuzzy systems [10] are the generalisation of deterministic systems, in which the input, output, and state variables are defined on a fuzzy set and therefore include fuzzy uncertain information. In order to model systems with fuzzy uncertainty, several operational fuzzy models have been proposed, such as fuzzy discrete event system (FDES) [11 –14], possibility Kripke structure (PKS) [15, 16] and its generalized version generalized PKS (GPKS) [17], etc., models. We adopt the notion of GPKS in this paper, extended by nondeterministic choices for decision-making. The nondeterminism is critical for control systems to make decision. In this paper, we call this GPKS with nondeterministic choices possibilistic decision system (PDS, in short). The use of PDS representation is based on the assumption that the decision depends on various imprecise sensor measurements that are obtained at discrete sampling instants. The fuzzy states in PDSs are determined by a family of evaluations of sensors, expressed as a set of atomic propositions of states.

Therefore, we focus on the following two issues in this paper: 1) How to characterize a complex objective with temporal properties and logical relationships? 2) How to implement complex objective optimization? Fuzzy temporal logic has features that make it an adequate tool to address a large amount of uncertainty that is inherent in natural environments. In [18], Saffiotti review some of the possible uses of fuzzy logic in the field of robots navigation. Furthermore, temporal logic formulas can describe various complex properties with temporal properties in the field of formal verification, such as model checking [19, 20]. Several fuzzy temporal logics have been well studied by Li in [17 , 21–24]. Generalized possibility computational tree logic (GPoCTL) is a fuzzy temporal logic that can characterize the basic temporal properties. We developed the GPoCTL by adding actions, called possibilistic strategy computation tree logic (PoSCTL in short), to characterize a complex objective in this paper.

1.2 Main contribution

In this paper, we investigate complex objective optimizatio n (COO) in fuzzy environments. Complex objective optimization in fuzzy environments is a field of research that deals with multi-objective optimization problems where the objective functions are fuzzy in nature. In COO, the objectives are described by PoSCTL formula, which capture the inherent uncertainty and temporal properties of many real-world optimization problems. In fact, COO is an extension of traditional multi-objective optimization, which deals with fuzzy objective functions. In traditional multi-objective optimization, the goal is to find a set of non-dominated solutions that provide the best compromise between conflicting objectives. In COO, the objective functions are fuzzy and with temporal properties, and the goal is to find the optimal strategy that provide the maximal possibility to objective. What’s more, we will show that memoryless strategies are sufficient for solving optimal control problems of complex objectives that are characterized by PoSCTL.

In brief, the main contributions of this article are:

A proposed fuzzy temporal logic is used to characterize a complex objective in a fuzzy environments such that the objective can express complex temporal properties between sub-objectives.

We prove that memoryless strategies are sufficient for solving optimal control problems of complex objectives.

A decision system called PDS is proposed, which can be used for modeling, formal verification, and decision-making of fuzzy systems with nondeterministic choices.

We have built a framework for the application of PDS to mobile robots. This framework solves the problem of optimal control of a complex objective in a fuzzy environment.

Complex objective optimization in fuzzy environments is a field of optimization that deals with problems involving multiple, conflicting objectives in the presence of uncertainty or imprecision. The presentation of complex objectives is useful for the portrayal of temporal relationships between multiple objectives. This technique involve a search for a set of optimal strategy solutions that are optimal with respect to the multiple objectives using mathematical optimization techniques, and can be applied to a wide range of practical problems in various domains.

1.3 Related work

In fact, a complex objective is a variant of a multi-objective, except that both temporal and logical relationships are considered between the sub-objectives. Here are some studies on multi-objective fuzzy optimal control. In [6] and [7], an approach for the multi-objective control of sampled-data systems that can be modeled as FDES was proposed to simulate a mobile robot. In [8], an embedded fuzzy controller for a mobile robot is developed to make the mobile robot follow the target trajectory satisfactorily. However, there is only one functional relationship (optimal objective function) or only a sequence of implementation between sub-objectives in these researches. Such relations are not sufficient to characterize complex objective relations such as temporal relations: next, future, until, always and infinitely often, etc., and logical relations: or, and, not and implied, etc. Although there has been much research for robots path planning in Markov decision processes (MDPs) [25 –27], they are both probability-based studies for robot path planning rather than depending on fuzzy theory.

1.4 Structure of this paper

The remainder of this paper is arranged as follows. Section 2 gives the notation about possibility theory and formal framework to possibilistic decision systems. In section 3, we study complex objectives optimal control in PDSs. In section 4, we verify the theoretical development through an example of robots moving. This paper ends with conclusions.

2 Notation and formal framework to possibilistic decision systems

In this section, basic knowledge about the possibility theory introduced in [17] and the formal framework to possibilistic decision system are given.

First, all parameters and variables as well as symbols of this paper are listed by a nomenclature to help the reader follow the paper conveniently, see Table 1.

Table 1
Notations in this paper

Notation Represent for

$S$ A strategy.

Po Possibility operator.

Φ A sub-objective.

φ A complex objective.

∨ Sup operator.

∧ Inf operator.

∘ Sup-Inf composition operator.

⊨ Satisfy operator.

∥ ∥ ∥Φ∥: the degree of satisfiability of an objective Φ.

○ Next temporal operator.

◊ Eventually or future temporal operator.

⊔ Until or constrained temporal operator.

Notation	Represent for
$S$	A strategy.
Po	Possibility operator.
Φ	A sub-objective.
φ	A complex objective.
∨	Sup operator.
∧	Inf operator.
∘	Sup-Inf composition operator.
⊨	Satisfy operator.
∥ ∥	∥Φ∥: the degree of satisfiability of an objective Φ.
○	Next temporal operator.
◊	Eventually or future temporal operator.
⊔	Until or constrained temporal operator.

2.1 Generalized possibility theory

Possibility theory is an uncertainty theory devoted to the handling of incomplete information and provides an alternative to probability theory, which use a pair of dual set-functions, i.e., possibility and necessity measures, instead of only one measure in probability theory [28, 29].

In this paper, for simplicity, we assume that the universe of discourse U is a nonempty set, and assume that all subsets are measurable. A possibility measure is a function Π from the powerset 2^U to [0, 1] such that:

1) Π (∅) =0, 2) Π (U) =1, 3) Π (⋃ E_i) = ⋁ Π (E_i) for any subset family {E_i} of the universe set U, where we use ⋁_i∈Ia_i to denote the supremum or the least upper bound of the family of real numbers {a_i} _i∈I; dually, we use ⋀_i∈Ia_i to denote the infimum or the largest lower bound of the family of real numbers {a_i} _i∈I. If Π only satisfies the conditions 1) and 3), then we call Π a generalized possibility measure [17].

2.2 Possibilistic decision systems

Definition 1. A possibilistic decision system (PDS, in short) is a tuple $M = (S, Act, P, I, AP, L)$ where

S is a countable set of fuzzy states;

Act is a set of actions;

P : S × Act × S → [0, 1] is the possibilistic actions transition function such that for all states s ∈ S, there exist α ∈ Act and t ∈ S satisfying P (s, α, t) >0;

I : S → [0, 1] is the initial distribution such that I (s) >0 for some state s;

AP is a set of atomic propositions;

L : S × AP → [0, 1] is a possibilistic labeling function, which can be viewed as function mapping a state s to the fuzzy set of atomic propositions, i.e., L (s, a) denotes the possibility or truth value of atomic proposition a that is supposed to hold in s.

Remark 1. 1) A matrix is called a fuzzy matrix if all its elements are taken from the [0, 1] interval in this paper. The composition operation of fuzzy matrixes is similar to ordinary matrix multiplication operation, just let minimum ∧ and maximum ∨ operations instead of ordinary × and + operators, which is called max-min composition operator and use the symbol ∘ to represent.

2) Let Act (s) = {α ∈ Act | ⋁ _t∈SP (s, α, t) >0}, then Act (s) is nonempty. Each state t for which P (s, α, t) >0 is called an α-successor of s. The possibilistic actions transition function P : S × Act × S → [0, 1] can also be represented by a family of fuzzy matrixes called fuzzy (possibilistic) actions transition matrixes indexed by the set Act. For an action α, let P_α (s, t) = P (s, α, t), then P_α is called a fuzzy α-transition matrix, i.e., P_α = (P (s, α, t)) _s,t∈S.

3) In the following, S is determined by a family of imprecise sensors measurements. Act is a set of control actions. Atomic propositions AP represent a family of imprecise sensors. The labeling function L can be viewed as a fuzzifier to map a sensor data to the [0, 1] fuzzy set.

4) The most related model is labeled Markov decision processes (MDPs) [19]. The main difference between an MDP and a PDS is that the former is based on probability measures and the latter based on possibility (fuzzy) measures. Furthermore, the labeling function in PDSs is also fuzzy rather than crisp in MDPs. Any GPKS is a PDS in which for any state, action set is just a singleton set. Vice versa, any PDS with this property is a GPKS. The action names are irrelevant and are omitted in GPKSs. Thus, GPKSs are thus a proper subset of PDSs

2.3 Strategy

A strategy (c.f.[19]) for a PDS $M$ is a function $S : S^{+} \to Act$ such that $S (s_{0} s_{1} \dots s_{n}) \in Act (s_{n})$ for all s₀s₁ ⋯ s_n ∈ S⁺. Here, S⁺ denotes the set of finite nonempty strings of S. The path π = s₀ α₁s₁ α₂s₂ α₃⋯ is called a $S - Path$ if $α_{i} = S (s_{0} \dots s_{i - 1})$ for all i > 0. Furthermore, a memoryless strategy is a function $S : S \to Act$ . For a memoryless strategy $S$ , its fuzzy transition matrix can be defined as $P_{S}$ such that $P_{S} (s, t) = P (s, S (s), t),$ (2) for any s, t ∈ S. In addition, different strategies could be defined, such as the finite-memory strategy [24], but memoryless strategies are sufficient to the purpose of this paper.

2.4 Complex objective and possibilistic strategy computation tree logic

Computational tree logic is well-suited for characterizing properties with branching time, and we utilize this logic to characterize a system objective. Thus, a complex objective is essentially a quantitative property of a system. We introduce the possibilistic strategy computation tree logic (PoSCTL), which has the same syntax as GPoCTL [17], but differs in semantics as it is related to the system’s strategy.

Definition 2. (Syntax of PoSCTL) PoSCTL state formulas over the set AP of atomic propositions are formed according to the following grammar: $Φ : : = r | a | Φ_{1} \land Φ_{2} | \neg Φ | P (φ),$ (3) where r ∈ [0, 1], a ∈ AP, φ is a PoSCTL path formula. PoSCTL path formulas are formed according to the following grammar: $φ : : = ○ Φ | ◊ Φ | Φ_{1} ⊔ Φ_{2},$ (4) where Φ, Φ₁ and Φ₂ are state formulas and ○, ◊ , ⊔ are called temporal operators.

Definition 3. (Semantics of PoSCTL) For a PDS $M$ and a strategy $S$ for $M$ , let s ∈ S be a state, Φ, Ψ be PoSCTL state formulas, and φ be a PoSCTL path formula. Theirs semantics are a fuzzy set $∥ Φ ∥_{S} : S \to [0, 1]$ , which is defined recursively as follows, $∥ r ∥_{S} (s) = r,$ (5) $∥ a ∥_{S} (s) = L (s, a),$ (6) $∥ Φ \land Ψ ∥_{S} (s) = ∥ Φ ∥_{S} (s) \land ∥ Ψ ∥_{S} (s),$ (7) $∥ \neg Φ ∥_{S} (s) = 1 - ∥ Φ ∥_{S} (s),$ (8) $∥ P (φ) ∥_{S} (s) = {Po}_{S} (s ⊨ φ) .$ (9)

For path formulas, their semantics are defined recursively in a $S - Path$ π = s₀ α₁s₁ α₂s₂ α₃⋯. For “next” formula, its semantics is defined as $∥ ○ Φ ∥_{S} (π) = P (s_{0}, α_{1}, s_{1}) \land ∥ Φ ∥_{S} (s_{1}),$ (10)

For “eventually” (or called “future”) formula that expresses reachable property, its semantics is defined as $∥ ◊ Φ ∥_{S} (π) = ⋁_{j = 0}^{\infty} ⋀_{k \leq j} P (s_{k - 1}, α_{k}, s_{k}) \land ∥ Φ ∥_{S} (s_{j}),$ (11)

For “until” formula that expresses constraint reachable property, its semantics is defined as $\begin{matrix} ∥ Φ ⊔ Ψ ∥_{S} (π) \\ = ∥ Ψ ∥_{S} (s_{0}) \lor ⋁_{j > 0} ((∥ Φ ∥_{S} (s_{0}) \land ⋀_{k < j} P (s_{k - 1}, α_{k}, s_{k}) \\ \land ∥ Φ ∥_{S} (s_{k})) \land P (s_{k - 1}, α_{k}, s_{j}) \land ∥ Ψ ∥_{S} (s_{j})), \end{matrix}$ (12)

In fact, the definition of ◊Φ is defined as true ⊔ Φ, where true = 1 ∈ r. The definition of ${Po}_{S} (s ⊨ φ)$ is the same as the definition in [17], except that the definition of ${Po}_{S} (s ⊨ φ)$ in this paper does not consider the possibility measure of paths, but only the measure of formulas. This is done to make decision-making within a limited horizon more convenient. It is defined as follows: ${Po}_{S} (s ⊨ φ) = ⋁_{π \in S - Paths (s)} ∥ φ ∥_{S} (π) .$ (13) Intuitively, ${Po}_{S} (s ⊨ φ)$ denotes the largest possibility of the $S - paths$ starting at s satisfying the formula φ.

3 Complex objectives optimal control in possibilistic decision systems

In this section, for example, for a path formula φ = Φ₁ ⊔ Φ₂, Φ₁ and Φ₂ are called sub-objectives that connected by temporal operators, and φ is thus called a complex objective (or objective, in short). Furthermore, each subformula in a state formula Φ connected by a logical operator is called a subgoal, and multiple objectives are combined by logic and temporal operators to form a complex objective φ. Assuming that the possibilistic decision system can take measurements of imprecise sensors per sampling instant in order to determine current fuzzy state sets.

Example 1. Let a complex objective φ = ¬ Obstacle∧ ¬ Jam ⊔ Destination. This objective means that a vehicle drives on a traffic-free road and bypasses obstacles until the vehicle arrives the destination, i.e., $φ = \underset{sub - objective 1}{\underset{︸}{\overset{subgoal}{\overset{︷}{\neg Obstacle}} {\overset{︷}{\land}}^{logic} {\overset{︷}{\neg Jam}}^{subgoal}}} \underset{temporal}{\underset{︸}{⊔}} \underset{sub - objective 2}{\underset{︸}{Destination}} .$ (14)

The main idea is to find the optimal strategy for such objective even more complex one in the following.

3.1 The process of complex objectives optimal control

Figure 1 shows the process of complex objectives optimal control in PDSs. First, an objective is a combination of multiple sub-objectives via logical operators, temporal operators. Next, this objective is formalized as a PoSCTL formula and input into the PDS. Then, a family of imprecise sensors sampling data, and the data is mapped to the [0, 1] interval by the fuzzifier, which is used to determine the current state of the system and the atomic proposition of the states. Finally, the PDS calculates and outputs the possibility of the objective and the corresponding optimal memoryless strategy to control the mobile robots or vehicles.

Fig. 1

Process of complex objectives optimal control.

Remark 2. Generally, there are three main ways to obtain the degree of satisfaction of an atomic proposition in a certain state: First, fuzzy values are obtained subjectively according to expert experience, as the data in Example 1. Secondly, the system data is derived from monitoring data from various sensors, as the data in example of robot in Section 4. Third, both the fuzzy value of the state and the fuzzy transition matrix can be obtained more objectively by the learning method. This has been well studied in fuzzy discrete-event systems, but is still a worth investigating topic in complex objectives optimization.

Formally, consider a complex objective φ, and sub-objectives Ψ₁, Ψ₂, i.e., then the complex objectives φ = ○ Ψ₁, ◊Ψ₁ and Ψ₁ ⊔ Ψ₂ optimal control problem amounts to determining maximal possibility

${Po}_{max} (s ⊨ φ) ≜ sup_{S} {Po}_{S} (s ⊨ φ)$ (15) where sup ranges over all strategies for PDS $M$ .

In fact, the Po operator can be viewed as an objective function, and we just need to compute the maximum possible value of the objective function. The set of strategies corresponding to the maximum possibility is the optimal solution set and is the key to our optimal control. However, strategies are infinite, and we need to find subclasses of strategies for optimal control, such as memoryless strategies.

3.2 Maximal possibility and the optimal strategy

The strategy corresponding to the maximum possibility of an objective φ is the optimal strategy, and by definition of maximal possibility, the optimal strategy is the global optimal strategy. There are three ways to construct an objective φ, i.e., φ = ○ Ψ (next), φ = ◊ Φ (future), φ = Ψ₁ ⊔ Ψ₂ (until).

Theorem 1. For a PDS $M$ , a memoryless strategy $S$ for $M$ , a state s in $M$ and an objective Ψ, then

${Po}_{max} (s ⊨ ○ Ψ) = (⋁_{α \in Act (s)} P_{α} \circ ∥ Ψ ∥_{S}) (s)$ (16) where $∥ Ψ ∥_{S} = (∥ Ψ ∥_{S} (s))_{s \in S}$ .

The proof is placed in Appendix A.

The theorem allows for optimal memoryless control of next temporal operator. The optimal strategy is the corresponding action when Equation (16) takes the maximum value.

Proposition 1. For a PDS $M$ and a state s in $M$ , Ψ₁, Ψ₂ are two sub-objectives, then∥Let x_s = Po_max (s ⊨ Ψ₁ ⊔ Ψ₂) and, then x_s satisfy following equation,

$\begin{matrix} x_{s} = ∥ Ψ_{2} ∥_{S} (s) \lor max_{α \in Act (s)} \\ {⋁_{s_{1} \in S} ∥ Ψ_{1} ∥_{S} (s) \land P (s, α, s_{1}) \land x_{s_{1}}}; \end{matrix}$ (17)

The proof is placed in Appendix B.

Proposition 2. Let $M$ be a finite PDS with state space S and C, B are fuzzy states sets over state space S. There exists a memoryless strategy $S_{max}$ such that for any s ∈ S: ${Po}_{max} (s ⊨ C ⊔ B) = {Po}_{S_{max}} (s ⊨ C ⊔ B);$ (18)

The proof is placed in Appendix C.

Theorem 2. For a PDS $M$ , a state s in $M$ and a “until” type objective Ψ₁ ⊔ Ψ₂, there exist a memoryless strategy $S_{max}$ such that

${Po}_{max} (s ⊨ Ψ_{1} ⊔ Ψ_{2}) = (D_{Ψ_{1}}^{S_{max}} \circ P_{S_{max}})^{*} \circ ∥ Ψ_{2} ∥_{S_{\max}} (s);$ (19) where $D_{Ψ_{1}}^{S_{max}} = diag (∥ Ψ_{1} ∥_{S_{max}} (s))_{s \in S}$ and * denotes the reflexive and transitive closure [10].

According to Propositions 3.2 and 3.2 , this theorem can be easily proved and left to the reader.

Since the number of memoryless strategy is finite (at most |Act|^|S|), then we can use strategy iterations [30] algorithms to find out an optimal memoryless strategy, which often applied in MDPs. Therefore the correctness of Algorithm 1 is obvious and is omitted here. Main steps are as follows:

Step 1: Start with an arbitrary memoryless strategy $S$ ;

Step 2: Evaluate the possibilities by using current memoryless strategy;

Step 3: Improve the strategy for all states by Equation (17);

Step 4: Repeat steps 2 and 3 until strategy convergence.

Since ◊Ψ = true ⊔ Ψ, as a supplement of Theorem 3.2, let D_Φ = D_true, we have a more concise form for the “future” type objective (also called single objective).

Theorem 3. For a PDS $M$ , a state s in $M$ and a “future” type objective ◊Ψ, there exist a memoryless strategy $S_{max}$ such that ${Po}_{max} (s ⊨ ◊ Ψ) = P_{S_{max}}^{*} \circ ∥ Ψ ∥_{S_{max}} (s);$ (20)

Theorem 4. (Time complexity of complex objectives optimal control) For a finite PDS $M$ and an objective specified by a PoSCTL formula φ, the complex objective optimal control can be determined in time

$O (poly (| S |) \cdot | Act | \cdot | Φ |),$ (21) where ploy (N) denotes the polynomial function of N and |Φ| denotes the number of subformulas of Φ.

The proof is similar the GPoCTL model checking over GPKSs [17] and is omitted here.

4 Verifying the theoretical development through computer simulations

Fuzzy systems have a wide range of applications, including the field of vehicles. References such as [31, 32] discuss how fuzzy systems can contribute to more efficient driving by considering various factors such as load, driving style, road conditions, and vehicle conditions. For instance, [33] explores a hybrid-electric autonomous vehicle under uncertain and ambiguous road environments and driver behavior. In [34], a new optimized fuzzy control system is proposed to address the limitations of existing vehicle motion simulators.

Fuzzy uncertainty is often encountered in linguistic interaction with automated driving systems. For instance, when drivers issue commands such as “increase the speed a little” or “go slightly to the left,” the terms “a little” and “slightly” are vague and must be dealt with accordingly by the automated driving system. Additionally, uncertainty arises from the imprecision of sensors placed on vehicles, which introduce measurement uncertainty. These uncertainties are classified as fuzzy environments and are detailed in [6]. The accuracy and range of sensors, their location and orientation, and the method of evaluating sensor measurements all contribute to measurement uncertainty. For example, the detection of an obstacle by proximity sensors and the distance estimation depend on the accuracy and range of sensors, as well as their number, location, and orientation. Reliable measurements can be achieved through an adequate number of sensors rather than just one.

Fig. 2

a PDS model of mobile robots.

We propose a framework for optimal control of complex objectives of mobile robots in a fuzzy environment, based on PDSs. This framework simulates car driving by taking into account multiple objectives and the temporal and logical relationships between them. The growing demand for driving has led to an increase in the complexity of objective control problems, necessitating more sophisticated techniques to solve them. We illustrate this framework with an example of mobile robots studied in [8 , 36]. In this example, a robot travels in a 10 × 10 grid to simulate unmanned car driving in a city, with each state representing a set of imprecise sensors placed at intersections to detect traffic congestion and air quality (indicated by the red and blue contours, respectively). Obstacles are represented by thick black lines, and each edge represents a two-way road (see partial enlargement in Fig. 2). The robot attempts to move from its current position s₁ to a desired destination s₁₀₀ and nearby states. Due to the imprecision of the sensors, the processor installed on the robot converts the received sensor data into fuzzy values, modeled as the truth value of an atomic proposition on states, to make decisions.

We will focus on the following four complex objectives in our simulation. It should be noted that the objectives chosen are relatively simple to illustrate the proposed approach.

Complex Objective 1: what is an optimal strategy that the robot will eventually arrive the destination? $φ_{1} = ◊ Destination .$ (22) This objective has a temporal operator ◊, which means in future or eventually.

Complex Objective 2: what is an optimal strategy that the robot will eventually arrive destinations via uncrowded roads? $φ_{2} = \neg Obstacle ⊔ Destination .$ (23) This objective has a temporal operator ⊔, which means until, and has one logical operators ¬, which mean that negations of sub-objective.

Complex Objective 3: what is an optimal strategy that the robot will eventually arrive the destination via uncrowded roads and bypassing obstacles? $φ_{3} = \neg Jam \land \neg Obstacle ⊔ Destination .$ (24) This objective has a temporal operator ⊔, which means until, and has two logical operators ∧ and ¬, which mean that simultaneous holding of negations of two sub-objectives.

Complex Objective 4: what is an optimal strategy that the robot will arrive destinations via uncrowded and fresh air roads and bypassing obstacles? $φ_{4} = Fresh \land \neg Jam \land \neg Obstacle ⊔ Destination .$ (25)

Then these amount to determining maximal possibilities Po_max (s₁ ⊨ φ_i) for 1 ≤ i ≤ 4 and the corresponding optimal strategies.

Fig. 3

Optimal strategies for complex objectives 1-3. (a) shows the optimal strategy for objective 1 that considers only the reachability of the destination without considering the obstacles. (b) shows the optimal paths for objective 2 that considers the obstacles. (c) shows the contour of Jam with a threshold of 0.61 and an optimal strategy for objective 3 that considers both the obstacles and the traffic jam.

4.1 Setup of simulation

We wrote a program in MATLAB (version 2021b) 1 to implement complex objectives optimal control algorithms.

There are 100 states spread over a 10 × 10 grid, i.e., $S = {s_{i}}_{i = 1}^{100} .$ (26)

According to a certain random distribution, the states are labeled Obstacle, Jam, Fresh and Destination and assigned numbers to represent the possibility degree of the atomic propositions in the states. In other words, the numbers represent the sensors data of the current intersection. Then, let $AP = {Obstacle, Jam, Fresh, Destination},$ (27) where the Obstacle denotes the walls that prevent the robot from moving and L (s, Obstacle) ∈ {0, 1} for s ∈ S. The Jam/Fresh refers to the degree of traffic jam/air quality. The higher the value,the higher the traffic jam/air quality, and vice versa. The target state is labeled by the atomic proposition Destination.

Each state has eight actions, i.e., North, Northeast, East, Southeast, South, Southwest, West, Northwest, and each action has only one successor state in this model. (The outermost state may have two or three actions). The robots can make eight actions that can change its direction, i.e., $Act = {N, NE, E, ES, S, SW, W, WN} .$ (28)

A robot is at the state s₁ for a certain moment, i.e., I (s₁) =1 and I (s) =0 for s ≠ s₁, and let state s₁₀₀ and its nearby states be target states, i.e., $L (s_{100}, Destination) = 1; L (s_{99}, Destination) = 0.9;$ $L (s_{90}, Destination) = 0.9; L (s_{89}, Destination) = 0.8;$ (29) this means that the robot should ideally reach state s₁₀₀, but it is acceptable to reach near state s₁₀₀.

Since it is beyond the scope of this paper to obtain the fuzzy transition matrixes and the transition matrixes can be obtained by referring to [37, 38], then we could choose the appropriate transition function between states based above experience. If the robot is moving towards the directions of Destination, we set a higher possibilities and vice versa. Then we can obtain the eight matrixes, i.e., P = {P_N, P_NE, P_E, P_ES, P_S, P_SW, P_W, P_WN}.

Then we get the PDS model $M$ of the six-tuple for mobile robots: $M = (S, Act, P, I, AP, L) .$ (30)

4.2 Results of simulation

We performed 30 simulations of above four objectives and a more intuitive simulation of four complex objectives was visualized by using the contour command in MATLAB. The 30 simulations took 142s.

By Fig. 3, it can be seen that the effect of multiple objectives on the optimal strategy.

Result 1: If only arriving at the destination is considered (objective 1). It can obtain the optimal strategy is $S_{1} (s_{1}) = S_{1} (s_{12}) = S_{1} (s_{23}) = S_{1} (s_{34}) = S_{1} (s_{45}) = S_{1} (s_{56}) = S_{1} (s_{67}) = S_{1}$ $(s_{78}) = S_{1} (s_{89}) = NE$ , then it can be seen that there are two collision points in Fig. 3(a).

Result 2: When the obstacles are considered, it can obtain the optimal strategy is $S_{2} (s_{1}) = S_{2} (s_{12}) = S_{2} (s_{25}) = S_{2} (s_{36}) = S_{2} (s_{47}) = S_{2} (s_{58}) = NE$ , $S_{1} (s_{23}) = S_{2} (s_{24}) = S_{2} (s_{69}) = E$ , $S_{2} (s_{70}) = S_{2} (s_{80}) = S_{2} (s_{90}) = N$ . It can be seen that the mobile robot can avoid the obstacles until it arrives destination, see Fig. 3(b). However, this optimal path may pass through a road with traffic jams.

Fig. 4

Optimal strategies for objective 4. (a) shows the optimal strategy in the obstacle map. (b) shows the optimal strategy in the Jam contour map with threshold of 0.61. (c) shows the optimal strategy in the Fresh contour map with threshold of 0.65.

Fig. 5

Optimal paths of avoiding traffic jams under three traffic jam contour: 0.82, 0.61, 0.38.

Result 3: Considering both obstacles and traffic jams in the complex objective 3, it can be seen that the strategy is $S_{3} (s_{1}) = S_{3} (s_{12}) = S_{3} (s_{26}) = S_{3} (s_{97}) = NE$ , $S_{3} (s_{23}) = S_{3} (s_{24}) = S_{3} (s_{25}) = S_{3} (s_{98}) = S_{3} (s_{99}) = E$ , $S_{3} (s_{37}) = S_{3} (s_{47}) = S_{3} (s_{57}) = S_{3} (s_{67}) = S_{3} (s_{77}) = N$ . Such strategy can follow this objective better when the traffic jam threshold is set to 0.61, see Fig. 3(c).

Result 4: For complex objective 4, it can be seen that the PDS can plan an optimal path by considering obstacles, traffic jam, and air quality simultaneously, and this optimal path is the global optimal solution due to Theorem 3.2. For convenience, we have visualised the three considered factors separately. Figure 4(a), (b) and (c) show the optimal strategy of complex objective 4 in the Obstacle map, Jam map and Fresh map, respectively.

4.3 Impact of parameters

Firstly, we will consider the effect of different thresholds on the optimal path of the objective, as illustrated in Fig. 5. Different thresholds for traffic congestion generate different optimal paths. Higher thresholds for Jam lead to more congested roads, with an average jam value of less than 0.82. Conversely, lower thresholds lead to smoother roads, with an average jam value of less than 0.38 for the entire path. Generally, higher thresholds result in a shorter distance, while lower thresholds result in a longer distance, as more congested roads need to be avoided. Overall, we consider these results sufficient for confirming the basic theoretical development.

Secondly, in practical engineering applications, the state accuracy λ is a critical parameter for control performance. The number of states increases exponentially with accuracy, as seen in the example of a control system for launching rockets. In this case, the pose state of the rocket is determined by three direction sensors (three atomic propositions) x, y, and z with an accuracy of 10^-3, i.e., AP = x, y, z, λ = 3. A pose state s of the rocket may be determined by the three variable readings of direction sensors, such as [x = 0.425, y = 0.377, z = 0.856]. A modeler can generate approximately one billion states, which makes optimization in such a large-scale model highly resource-intensive and time-consuming. Therefore, state reduction technology based on intelligent algorithms is urgently needed.

In fact, the optimal strategy corresponding to the maximum possibility of the complex objective is a Pareto optimal solution introduced in [39]. For this simulation, we can get the maximal possibilities of above four complex objectives: ${Po}_{max} (s_{1} ⊨ φ_{1}) = 1; {Po}_{max} (s_{1} ⊨ φ_{2}) = 1;$ (31) ${Po}_{max} (s_{1} ⊨ φ_{3}) = 0.82; {Po}_{max} (s_{1} ⊨ φ_{4}) = 0.65 .$ (32) A complex objective may have multiple Pareto optimal solutions (the composition of a solution is a sequence of actions). Only one of the optimal strategies is shown in the simulation. This paper does not go on to investigate how to choose a suitable solution within the optimal solution set, as this is beyond the scope of this paper and leaves this issue as an open question for the reader.

4.4 Comparison with other work

The existing work is based on our previous fuzzy model, Generalized Possibility Kripke Structures (GPKS), as described in [17]. In comparison, the proposed PDP model is a more general model that includes the previous models as special cases. Mathematically, any GPKS can be seen as a PDS where the action set for each state is a singleton set. Conversely, any PDS with this property is also a GPKS. The action names in GPKSs are irrelevant and are omitted. Therefore, GPKSs are a proper subset of PDSs. Furthermore, the proposed PDP model offers several new innovations and decision-making advantages that were not available in our previous models.

1) The PDS model as an open system increases the nondeterminism of the choices and allows the model to interact with the environment.

2) Due to the nondeterministic choices, different choices can lead to different models, and the degree to which these models satisfy a given property may vary. In this work, we investigate the optimal control of complex objectives, which enables us to determine not only the maximum possibility that the system will reach these objectives when considering all possible strategies, but also the corresponding optimal strategy for decision-making.

In the robot simulation, the mobile robot has nondeterministic choices at each intersection, making it suitable for modeling using PDS. On the other hand, the GPKS model does not perform well in this situation, as it significantly reduces the robot’s ability to make decisions. This is because GPKS models have no action set, or can be seen as a PDS with a singleton action set. This highlights the advantages of using PDS for modeling in such scenarios.

5 Conclusions

In this study, we investigate the challenges of multi-objective optimization in fuzzy environments, which is an important field of research. We use possibilistic strategy computation tree logic (PoSCTL) to formalize a complex objective that can be described by temporal properties and logical relations. Then, we propose a fuzzy decision system called possibilistic decision systems (PDS) to model systems with nondeterminism in fuzzy uncertain environments and perform an optimization process for the complex objective. Furthermore, we prove mathematically that memoryless strategies are sufficient to solve the optimal control problem for complex objectives, which is a crucial result for decision-making using PDSs. We propose a strategy iteration algorithm that can obtain the optimal memoryless decision of a complex objective, and it is a polynomial algorithm for the size of the model. Finally, we conduct a simulation study of mobile robots to demonstrate the applicability of complex objectives in a fuzzy environment. The simulation results show that the memoryless strategy implemented by the mobile robot follows complex objectives with the maximum possibility in an environment with different fuzzy uncertain factors. The strategy allows each sub-objective to be satisfied with a certain possibility degree and ensures that the complex objective is optimally achieved. Our study provides a general framework for complex objective optimization in fuzzy environments and highlights the effectiveness of our proposed methods.

There is still a lot of work to be done here. The optimization of complex objectives characterised by linear temporal logic has yet to be explored. Furthermore, although the time complexity of algorithms for strategy iteration is polynomial, they can still be time-consuming for systems with large-scale states. Further state reduction methods need to be developed.

Footnotes

Acknowledgment

This work was partially supported by National Natural Science Foundation of China (Grant Nos: 11671244, 12071271) and the Fundamental Research Funds For the Central Universities (Grant No: 2020CSLY016).

A. The proof of Theorem 1

Proof: Obviously, for any strategy $S$ , (33) ${Po}_{max} (s ⊨ ○ Ψ) \leq ⋁_{α \in Act (s)} P_{α} \circ ∥ Ψ ∥_{S} (s) .$

Conversely, there exists an action α ∈ Act (s) and a state s₁ ∈ S such that (34) $⋁_{α \in Act (s)} P_{α} \circ ∥ Ψ ∥ (s) = P (s, α_{1}, s_{1}) \land ∥ Ψ ∥ (s_{1}) .$

We can construct a strategy $S : S^{+} \to Act$ such that $S (s) = α_{1}, S (s ω)$ is arbitrary for any ω ∈ S⁺. Since (35) ${Po}_{S} (s ⊨ ○ Ψ) \geq P (s, α_{1}, s_{1}) \land ∥ Ψ ∥_{S} (s_{1}),$ therefore (36) ${Po}_{max} (s ⊨ ○ Ψ) \geq ⋁_{α \in Act (s)} P_{α} \circ ∥ Ψ ∥_{S} (s),$ this completes the proof.

B. The proof of Proposition 1

Proof: By the definition of Po_max (s ⊨ Ψ₁ ⊔ Ψ₂), there exists a strategy $S$ such that ${Po}_{max} (s ⊨ Ψ_{1} ⊔ Ψ_{2}) = {Po}_{S} (s ⊨ Ψ_{1} ⊔ Ψ_{2})$ , then (37) $\begin{matrix} {Po}_{max} (s ⊨ Ψ_{1} ⊔ Ψ_{2}) \\ = ⋁_{π \in S - Paths (s)} ∥ Ψ_{1} ⊔ Ψ_{2} ∥_{S} (π) \\ = ∥ Ψ_{2} ∥_{S} (s) \lor ⋁_{s_{1} \in S} ∥ Ψ_{1} ∥_{S} (s) \land P (s, α_{1}, s_{1}) \land x_{s_{1}} \\ \leq ∥ Ψ_{2} ∥_{S} (s) \lor max_{α \in Act (s)} {⋁_{s_{1} \in S} ∥ Ψ_{1} ∥_{S} (s) \land P (s, α, s_{1}) \land x_{s_{1}}} . \end{matrix}$

Conversely, there exists an action α ∈ Act and a strategy $S_{s_{1}}$ such that $r i g h t = ‖ Ψ_{2} ‖_{S} (s) \lor_{s_{1} \in S} ‖ Ψ_{1} ‖_{S} (s) \land P (s, α_{1}, s_{1}) \land P o_{S_{s_{1}}} (s_{1} ⊨ Ψ_{1} ⊔ Ψ_{2})$ . Constructing a strategy $S_{max} : S^{+} \to Act$ such that $S_{max} (s) = α, S_{max} (s ω) = S_{s_{1}} (ω)$ for any ω ∈ S⁺. Then we have (38) $right = {Po}_{S_{max}} (s ⊨ Ψ_{1} ⊔ Ψ_{2}) \leq {Po}_{max} (s ⊨ Ψ_{1} ⊔ Ψ_{2}) .$

This completes the proof.

C. The proof of Proposition 2

Proof: We first show that the proposition holds for crisp states C and B. The proof of Equation (18) is constructing a memoryless strategy $S_{max}$ . If s ∈ R or s ⊭ ∃ C ⊔ B, (here, s ⊨ ∃ C ⊔ B means that there exists a path in the diagraph of $M$ satisfies C ⊔ B. The definition of diagraph can be seen [19]), we choose an arbitrary action for each state s ∈ S. If s ∈ C ∧ s ∉ B and s ⊨ ∃ C ⊔ B, let Act_max (s) be the set of actions α ∈ Act (s) such that (39) ${Po}_{max} (s ⊨ C ⊔ B) = ⋁_{t \in S} P (s, α, t) \land {Po}_{max} (t ⊨ C ⊔ B) .$

Under the condition of ensuring reachability of R via U in the induced GPKS $M$ under $S_{max}$ , for s ⊨ ∃ U ⊔ R, len (s) denotes the length of a shortest path fragment from s to R via U. Since s ∈ U ∧ s ∉ R and s ⊨ ∃ U ∪ R, it follows that len (s) >0. By induction on n ≥ 1, we define actions $S_{max}$ for the states s with s ⊨ ∃ U ⊔ R and len (s) = n. If len (s) = n ≥ 1, we choose an action $S_{max} (s) \in {Act}_{max} (s)$ such that $P (s, S_{max} (s), t) > 0$ for some state t with t ⊨ ∃ U ∪ R and len (t) = n - 1. This yields a memoryless strategy $S_{max}$ . Moreover, the possibilities for U ∪ R provide the solution of the linear equation system:

•If s ∈ B, then Po_max (s ⊨ C ⊔ B) =1;

•If s ⊭ ∃ C ⊔ B, then Po_max (s ⊨ C ⊔ B) =0;

•If s ∈ C ∧ s ∉ B and s ⊨ ∃ C ⊔ B, then (40) $\begin{matrix} {Po}_{max} (s ⊨ C ⊔ B) \\ = ⋁_{t \in S} P (s, S_{max} (s), t) \land {Po}_{max} (t ⊨ C ⊔ B) . \end{matrix}$

Since the vector (Po_max (s ⊨ C ⊔ B)) _s∈S also solves the above equation, then this completes the proof of Equation (18).

Second, we show that the proposition holds for general fuzzy states C and B. Let (41) ${(C ⊔ B) (s) | (C ⊔ B) (s) > 0} = {λ_{1} < \dots < λ_{k}},$ we have (42) $C ⊔ B = ⋁_{i = 1}^{k} λ_{i} \land (C ⊔ B)_{λ_{i}} = ⋁_{i = 1}^{k} λ_{i} \land (C_{λ_{i}} ⊔ B_{λ_{i}}) .$

By the fuzzy set decomposition theorem [10], then (43) ${Po}_{max} (s ⊨ C ⊔ B) = ⋁_{i = 1}^{k} λ_{i} \land {Po}_{max} (s ⊨ C_{λ_{i}} ⊔ B_{λ_{i}}) .$

Since C_{λ_i}, B_{λ_i} are crisp states, by Equation (18), there exists a memoryless strategy $S_{max_{i}}$ such that (44) ${Po}_{max} (s ⊨ C ⊔ B) = ⋁_{i = 1}^{k} λ_{i} \land {Po}_{S_{max_{i}}} (s ⊨ C_{λ_{i}} ⊔ B_{λ_{i}}) .$

Since k is finite, then there always exists $j \in [1, k] \cap ℕ$ that maximizes the Po_max (s ⊨ C ⊔ B), i.e., (45) ${Po}_{max} (s ⊨ C ⊔ B) = λ_{j} \land {Po}_{S_{max_{j}}} (s ⊨ C_{λ_{j}} ⊔ B_{λ_{j}}) .$

We choose $S_{max} = S_{max_{j}}$ , then $S_{max}$ is the required memoryless strategy. Next, let us prove that the strategy $S_{max}$ is the optimal memoryless strategy, i.e., (46) ${Po}_{S_{max}} (s ⊨ C ⊔ B) = {Po}_{max} (s ⊨ C ⊔ B) .$

Obviously, ${Po}_{S_{max}} (s ⊨ C ⊔ B) \leq {Po}_{max} (s ⊨ C ⊔ B)$ .

Conversely, ${Po}_{S_{max}} (s ⊨ C ⊔ B) = ⋁_{j = 1}^{k} λ_{j} \land {Po}_{S_{max}} (s ⊨ C_{λ_{j}} ⊔ B_{λ_{j}}) \geq λ_{j} \land {Po}_{S_{max}} (s ⊨ C_{λ_{j}} ⊔$ $B_{λ_{j}}) = λ_{j} \land {Po}_{S_{max_{j}}} (s ⊨ C_{λ_{j}} ⊔ B_{λ_{j}}) = {Po}_{max} (s ⊨ C ⊔ B)$ .

This completes the proof.

MATLAB program ran on a PC with an Intel i5-10400F 2.90-GHz CPU and equipped with 16 GB-RAM and 64-bit Windows 10 systems.

References

Marler

R.T.

and Arora

J.S.

, Survey of multi-objective optimization methods for engineering, Structural and Multidisciplinary Optimization 26 (2004), 369–395.

Huang

H.Z.

, Gu

Y.K.

and Du

, An interactive fuzzy multi-objective optimization method for engineering design, Engineering Applications of Artificial Intelligence 19(5) (2006), 451–460.

Jin

and Yang

, Monotonicity theorem for the uncertain fractional differential equation and application to uncertain financial market, Mathematics and Computers in Simulation 190 (2021), 203–221.

Jin

, Yang

, Xia

, et al., Reliability index and option pricing formulas of the first-hitting time model based on the uncertain fractional-order differential equation with Caputo type, Fractals 29(01) (2021), 2150012.

Tian

, Jin

, Yang

, et al., Reliability analysis of the uncertain heat conduction model, Computers and Mathematics with Applications 119 (2022), 131–140.

Schmidt

K.W.

and Boutalis

Y.S.

, Fuzzy Discrete Event Systems for Multiobjective Control: Framework and Application to Mobile Robot Navigation, IEEE Transactions on Fuzzy Systems 20(5) (2012), 910–922.

Boutalis

and Schmidt

, Multi-objective decision making using fuzzy discrete event systems: A mobile robot example, 18th Mediterranean Conference on Control and Automation (2010), 575–580.

Yang

S.X.

, Li

, Meng

M.-H.

and Liu

P.X.

, An embedded fuzzy controller for a behavior-based mobile robot with guaranteed performance, IEEE Transactions on Fuzzy Systems 12(4) (2004), 436–446.

Singh

A.P.

, Yadav

S.P.

and Singh

S.K.

, A multi-objective optimization approach for DEA models in a fuzzy environment, Soft Computing 26(6) (2022), 2901–2912.

10.

, Analysis of Fuzzy Systems, Science Press, Beijing, 2005.

11.

Lin

and Ying

, Modeling and control of fuzzy discrete event systems, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 32(4) (2002), 408–415.

12.

Cao

and Ying

, Observability and decentralized control of fuzzy discrete-event systems, IEEE Transactions on Fuzzy Systems 14(2) (2006), 202–216.

13.

Deng

and Qiu

, Bifuzzy discrete event systems and their supervisory control theory, IEEE Transactions on Fuzzy Systems 23(6) (2015), 2107–2121.

14.

Qiu

, Supervisory control of fuzzy discrete event systems: a formal approach, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 35(1) (2005), 72–88.

15.

and Li

, Model checking of linear-time properties based on possibility measure, IEEE Transactions on Fuzzy Systems 21(5) (2013), 842–854.

16.

, Li

and Ma

, Computation tree logic model checking based on possibility measures, Fuzzy Sets and Systems 262 (2015), 44–59.

17.

and Ma

, Quantitative computation tree logic model checking based on generalized possibility measures, IEEE Transactions on Fuzzy Systems 23(6) (2015), 2034–2047.

18.

Saffiotti

, The uses of fuzzy logic in autonomous robot navigation, Soft Computing 1(4) (1997), 180–197.

19.

Baier

and Katoen

J.-P.

, Principles of Model Checking, The MIT Press, 2008.

20.

Clarke

, Grumberg

and Peled

, Model Checking, The MIT Press, 1999.

21.

, Quantitative model checking of linear-time properties based on generalized possibility measures, Fuzzy Sets and Systems 320 (2017), 17–39.

22.

, Lei

and Li

, Computation tree logic model checking based on multi-valued possibility measures, Information Sciences 485 (2019), 87–113.

23.

and Wei

, Possibilistic fuzzy linear temporal logic and its model checking, IEEE Transactions on Fuzzy Systems 29(7) (2021), 1899–1913.

24.

Liu

, He

and Li

, Computation tree logic model checking over possibilistic decision processes under finite-memory scheduler, in National Conference of Theoretical Computer Science, Springer, 2021, 75–88.

25.

Nardi

and Stachniss

, Uncertainty-Aware Path Planning for Navigation on Road Networks Using Augmented MDPs, 2019 International Conference on Robotics and Automation (ICRA), (2019), 5780–5786.

26.

Pei

, An

, Liu

and Wang

, An improved dyna-q algorithm for mobile robot path planning in unknown dynamic environment, IEEE Transactions on Systems, Man and Cybernetics: Systems, 2021.

27.

Konar

, Chakraborty

I.G.

, Singh

S.J.

, Jain

L.C.

and Nagar

A.K.

, A deterministic improved q-learning for path planning of a mobile robot, IEEE Transactions on Systems, Man, and Cybernetics: Systems 43(5) (2013), 1141–1153.

28.

Zadeh

L.A.

, Fuzzy sets, Information and Control 8(3) (1965), 338–353.

29.

Zadeh

L.A.

, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets and Systems 1(1) (1978), 3–28.

30.

Howard

R.A.

, Dynamic programming and markov processes, 1960.

31.

Stan

, Suciu

and Potolea

, Smart driving methodology for connected cars, International Conference on System Theory, Control and Computing (ICSTCC), IEEE (2019), 608–613.

32.

Mao

, Dou

, Yang

, Tian

and Zong

, Fuzzy disturbance observer-based adaptive sliding mode control for reusable launch vehicles with aeroservoelastic characteristic, IEEE Transactions on Industrial Informatics 16(2) (2020), 1214–1223.

33.

Phan

, Bab-Hadiashar

, Fayyazi

, Hoseinnezhad

, Jazar

R.N.

and Khayyam

, Interval Type 2 Fuzzy Logic Control for Energy Management of Hybrid Electric Autonomous Vehicles, IEEE Transactions on Intelligent Vehicles 6(2) (2021), 210–220.

34.

Asadi

, Bellmann

, Mohamed

, Lim

C.P.

, Khosravi

and Nahavandi

, Adaptive Motion Cueing Algorithm using Optimized Fuzzy Control System for Motion Simulators, IEEE Transactions on Intelligent Vehicles. doi: 10.1109/TIV.2022.3147862.

35.

Pan

, Li

, Cao

and Li

, Reachability in fuzzy game graphs, IEEE Transactions on Fuzzy Systems 25(4) (2016), 972–984.

36.

Alexopoulos

and Griffin

P.M.

, Path planning for a mobile robot, IEEE Transactions on Systems, Man, and Cybernetics 22(2) (1992), 318–322.

37.

Ying

and Lin

, Online self-learning fuzzy discrete event systems, IEEE Transactions on Fuzzy Systems 28(9) (2019), 2185–2194.

38.

Ying

and Lin

, Learning Fuzzy Automaton’s Event Transition Matrix When Post-Event State Is Unknown, IEEE Transactions on Cybernetics 52(6) (2022), 4993–5000.

39.

Ngatchou

, Zarei

and El-Sharkawi

, Pareto Multi Objective Optimization, Proceedings of the 13th International Conference on, Intelligent Systems Application to Power Systems (2005), 84–91.