Abstract
Critical dynamics research of recurrent neural networks (RNNs) is very meaningful in both theoretical importance and practical significance. Recently, because of the application requirements, the study on the critical dynamics behaviors of RNNs has drawn special attention. The critical condition is that a discriminant matrix M1 (Γ) related with an RNN is nonnegative definite. Due to the essential difficulty in analysis, there were only a few critical results up to now. Further, nearly all of the existing dynamic results are with diagonally nonlinear requirements on the activation mappings, i.e., the activation mapping G should satisfy the strict necessary condition that G (x) = (g1 (x1) , g2 (x2) , ⋯ , g N (x N )) T . This is because of the essential difficulty on the analysis of the energy function. The requirement is so strict and it limits the applications of RNNs. In this paper, under the critical conditions, some new global asymptotically stable conclusions are presented for RNNs without the diagonally nonlinear requirement on the activation mappings. The results present here not only improve substantially upon the existing relevant critical stability results, but also provide some further cognizance on the essentially dynamical behavior of RNNs, and further, enlarge the application fields of them.
Keywords
Introduction
Recurrent neural networks are the neural networks with feedback loops and whose neurons send feedback signals to each other. An RNN is one kind of dynamic systems that their states vary with time pasting by. Their internal memory can be used to process arbitrary sequences of inputs, and this makes them applicable to model dynamic process associated with solving learning, pattern recognition, image processing, associative memory as well as solving optimization problems. The crucial foundation of the RNNs consists in their dynamical properties, such as the global convergence, asymptotic stability and exponential stability, therefore, the analysis of such dynamical behaviors is a first and necessary step for any practical design and application of RNNs.
In recent years, considerable efforts have been devoted to the neural network modeling and control, and for different model individuals, there have been numerous analysis on the stability of RNNs with or without time-delay (see, e.g. [1–6] and the references therein). Further, the stability results of RNNs can be used for different analysis and applications, e.g., the adaptive neural networks are proposed for a class of non-linear second-order multi-agent systems and greatly reduces the online computation burden, or for stabilizing the uncertain nonlinear strict-feedback systems with full-state constraints [7]. A fuzzy-neural network is used to approximate the unknown functions of a class of nonlinear stochastic systems [8]. The Radial basis function neural networks are utilized to approximate the unknown nonlinear function [9].
For RNNs study, two fundamental modeling approaches are commonly adopted: either using the neuron states or using the local field states of neurons as basic variables to describe the dynamical evolution rules of the neural network. Correspondingly, the local field RNNs and static RNNs typically represent two fundamental modeling approaches in current neural network research [10], which are respectively modeled by
Model (1)–(2) summarize most of the existing continuous-time RNNs specials, e.g., Hopfield-type neural networks, brain-state-in-a-box neural networks, bound-constraints optimization solvers, recurrent back-propagation neural networks, mean-field neural networks,convex optimization solvers, recurrent correlation associative memories neural networks, cellular neural networks, etc.
For a given recurrent neural network, if we define M1 (Γ) = L-1DΓ - (ΓW + W T Γ)/2, where both D and Γ are positive definite diagonal matrices, and W is the weight matrix of the network, then by generalizing these existing stability results of RNNs, it should be noticed that most of them are on the exponential stability analysis under the conditions. That is, there exists a positive definite diagonal matrix Γ, such that M1 (Γ) is positive definite, where L = diag {L (g1) , L (g2) , …, L (g N )} with each L (g i ) >0 being the Lipschitz constant of g i and G = (g1, g2, …, g N ) T is the activation mapping of the network. On the other hand, [11, 12] have proved that an RNN will be globally exponentially unstable if there is a positive definite diagonal matrix Γ, such that M2 (Γ) = l-1DΓ - (ΓW + W T Γ)/2 is negative definite, where l = diag {l (g1) , l (g2) , …, l (g N )} with each l (g i ) >0 being the inversely Lipschitz constant of g i , i.e., |g i (t) - g i (s) | ≥ l (g i ) |t - s| for all s, t ∈ R N . By the definitions of Lipschitz constant and inversely Lipschitz constant, we have l (g i ) ≤ L (g i ) and, in the sense of nonnegative definition, the inequality relation M1 (Γ) ≤ M2 (Γ) holds. From what have been mentioned above, we get that M1 (Γ) >0 is sufficient for the globally exponential stability of RNNs, and M2 (Γ) ≥0 is necessary for RNNs to have globally stable dynamics. The question then arise: what kinds of dynamic behaviors will be when M1 (Γ) ≤0 and M2 (Γ) ≥0? Since M1 (Γ) >0 is a sufficient condition that the network is stable, and when M1 (Γ) ≥0, one can get M2 (Γ) ≥0 directly, then recently special attention has been paid on the dynamics behavior under the condition that M1 (Γ) ≥0. This condition is called as the critical condition, and the dynamics analysis under such condition is called as the critical analysis. It is clear that the critical condition is really an essential gap between stable and unstable for RNNs. The goal of the critical analysis is to find the least restrictions of the networks to assure the stability. To extend the application fields, especially to loosen the design for the structure of RNNs, it is quite important to study the critical dynamics of RNNs. While, it is by no means easy to conduct a meaningful critical dynamics analysis, since such analysis is much more difficult than the dynamics analysis under the non-critical condition, i.e. M1 (Γ) >0. It should be also noticed that the study of dynamics behaviors for RNNs on the critical case is very valuable in both theory and application, that is because, for RNNs, the critical condition can characterize the essential boundary line between the stability region and instability region [14].
Up to now, there are only a few critical stability and convergence analysis on RNNs. For RNN with hyperbolic tangent activation function, in [15], the globally asymptotical stability and globally exponential stability of the unique equilibrium point of the network under some specific conditions of M1 (Γ) ≥0 have been conducted. The authors of [16] have gotten the globally exponential stability of RNN with projection operator under the condition that I - W is nonnegative definite (which is a special case of M1 (Γ) ≥0). In [12], the authors have proved that an RNN with Sigmoidal activation mapping has a globally attractive equilibrium state, and when W is quasi-symmetric (i.e., there exists a positive definite diagonal matrix Γ, such that ΓW is symmetric), then RNN with nearest point projection activation mapping is global convergence on a region defined by the network. The quasi-symmetric requirement of W in [12] has been removed in [17, 18]. Further, the authors of [19] have gotten results that an RNN with general projection mapping is globally convergent under the condition that M1 (Γ) + P ≥ 0 (here M1 (Γ) <0 and P ≥ 0) if the nonlinear norm defined by the network is bounded, and some further study of such RNNs has been conducted in [13, 20–23].
On the other hand, it should be pointed out that almost all of the dynamics conclusions in literature are based on a very strong hypothesis that activation mappings must be diagonally nonlinear. The diagonally nonlinear requirement in dynamics analysis is due to the fact that when considering the derivative of a constructed energy function, an inner product is always produced and it is hard to deal with it if one do not use the diagonally nonlinear property. The diagonally nonlinear requirement has great limitation in application, since actually, each g i should have responds to all x j , not only just to the corresponding x i . Obviously, the requirement for activation mappings is quite strict, which does not obey the biological and applications.
In this present, we devote to answer the question that what kinds of critical dynamics behaviors will be for RNNs without the diagonally nonlinear requirement. That is, we consider the activation mapping G has its original form G (x) = (g1 (x) , g2 (x) , ⋯ , g N (x)) T and networks are under the critical condition that M1 (Γ) ≥0. Since many commonly used activation mappings naturally process the uniformly anti-monotone property defined in [24], i.e., sigmoidal mapping, nearest point projection mapping, linear saturating mapping, signum mapping, symmetric multi-valued step mapping, etc., we focus on studying the critical dynamics behaviors of RNNs with uniformly anti-monotone property. By applying Lyapunov functional method and Barbalat Lemma, we achieve the critical global asymptotically stable results for such type RNNs without diagonal nonlinear requirement. The results in Section 3 only need networks satisfy the critical condition, i.e., M1 (Γ) ≥0, and do not require any other additional prerequisites, such as quasi-symmetric limitation on W [12], the nonlinear norm or matrix norm being bounded [19], and other restraints of RNNs in [15–22]. Furthermore, because the critical dynamics results obtained here are for general activation mappings without diagonally nonlinear limitation, they can be directly applied to RNNs and improve deeply the existing ones for RNNs, such as Hopfield-type neural networks, Recurrent Back-propagation (ReBP) neural networks, Recurrent Back-propagation neural networks, Brain-State-in-a-Box/Domain type neural networks, Cellular neural networks, Bidirectional Associative Memory neural networks, Bound-constraints Optimization neural networks, and so on.
For nonlinear activation mapping G : R
N
→ R
N
, the range and fixed-point set of G are respectively defined by
In the previous analysis of RNNs, especially in the critical analysis, the diagonally nonlinear property of RNNs is a necessary requirement for dealing with the derivative of a constructed energy function. While, the diagonally nonlinear requirement has an obviously limitation, i.e., for each g i , we should consider the whole vector x, not only its component x i .
Many commonly used activation mappings naturally process the uniformly anti-monotone property. For example, sigmoidal mapping, nearest point projection mapping, signum mapping, symmetric multi-valued step mapping, linear saturating mapping, etc. Naturally, most of the RNN models being widely applied in various fields of science and engineering belong to the family with uniformly anti-monotone property.
Without loss of generality, through out this paper, we assume that each L (g i ) >0 and let L = diag {L (g1) , L (g2) , …, L (g N )}, which is said to be minimum Lipschitz matrix of operator G (x) = (g1 (x) , g2 (x) , …, g N (x)) T .
In this section, we will establish the global asymptotically stable results for RNNs with α-UAM operators of both system (1) and (2), which are under the critical condition that the discriminant matrix defined by the network is positive semi-definite. It should be pointed out that all of the results don’t need the mapping G to be diagonal nonlinear, while the diagonal nonlinear property of the activation mapping is quite crucial for most of the critical dynamical analysis. We consider the networks of form (1) first. To be simple, we denote the range of the nonlinear activation operator, i.e. R (G), by Θ.
Suppose that Θ is bounded, closed and convex on R N . For any y ∈ R (G), define T (y) = WG (y) + q and Fix(T) to be the fixed point set of T (y), then by Brouwer’s fixed point theorem, T has at least one fixed point y* ∈ Fix (T). That is, T-1 (0), the equilibrium state set of (1) is not empty. The following is the globally asymptotically stable theorem for system (1).
We will complete the proof in several steps.
Step 1) We show that dE (u (t))/dt ≤ 0.
Now, a direct calculation using system (1) gives
Noting that M1 (Γ) is nonnegative definite, so we have
Since G is α-UAM, we then have
Because ΓD is an identity matrix, Then by (8) and (9), we get
Step 2) We want to show that
Since dE (u (t))/dt is a continuous function and u (t) ∈ Θ is a bounded and closed set, it follows that dE (u (t))/dt is a uniformly continuous function of t. Furthermore, by (10) we have dE (u (t))/dt ≤ 0, combined with the fact that E (u (t)) is bounded, implies that
So we have
Consequently,
Then it can be deduced that
So, the result
Step 3) To show that
By the differential equation theory, we can also solve the following integral equation:
Let
By Step 2),
Therefore, we conclude from (8) that, when t > s ≥ T
ɛ
,
Letting t→ + ∞ in the above inequality yields
Step 4) Finally, we prove that the equilibrium state of system (1) is unique.
Without loss of generality, we assume that v* ≠ u* is also the equilibrium state of system (1), i.e.,
Similarly, because ɛ is arbitrary, the equation
Theorem 1 gives the critically globally asymptotically stable results of system (1) without diagonal nonlinear requirment, that is, the activation mapping G has the form G (x) = (g1 (x) , g2 (x) , ⋯ , g N (x)) T . Correspondingly, we can also deduce the critical globally asymptotically stable conclusion for RNN system (2).
In this paper, when there exists a positive diagonal matrix Γ, such that ΓD is an identity matrix (here D is a matrix defined by the network) and the critical condition holds (i.e., M1 (Γ) is a nonnegative definite matrix), then the neural network has a unique equilibrium state and which is globally asymptotically stable. The results directly removed the requirement of diagonally nonlinear property on the activation function.
In this section, we provide two illustrative examples to demonstrate the validity of the critical stability results formulated in the previous section. Since most of the existing dynamical results for RNNs are based on the assumption that activation mappings of the network are diagonal nonlinear, thus, most of the known dynamical results developed in literature can not be applied here.
The weight matrix and the external bias vector are defined as follows:
In this example, it is easy to verify that for any positive diagonal matrix Γ, the matrix M1 (Γ) is not positive definite. That is to say, Lemma 3 in [11] is not suitable here. And for this example, it is established on a general projection operator, so the diagonally nonlinear results, i.e., Theorem 3 in [12], Theorem 2 in [18], and Corollary 3.2 in [19] cannot be used here. We will show the conditions in Theorem 1 proposed herein is satisfied. Actually, in this example, we have α = 1 by the definition of G, thus L = I. Let v1 ≠ v2 ∈ R3, and since G is α-UAM, we can get that ∥G (v1) - G (v2) ∥ 2 ≤ α-1 ∥ v1 - v2 ∥ 2 = ∥ v1 - v2 ∥ 2. In addition, since ∥G (v1) - G (v2) ∥ 2 ≥ ∥ g1 (v1) - g1 (v2) ∥ 2, so ∥g1 (v1) - g1 (v2) ∥ 2 ≤ ∥ v1 - v2 ∥ 2, and one can get that L (g1) ≤1. On the other hand, by taking v1 = (- δ, 0, - 1) , v2 = (δ, 0, - 1), thus
Figure 1 depicts the time responses of state variables of system (14) starting randomly fromW (Θ) + q.

Transient behaviors of RNN in system (14) with random initial points u0 ∈ W (Θ) + q.
Obviously, the activation mapping G is a general projection operator and for any positive diagonal matrix Γ, M1 (Γ) is not positive definite in the present case.
Since the conclusions about global convergence of the system (16) are proved under the condition that G is diagonally nonlinear in Theorem 1 of [18] and Theorem 3.1 [19], so both Theorem 1 in [18] and Theorem 3.1 in [19] cannot be used here. Furthermore, Lemma 2 in literature [11] is unsuitable here, for the global convergence of the system (16) being guaranteed under special conditions of M1 (Γ) is positive definite and G is diagonally nonlinear in the Lemma 2. We show that conditions in Corollary 1 proposed herein is satisfied. Actually, just like the proof in Example 1, here α = 1 and L = I. By taking D = Γ = I, it is clear that L-1DΓ - (ΓW + W T Γ)/2 is nonnegative definite. Hence, by Corollary 1, system (16) is global asymptotically stable on Θ (here Θ = {v ∈ R6 : ∥ v ∥ 2 ≤ 1}). Figure 2 depicts the time responses of neural state variables of the system starting randomly from Θ.

Transient behaviors of RNN in system (14) with random initial points x0 ∈ Θ.
We have developed the critical stability results for both the local field RNNs and the static RNNs with α-UAM mappings and without diagonal nonlinear requirement. Based on exploring some intrinsic properties of RNNs, and by combining Lyapunov functional method and Barbalat Lemma, it has been proved that the RNN has a unique equilibrium state and it is globally asymptotically stable in the sense that a discriminant matrix M1 (Γ) is nonnegative definite. Compared with the existing dynamics analysis, the results here extend most of the dynamics conclusions achieved, and what is the most important, the limitation of the activation mappings have been deeply relaxed.
How to improve the critical stability results without the diagonal nonlinearity requirement on the activation function by some other technique, e.g., by matrix theory and nonlinear analysis methods, and how to present some new results by avoiding the inner product term in the derivation of the energy function with some looser condition than ΓD being an identity matrix, are still full of challenge. Moreover, in this paper, we did not discuss the exponential stability for RNNs, nor consider the dynamics analysis for RNNs under some more critical conditions, e.g., M1 (Γ) + P ≥ 0. All of them are quite meaningful in applications of RNNs, and they are under our current investigation.
Footnotes
Acknowledgments
This research was supported by NSFC Nos. 11471006 and 11101327, National Science and Technology Program of China (No. 2015DFA81780), the Fundamental Research Funds for the Central Universities (No. xjj2017126) and was partly Supported by HPC Platform, Xi’an Jiaotong University.
