This work concerns Markov chains evolving on a denumerable sate space, which is endowed with a non-negative reward function with finite support. In this context, the problem of determining the Varadhan function, given by the exponential growth rate of the aggregated rewards, is studied. The main results in this direction are expressed in terms of the idea of asystem of local Poisson equations, and can be summarized as follows: (i) the Varadhan function is determined by one of such systems, and (ii) if a finite set is accessible form any state, then a system of local Poisson equations exits.
Let be a Markov chain with transition matrix on a denumerable state space S. Given a running reward function , this work is concerned with the characterization of the exponential grow rate of the aggregated rewards , which determine the following Varadhan function:
where
and stands for the expectation operator given that the initial state is . Within the theory of large deviations, this function plays a crucial role to analyze the rate of convergence of the empirical measure of the Markov chain ; see, for instance, [7] or [11–13]. In the context of stochastic optimal control, this functional has been the fundamental instrument to formulate the class of risk-sensitive average control problems; see, for instance, [8–10,19], or [18] where discrete-time Markov decision chains are studied, or [2,14] or [3], which concern continuous-time models.
When the state space S is finite and communicating, i.e., for all there exists a positive integer such that , the Perron–Frobenious theory of positive matrices [22] yields that is constant, say g, and that is the largest eigenvalue of the matrix . Moreover, g is characterized as the unique real number for which there exists a function satisfying the following Poisson equation:
a result that can be traced back to the seminal paper by Howard and Matheson [17], where it was also shown that, if the above equation is satisfied, then the superior limit in (1.1) can be replaced by limit. Under appropriate ergodicity conditions, this result was extended in [21] to Markov chains with general states space and continuous or discrete time parameter, whereas a formulation of the corresponding eigenvalue problem for differential operators associated with diffusions was given in [15]. The analysis of the risk sensitive ergodic problem for diffusions was analyzed by Kaise and Nagai [20].
The communication condition described above is essential to characterize the Varadhan’s function in terms of the Poisson equation (1.3). Indeed, under the unichain assumption that has only one recurrent class but the set of transient states is non-empty, it is possible to have that is not constant, and then it cannot be characterized by a single Poisson equation; moreover, even if takes only a single value, it is not generally determined via Eq. (1.3) [5]. On the other hand, still within the context of a finite state space, it was shown in [6] that is generally characterized by a system of local Poisson equations, a result that can be roughly described as follows: There exists a partition of the state space such that, on , is constant, say , this value depends only on the rewards earned while the chain stays in , and is characterized by an equation similar to (1.3). This notion is extended in Definition 2.1 to the presented context of a denumerable state space, and the main problem studied in this note is the following:
To determine conditions on the reward and transition structures so that the Varadhan function can be characterized in terms of a system of local Poisson equations.
This problem will be analyzed under the assumption that the reward function is non-negative and has finite support. Within this framework, the main conclusions of the paper, which are formally stated in Section 2, can be described as follows:
the Varadhan function can be obtained from a system of local Poisson equations, and
if some finite subset F of the state space is accessible from any initial state, then a system of local Poisson equations exists.
The approach used below to establish these results relies on the discounted method, which is based on contractive operators on the space of bounded functions on S, and the corresponding fixed points are used to construct the components of a system of local Poisson equations. The discounted technique is the fundamental tool used in [8–10,16,19], where controlled Markov chains endowed with the average index were studied under risk-aversion. The particular usage of the discounted procedure in this note is motivated by the results in [1], where a general characterization of the optimal risk-sensitive average cost was established for controlled models with finite sate space.
The organization of the paper is as follows: In Section 2 the basic structural conditions of the paper are formally expounded as Assumption 2.1, the idea of a system of local Poisson equations is introduced in Definition 2.1, and the main conclusions of this work are stated as Theorems 2.1 and 2.2. The first of these results is the verification theorem, stating that the Varadhan function can be obtained form a system of local Poisson equations, whereas the second one establishes that, under Assumption 2.1, such a system certainly exists. Also, and example illustrating the fundamental idea in Definition 2.1 is presented. Next, Section 3 is dedicated to establish the verification theorem, whereas in Section 4 the family of discounted operators is introduced, and the corresponding fixed points are used in Sections 5 and 6 to construct the components of a system of local Poisson equations. Finally, the exposition concludes in Section 7 with a proof of the existence result in Theorem 2.2.
Notation. For integers a and b, is an infix notation for . If f is a real valued function, the corresponding supremum norm is denoted by , that is,
whereas the class of all real valued and bounded functions defined on a set is denoted by , so that belongs to if and only . The indicator function associated to an event A is denoted by and, without explicit reference, a relation involving random variables holds almost surely with respect to the underlying probability measure.
Local Poisson equations and main results
In this section the main conclusions on the determination of the Varadhan function will be stated. The subsequent analysis will be performed under the following assumption, whose formulation involves the idea of first return time to a non-empty subset B of the state space:
where by convention, the minimum of the empty set is ∞; when is a singleton, the simpler notation is used in place of .
The reward function R is non-negative and has finite support, i.e.,
There exists a finite setsuch that, with probability 1, F is accessible from every initial state, that is,
The accessibility property in Assumption 2.1(ii) is a weak form of the Doeblin condition, which prescribes the existence of a finite set F such that ; a generalized version of that requirement, which is referred to as the simultaneous Doeblin condition, has been intensively used to analyze the risk-neutral average criterion in controlled Markov chains [23].
Under Assumption 2.1 the chain may be transient but, even in that case, the Varadhan function may be positive [4].
The results of this note on the characterization of the Varadhan function involve the idea of system of local Poisson equations, which is now introduced.
A vector of triplets
is a system of local Poisson equations for the reward function R and the transition matrix P if, and only if, the following requirements (i)–(v) are satisfied:
is a partition of S.
For each , , and satisfy that
and
For each , the set is closed with respect to the transition matrix P, that is,
For each , the pair satisfies the following local Poisson equation:
For each ,
where for each positive integer n, is the expectation operator with respect to the measure on defined by
Observe that the left-hand side of (2.6) is positive, so that
From this point, an induction argument using the Markov property yields that, for every positive integer n, when , so that the denominator in (2.8) is always positive, and then the measure is well defined.
The idea in Definition 2.1 is an extension of the corresponding notion formulated in [6]. In this last paper Markov chains with finite state space were considered, and just conditions (i)–(iv) were required for a system of local Poisson equations. Indeed, when the state space is finite, the fifth part in Definition 2.1 follows from the conditions (i)–(iv). To verify this assertion, assume that S is finite and note that, for each , (2.9) yields that the set is non-empty (and finite), so that
setting , it is not difficult to verify that the above display and (2.8) together lead to , and then (2.7) holds.
Assume that condition (iv) in Definition 2.1 is satisfied by the vector of triplets in (2.3). In that case, it will be shown in Lemma 3.1 below that, for every positive integer n and ,
Using this fact, it will be now verified that the requirement (2.7) is equivalent to
To achieve this goal, note (2.10) and the definition of the measure in (2.8) together yield that, for every and ,
Therefore,
and it follows that (2.7) is equivalent to (2.11).
The main results of this note are stated in the following two theorems. The first one establishes that the Varadhan function can be determined from a system of local Poisson equations.
(Verification).
Letbe a Markov chain with transition matrixon a denumerable state space S, and letbe such that Assumption2.1(i) holds. Ifis a system of local Poisson equations in the sense of Definition2.1, thenMoreover, for each,
A complement to this theorem is the following existence result.
If the reward functionand the transition lawsatisfy Assumption2.1, then a system of local Poisson equations for R and P exists.
Before proceeding with the proof of the above theorems, an example illustrating the idea of a system of local Poisson equations will be analyzed.
On the state space , consider the Markov chain with transition matrix determined by
whereas for each positive integer k,
Now, define the reward function by
First, it will be verified that Assumption 2.1 is satisfied in this example.
Since the reward function is non-negative and has finite support, it is sufficient to show that (2.2) occurs with . To achieve this goal, have a glance at (2.12) and note that whereas
so that
To conclude, let the Markov chain associated to the transition law , and for each non-negative integer n, set
It follows from (2.12) and (2.13) that, for each non-negative integer n, conditional on , , takes the values and with probability when , whereas with probability 1 if . Therefore, is a symmetric random walk on S with absorbing barrier , so that for every . Since is equivalent to , it follows that
a property that via (2.16) leads to (2.2). □
Combining the above proposition with Theorem 2.2, it follows that there exists a system of local Poisson equations for the model in Example 2.1. To provide an explicit instance of such a system, first note that, since 0 is an absorbing state and ,
On the other hand, (2.12), (2.14) and (2.15) together yield that , so that , and then
Similarly,
Next, define the partition , , of the state space by
and, for each , define the function on the set by
and
Consider now a fixed state , where or . In this case and , by (2.13), and then , a relation that immediately leads to , so that
This last relation and (2.17)–(2.22) together imply that
With the notation in (2.17)–(2.22),is a system of local Poisson equations for the model in Example2.1.
It will be shown that the five conditions in Definition 2.1 are satisfied by :
By (2.17)–(2.19) the relation occurs, whereas condition (2.5) holds in the present case with , by (2.23).
Combining (2.20) with the specification of the transition law in (2.12) and (2.13), it follows that and are closed with respect to the transition law ; of course is also closed.
It will be verified that, for each ,
In this context, the left-hand side of the above equation is 1, by (2.17) and (2.21). On the other hand, using that for , by (2.14) and (2.21), note that , where the second equality is due to the fact that is closed with respect to . Consequently, (2.24) holds when .
Note that , since , whereas (2.12) and (2.14) yield that . Therefore, by (2.18),
Next, let be arbitrary, and suppose that . In this case (2.13) yields that . Since and is closed, using that it follows that , and then (2.22) implies that
By the Markov property,
and
whereas
see (2.22) for the second equality. Combining the last four displays it follows that
Since for , the above equality and (2.25) together yield that (2.24) holds when , whereas the case can be established along similar lines.
Via Remark 2.2(ii), it is sufficient to show that
for . Recalling that and for , it follows that the above relation holds for . Now, consider the case and observe that, conditionally on , the variables and coincide almost surely, by (2.12), and then
Therefore, using that and , by (2.12) and (2.14), it follows that
Now, let be arbitrary. Using that the reward function is null on , note that the Markov property and the above display together yield that
These two last displays together yield that
and (2.26) follows combining this relation with the equality ; see (2.18) and (2.19). Thus, satisfies the five conditions in Definition 2.1, so that is a system of local Poisson equations. □
The remainder of the paper is dedicated to prove Theorems 2.1 and 2.2. The proof of the verification theorem, which will be presented in the following section, is grounded on the Markov property. On the other hand, the proof of Theorem 2.2 is somewhat involved. The argument is based on contractive (discounted) operators on the space of bounded functions defined on S, whose fixed points are used to construct the different components of a system of local Poisson equations. After presenting the necessary preliminaries in Sections 4–6, the existence result will be proved in Section 7.
Proof of the verification theorem
In this section Theorem 2.1 will be established. Throughout the subsequent analysis Assumption 2.1 is enforced, and
stands for a system of local Poisson equations in the sense of Definition 2.1. The starting point to the proof of the verification theorem is the following.
For each, the following relation holds for every positive integer n:
The argument is by induction. Note that (2.6) can be equivalently written as
showing that (3.1) holds when . Suppose now that (3.1) is valid for a positive integer n, let be arbitrary and observe that the Markov property yields that
where the second equality is due to (2.6). Thus,
an equality that combined with the induction hypothesis yields that (3.1) holds with instead of n, completing the induction argument. □
Note that the proof of Lemma 3.1 depends only on property (iv) of a system of local Poisson equations.
The second instrument that will be used to derive Theorem 2.1 is the following:
The Varadhan functionsatisfies thatsee (1.1) and (1.2)
Before proceeding with the proof of this result, it will be combined with Lemma 3.1 to establish the verification theorem.
Let be arbitrary, select and observe that for every positive integer n
where the second inequality is due to the non-negativity of R, and the third one stems form the relation for every ; see (2.5). Via Lemma 3.1, the above display yields that , and then . Combining this relation with Theorem 3.1 it follows that
and the conclusion follows, since is arbitrary. □
The remainder of the section is dedicated to establish Theorem 3.1. The argument relies on the following two lemmas. The first one shows that is a local rate, in the sense that it equals the grow rate of the aggregated rewards while the system stays in the set .
For every,and
Let , and the positive integer n be arbitrary. Recalling that , note that Lemma 3.1 and (2.5) together yield that
a relation that immediately leads to
On the other hand, as already noted in Remark 2.2(ii), the property in Definition 2.1(v) is equivalent to
Since is arbitrary, these two last displays together imply the desired conclusion in (3.3). To verify the property (3.4), select and, using that set is closed with respect to the transition matrix P, by condition (iii) in Definition 2.1, note that
so that (3.3) implies that
where the inequality is due to the non-negativity of R. Now, (3.4) follows combining this last display with the relation (2.4) in Definition 2.1(ii). □
For eachthe following assertion holds: Given, there existssuch that, for every positive integer n
Let and be arbitrary, and note that Lemma 3.2 implies that there exists a positive integer such that
Now define the real number by
and note that implies that
Recalling that is arbitrary, these three last displays together yield that
To conclude, it will be shown that can be selected independently of . To achieve this goal, consider the following two exhaustive cases:
().
In this context for every . Thus, recalling that (by (3.4)), it follows that (3.5) holds with .
().
In this context set
and observe that, for an arbitrary positive integer n, (3.6) leads to
Next, select and note that for , so that
Next, given a positive integer , observe that the Markov property yields that
where, using that on the event , the first inequality is due to (3.7), and the second one stems from the non-negativity of . Thus,
and combining this inequality with (3.8) it follows that
Since is arbitrary and the numbers and are non-negative, the above relation yields that
Thus, recalling that positive integer n is arbitrary, this last display and (3.7) together yield that (3.5) is also valid in the present case. □
Let be arbitrary and, for each , define
where are the non-negative numbers in Lemma 3.3, so that the relations
hold. It will be verified, by induction, that the following assertion is valid for every positive integer :
Suppose that.
In this case and . Using that the set is closed with respect to the transition matrix P, by Definition 2.1(iii), it follows that for every
and then (3.11) follows from Lemma 3.3 with .
Using (2.4) and (3.10), observe that (3.11) yields that
Next, let and the positive integer n be arbitrary, and observe that
Recalling that is closed with respect to the transition matrix P, observe that the following equalities hold -almost surely:
Combining these facts with the previous display it follows that
Observe now that, by Lemma 3.3, the inclusion yields that
where (3.10) was used to set the second inequality. On the other hand, for every , an application of the Markov property leads to
where the inequality is due to the induction hypothesis. Therefore,
where the third inequality is due to Lemma 3.3, and the fourth one stems from (2.4) and (3.9). Combining the above display with (3.13) and (3.14) it follows that
and then, since and the positive integer n are arbitrary,
Since , this last display and (3.12) together yield that assertion (3.11) is valid with instead of i, completing the induction argument. To conclude, let be arbitrary, and note that (3.11) implies that
so that
and the conclusion of Theorem 3.1 follows, since is arbitrary. □
Contractive operators
The remainder of the paper is dedicated to establish Theorem 2.2, and the necessary tools are presented in this and the following two sections. As already mentioned, the approach relies on the discounted operators introduced below which, in the context of a finite state space, were used in Cavazos-Cadena and Hernández-Hernández [6] to approximate the Varadhan function, and in [1] to characterize the optimal average value function for controlled Markov chains under risk-aversion.
Given , the operator is specified as follows: For each ,
From this definition, the following monotonicity and α-homogeneity properties of an operator can be obtained:
If are such that , then ;
for each and .
Combining these two properties with the relation , it follows that
that is,
so that, for each , the operator is a contractive on the space endowed with the supremum norm. Therefore, by Banach’s fixed point theorem, there exists a unique function satisfying ; note that (4.2) yields that
where is the nth fold composition of with itself. Observe that, by (4.1), the equality is equivalent to
whereas for each . Hence, (4.2) yields that
and combining this relation with the inequality , it follows that
In the following sections, the family of fixed points will be used to build a system of local Poisson equations. The construction relies on the relations on the state space introduced in the following definition. First, using Cantor’s diagonal method, select a sequence such that the following requirements are satisfied:
and
where the second inclusion follows from (4.5). Throughout the remainder of the paper the sequence satisfying these two last properties will be kept fixed.
The relation ‘⪰’ on the state space S is defined as follows: For each ,
is read as ‘x dominates y’.
For ,
From this definition and (4.7) it is not difficult to see that
‘⪰’ is a total order in S: for every , at least one of or is valid;
‘⪰’ is transitive, that is for every states ,
‘∼’ is an equivalence relation which can be more directly described as follows:
so that
see (4.7). Also observe that
Ifis non-negative, thenfor every;
For each state, the set of all states that are dominated by x is closed with respect to the transition matrix. More explicitly,
Let be arbitrary. As already noted, , so that when R is non-negative. In this case the monotonicity of yields that for every positive integer n, and the desired conclusion follows via (4.3).
Let be arbitrary, and note that (4.4) yields that ; combining this relation with the inequality obtained from (4.5), it follows that . From this point, replacing α by and taking the limit as m goes to ∞, (4.6) and (4.7) together imply that
and then
see Definition 4.2. □
In the remainder of the paper the relations in Definition 4.2 will be used to build a system of local Poisson equations.
Partition of the state space and local rates
In this and the following section the relations introduced in Definition 4.2 will be used to construct the components of the triplets comprising a system of local Poisson equations. The present objective is to build the first two components, namely, a partition of the state space and the numbers (local rates) associated to each set in the partition as in Definition 2.1. Throughout the remainder of the paper, even without explicit reference, it is supposed that Assumption 2.1 holds, and the finite set is defined by
The starting point of the argument, which concerns the equivalence relation in Definition 4.2(ii), is stated in the following theorem and can be roughly described as follows: Every state is equivalent to some member of G.
Under Assumption2.1, for eachthere exists a statesuch that. Moreover, if, then.
The proof of this result relies on the following lemma. First, note that the inclusion and (2.1) together yield that , and then Assumption 2.1 yields that
Also, set
where, using that the reward function R is non-negative, the inclusion follows from (4.5) and Lemma 4.1.
Suppose that Assumption2.1holds. In this case, for eachthe following assertions (i) and (ii) are valid:
For each positive integer n and,
For each
The proof is by induction. Combining (4.4) with (5.3) it follows that
and using that , it follows that (5.4) occurs with when . Now suppose that (5.4) holds for certain positive integer n and apply the Markov property to obtain that
where the third equality is due to (5.6). Thus,
observing that on the event , the above equality can be written as
To conclude, note that on the event , so that
Combining these two last displays with the induction hypothesis, it follows that (5.4) also holds with instead of n, completing the induction argument.
By (5.1), the support of R is contained in G. Since when , by (2.1), it follows that for every positive integer n, and then part (i) yields that
Next, observe that
whereas (4.5) and the non-negativity of (see (5.3)) together yield that
Using (5.2), taking the limit as n goes to ∞ in (5.8), via the bounded convergence theorem the two last display together lead to (5.5). □
Since ‘∼’ is an equivalence relation, it is sufficient to prove the assertion in the theorem for x outside G.
Let be arbitrary. Define the set by
and note that, since on the event and , (5.2) implies that . It will be proved that
To achieve this goal, first note that , since , and recalling that is non-negative, use Lemma 5.1(ii) to obtain that for every , so that
Proceeding by contradiction, assume that for every . In this case Definition 4.2(i) and (4.7) together yield that
since is finite, after replacing α by in (5.11) and taking the limit as m goes to ∞ in the resulting inequality, the above displayed condition leads to , which is a contradiction, so that (5.10) holds. Now observe that (4.9) and Lemma 4.1(ii) yield that when is positive, so that, by (5.9),
since , this last display and (5.10) together imply that for some . To conclude, note that the inclusion and (5.9) yield that . □
Recalling that G is a finite set, Theorem 5.1 yields that the family of equivalence classes with respect to ‘∼’ is finite and consists of at most k sets, where
and, for each set , stands for the number of elements of B. Throughout the remainder,
and, with this notation, Theorem 5.1 immediately implies the following conclusion.
For each,. Moreover,for each.
Next, a total order in the family of equivalence classes is introduced.
If and are two different equivalence classes with respect to the equivalence relation ‘∼’ in Definition 4.2(ii), then
By (4.7), (4.11) and (4.12), this relation ‘≺’ is well defined and is a strict total order, that is, if and are two different equivalences classes, then exactly one of or occurs; also, note that the above definition and (4.12) together yield that
Without loss of generality, throughout the remainder the different equivalence classes are labeled in such a way that
For every, the setis closed with respect to the transition matrix, that is,
Let and be fixed. Now, suppose that the integer j is such that and let be arbitrary. In this case , by (5.15), and then (5.14) yields that
so that , by Definition 4.2(i). From this point, Lemma 4.1 yields that . Hence,
and the conclusion follows, since is a partition of the state space. □
Now, a number will be associated to each equivalence class .
For , select a point and define the local rate by
Since is a convergent sequence when , (4.6) and (5.3) yield that the value of does not depend on the choice of , that is,
Let and be arbitrary. In this case Definition 5.1 and (5.15) together yield that the sequence converges to ∞, and then it is bounded below, that is
for a certain constant M. Multiplying both sides of this inequality by it follows that and, after taking the limit as m goes to ∞, (4.6) and (5.18) yield that . □
Relative value functions
In this section a function will be associated with each set in (5.13). For each , the number will be obtained from the relative value of with respect to an appropriately chosen element of , and taking the limit as m goes to ∞. To begin with, consider an index and, using Corollary 1 and the finiteness of G, given an integer m select such that for every . Via Cantor’s diagonal method, after taking an appropriate subsequence of , without loss of generality it can be assumed that each sequence is constant, say , so that, for every ,
The following extensions of this property will be useful.
For each, the following assertions hold: There exist positive integersandsuch thatand
Let the index i between 1 and k and be arbitrary. In this case, it was established in the proof of Theorem 5.1 that the inequality (5.11) occurs, and it follows that
Using that is closed with respect to the transition matrix P, the inclusion and (5.2) yield that . Since is a partition of the state space, it follows that
where the inequality is due to (6.1). Since and for , it follows from (5.14) that there exists a positive integer such that
and combining these relation with the two previous displays it follows that
since is arbitrary and is positive it follows that
and (6.2) follows combining this relation and (6.1). Finally, setting , (6.2) and (6.4) together imply assertion (6.3). □
For each index i between 1 and k, the relative value function on is defined by
The following result establishes two basic properties of the functions .
For each,and the following local Poisson equation is satisfied:
Let be arbitrary. In this case , and then Definition 6.1, (4.10) and (6.2) together yield that is a finite non-positive number, establishing (6.5). On the other hand, replacing α by in (4.4) and multiplying both sides of the resulting equality by it follows that
where the second equality is due to the fact that is closed with respect to the transition matrix P. Next observe that (6.3) yields that, for m large enough,
and that (5.14) and (5.15) imply that
whereas
by Definition 6.1. Taking the limit as m goes to ∞ in (6.7), via the bounded convergence theorem the three last displays together lead to (6.6). □
The system of local Poisson equations
In this section Theorem 2.2 will be finally proved. Define
where the sets are as in (5.13), whereas the local rates and the relative value functions are as in Definitions 5.2 and 6.1, respectively.
It will be verified that the five properties in Definition 2.1 are satisfied by . Note that condition (i) holds, by (5.13), and that Lemmas 5.3 and 6.2 together show that condition (ii) is also valid, whereas properties (iii) and (iv) follow from Lemmas 5.2 and 6.2. To verify that the property in part (v) of Definition 2.1 is satisfied by , it must be proved that, for every index i between 1 and k,
where, for each positive integer n, is the expectation operator with respect to the measure on given in (2.8) with instead of . Combining Lemma 3.1 and Remark 2.2(ii), the above property is equivalent to
a relation that will be now verified: Let be as in (6.1) and, given a positive integer m, replace α by in (4.4) and multiply both sides of the resulting inequality by to obtain, via Definition 5.2, that
Next, set
and, using that is closed with respect to the transition matrix P, by Lemma 5.2, note that (7.4) yields that
Now, select an arbitrary but fixed integer m such that , where is the positive integer in (6.2), so that
a property that combined with the previous display leads to
From this point, an induction argument similar to the one used to establish Lemma 3.1 yields that, for every positive integer n and
and then, recalling that this yields to
Since , it follows that, for every integer and ,
an inequality that immediately leads to
Thus, since is arbitrary, after taking the limit as m goes to ∞ in the left-hand side of the above relation, the desired inequality (7.3) follows via (5.18). Summarizing, it has been shown that in (7.1) is a system of local Poisson equations for the reward function R and the transition law P, completing the proof. □
Footnotes
Acknowledgements
The authors are sincerely grateful to the reviewer by his careful reading of the original manuscript, constructive criticism and helpful suggestions to improve the paper. This work was partially supported by the PSF Organization under Grant No. 2-450-14, by PRODEP under Grant No. 17332-CA-23 and by CONACYT under Grant No. Laboratorio LEMME.
References
1.
A.Alanís-Durán and R.Cavazos-Cadena, An optimality system for finite average Markov decision chains under risk-aversion, Kybernetika48 (2012), 83–104.
2.
A.Biswas, V.S.Borkar and K.Suresh Kumar, Risk-sensitive control with near monotone cost, Appl. Math. Opt.62 (2009), 145–163.
3.
V.S.Borkar and K.Suresh Kumar, Singular perturbations in risk-sensitive stochastic control, SIAM J. Control Optim.48 (2010), 3675–3697.
4.
R.Cavazos-Cadena, The risk-sensitive Poisson equation for a communicating Markov chain on a denumerable state space, Kybernetika45 (2009), 716–736.
5.
R.Cavazos-Cadena and D.Hernández-Hernández, A characterization of exponential functionals in finite Markov chains, Math. Methods Oper. Res.60 (2004), 399–414.
6.
R.Cavazos-Cadena and D.Hernández-Hernández, A system of Poisson equations for a non-constant Varadhan functional on a finite state space, Appl. Math. Opt.53 (2006), 101–119.
7.
A.Dembo and O.Zeitouni, Large Deviations Techniques and Applications, Jones and Bartlett, Boston, MA, 1993.
8.
G.B.Di Masi and L.Stettner, Risk-sensitive control of discrete time Markov processes with infinite horizon, SIAM J. Control Optim.38 (1999), 61–78.
9.
G.B.Di Masi and L.Stettner, Infinite horizon risk sensitive control of discrete time Markov processes with small risk, Syst. Control Lett.40 (2000), 15–20.
10.
G.B.Di Masi and L.Stettner, Infinite horizon risk sensitive control of discrete time Markov processes under minorization property, SIAM J. Control Optim.46 (2007), 231–252.
11.
M.D.Donsker and S.R.Varadhan, Asymptotic evaluation of certain Markov process expectations for large time, I, Commun. Pur. Appl. Math.28 (1975), 1–47.
12.
M.D.Donsker and S.R.Varadhan, Asymptotic evaluation of certain Markov process expectations for large time, II, Commun. Pur. Appl. Math.28 (1975), 279–301.
13.
M.D.Donsker and S.R.Varadhan, Asymptotic evaluation of certain Markov process expectations for large time, III, Commun. Pur. Appl. Math.29 (1975), 389–461.
14.
W.H.Fleming and W.M.McEneaney, Risk sensitive control on an infinite horizon, SIAM J. Control Optim.33 (1995), 1881–1915.
15.
W.H.Fleming and S.J.Sheu, Asymptotics of the principal eigenvalue and eigenfunction of a nearly first-order operator with large potential, Ann. Probab.25 (1997), 1953–1994.
16.
D.Hernández-Hernández and S.I.Marcus, Existence of risk sensitive optimal stationary polices for controlled Markov processes, Appl. Math. Opt.40 (1999), 273–285.
N.Ichihara, Phase transitions for controlled Markov chains on infinite graphs, Preprint, 2014.
19.
A.Jaśkiewicz, Average optimality for risk sensitive control with general state space, Ann. Appl. Probab.17 (2007), 654–675.
20.
H.Kaise and H.Nagai, Ergodic type Bellman equations of risk-sensitive control with large parameters and their singular limits, Asymptotic Analysis20 (1999), 279–299.
21.
I.Kontoyiannis and S.P.Meyn, Spectral theory and limit theorems for geometrically ergodic Markov processes, Ann. Appl. Probab.13 (2003), 304–362.
22.
C.D.Meyer, Matrix Analysis and Applied Linear Algebra, SIAM, Philadelphia, 2000.
23.
M.L.Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley, New York, 2005.