Using asymptotic analysis of the Laplace transform, we establish almost sure divergence of certain integrals and derive logarithmic asymptotic of small ball probabilities for quadratic forms of Gaussian diffusion processes. The large time behavior of the quadratic forms exhibits little dependence on the drift and diffusion matrices or the initial conditions, and, if the noise driving the equation is not degenerate, then similar universality also holds for small ball probabilities. On the other hand, degenerate noise leads to a variety of different asymptotics of small ball probabilities, including unexpected influence of the initial conditions.
Given a Gaussian process , , with values in and a non-negative-definite symmetric matrix , how large and how small can the random variable
be? For example,
Does the integral
diverge with probability one? (While the expected value of the integral is easy to study, the almost-sure divergence is non-trivial unless y is ergodic.)
What is the asymptotic of
as ?
Questions of this type arise in the analysis of various statistical estimators [12, Chapter 17], and in the study of Gaussian measures on Hilbert spaces [10], and, while the scalar case () has been getting a lot of the attention, much less is known in multi-dimensional setting.
The objective of this paper is to investigate questions (Q1) and (Q2) for a particular class of multi-dimensional Gaussian processes, namely, Gaussian diffusions. Let be a stochastic basis with an m-dimensional standard Brownian motion w, and let and be constant non-random matrices; . Let be the solution of
with initial condition that is independent of w and is a Gaussian vector with mean m and covariance . We refer to y as a multi-dimensional Gaussian diffusion; a popular alternative name is a multi-dimensional Ornstein–Uhlenbeck process.
Throughout the paper, a column vector is denoted by a lower-case bold letter (Greek or Latin), e.g. or y, whereas an upper-case regular Latin letter, e.g. A, means a matrix; the identity matrix is I. For a matrix A, means transposition. The same notation ⊤ will also be used for column vectors to produce a row vector. The notation means that A is a symmetric non-negative-definite matrix: and ; for such matrices, denotes the non-negative-definite symmetric square root of A. The trace of a square matrix A is and the determinant is . The Euclidean norm of a vector and the induced matrix norm are both denoted by . The notation , as in , means the derivative of the function (scalar, vector or matrix) with respect to t. Zero matrix and zero number are both 0; zero vector is .
One way to address both (Q1) and (Q2) is to investigate the Laplace transform function
and then to apply a suitable Tauberian theorem. This approach requires asymptotic analysis of the function Ψ in various regimes, which, in turn, requires a workable closed-form expression for Ψ. Fortunately, the paper [8] and the book [9] provide all the necessary tools to carry out the asymptotic analysis of (1.2) when y satisfies (1.1) and the noise is non-degenerate in the sense that the matrix has rank d. The resulting answers to (Q1) and (Q2) turn out rather universal in the sense that there is minimal dependence on the drift matrix A and the initial condition . In particular:
The integral
diverges with probability one (as long as it is not identically zero, which only happens when ). In fact, this divergence takes place under a much weaker condition than non-degeneracy of , namely, when the pair is controllable; see Theorem 4.3. The proof relies on the large-time asymptotic of , that is, analysis of as .
If and the covariance matrix K of the initial condition is non-singular, then
Theorem 4.5 provides the general result, which covers and a singular matrix K. The proof relies on high frequency asymptotic of , that is, analysis of as .
The paper is organized as follows. Section 2 presents the necessary background on the Laplace transform and small ball probabilities. Section 3 summarizes the main properties of the solution of (1.1), including the formula for . Section 4 contains the main contributions of the paper, related to items (A1) and (A2) above. Section 5 demonstrates how degenerate matrix can dramatically change (1.3).
Background
The main challenge in answering question (Q1) is often finding a rigorous proof of an “obvious” result. Question (Q2) presents a somewhat different challenge: getting useful information from the general answer. Indeed, after diagonalizing the matrix Q and expanding the process y in the eigenfunctions of its covariance operator, and under an additional assumption that , , we get
for some and i.i.d. standard normal . Then, as shown in [16],
in the sense that, as , the ratio of the expressions on the left and right sides of (2.1) approaches 1. The function is defined implicitly by the relation
and this implicit dependence on ε is the main drawback of (2.1) in concrete applications.
Sometimes (2.1) can lead to an explicit asymptotic of the probability on the left-hand side (see [10, Section 6.1] and references therein), but when y is a multi-dimensional Gaussian diffusion, a completely different approach, based on asymptotic analysis of the function from (1.2), appears to be a better option. In particular, this approach makes it possible to handle both questions (Q1) and (Q2).
In fact, the large-time asymptotic of provides an immediate answer to (Q1).
If, thenwith probability one.
If but
then (1.2) implies for all , a contradiction. □
The high frequency asymptotic of provides an answer to (Q2) via an exponential Tauberian theorem (Theorem 2.2). This theorem is a modification of [10, Theorem 3.5] (which, in turn, is a modification of [1, Theorem 4.12.9]).
Let ξ be a non-negative random variable. Thenholds if and only if
Technically, (2.3) is only logarithmic asymptotic of the probability and is not as strong as (2.1), but, for many applications, the logarithmic asymptotic is good enough, and, by providing an explicit dependence on ε, it is also much more useful.
We write (2.3) as
and say that the random variable ξ has the small ball rate
and the small ball constant
The two extreme cases of (2.3), corresponding to (infinite small ball rate) and (zero small ball rate), are a straightforward exercise in elementary probability.
Let ξ be a non-negative random variable. Then
is equivalent to.
is equivalent to.
Multi-dimensional Gaussian diffusions
The main object of study in this paper is the -valued process defined by (1.1). Here is a summary of the basic properties of y.
The solution of (1.1) is a Gaussian processwith meanand covariance matrixThe matrixhas the following properties:
It is the solution of the initial value problem
When, it is non-degenerate for everyif and only if the pairis controllable: thematrixhas rank d;
If, the pairis controllable, and the eigenvalues of the matrix A have non-positive real parts, then, for every, there exists a positive numberssuch that, for all,
Moreover, if the pairis controllable and all eigenvalues of A have strictly negative real parts, then Eq. (1.1) is ergodic, and the stationary distribution is Gaussian with mean zero and non-singular covariance matrixthat is the unique solution of
Direct computations show that the solution of (1.1) is (3.1), from which (3.2) and (3.3) immediately follow. The solution of (3.4) is unique and is indeed (3.3). The equivalence between non-degeneracy of , , and controllability of is well known (e.g. [9, Corollary 4.3.2]).
To establish (3.5), note that, for every unit vector ,
In particular, the smallest eigenvalue of is a non-decreasing function of t and, if all the eigenvalues of A have non-positive real parts, then the largest eigenvalue of is bounded above by . Then (3.5) holds with
Finally, controllability of and stability of A imply ergodicity of (1.1) [3, Theorem 9.1.1]. Existence, uniqueness and non-degeneracy of the solution of (3.6) follow from [9, Theorem 5.3.1]. This solution of (3.6) is
(cf. [9, Formula (5.3.3)]). In particular,
is the covariance matrix of the stationary distribution for the solutions of (1.1).
Given a symmetric non-negative definite matrix , consider the function from (1.2). The key to computing the function Ψ is the algebraic Riccati equation
for the unknown matrix .
Assume that Eq. (3.7) has a symmetric solutionand define the processby
Then
In particular:
(1) Ifis non-random, thenwhere
(2) Ifis a Gaussian random vector with meanmand covariance K, thenwhereandare-by-block matricesand
Theorem 3.2 was essentially proved in [8], because the more recent results from [9] about solvability of (3.7) made it possible to remove the additional restriction (stability of A) used in [8]. There are two main steps in the proof: (a) getting (3.8) via a change of measure, which an interested reader can easily do using a Girsanov-type formula, such as [11, Formula (7.138)]; (b) evaluating the right-hand side of (3.8) using the equality
where is a Gaussian random vector with mean and covariance matrix R, and G is a symmetric matrix. □
Non-zero initial conditions do not increase the value of Ψ:
We fix t, Q, C. Because
the Gaussian random vectors
and
are independent and . By (3.8),
where
We now compute the expectation by conditioning on and using (3.14) twice. The result is
with some function Φ and a matrix N; the matrix N depends on K and t, but the function Φ does not depend on m. While this expression is not as explicit as (3.11), it does establish (3.15). Indeed, by definition,
so if the matrix N is not non-positive definite, then it would be possible to violate (3.16) with a suitable scaling of m. By considering , a similar argument shows that we also must have . □
Let us summarize the basic facts about the symmetric algebraic Riccati equation
with known matrices A, D, Q in such that and .
A standard assumption about (3.17) is that the pair is either controllable or stabilizable; it is also often assumed that the pair is either observable or detectable. Below are practical definitions of these four concepts:
is controllable if the rank of the matrix is equal to d for all complex numbers z [9, Theorem 4.3.3];
is stabilizable if the rank of the matrix is equal to d for all complex numbers z with non-negative real part [9, Theorem 4.5.6(a)];
is observable if is controllable [9, Proposition 4.2.2];
is detectable if is stabilizable (the original definition).
In particular:
If D is invertible, then the pair is controllable for every matrix A.
If A is stable (all eigenvalues of A have negative real parts), then the pair is stabilizable for every matrix Dand the pair is detectable for every matrix Q.
The pair is controllable if and only if the pair is controllable [9, Corollary 4.1.3]; recall that the definition of controllability of if , , is in Proposition 3.1.
For every matrix , the pair is controllable if and only if the pair is controllable [9, Lemma 4.4.1].
The following is the summary of the main results from [9] about Eq. (3.17).
If the pairis observable, then every symmetric solution of (3.17) is non-singular [9, Problem 7.11.15].
If the pairis stabilizable, then there exists a symmetric solutionof (3.17), such that, the eigenvalues ofhave non-positive real parts, andfor every symmetric solution X of (3.17);is called themaximal symmetric solutionof (3.17) [9, Theorems 9.1.1 and 9.1.2].
Ifis stabilizable andis detectable, then the eigenvalues ofhave strictly negative real parts [9, Theorem 9.1.2].
Equation (3.7) can have more than one symmetric solution, and Theorem 3.2 indicates that any such solution can be used to compute . If the pair is controllable, then it is often convenient to take , so that (3.8) becomes
where
Indeed, Propositions 3.1 and 3.4 suggest, and the following result confirms, that representation (3.18) can have advantages over the more general (3.8).
Assume that the pairis controllable, letbe the maximal symmetric solution of (3.7), and consider the processfrom (3.19). DefineThen
If, in addition, the pairis detectable, then the processis ergodic; the stationary distribution ofis Gaussian with mean zero and covariance matrix
Everything follows from Propositions 3.1 and 3.4. In particular, to claim ergodicity, note that detectability of implies stability of , and, by [9, Lemma 4.4.1], controllability of implies controllability of . □
We will also need some basic facts about the Riccati differential equation
with constant square matrices A, D, Q.
Consider Eq. (3.21) under the assumptions that,, and the pairis controllable.
There exists a unique symmetric solutionof (3.21), andfor all [12, Lemma 16.3].
Ifis a symmetric solution ofwith, then, for all,and, as a consequence, [14, Theorem 1] and [13, Exercise 13.5.1].
Formula (3.11) provides an expression for the function Ψ in quadratures: there are no differential Riccati equations to solve, as, for example in [17, Corollary 1] or [6, Section 4.1]. Still, further simplification of (3.11) or even of (3.9) is, in general, not possible, often because of complications related to evaluation of (3.10) when the matrices A, B and C do not commute.
Below are two examples when the right-hand side of (3.9) can be simplified further. Both examples can be considered multi-dimensional analogues of the one-dimensional Ornstein–Uhlenbeck process
for which it is known [12, Lemma 17.3] that
Assume that,,,, and the pairis observable. Define the matrixas the symmetric non-negative-definite square root of. Then
The matrix Λ is invertible;
The functionhas the following representation:
With , Eq. (3.7) becomes or
Then
is the maximal symmetric solution of (3.25), and, because , the matrices Λ and are non-degenerate by Proposition 3.4.
The second example shows that, even when system (1.1) is not diagonal, the random variable can have the same distribution as
where are i.i.d. one-dimensional Ornstein–Uhlenbeck processes of the type (3.22); cf. [12, Lemma 17.5] when .
Assume that,,and, where,,, andis a skew-symmetric matrix:. Thenwhereis the right-hand side of (3.23), with
Equation (3.7) becomes
Substituting , , in (3.27) and using yields
With , equality (3.10) becomes
and (3.26) follows from (3.11). □
In the case of non-zero initial condition, an explicit and manageable expression for Ψ exists when : for the one-dimensional OU process (3.22) with initial condition having mean and variance , direct computations lead to
where
Asymptotic analysis of the Laplace transform
As a first application of Theorem 3.2, let us investigate the asymptotic of the function as .
Assume that the pairis controllable and. Letbe the maximal symmetric solution ofDenote bythe maximal symmetric solution of (3.7). Thenand
By (3.15), it is enough to consider only zero initial condition . We continue to use the notation
Then (3.18) becomes
By Proposition 3.4, , and therefore . If , then , so that , where is the solution of (3.19) with instead of . Equality (3.18) then implies , contradicting the assumption that . In other words, implies
By Proposition 3.1 and [9, Lemma 4.4.1], for every and
Similarly, the matrix
is positive-definite for every .
Define the matrices
Then (4.2) becomes
or, using (4.3) and ,
By (3.5) and Proposition 3.4(2),
and, to establish (4.1), it remains to verify that
Let us derive the differential equation satisfied by . For every invertible matrix , the inverse matrix satisfies
which follows after applying the product rule to . Applying (4.8) with we get
After combining (4.5) and (4.9),
simplifying the result using (3.7),
One more application of (4.8), this time with , together with (4.10), gives the differential equation satisfied by U:
By Proposition 3.6, for all and
where satisfies
Next, let us apply (4.6) when :
where and is from (4.4). Since for all ,
By (3.5) and Proposition 3.4(2),
which implies (4.7).
Combining (4.1) with Proposition 2.1, we get an answer to question (Q1) originally posed in the Introduction.
Ifwith,is the solution of (1.1), and the pairis controllable, thenwith probability one.
Indeed, (4.1) implies that
as long as Q is not identically zero. □
While intuitively obvious, (4.12) is surprisingly difficult to prove: even the one-dimensional case relies on the Laplace transform [12, Section 17.3]. Note also that (4.12) is, in general, not true without the controllability assumption: consider
so that, with ,
and the left-hand side of (4.12) is bounded above by . Further analysis of this example shows that, without the controllability assumption, the integral in (4.12) can either converge or diverge, depending on the initial condition and the matrix Q.
A more precise asymptotic of as exists under additional assumptions, and, even though it does not add anything as far as answering question (Q1), the result can, for example, provide an explicit solution to some optimization problems (cf. [4, Eq. (1.10)]).
Assume that the pairis controllable, the pairis observable, and the initial conditionis a Gaussian vector with meanmand covariance matrix K. Letbe the maximal symmetric solution of (3.7). Then
The matrixfrom (3.20) is well defined and non-singular;
The matrixis non-singular and, as,
Define the Gaussian process by (3.19). Proposition 3.5 implies that the process has a unique invariant measure, which is Gaussian with mean zero and non-singular covariance matrix . Passing to the limit as in (3.12) and (3.13), we find
It remains to verify that the matrix , or, equivalently, , is non-singular; then relation (4.13) will follow from (3.11).
To show that the matrix is non-singular, note that, by [12, Theorem 16.2], the matrix solving Eq. (4.10) has a non-singular limit as that does not depend on the initial condition; by construction, this limit coincides with . □
There is an alternative form of (4.13) using the minimal symmetric solution of (3.7). Indeed, consider the equation
Then C is solution of (3.7) if and only if is a solution of (4.14). In particular, is the minimal symmetric solution of (4.14). Applying [9, Theorem 7.5.1] we conclude that is the maximal symmetric solution of (4.14). Note that direct computations using (4.9), with , confirm that is a symmetric solution of (4.14), but an additional argument is still necessary to claim that it is indeed the maximal solution. By construction, , which leads to an equivalent form of (4.13):
On the one hand, the assumption about observability of cannot, in general, be omitted: take , , and
In this case, the right-hand sides of both (4.15) and (4.13) are not defined because the matrices are singular and the matrix does not exist.
On the other hand, (4.15) can hold without the observability assumption: take ,
so that . In this case, but , , , and the right-hand side of (4.15) gives the correct asymptotic . Incidentally, note that is stable. Further analysis of this example shows that, if the matrices Q and A are both diagonal, then the observability condition can be replaced by a weaker condition of detectability, which, in this case, means that, for every zero entry on the diagonal of Q, the corresponding diagonal entry of A must be negative.
Next, we study the high frequency asymptotic of the function , that is,
for fixed and fixed . As paper [2] demonstrates, if the matrix is not invertible, the high-frequency asymptotic can depend on the matrix Q in a rather complicated way even with zero initial conditions. In the next section, we will see that, when the noise is degenerate, the non-zero initial conditions can also change the asymptotic in a profound way. A truly universal result exists only when the matrix is invertible, which we will assume through the rest of this section.
If the matrix is invertible, then a linear transformation reduces the problem to the case . Indeed, define
The process y has the same distribution as the solution of
where v is a d-dimensional standard Brownian motion, and then
where is the solution of
Denote by the maximal symmetric solution of
and define
By Theorem 3.2,
where
and are -by- block matrices
and is a vector in :
We will also need the matrix
The limit in (4.17) exists by a monotonicity argument, and there are two obvious particular cases:
If , then ;
If is invertible, then .
Assume that the matrixis invertible. Denote bythe symmetric non-negative square root of.
Then
In view of (4.16), we need to verify the following:
If , then
By [9, Theorem 11.2.1],
which implies (4.19). Then
and
To establish (4.20) note that (4.22) and (4.24) imply
Accordingly, define the matrix
Then
and equality (4.20) will follow from
To verify (4.26), denote by the largest eigenvalue of the matrix . By (4.22) and continuous dependence of eigenvalues on the elements of the matrix (e.g. [15, Theorem 5.2]), the real part of every eigenvalue of will be greater than or equal to for all sufficiently large λ. Proposition 3.1 then implies existence of a positive number δ such that, for all ,
from which (4.26) follows.
To verify (4.27), note that, by (4.11),
By (4.25),
whereas, by Proposition 3.6,
where is the largest eigenvalue of the matrix , and then (4.27) follows from (4.28).
With the help of Theorem 4.4, we can now answer question (Q2) posed in the Introduction.
Assume that, in Eq. (1.1), the matrixis invertible and the initial conditionis independent ofwand is a Gaussian vector with meanmand covariance K. Then, for every,where,,is the symmetric non-negative square root of, andis the matrix from (4.17).
Theorem 4.5 shows that if the matrix is invertible, then the random variable
has small ball rate for every initial condition , every drift matrix A, and every non-zero matrix . The small ball constant
depends on both m and Q, but does not depend on the matrix A. If and , then y is a Brownian motion with covariance matrix so that is equal in distribution to , where are independent one-dimensional standard Brownian motions and are the eigenvalues of the matrix . The corresponding small deviations bound can then be derived, for example, from [10, Corollary 3.1], and this bound coincides with (4.29). In other words, as ,
It is somewhat remarkable that the variance of the initial condition can affect the small ball constant for . Indeed, consider the one-dimensional OU process (3.22) with initial condition having mean and variance . Equality (3.28) and the asymptotic relation (2.4) imply
More generally, if both K and Q are invertible, then and the small ball constant does not depend on the initial condition:
On the other hand, consider ,
so that
With , we get
Degenerate noise: A two-dimensional example
The objective of this section is to demonstrate how degenerate noise can destroy universality of the conclusion of Theorem 4.4. We will consider a two-dimensional example, corresponding to (1.1) with
In other words, , and satisfies
Let
be a symmetric non-negative definite matrix; occasionally, we will write .
Sometimes, formula (5.5) can be simplified further. For example, consider the integrated Brownian motion , corresponding to , and . Then (5.5) becomes
a well-known result: cf. [7, Section 4.2.1] or [5, Theorem 3.1].
Here are two new examples.
(1) Random harmonic oscillator , , corresponding to , and :
(2) Joint integrated Brownian motion and Brownian motion:
Even when the initial conditions are zero, the hight-frequency asymptotic of the function Ψ can depend on the matrix Q in a non-trivial way.
Assume that. Ifand, thenIf, then
When using (5.5), we now keep in mind that the matrix Q is re-scaled by the factor λ, so that y, r, z must be replaced with , , . Throughout the proof we use the and notations for asymptotic comparison as , and write if .
Both (5.6) and (5.7) will follow from (5.5) once we show that
The key question becomes the asymptotic, as , of the function φ from (5.3) for fixed . There are three cases to consider.
(, ).
As , we have , so that for large λ and then, by (5.3),
Then for all ,
so that
and (5.8) follows.
(, ).
Now q does not depend on λ and, as , we have , so that for large λ, and
As a result, (5.3) and (5.4a)–(5.4c) lead to
Then
and (5.8) follows.
(, , ).
We have , so that for large λ, and
With notations
equalities (5.3) and (5.4a)–(5.4c) lead to
Then
and (5.8) follows.
Theorems 2.2 and 5.1 lead to the logarithmic asymptotic of probability of small deviations for the corresponding quadratic functionals. The result shows that the small ball rate depends on whether (cf. (5.10)) or (cf. (5.11)). Equivalently, the random variables and have different small ball rates.
To establish (5.10), note that (5.6) is (2.2) with
Then (5.10) follows from (2.4).
To establish (5.11), note that (5.7) is (2.2) with
Then (5.11) follows from (2.4). □
Let us now consider the effects of non-zero initial conditions. As we saw in the previous section, if the matrix is non-singular, then the initial conditions can increase the small ball constant , but do not change the small ball rate ϖ (cf. Theorem 4.5). For Eq. (5.2), the initial conditions can change the small ball rate as well. The most dramatic change takes place when and the initial condition is non-random () with . In this case, the quadratic form becomes uniformly bounded from below: there exists such that
Informally, the small ball rate ϖ becomes infinite. While this might appear surprising at first, a simple application of the Cauchy–Schwarz inequality shows that the result is to be expected. For example, take
so that
Using
we get
Next, by Cauchy–Schwarz,
so that
and, as long as , we get (5.12) with .
Asymptotic analysis of as , together with Proposition 2.3, leads to a sharper bound in a more general setting; cf. (5.14). In particular, for (5.13), we actually have (5.12) for every with .
By Proposition 2.3, we need to show that
To simplify the presentation, and keeping in mind that the matrix C depends on λ, we will write
and then (3.11) becomes
where
By construction, for some , and therefore
If , then, by (5.17),
and (5.15) follows. Note that the right-hand side of (5.15) vanishes when or , that is, exactly when the matrix Q is singular. □
Further inspection of (5.16) suggests the following qualitative summary of the effects of non-zero initial conditions on small ball probabilities:
If , then the small ball asymptotic at logarithmic scale does not depend on the initial conditions, that is, (5.11) holds when and (5.10) holds when , .
If , then the initial conditions can increase the small ball rate ϖ in the following two cases: (a) , and (b) .
In all other cases, the initial conditions do not change the small ball rate ϖ, but can increase the small ball constant .
At this point, the amount of computations necessary to derive the corresponding quantitative results requires writing a separate paper.
X.Chen and W.V.Li, Quadratic functionals and small ball probabilities for the m-fold integrated Brownian motion, Ann. Probab.31 (2003), 1052–1077.
3.
G.Da Prato and J.Zabczyk, Ergodicity for Infinite-Dimensional Systems, Cambridge Univ. Press, Cambridge, 1996.
4.
M.D.Donsker and S.R.S.Varadhan, Asymptotic evaluation of certain Markov process expectations for large time. IV, Comm. Pure Appl. Math.36 (1983), 183–212.
5.
D.Khoshnevisan and Z.Shi, Chung’s law for integrated Brownian motion, Trans. Amer. Math. Soc.350 (1998), 4253–4264.
6.
M.L.Kleptsyna and A.Le Breton, Optimal linear filtering of general multidimensional Gaussian processes and its application to Laplace transforms of quadratic functionals, J. Appl. Math. Stochastic Anal.14 (2001), 215–226.
7.
M.L.Kleptsyna and A.Le Breton, A Cameron–Martin type formula for general Gaussian processes – A filtering approach, Stoch. Stoch. Rep.72 (2002), 229–250.
8.
K.Koncz, On the parameter estimation of diffusional type processes with constant coefficients (elementary Gaussian processes), Anal. Math.13 (1987), 75–91.
9.
P.Lancaster and L.Rodman, Algebraic Riccati Equations, Oxford Univ. Press, New York, 1995.
10.
W.V.Li and Q.-M.Shao, Gaussian processes: Inequalities, small ball probabilities and applications, in: Stochastic Processes: Theory and Methods, D.N.Shanbhag and C.R.Rao, eds, Handbook of Statistics, Vol. 19, North-Holland, Amsterdam, 2001, pp. 533–597.
11.
R.S.Liptser and A.N.Shiryaev, Statistics of Random Processes, I: General Theory, Springer, Berlin, 2001.
12.
R.S.Liptser and A.N.Shiryaev, Statistics of Random Processes, II: Applications, Springer, Berlin, 2001.
13.
L.Mirsky, An Introduction to Linear Algebra, Dover Publications, Inc., New York, 1990.
14.
H.L.Royden, Comparison theorems for the matrix Riccati equation, Comm. Pure Appl. Math.41 (1988), 739–746.
15.
D.Serre, Matrices: Theory and Applications, Springer, New York, 2010.
16.
G.N.Sytaya, On some asymptotic representation of the Gaussian measure in a Hilbert space, in: Theory of Stochastic Processes, Vol. 2, 1974, pp. 93–104(in Russian).
17.
A.I.Yashin, An extension of the Cameron–Martin result, J. Appl. Probab.30 (1993), 247–251.