1991 Mathematics Subject Classification. 60F10, 60J27.
<ph f="cmbx">Moderate deviation principle for ergodic Markov chain. Lipschitz summands</ph>

B. Delyon, A. Juditsky,

R. Liptser

Universite de Rennes 1, IRISA, Campus de Beaulieu, 35042 Rennes Cedex, France. E-mail address : bernard.delyon@univ-rennes1.fr University Joseph Fourier of Grenoble, France E-mail address : juditsky@inrialpes.fr Department of Electrical Engineering-Systems, Tel Aviv University, 69978 Tel Aviv Israel E-mail address : liptser@eng.tau.ac.il

1 Introduction and discussion

Let ( X n ) n 0   be a homogeneous ergodic Markov chain, X n R d   with the transition probability kernel for n   steps: P x ( n ) = P ( n ) ( x , d y )   (for brevity P x ( 1 ) : = P x   ) and the unique invariant measure μ   .
Let H   be a measurable function R d H R p   with R d | H ( z ) | μ ( d z ) <   and
R d H ( z ) μ ( d z ) = 0 . (1.1)
Set S n α = 1 n α i = 1 n H ( X i 1 ) , n 1 ; ( 0.5 < α < 1 ) .   In this paper, we examine the moderate deviation principle (in short:
MDP) for the family ( S n α ) n 1   when the spectrum of operator P x   is continuous.
It is well known that for bounded H   satisfying  1.1 ((H) condition), the most MDP compatible Markov chains are characterized by eigenvalues gap condition (EG) (see Wu, [16, [17, Gong and Wu, [7, and citations therein):
  • the unit is an isolated, simple and the only eigenvalue with modulus 1 of the transition probability kernel P x   .
In the framework of (H)-(EG) conditions, the MDP is valid with the rate of speed n ( 2 α 1 )   and the rate function I ( y ) , y R d  
I ( y ) = { 1 2 y B 2 , B B y = y , otherwise , (1.2)
where B   is the pseudoinverse matrix (in Moore-Penrose sense, see e.g.[1) for the matrix
B = R d H ( x ) H * ( x ) μ ( d x ) + n 1 R d [ H ( x ) ( P x ( n ) H ) * + ( P x ( n ) H ) H * ( x ) ] μ ( d x ) (1.3)
(henceforth, *   , | |   , and Q   are the transposition symbol, L 1   norm and L 2   norm with the kernel Q   ( x Q = x , Q x   ) respectively).
Thanks to the quadratic form rate function, the MDP is an attractive tool for an asymptotic analysis in many areas, say, with thesis (see, example in Section  7 )
“MDP instead of CLT”.
In this paper, we intend to apply the MDP analysis to Markov chain defined by the recurrent equation X n = f ( X n 1 , ξ n ) , n 1   generated by i.i.d. sequence ( ξ n ) n 1   of random vectors, where f   is some vector-valued measurable function. Obviously, the function f   and the distribution of ξ 1   might be specified in this way P x   satisfies (EG). For instance, if d = 1   and X n = f ( X n 1 ) + ξ n ,   then for bounded f   and Laplacian random variable ξ 1   (EG) holds. However, (EG) fails for many useful in applications ergodic Markov chains. For d = 1   , a typical example gives Gaussian Markov chain defined by a linear recurrent equation governed by i.i.d. sequence of ( 0 , 1 )   -Gaussian random variables(here | a | < 1   ) X n = a X n 1 + ξ n .   In order to clarify this remark, notice that if (EG) holds true, than for any bounded and measurable function H   , satisfying (H)-property, for some constants K > 0   , ϱ ( 0 , 1 )   , n 1 ,  
| E x H ( X n ) | K ϱ n . (1.4)
However, the latter fails for H ( x ) = sign ( x )   satisfying  1.1 . In fact, if  1.4 were correct, then n = 0 | E x H ( X n ) | K 1 ϱ .   On the other hand, it is readily to compute that n = 0 | E x H ( X n ) |   grows in | x |   on the set { | x | > 1 }   faster than O ( log ( | x | )   .
In this paper, we avoid a verification of (EG). Although our approach is close to a conception of “Multiplicative Ergodicity” (see Balaji and Myen [2) and “Geometrical Ergodicity” (see Kontoyiannis and Meyn, [8and Meyn and Tweedie, [11), Chen and Guillin, [4) we do not follow explicitly these methodologies.
Our main tools are the Poisson equation and the Puhalskii theorem from [15. The Poisson equation enables to reduce the MDP verification for ( S n α ) n 1   to ( 1 n α M n ) n 1   , where M n   is a martingale generated by Markov chain, while the Puhalskii theorem allows to replace an asymptotic analysis for the Laplace transform of 1 n α M n   by the asymptotic analysis for, so called, Stochastic Exponential
E n ( λ ) = n i = 1 E ( exp [ λ , 1 n α ( M i M i 1 ) ] | X i 1 ) , λ R d (1.5)
being the product of the conditional Laplace transforms for martingale increments.
An effectiveness of the Poisson equation approach (method of corrector) combined with the stochastic exponential is well known from the proofs of functional central limit theorem (FCLT) for the family ( S n 0.5 ) n 1   (see, e.g.
Papanicolaou, Stroock and Varadhan [12, Ethier and Kurtz [6, Bhattacharya [3, Pardoux and Veretennikov [13; related topics can be found in Metivier and Priouret (80's) for stochastic algorithms analysis. The use of the same approach for a continuous time setting can be found e.g. in [9, [10).

2 Formulation of main result

We consider Markov chain ( X n ) n 0   , X n R d   defined by a nonlinear recurrent equation
X n = f ( X n 1 , ξ n ) , (2.1)
where f = f ( z , v )   is a vector function with entries f 1 ( z , v ) , , f d ( z , v )   , u R d   , v R p   and ( ξ n ) n 1   is i.i.d. sequence of random vectors of the size p   .
We fix the following assumptions.
Assumption 2.1. Entries of f   are Lipschitz continuous functions in the following sense: for any v  
| f i ( z 1 , z j 1 , z j , z j + 1 , z d , v 1 , , v p ) f i ( z 1 , z j 1 , z j , z j + 1 , z d , v 1 , , v p ) | ϱ i j | z j z j | , | f ( z , v ) f ( z , v ) | ϱ | z z | ,  
where max i , j ϱ i j = ϱ < 1 .  
Assumption 2.2. For sufficiently small positive δ   , Cramer's condition holds:
E e δ | ξ 1 | < .  
Theorem 2.1. Under Assumptions  2.1 and  2.2 , the Markov chain is ergodic with the invariant measure μ   such that R d | z | μ ( d z ) <   . For any Lipschitz continuous function H   , satisfying  1.1 , the family ( S n α ) n 1   obeys the MDP in the metric space ( R d , r )   ( r   is the Euclidean metric) with the rate of speed n ( 2 α 1 )   and the rate function given in  1.2 .
Remark 1. Notice that:
assumptions of Theorem  2.1 do not guarantee (EG), Lipschitz continuous H   , obeying the linear growth condition, are permissible for the MDP analysis, the ξ 1   -distribution with a continuous component is not required.
Consider now a linear version of  2.1 : X n = A X n 1 + ξ n ,   where A   is the d × d   -matrix with entries A i j   . Now, Assumption  2.1 reads as: max i j | A i j | < 1   . This assumption is too restrictive. We replace it by more natural one
Assumption 2.3. The eigenvalues of A   lie within the unit circle.
Theorem 2.2. Under Assumption  2.3 , the Markov chain is ergodic with the invariant measure μ   such that R d z 2 μ ( d z ) <   .
For any Lipschitz continuous function H   , satisfying  1.1 , the family ( S n α ) n 1   obeys the MDP in the metric space ( R d , r )   with the rate of speed n ( 2 α 1 )   and the rate function given in  1.2 .

3 Preliminaries

3.1 (EG)-(H) conditions

To clarify our approach to the MDP analysis, let us first demonstrate its applicability under (EG)-(H) setting.
The (EG) condition provides the geometric ergodicity of P x ( n )   to the invariant measure μ   uniformly in x   in the total variation norm: there exist constants K > 0   and ϱ ( 0 , 1 )   such that for any x R d   , P x ( n ) μ t v K ϱ n , n 1 .   The latter provides an existence of bounded function
U ( x ) = H ( x ) + n 1 P x ( n ) H (3.1)
solving the Poisson equation
H ( x ) = H ( x ) + P x U . (3.2)
In view of the Markov property, a sequence ( ζ i ) i 1   of bounded random vectors with ζ i : = U ( X i ) P X i 1 U   forms a martingale-differences relative to the filtration generated by Markov chain. Hence, M n = i = 1 n ζ i   is the martingale with bounded increments. With the help of Poisson's equation we get the following decomposition
1 n α i = 1 n H ( X i 1 ) = 1 n α [ U ( x ) U ( X n ) ] c o r r e c t o r + 1 n α M n . (3.3)
The boundedness of U   provides a corrector negligibility in the MDP scale, that is, the families S n α   and 1 n α M n   share the same MDP. In view of that, suffice it to to establish the MDP for ( 1 n α M n ) n 1   .
Assume for a moment that ζ i   's are i.i.d. sequence of random vectors.
Recall, E ζ 1 = 0   and denote B = E ζ 1 ζ 1 *   . Then, the Laplace transform for 1 n α M n   is:
E n ( λ ) = ( E e λ , ζ 1 n α ) n , λ R d . (3.4)
Under this setting, it is well known that 1 n α M n   obeys the MDP if B   is not singular matrix and lim n n 2 α 1 log E n ( λ ) = 1 2 λ , B λ , λ R d .   We adapt this method of MDP verification to our setting. Instead of B   , we introduce matrices B ( X i 1 ) , i 1   with
B ( x ) = P x U U * P x U ( P x U ) * . (3.5)
The homogeneity of Markov chain and the definition of ζ i   provide a.s. that E ( ζ i ζ i * | X i 1 ) = B ( X i 1 ) .   Instead of the Laplace transform  3.4 , we apply the stochastic exponential  1.5 , expressed via ζ i   's, E n ( λ ) = n i = 1 E ( e λ , ζ i n α | X i 1 ) , λ R ,   which is not the Laplace transform itself.
The Poisson equation  3.2 and its solution  3.1 permit to transform  3.5 into B ( x ) = H ( x ) H * ( x ) + n 1 [ H ( x ) ( P x ( n ) H ) * + ( P x ( n ) H ) H * ] ,   that is, R d B ( z ) μ ( d z )   coincides with B   from  1.3 .
Now, we are in the position to formulate Puhalskii Theorem. [for more details, see [15and [?.] Assume B   from  1.3 is nonsingular matrix and for any ɛ > 0   , λ R d  
lim n 1 n 2 α 1 log P ( | n 2 α 1 log E n ( λ ) 1 2 λ , B λ | > ɛ ) = . (3.6)
Then, the family 1 n α M n   , n 1   possesses the MDP in the metric space ( R d , r )   ( r   is the Euclidean metric) with the rate of speed n ( 2 α 1 )   and rate function I ( y ) = 1 2 y B 1 2   .
Remark 2. The condition  3.6 is verifiable with the help of
lim n 1 n 2 α 1 log P ( 1 n | i = 1 n λ , [ B ( X i 1 ) B ] λ | > ɛ ) = lim n 1 n 2 α 1 log P ( 1 6 n 1 + α i = 1 n E [ | ζ i | 3 e n α | ζ i | | X i 1 ] > ɛ ) = . (3.7)
The second condition in  3.7 is implied by the boundedness of | ζ i |   's. The first part in  3.7 is known as Dembo's conditions, [?, formulated as follows:
for any ɛ > 0   , λ R d   l i m n 1 n log P ( 1 n | i = 1 n λ , [ B ( X i 1 ) B ] λ | > ɛ ) < 0 .   In order to verify the first condition in  3.7 , we apply again the Poisson equation technique. Set h ( x ) = λ , [ B ( x ) B ] λ   and notice that R d h ( z ) μ ( d z ) = 0 .   Then, the function u ( x ) = h ( x ) + n 1 P x ( n ) h   is well defined and solves the Poisson equation u ( x ) = h ( x ) + P x u .   Similarly to  3.3 , we have 1 n i = 1 n h ( X i 1 ) = u ( x ) u ( X n ) n + m n n ,   where m n = i = 1 n z i   is the martingale with bounded martingale-differences ( z i ) i 1   . Since u   is bounded, the first condition in  3.7 is reduced to
lim n 1 n 2 α 1 log P ( | m n | > n ɛ ) = (3.8)
while  3.8 is provided by Theorem  A.1 in Appendix which states that  3.8 holds for any martingale with bounded increments.

3.1.1 Singular B  

The conditions from  3.7 remain to hold whether B   is nonsingular or singular. For singular B   the Puhalskii theorem is no longer valid. With singular B   , we use the Puhalskii theorem as helpful tool It is well known that the family M n n α   , n 1   obeys the MDP with the rate of speed n ( 2 α 1 )   and some rate function,say I ( y )   provided that
l i m C l i m n 1 n 2 α 1 log P ( M n n α > C ) = l i m ɛ 0 l i m n 1 n 2 α 1 log P ( M n n α y ɛ ) I ( y ) l i m ɛ 0 l i m n 1 n 2 α 1 log P ( M n n α y ɛ ) I ( y ) . (3.9)
The first condition in  3.9 provides the exponential tightness in the metric r   while the next others the local MDP. In order to verify of  3.9 , we introduce “regularized” family M n β n α , n 1   with M n β = M n + β i = 1 n ϑ i ,   where β   is a positive parameter and ( ϑ i ) i 1   is a sequence of zero mean i.i.d.
Gaussian random vectors with cov ( ϑ 1 , ϑ 1 ) = : I   ( I   is the unit matrix). The Markov chain and ( ϑ i ) i 1   are assumed to be independent objects.
It is clear that for this setting the matrix B   is transformed into a positive definite matrix B β = B + β I   . Now, the Puhalskii theorem is applicable and guarantees the MDP with the same rate of speed and the rate function I β ( y ) = 1 2 y B β 1 2 .   We use now the well known fact (see, e.g. Puhalskii, [14) that MDP provides the exponentially tightness and the the local MDP:
l i m C l i m n 1 n 2 α 1 log P ( M n β n α > C ) = l i m ɛ 0 l i m n 1 n 2 α 1 log P ( M n β n α y ɛ ) I β ( y ) l i m ɛ 0 l i m n 1 n 2 α 1 log P ( M n β n α y ɛ ) I β ( y ) . (3.10)
Notice now that  3.9 is implied by  3.10 if
lim β 0 I β ( y ) = { 1 2 y B 2 , B B y = y , otherwise (3.11)
and
lim β 0 l i m n 1 n 2 α 1 log P ( β n α i = 1 n ϑ i > η ) = , η > 0 . (3.12)
Let T   be an orthogonal matrix transforming B   to a diagonal form: diag ( B ) = T * B T .   Then, owing to 2 I β ( y ) = y * ( β I + B ) 1 y = y * T ( β I + diag ( B ) ) 1 T * y ,   for y = B B y   we have (recall that B B B = B   , see [1)
2 I β ( y ) = y * B B T ( β I + diag ( B ) ) 1 T * y
= y * B T T * B T ( β I + diag ( B ) ) 1 T * y
= y * B T diag ( B ) ( β I + diag ( B ) ) 1 T * y
β 0 y * B T diag ( B ) diag ( ( B ) ) T * y
= y * B T diag ( B ) T * T ( diag ( B ) ) T * y
= y * B B B u = u * B y = y B 2 = 2 I ( y ) .
If y B B y   , lim β 0 2 I β ( y ) =   .
Thus,  3.11 holds true.
Since ( ϑ i ) i 1   is i.i.d. sequence of random vectors and entries of ϑ 1   are i.i.d. ( 0 , 1 )   -Gaussian random variables, the verification of  3.12 is reduced to
lim β 0 l i m n 1 n 2 α 1 log P ( | i = 1 n ξ i | > n α η β ) = , (3.13)
where ( ξ i ) i 1   is a sequence of i.i.d. ( 0 , 1 )   -Gaussian random variables, and it suffices to consider the case “+” only. By the Chernoff inequality with λ > 0   , we find that P ( i = 1 n ϑ i > n α η β ) exp ( λ n α η β + n λ 2 2 )   while the choice of λ = n α η n β   provides 1 n 2 α 1 log P ( i = 1 n η i > n α η β ) η 2 2 β β 0 .  

3.2 Virtual scenario

(EG)-(H) are not assumed the ergodicity of Markov chain is checked - H   is chosen to hold  1.1 .
(1) Let  3.1 hold. Hence, the function U   solves the Poisson equation and the decomposition from  3.3 is valid with M n = i = 1 n ζ i   , where ζ i = u ( X i ) P X i 1 u .   Let
E ζ i * ζ i c o n s t .
E [ | ζ i | 3 e n α | ζ i | | X i 1 ] c o n s t .
(2) With B ( x )   and B   are defined in  3.5 and  1.3 respectively, set h ( x ) = λ , [ B ( x ) B ] λ , λ R d .   Let (i) u ( x ) = h ( x ) + n 1 P x ( n ) h   is well defined (ii) for z i = u ( X i ) P X i 1 u   ,
E z i 2 c o n s t .
E [ | z i | 3 e n α | z i | | X i 1 ] c o n s t .
(3) For any ɛ > 0   , let
lim n 1 n 2 α 1 log P ( | U ( X n ) | > n α ɛ ) =
lim n 1 n 2 α 1 log P ( | u ( X n ) | > n α ɛ ) = .
Notice that (EG)-(H) provide (1)-(3) and even if (EG)-(H) fail, (1)-(3) may fulfill. Moreover, (1)-(3) guarantee the validity for all steps of the proof given in Section  3.1 .
Thus, an ergodic Markov chain, possessing (1)-(3), obeys the MDP. The proof of Theorems  2.1 and  2.2 follows this scenario.

4 The proof of Theorem  2.1 

4.1 Ergodic property

Lemma 4.1. Under Assumption  2.1 , ( X n ) n 0   possesses the unique probability invariant measure μ   with R d | z | μ ( d z ) <   .
  • Proof. Let ν   be a probability measure on R d   with R d | x | ν ( d x ) <   and let a random vector X 0   , distributed in the accordance to ν   , is independent of ( ξ n ) n 1   . We initialize the recursion, given in  2.1 , by X 0   . Let now X n   is generated by  2.1 . Then, μ n ( d z ) = R d P x ( n ) ( d z ) ν ( d x )   defines the distribution of X n   .
    We show that the family ( μ n ) n 1   is tight in the Levy-Prohorov metric:
    lim k l i m n μ n ( | z | > k ) = 0 .   By the Chebyshev inequality, μ n ( | z | > k ) E | X n | k   . The tightness follows from sup n 1 E | X n | < .   Further, since By Assumption  2.1 ,
    | X n | = | f ( 0 , ξ n ) + ( f ( X n 1 , ξ n ) f ( 0 , ξ n ) ) |
    | f ( 0 , ξ n ) | + | f ( X n 1 , ξ n ) f ( 0 , ξ n ) ) |
    | f ( 0 , ξ n ) | + ϱ | X n 1 |
    | f ( 0 , 0 ) | + | ξ n | + ϱ | X n 1 | ,
    the sequence ( E | X n | ) n 1   solves a recurrent inequality E | X n | | f ( 0 , 0 ) | + E | ξ 1 | + ϱ E | X n 1 |   subject to E | X 0 | = R d | x | ν ( d x ) ( < )   . Hence, we find that for any n 1   , E | X n | E | X 0 | + | f ( 0 , 0 ) | + E | ξ 1 | 1 ϱ .   Thus, the family { μ n }   is tight, so that, by the Prohorov theorem, { μ n }   contains further subsequence { μ n }   converging, as n   , in the Levy-Prohorov metric to a limit μ   being a probability measure on R d   : for any bounded and continuous function g   on R d   lim n R d g ( z ) μ n ( d z ) = R d g ( z ) μ ( d z ) .   Thence, for g ( z ) = L | z |   and L > 0   , it holds R d ( L | z | ) μ ( d z ) = lim n E ( L | X n | ) l i m n E | X n | <   and, by the monotone convergence theorem, R d | z | μ ( d z ) l i m n E | X n | < .   The μ   is regarded now as a candidate to be the unique invariant measure.
    So, we shall verify R d g ( x ) μ ( d x ) = R d P x g μ ( d x ) .   for any nonnegative, bounded and continuous function g   . For notational convenience, write X n x   and X n ν   , if X 0 = x   and X 0   is distributed in the accordance with ν   . By Assumption  2.1 , | X n x X n ν | ϱ | X n 1 x X n 1 ν | , n 1 ,   that is, | X n x X n ν |   converges to zero exponentially fast as long as n   .
    For any x R d   , the latter provides lim n E g ( X n x ) = R d g ( x ) μ ( d x ) .   Since the Markov chain is homogeneous, we also find that lim n E g ( X n + 1 x ) = R d g ( z ) μ ( d z ) .   On the other hand, owing to E g ( X n + 1 x ) = E P X n x g   , the above relation is nothing but lim n E P X n x g = R d g ( z ) μ ( d z ) .   Finally, owing to P x g = E g ( f ( x , ξ 1 ) )   , the function P x g   of argument x   is bounded and continuous. Consequently, lim n E P X n x g = R d P x g μ ( d x ) .   Assume μ   is another invariant probability measure, μ μ   . Then, taking X 0 μ   and X 0 μ   , distributed in the accordance to μ   and μ   respectively and independent of ( ξ n ) n 1   , we get two stationary Markov chains ( X n μ )   and ( X n μ )   defined on the same probability space as:
    X n μ = f ( X n 1 μ , ξ n )
    X n μ = f ( X n 1 μ , ξ n ) .
    By Assumption  2.1 , | X n μ X n μ | ϱ | X n 1 μ X n 1 μ |   , i.e. lim n | X n μ X n μ | = 0   . Recall that both processes X n μ   and X n μ   are stationary with the marginal distributions μ   and μ   respectively. Hence, for any bounded and continuous function g : R d R   , | R d g ( x ) μ ( d x ) R d g ( x ) μ ( d x ) | E | g ( X n μ ) g ( X n μ ) | n 0 ,   that is, μ = μ   .

4.2 The verification of (1)

Let K   be the Lipschitz constant for H   . Then | H ( x ) | | H ( 0 ) | + K | x |   and R d | H ( z ) | μ ( d z ) < .   By  1.1 , E H ( X n μ ) 0   . Then,
| E H ( X n x ) | = | E ( H ( X n x ) H ( X n μ ) |
K ϱ n E | x X n μ | K ( 1 + | x | ) ϱ n .
Therefore, n 1 | E H ( X n x | K 1 ϱ ( 1 + | x | )   . Consequently, the function U ( x )   , given in  3.1 , is well defined and solves the Poisson equation.
Recall that ζ i = U ( X i ) P X i 1 U   .
Lemma 4.2. The function U ( x )   possesses the following properties:
1) U ( x )   is Lipschitz continuous; 2) P x ( U U * ) P x U ( P x U ) *   is bounded and Lipschitz continuous; 3) For sufficiently small δ > 0   and any i 1   E ( | U ( X i ) P X i 1 U | 3 e δ | U ( X i ) P X i 1 U | | X i 1 ) const.  
  • Proof. 1) Since by Assumption  2.1 , | X n x X n x | ϱ | X n 1 x X n 1 x | , | X 0 x X 0 x | | x x | ,   we have
    | U ( x ) U ( x ) | | H ( x ) H ( x ) | + n 1 E | H ( X n x ) H ( X n x ) | K 1 ϱ | x x | . (4.1)
    2) Recall (see  3.5 ) P x ( U U * ) P x U ( P x U ) * = B ( x )   and denote B p q ( x )   , p , q = 1 , , d   the entries of matrix B ( x )   . Also, denote by U p ( x )   , p = 1 , , d   the entries of U ( x )   . Since B ( x )   is nonnegative definite matrix, suffice it to show only that B p p ( x )   's are bounded functions. Denote F ( z )   the distribution function of ξ 1   . Taking into the consideration  4.1 and Assumption  2.1 , we get
    B p p ( x ) = E ( U p ( f ( x , ξ 1 ) ) R d U p ( f ( x , z ) ) d F ( z ) ) 2
    ( K ) 2 ( 1 ϱ ) 2 E | R d | ξ 1 z | d F ( z ) | 2 4 ( K ) 2 ( 1 ϱ ) 2 E | ξ 1 | 2 < .
    The Lipschitz continuity of B p q ( x )   is proved similarly. Write B p q ( x ) B p q ( x ) = : a b c d ,   where
    a = E ( U p ( f ( x , ξ 1 ) ) R d U q ( f ( x , z ) ) d F ( z ) )
    b = E ( U q ( f ( x , ξ 1 ) ) R d U q ( f ( x , z ) ) d F ( z ) )
    c = E ( U p ( f ( x , ξ 1 ) ) R d U q ( f ( x , z ) ) d F ( z ) )
    d = E ( U q ( f ( x , ξ 1 ) ) R d U q ( f ( x , z ) ) d F ( z ) ) .
    Now, applying a b c d = a ( b d ) + d ( a c )   and taking into account  4.1 and Assumption  2.1 , we find that | a | , | d | 2 K 1 ϱ E | ξ 1 |   and so | B p q ( x ) B p q ( x ) | 4 K 2 ϱ ( 1 ϱ ) 2 E | ξ 1 | | x x | .   3) By  4.1 and Assumption  2.1  | U ( X i ) P X i 1 U | K 1 ϱ ( E | ξ 1 | + | ξ i | ) .  

4.3 The verification of (2)

The properties of B ( x )   to be bounded and Lipschitz continuous provide the same properties for h ( x ) = λ , [ B ( x ) B ] λ .   Hence (2) is provided by (1).

4.4 The verification of (3)

Since U   and u   are Lipschitz continuous, they possess the linear growth condition, e.g., | U ( x ) | C ( 1 + | x | ) , C > 0 .   So, (3) is reduced to the verification of
lim n 1 n 2 α 1 log P ( | X n | > ɛ n α ) = , ɛ > 0 . (4.2)
Due to Assumption  2.1 , we have
| X n | | f ( X n 1 , ξ n ) | | f ( 0 , ξ n ) | + ϱ | X n 1 |
| f ( 0 , 0 ) | + ϱ | X n 1 | + | ξ n | .
Iterating this inequality with X 0 = x   we obtain
| X n | ϱ n | x | + | f ( 0 , 0 ) | j = 1 n ϱ n j + j = 1 n ϱ n j | ξ j |
| x | + | f ( 0 , 0 ) | 1 ϱ + j = 0 n 1 ϱ j | ξ n j | .
Hence,  4.2 is reduced to
lim n 1 n 2 α 1 log P ( j = 0 n 1 ϱ j | ξ n j | n α ɛ ) = . (4.3)
We verify  4.3 with the help of Chernoff 's inequality: with δ   , involving in Assumption  2.2 , and γ = δ 1 ϱ  
P ( j = 0 n 1 ϱ j | ξ n j | n α ɛ ) e n α γ ɛ E e j = 0 n 1 γ ϱ j | ξ n j | .
The i.i.d. property for ξ j   's provides E e j = 0 n 1 γ ϱ j | ξ n j | = E e j = 0 n 1 γ ϱ j | ξ 1 | E e j = 0 γ ϱ j | ξ 1 | = E e δ | ξ 1 | <   and we get 1 n 2 α 1 log P ( j = 0 n 1 ϱ j | ξ n j | n α ɛ ) n 1 α δ ɛ + log E e δ | ξ 1 | n 2 α 1 n .  

5 The proof of Theorem  2.2 

The proof of this theorem differs from the proof of Theorem  2.1 only in some details concerning to (L.1). So, only these parts of the proof are given below.

5.1 Ergodic property and invariant measure

Introduce ( ξ ~ n ) n 1   the independent copy of ( ξ n ) n 1   . Owing to X n = A n x + i = 1 n A n i ξ i = A n x + i = 0 n 1 A i ξ n i ,   we introduce
X ~ n = A n x + i = 0 n 1 A i ξ ~ i (5.1)
and notice that the i.i.d. property of ( ξ i ) i 1   provides ( X n ) n 0 = l a w ( X ~ n ) n 0 .   By Assumption  2.3 , A n 0   , n   , exponentially fast. Particularly, i = 0 trace ( A i cov ( ξ 1 , ξ 1 ) ( A i ) * ) < ,   so that lim n X ~ n = i = 0 A i ξ ~ i   a.s. and in L 2   norm.
Thus, the invariant measure μ   is generated by the distribution function of X ~   . In addition, E X ~ 2 = i = 0 trace ( A i cov ( ξ 1 , ξ 1 ) ( A i ) * )   , so that R d z 2 μ ( d z ) < .  

5.2 The verification of (1) and (2)

Due to ( X n x X n x ) = A ( X n 1 x X n 1 x ) ,   we have ( X n x X n x ) = A n ( x x )   . Let us transform the matrix A   into a Jordan form A = T J T 1   and notice that A n = T J n T 1   . It is well known that the maximal absolute value of entries of J n   is n | λ | n   , where | λ |   is the maximal absolute value among eigenvalues of A   . By Assumption  2.3 , | λ | < 1   .
So, there exist K > 0   and ϱ < 1   such that | λ | < ϱ   . Then, entries A p q n   of A n   are evaluated as: | A p q n | K ϱ n   . Hence, | X n x X n x | K ϱ n | x x |   , n 1   , and the verification of (1), (2) is in the framework of Section  3 .

5.3 The verification of (3)

As in Section  3 , the verification of this property is reduced to
lim n 1 n 2 α 1 log P ( | X n | > ɛ n α ) = , ɛ > 0 . (5.2)
In  5.2 , we may replace X n   by its copy X ~ n   defined in  5.1 . Notice also that | X ~ n | | A n x | + i = 0 max p q | A p q i | | ξ ~ | .   As was mentioned above, | A p q i | K ϱ j   for some K > 0   and ϱ ( 0 , 1 )   .
Hence, suffice it to verify lim n 1 n 2 α 1 log P ( i = 0 ϱ i | ξ i | > ɛ n α ) = , ɛ > 0   what be going on similarly to corresponding part of the proof in Section  3 .

6 Exotic example

Let ( X n ) n 0   , X n R   and X 0 = x   , be Markov chain defined by the recurrent equation
X n = X n 1 m X n 1 | X n 1 | + ξ n , (6.1)
where m   is a positive parameter, ( ξ n )   is i.i.d. sequence of zero mean random variables with E e δ | ξ 1 | < , for some δ > 0 ,   and let 0 0 = 0   .
Although the virtual scenario is not completely verifiable here we show that for H ( x ) = x | x |   the family ( S n α ) n 1   possesses the MDP provided that
m > 1 δ log E e δ | ξ 1 | . (6.2)
Indeed, by  6.1 we have 1 n α k = 1 n X k 1 | X k 1 | = 1 m ( X n x ) n α + 1 n α k = 1 n ξ k m .   The family ( 1 n α k = 1 n ξ k m ) n 1   possesses the MDP with the rate of speed n ( 2 α 1 )   and the rate function I ( y ) = m 2 2 E ξ 1 2 y 2   . Then, the family ( S n α ) n 1   obeys the same MDP provided that ( X n x n α ) n 1   is exponentially negligible family with the rate n ( 2 α 1 )   . This verification is reduced to
lim n 1 n 2 α 1 log P ( | X n | > n α ɛ ) = , ɛ > 0 . (6.3)
By the Chernoff inequality P ( | X n | > n α ɛ ) e δ n α ɛ E e δ | X n | ,   that is  6.3 holds if sup n 1 E e δ | X n | <   for some δ > 0   . We show that the latter holds true for δ   involved in  6.2 . A helpful tool for this verification is the inequality | z m z | z | | | | z | m |   . Write
E e δ | X n | = E e δ | X n | I ( | X n 1 | m ) + E e δ | X n | I ( | X n 1 | > m )
e δ m E e δ | ξ 1 | + e δ m E e δ | ξ 1 | E e δ | X n 1 | .
Set = e δ m E e δ | ξ 1 |   and ϱ = e δ m E e δ | ξ 1 |   . By  6.2 , ϱ < 1   . Hence, V ( x ) = e δ | x |   is the Lyapunov function: P x V ϱ V ( x ) + .   Consequently, E V ( X n ) ϱ E V ( X n ) + , n 1   and so, sup n 1 E V ( X n ) V ( x ) + 1 ϱ   .

7 Statistical example

An asymptotic analysis, given in this section, demonstrate the thesis “MDP instead of CLT”. Let X n = θ f ( X n 1 ) + ξ n ,   where θ   is a number and ( ξ n ) n 1   is i.i.d. sequence of of ( 0 , 1 )   -Gaussian random variables. We assume that | θ | < 1   and f   is bounded continuously differentiable function with | f ( x ) | 1   . By Theorem  2.1 , ( X n )   is an ergodic Markov chain and its invariant measure μ θ   depends on parameter θ   . Since ξ 1   is Gaussian random variables, μ θ   , being a convolution of some measure with Gaussian one, possesses a density relative to d z   . Then, assuming f 2 ( x ) > 0   relative to Lebesgue measure, we have B θ = R f 2 ( z ) μ ( d z ) > 0 .   Under the above assumptions, θ n = i = 1 n f ( X i 1 ) X i i = 1 n f 2 ( X i 1 )   is a strongly consistent estimate of θ   by sampling { X 1 , , X n }   , that is, lim n θ n = θ   a.s. Moreover, it is known its asymptotic in the CLT scale:
n ( θ θ n ) l a w n ( 0 , 1 B θ ) -Gaussian r. v.   Here, we give an asymptotic of θ n   in the MDP scale: for any α ( 1 2 , 1 )   , n 1 α ( θ θ n ) M D P n ( 1 n 2 α 1 , y 2 2 B θ ) .  
Theorem 7.1. The family n 1 α ( θ θ n )   obeys the MDP with the rate of speed 1 n 2 α 1   and the rate function I ( y ) = y 2 2 B θ   .
  • Proof. The use of n 1 α ( θ θ n ) = 1 n α i = 1 n f ( X i 1 ) ξ i 1 n i = 1 n f 2 ( X i 1 )   and the law of large numbers, P - lim n 1 n i = 1 n f 2 ( X i 1 ) = B θ   , give a hint that that the theorem statement is valid provided that (i) for M n = i = 1 f ( X i 1 ) ξ i   , the family ( 1 n α M n ) n   obeys the MDP with the rate of speed 1 n 2 α 1   and the rate function I ( y ) = y 2 2 B θ 1   ; (ii) for any ɛ > 0   , lim n 1 n 2 α 1 log P ( | 1 n i = 1 n [ f 2 ( X i 1 ) B θ ] | ɛ ) = .   Following to  1.5 and taking into account the setting, we notice that E n ( λ ) = exp ( i = 1 n λ 2 2 n 2 α f 2 ( X i 1 ) ) .   is the stochastic exponential related to ( 1 n α M n ) n   . Consequently,  3.6 is reduced to (ii), that is, only (ii) is left to be verified.
    The verification of (ii) is in the framework of Theorem  2.1 . The function H ( x ) = f 2 ( x ) B θ   satisfies the assumptions of Theorem  2.1 . Hence, the family ( 1 n α i = k n H ( X k 1 ) ) n   obeys the MDP with the rate of speed 1 n 2 α 1   and the rate function J ( y ) = { y 2 2 B ^ θ B ^ θ > 0 , , B ^ θ = 0 , y 0 ,   where, in accordance with  1.3 , B ^ θ = R H 2 ( x ) μ θ ( d x ) + 2 n 1 R H ( x ) P x ( n ) H μ θ ( d x ) .   In particular, l i m n 1 n 2 α 1 log P ( | 1 n α k = 1 n H ( X k 1 ) | C ɛ ) { 1 2 B ^ θ C 2 ɛ 2 , B ^ θ > 0 , otherwise .   Hence, for any C > 0   , we find that
    l i m n 1 n 2 α 1 log P ( | 1 n k = 1 n H ( X k 1 ) | ɛ )
    = l i m n 1 n 2 α 1 log P ( | 1 n α k = 1 n H ( X k 1 ) | n 1 α ɛ )
    l i m n 1 n 2 α 1 log P ( | 1 n α k = 1 n H ( X k 1 ) | C ɛ )
    { C 2 ɛ 2 2 B ^ θ B ^ θ > 0 , otherwise C .

A Exponentially integrable martingale-differences

Let ζ n = ( ζ n ) n 1   be a martingale-difference with respect to some filtration F = ( F n ) n 0   and M n = i = 1 n ζ i   be the corresponding martingale.
Theorem A.1. Assume that for sufficiently small positive δ   and any i 1  
E ( e δ | ζ i | | F i 1 ) c o n s t . (A.1)
Then for any α ( 0.5 , 1 )   lim n 1 n 2 α 1 log P ( | M n | > n ɛ ) = .  
  • Proof. Suffice it to prove lim n 1 n 2 α 1 log P ( ± M n > n ɛ ) =   . We verify here only “+” only (the proof of “-” is similar).
    For fixed positive λ   and sufficiently large n   , let us introduce the stochastic exponential E n ( λ ) = n i = 1 E ( e λ ζ i n | F i 1 ) .   A direct verification shows that E exp ( λ M n n log E n ( λ ) ) = 1 .   We apply this equality for further ones
    1 E I ( M n > n ɛ ) exp ( λ M n n log E n ( λ ) ) E I ( M n > n ɛ ) exp ( λ ɛ log E n ( λ ) ) . (A.2)
    Due to E ( λ ζ i n | F i 1 ) = 0   and  A.1 , we find that
    log E n ( λ ) = i = 1 n log ( 1 + E [ e λ ζ i n 1 λ ζ i n | F i 1 ] )
    i = 1 n { λ 2 2 n 2 E ( ( ζ i ) 2 | X i 1 ) + λ 3 6 n 3 E ( | ζ i | 3 e λ | ζ i | n | F i 1 ) }
    K [ λ 2 2 n + λ 3 6 n 2 ] ,
    where K   is some constant. This inequality, being incorporated into  A.2 , provides 1 E I ( M n > n ɛ ) exp ( λ ɛ K [ λ 2 2 n + λ 3 6 n 2 ] ) .   If ɛ < 3   , taking λ = ɛ n K 1   , we find that 1 n 2 α 1 log P ( M n > n ɛ ) ɛ 2 n 2 ( 1 α ) K ( 1 2 ɛ 6 ) n   Thus, the desired statement holds true.
References

  1. Albert, A. (1972) Regression and the Moore-Penrose Pseudoinverse. Academic Press, New York and London.
  2. Balaji, S. Meyen, S. P. (2000) Multiplicative ergodicity and large deviations for an irreducible Markov chain. Stochastic Processes and their Applications. 90, pp. 123-144.
  3. Bhattacharya, R.N. (1992) On the functional central limit theorem and the law of the iterated logarithm for Markov processes, Z. Wharsch. verw. Geb. 60, pp. 185–201.
  4. Chen. Xia, Guillin, A. (2004) The functional moderate deviations for Harris recurrent Markov chains and applications. Annales de l'Institut Henri Poincarè (B) Probabilitès et Statistiques. 40 pp. 89-124
  5. Dembo, A. (1996) Moderate deviations for martingales with bounded jumps,Elect. Comm. in Probab. 1, pp. 11-17.
  6. Ethier, S. N., Kurtz, T. G. (1986), Markov processes. Characterization and convergence, Wiley Series in Probability and Mathematical Statistics, John Wiley & Sons, New York et al.
  7. Gong, F. and Wu, L. (2000) Spectral gap of positive operators and applications, C. R. Acad. Sci., Sér. I, Math. 331(12), pp. 983-988.
  8. Kontoyiannis, I., Meyn, S.P. (2002) Spectral Theory and Limit Theorems for Geometrically Ergodic Markov Processes. Article math.PR/0209200.
  9. Liptser, R.S. and Spokoiny, V. (1999) Moderate deviations type evaluation for integral functionals of diffusion processes, EJP. 4, Paper 17. (http://www.math.washington.edu/ ejpecp/)
  10. Liptser, R., Spokoiny, V. and Veretennikov, A.Yu. (2002) Freidlin-Wentzell type large deviations for smooth processes. Markov Process and Relat. Fields. 8, pp. 611-636.
  11. Meyn, S.P. Tweedie, R.L.(1993) Markov chains and stochastic stability, Springer-Verlag .
  12. Papanicolaou, C.C., Stroock, D.W., Varahan, S.R.S. (1977) Martingale approach to some limit theorems. in: Conference on Statistical Mechanics, Dinamical Systems and Turbulence, M. Reed ed., Duke Univ. Math. Series, 3.
  13. Pardoux, E., Veretennikov, A.Yu. (2001) On Poisson equation and diffusion approximation, 1. Ann. Prob. 29 (2001), n. 3, pp. 1061-1085.
  14. A.A Puhalskii, (1991) On functional principle of large deviations”. New trends in Probability and Statistics., Vilnius, Lithuania, VSP/Mokslas, , pp. 198-218.
  15. Puhalskii, A.A. (1994) The method of stochastic exponentials for large deviations. Stochast. Proc. Appl. 54, pp. 45-70.
  16. Wu, L. (1992) Moderate deviations of dependent random variables related to CLT and LIL. Prépublication N. 118, Lab. de Probabilité de l'Université Paris VI.
  17. Wu, L. (1995) Moderate deviations of dependent random variables related to CLT, Annals of Probability. 23, No. 1, pp. 420-445.

Universite de Rennes 1, IRISA, Campus de Beaulieu, 35042 Rennes Cedex, France. E-mail address : bernard.delyon@univ-rennes1.fr University Joseph Fourier of Grenoble, France E-mail address : juditsky@inrialpes.fr Department of Electrical Engineering-Systems, Tel Aviv University, 69978 Tel Aviv Israel E-mail address : liptser@eng.tau.ac.il