Quantitative concentration inequalities for empirical measures on non-compact spaces
François Bolley, Arnaud Guillin,
Cédric Villani
ENS Lyon, Umpa, 46 allee d'Italie, F-69364 Lyon Cedex 07 E-mail address : fbolley@umpa.ens-lyon.fr CEREMADE, Universite Paris Dauphine E-mail address : guillin@ceremade.dauphine.fr ENS Lyon, Umpa, 46 allee d'Italie, F-69364 Lyon Cedex 07 E-mail address : cvillani@umpa.ens-lyon.fr
-
Abstract.
We establish some quantitative concentration estimates for the empirical measure of many independent variables, in transportation distances. As an application, we provide some error bounds for particle simulations in a model mean field problem. The tools include coupling arguments, as well as regularity and moments estimates for solutions of certain diffusive partial differential equations.
Contents
Introduction
Large stochastic particle systems constitute a popular way to perform numerical simulations in many contexts, either because they are used in some physical model (as in e.g. stellar or granular media) or as an approximation of a continuous model (as in e.g. vortex simulation for Euler equation, see [21,Chapter5] for instance). For such systems one may wish to establish concentration estimates showing that the behavior of the system is sharply stabilized as the number
of particles goes to infinity. It is natural to search for these estimates in the setting of large (or moderate) deviations, since one wishes to make sure that the numerical method has a very small probability to give wrong results. From a physical perspective, concentration estimates may be useful to establish the validity of a continuous approximation such as a mean-field limit.
When one is interested in the asymptotic behavior of just one, or a few observables (such as the mean position...), there are efficient methods, based for instance on concentration of measure theory. As a good example, Malrieu [19] recently applied tools from the fields of Logarithmic Sobolev inequalities, optimal transportation and concentration of measure, to prove very neat bounds like
|
(0.1)
|
Here
stand for the positions of particles (in phase space) at time
,
is a given error,
stands for the probability,
is a probability measure governing the limit behavior of the system, and
is a positive constant depending on the particular system he is considering (a simple instance of McKean-Vlasov model used in particular in the modelling of granular media). Moreover,
where
is the distance in phase space (say the Euclidean norm
in
).
This approach can lead to nice bounds, but has the drawback to be limited to a finite number of observables. Of course, one may apply 0.1 to many functions
, and obtain something like
|
(0.2)
|
where
is an arbitrarily chosen dense family in the set of all
-Lipschitz functions converging to 0 at infinity. If we denote by
the Dirac mass at point
, and by
the empirical measure associated with the system (this is a random probability measure), then estimate 0.2 can be interpreted as a bound on how close
is to
. Indeed,
|
(0.3)
|
defines a distance on probability measures, associated with a topology which is at least as strong as the weak convergence of measures (convergence against bounded continuous test functions). However, this point of view is deceiving: for practical purposes, the distance
can hardly be estimated, and in any case 0.2 does not contain more information than 0.1 :
it is only useful if one considers a finite number of observables.
Sanov's large deviation principle [12,Theorem6.2.10] provides a more satisfactory tool to estimate the distance between the empirical measure and its limit. Roughly speaking, it implies, for independent variables
, an estimate of the form
where
|
(0.4)
|
and
is the relative
functional:
(to be interpreted as
if
is not absolutely continuous with respect to
). Since
behaves in many ways like a square distance, one can hope that
. Here “
” may be any distance which is continuous with respect to the weak topology, a condition which might cause trouble on a non-compact phase space.
Yet Sanov's theorem is not the final answer either: it is actually asymptotic, and only implies a bound like
which, unlike 0.1 , does not contain any explicit estimate for a given
. Fortunately, there are known techniques to obtain quantitative upper bounds for such theorems, see in particular [12,Exercise 4.5.5] . Since these techniques are devised for compact phase spaces, a further truncation will be necessary to treat more general situations.
In this paper, we shall show how to combine these ideas with recent results about measure concentration and transportation distances, in order to derive in a systematic way estimates that are explicit, deal with the empirical measure as a whole, apply to non-compact phase spaces, and can be used to study some particle systems arising in practical problems.
Typical estimates will be of the form
|
(0.5)
|
As a price to pay, the constant
in the right-hand side will be much larger than the one in 0.1 .
Here is a possible application of 0.5 in a numerical perspective. Suppose your system has a limit invariant measure
as
, and you wish to numerically plot itsdensity
. For that, you run your particle simulation for a long time
, and plot, say,
|
(0.6)
|
where
is a smooth approximation of a Dirac mass as
(as usual,
is a nonnegative smooth radial function on
with compact support and unit integral).
With the help of estimates such as 0.5 , it is often possible to compute bounds on, say,
in terms of
,
,
and
. In this way one can “guarantee” that all details of the invariant measure are captured by the stochastic system. While this problem is too general to be treated abstractly, we shall show on some concrete model examples how to derive such bounds for the same kind of systems that was considered by Malrieu. In the next section, we shall explain about our main tools and results; the rest of the paper will be devoted to the proofs. Some auxiliary estimates of general interest are postponed in Appendix.
1 Tools and main results
1.1 Wasserstein distances
To measure distances between probability measures, we shall use transportation distances, also called Wasserstein distances. They can be defined in an abstract Polish space
as follows: given
in
,
a lower semi-continuous distance on
, and
and
two Borel probability measures on
, the Wasserstein distance of order
between
and
is
where
runs over the set
of all joint probability measures on the product space
with marginals
and
; it is easy to check [29,Theorem7.3] that
is a distance on the set
of Borel probability measures
on
such that
.
For this choice of distance, in view of Sanov's theorem, a very natural class of inequalities is the family of so-called transportation inequalities, or Talagrand inequalities (see [17] for instance): by definition, given
and
, a probability measure
on
satisfies
if the inequality
holds for any probability measure
. We shall say that
satisfies a
inequality if it satisfies
for some
. By Jensen's inequality, these inequalities become stronger as
becomes larger; so the weakest of all is
. Some variants introduced in [8] will also be considered.
Of course
is not a very explicit condition, and a priori it is not clear how to check that a given probability measure satisfies it. It has been proven [7, 14, 8] that
is equivalent to the existence of a square-exponential moment: in other words, a reference measure
satisfies
if and only if there is
such that
for some (and thus any)
. If that condition is satisfied, then one can find explicitly some
such that
holds true: see for instance [8] .
This criterion makes
a rather convenient inequality to use. Another popular inequality is
, which appears naturally in many situations where a lot of structure is available, and which has good tensorization properties in many dimensions. Up to now,
inequalities have not been so well characterized: it is known that they are implied by a Logarithmic Sobolev inequality [23, 6, 30] , and that they imply a Poincaré, or spectral gap, inequality [23, 6] .
See [11] for an attempt to a criterion for
. In any case, contrary to the case
, there is no hope to obtain
inequalities from just integrability or decay estimates.
In this paper, we shall mainly focus on the case
, which is much more flexible.
1.2 Metric entropy
When
is a compact space, the minimum number
of balls of radius
needed to cover
is called the metric entropy of
. This quantity plays an important role in quantitative variants of Sanov's Theorem [12,Exercise4.5.5] . In the present paper, to fix ideas we shall always be working in the particular Euclidean space
, which of course is not compact; and we shall reduce to the compact case by truncating everything to balls of finite radius
. This particular choice will influence the results through the function
, where
is the ball of radius
centered at some point, say the origin, and
is the space of probability measures on
, metrized by
.
1.3 Sanov-type theorems
The core of our estimates is based on variants of Sanov's Theorem, all dealing with independent random variables. Let
be a given probability measure on
, and let
be a sample of independent variables, all distributed according to
; let also
be the associated empirical measure. In our first main result we assume a
inequality for the measure
, and deduce from that an upper bound in
distance:
Theorem 1.1.
Let
and let
be a probability measure on
satisfying a
inequality. Then, for any
and
, there exists some constant
, depending on
and some square-exponential moment of
, such that for any
and
,
|
(1.1)
|
where
Compared to Sanov's Theorem, this result is more restrictive in the sense that it requires some extra assumptions on the reference measure
, but under these hypotheses we are able to replace a result which was only asymptotic by a pointwise upper bound on the error probability, together with a lower bound on the required size of the sample.
In view of the Kantorovich-Rubinstein duality formula
|
(1.2)
|
Theorem 1.1 implies concentration inequalities such as
for
, and
sufficiently large, under the assumption that
satisfies a
inequality, or equivalently admits a finite square-exponential moment. Those types of inequalities are of interest in non-parametric statistics and choice models [22] .
Remark 1.2.
The sole inequality
implies that for all 1-Lipschitz function
,
and it is easy to see that the coefficient
in this inequality is the best possible.
While the quantity controlled in Theorem 1.1 is much stronger, the estimate is weakened only in that
is replaced by some
(arbitrarily close to
) and that
has to be large enough. In fact, a variant of the proof below would yield estimates such as
where now there is no restriction on
, but
is a larger constant, explicitly computable from the proof.
Remark 1.3.
As pointed out to us by M. Ledoux, there is another way to concentration estimates on the empirical measure when
. Indeed, in this specific case,
where
stands for the Heaviside function on
and
denotes the repartition function of
, so that
where
are centered
-valued independent identically distributed random variables.
But, according to [
1,Exercise3.8.14]
, a centered
-valued random variable
satisfies a Central Limit Theorem if and only if
a condition which for the random variables
's can be written
|
(1.3)
|
Condition 1.3 in turn holds true as soon as (for instance)
is finite for some positive
. Then we may apply a quantitative version of the Central Limit Theorem for random varaiables in the Banach space
. See [
16]
and [
18]
for related works.
Remark 1.4.
Theorem 1.1 applies if
is at least as large as
for some
; we do not know whether
here is optimal.
For the applications that we shall treat, in which the tails of the probability distributions will be decaying very fast, Theorem 1.1 will be sufficient. However, it is worthwile pointing out that the technique works under much broader assumptions: weaker estimatescan be proven for probability measures that do not decay fast enough to admit finite square-exponential moments. Here below are some such results using only polynomial moment estimates:
Theorem 1.5.
Let
and let
be a probability measure on
such that
Then (i) For any
,
and
, there exists a constant
such that
for any
and
; (ii) For any
,
and
there exists a constant
such that
for any
and
.
Here are also some variants under alternative “regularity” assumptions:
Theorem 1.6.
-
(i)
Let
; assume that
is finite for some
. Then, for all
, there exist some constants
and
, depending only on
,
and
, such that
for any
and
.
-
(ii)
Suppose that
satifies
and a Poincaré inequality, then for all
there exists some constants
and
such that
|
(1.4)
|
for any
and
.
-
(iii)
Let
and let
be a probability measure on
satisfying
.
Then for all
and
there exists some constant
, depending on
only through
and some square-exponential moment, such that
|
(1.5)
|
for any
and
.
1.4 Interacting systems of particles
We now consider a system of
interacting particles whose time-evolution is governed by the system of coupled stochastic differential equations
|
(1.6)
|
Here
is the position at time
of particule number
, the
's are
independent Brownian motions, and
and
are smooth potentials, sufficiently nice that 1.6 can be solved globally in time. We shall always assume that
(which can be interpreted as an interaction potential) is a symmetric function, that is
for all
.
Equation 1.6 is a particularly simple instance of coupled system; in the case when
is quadratic and
has cubic growth, it was used as a simple mean-field kinetic model for granular media (see e.g. [19] ). While many of our results could be extended to more general systems, that particular one will be quite enough for our exposition.
To this system of particles is naturally associated the empirical measure, defined for each time
by
|
(1.7)
|
Under suitable assumptions on the potentials
and
, it is a classical result that, if the initial positions of the particle system are distributed chaotically (for instance, if they are identically distributed, independent random variables), then the empirical measure
converges as
to a solution of the nonlinear partial differential equation
|
(1.8)
|
where
stands for the divergence operator. Equation 1.8 is a simple instance of McKean-Vlasov equation. This convergence result is part of the by now well-developed theory of propagation of chaos, and was studied by Sznitman for pedagogical reasons [27] , in the case of potentials that grow at most quadratically at infinity. Later, Benachour, Roynette, Talay and Vallois [2, 3] considered the case where the interaction potential grows faster than quadratically. As far as the limit equation 1.8 is concerned, a discussion of its use in the modelling of granular media in kinetic theory was performed by Benedetto, Caglioti, Carrillo and Pulvirenti [4, 5] , while the asymptotic behavior in large time was studied by Carrillo, McCann and Villani [9, 10] with the help of Wasserstein distances and entropy inequality methods. Then Malrieu [19] presented a detailed study of both limits
and
by probabilistic methods, and established estimates of the type of 0.1 under adequate convexity assumptions on
and
(see also [29,Problem 15] ).
As announced before, we shall now give some estimates on the convergence at the level of the law itself. To fix ideas, we assume that
and
have locally bounded Hessian matrices satisfying
|
(1.9)
|
Under these assumptions, we shall derive the following bounds.
Theorem 1.7.
Let
be a probability measure on
, admitting a finite square-exponential moment:
Let
be
independent random variables with common law
. Let
be the solution of 1.6 with initial value
, where
and
are assumed to satisfy 1.9 ; and let
be the solution of 1.8 with initial value
.
Let also
be the empirical measure associated with the
. Then, for all
, there exists some constant
such that, for any
, there exists some constants
and
such that for all
Note that in the above theorem we have proven not only that for all
, the empirical measure is close to the limit measure, but also that the probability of observing any significant deviation during a whole time period
is small.
The fact that
is very close to the deterministic measure
implies the propagation of chaos: two particles drawn from the system behave independently of each other as
(see Sznitman [27] for more details). But we can also directly study correlations between particles and find more precise estimates: for that purpose it is convenient to consider the empirical measure on pairs of particles, defined as
By a simple adaptation of the computations appearing in the proof of Theorem 1.7 , one can prove
Theorem 1.8.
With the same notation and assumptions as in Theorem 1.7 , for all
and
, there exists some constants
and
such that for all
(Here
stands for the Wasserstein distance or order
on
.) Of course, one may similarly consider the problem of drawing
particles with
.
Theorems 1.7 and 1.8 use Theorem 1.1 as a crucial ingredient, which is why a strong integrability assumption is imposed on
. Note however that, under stronger assumptions on the behaviour at infinity of
or
, as the existence of some
,
such as
it can be proven that any square exponential moment for
becomes instantaneously finite for
. Note also that, by using Theorem 1.5 , one can obtain weaker but still relevant results of concentration of the empirical measure under just polynomial moment assumptions on
, provided that
does not grow too fast at infinity. To limit the size of this paper, we shall not go further into such considerations.
1.5 Uniform in time estimates
In the “uniformly convex case” when
, it can be proven [19, 9, 10] that
converges exponentially fast, as
, to some equilibrium measure
. In that case, it is natural to expect that the empirical measure is a good approximation of
as
and
, uniformly in time. This is what we shall indeed prove:
Theorem 1.9.
With the same notation and assumptions as in Theorem 1.7 , suppose that
. Then there exists some constant
such that for any
, there exists some constants
and
such that for all
As a consequence, there are constants
,
(depending on the initial datum) and
such that, under the same conditions on
and
,
Remark 1.10.
In view of the results in [
9]
, it is natural to expect that a similar conclusion holds true when
and
is convex enough. Propositions 3.1 and 3.8 below extend to that case, but it seems trickier to adapt the proof of Proposition 3.8 .
We conclude with an application to the numerical reconstruction of the invariant measure.
Theorem 1.11.
With the same notation and assumptions as in Theorem 1.9 , consider the mollified empirical measure 0.6 . Then one can choose
in such a way that
These results are effective: all the constants therein can be estimated explicitly in terms of the data.
1.6 Strategy and plan
The strategy is rather systematic. First, we shall establish Sanov-type bounds for independent variables in
(not depending on time), resulting in concentration results such as Theorems 1.1 to 1.6 . This will be achieved along the ideas in [12,Exercices4.5.5and 6.2.19] (see also [25,Section5] ), by first truncating to a compact ball, and then covering the set of probability measures on this ball by a finite number of small balls (in the space of probability measures); the most tricky part will actually lie in the optimization of parameters.
With such results in hand, we will start the study of the particle system by introducing the nonlinear partial differential equation 1.8 . For this equation, the Cauchy problem can be solved in a satisfactory way, in particular existence and uniqueness of a solution, which for
is reasonably smooth, can be shown under various assumptions on
and
(see e.g. [9, 10] ). Other regularity estimates such as the decay at infinity, or the smoothness in time, can be established; also the convergence to equilibrium in large time can sometimes be proven.
Next, following the presentation by Sznitman [27] , we introduce a family of independent processes
, governed by the stochastic differential equation
|
(1.10)
|
As a consequence of Itô's formula, the law
of each
is a solution of the linear partial differential equation
But this linear equation is also solved by
, and a uniqueness theorem implies that actually
, for all
. See [2, 3] for related questions on the stochastic differential equation 1.10 .
For each given
, the independence of the variables
and the good decay of
will imply a strong concentration of the empirical measure
To go further, we shall establish a more precise information, such as a control on
Such bounds will be obtained by combining the estimate of concentration at fixed time
with some estimates of regularity of
(and
) in
, obtained via basic tools of stochastic differential calculus (in particular Doob's inequality).
Finally, we can show by a Gronwall-type argument that the control of the distance of
to
reduces to the control of the distance of
to
: for instance,
|
(1.11)
|
for some constant
. We shall also show how a variant of this computation provides estimates of the type of those in Theorem 1.9 , and how to get data reconstruction estimates as in Theorem 1.11 .
1.7 Remarks and further developments
The results in this paper confirm what seems to be a rather general rule about Wasserstein distances: results in distance
are very robust and can be used in rather hard problems, with no particular structure; on the contrary, results in distance
are stronger, but usually require much more structure and/or assumptions. For instance, in the study of the equation 1.8 , the distance
works beautifully, and this might be explained by the fact that 1.8 has the structure of a gradient flow with respect to the
distance [9, 10] . In the problem considered by Malrieu [19] ,
is also well-adapted, but leads him to impose strong assumptions on the initial datum
, such as the existence of a Logarithmic Sobolev inequality for
, considered as a reference measure. As a general rule, in a context of geometric inequalities with more or less subtle isoperimetric content, related to Brenier's transportation mapping theorem,
is also the most natural distance to use [29] . On the contrary, here we are considering quite a rough problem (concentration for the law of a random probability measure, driven by a stochastic differential equation with coupling) and we wish to impose only natural integrability conditions; then the distance
is much more convenient.
Further developments could be considered. For instance, one may desire to prove some deviation inequalities for dependent sequences, say Markov chains, as both Sanov's theorem and transportation inequality can be established under appropriate ergodicity and integrability conditions.
Considering again the problem of the particle system, in a numerical context, one may wish to take into account the numerical errors associated with the time-discretization of the dynamics (say an implicit Euler scheme). For concentration estimates in one observable, a beautiful study of these issues was performed by Malrieu [20] . For concentration estimates on the whole empirical measure, to our knowledge the study remains to be done. Also errors due to the boundedness of the phase space actually used in the simulation might be taken into account, etc.
At a more technical level, it would be desirable to relax the assumption of boundedness of
in Theorem 1.7 , so as to allow for instance the interesting case of cubic interaction.
This is much more technical and will be considered in a separate work.
Another issue of interest would be to consider concentration of the empirical measure on path space, i.e.
where
is a fixed time length. Here
is a random measure on
and we would like to show that it is close to the law of the trajectories of the nonlinear stochastic differential equation
|
(1.12)
|
where the initial datum
is drawn randomly according to
. This will imply a quantitative information on the whole trajectory of a given particle in the system.
When one wishes to adapt the general method to this question, a problem immediately occurs: not only is
not compact, but also balls with finite radius in thisspace are not compact either (of course, this is true even if the phase space of particles is compact). One may remedy to this problem by embedding
into a space such as
, equipped with the weak topology; but we do not know of any “natural” metric on that space. There is (at least) another way out: we know from classical stochastic processes theory that integral trajectories of differential equations driven by white noise are typically Hölder-
for any
. This suggests a natural strategy: choose any fixed
and work in the space
, equipped with the norm
For any
, the ball of radius
and center 0 (the zero function) in
is compact, and one may estimate its metric entropy. Then one can hope to perform all estimates by using the norm
; for instance, establish a bound on, say, a square-exponential moment on the law of
:
Again, to avoid expanding the size of the present paper too much, these issues will be addressed separately.
2 The case of independent variables
In this section we consider the case where we are given
independent variables
, distributed according to a certain law
. There is no time dependence at this stage. We shall first examine the case when the law
has very fast decay (Theorem 1.1 ), then variants in which it decays in a slower way (Theorem 1.5 and 1.6 ).
2.1 Proof of Theorem 1.1
The proof splits into three steps: (1) Truncation to a compact ball
of radius
, (2) covering of
by small balls of radius
and Sanov's argument, and (3) optimization of the parameters.
Step 1: Truncation. Let
, to be chosen later on, and let
stand for the ball of radius
and center 0 (say) in
. Let
stand for the indicator function of
. We truncate
into a probability measure
on the ball
:
We wish to bound the quantity
in terms of
and the associated empirical measure. For this purpose, consider independent variables
drawn according to
, and
drawn according to
, independent of each other; then define
Since
and
are distributed according to
and
respectively, we have, by definition of Wasserstein distance,
But
satisfies a
inequality for some
, hence a fortiori a
inequality, so
for some
(any
would do). If
is large enough (say,
), then the function
is nonincreasing for
, and then
We conclude that
|
(2.1)
|
On the other hand, the empirical measures
satisfy
where
. Then, for any
, we can introduce parameters
and
, and use Chebyshev's exponential inequality and the independence of the variables
to obtain
| |
| |
| |
|
(2.2)
|
In the case when
, for any
, there exists some constant
such that
for all
and
, whence
As a consequence,
|
(2.3)
|
From 2.1 , 2.3 and the triangular inequality for
,
| |
| |
| |
|
(2.4)
|
This estimate was established for any given
,
,
,
and
, where
is a constant depending only on
and
.
In the case when
, we let
, and starting from inequality 2.2 again, we choose
and then
: by definition of
and
,
| |
| |
| |
| |
for
large enough, from which
|
(2.5)
|
To sum up, in the case
equation 2.4 writes
|
(2.6)
|
So, apart from some error terms, for all
we have reduced the initial problem to establishing the result only for the probability law
, whose support lies in the compact set
.
We end up this truncation procedure by proving that
satisfies some modified
inequality. Let indeed
be a probability measure on
, absolutely continuous with respect to
(and hence with respect to
); then, when
is larger than some constant depending only on
, we can write
| |
| |
|
(2.7)
|
But
satisfies a
inequality, so
by triangular inequality. Combining this with 2.7 , we obtain
From this, inequality 2.1 and the elementary inequality
|
(2.8)
|
we deduce that for any
there exists some constant
such that
|
(2.9)
|
Step 2: Covering by small balls. In this second step we derive quantitative estimates on
. Let
be a bounded continuous function on
, and let
be a Borel set in
(equipped with the weak topology of convergence against bounded continuous test functions). By Chebyshev's exponential inequality and the independence of the variables
,
| |
| |
| |
| |
As
is arbitrary, we can pass to the supremum and find
Now we note that the quantity
is linear in
and convex lower semi-continuous (with respect to the topology of uniform convergence) in
; if we further assume that
is convex and compact, then (for instance) Sion's min-max theorem [26,Theorem4.2'] ensures that
By the dual formulation of the
functional [12,Lemma6.2.13] , we conclude that
|
(2.10)
|
Now, let
and let
be a measurable subset of
. We cover the latter with
balls
with radius
in
metric. Each of these balls is convex and compact, and it is included in the
-thickening of
in
metric, defined as
So, by 2.10 we get
| |
| |
| |
|
(2.11)
|
We now apply this estimate with
From 2.9 we have, for any
,
where
Combining this with 2.11 , we conclude that
|
(2.12)
|
Now, given any
, it follows from 2.8 that there exist
,
and
, depending on
, such that
|
(2.13)
|
where
and
.
Though this inequality holds independently of
, we shall use it only in the case when
. In the case
, on the other hand, we note that for any
,
|
(2.14)
|
where
. Finally, we bound
by means of Theorem A.1 in Appendix A : there exists some constant
(only depending on
) such that for all
and
the set
can be covered by
balls of radius
in
metric, where
stands for
. In particular, given
, we can choose
|
(2.15)
|
balls of radius
, for some constant
depending on
and
(via
) but neither on
nor on
. (The purpose of the 1 in
is to make sure that the estimate is also valid when
.) Combining 2.4 , 2.12 , 2.13 and 2.15 , we find that, given
,
and
, there exist some constants
,
,
and
such that for all
and
,
|
(2.16)
|
for some constant
. In the case when
, we obtain similarly
|
(2.17)
|
for any
and
.
These estimates are not really appealing (!), but they are rather precise and general. In the rest of the section we shall show that an adequate choice of
leads to a simplified expression.
Step 3: Choice of the parameters.
We first consider the case when
. Let
,
and
. We claim that
as soon as
|
(2.18)
|
for some constants
and
depending on
only through
and
.
Indeed, on one hand
for some constant
, on the other hand
for
large enough, and then
for
and
large enough; this is enough to bound the first term in the right-hand side of 2.16 if moreover
is large enough.
Moreover, letting
, we can choose
in such a way that
, so that
which in the end can be bounded by
if
and
are large enough. With this one can get a bound on the right-hand side of 2.16 .
Now let us check that conditions 2.18 can indeed be fulfilled. Clearly, the first condition holds true for all
and
, where
and
are positive constants.
Then, we can choose
so that the second condition holds as an equality. This choice is admissible as soon as
and this, in turn, holds true as soon as
|
(2.19)
|
where
is such that
, and
is large enough.
If
, then we can choose
, i.e.
, and then the second inequality in 2.18 will be true as soon as
is large enough.
To sum up: Given
,
and
, there exists some constant
, depending on
and depending on
only through
and
, such that for all
,
as soon as
. Then we note that, given
, the inequality
holds if condition 2.19 is satisfied for some
large enough. To conclude the proof of Theorem 1.1 in the case when
, it is sufficient to choose
,
.
Now, in the case when
, given
and
, conditions 2.18 imply
Then we let
and
, so that
Then
for
, the above quantity is bounded by
as soon as 2.19 is enforced with
large enough. This concludes the argument.
2.2 Proof of Theorem 1.5
It is very similar to the proof of Theorem 1.1 , so we shall only explain where the differences lie. Obviously, the main difficulty will consist in the control of tails.
We first let
,
and
, and introduce
Then 2.1 may be replaced by
|
(2.20)
|
and 2.2 by
|
(2.21)
|
for some constant
depending on
and
.
Let us establish for instance 2.21 . Introduce
By Chebychev's inequality,
provided that
. But, since the random variables
are independent and identically distributed, with zero mean, there exists some constant
depending on
such that
where
. This inequality is a consequence of Rosenthal's inequality in the case when
, but also holds true if
(see for instance [24,pp. 62and 82] ).
Then, on one hand,
while on the other hand,
with
standing for various constants. Collecting these two estimates, we conclude to the validity of 2.21 for
large enough.
Then 2.20 and 2.21 together ensure that
|
(2.22)
|
for any
,
and
large enough.
Since
is supported in
, the Csiszár-Kullback-Pinsker inequality and Kantorovich-Rubinstein formulation of the
distance together ensure that it satisfies a
inequality (see e.g. [8,ParticularCase 5] with
). This estimate also extends to any
distance, not as a penalized
inequality as in 2.9 , but rather as
|
(2.23)
|
(see again [8,ParticularCase5] ).
From 2.22 and 2.23 we deduce (as in 2.17 ) that
|
(2.24)
|
for any
, where now
Letting
and
, and choosing
, we deduce
for
large enough, and then
|
(2.25)
|
for
, provided that the conditions
|
(2.26)
|
hold for some
and
.
Given any choice of
as a product of powers of
and
, the first term in the right-hand side of 2.25 will always be smaller than the second one, if
goes to infinity while
is kept fixed; thus we can choose
minimizing the second term under the above conditions. Then the second condition in 2.26 will be fulfilled as an equality:
As for the first condition in 2.26 , it can be rewritten as
and then, by 2.25 ,
Hence
|
(2.27)
|
for all
and
larger than some constant and, given
, for all
and
where
is large enough.
In the first case when
, any admissible
belongs to
, so
. If
, we get from 2.27 , with
, that
for all
and
In the second case when
, we only consider admissible
's in
, so that
. Choosing
, we get from 2.27
under the same conditions on
as before. This concludes the argument.
2.3 Proof of Theorem 1.6
It is again based on the same principles as the proofs of Theorems 1.1 and 1.5 , with the help of functional inequalities investigated in [8] and [11] . We skip the argument, which the reader can easily reconstruct by following the same lines as above.
2.4 Data reconstruction estimates
Finally, we show how the above concentration estimates imply data reconstruction estimates. This is a rather general estimate, which is treated here along the lines of [25,Section5] and [29,Problem 10] .
Proposition 2.1.
Let
be a probability measure on
, with density
with respect to Lebesgue measure. Let
be random points in
, and let
be a Lipschitz, nonnegative kernel with unit integral. Define the random measure
and the random function
by
Then,
|
(2.28)
|
where
stands for the modulus of continuity of
, defined as
As a consequence, if
is Lipschitz, then there exist some constants
, only depending on
,
and
, such that
|
(2.29)
|
for all
.
-
Proof.
First,
| |
| |
Since
is supported in
, and
is a probability density, we deduce
|
(2.30)
|
Now, if
is some point in
, then, thanks to the Kantorovich-Rubinstein dual formulation 1.2 ,
| |
| |
| |
To conclude the proof of 2.28 , it suffices to combine this bound with 2.30 .
Now, let
, and
. The bound 2.28 turns into
In particular,
which is estimate 2.29 . □
Remark 2.2.
Estimate 2.29 , combined with Theorem 1.1 or Theorem 1.5 , yields simple quantitative (non-asymptotic) deviation inequalities for empirical distribution functions in supremum norm. We refer to Gao [
15]
for a recent study of deviation inequalities for empirical distribution functions, both in moderate and large deviations regimes.
3 PDE estimates
Now we start the study of our model system for interacting particles. The first step towards our proof of Theorem 1.7 consists in deriving suitable a priori estimates on the solution to the nonlinear limit partial differential equation 1.8 . In this section, we recall some estimates which have already been established by various authors, and derive some new ones. All estimates will be effective.
3.1 Notation
In the sequel,
is a probability measure, taken as an initial datum for equation 1.8 , and various regularity assumptions will later be made on
. Assumptions 1.9 will always be made on
and
, even if they are not recalled explicitly; we shall only mention additional regularity assumptions, when used in our estimates. Moreover, we shall write
|
(3.1)
|
The notation
will always stand for the solution (unique under our assumptions) of 1.8 .
We also write
for the (kinetic) energy associated with
, and
for the square exponential moment of order
.
The scalar product between two vectors
will be denoted by
. The symbols
and
will often be used to denote various positive constants; in general what will matter is an upper bound on constants denoted
, and a lower bound on constants denoted
.
The space
is the space of
times differentiable continuous functions.
3.2 Decay at infinity
In this subsection, we prove the propagation of strong decay estimates at infinity:
Proposition 3.1.
With the conventions of Subsection 3.1 , let
be
if
, and an arbitrary negative number otherwise. Let
Then (i)
; (ii) For any
there is a continuous positive function
such that
and
|
(3.2)
|
(iii) Moreover, in the “uniformly convex case” when
and
, then there is
such that
Corollary 3.2.
If
admits a finite square exponential moment, then
satisfies
, for some function
, bounded below on any interval
(
).
-
Proof.
We start with (i). For simplicity we shall pretend that
is a smoothly differentiable function of
, with rapid decay, so that all computations based on integrating equation 1.8 against
are justified. These assumptions are not a priori satisfied, but the resulting bounds can easily be rigorously justified with standard but tedious approximation arguments.
With that in mind, we compute
with
Since
is an odd function, we have
| |
| |
| |
| |
If
, then
| |
| |
and if
, then for any
| |
| |
This leads to
and the conclusion follows easily by Gronwall's lemma.
We now turn to (ii). Let
be some arbitrary nonnegative
function on
. By using the equation 1.8 , we compute
Since
for all
, we can write
|
(3.3)
|
for any
and
.
Next, our assumptions on
imply
, and
, so
for all
, with
defined by 3.1 . Hence, by Taylor's formula,
| |
| |
| |
|
(3.4)
|
where
is any positive number.
From 3.3 and 3.4 we obtain
|
(3.5)
|
where
and
is a finite constant, while
.
We now choose
in such a way that
, i.e.
This integrates to
Obviously
is a continuous positive function, and our estimates imply
We conclude by using Gronwall's lemma that
Next, the estimate (iii) for
is an easy consequence of our explicit estimates when
(in the case when
and
, we choose
).
As for the estimate about
, it will result from a slightly more precise computation.
From 3.5 , we have
|
(3.6)
|
where
is bounded on
by some constant
, and
Since
, for any fixed
in
we can choose
such that
.
Letting
and
, equation 3.6 becomes
|
(3.7)
|
Let
. The formula
| |
| |
leads to
by decomposing the integral on the sets
and
. From 3.7 we deduce
where
and
are positive constants. It follows that
remains bounded on
if
, and this concludes the argument. □
3.3 Time-regularity
Now we study the time-regularity of
.
Proposition 3.3.
With the conventions of Subsection 3.1 , for any
there exists a constant
such that
|
(3.8)
|
Remark 3.4.
The exponent
is natural in small time if no regularity assumption is made on
; it can be improved if
are assumed to be bounded below by some
. Also, in view of the results of convergence to equilibrium recalled later on, the constant
might be chosen independent of
if
.
Remark 3.5.
A stochastic proof of 3.8 is possible, via the study of continuity estimates for
, which in any case will be useful later on. But here we prefer to present an analytical proof, to stress the fact that estimates in this section are purely analytical statements.
-
Proof.
Let
be the linear operator
, and let
be the associated semigroup: from our assumptions and estimates it follows that it is well-defined, at least for initial data which admit a finite square exponential moment. Of course
. It follows that
| |
| |
Our goal is to bound this by
. In view of Proposition 3.1 , it is sufficient to prove that for all
,
This estimate is rather easy, since the left-hand side is just the variance of the solution of a linear diffusion equation, starting with a Dirac mass at
as initial datum. Without loss of generality, we assume
, and write
. For simplicity we write the computations in a sketchy way, but they are not hard to justify.
Since the initial datum is
, its square exponential moment
of order
is
.
With an argument similar to the proof of Proposition 3.1 (ii), one can show that
Now, since
,
,
grows at most polynomially, and
admits a square exponential moment of order
, we easily obtain
From these estimates we deduce that the time-derivative of the variance
is bounded by
for any
. Since
has zero variance, it follows that the variance of
is
, which was our goal. □
3.4 Regularity in phase space
Regularity estimates will be useful for Theorem 1.11 . Equation 1.8 is a (weakly nonlinear) parabolic equation, for which regularization effects can be studied by standard tools. Some limits to the strength of the regularization are imposed by the regularity of
. So as not to be bothered by these nonessential considerations, we shall assume strong regularity conditions on
here. Then in Appendix B we shall prove the following estimates:
Proposition 3.6.
With the conventions of Subsection 3.1 , assume in addition that
has all its derivatives growing at most polynomially at infinity.
Then, for each
and for all
,
there is a finite constant
, only depending on
and a square exponential moment of the initial measure
, such that the density
of
is of class
, with
If moreover
,
, then
can be chosen to be independent of
for any fixed
.
Remark 3.7.
For regular initial data and under some adequate assumptions on
and
, some regularity estimates on
, where
is the limit density in large time, are established in [
10,Lemma 6.7]
. These estimates allow a much more precise uniform decay, but are limited to just one derivative. Here there will be no need for them.
3.5 Asymptotic behavior
In the “uniformly convex” case when
, the measure
converges to a definite limit
as
. This was investigated in [19, 9, 10] . The following statement is a simple variant of [9,Theorems 2.1and5.1] .
Proposition 3.8.
With the conventions of Subsection 3.1 , assuming that
, there exists a probability measure
such that
Here the constants
and
only depend on the initial datum
.
4 The limit empirical measure
Consider the random time-dependent measure
|
(4.1)
|
where
,
, are
independent processes solving the same stochastic differential equation
and such that the law of
is
. As we already mentioned, for each
and
,
is distributed according to the law
. We call
the “limit empirical measure” because it is expected to be a rather accurate description, in some well-chosen sense, of the empirical measure
as
.
Our estimates on
, and the fact that
is the empirical measure for independent processes, are sufficient to imply good properties of concentration of
around its mean
, as
, for each
. But later on we shall use some estimates about the time-dependent measure (even to obtain a result of concentration for
with fixed
). To get such results, we shall study the time-regularity of
. Our final goal in this section is the following
Proposition 4.1.
With the conventions of Subsection 3.1 , for any
there are constants
and
such that the limit empirical measure 4.1 satisfies
To prove Proposition 4.1 , we shall use a bit of classical stochastic calculus tools.
4.1 SDE estimates
In this subsection we establish the following estimates of time regularity for the stochastic process
: For all
, there exist positive constants
and
such that, for all
,
| |
| |
| |
-
Proof.
We start with (i). We use Itô's formula to write a stochastic equation on the process
:
where
, viewed as a process depending on
, is a martingale with zero expectation.
Hence
|
(4.2)
|
On one hand
|
(4.3)
|
On the other hand, by Proposition 3.1 ,
has a finite square exponential moment, uniformly bounded for
. More precisely, there exist
and
such that
for all
. Since by assumption
and
, we deduce
In view of 4.2 , it follows that there exists a constant
such that
This concludes the proof of (i).
To establish (ii), we perform a very similar computation. For given
, let
. Another application of Itô's formula yields
On one hand, from (i),
On the other hand
|
(4.4)
|
by Hölder's inequality. But again, since the measures
admit a bounded square exponential moment,
is bounded on
. We conclude that
|
(4.5)
|
Then, with
standing again for various constants which are independent of
and
,
so, from 4.5 ,
and by 4.5 again we successively obtain
and finally
This concludes the proof of (ii).
We finally turn to the proof of (iii). Without real loss of generality, we set
. We shall proceed as in the proof of Proposition 3.1 , and prove the existence of some constant
and some continuous positive function
on
such that
|
(4.6)
|
Let
be a smooth function, and
By Itô's formula,
where
For each
,
, viewed as a stochastic process in
, is a martingale.
By Young's inequality, for any
,
So, by letting
and
we obtain
We choose
in such a way that the function
is identically zero, that is
where
is to be fixed later. Then
from which it is clear that
|
(4.7)
|
By Cauchy-Schwarz and Doob's inequalities,
|
(4.8)
|
Also, by Itô's formula and the Cauchy-Schwarz inequality again,
|
(4.9)
|
In view of (ii), there exists a constant
such that
|
(4.10)
|
Furthermore,
|
(4.11)
|
Recall from Proposition 3.1 that there exist constants
and
such that
If we choose
, the decreasing property of
will ensure that
for all
, and
Then, from 4.11 ,
Now, from 4.9 and 4.10 we deduce
Combining this with 4.8 , we conclude that
In the same way, we can prove that
is bounded for
by bounding
and
. This concludes the proof of 4.6 , and therefore of (iii) above. □
4.2 Time-regularity of the limit empirical measure
We are now ready to prove Proposition 4.1 .
On one hand
so
|
(4.12)
|
where
By Chebyshev's exponential inequality and the independence of the
,
But, for any given
and
,
Let
, so that
. Then, from estimate (iii) in Subsection 4.1 ,
uniformly in
and
. Hence, for any
,
Consequently,
| |
| |
| |
The proof of Proposition 4.1 follows by 4.12 .
5 Coupling
We now (as is classical) reduce the proof of convergence for
to a proof of convergence for the empirical measure
constructed on the auxiliary independent system
. The final goal of this section is the following estimate.
Proposition 5.1.
With the conventions of Subsection 3.1 ,
where
is defined by 3.1 , and
.
Remark 5.2.
Not only does the “second option” in the proof lead to better bounds, it also provides an estimate of the distance between
and
in the
distance, which is stronger than the
distance. However, we do not take any advantage of this refinement.
6 Conclusion
In this section, we paste together all the estimates established in the previous sections, so as to prove Theorems 1.7 to 1.11 .
6.1 Concentration estimates
We start with the proof of Theorem 1.7 . By
we shall denote various constants depending on
, on our assumptions on
and
, and also on
, for some
.
From Proposition 5.1 ,
In particular, there is a constant
such that
|
(6.1)
|
From Corollary 3.2 and Theorem 1.1 we know that
for all
,
(
). The issue now is to “exchange”
and
in this estimate. As we shall see, this is authorized by the continuity estimates on
and
.
Let
(to be fixed later on), and let
be the integer part of
. We decompose the interval
as
Proposition 3.3 guarantees that, if
for some
small enough, then
|
(6.2)
|
Then, by triangular inequality and 6.2 ,
which can be bounded by
By Corollary 3.2 and Theorem 1.1 , there exist some constants
and
such that
for all
, and
. Hence
|
(6.3)
|
On the other hand, from Proposition 4.1 we deduce
for all
and
, so
|
(6.4)
|
We can assume that
, and
; then we can bound the right-hand side of 6.4 by
|
(6.5)
|
From 6.3 and 6.5 we deduce that, for
small enough (depending on
!),
|
(6.6)
|
for
. So we deduce from 6.6 that
where again
stand for various positive constants, and
. This concludes the proof of Theorem 1.7 .
6.2 Uniform in time estimates
Now, we shall focus on the case when
is positive, and derive Theorem 1.9 by a slightly refined estimate.
Let us start again from the bound
where
is positive. Let
(to be fixed later on), and
be the integer part of
. If
is larger than
, then
Indeed,
. As a consequence,
Since, for
,
we conclude to the existence of a constant
such that
|
(6.7)
|
We already know that the first term in the right-hand side in 6.7 is bounded by
for some constant
, and so we focus on the other terms.
In the proof of Theorem 1.7 , we have established that there are constant
and
, depending on
and on bounds on square exponential moments for
, such that
|
(6.8)
|
Proposition 3.1 guarantees that these square exponential bounds also hold true for
, uniformly in
. Thus we can apply 6.8 with
taken as initial datum, and get
|
(6.9)
|
as soon as
.
We now use 6.9 to bound the sum appearing in the right-hand side of 6.7 . Choose
large enough that
Applying 6.9 with
replaced by
, we can bound the sum in the right-hand side of 6.7 by
for
, where
,
and
are again positive constants. Since again
is larger than 1, there is a constant
such that
, so the sum above is bounded by
If
is large enough, our assumption
implies that
is always less than
, so that the above sum can be bounded by just
. This concludes the proof of the first point of Theorem 1.9 .
The second point is proved by writing
| |
| |
successively by the triangular inequality for Wasserstein distance and use of Proposition 3.8 . Then the result follows from the uniform estimate obtained above.
6.3 Data reconstruction
We finally consider Theorem 1.11 . Proposition 3.6 ensures that, as
,
is uniformly bounded in
, where
is arbitrarily large. Since
converges to
as
, we deduce that
is Lipschitz. Then Theorem 1.9 and Proposition 2.1 together imply Theorem 1.11 .
A Metric entropy of a probability space
We now prove the covering result used in Section 2.1 , as a particular case of a more general estimate. Let
be a Polish space, we look for an upper bound on the number
of balls of radius
in Wasserstein distance
needed to cover the space
of probability measures on
. We use the same strategy as in [12,Exercise 6.2.19] , where the Lévy distance is used instead of the Wasserstein distance.
Theorem A.1.
Let
be a Polish space with finite diameter
. For any
, define
as the minimal number of balls needed to cover
by balls of radius
. Then there exists a numerical constant
such that for all
and
, the space
can be covered by
balls of radius
in
distance, with
|
(A.1)
|
Remark A.2.
The
distance between any two probability measures on
is at most
, so, for all
, we have the trivial estimate
.
-
Proof.
Let
, and let
be such that
is covered by the balls
with centers
and radius
. For simplicity we shall write
.
In a first step we prove that for any
there exist nonnegative real numbers
, with
, such that
For this we first replace the balls
's by the sets
's defined by
so that
is partitioned into the
's. Next define
It is easy to check that the required properties are fulfilled. Indeed, we may transport
onto
by sending all
's in
onto
, for each
: the cost of this transport is bounded by
. In the second step we introduce an integer
(whose value will be made more precise later on), and consider the set
where
is the set of all
-tuples
, such that each
is of the form
,
, and
.
Given a probability measure
(where
does not necessarily belong to
), there exists
in
such that
|
(A.2)
|
To prove A.2 , we define
as the integer part
of
and
as the first integer such that
Since
, it is clear that
. Then we define a measure
by
, where
Let us bound the distance between
and
. For that we gradually define a transport plan between
and
in the following way: first of all, at each point
, the mass
stays in place. Then, the remaining masses
are redistributed as follows: all the remaining mass at
is brought to
, together with possibly a bit of mass at
, until a total mass
has been added at location
(for
large enough). If
, then we again bring mass from
, until another mass
has been added at
. We carry on until all the mass at
has been used, thus building a transport plan
which sends
onto
, in such a way that
for all
. Hence,
and this plan yields an upper bound on the Wasserstein distance:
To summarize the first two steps: for any
in
there exists
such that
In other words, the family
covers
.
In the third step we choose some suitable
and
for a given
.
We first choose
in such a way that
and
have the same order of magnitude, for instance
Then
and the balls
have radius at most
if
Now
and
are fixed,
, and we just have to estimate the cardinality
of
. For this we first note that
Without loss of generality, we have assumed
, so
. Then
, and hence
Since
and
, we can write
and we deduce
with
.
Consequently, we have covered
by the
balls
with radius
. This concludes the argument.
□
In the particular case when
is the Euclidean ball
of radius
in
, we have
|
(A.3)
|
for some constant
. To see this, one may for instance consider the balls with center in the lattice
in
. Then Theorem A.1 yields the bound
which is used in the present paper.
B Regularity estimates on the limit PDE
In this appendix we study solutions to the limit equation
|
(B.1)
|
and establish the regularity results stated in Proposition 3.6 . Following the method in [13] , we shall measure the regularity in terms of
-Sobolev spaces
Our main result is as follows.
Theorem B.1.
Let
and
such that all their partial derivatives
and
are continuous and grow at most polynomially at infinity, for any multi-index
with
. Let
and let
be a probability density such that
Then, there exists a continuous function
, only depending on
,
,
,
,
and
, such that any classical solution
to B.1 , starting from
, satisfies
-
Proof.
For the sake of simplicity we only give a formal proof, which can be turned rigorous by means of regularization arguments.
Let then
be a solution of
we rewrite the equation as
where
if
is the
-th vector of the canonical base of
, and
Let
be given. By integration by parts and Cauchy-Schwarz inequality,
By summing over
with
, we find
Given
, by Proposition 3.1 there exist constants
and
, depending only on
,
,
and
, such that
|
(B.2)
|
for all
. In particular, it follows from our assumptions on the derivatives of
and
that all
terms are bounded by some polynomial in
, uniformly in
.
Let
. For
, we introduce the weighted norms
and
Then for any
and
there exist
and
such that
|
(B.3)
|
We shall prove later on the following interpolation lemma:
Lemma B.2.
Given
,
an
, there exist nonnegative constants
and
, and
such that for all
,
Then, again from B.2 , all
norms are bounded on
, so from B.3 and Lemma B.2 there exists some constants
such that
In other words
satisfies on
the differential inequality
|
(B.4)
|
for some constants
and
depending only on
,
,
,
and
.
Let us distinguish two cases. If
, then we only use the inequality
to make sure that
for any
.
If on the other hand
, we deduce from B.4 that
as long as
, so that
satisfies the inequality
which integrates to
As a consequence, as long as
, we have
In the end, we have obtained an a priori bound on
for
, depending only on
and
, but not on the initial value
. Then the proof can be concluded by an approximation argument. □
-
Proof of Lemma B.2 .
We proceed by induction on
.
In the first step we prove the result for
. Given
and
, we write
so, by Hölder's inequality,
(with
if
). Then by Sobolev embedding,
where
if
,
is arbitrary in
if
, and
if
, that is,
where
, any
for
, and
for
.
In the second step we let
and assume by induction that there exist some constants
and
such that for all
:
Let then
.
Given
with
and
, we split
into
with
, and integrate by parts:
whence
Since this holds for any
we obtain
Moreover
so that finally
Then, by induction hypothesis,
whence
where
and
. This concludes the argument.
□
Acknowledgments: The authors thank M. Ledoux for his relevant comments and his interest during the preparation of this work, as well as providing Reference [16] .
References
-
Araujo, A. and Gine, E. The central limit theorem for real and Banach valued random variables John Wiley & Sons, New York, (1980).
-
Benachour, S., Roynette, B., Talay, D., and Vallois, P. Nonlinear self-stabilizing processes. I. Existence, invariant probability, propagation of chaos. Stochastic Process. Appl. 75, 2 (1998), 173–201.
-
Benachour, S., Roynette, B., and Vallois, P. Nonlinear self-stabilizing processes. II. Convergence to invariant probability. Stochastic Process. Appl. 75, 2 (1998), 203–224.
-
Benedetto, D., Caglioti, E., Carrillo, J. A., and Pulvirenti, M. A non-Maxwellian steady distribution for one-dimensional granular media. J. Statist. Phys. 91, 5-6 (1998), 979–990.
-
Benedetto, D., Caglioti, E., and Pulvirenti, M. A kinetic equation for granular media. RAIRO Modél. Math. Anal. Numér. 31, 5 (1997), 615–641.
-
Bobkov, S., Gentil, I., and Ledoux, M. Hypercontractivity of Hamilton-Jacobi equations. J. Math. Pures Appl. 80, 7 (2001), 669–696.
-
Bobkov, S., and Gotze, F. Exponential integrability and transportation cost related to logarithmic Sobolev inequalities. J. Funct. Anal. 163 (1999), 1–28.
-
Bolley, F., and Villani, C. Weighted Csiszár-Kullback-Pinsker inequalities and applications to transportation inequalities. To appear in Ann. Fac. Sci. Toulouse. Available online via http://www.umpa.ens-lyon.fr/~cvillani/cv.html#publicationlist, 2004.
-
Carrillo, J. A., McCann, R. J., and Villani, C. Kinetic equilibration rates for granular media and related equations: entropy dissipation and mass transportation estimates. Rev. Mat. Iberoamericana 19, 3 (2003), 971–1018.
-
Carrillo, J. A., McCann, R. J., and Villani, C. Contractions in the 2-Wasserstein length space and thermalization of granular media. Preprint, 2004.
-
Cattiaux, P., and Guillin, A. Talagrand's like quadratic transportation cost inequalities. Available online via http://www.ceremade.dauphine.fr/~guillin/index3.html. Preprint, 2004.
-
Dembo, A., and Zeitouni, O. Large Deviations Techniques And Applications, second ed. Springer Verlag, New York, 1998.
-
Desvillettes, L., and Villani, C. On the spatially homogeneous Landau equation for hard potentials. I. Existence, uniqueness and smoothness. Comm. Partial Differential Equations 25, 1-2 (2000), 179–259.
-
Djellout, H., Guillin, A., and Wu, L. Transportation cost-information inequalities and applications to random dynamical systems and diffusions. Ann. Probab. 32, 3B (2004), 2702–2732.
-
Gao, F. Moderate deviations and large deviations for kernel density estimators. J. Theor. Prob., 16 (2003), 401–418.
-
Gine, E. and Zinn, J. Empirical processes indexed by Lipschitz functions Ann. Probab.14 , 4 (1986), 1329–1338.
-
Ledoux, M. The concentration of measure phenomenon, vol. 89 of Mathematical Surveys and Monographs. American Mathematical Society, Providence, 2001.
-
Ledoux, M. and Talagrand, M., Probability in Banach spaces. Springer-Verlag, Berlin, 1991.
-
Malrieu, F. Logarithmic Sobolev inequalities for some nonlinear PDE's. Stochastic Process. Appl. 95, 1 (2001), 109–132.
-
Malrieu, F. Convergence to equilibrium for granular media equations and their Euler schemes. Ann. Appl. Probab. 13, 2 (2003), 540–560.
-
Marchioro, C., and Pulvirenti, M. Mathematical theory of incompressible nonviscuous fluids. Springer-Verlag, New York, 1994.
-
Massart, P. Saint-Flour Lecture Notes. Available at http://www.math.u-psud.fr/~massart, 2003.
-
Otto, F., and Villani, C. Generalization of an inequality by Talagrand, and links with the logarithmic Sobolev inequality. J. Funct. Anal. 173 (2000), 361–400.
-
Petrov, V. V. Limit theorems of probability theory. The Clarendon Press Oxford University Press, New York, 1995.
-
Schochet, S. The point-vortex method for periodic weak solutions of the 2-D Euler equations. Comm. Pure Appl. Math. 49, 9 (1996), 911–965.
-
Sion, M. On general minimax theorems. Pac. J. Math. 8 (1958), 171–176.
-
Sznitman, A.-S. Topics in propagation of chaos. In École d'Été de Probabilités de Saint-Flour XIX—1989, vol. 1464 of Lecture Notes in Math. Springer, Berlin, 1991.
-
Talagrand, M. Transportation cost for Gaussian and other product measures. Geom. Funct. Anal. 6 (1996), 587–600.
-
Villani, C. Topics in optimal transportation. Grad. Stud. Math. (58), American Mathematical Society, Providence, 2003.
-
Wang, F.-Y. Probability distance inequalities on Riemannian manifolds and path spaces. J. Funct. Anal. 206, 1 (2004), 167–190.
ENS Lyon, Umpa, 46 allee d'Italie, F-69364 Lyon Cedex 07 E-mail address : fbolley@umpa.ens-lyon.fr CEREMADE, Universite Paris Dauphine E-mail address : guillin@ceremade.dauphine.fr ENS Lyon, Umpa, 46 allee d'Italie, F-69364 Lyon Cedex 07 E-mail address : cvillani@umpa.ens-lyon.fr