Notes on Free Probability Theory
Dimitri Shlyakhtenko.
Department of Mathematics, UCLA, Los Angeles, CA 90095, USA E-mail address: shlyakht@math.ucla.edu These notes are from a 4-lecture mini-course taught by the author at the conference on von Neumann algebras as part of the “Géométrie non commutative en mathématiques et physique” month at CIRM.
Contents
1 Free Independence and Free Harmonic Analysis.
Free probability theory was developed by Voiculescu as a way to deal with von Neumann algebras of free groups. In addition to the view of von Neumann algebras as “non-commutative measure spaces”, which was already presented in this conference, free probability theory considers von Neumann algebras as “non-commutative probability spaces”.
There are by now several standard references on free probability theory, of which we mention two: [VDN92, Voi00] .
1.1 Probability spaces.
Recall that a classical probability space is a measure space
. Here
is a sigma-algebra of subsets of
, and
is a measure, which is a probability measure, i.e.
. One thinks of
as a set of events and for
, the measure
is a probability of an event occurring in the set
.
1.1.1 Random variables; laws.
An alternative point of view on probability theory involves considering random variables, i.e., measurable functions
. One can think of a random variable as a measurement, which assigns to each event
a value
. Note that the probability of the value of
lying in a set
is exactly
. Thus the law of
,
, defined to be the push-forward measure
on
, measures the probabilities that
assumes various values.
1.1.2 The expectation
.
Let us say that
is an essentially bounded random variable. Then the integral
has the meaning of the expected value of
. For this reason, the linear functional
given by integration against
is called an expectation. We note that
satisfies:
(normalization),
if
(positivity).
Note that the knowledge of
is equivalent (up to an isomorphism and up to null sets) to the knowledge of
and
. Thus the notion of a classical probability space can be phrased entirely in terms of commutative (von Neumann) algebras.
1.2 Non-commutative probability spaces.
We now play the usual game of dropping the word “commutative” in a definition:
Definition 1.1.
An algebraic non-commutative probability space is a pair
consisting of a unital algebra
and a linear functional
, so that
.
Thus we think of
as a “non-commutative random variable”,
as its “expected value” and so on. Of course, any classical probability space is also a non-commutative probability space. But there are many interesting genuinely non-commutative probability spaces. For example, if
is a discrete group, we could set
(the group algebra) and
(the group trace). Here if
, then
if
and
. The same construction works with
replaced by the reduced group
-algebra of
, or the von Neumann algebra of
.
1.2.1 Positivity.
Operator algebras give one a “test” of which algebraic non-commutative spaces “exist in nature”. These are precisely those non-commutative probability spaces that can be represented by (possibly unbounded) operators on a Hilbert space
, so that
is a linear functional given by a vector-state,
for some
. If
is a
-algebra, it is not hard to characterize these (via the GNS construction) in terms of the properties of
:
must be positive, i.e.,
for all
.
1.2.2 The law of a random variable.
Recall that we assigned to a classical random variable
its law
. If
is an algebra of operators on a Hilbert space
,
and
is self-adjoint, then the spectral theorem gives us a measure
on
valued in the set of projections on
, so that
If we let
then
is a measure on
. It is not hard to check that if we are in the classical situation and
,
,
, then this construction gives us precisely the law of
.
1.2.3 Moments.
However, if
is not self-adjoint, or if we are dealing with a
-tuple of random variables, there is no description of the law of
in terms of a measure.
Fortunately, for
the moments of
, i.e., the expected values
,
are exactly the same as the moments of the law
of
. Indeed,
is exactly the
-th moment of
. For essentially bounded
, the moments of
determine
.
Thus given a family
of variables
, we say that an expression of the form
is the
-th moment of the family
. The collection of all moments can be thought of as a linear functional
defined on the algebra of polynomials in
indeterminates
by
This functional
is called the joint law, or joint distribution, of the family
.
1.3 Classical independence.
Definition 1.2.
Two random variables
and
in
are called independent, if
for all
.
Equivalently,
whenever
and
is in the algebra
generated by
, while
.
The equality
is a consequence of the statement that “the probability that the value of
lies in a set
and the value of
lies in the set
is the product of the probabilities that the value of
lies in
and the value of
lies in
”, which is a more familiar way of phrasing independence.
If
and
, then any functions
so that
depends only on the
coordinate and
only on the
coordinate are independent. Note that another way of saying this is that the random variables
and
in
are independent, for any
and
.
Thus independence has to do with the operation of taking tensor products of probability spaces.
1.4 Free products of non-commutative probability spaces.
There is “more room” in the non-commutative universe to accommodate a different way of combining two non-commutative probability spaces: free products. Just like the notion of a tensor product can be used to recover the notion of independence, free products have led Voiculescu to discover the notion of free independence.
1.4.1 Free products of groups.
We start with a motivating example. Let
and
be two discrete groups. View the group algebra of the free product
as a non-commutative probability space by letting
be the group trace; for
,
unless
.
Let us understand the relative positions of
and
inside of the group algebra of the free product
. Let
be a word. Thus
with
. We may carry out multiplications and cancellations until we reduce the word so that consequent letters lie in different groups; i.e.,
,
and so on. The resulting word is non-trivial if all
are non-trivial. Thus:
provided that
,
,
,
, and
.
By linearity we get:
Proposition 1.3.
if
has the form
with
,
,
,
, and
, then
We now note that this proposition allows one to compute
on
in terms of its restriction to
and
.
Indeed, an arbitrary element of
is a linear combination of
and of terms of the form
But then the equation
allows one to express
in terms of values of
on shorter words. By induction, this allows one to express
in terms of
and
.
1.4.2 Free products of algebras.
Such an expression is universal and works in any free product of two algebras (not necessarily of group algebras). We thus say:
Definition 1.4.
[
Voi85]
Let
and
be two non-commutative probability spaces. We call the unique linear functional
on
which satisfies
| |
| |
the free product of
and
. It is denoted
.
One can check that the free product of two positive linear functionals is positive (to do so it is the easiest to make sense of the product of the underlying GNS representations). Thus one can talk about (reduced) free products of
-algebras or von Neumann algebras by passing to the appropriate closure in the GNS representation associated to the free product functional.
1.4.3 Free independence.
By analogy with the relationship between classical independence and tensor products, Voiculescu gave the following definition:
Definition 1.5.
[
Voi85]
Let
be two families of non-commutative random variables. We say that
and
are freely independent, if
whenever
,
,
, and
.
Here
denote the algebra generated by a set
.
We should point out a certain similarity between this definition and the classical independence, where the requirement was that
if
.
1.5 Free Fock space.
We give an example of freely independent random variables that does not come from groups.
1.5.1 Free Fock space.
Let
be a Hilbert space,
be a vector, and let
be the Hilbert space direct sum of the tensor powers of
(the one-dimensional space
is thought of as the zeroth tensor power of
). This space is called the free (or full) Fock space, by analogy with the symmetric and anti-symmetric Fock spaces (where the symmetric or anti-symmetric tensor product is used instead).
1.5.2 Free creation operators.
For
consider the left creation operator
given by
(here
by convention). Then
exists and is given by
and
. The operator
is also called the annihilation operator.
These operators satisfy
In particular, the map
is a linear isometry between
(with its Hilbert space norm) and the closed linear span of
, taken with the operator norm.
1.5.3 Relation with non-crossing diagrams.
Let
be an orthonormal family. Let
.
Thus
.
The joint distribution of the family
(also known as the
-distribution of
) has a nice combinatorial description.
Suppose that we are interested in
where
and
,
(by
we mean
if
and
if
).
Mark
points on the
-axis in half-plane
at positions
, and color them by
colors, so that the
-th point point is colored with the
-th color. Attach to the
-th point the line segment from
to
. Orient this segment upwards (towards infinity) if
and orient it downwards (toward the
-axis) if
. Color the segment the same way as the
-th point, from which it is drawn.
Then there exists at most one way of drawing a diagram so that:
-
The upper end of every segment is connected to the upper end of exactly one other segment, and all segments connected together have the same color;
-
Orient each line connecting two segments counter-clockwise. Then the orientation of the line is compatible with the orientation of the segments;
-
The lines do not cross.
It is not hard to prove that
iff such a diagram exists, while
otherwise.
1.5.4 Moments of
.
Utilizing this description one can prove, for example, that
where
is the number of non-crossing pairings between the integers
. Recall that a pairing of
is an equivalent relation on this set, so that each equivalence class has exactly two elements. Non-crossing pairings are ones for which one can draw lines above the real axis
, connecting the equivalent classes of the pairings, and having no intersections (more generally, one can in a similar way define non-crossing partitions of the set
). We shall later see that the moments of
are related to the semicircle law.
Non-crossing diagrams and non-crossing partitions have a very deep connection with free probability; this connection is beyond the scope of these notes (see e.g. [Spe98] ). We will point out later, however, how this connection explains the relationship between freeness and large random matrices.
1.5.5 Free independence.
Let
and let
be given by
The
-algebra
is an extension of the Cuntz algebra
,
if
and is isomorphic to
if
.
It is not hard to prove that if
are two subspaces of
, then the algebras
are freely independent in
.
1.6 Free Central Limit Theorem.
1.6.1 Convergence in moments.
We say that a sequence
of random variables converges in moments to the law a random variable
, if
in moments; that is to say, for any
,
This definition makes sense verbatim (with the replacement of
by
) in the setting of a non-commutative probability space.
1.6.2 Classical CLT
Let
be independent random variables, so that for all
,
,
, and so that for any
,
for some constants
. The classical central limit theorem states:
Theorem 1.6.
Let
Then the laws of the random variables
converge in moments to the Gaussian law
given by
The main tool used in the proof of this theorem is the fact that if
and
are independent random variables, then the law of their sum is given by a convolution formula:
One then utilizes the fact that the Fourier transform
satisfies
Thus if we write
then
Using this one can compute
and argue that it is quadratic in
. This implies that
converge in moments to a measure whose Fourier transform is proportional to
, so that
.
1.6.3 Free CLT
Amazingly, the statement of the free central limit theorem is essentially the same as that of the classical one. The only difference is the replacement of the requirement of independence by that of free independence. This is only a single example of a surprising number of parallels between the behavior of independent and freely independent random variables.
Let
be freely independent random variables, so that for all
,
,
, and so that for any
,
for some constants
. The classical central limit then states:
Theorem 1.7.
[
Voi85]
Let
Then the laws of the random variables
converge in moments to the Gaussian law
given by
We will postpone the proof of this theorem until we get to talk about the
-transform. For now let us just note that we need a tool to compute the distribution of
in terms of the distributions of
and
if
and
are freely independent.
1.7 Free Harmonic Analysis.
The corresponding classical problem was involved computing the convolution of two measures via the Fourier transform.
1.7.1 Free additive convolution.
By analogy with the classical situation, Voiculescu gave the following definition:
Definition 1.8.
[
Voi85]
Let
and
be two probability measures on
. We define their free additive convolution
to be the law of the random variable
, where
and
are freely independent, and
,
.
Since
are free in
, the freeness condition determines the restriction of
to
in terms of the restrictions of
to
,
. Thus the joint distribution of
and
depends only on
and
. Thus the distribution of
(which depends only on the joint distribution of
and
) depends only on
and
. It follows that
is well-defined.
Note that
is an operation on the space of probability measures on
.
Example 1.9.
Let
be a probability measure and let
be the point mass at
. Then
, the translate of
by
.
In particular,
is the same as the classical convolution
.
1.7.2
-transform.
There is a free analog of the logarithm of the Fourier transform, which linearizes free additive convolution.
Let
be a probability measure on
, and let
be a function defined in the upper half-plane. This function is sometimes callled the Cauchy transform of
.
If
has moments of all orders (e.g., if it is compactly supported),
is a power series in
, and we have
where
are the moments of
. Thus
is the generating function for the moments of
.
Define
by the equation
It turns out that
is analytic in a certain region in
; however, one can simply understand it as a formal power series in
and regard the equation above as an equation involving composition of formal power series.
Voiculescu proved the following linearization theorem, which shows that the map
is a free analog of the logarithm of the Fourier transform.
Theorem 1.10.
[
Voi85]
Let
be the
-transform of
. Then: (a)
is a universal polynomial expression in the first
moments of
; (b)
if and only if
; i.e.,
; (c)
; (d) If
has law
and
, then
.
1.7.3 Proof of additivity of
-transform.
We will sketch a proof of (a), (b) and (c). We start with a Lemma.
Lemma 1.11.
Let
be a non-commutative random variable. Fix
,
. For a sequence of numbers
, let
acting on the full Fock space
. Then there exists a unique sequence of numbers
, so that for each
,
Moreover, each
is a polynomial in
, and this polynomial is universal, and does not depend on
.
The proof is based on an inductive argument and the combinatorial formula for moments of free creation operators.
1.7.4 Combinatorial definition of
-transform.
Given
, let
be as in the Lemma above. Consider the formal power series
For now we’ll consider
given by this new definition, and call it the “combinatorial
-transform”. We shall later prove that
satisfies our old analytic definition in terms of
given above; in particular, it will follow that
.
1.7.5 Additivity of combinatorial
-transform.
Proposition 1.12.
.
-
Proof.
Let
be two free creation operators on the free Fock space
, associated to a pair of orthonormal vectors.
Given
and
, let
be random variables in
,
, respectively, so that their first
moments are the same as the first
moments of
and
, respectively.
Since
and
are freely independent,
and
are freely independent. Since moments of order up to
of
depend only on the moments of order up to
of
and
, we see that the moments of order up to
of
and
are the same.
We leave to the reader the combinatorial exercise to check that the moments of
are the same as the moments of
By the uniqueness statement in Lemma 1.11 , it follows that
as claimed. □
1.7.6 Analytic and combinatorial
-transforms are the same.
It now remains to prove that the combinatorial
-transform
satisfies the formula relating it to the Cauchy transform
(and so
). The proof of the following proposition is due to Haagerup [Haa97] .
Proposition 1.13.
Let
. With the above notation, one has
and
both equalities interpreted in terms of composition of formal power series.
-
Proof.
Let
be a free creation operator corresponding to a unit vector
, and acting on the full Fock space
. Let
, where
is a polynomial with real coefficients. Thus by definition,
.
For
with
, consider the vector
Then
Similarly,
Thus
is an eigenvector for
with eigenvalue
. Hence
| |
| |
| |
It follows that
Now choose
, so that
is invertible for
. This is possible, since
(since
is a polynomial). Hence
thus
| |
Since by definition of
,
Since all of the coefficients of the power series
are real, we get that
so that
We now substitute
to get
We also see that
is invertible with respect to composition on some neighborhood. Applying its inverse to both sides, and remembering that
, we get that
as claimed.
This concludes the proof in the case that
is a polynomial; the general statement can be deduced from this partial case by taking limits. □
We have thus proved (a) and (c) of Theorem 1.10 .
1.7.7 Semicircular variables.
Let us prove (b). Assume that
. Then
and
Solving this gives
One can recover
from
by the formula
Since
as
(as is apparent from the integral formula for the Cauchy transform), the branch of the square root must be chosen so that
for
real and large. It follows that
and
outside of this interval.
Note that if
, then
unless
. It follows that the variable
on the Fock space
has semicircular distribution.
1.7.8 Proof of free CLT
We are now ready to give a proof of the free central limit theorem.
Let
be freely independent random variables satisfying the assumptions of the free central limit theorem, and let
| |
| |
Let
be the law of
, let
be the law of
and let
be the law of
. Thus
and because of additivity of
-transform,
Write
. Since the coefficient of
in
is a universal polynomial in the moments up to order
of
, and
, it follows that
, where
are some constants independent of
.
Thus
If
is fixed, the estimate
implies that
If
, the fact that
so that
implies that
for all
. Finally, the fact that
implies that
and
for all
.
We conclude that
as
in the sense of coefficient-wise convergence of formal power series. Since the
-th moment of
is a universal polynomial in the first
coefficients of the power series
, it follows that the
-th moment of
converges to the
-th moment of the unique measure
for which
. We saw above that this implies that
is then the semicircle measure, and so
.
1.8 Further topics.
We already briefly touched upon the amazing correspondence between various theorems in the classical and free context. There are several other instances of this. For example, one can consider the free analog of infinite divisibility. A measure
is called infinitely divisible if for any
there is a measure
so that
is the
-fold convolution
. One can say that
is freely infinitely divisible if for each
there is a measure
so that
is the
-fold free convolution
. Remarkably, there is a one-to-one correspondence between the classically infinitely divisible measures and the free ones. A similar situation occurs when considering stable and freely stable laws.
There is a also a notion of multiplicative free convolution, based on taking products of non-commutative random variables.
The reader is encouraged to consult [Voi00] for more details.
2 Random Matrices and Free Probability.
One of the most important advances in free probability theory was Voiculescu’s discovery that free probability theory describes the asymptotic distribution of certain large random matrices.
This has led to a number of applications of free probability theory, both to spectral computations for random matrices, and to von Neumann algebras. The latter applications rely on the somewhat unexpected presence of a “matricial” structure in free probability theory: if one takes several square arrays of certain free random variables and creates several matrices out of these arrays, then the resulting matrices have surprising freeness properties (for example, the resulting matrices may be freely independent).
2.1 Random matrices.
A random matrix is a matrix, whose entries are random variables. One can also think of a random matrix as a matrix-valued random variable, i.e., as a randomly chosen matrix. Any Borel function of a random matrix becomes then a random variable. For example, the eigenvalues of a random matrix (being functions of its entries) are themselves random variables.
2.1.1 Expected distributions.
Let
be a self-adjoint random matrix of size
. We think of
as a function
on some probability space
. Integration with respect to
has the meaning of taking the expected value and will be denoted by
.
One is frequently interested in the expected proportion of the eigenvalues of
that lie in a given interval
:
| |
| |
Let
be the eigenvalues of
, listed with multiplicity, and viewed as random variables. Let
be a random measure associated with this list of eigenvalues (we say that
is random to emphasize that it depends on
, i.e., is a measure-valued random variable). Then
is the expected value of
. Thus if we set
we obtain that
Note that
where
is the spectral measure of
. In other words,
is the distribution of
, when viewed as a random variable in
. Thus
is the “expected value of the distribution of
”.
2.2 Asymptotics of random matrices.
We are mainly interested in the asymptotics of the expected number of eigenvalues of a random matrix in a given interval. In other words, we are interested in studying the asymptotics of the measure
as
.
It should be mentioned that the eigenvalue distributions of random matrices have been studied in several ways. Instead of looking at the expected numbers of eigenvalues, there is also interest in the behavior of eigenvalue spacing (normalized so that the average spacing is
). One is also interested in the behavior of the largest and smallest eigenvalues (this translates into considering the expected value of the spectral radius, or the operator norm, of the matrix
). We have already heard in this conference of the significant progress recently made by Haagerup and Thorbjornsen on the latter problem, in the case that
is an arbitrary polynomial of a
-tuple of Gaussian random matrices.
2.2.1 Wigner’s theorem for Gaussian random matrices.
Let
be a self-adjoint random matrix, whose entries are
,
, determined as follows. The variables
are independent; if
, then
is a centered complex Gaussian random variable of variance
; if
, then
is a centered real Gaussian random variable of variance
. Finally, if
,
.
One can think of the random matrix
as a map
Here
is the space of complex
matrices,
is the map
and
is the Gaussian measure on
given by
for a suitable constant
.
Let
be as before the expected value of the distribution of
.
Then
weakly as
. This is a very old result, going back to the work of Wigner in 1950s [Wig55] .
It turns out that the semicircle law is fairly universal for matrices with independent identically distributed entries. In fact, Wigner’s original work involved matrices
whose entries were not Gaussian, but random signs.
2.2.2 Voiculescu’s asymptotic freeness results.
The semicircular law also arose in free probability theory as the central limit law. Voiculescu showed that this is not just a coincidence: families of certain
random matrices behave as free random variables in the large
asymptotics.
For each
, let
be a diagonal matrix; assume that the operator norms
are uniformly bounded in
, and assume that the distribution of
(as an element of
) converges in moments to a limit measure
. Let
be random matrices described as follows. Let
with the measure
given by
for a suitable constant
. Then
is the map
More explicitly, if we denote by
the
-th entry of
, then
form a family of independent centered Gaussian random variables, so that:
is a complex Gaussian of variance
if
;
is real Gaussian of variance
; and
if
.
The family
is sometimes called the Gaussian Unitary Ensemble (or GUE) because of the obvious invariance of their joint distribution under conjugation by
unitaries.
Let
be the distribution of the family
, viewed as a linear functional on the space of polynomials in
indeterminates.
Then Voiculescu proved:
Theorem 2.1.
[
Voi91]
Let
be a family of free random variables in a non-commutative probability space
, so that
has distribution
, and
have semicircular distribution. Let
be the distribution of this family, and let
be the distribution of
as described above. Then as
,
in moments.
In other words, for any
and any
,
one has
| |
| |
Note that in particular we have that
and
are asymptotically free. One also recovers Wigner’s result, since in particular
, and
is the semicircle law.
2.2.3 Some remarks on the proof.
We will not prove this theorem here; see e.g. [VDN92] for a proof.
We shall only sketch the essential combinatorial trick used in the proof and explain its connection to non-crossing partitions.
We concentrate on the case of a single random matrix
with Gaussian entries
(depending on
).
Consider the value of the moment
|
(2.1)
|
If
is odd, it is not hard to see that the value of the moment is zero, so we’ll assume that
is even for the remainder of the proof.
Since
are Gaussian of variance
,
is zero unless the variable
entering in the product “pair up” with another variable
entering the product, and
,
(so that
). That is to say, a term in the sum ( 2.1 ) is zero unless for some pairing
of the set
with itself, the indices
satisfy the equations
|
(2.2)
|
(where
is understood as the remainder mod
, and
iff
and
are in the same equivalence class of
).
Suppose now that we fix
and ask how large a contribution we can get from all of the terms that satisfy ( 2.2 ) for this given
. The equations ( 2.2 ) can be visualized as follows. Let
be the cyclic graph with
edges, numbered
through
. Place
on the vertices of this graph, so that the
-th edge, oriented clockwise, has vertices
and
, in that order (
is again understood modulo
). In other words, we can think of the map
as a function on the vertices of
.
The pairing
defines an equivalence relation on the set of edges of
: edges
and
are equivalent if
. Form the quotient graph
by gluing equivalent edges with orientation reversed. Then ( 2.2 ) is equivalent to saying that the function
descends to a function on the quotient graph
.
The total number of such functions is
, where
is the number of vertices of
.
Because the variance of
is
, we can deduce that the contribution to the sum ( 2.1 ) of those terms that satisfy equations ( 2.2 ) for a given
is at most
The first factor
comes from the normalization of the trace; the term
comes from bound on the variance; and the factor
comes from our estimation of the number of indices
satisfying ( 2.2 ). It follows that the contribution of all of the terms that satisfy ( 2.2 ) for a given
is negligible (is of order
) if
.
Recall that
has exactly
edges and that
is even. Thus
has exactly
edges. It follows that
has
vertices exactly if it is a tree. With a little bit of care, one can show that ( 2.1 ) is then equal to
On the other hand, we mentioned in § 1.5.4 that the
-th moment of a semicircular element is given by
where
stands for the set of non-crossing pairings of
. It is not hard to see that if we interpret a pairing
of
as a pairing of edges of
, it is non-crossing if and only if
is a tree. This concludes the proof.
2.3 An application to random matrix theory.
Keeping the notations of Theorem 2.1 , let
. It is not hard to work out the limit distribution of
using free probability tools. Indeed,
On the other hand,
and
are freely independent. Thus
The computation of the limit distribution of
can then be carried out using the machinery of
-transform.
2.4 Applications to von Neumann algebras.
Let us say that a non-commutative non-self-adjoint random variable
is circular if
and
are freely independent and are semicircular.
If
and
are two GUE random matrices, then
converges in
-distribution to a circular variable.
If we start with
GUE random matrices
,
, then we can form a new matrix,
of size
, where
. It is not hard to see that
is a pair of GUE random matrices. We thus obtain that
is circular in the limit
. From this it is not hard to prove that if
,
are a free circular family, then the matrix
is again circular. In fact, one can use the asymptotic freeness result to show that if we let
be the algebra of scalar diagonal
matrices, then
is free from
.
This fact underlies the earliest applications of free probability theory to von Neumann algebras and free group factors. For example, one has the following result of Voiculescu [Voi90] :
Theorem 2.2.
Let
be an integer, and let
be a rational number, so that
is an integer. Let
be a projection in the free group factor
associated to the free group on
generators. Assume that
has trace
. Then
|
(2.3)
|
This theorem has many far-reaching extensions due to Dykema and Radulescu, see e.g.
[Voi90, Dyk95, Dyk93b, Dyk93a, Dyk94, R–d92, R–d94] . For example,it turns out that it is possible to define for each
avon Neumann algebra
, called an interpolated free groupfactor, in such a way that
is the von Neumann algebra onthe free group with
generators, if
is an integer. Moreover, thecompression formula ( 2.3 ) remains valid for non-rational traces of
: the result is an interpolated free group factor with
generators; the same formula is valid also for non-integer
. For a II
factor
, its fundamental group was defined by Murray and von Neumann to be the set
Radulescu proved that
(Voiculescu’s result quoted above implied that the positive rational numbers
). In fact, it turns out that there is a dichotomy:
either all interpolated free group factors are the same among each other (and also are isomorphic to
), and all have
as their fundamental groups; or
for finite
, and
for finite
. It is not known which of the two alternatives holds.
Further developments of these techniques gave information on fundamental groups of more general free products of von Neumann algebras and on subfactors of
(see e.g.
[R–d94, Dyk95, Shl98, Shl99, PS03, SU02, DR00] ).
3 Free Entropy via Microstates.
Free entropy was introduced and developed by Voiculescu in a series of papers [Voi93, Voi94, Voi96, Voi97, Voi98b, Voi98a, Voi99a, Voi99b] as a free probability analogue of the classical information-theoretic entropy; see also Voiculescu’s survey [Voi02] .
3.1 Definition of free entropy.
Voiculescu’s original “microstates” approach to free entropy followed Boltzman’s definition of entropy of a macroscopic state.
3.1.1 Microstates and Macrostates.
Assume that the macroscopic behavior of a physical system (e.g. gas) is described by several macroscopic parameters (e.g., pressure, volume and temperature). Then a macrostate
is a state of the system corresponding to certain prescribed values of these parameters.
Microscopically, the system is made out of a large number of smaller systems (e.g., the molecules that make up the gas). On this microscopic level, the system can be described by a microstate
that specifies exactly the states of all of the sub-systems (e.g, the exact locations and moments of all of the molecules of the gas). If we fix a macrostate
, there are many microstates
that lead to the same macroscopic state.
Boltzman’s formula is then that the entropy of
must be given by
for some constant
.
3.1.2 Matricial microstates.
Voiculescu’s idea is to interpret
as a description of a macroscopic state of a system, and as microstates to take the set of all matrices
of a specific dimension that approximate
. More precisely for
in a non-commutative probability space
,
, let
be the space of
self-adjoint matrices, and consider the set
| |
| |
| |
In other words, we are considering a weak neighborhood
of the joint law
defined by the property that
iff the value of the law
on all words of length at most
deviates by no more than
from that of
. Next, we consider all self-adjoint
matrices
so that
The set
is called the set of (matricial) microstates for
.
3.1.3 Definition of free entropy.
Voiculescu then defined the free entropy by
where
refers to the Euclidean volume associated to the standard identification of
with
. We use the convention that
.
We should note that
depends only on the law of
and not on the particular realization of this law. It would be also appropriate to write
.
3.1.4 Relation to Connes’ problem.
Note that there is no a priori reason for
to be non-empty. Connes has posed a question in [Con76] of whether every II
factor can be embedded into an ultrapower of the hyperfinite II
factor. It is not hard to see that his question is equivalent to the question of whether, given
in a von Neumann algebra
with
a trace, one has that for any
and
there is a
so that
. This question is open even for
elements of the group algebra of an arbitrary discrete group
.
3.2 Properties of free entropy.
Voiculescu gave an explicit formula for the free entropy of a single variable
with law
:
for a certain universal constant
.
Free entropy has a number of nice properties, related to freeness and analogous to the properties of classical entropy; we list a few, due to Voiculescu [Voi94] :
-
If
are free, then
. Furthermore, if
, then
are freely independent.
-
.
-
is maximal subject to
iff
is a free semicircular family and each
satisfies
.
-
If
are free semicircular variables, freely independent from the family
, then
embeds into the ultrapower of the hyperfinite II
factor if and only if
for every
. Thus semicircular perturbations (i.e., “free Brownian motion”) have a regularization effect on free entropy.
To give but one example of the technical difficulties that working with
presents, one would be able to prove that
if
and
are free families, provided that one could argue that the
in the definition of free entropy is a limit.
3.2.1 Infinitesimal change of variables formula.
We end the review of free entropy by mentioning the change of variables formula [Voi94] .
Assume that
are given as non-commutative power series in
:
. Assume moreover that the multi-radius of convergence of
is large enough to exceed the norms of all
. Assume further that
for some non-commutative power series
, and that similarly the multi-radius of convergence of
is large enough to exceed the norms of
.
Let
, and let
be the given trace on
. Consider the derivation
determined by
For example,
Let
be the “Jacobian” of
:
. Then
where
refers to the Kadison-Fuglede determinant
Here
is the tensor product
of the traces on
and
.
The explanation of this formula and the appearance of
is that the Jacobian of the transformation
viewed as a map from
is naturally a matrix in
, and is given by
.
3.3 Free entropy dimension.
Voiculescu’s original idea for defining free entropy dimension was to consider a kind of asymptotic Minkowski dimension of the set of microstates. We present below an equivalent definition of K. Jung, which is based on packing dimension instead.
3.3.1 Packing and covering numbers and Minkowski dimension.
For a metric space
, let
be the packing number of
; that is, the maximal number of disjoint
-balls that can be placed inside
. Similarly, let
be the covering number of
; that is, the minimal number of
-balls needed to cover
.
For a metric space
, the upper uniform packing dimension and the upper uniform covering dimension are the same and are defined as
It is a theorem that if
, then both of these numbers are the same as the Minkowski dimension of
, which is given by
where
denotes the tubular neighborhood of
of radius
.
3.3.2 Free entropy dimension.
Let
be self-adjoint. Then let
| |
| |
Then K. Jung proved the following theorem [Jun02] :
Theorem 3.1.
One has
Moreover, if
are free semicircular variables, free from
, then
where
.
The value of any of these limits is then by definition called the free entropy dimension
.
Here
is the free entropy of
in the presence of
; it is a technical modification of the free entropy
. Very roughly, the value of
is the asymptotic logarithmic volume of a
-tubular neighborhood of the set of microstates for
. Thus the number
is a kind of asymptotic Minkowski dimension of the set of microstates. This was the original definition of free entropy dimension given by Voiculescu. We finish this section with an example.
Let
be free semicircular variables. Then
are also semicircular. In fact
It follows that
. In particular, the free group factor
can be generated by a family with free entropy dimension
.
3.4 Properties of free entropy dimension.
The theory of free entropy dimension has found a number of spectacular applications to von Neumann algebra theory.
For example, Voiculescu used free entropy dimension to prove that free group factors do not have Cartan subalgebras; soon thereafter, L. Ge gave a proof that free group factors are prime, i.e., cannot be written as tensor products of infinite-dimensional von Neumann algebras.
One of the main remaining questions about free entropy dimension is the extent to which
depends on the elements
. Voiculescu asked if
is an invariant of the von Neumann algebra generated by
, taken with a fixed trace. Since
has a generating family with free entropy dimension equal to
, a positive answer to this question would imply non-isomorphism of free group factors.
3.4.1 Invariance of
.
Voiculescu proved that
depends only on the restriction of the trace to the algebra generated by
.
In particular, if
is a discrete group and
are self-adjoint generators of the group algebra, then
depends only on the group. This invariant seems to be related to the
-cohomology of
; see below.
3.4.2 Free entropy dimension for a single variable.
Voiculescu proved that if
has law
, then
In particular, notice that
is an invariant of the von Neumann algebra (with a fixed trace) generated by
.
3.4.3 Upper bounds on
.
If
satisfies any of the following conditions, then
for any
generating
:
-
(1)
[Voi96]
has a Cartan subalgebra, i.e., a maximal abelian subalgebra
so that
. Thus free group factors have no Cartan subalgebras.
-
(2)
[Voi96]
has a diffuse regular hyperfinite subalgebra: a hyperfinite subalgebra
so that
. This is the case, in particular, if
and
has an infinite normal amenable subgroup. Thus free group factors do not have diffuse regular hyperfinite subalgebras.
-
(3)
[Voi96]
has property
: there is a sequence of unitaries
, so that
but
for all
. Free group factors are non-
by a classical result of Murray and von Neumann.
-
(4)
[Ge98]
with
and
infinite-dimensional. Thus free group factors are prime.
In particular, note that
for any
which be embedded into the ultrapower of the hyperfinite II
factor (e.g.,
is already interesting).
There are other conditions assuring upper bounds on
; we mention the work of K. Dykema [Dyk97] , M. Stefan [Ste99] and of Ge and Shen [GS00] . Upper estimates on
turned out to be of relevance also to the theory of type III factors [Shl00, Shl03b] .
3.4.4 Lower bounds on
.
K. Jung has proved the following “hyperfinite monotonicity result” [Jun03] : let
be a diffuse von Neumann algebra, and assume that
is embeddable in the ultrapower of the hyperfinite II
factor. Then
for any generators
.
Combined with the upper estimates, this shows that if
satisfies any of the properties (1)–(4) above and is embeddable into the ultrapower of the hyperfinite II
factor, then the value of
is
on any set of generators. In particular,
is an invariant of the entire von Neumann algebra!
Jung has also computed
for arbitrary generators of a hyperfinite algebra [Jun03] (which is in general a direct sum of matrix algebras and a diffuse hyperfinite von Neumann algebra) and once again found that
is an invariant of the von Neumann algebra in that case.
3.5 Relation with
-Betti numbers.
By [CS] , for any generators
of a tracial algebra
one has the inequality relating
to the
-Betti numbers of
:
In particular, specializing to the case of the group algebra of a discrete group
, we have that
where
are the
-Betti numbers of the group.
The same combination of Betti numbers also occurs in Gaboriau’s work on cost of equivalence relations [Gab00, Gab02] ; indeed he proves that
where
is the cost of
. There are no known examples in which equality does not hold.
It is curious that
measures the “optimal number of generators” for an equivalence relation induced by
; on the other hand,
is known to be
in many cases in which the von Neumann algebra is “singly generated” [GP98] .
One obstruction for the equality between
and
is the fact that the latter quantity is insensitive to the outcome of Connes’ embedding question (if there is an non-embeddable group, one can manufacture a non-embeddable group with large Betti numbers by taking free products).
It is also possible to define a “relative” version of Voiculescu’s free entropy dimension for equivalence relations; one can obtain an invariant of an equivalence relation in this way (see [Shl01, Shl03a] .
4 Non-microstates Approach to Free Entropy.
We have reviewed the microstates definition of free entropy in the previous lecture. There are several difficulties connected with that definition. The first is that the involvement of sets of microstates makes the definition hard to work with technically; as we saw there are several properties of free entropy (such as additivity for free families) that one expects to hold, but which one is unable to prove because of such technical difficulties. Another example of such acute difficulties arises when one deals with free Fisher information. By analogy with the classical case, one wants to define the free Fisher information
by the formula
where
are free semicircular variables, free from
. The definition works fine in the case that
(the explicit formula for
is essential), but it is not clear how to prove that the derivative exists and that the definition makes sense in the case
.
The other point is that the definition of the microstates free entropy subsumes existence of microstates, i.e., embedability into the ultrapower of the hyperfinite II
factor. A priori, it is not clear why one should assume this for elements of an arbitrary non-commutative tracial probability space (although of course if Connes’ embedability question always has an affirmative answer, this second point disappears).
Voiculescu [Voi98a] gave a new definition of free entropy, based on an “infinitesimal” approach involving free Fisher information.
This new approach does not involve microstates and for this reason the resulting entropy bears the name “non-microstates” or “microstates-free”. It is not at present known if the two definitions (microstates and non-microstates) are the same, except in the one-variable case; and indeed, showing this would give a positive answer to Connes’ embedability question.
Nonetheless, a recent work by Biane, Capitaine and Guionnet [BCG03] shows that the microstates free entropy is always smaller than the non-microstates entropy.
To distinguish the two definitions, quantities related to the non-microstates entropy are denoted by the same letter as their microstates analogs, but with an asterisk; for example, the non-microstates free entropy is
, and the corresponding free entropy dimension is
.
4.1 A non-rigorous derivation of the non-microstates definition.
We begin with a (rigorous) consequence of the change of variables formula for microstates entropy. We shall assume that
are in a non-commutative probability space
with a tracial positive linear functional
.
4.1.1 Infinitesimal change of variables.
Let
be polynomials in
indeterminates. Consider the change of variables
Then for
sufficiently small, this change of variables can be inverted and
can be expressed as a non-commutative power series in terms of
, so that the multi-radius of convergence of that power series exceeds the operator norms of
. Thus one can apply the change of variables formula and express
in terms of the free entropy
and the logarithm of the Jacobian of our transformation.
Expanding the value of the logarithm of the Jacobian as a power series in
gives us the infinitesimal change of variables formula [Voi97] :
4.1.2 Conjugate variables.
Let us now assume that
, with
has the property that
is in the domain of
.
Let
. The elements
are called conjugate variables to
and satisfy
for any polynomial
.
Then our infinitesimal change of variables formula becomes:
It turns out that conjugate variables are intimately connected with free Brownian motion. If we let
where
are a free semicircular family, free from
, then for any polynomial
in
indeterminates one can prove that
Thus perturbations by conjugate variables give an “approximation” in law to free Brownian motion; note, however, that while
no longer lies in
,
does lie in
.
Conjugate variables frequently exist. For example, if
are a free semicircular family, then
exist and in fact
,
. One can show that for any
and any
, conjugate variables to the family
always exist. In fact, in this case
4.1.3 Non-rigorous derivation of the formula for
.
Assume now that
are conjugate variables to
. Let
be as before a free semicircular system, free from
.
Recall that we want to define the free Fisher information by
Since
depends only on the law of
, and since the laws of
and
are the same up to higher orders in
, one would expect that
We now assume that
are sufficiently nice functions of
so that the infinitesimal change of variables applies.
Thus
| |
| |
Summarizing, we then expect that
4.1.4 Definition of
.
This leads us to take the non-rigorous formula for
as a definition of the non-microstates free Fisher information:
Definition 4.1.
[
Voi98a]
Let
be a family of non-commutative random variables in
. If conjugate variables
to this family exist, then we set
If the conjugate variables do not exist, we set
.
Note that this definition does not involve microstates.
In the case of a single variable,
ends up being the restriction of the Hilbert transform of the distribution of
to the support of this distribution. One can then compute that if
is Lebesgue absolutely-continuous, and
, then
4.1.5 Definition of
.
Since
was supposed to be proportional to the derivative of free entropy one can recover free entropy from the free Fisher information. The formula is
here as before
, and
is a free semicircular family, free from
.
Voiculescu proved that the function
is monotone decreasing and right semi-continuous in the sense that
It is an important open question if this function is always continuous.
Furthermore, if
, then
which implies that the integral defining
makes sense and converges to a value in
.
4.2 Properties of
.
As we mentioned in the foreword to this section, the principal outstanding question in the theory of free entropy is the question of when
. To this end there are two results:
-
[Voi98a] In the single-variable case, the two quantities are equal:
;
-
[BCG03] In general, the following inequality is satisfied:
The non-microstates definition turns out to be easier to work with in some respects, but harder in others. One of the big difficulties in the non-microstates framework is one’s inability to prove the change of variables formula. This difficulty is related to our inability to handle the continuity properties of the “non-commutative Hilbert transform”,
, where
are the conjugate variables to
.
Nonetheless,
has a lot of nice properties, for example: (all of these are from [Voi98a] )
-
if the families
and
are free;
-
;
-
is maximal subject to
if and only if
are free semicircular variables, and
.
-
If
are free semicircular variables, free from the family
, then for any
,
.
Comparing the last property of
with the corresponding property of
explains why
would imply a positive answer to Connes’ embedability question.
4.3 Non-microstates free entropy dimension.
Although we don’t know how to formulate the packing number definition of free entropy dimension in the non-microstates approach, the Minkowski dimension definition does have a straightforward analog. We set
where as before
, and
is a free semicircular family, free from the family
.
It is tempting to formally apply L’Hopital’s rule in the definition of
and use the fact that
. Thus we write
One can easily show that
with no examples in which equality does not hold.
There are unfortunately preciously few computations of
or
, and much less is known about their properties than about the properties of
. In particular, it is not known in general if
or
depend only on the algebra generated by
, taken with its trace.
We summarize what is known below:
-
, where
is the law of
;
-
if
are free from
; the same is true for
;
-
[CS] If
are generators of a tracial algebra
, then
where
are the
-Betti numbers of
.
-
[MS] If
are self-adjoint and generate the group algebra of a discrete group
, then equality holds:
, where
are the
-Betti numbers of
. In particular, in this case
are algebraic invariants;
-
[CS] If
has diffuse center, then
.
References
-
P. Biane, M. Capitaine, and A. Guionnet, Large deviation bounds for matrix Brownian motion, Invent. Math. 152 (2003), no. 2, 433–459.
-
A. Connes, Classification of injective factors. Cases
, Ann. of Math. (2) 104 (1976), no. 1, 73–115.
-
A. Connes and D. Shlyakhtenko,
-homology for von Neumann algberas, Preprint math.OA/0309343, to appear in J. Reine Angew. Math.
-
K. Dykema and F. Radulescu, Compressions of free products of von Neumann algebras, Math. Ann. 316 (2000), no. 1, 61–82.
-
K. Dykema, Free products of hyperfinite von Neumann algebras and free dimension, Duke Math J. 69 (1993), 97–119.
-
K. Dykema, On certain free product factors via an extended matrix model, J. Funct. Anal 112 (1993), 31–60.
-
K. Dykema, Interpolated free group factors, Pacific J. Math. 163 (1994), 123–135.
-
K. Dykema, Amalgamated free products of multi-matrix algebras and a construction of subfactors of a free group factor, Amer. J. Math. 117 (1995), no. 6, 1555–1602.
-
Kenneth J. Dykema, Two applications of free entropy, Math. Ann. 308 (1997), no. 3, 547–558.
-
D. Gaboriau, Cout des relations d’equivalence et des groupes, Invent. Math. 139 (2000), no. 1, 41–98.
-
D. Gaboriau, Invariants
de relations d’equivalence et de groupes, Publ. Math. Inst. Hautes Etudes Sci. 95 (2002), 93–150.
-
L. Ge, Applications of free entropy to finite von Neumann algebras. II, Ann. of Math. (2) 147 (1998), no. 1, 143–157.
-
Liming Ge and Sorin Popa, On some decomposition properties for factors of type
, Duke Math. J. 94 (1998), no. 1, 79–101.
-
L. Ge and J. Shen, Free entropy and property
factors, PNAS 97 (2000), 9881–9885.
-
Uffe Haagerup, On Voiculescu’s
and
-transforms for free non-commuting random variables, Free probability theory (Waterloo, ON, 1995), Fields Inst. Commun., vol. 12, Amer. Math. Soc., Providence, RI, 1997, pp. 127–148.
-
K. Jung, A free entropy dimension lemma, Preprint math.OA/0207149, 2002.
-
Kenley Jung, The free entropy dimension of hyperfinite von Neumann algebras, Trans. Amer. Math. Soc. 355 (2003), no. 12, 5053–5089 (electronic).
-
I. Mineyev and D. Shlyakhtenko, Non-microstates free entropy dimension for groups, Preprint, math.OA/0312242, to appear in GAFA.
-
S. Popa and D. Shlyakhtenko, Universal properties of
in subfactor theory, MSRI preprint 2000-032, to appear in Acta Math., 2003.
-
F. Radulescu, A one parameter group of automorphisms of
scaling the trace, C.R. Acad. Sci. Paris 314 (1992), no. 1, 1027–1032.
-
F. Radulescu, Random matrices, amalgamated free products and subfactors of the von Neumann algebra of a free group, of noninteger index, Invent. math. 115 (1994), 347–389.
-
D. Shlyakhtenko, Some applications of freeness with amalgamation, J. reine angew. Math. 500 (1998), 191–212.
-
D. Shlyakhtenko,
-valued semicircular systems, J. Func. Anal 166 (1999), 1–47.
-
D. Shlyakhtenko, Prime type III factors, Proc. Natl. Acad. Sci. USA 97 (2000), 12439–12441.
-
D. Shlyakhtenko, Free Fisher information with respect to a completely positive map and cost of equivalence relations, Comm. Math. Phys. 218 (2001), no. 1, 133–152.
-
D. Shlyakhtenko, Microstates free entropy and cost of equivalence relations, Duke Math. J. 118 (2003), 375–425.
-
D. Shlyakhtenko, On the classification of full factors of type III, Preprint math.OA/0201007, to appear in Trans. AMS, 2003.
-
R. Speicher, Combinatorial theory of the free product with amalgamation and operator-valued free probability theory, Mem. Amer. Math. Soc. 132 (1998), x+88.
-
M. Stefan, Idecomposibility of free group factors over nonprime subfactors and abelian subalgebras, Preprint, 1999.
-
D. Shlyakhtenko and Y. Ueda, Irreducible subfactors of
of index
, J. reine angew. Math 548 (2002), 149–166.
-
D.-V. Voiculescu, K. Dykema, and A. Nica, Free random variables, CRM monograph series, vol. 1, American Mathematical Society, 1992.
-
D.-V. Voiculescu, Symmetries of some reduced free product
-algebras, Operator Algebras and Their Connections with Topology and Ergodic Theory, Lecture Notes in Mathematics, vol. 1132, Springer Verlag, 1985, pp. 556–588.
-
D.-V. Voiculescu, Circular and semicircular systems and free product factors, Operator Algebras, Unitary Representations, Enveloping Algebras, and Invariant Theory, Progress in Mathematics, vol. 92, Birkhauser, Boston, 1990, pp. 45–60.
-
D.-V. Voiculescu, Limit laws for random matrices and free products, Invent. math 104 (1991), 201–220.
-
D.-V. Voiculescu, The analogues of entropy and of Fisher’s information measure in free probability theory I, Commun. Math. Phys. 155 (1993), 71–92.
-
D.-V. Voiculescu, The analogues of entropy and of Fisher’s information measure in free probability theory II, Invent. Math. 118 (1994), 411–440.
-
D.-V. Voiculescu, The analogues of entropy and of Fisher’s information measure in free probability theory, III, Geometric and Functional Analysis 6 (1996), 172–199.
-
D.-V. Voiculescu, The analogues of entropy and of Fisher’s information measure in free probability theory, IV: Maximum entropy and freeness, Free Probability (D.-V. Voiculescu, ed.), American Mathematical Society, 1997, pp. 293–302.
-
D.-V. Voiculescu, The analogues of entropy and of Fisher’s information measure in free probabilility, V, Invent. Math. 132 (1998), 189–227.
-
D.-V. Voiculescu, A strengthened asymptotic freeness result for random matrices with applications to free entropy, IMRN 1 (1998), 41 – 64.
-
D.-V. Voiculescu, The analogues of entropy and of Fisher’s information measure in free probability, VI, Adv. Math. 146 (1999), no. 2, 101–166.
-
D.-V. Voiculescu, Free entropy dimension
for some generators of property
factors of type
, J. reine Angew. Math. 514 (1999), 113–118.
-
Dan Voiculescu, Lectures on free probability theory, Lectures on probability theory and statistics (Saint-Flour, 1998), Lecture Notes in Math., vol. 1738, Springer, Berlin, 2000, pp. 279–349.
-
D.-V. Voiculescu, Free entropy, Bull. London Math. Soc. 34 (2002), no. 3, 257–278.
-
E.P. Wigner, Characteristic vectors of bordered matrices with infinite dimensions, Annals of Math. 62 (1955), 548–564.
Department of Mathematics, UCLA, Los Angeles, CA 90095, USA E-mail address: shlyakht@math.ucla.edu