Version of 3/9/05
y Jason Fulman (fulman@math.pitt.edu)
niversity of Pittsburgh Math Department, 414 Thackeray Hall Pittsburgh, PA 15260
An Inductive Proof of the Berry-Esseen Theorem for Character Ratios
B
U Abstract: Bolthausen used a variation of Stein's method to give an inductive proof of the Berry-Esseen theorem for sums of independent, identically distributed random variables. We modify this technique to prove a Berry-Esseen theorem for character ratios of a random representation of the symmetric group on transpositions. An analogous result is proved for Jack measure on partitions.
2000 Mathematics Subject Classification: 05E10, 60C05.
Key words and phrases: character ratio, Berry-Esseen theorem, Stein's method, Plancherel measure, Jack polynomial.
1 Introduction
The Plancherel measure of a finite group
is a probability measure on the set of irreducible representations of
which chooses a representation
with probability
, where
denotes the dimension of
. For instance if
is the symmetric group, the irreducible representations are parameterized by partitions
of
, and the Plancherel measure chooses a partition
with probability
where the product is over boxes in the partition and
is the hooklength of a box. The hooklength of a box
is defined as 1 + number of boxes in same row as x and to the right of x + number of boxes in same column of x and below x. For example we have filled in each box in the partition of 7 below with its hooklength
and the Plancherel measure would choose this partition with probability
. Recently there has been interest in the statistical properties of partitions chosen from Plancherel measure and we refer the reader to the surveys [AlD] , [De] and the seminal papers [J] , [O1] , [BOO] for a glimpse of the remarkable recent work on Plancherel measure. We recommend [Sa] as an introduction to representation theory of the symmetric group.
Let
be a partition of
chosen from the Plancherel measure of the symmetric group
and let
be the irreducible character parameterized by
evaluated on the transposition
. The quantity
is called a character ratio and is crucial for analyzing the convergence rate of the random walk on the symmetric group generated by transpositions [DSh] . In fact Diaconis and Shahshahani prove that the eigenvalues for this random walk are the character ratios
each occurring with multiplicity
.
Character ratios on transpositions also play an essential role in work on the moduli space of curves [EO] , [OP] .
Given these motivations, it is natural to study the distribution of the character ratio
and there has been a substantial amount of work in this direction, which we now summarize. Kerov [K1] proved that if
is chosen from the Plancherel measure of the symmetric group, then for all real
,
The details of Kerov's argument appeared in [IO] , which gave a beautiful development of Kerov's work. Hora [Ho] gave another proof of Kerov's result, exploiting the fact that the kth moment of a Plancherel distributed character ratio is equal to the chance that the random walk generated by random transpositions is at the identity after k steps. Both of these proofs were essentially combinatorial in nature and used the method of moments (and so information about all moments of the character ratio). Recent work of Sniady [Sn1] , [Sn2] understands these moments in terms of the genus expansion from random matrix theory.
A more probabilistic approach to Kerov's result appeared in [F1] , which proved that for all
and real
,
The proof used Stein's method (which is fundamentally different from the method of moments as it only uses information about a few lower order moments) and random walk on the set of irreducible representations of the symmetric group. Note that unlike Kerov's original result, this result includes an error term. The paper [F3] used martingale theory to sharpen the error term to
for any
where
is a constant depending on
. The paper [ShSu] developed a refinement of Stein's method which led to a proof of the conjecture of [F1] that an error term of
holds where
is a universal constant. A second proof of this conjecture appears in [CF] , using a different refinement of Stein's method. Both of these proofs used the “exchangeable pairs” approach to Stein's method.
The purpose of the present paper is to use a completely different technique to prove a bound of
. The method is based on Bolthausen's [Bol] ingenious inductive proof of the Berry-Esseen theorem for sums of independent identically distributed random variables. As in [F3] , we write the character ratio as a sum of martingale differences, but these are neither independent nor identically distributed so some subtle combinatorics is required to adapt Bolthausen's method. This is not the first example of adapting Bolthausen's method to the non i.i.d. case; Bolthausen [Bol] used the approach to study the distribution of
where
is a fixed
matrix and
is a random permutation on
symbols. But the case of character ratios is of considerable interest and quite unlike any other example to which his technique has been applied.
Note that a central limit theorem is known for some other conjugacy classes of the symmetric group [K1] , [IO] , [Ho] , but with no information about the error term (see the end of Section 2 for a conjecture). The technique of this paper does not obviously extend, however, since in Section 2 when we write
as a sum of martingale differences, we require the fact that the expected value of the square of a summand given the previous summands is constant. This is false for general conjugacy classes. Also it is a nontrivial combinatorial problem to give upper bounds on the expected absolute value of the cubes of the summands. Fortunately for the case of transpositions this can be done without much difficulty. And the case of transpositions does seem to have unique practical importance [EO] , [OP] .
The contents of this paper are as follows. Section 2 develops the combinatorics needed to adapt Bolthausen's method to the case of character ratios, and then proves an upper bound of
. Section 3 then recalls the Jack
measure on partitions (here
is a parameter) and why it is interesting.
It then briefly indicates the modifications to the Plancherel case needed to prove a central limit theorem with an error term of
, where
is a constant depending on
. This organization is natural since many algebraically inclined readers will want to understand the result for character ratios without needing combinatorics of Jack polynomials; thus a key lemma is given an algebraic proof in Section 2 and a combinatorial proof in Section 3 .
2 Central limit theorem for Plancherel measure
The random variable we wish to study is
where
is chosen from the Plancherel measure of the symmetric group
. To begin we write
as a sum of other random variables. For this we need Kerov's growth process on partitions [K2] ; this has a natural generalization to arbitrary finite groups [F3] , but we only recall it in the case of interest. Given a partition
of size
, one obtains a partition
of size
by choosing
with probability
if
can be obtained from
by adding a single box, and with probability 0 otherwise. Thus starting from
, the unique partition of size
, one obtains a random sequence
of partitions. Kerov [K2] proves that each
is distributed according to the Plancherel measure of
.
Given Kerov's growth process, one can write
where
,
is defined as 0, and
for
.
Lemma 2.1 states that the
are martingale differences satisfying special properties. We remark that [F3] extends this lemma to more general conjugacy classes and groups. The notation
means the expected value of
given
.
Lemma 2.1.
([
F3]
)
-
(1)
for
and all partitions
.
-
(2)
for all
.
-
(3)
.
-
(4)
.
Frobenius [Fr] found the following explicit formula for the character ratio of the symmetric group on transpositions:
where
is the length of row
of
and
is the length of column
of
. From his formula it follows that
where
is the box added to
to obtain
and the “content”
of a box is defined as column number of box row number of box.
Lemma 2.2 gives the conditional second and fourth moments of the
's.
We emphasize that these were not derived or even stated in terms of character ratios, but rather were proved in a completely combinatorial way by studying the behavior of the moments of
where
is the box added during Kerov's growth process. We remark that for other conjugacy classes, there is not an analog of the fact that
is independent of
.
Lemma 2.2.
Let
be a partition of size
.
-
(1)
([K3] )
.
-
(2)
([La] )
.
Lemma 2.3 is a useful identity. Although a combinatorial proof can be given using properties of Schur functions, we defer combinatorial arguments to the more general setting of Jack polynomials in Section 3 and give an algebraic proof.
Lemma 2.3.
Let
be the rth elementary symmetric function of
. For
a partition of
, let
denote the rth elementary symmetric function of the contents of the boxes of
. Then
for
.
-
Proof.
If
the result is clear since the box in the first row and column of
has content 0, so that
for all
.
For
, we use the theory of Murphy elements [Mu] ; a friendly reference giving background on these elements is [DG] . For
, the ith Murphy element is defined as the sum of transpositions
.
Let
be the element of the group algebra of
which is the sum of all permutations with
cycles. By Proposition 2.1 of [DG] ,
is the rth elementary symmetric function of the elements
.
Since the elements
are simultaneously diagonalizable in every irreducible representation of the symmetric group, it follows from Murphy's determination of their eigenvalues that in the representation of
parameterized by
,
is a scalar multiple of the
identity matrix with scalar equal to
. In the regular representation of
the irreducible representation parameterized by
occurs with multiplicity
. Hence the trace of
in the regular representation is
. But the coefficient of the identity in
is 0, so the trace of
in the regular representation is 0, implying the result. □
Lemma 2.4 gives upper bounds for
and for
.
Lemma 2.4.
Suppose that
.
-
(1)
.
-
(2)
.
-
Proof.
By the Cauchy-Schwarz inequality,
. By Lemma 2.2 ,
and
By Lemma 2.3 with
and then part 4 of Lemma 2.1 ,
| |
| |
This proves the first assertion.
For the second assertion, note (using part 4 of Lemma 2.1 in the final equality) that
| |
| |
| |
| |
The conditional version of the Cauchy-Schwarz inequality and part 1 of Lemma 2.2 give that
is at most
Thus
and the proof of the first assertion showed this to equal
, as desired. □
Now we adapt Bolthausen's [Bol] inductive proof of the Berry-Esseen theorem for i.i.d. random variables to the setting of character ratios. We remark that the unpublished notes of Mann [Man] are a useful exposition of Bolthausen's proof and we refer to them in the proof of Theorem 2.5 .
Theorem 2.5.
Let
be chosen from the Plancherel measure on partitions of size
. Then for all
and real
,
-
Proof.
The theorem is visibly true for
, so throughout we suppose that
.
For
real, let
be the indicator function of the set
.
For
real and
, let
be the function which is 1 for
and then drops linearly to the value 0 at
and is 0 for
. Let
where
is the expected value of a function f under the normal distribution.
Note that our ultimate goal is to upper bound
.
As in Stein's method [Stn] , let
Then
, so that
Part 2 of Lemma 2.1 with
implies that
so that
By part 1 of Lemma 2.1 and part 1 of Lemma 2.2 , this is equal to
| |
| |
| |
| |
Next we upper bound
. Recall from [Bol] or [Man] that for any
and
,
Thus
where
-
.
-
.
-
.
By part 3 of Lemma 2.1 ,
; thus
. By parts 3 and 4 of Lemma 2.1 ,
Thus
. To bound
, we use the fact (explained in [Man] ) that
for any interval
and constants
with
. This, together with the fact (used in bounding
) that
, implies that
Summarizing, we conclude that
Next, we upper bound
Arguing as in the previous paragraph this is at most
where
-
.
-
.
-
.
To bound
, use part 1 of Lemma 2.4 to conclude that
. To bound
, use part 2 of Lemma 2.4 to conclude that
. To bound
, note that by the reasoning used in bounding
,
Thus
. By part 1 of Lemma 2.4 and since
, this implies that
. To summarize,
.
The previous two paragraphs imply that
From [Bol] or [Man] ,
for all
, which implies that
Choosing
gives that
. The result now follows by induction since
for
. □
To conclude this section, we note that it would be of interest to prove the following (more general) conjecture.
Conjecture: Let
be fixed. Then for all
and real
,
where
is a constant depending on
.
3 Central limit theorem for Jack measure
For
the Jack
measure on partitions of size
chooses a partition
with probability
where the product is over all boxes in the partition. Here
denotes the number of boxes in the same row of
and to the right of
(the “arm” of x) and
denotes the number of boxes in the same column of
and below
(the “leg” of x). For example the partition of 5 below
would have Jack
measure
Note that when
, Jack measure reduces to Plancherel measure of the symmetric group. The papers [O2] , [BO] emphasize that for
fixed the study of Jack
measure is an important open problem, about which relatively little is known for general values of
. It is a discrete analog of eigenvalue ensembles from random matrix theory and like Jack polynomials [GHJ] , should also be relevant to the moduli space of curves.
Given
, the quantity to be studied is
where as usual
is the length of the ith row of
and
is the length of the ith column of
. It is of interest to study the quantity
under Jack measure for several reasons. When
it reduces to the study of the character ratio of transpositions under Plancherel measure. When
it is a spherical function of the Gelfand pair
where
is the hyperoctahedral group of size
. Also by Corollary 1 of [DHol] , there is a natural random walk on perfect matchings of the complete graph on
vertices, whose eigenvalues are precisely
, occurring with multiplicity proportional to the Jack
measure of
.
The paper [F2] used the “exchangeable pairs” version of Stein's method to prove a central limit theorem for
with error term
where
is a constant depending on
. This was sharpened in [F3] using martingales to
for any
, and the paper [CF] gives a bound
using a refinement of the “exchangeable pairs” version of Stein's method.
The main result of this section is Theorem 3.1 .
Theorem 3.1.
Suppose that
and let
be chosen from the Jack
measure on partitions of size
. Then there is a constant
depending on
so that for all
and real
,
Note that in Theorem 3.1 we suppose that
since the Jack
probability of
is equal to the Jack
probability of the transpose of
, implying that for any
, the Jack
probability that
is equal to the Jack
probability that
. Also note that
must depend on
, since by Corollary 5.3 of [F2] , the random variable
has mean 0, variance 1, and third moment
.
There is no need to write out a proof of Theorem 3.1 , which uses exactly the same logic as that of Theorem 2.5 . But it is necessary to give analogs of Lemmas 2.1 , 2.2 , 2.3 , and 2.4 , and we do that.
There is an
-analog of Kerov's growth process (due to Kerov [K4] ) giving a sequence of partitions
with
distributed according to the Jack
measure on partitions of size
; see [F3] for details. Moreover from the definition of
, it follows that
Here
and if
then
where
is the box added to
to obtain
and the “
-content”
of a box
is defined to be
(column number of x-1) (row number of x-1).
Lemma 3.2 is an analog of Lemma 2.1 and is generalized in [F3] to arbitrary spherical functions of the Gelfand pair
.
Lemma 3.2.
([
F3]
)
-
(1)
for
and all partitions
.
-
(2)
for all
.
-
(3)
.
-
(4)
.
Lemma 3.3 is the
version of Lemma 2.2 .
Lemma 3.3.
Let
be a partition of size
.
-
(1)
([K4] )
.
-
(2)
([La] )
| |
| |
Lemma 3.4 is the
version of Lemma 2.3 . The proof is combinatorial, as opposed to the algebraic argument given for Lemma 2.3 .
Lemma 3.4.
Consider the Jack
measure on partitions of size
.
-
(1)
If
is an integer then
-
(2)
Let
denote the
th elementary symmetric function of the
-contents of the boxes of
. Then
for
.
-
Proof.
It suffices to prove the first assertion since the second assertion follows from the first by taking the coefficient of
on both sides. Page 324 of [Mac] proves the identity
where the sum is over all
of all sizes,
denotes a Macdonald symmetric function,
denotes the nth power sum symmetric function, and
is a number to be discussed more below. We apply the homomorphism of the ring of symmetric functions determined by
,
for all
where
are positive integers; this is possible since the
's are algebraically independent. Then we take the limit
in which Macdonald polynomials become Jack polynomials.
With these substitutions, consider the left hand side of the identity. By pages 380 and 381 of [Mac] ,
Letting
, one sees that the coefficient of
in the left-hand side of the identity is
.
Consider the right hand side of the identity with these substitutions. One obtains
. Letting
, one obtains
, and taking the coefficient of
gives
. Comparing with the previous paragraph proves the first assertion of the lemma. □
Finally, we give the analog of Lemma 2.4 .
Lemma 3.5.
Suppose that
. There is a constant
such that
-
(1)
.
-
(2)
.
-
Proof.
The proof method is the same as that of Lemma 2.4 , using the Cauchy-Schwarz inequality in the first part and the conditional Cauchy-Schwarz inequality in the second part. One uses that
for all
(part 1 of Lemma 3.3 ). Also one needs that
| |
| |
The first equality used part 2 of Lemma 3.3 and the fact that
.
The second equality used part 4 of Lemma 3.2 and Lemma 3.4 with
. □
4 Acknowledgements
The author was partially supported by NSA grant number H98230-05-1-0031. References
-
Aldous, D. and Diaconis, P., Longest increasing subsequences: from patience sorting to the Baik-Deift-Johansson theorem, Bull. AMS (N.S.) 36 (1999), 413-432.
-
Bolthausen, E., An estimate of the remainder term in a combinatorial central limit theorem, Z. Wahrsch. Verw. Gebiete 66 (1984), 379-386.
-
Borodin, A., Okounkov, A., and Olshanski, G., Asymptotics of Plancherel measures for symmetric groups, J. Amer. Math. Soc. 13 (2000), 481-515.
-
Borodin, A. and Olshanski, G., Z-measures on partitions and their scaling limits, preprint math-ph/0210048 at http://xxx.lanl.gov.
-
Chatterjee, S. and Fulman, J., Stein's method for chi-squared approximation and spectral measure of Gelfand pairs, in preparation.
-
Deift, P., Integrable systems and combinatorial theory, Notices Amer. Math. Soc. 47 (2000), 631-640.
-
Diaconis, P. and Greene, C., Applications of Murphy's elements, Stanford University technical report no. 335 (1989).
-
Diaconis, P. and Holmes, S., Random walk on trees and matchings, Elec. J. Probab. 7 (2002), 17 pages (electronic).
-
Diaconis, P. and Shahshahani, M., Generating a random permutation with random transpositions, Z. Wahr. Verw. Gebiete 57 (1981), 159-179.
-
Eskin, A. and Okounkov, A., Asymptotics of branched coverings of a torus and volumes of moduli spaces of holomorphic differentials, Invent. Math. 145 (2001), 59-103.
-
Frobenuis, F., Uber die charaktere der symmetrischen gruppe, Sitz. Konig. Preuss. Akad. Wissen. (1900), 516-534; Gesammelte abhandlungen III, Springer-Verlag, Heidelberg, 1968, 148-166.
-
Fulman, J., Stein's method and Plancherel measure of the symmetric group, Transac. Amer. Math. Soc. 357 (2005), 555-570.
-
Fulman, J., Stein's method, Jack measure, and the Metropolis algorithm, J. Combin. Theory Ser. A 108 (2004), 275-296.
-
Fulman, J., Martingales and character ratios, to appear in Transac. Amer. Math. Soc., available at http://www.math.pitt.edu/
fulman.
-
Goulden, I., Harer, J., and Jackson, D., A geometric parametrization for the virtual Euler characteristic of the moduli spaces of real and complex algebraic curves, Trans. Amer. Math. Soc. 353 (2001), 4405-4427.
-
Hora, A., Central limit theorem for the adjacency operators on the infinite symmetric group, Comm. Math. Phys. 195 (1998), 405-416.
-
Ivanov, V. and Olshanski, G., Kerov's central limit theorem for the Plancherel measure on Young diagrams, in Symmetric Functions 2001: Surveys of developments and perspectives, Kluwer Academic Publishers, Dodrecht, 2002.
-
Johansson, K., Discrete orthogonal polynomial ensembles and the Plancherel measure, Ann. of Math. (2) 153 (2001), 259-296.
-
Kerov. S.V., Gaussian limit for the Plancherel measure of the symmetric group, Compt. Rend. Acad. Sci. Paris, Serie I, 316 (1993), 303-308.
-
Kerov, S.V., The boundary of Young lattice and random Young tableaux, in Formal power series and algebraic combinatorics, DIMACS Ser. Discrete Math. Theoret. Comput. Sci. 24, Amer. Math. Soc., Providence, RI, (1996), 133-158.
-
Kerov, S.V., Transition probabilities of continual Young diagrams and the Markov moment problem, Funct. Anal. Appl. 27 (1993), 104-117.
-
Kerov, S.V., Anisotropic Young diagrams and Jack symmetric functions, Funct. Anal. Appl. 34 (2000), 41-51.
-
Lassalle, M., Jack polynomials and some identities for partitions, Trans. Amer. Math. Soc. 356 (2004), 3455-3476.
-
Macdonald, I., Symmetric functions and Hall polynomials, Second edition, Oxford University Press, New York, 1995.
-
Mann, B., Bolthausen's proof of Berry-Esseen, unpublished manuscript (1994).
-
Murphy, G.E., A new construction of Young's seminormal representation of the symmetric group, J. Algebra 69 (1981), 287-291.
-
Okounkov, A., Random matrices and random permutations, Internat. Math. Res. Notices 20 (2000), 1043-1095.
-
Okounkov, A., The uses of random partitions, preprint math-ph/0309015 at http://xxx.lanl.gov.
-
Okounkov, A. and Pandaripandhe, R., Gromov-Witten theory, Hurwitz numbers, and Matrix models, I, preprint math.AG/0101147 at http://xxx.lanl.gov.
-
Sagan, B., The symmetric group. Representations, combinatorial algorithms, and symmetric functions, Springer-Verlag, New York, 1991.
-
Shao, Q., and Su, Z., The Berry-Esseen bound for character ratios, preprint (2004).
-
Sniady, P., Asymptotics of characters of symmetric groups, Gaussian fluctuations of Young diagrams and genus expansion, preprint math.CO/0411647 at xxx.lanl.gov.
-
Sniady, P., Gaussian fluctuations of characters of symmetric groups and of Young diagrams, preprint math.CO/0501112 at xxx.lanl.gov.
-
Stein, C., Approximate computation of expectations, Institute of Mathematical Statistics Lecture Notes, Volume 7, 1986.
U