<ph f="cmex"> </ph><ph f="cmbx">Global Regularity for the Yang–Mills Equations on High Dimensional Minkowski Space</ph>

Joachim Krieger

Jacob Sterbenz

1 Introduction

In this work we investigate the global in time regularity properties of the Yang-Mills equations on high dimensional Minkowski space with compact semi-simple gauge group G   . Specifically, we show that if a certain gauge covariant Sobolev norm is small, the so called critical regularity H ˙ A n 4 2   , and the dimension satisfies 6 n   , then a global solution exists and remains regular for all times given that the initial data is regular. This is in the same spirit as the recent result [8for the Maxwell-Klein-Gordon system, as well as earlier results for high dimensional wave-maps (see [11, [6, [9, and [7). Our approach shares many similarities with those works, whose underlying philosophy in basically the same. That is, to introduce Coulomb type gauges in order to treat a specific potential term as a quadratic error. In our setup, we use a non-abelian variant of the remarkable parametrix construction contained in [8, in conjunction with a version of the Uhlenbeck lemma [13on the existence of global Coulomb gauges. This latter result has been used for high dimensional wave-maps to globally “renormalize” the equation so that the existence theory can be treated directly through Strichartz estimates applied to multi-linear expressions.
In the present situation, as was the case with the Maxwell-Klein-Gordon system, the corresponding renormalization procedure is necessarily more involved because it needs to be done separately for each distinct direction in phase space. That is, we provide a renormalization of the Yang-Mills equations through the construction of a Fourier integral operator with
G   -valued phase. The construction and estimation of such an object relies heavily on elliptic-Coulomb theory, primary due to the difficulty one faces in that the G   -valued phase function cannot be localized within a neighborhood of any given point on the group due to the critical nature of the problem (if you like, there is a logarithmic “twisting” of the group element as one moves around in physical space; fortunately the group is compact so this doesn't ruin things).
To get things started, we now give a simple gauge covariant description of the equations we are considering. The (hyperbolic) Yang-Mills equations arise as the evolution equations for a connection on the bundle
V = n × g   , where n   is some n   (spatial) dimensional Minkowski space, with metric g : = ( 1 , 1 , , 1 )   in inertial coordinates ( x 0 , x i )   , and g   is the Lie algebra of some compact semi-simple Lie group G   . Here we are considering V   with the A d ( G )   gauge structure. If φ   is any section to V   over   , then a connection assigns to every vector-field X   on the base n   , a derivative which we denote as D X   , such that the following Leibniz rule is satisfied for every scalar field f   : D X ( f φ ) = X ( f ) φ + f D X φ .   In this setup, we assume that V   is equipped with an A d ( G )   invariant metric ,   which respects the action of D   . That is, one has the formula:
d φ , ψ = D φ , ψ + φ , D ψ . (1)
In the present situation we will take ,   to be the Killing form on g   . The curvature associated to D   is the g   valued two-form F   which arises from the commutation of covariant derivatives and is defined via the formula: D X D Y φ D Y D X φ D [ X , Y ] φ = [ F ( X , Y ) , φ ] .   We say that the connection D   satisfies the Yang-Mills equations if its curvature is a (formal) local minima of the following Maxwell type functional:
[ F ] = 1 4 n F α β , F α β D V n . (2)
The Euler-Lagrange equations of  2 read:
D β F α β = 0 . (3)
Also, from the fact that F   arises as the curvature of some connection, we have that the following identity known as “Bianchi” is satisfied:
D [ α F β γ ] = 0 . (4)
From now on we will refer to the system  3  4 as the first order Yang–Mills equations (FYM).
As we have already mentioned, our aim is to study the regularity properties of the Cauchy problem for the (FYM) system. To describe this in a geometrically invariant way, we make use of the following splitting of the connection-curvature pair
( F , D )   : Foliating   into the standard Cauchy hypersurfaces t = c o n s t .   , we decompose: ( F , D ) = ( F ̲ , D ̲ ) ( E , D 0 ) ,   where ( F ̲ , D ̲ )   denotes the portion of ( F , D )   which is tangent to the surfaces t = c o n s t .   (i.e. the induced connection), and ( E , D 0 )   denotes respectively the interior product of F   with the foliation generator T = t   , and the normal portion of D   . In inertial coordinates we have: E i = F 0 i .   On the initial Cauchy hypersurface t = 0   we call a set ( F ̲ ( 0 ) , D ̲ ( 0 ) , E ( 0 ) )   admissible Cauchy data 1 if it satisfies the following compatibility condition:
D ̲ i E i ( 0 ) = 0 . (5)
We define the Cauchy problem for the Yang-Mills equation to be the task of construction a connection ( F , D )   which solves  3 , and has Cauchy data equal to ( F ̲ ( 0 ) , D ̲ ( 0 ) , E ( 0 ) )   .
In order to understand what the appropriate condition on the initial data should be (and what we would like it to be!), it is necessary to consider the following two basic mathematical features of the system
 3  4 . The first is conservation. From the Lagrangian nature of the field equations  3  4 , we have the tensorial conservation law:
Q α β [ F ] = F α γ , F β γ 1 4 g α β F γ δ , F γ δ ,
α Q α β [ F ] = 0 ,
where   is the covariant derivative on n   . In particular, contracting Q   with the vector-field T = t   , we arrive at the following constant of motion for the system  3  4 :
R n Q 00 d x = 1 2 R n ( | E | 2 + | F ̲ | 2 ) d x . (6)
The second main aspect of the system  3  4 is that of scaling. If we perform the transformation:
( x 0 , x i ) ( λ x 0 , λ x i ) , (7)
on n   , then an easy calculation shows that:
D λ D , F λ 2 F . (8)
If we now define the gauge covariant (integer) Sobolev spaces:
F 2 H ˙ A s : = | I | = s D ̲ I F 2 L 2 ( R n ) , (9)
where for each multiindex I = ( i 1 , , i s )   we have that D I = D i 1 D i s   is the repeated covariant differentiation with respect to the translation invariant spatial vector-fields { 1 , , n }   , then for even2 spatial dimensions, the norm H ˙ A n 4 2   is invariant with respect to the scaling transformation  8 . In particular, the conserved quantity  6 is invariant when n = 4   and this is called the critical dimension.
Now, based on numerical evidence as well as analytical arguments, it is suspected that in general the Cauchy problem for
 3  4 with smooth initial data will not be well behaved without size control of the critical regularities s c = n 4 2   in high dimensions. What we will take this statement to mean here is simply that if 4 n   and the H ˙ A s c   norm of the initial data is not sufficiently small, then one can expect the existence of regular (i.e. C A   ) sets ( F ̲ ( 0 ) , D ̲ ( 0 ) , E ( 0 ) )   such that the corresponding solution to  3  4 will develop a singularity in finite time. By singularity development, we mean that some higher norm of the type  9 will fail to be bounded at a later time, given that it was initially; or even more specifically, that the L   norm of the curvature F   will blow up in finite time for some regular initial data sets. Since these norms are gauge covariant, this type of singularity development would correspond to an intrinsic geometric breakdown of the equations, and could not be an artifact of poorly chosen local coordinates (gauge) on V   . This has been rigorously demonstrated in the equivariant category for the supercritical dimensions 5 n   (see [3). In the critical dimension things are much less clear, although there is numerical evidence that on still has blowup for large initial data (see [2). This is thought to be connected with the existence of large static solutions (instantons).
One possible conjecture is that there is global regularity when the norm
 6 is below the ground state energy.
Going in the other direction, it is expected that if the critical norm
H ˙ A n 4 2   is sufficiently small, then regular initial data will remain regular for all times. This can be seen as an easier preliminary step toward understanding in detail the issue of large data for dimension n = 4   , and is furthermore an interesting problem in its own right. A central difficulty in the demonstration of this conjecture is to construct a stable set coordinates on the bundle V   such that the Christoffel symbols of D   are well behaved in the sense that they obey the natural range of estimates one expects for this type of problem. This is precisely what we shall do in dimensions 6 n   through the well known procedure of using (spatial) Coulomb gauges. Unfortunately, this preliminary gauge construction is far from sufficient to close the regularity argument, and it will in fact be necessary for us to go much further and control infinitely many Coulomb gauges, each of which correspond to a distinct polarized plane wave solution to the usual (flat) wave equation = α α   .
However, this does not effect the statement of our main result which is in fact quite simple:
Theorem 1.1 (Critical regularity for high dimensional Yang-Mills). Let the number of spatial dimensions be even and such that 6 n   . Then there exists fixed constants 0 < ɛ 0 , C   such that if ( F ̲ ( 0 ) , D ̲ ( 0 ) , E ( 0 ) )   is an admissible data set which satisfies the smallness condition:
( F ̲ ( 0 ) , E ( 0 ) ) H ˙ A n 4 2 ɛ 0 , (10)
and there exists constants M k <   , n 4 2 < k N   such that:
( F ̲ ( 0 ) , E ( 0 ) ) H ˙ A k = M k , (11)
then there exists a unique global solution to the field equations  3  4 with this initial data, and furthermore one has that the following inductive norm bounds hold:
F H ˙ A n 4 2 C ɛ 0 ,
F H ˙ A k C ( M n 4 2 , , M k 1 ) M k .
In particular, in this case F   remains smooth (in the gauge covariant sense) and bounded for all times.
Remark 1.2. As alluded to above, we will more specifically prove the existence of a global (in space and time) spatial Coulomb gauge such that the coefficient functions of the curvature F   , as well as the Christoffel symbols (gauge potentials) of the connection D   are in the classical Sobolev spaces H ˙ s   , and such that they satisfy appropriate angularly and spatially microlocalized Strichartz estimates. We have elected to eliminate a discussion of this from the statement from the main theorem in favor of the simpler geometric language so that the reader can at a first glance gain an idea of the content of our result without being confronted with too many technical details.

1 Of course, this set is overdetermined as the curvature F ̲   depends completely on the connection D ̲   . Also, it is perhaps not completely obvious at first that the set ( F ̲ ( 0 ) , D ̲ ( 0 ) , E ( 0 ) )   determined uniquely a solution ( F , D )   to  3  4 . For example, the initial normal derivative D 0 ( 0 )   does not need to be specified. We will show this is the case in the sequel (in particular see Proposition  5.3 ).

2 For odd spatial dimensions, the above discussion needs to be modified somewhat because we will not make an attempt here to define fractional powers of the spaces H ˙ A s   . Instead, what one should do is to simply put things in a Coulomb gauge and then use the usual fractional Sobolev spaces. This later approach is what we will take in the sequel, although for sake of concreteness we will only discuss the case of even dimensions. We have opted for the covariant approach in the introduction because it makes stating our main result a bit easier, and has an appealing simplicity. Also, since we shall need many specifics on how Coulomb gauges are constructed in order to create and control our parametrix, we will explain how the Coulomb gauge relates to the Cauchy problem in detail in the following two sections.

Acknowledgements

First and foremost, we would like to thank our advisors Sergiu Klainerman and Matei Machedon for their continuing support and encouragement. This subject matter as well as our own point of view owes much to them. We would also like to thank Igor Rodnianski, Terry Tao, and Daniel Tataru for many interesting and helpful conversations. This work began at the Institute for Advanced Study during the Fall 2003 semester when both authors were in attendance. The second author would like to thank Harvard University for for its hospitality during the Spring of 2004 and Winter 2005. The first author was partially supported under NSF grant DMS-0401177. The second author was supported in part by an NSF postdoctoral fellowship.

2 Some Basic Notation

We list here some of the basic conventions used in this work, as well as some constants which will be needed in the sequel. We use the usual notation a b   , to denote that a C b   for some (possibly large) constant C   which may change from line to line. Likewise we write a b   to mean a C 1 b   for some large constant C   . In general, C   will denote a large constant, but at times we will also call C   a connection. The difference should be clear from context. Overall, we will have use for a family of small constants, which satisfy the hierarchy:
0 < ɛ 0 ε 0 ε 0 ~ γ δ 1 . (12)

3 Some gauge-theoretic preliminaries

In this paper, we are working with a compact semi-simple group Lie G   . However, all of our calculations will be carried out in a somewhat larger context. Firstly, we will assume that G   is embedded as a subgroup of matrices of some (possibly) larger orthogonal group O ( m )   . In particular, we can identify the Lie algebra g   with an appropriate sub-algebra of o ( m )   . This allows us to perform all of our calculations on a specific collection of matrices. Since our main computation involves complex valued integral operators, we will further need to work in the complexified algebra C o ( m )   . The Killing form ,   on g   extends easily to this context to yield the bilinear form:
A , B = t r a c e ( A B * ) . (13)
Notice that this is a positive definite form when restricted to the real vector space o ( m )   , and is a sesquilinear form on the corresponding complexified algebra C o ( m )   . More importantly, ,   is A d ( O ( m ) )   invariant, and in fact the more general identity holds:
g 1 1 A h 1 , g 2 1 B h 2 = g 2 g 1 1 A h 1 h 2 1 , B , (14)
for A , B C o ( m )   and g i , h i O ( m )   . In fact, it is not difficult to see that the form  13 extends to a sesquilinear form on all complex matrices in M ( m × m )   , and that it can be identified with the usual matrix inner product:
A , B = i , j a i j b ¯ i j , (15)
which come from considering these matrices as vectors in C m 2   . Furthermore, it is easy to see that the general adjoint formula  14 continues to hold in this context.
This will be of fundamental importance in the sequel. In general, we will use the notation
A 2 = A , A   to denote the action of this norm on any matrix. Also, notice that directly from  14 one has the isometric identity:
g A = A , g O ( m ) . (16)
These are all very simple algebraic identities, but our method is incredibly sensitive to them and would collapse entirely if they did not hold.
In the context of matrices, we may compute the action of the connection
D   on sections F   to V   as follows: D X F = X α ( α ( F ) + [ A α , F ] ) ,   Here, the gauge potentials { A α }   are g   -valued, and are defined via the equation:
D α 1 V = [ A α , 1 V ] ,   where 1 V   denotes some chosen orthonormal frame in V   , and we are abusively writing F = F 1 V   . In shorthand notation, we write: D = d + A ,   where d   is the usual exterior derivative on matrix valued functions. Likewise, in this notation we have the well known identity for the curvature of D   : F A = d A + [ A , A ] .   In this last formula, we use the superscript notation to emphasize the fact that the curvature is not gauge invariant, but transforms according to the A d ( G )   action:
F g F g 1 ,   whenever one performs the change of frame 1 V g 1 V g 1   . As is well known, the potentials { A α }   themselves do not transform according to A d ( G )   , but instead take on an affine group of transformations:
B = g A g 1 + g d g 1 , (17)
where { B α }   represents the connection D   in the frame g 1 V g 1   . In particular, the difference of two connections obeys the A d ( G )   structure, a fact we will have use for in a moment. For instance, any connection { C α }   with F C = 0   obeys A d ( G )   .
Furthermore, as is the basic fact of gauge theory, such connections always lead to a globally
3 integrable ODE: d g = g C ,   where the solution g   belongs to G   . Thus, we may identify flat connections C   with infinitesimal gauge transformations, and it is easy to see that every gauge transformation  17 leads to a flat connection which we may define as C = g 1 d g   .
This completes our discussion of elementary gauge theory.
It will also be necessary for us to make use of the basic facts from (non-gauge-covariant) Hodge theory. Even though the connections we work with in this paper are on the full space-time
n   , our use of Hodge theory will always be restricted to time slices { t } × R n   . In particular we use the general notation d , d *   for the exterior derivative and its adjoint acting on g   (and more generally M ( m × m )   ) valued differential forms on R n   . To emphasize this restriction, we will use Latin indices when computing these operators. For example:
( d A ) i j = { i A j } , ( d F ) i j k = [ i F j k ] ,
where { }   and [ ]   denote anti-symmetric and symmetric cyclic summing respectively. Also, the adjoint here is taken with respect to the Killing form  13 . In particular, we have the Hodge Laplacean:
Δ = ( d d * + d * d ) , (18)
which in our context is simply the usual scalar Laplacean acting component-wise on matrices. Finally, we have the Hodge decomposition which we write as A = A d f + A c f   where:
A d f = d * d Δ 1 A ,
A c f = d d * Δ 1 A .
This decomposition is bounded on L p   spaces for 1 < p <   as the operators involved are SIO's. Also, since these operators are all real, this decomposition respects the Lie algebra structure of g   inside of C o ( m )   .
The last topic we cover here is the basic underpinning of much of analysis in the context of compact gauge groups. This is the remarkable Uhlenbeck lemma, which allows one to “straighten out” a connection as long as its curvature satisfies appropriate bounds. The important thing for us is that these bounds are precisely at the level of the critical regularity
H ˙ A n 4 2   . This result is:
Lemma 3.1 (Classical Uhlenbeck lemma). Let D A = d + A   be a connection with compact (matrix) group on R n   . Then there is a pair of constants ε 0 , C   which only depend on the dimension n   such that if the curvature F A   of D A   satisfies the bound: F A L n 2 ε 0 ,   then D A   is gauge equivalent to a connection D B = d + B   where the potentials { B i }   satisfy the condition: d * B = 0 ,   and such that the following estimate holds:
B L n C ε 0 . (19)
In the sequel, it will be useful for us to have a somewhat more refined version of Lemma  3.1 which does not make reference to the size of the curvature, but rather to the size of the connection { A α }   itself in a critical norm which does not involve derivatives. This will allow us to prove certain connections exist more directly. Furthermore, since the basic formulas used in the proof of this result will be important in constructing our parametrix, it will set the pace for much of what follows. Finally, we mention here that our proof is a bit different from that of [13in that it does not rely on any implicit function theorem type arguments, and is instead completely explicit being based on a simple Picard iteration.
Lemma 3.2 (Uhlenbeck lemma for small L n   perturbations of Coulomb potentials with small L n 2   curvature.). Let D A = d + A   be a connection on R n × V   with compact (matrix) gauge group G   . Then there exists constants ε 0 , C   such that if:
F A L n 2 ε 0 , (20)
and such that d + A   is gauge equivalent to d + B   with d * B = 0   , where one has the bounds:
A L n C ε 0 , (21)
then for every connection d + A ~   such that:
A ~ A L n C ε 0 , (22)
there exists a gauge equivalent connection d + B ~   such that d * B ~ = 0   , and one has the same size control:
B ~ L n C ε 0 . (23)
Remark 3.3. Before continuing with proof, let us remark here that Lemma  3.2 is in fact more general that the classical Uhlenbeck Lemma. Specifically,  3.2 easily implies  3.1 with smallness condition ε 0 / 2   (where ε 0   is determined by Lemma  3.2 ) through a straightforward induction procedure which we outline now.
First of all, from Lemma
 3.2 we see that the set of all connections d + A   with curvature such that:
F A L n 2 ε 0 2 , (24)
and such that d + A   is equivalent to d + B   with d * B = 0   , and such that one has the bounds  19 , is an open set in the intersection of L n   with the set determined by  24 (in the sense of distributions). Therefore, if the conclusion of Lemma  3.1 were to be violated, it must then be the case that there is a smallest number r *   such that the sphere of radius r *   contains a connection d + A   with the property that it cannot be put in the Coulomb gauge (with L n   bounds), even though the bound  24 is valid for this connection. Now, consider the set of connections d + λ A   where 0 < ( 1 λ ) 1   .
A quick calculation shows that these have curvature:
F λ A = λ F A + λ ( λ 1 ) [ A , A ] .   Choose λ   such that: ( 1 λ ) ( 1 + r * ) 2 ε 0 2 .   By the triangle and Hölders inequality, and the definition of r *   , we have that: F λ A L n 2 ε 0 .   Therefore, by the minimality of r *   we have that d + λ A   can be Coulomb gauged.
Again, by the definition of
λ   , we have that: d + A = d + λ A + A ~ ,   where we easily have the bound (we may assume 1 C   ): A ~ L n C ε 0 .   Therefore, by an application of Lemma  3.2 we have that d + A   can be put in the Coulomb gauge with the  19 holds. This contradicts the minimality of r *   as was to be shown.

3 Of course this ODE is non-linear, but in the present context it also satisfies the conservation law g g = I   .

4 Some analytic preliminaries

We record here some useful formulas, mostly from elementary harmonic analysis, which will be used many times in the sequel. Firstly, we define the Fourier transform on C o ( m )   , which is merely the usual scalar Fourier transform acting component-wise on matrices:
A ^ ( ξ ) = R n e 2 π i x ξ A ( x ) d x . (36)
The Plancherel theorem with respect to the Killing form  13 reads: R x n A , B d x = R ξ n A ^ , B ^ d ξ .   This follows simply from definition of the inner product  15 . While the constructions we make in the sequel are almost explicitly based on the spatial transform  36 , it will in certain places be convenient for us to work with the space-time Fourier transform: A ^ ( τ , ξ ) = R n + 1 e 2 π i ( t τ + x ξ ) A ( t , x ) d t d x .   In the sequel, we will have much use for dyadic frequency decompositions with respect to the spatial variable. For the most part, we will use a fairly loose and heuristic notation for this operation. This will help us to avoid having to come up with different symbols for multipliers which are basically the same. First of all, we let χ ( ξ )   denote some smooth bump function adapted to the unit frequency annulus { 2 a | ξ | 2 a }   , where 1 a   is some constant used to define χ   which may change from line to line. For a dyadic number μ { 2 i | i Z }   , we define the rescaled cutoffs: χ μ ( ξ ) = χ ( μ 1 ξ ) ,   and the associated Fourier multipliers P μ A ^ = χ μ A ^   . The two main facts we will need about these multipliers is the Bernstein inequality:
P μ A L p μ n ( 1 q 1 p ) A L q , (37)
which holds for all 1 q p   , and the Littlewood-Paley equivalence:
( μ | P μ A | 2 ) 1 2 L p A L p , (38)
which holds under the restriction 1 < p <   . All of the norms above can be taken with respect to  13 .
There are two simple analysis lemmas involving derivatives and multipliers which will come in useful in the sequel. The first of these is the low frequency (operator) commutator estimate:
[ A , P 1 ] F L p x A L q F L r , (39)
where 1 p = 1 q + 1 r   (see [8). The second is the homogeneous paraproduct estimate:
x k ( A F ) L p x k A L q 1 F L r 1 + A L q 2 x k F L r 2 , (40)
for 1 < p , q i , r i <   , 1 p = 1 q 1 + 1 r 1   , and 1 p = 1 q 2 + 1 r 2   whenever 0 < k   . This estimate is true even for non-integer 0 k   by a simple Littlewood-Paley argument. We note here that we only use it the integer case, and there it is only employed as a convenience. For a proof of this, see e.g. Chapter 2 of [12.
We would now like to set up a system to formalize many of the dyadic estimates which will appear in this paper. This is most easily done using the language of Besov spaces. Since we have a specific purpose for these in mind, we introduce the following notation:
A 2 B ˙ 2 p , ( q , s ) = μ μ 2 s 2 n ( 1 q 1 p ) P μ A 2 L p , (41)
This notation may seem a bit mysterious at first, but the thing to keep in mind here is that the first index p   in some sense controls the decay, while the second double index ( q , s )   controls the scaling, which is the same as W ˙ s , q   (homogeneous L q   Sobolev space). In general, the second index will be fixed, so we will strive to have p   as low as possible (see Remark  4.2 below). This notation has the following simple significance: B ˙ 2 p , ( q , s )   is the 2   Besov space of Lebesgue index p   which contains the standard Besov space B ˙ 2 q , s   defined by: A 2 B ˙ 2 q , s = μ μ 2 s P μ A 2 L q .   This identification is a direct consequence of the Bernstein embedding  37 . In general, one has the inclusions:
B ˙ 2 p 1 , ( q , s ) B ˙ 2 p 2 , ( q , s ) , q p 1 p 2 . (42)
Furthermore, a quick application of the Littlewood-Paley identity  38 gives the Lebesgue space inclusion:
B ˙ 2 p , ( q , n ( 1 q 1 p ) ) L p , 2 p < . (43)
The reason we prefer to use this more involved notation, instead of the usual Besov norm convention is that ours allows one to tell at first glance which norms are critical, which is particularly useful in a scale invariant problem like the one of this paper. Specifically, the norms B ˙ 2 p , ( 2 , n 2 2 )   will play a prominent role in what follows.
It will also be necessary for us to employ the
1   summing version of the norm  41 , which we label by B ˙ 1 p , ( q , s )   . This will essentially be used for one purpose only, and that is that the L   end-point of  43 is true for this space:
B ˙ 1 , ( q , n q ) L , 1 q . (44)
Besov spaces are particularly well behaved with respect to the action of Riesz operators, which is exactly why we use them. In general, we define the operator | D x | σ   to be the Fourier multiplier with symbol | ξ | σ   . The basic embedding we will use in the sequel is the following:
Lemma 4.1. One has the following bilinear estimate for Besov spaces for 0 σ   :
| D x | σ : B ˙ 2 p , ( 2 , s 1 ) B ˙ 2 q , ( 2 , s 2 ) B ˙ 1 r , ( 2 , s 3 ) , (45)
where the indices 1 p , q , r   and σ , s i   satisfy the following conditions:
s 3 = s 1 + s 2 + σ n 2 , ( s c a l i n g ) , (46)
σ + n 2 s 3 < n ( 1 p + 1 q ) , ( H i g h × H i g h ) , (47)
s 1 < n 2 + min { n ( 1 q 1 r ) , 0 } , ( L o w × H i g h ) , (48)
s 2 < n 2 + min { n ( 1 p 1 r ) , 0 } , ( H i g h × L o w ) , (49)
1 r 1 p + 1 q , ( L e b e s g u e ) . (50)
Remark 4.2. As will become apparent in the proof, it is possible to show frequency localized versions of the embedding  45 such that not all of the conditions  47  49 need to be satisfied. Indeed, we will show the following two frequency localized “improvements” are possible:
| D x | σ : P λ ( B ˙ 2 p , ( 2 , s 1 ) ) P λ ( B ˙ 2 q , ( 2 , s 2 ) ) P λ ( B ˙ 1 r , ( 2 , s 3 ) ) , (51)
| D x | σ : P λ ( B ˙ 2 p , ( 2 , s 1 ) ) P λ ( B ˙ 2 q , ( 2 , s 2 ) ) ( μ λ ) δ P μ ( B ˙ 1 r , ( 2 , s 3 ) ) , (52)
where δ = n ( 1 p + 1 q ) + s 3 σ n 2   in estimate  52 . Estimate  51 holds whenever  46 ,  48 , and  50 are satisfied. The second estimate  52 is valid whenever we have  46 ,  47 , and  50 . In particular, notice that for larger σ   this estimate requires lower values of p , q   . This fact will have an immense bearing on the estimates we prove in the sequel, and seems to be one of the most difficult factors in lowering the dimension of the overall argument from n = 6   (apart from even more difficult things such as null-form estimates).
Before continuing on, let us note here a slight refinement of the Besov norms  41 and the embedding  45 . This involves taking into account functions which live at frequency 1   . If we let D x   denote the multiplier with symbol ( 1 + | ξ | 2 ) 1 2   , then we form the low frequency spaces:
A B ˙ 2 , 10 n p , ( q , s ) = D x 10 n A B ˙ 2 p , ( q , s ) , (54)
with a similar definition for the 1   version B ˙ 1 , 10 n p , ( q , s )   . By a straightforward adaptation of the previous argument, it is easy to see that the embedding  45 is equally valid for these low frequency spaces. We leave the details to the reader.
It will also be necessary for us to perform various dyadic decompositions with respect to the angular frequency variable. For each fixed direction
ω   in the frequency plane R ξ n   , we decompose the unit sphere S ξ n 1   into dyadic conical regions:
( ω , θ ) = { η S ξ n 1 | ( ω , η ) θ } , (55)
where θ { π 2 2 i | i Z , i 0 }   . Here we will not bother to fix the constant in the   notation used to define the regions  55 , but we will let it change from line to line as we have done for the spatial multipliers above. We also define a smooth partition of unity adapted to these regions, which we label by b θ ω   . These can always be chosen (e.g. by defining them on a larger sphere and then rescaling) so that they satisfy the differential bounds:
| ( ω ξ ) ω k p 1 b θ ω | 1 , | ( ω ξ ) k p 1 b θ ω | θ k ,
where the implicit constants depend on k   but are uniform in θ   . In particular, if we define the multipliers ω Π θ A ^ = b θ ω A ^   , then the operators ω Π θ P μ   are bounded on all L p   spaces uniformly in μ   and θ   . In fact, the following refinement of the inequality  37 holds, which we also call Bernstein:
ω Π θ P μ A L p μ n ( 1 q 1 p ) θ ( n 1 ) ( 1 q 1 p ) A L q . (56)
In all of the above inequalities, we have kept ω   as a fixed directional value.
However, it will also be necessary for us to have an account of how our multipliers depend on this parameter. In particular, we will need to have bounds for the operators
ω ω Π θ   . This is easily achieved by differentiating the associated multiplier.
In fact, one has the bounds for fixed
ξ   :
| ω k b θ ω | θ k . (57)
The way we shall express this bound in calculations is through the following heuristic operator identity:
ω k ω Π θ θ k ω Π θ , (58)
which we shall take to mean that the left hand side satisfies all L p   space bounds as the right hand side. Notice that this relation has a preferred direction (left   right).
In practice, this means that we have the bound
 56 for the operator on the left hand side of  58 with the added factor of θ k   .
Finally, let us end this section by making the following conventions. Firstly, it will be convenient for us at times to write
P μ A = A μ   for a localized object.
This should not be confused with the
μ t h   component of A   in the case that it is a one-form. This should usually be clear from context. Secondly, it will be necessary for us to ensure that certain of our multipliers have real symbol so that they respect the subalgebra g ( m ) M ( m × m )   . This will be done by taking their real part which simply symmetrizes their (real) symbols. In particular, we will denote this by: ( ω Π θ ) = ω Π ¯ θ .   Secondly, we use the following bulleted notation for the sum of various cutoffs over a given range:
P < c = μ < c P μ , ω Π < c = θ < c ω Π θ ,
etc. We will also use the notation A < c   etc. for these operators applied to tensors.
Finally, we will set aside a special notation here for cutting off on angles sectors whose width depends on the frequency:
ω Π ¯ ( σ ) = μ ω Π ¯ μ σ < P μ . (59)
Notice that this multiplier does not satisfy good bounds of the form  57 . However, it can be dealt with using the Littlewood-Paley equivalence  38 if there is a little extra room left to sum over fixed angular dyadics. This ends our description of the basic analysis we will use in this paper.

5 Gauge construction for the initial data; Reduction to a second order system and the main a-priori estimate

We now begin our proof of the main theorem  1.1 . As we have already mentioned, one of the central components of the proof is to construct a stable set of “elliptic coordinates” on the bundle V   . The way we will do this is to construct the desired frame on the t = 0   slice R n × g   . We will then show that this frame propagates as the system evolves by solving an auxiliary set of equations for the gauge potentials which respects the chosen frame automatically. The regularity of this system of equations will be provided in the usual translation invariant Sobolev spaces. We then show that our auxiliary solution is in fact a true solution to the system of equations  3  4 by employing a bootstrapping procedure which is similar to that used in the proof of Lemma  3.2 . The desired gauge covariant regularity, which is contained in the statement of Theorem  1.1 , will be provided by a comparison principle. These constructions are all local in time and are more or less standard. We have included them here for the convenience of the reader, the sake of completeness, and the fact that some of the formulas we develop along the way will be central to what we do in later sections.
With the local theory established, the global conclusion of Theorem
 1.1 will then be a consequence of a certain a-priori estimate on the (usual Sobolev) energy of solutions to  3  4 in the gauge we construct. Our task will then be to show that this a-priori estimate is true for all solutions to yet another system of auxiliary equations, this time for the curvature. This can be considered to be the main estimate of the paper. The proof turns out to be quite involved, and will occupy the rest of the paper. In the next section, we will prove the main a-priori estimate itself with the help of a certain family of microlocalized space-time (Strichartz) estimates for solutions to second order covariant wave equations on bundles with connections satisfying estimates consistent with our bootstrapping assumptions.
The breakdown here is based on the Smith-Tataru (see
[10)   -parametrix idea, which allows one to reduce the needed Strichartz estimates to proving them for a suitable family of approximate frequency localized fundamental solutions. Our rendition of this is essentially equivalent to that contained in the paper [8.
Finally, in the remaining sections of the paper we develop the linear theory. This is by far the most involved portion of the present work, and requires the construction of some fairly sophisticated oscillatory integrals and microlocal function spaces.
This material can be read without reference to the non-linear problem, as long as one is familiar with the algebraic and analytic assumptions we make on the geometry (frequency localized connection). While these come from the non-linear problem, they are of course a bit more general.

5.1 Construction of the initial frame, and the comparison principle

The first thing we do here is to put the initial connection D ̲   into the Coulomb gauge. Via the Uhlenbeck lemma  3.1 , we simply need to show that: F ̲ L n 2 ɛ 0 ,   for ɛ 0   the sufficiently small parameter from line  10 (which should not be confused with the small constant from Lemma  3.1 above). This L p   bound follows immediately from the gauge covariant Sobolev embedding (for n   even): H ˙ A n 4 2 L n 2 ,   which in turn follows from repeated application of the usual single derivative Sobolev embeddings and the Kato estimate (which follows immediately from  1 and Cauchy-Schwatrz):
| d | F | | | D ̲ F | , (60)
where F   is any section to × g   and the absolute norm | |   is taken with respect to the Killing inner product  13 .
We may now assume that we are dealing with an initial data set:
( F ̲ ( 0 ) , D ̲ ( 0 ) , E ( 0 ) ) , (61)
for the system which is such that connection D ̲ ( 0 ) = d + A ̲ ( 0 )   satisfied the elliptic div-curl system:
d A ̲ ( 0 ) + [ A ̲ ( 0 ) , A ̲ ( 0 ) ] = F ̲ ( 0 ) , d * A ̲ ( 0 ) = 0 , (62)
and such that the compatibility condition  5 is satisfied. Furthermore, from  19 we have the bounds: A ̲ ( 0 ) L n ɛ 0 .   We will now use this last bound to show that the initial data set  61 is in fact in the classical Sobolev spaces H ˙ k   . This is a consequence of the following:
Lemma 5.1 (Comparison principle for Sobolev norms on R n   ). Let D ̲ = d + A ̲   be a connection on R n   , with n   even, such that one has the potential and curvature bounds:
A ̲ L n , F ̲ H ˙ A n 4 2 ε 0 , (63)
F ̲ H ˙ A k M k , (64)
for n 4 2 < k   . Suppose also that D ̲   is in the gauge d * A ̲ = 0   . Then we have the critical classical Sobolev bounds:
F ̲ H ˙ n 4 2 C ε 0 , (65)
A ̲ H ˙ n 2 2 C ε 0 . (66)
Furthermore, if G   is any g   valued function, then we have the following inductive comparison of norms:
C 1 ( M n 4 2 , , M k 1 ) G H [ k * , k ] G H A [ k * , k ] , (67)
C ( M n 4 2 , , M k 1 ) G H [ k * , k ] , (68)
where the index k *   is such that n 4 2 k * < n   , and where we have set:
G 2 H A [ k * , k ] = k * m k D ̲ m G 2 L 2 ,   to be the interval gauge-covariant Sobolev space. We use an analogous definition for the space H [ k * , k ]   . We also have the non-inductive equivalence between x A ̲   and F ̲   :
N k 1 A ̲ H ˙ k F ̲ H ˙ k 1 N k A ̲ H ˙ k , (69)
where N k   , n 2 2 k   , is a set of constants which depends only on the dimension and not on the constant ε 0   once it is sufficiently small. In particular, combining all of this, we have the following classical Sobolev bounds on the pair ( A ̲ , F ̲ )   :
F ̲ H ˙ k C ( M n 4 2 , , M k 1 ) M k , (70)
A ̲ H ˙ k + 1 C ( M n 4 2 , , M k 1 ) M k . (71)
for n 4 2 < k   .
Using Lemma  5.1 and the assumed bounds  10  11 , we may assume that our initial data  61 is such that:
( F ̲ ( 0 ) , E ( 0 ) ) H ˙ n 4 2 ε ~ 0 , (78)
A ̲ ( 0 ) H ˙ n 2 2 ε ~ 0 , (79)
( F ̲ ( 0 ) , E ( 0 ) ) H ˙ k M ~ k , (80)
A ̲ ( 0 ) H ˙ k + 1 M ~ k , (81)
where n 4 2 < k   , and the M ~ k   depend on the M k   in some inductive way, and we also have that ε 0 ~ C ɛ 0   for some constant C   which depends only on the dimension.
Here
M k   and ɛ 0   refer to the constants introduced in the statement of Theorem  1.1 .
We now decompose the initial field strength
{ E i ( 0 ) }   in a way that will be consistent with the evolution of the system  3  4 . This will be convenient for discussing the Cauchy problem. Our first step is to define the following elliptic quantity:
Δ a 0 = [ a i , i a 0 ] + [ a i , E i ] . (82)
where for convenience we have labeled { a i } = { A ̲ i ( 0 ) }   . We then define the auxiliary set of quantities:
a ˙ i = E i + i a 0 [ a 0 , a i ] . (83)
Notice that as an immediate consequence of the constraint equation  5 , the form of  82 , and the Coulomb condition d * a = 0   , we have the secondary Coulomb condition:
i a ˙ i = 0 .   This will turn out to be important in a moment. Now, from the definition of the quantities  82 and  83 , the already established bounds  78  81 , and several rounds of Sobolev embeddings, we have the following differential bounds on the quantities { a ˙ i }   :
a ˙ H ˙ n 4 2 ε ~ 0 , (84)
a ˙ H ˙ k M ~ k , (85)
for n 4 2 < k   (after a possible slight redefinition of the constants ε ~ 0 , M ~ k   via multiplication by some fixed dimensional constant). We now define a Coulomb admissible initial data set to be a collection ( F ̲ , { a i } , { a ˙ i } )   such that:
d a + [ a , a ] = F ̲ , d * a = 0 , d * a ˙ = 0 . (86)
Notice that F ̲   is uniquely determined by the { a i }   , therefore we do not need to include it in the definition of initial data. We define the Coulomb-Cauchy problem to be the task of finding a space-time connection D = d + A   such that it satisfies the set of equations:
D β F α β = 0 , (87a)
d A + [ A , A ] = F , (87b)
d * A ̲ = 0 , (87c)
and such that at time t = 0   we have that:
A ̲ ( 0 ) = a , t A ̲ ( 0 ) = a ˙ . (88)
We remark briefly here that solving the problem  86  88 provides a solution to the original Yang Mills system  3  4 with Cauchy data  61 as long as we define the collection { a ˙ }   according to the equations  82  83 . All we need to do to prove this assertion is to show that: F 0 i ( 0 ) = E i .   Our proof of this follows the same bootstrapping philosophy used to show the equivalence  34 in the proof of Lemma  3.2 . The claim will follow at once from equation  83 if we can first establish that: A 0 ( 0 ) = a 0 ,   where a 0   is defined by  82 . Now, from the system of equations  87 we have that the quantity A 0   is elliptically determined by the equation:
Δ A ̲ A 0 = [ A i , t A i ] , (89)
where Δ A ̲ = D ̲ i D ̲ i   is the gauge covariant Laplacean. Furthermore, by using equation  83 as the definition of E i   , and substituting this into equation  82 , we have that the quantity a 0   is elliptically determined by the equation:
Δ a a 0 = [ a i , a ˙ i ] . (90)
By subtracting  90 from  89 at time t = 0   we have that: Δ a ( A 0 ( 0 ) a 0 ) = 0 .   Uniqueness now comes from the Sobolev type estimate: B L n Δ a B L n 3 ,   which follows from the smallness condition  79 and the usual Sobolev estimates.
The details of the proof are left to the reader.
Keeping the equivalence we have just established in mind, and the first inequality contained in the comparison estimates
 68 and  69 , we have reduced the demonstration of Theorem  1.1 to showing the following non-gauge covariant global regularity theorem:
Theorem 5.2 (Global regularity in the Coulomb gauge). Let the number of spatial dimensions be 6 n   . Then there exists a set of constants ε ~ 0   and C , C k   , n 2 2 k   such that if ( F ̲ , { a i } , { a ˙ i } )   is a Coulomb admissible initial data set such that is satisfies the bounds:
F ̲ H ˙ n 4 2 ε ~ 0 , F ̲ H ˙ k M ~ k , (91a)
a H ˙ n 2 2 ε ~ 0 , a ˙ H ˙ n 4 2 ε ~ 0 , (91b)
a H ˙ k M ~ k 1 , a ˙ H ˙ k 1 M ~ k 1 , (91c)
then if ε ~ 0   is sufficiently small there exists a unique global solution { A α }   to the system  87 with this initial data. Furthermore, this solution obeys the following differential estimates:
A H ˙ n 2 2 C ε ~ 0 , t A H ˙ n 4 2 C ε ~ 0 , (92a)
A H ˙ k C k 1 M ~ k 1 , t A H ˙ k 1 C k 1 M ~ k 1 , (92b)

5.2 Local existence in the Coulomb gauge

Our goal here is to reduce the proof of Theorem  5.2 to a certain a-priori estimate involving the energies of the field strength F   . This amounts to proving a local existence theorem for the system  86  88 . The proof of this will allow us to set up a system of equations for the coulomb potentials { A α }   which will be of central importance in the sequel. We will show that:
Proposition 5.3 (Local existence in the Coulomb gauge). Let the number of spatial dimensions be 6 n   . Then for every set of constants C , C k   , n 2 2 k   , there exists an ε ~ 0   which only depends on C   with the following property: If ( { a i } , { a ˙ i } )   is any set of Coulomb admissible initial data such that:
a H ˙ n 2 2 C ε ~ 0 , a ˙ H ˙ n 4 2 C ε ~ 0 , (93)
a H ˙ k C k 1 M ~ k 1 , a ˙ H ˙ k 1 C k 1 M ~ k 1 , (94)
then for ε ~ 0   sufficiently small there exists a time 0 < T *   , which only depends on the quantities C ε ~ 0 , C n 2 M ~ n 2 , C n + 2 2 M ~ n + 2 2   such that there exists a unique local solution { A α }   to the system  86  88 with this set of initial data. Furthermore, on the time interval [ 0 , T * ]   one has the following norm bounds on the collection { A α }   :
sup 0 t T * A ( t ) H ˙ n 2 2 2 C ε ~ 0 , (95)
sup 0 t T * t A ( t ) H ˙ n 4 2 2 C ε ~ 0 , (96)
sup 0 t T * A ( t ) H ˙ k 2 C k 1 M ~ k 1 , (97)
sup 0 t T * t A ( t ) H ˙ k 1 2 C k 1 M ~ k 1 . (98)

5.3 The second order curvature equation and the main a-priori estimate

Through a repeated application of the local existence theorem  5.3 , we may reduce the proof of the global existence theorem  5.2 to showing a-priori that any solution to the Coulomb system  86  88 which exists on a time interval [ 0 , T * ]   (possibly large!), and such that it obeys the both the initial data bounds  91a  91c , as well as the evolution bounds  95  98 , in fact obeys the improved evolution bounds  92a  92b .
Now, it turns out that the system of equations
 100 is by itself not so well adapted4 to the proof of such an a-priori estimate. This stems from the fact that these equations are not covariant. This manifests itself in the projection operator P   . If one were to try to write the hyperbolic system of equations  100a in terms of covariant wave operator A   and a source term, the projection operator which is non-local would end up causing problems in various commutator terms. The way around this is to not only consider the system  100 , but to also work directly with the curvature in the equations  87a  87b . This is possible because we are not attempting to set up an iteration scheme, but are instead merely trying to prove an a-priori estimate, so we may safely assume that the quantities we work with satisfy any equation which results from the system  87 . We will in fact use several such elliptic and hyperbolic equations. As a very rough description of this kind of philosophy, the reader may find it useful to keep in mind the following schematic:
W e a k c o n t r o l o f t h e c o n n e c t i o n I m p r o v e d c o n t r o l o f t h e c u r v a t u r e ,
I m p r o v e d c o n t r o l o f t h e c o n n e c t i o n ,
W e a k c o n t r o l o f t h e c o n n e c t i o n f o r l o n g e r t i m e s .
To provide the improved control on the curvature, we will employ a second order equation for it. To derive this, we write the Bianchi identities  87b in the form  4 and then contract this expression with the covariant derivative D   . This yields the equations:
0 = D γ ( D α F β γ + D γ F α β + D β F γ α ) ,
= A F α β + [ F α γ , F β γ ] + [ F β γ , F γ α ] ,
= A F α β 2 [ F α γ , F β γ ] . (103)
In addition to  103 and the system  100 , it will also be useful for us to employ a secondary elliptic equation. This will be for the quantity t A 0   :
t A 0 = Δ 1 i ( [ A i , t A 0 ] + [ A 0 , t A i ] + [ A α , F i α ] ) . (104)
This equation follows immediately from differentiating the equation  100b with respect to time, and then applying the conservation law α [ A β , F α β ] = 0   to the resulting expression. We are now ready to state our main a-priori estimate:
Theorem 5.4 (Main a-priori estimate for the curvature of the Coulomb system  86  88 ). Let the space-time connection D = d + A   on R ( n + 1 )   , where 6 n   , be given such that it satisfies the following system of equations on some finite time interval [ 0 , T * ]   :
A F α β = 2 [ F α γ , F β γ ] , (105a)
d A + [ A , A ] = F , (105b)
d * A ̲ = 0 , (105c)
A i = P ( [ t A 0 , A i ] [ A α , α A i ] [ A α , F α i ] ) , (105d)
Δ A 0 = i [ A 0 , A i ] + [ A i , F 0 i ] , (105e)
Δ ( t A 0 ) = i ( [ A i , ( t A 0 ) ] + [ A 0 , t A i ] + [ A α , F i α ] ) . (105f )
Here we have split { A α } = ( A 0 , { A ̲ i } )   . Let there also be given a set of fixed constants L , N , L k , N k   for the indices n 2 2 k   , such that at time t = 0   we have the initial bounds:
F ( 0 ) H ˙ n 4 2 ε ~ 0 , t F ( 0 ) H ˙ n 6 2 L ε ~ 0 , (106)
F ( 0 ) H ˙ k M ~ k , t F ( 0 ) H ˙ k 1 L k M ~ k . (107)
Then if ε 0 ~   is chosen as to be sufficiently small on line  106 above, there exists a collection constants C , C k   , which only depend on the dimension and the collection L , N , L k , N k   but not on ε ~ 0   (once it is small enough) or the collection M ~ k   , such that if at later times we have the bounds:
sup 0 t T * A ̲ ( t ) H ˙ n 2 2 2 N C ε ~ 0 , sup 0 t T * t A ̲ ( t ) H ˙ n 4 2 2 N C ε ~ 0 , (108)
sup 0 t T * F ( t ) H ˙ n 4 2 2 N C ε ~ 0 , sup 0 t T * t F ( t ) H ˙ n 6 2 2 N C ε ~ 0 , (109)
sup 0 t T * A ̲ ( t ) H ˙ k < , sup 0 t T * t A ̲ ( t ) H ˙ k 1 < , (110)
sup 0 t T * F ( t ) H ˙ k < , sup 0 t T * t F ( t ) H ˙ k 1 < , (111)
the following set of stronger bounds holds:
sup 0 t T * F ( t ) H ˙ n 4 2 N 1 C ε ~ 0 , sup 0 t T * t F ( t ) H ˙ n 6 2 N 1 C ε ~ 0 , (112)
sup 0 t T * F ( t ) H ˙ k N k 1 C k M ~ k , sup 0 t T * t F ( t ) H ˙ k 1 N k 1 C k M ~ k . (113)
Remark 5.5. The bounds involving  111 and  113 express the fact that the control we provide here is at the critical level. That is, bounds on the higher norms are completely irrelevant in the bootstrapping procedure, except for the fact that they are finite. The only place where we need higher norms to accomplish anything here is in the local existence theorem  5.3 . The way we will prove Theorem  5.4 is by first establishing control at the critical level through a bootstrapping argument. The control of the higher norms will then be provided through an a-priori estimate who's proof is essentially identical to that of the critical bootstrapping bound, and will therefore be left to the reader.
Remark 5.6. The reader my find it useful to have a brief description of the various constants appearing in Proposition  5.3 and Theorem  5.4 . The constants L , L k , N , N k   are input into the a-priori machine, and these are meant to cover the transition to and from estimates involving the connection and curvature. The set L , L k   is only needed to deal with the initial data. This is necessary because we must have an account of bounds involving the quantities t F   . The other constants N , N k   govern comparison type estimates similar to  69 . The constants C , C k   are byproducts of the proof of the a-priori estimate itself. These will very much depend on the L , L k , N , N k   , but are independent of ε ~ 0   when it is small enough. Finally, the main adjusting parameter ε ~ 0   has two important roles. First and foremost, it is needed to prove the a-priori estimate itself. However, it has a second purpose which is also crucial, and that is to keep the dependence of C , C k   on L , L k , N , N k   from creating a feedback loop. Specifically, we need our various comparison estimates to have constants which do not depend on the large constants C , C k   . Since the critical energy of the curvature can grow by a factor of C   , we will need the extra influence of ε ~ 0   to make sure this does not cycle back to L , L k , N , N k   .

4 Strictly speaking, this is not entirely true. This can be seen from the fact that if one looks at the localized commutator [ A , P ] P λ   , where the connection { A α }   is assumed to be of much lower frequency than λ   , then this is essentially a “derivative falls on low” interaction which can be handled with the available Strichartz estimates in 5 n   dimensions. We have elected instead to follow a formulation of the YM system which is based on the curvature because of its conceptual appeal. However, in lower dimensions, it may be best to work directly with the connection { A α }   , in part to help mitigate bad H i g h × H i g h L o w   frequency interactions which come from the quadratic term on the right hand side of  103 .

6 Proof of the Main Bootstrapping Estimate

We are now ready to begin our proof of the (improved) main critical a-priori estimate  112 . In order to do this, we will need to bootstrap in a function space which is much stronger than the energy type spaces of Theorem  5.4 . This will cost us another bootstrapping procedure, but this will be easy to set up because it will be clear the extra norms we create have good bounds on some very small initial time interval due to the fact that we are assuming the higher energy boundedness  111 and that these norms involve integration in time. All of the norms we construct here will be of Strichartz type, with an 2   Besov structure in the spatial variable.
It will also be necessary for us to include an angular square sum structure in many of the estimates we prove. This may seem a bit odd at first because we will not need such bounds directly in our proof of Theorem
 5.4 . These extra bounds will instead be used to give the fine control which is needed to handle the linear part of the problem. At each fixed frequency, we form the square-sum norms:
P λ A S L P = sup θ 1 ( φ : ω 0 Γ φ ω 0 Π θ P λ A 2 L p ) 1 2 , (116)
where Γ φ   is taken to be a (uniformly) finitely overlapping set of spherical caps such that S n 1 = φ Γ φ   , each of which has size θ   and constructed such a way that one has the bounds: ( φ : ω 0 Γ φ ω 0 Π θ P λ A L 2 ) 1 2 P λ A L 2 ,   independent of the size of θ   . Here we take the condition ω 0 Γ φ   to mean that the variable ω 0   is essentially in the center of that spherical cap Γ φ   . The exact placement is not essential. Notice that by construction, these norms are contained in the usual L p   spaces because we can assume that one set of angular sectors we are summing over contains the whole sphere.
Next, using the same prescription that defined the Besov spaces
 41 , we define the angular square sum Besov spaces to be:
A S B ˙ 2 p , ( q , s ) = ( λ λ 2 s 2 n ( 1 q 1 p ) P λ A 2 S L p ) 1 2 . (117)
We now define the main dispersive component of the function spaces we will be working with. These are L t 2   based Strichartz spaces, built on the norms  117 and  41 . These are all defined on a finite time interval [ 0 , T * ]   , which will for the most part be left implicit:
A Z ˙ s = A L t 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , s + 1 2 ) ) [ 0 , T * ] , (118)
A S Z ˙ s = A L t 2 ( S B ˙ 2 2 ( n 1 ) n 3 , ( 2 , s + 1 2 ) ) [ 0 , T * ] . (119)
To gain some intuition about these spaces, notice that they all scale like L ( H ˙ s )   under the change of variables  7 . Therefore, they all scale like solutions to the wave equations with H ˙ s   initial data. Indeed, these spaces are consistent with the available range of Strichartz estimates for the usual scalar wave equation, and it will be our goal to show that one has bounds on the norm  119 for solutions of the covariant wave operator on the left hand side of  105 .
To form the overall spaces we will bootstrap in, we add the above space-time norms to the energy type norms used in the statement of the main a-priori estimate
 5.4 :
X ˙ s = L [ 0 , T * ] ( H ˙ s ) S Z ˙ s , (120)
Y ˙ s = L [ 0 , T * ] ( H ˙ s ) Z ˙ s . (121)
It will also be necessary for us to estimate time derivatives in the above spaces.
Since differentiation will decrease the scaling by one unit, we use the norms:
A X ˙ s × t 1 ( X ˙ s 1 ) = A X ˙ s + t A X ˙ s 1 ,   with an analogous definition for Y ˙ s × t 1 ( Y ˙ s 1 )   .

6.1 Proof of the Critical Bootstrapping Estimate

We are now ready to prove the critical component of Theorem  5.4 (we will now change notation from ε ~ 0   back to ε 0   ):
Proposition 6.1 (Critical bootstrapping estimate in the X ˙ s   spaces). Let the dimension be 6 n   . Let the collection ( F , A )   be a space-time connection curvature pair which obeys the general smoothness conditions  110  111 , and which satisfies the system of equations  105 . Let L , N   be given constants such that one has the initial bounds:
F ( 0 ) H ˙ n 4 2 + t F ( 0 ) H ˙ n 6 2 L ε 0 . (122)
Then there exists a constant C   which depends only on L , N   and the dimension such that if one has the bootstrapping bounds on a time interval [ 0 , T * ]   :
sup 0 t T * ( A ̲ , t A ̲ ) ( t ) H ˙ n 2 2 × H ˙ n 4 2 2 N C ε 0 , (123)
F X ˙ n 4 2 × t 1 ( X ˙ n 6 2 ) 2 N C ε 0 , (124)
then for ε 0   sufficiently small, we have that the following improved bounds on the same time interval [ 0 , T * ]   :
F X ˙ n 4 2 × t 1 ( X ˙ n 6 2 ) N 1 C ε 0 . (125)
The proof of Proposition  6.1 will be accomplished through the standard use of Littlewood-Paley paraproduct decompositions, and the application of space-time estimates. All of the linear bounds we will need are provided by the following, which is the main technical result of this work:
Theorem 6.2 (Gauge covariant angular square-sum Strichartz estimates for Yang-Mills connections). Let the number of dimensions be such that 6 n   , and let d + A ̲ ~   be a space-time connection defined defined on all of Minkowski space n + 1   such that it satisfies the conditions:
A ̲ ~ 0 = 0 ( T e m p o r a l G a u g e ) , (126a)
d * A ̲ ~ = 0 ( C o u l o m b G a u g e ) , (126b)
P | ξ | | τ | ( A ̲ ~ ) = 0 ( S p a c e - t i m e f r e q u e n c y l o c a l i z a t i o n ) , (126c)
A ̲ ~ X ˙ n 2 2 ( S p a c e - t i m e e s t i m a t e ) , (126d)
A ̲ ~ = P ~ ( [ B , H ] ) ( S t r u c t u r e e q u a t i o n ) , (126e)
( B , H ) Y ˙ n 2 2 × Y ˙ n 4 2 ( S t r u c t u r e e s t i m a t e s ) , (126f )
where ( B , H )   is an auxiliary set of g   valued functions defined on all of n + 1   . The symbol P ~   denotes a composition of the Leray projection P   with some frequency cutoff function which is bounded on all mixed Lebesgue-Besov spaces of the type L p ( B ˙ 2 p , ( 2 , s ) )   . We assume also that the connection d + A ̲ ~   satisfies the general smoothness bounds:
sup T * t T * A ̲ ~ ( t ) H ˙ k < , n 2 2 < k , (127)
for each fixed time T *   . Let now F   be any other g   valued function which satisfies the inhomogeneous equation:
A ̲ ~ F = G , (128)
with Cauchy data:
F ( 0 ) = f , t F ( 0 ) = f ˙ . (129)
Then if the constant   in lines  126d and  126f above is sufficiently small, one has the following family of space-time estimates:
F X ˙ n 4 2 × t 1 ( X ˙ n 6 2 ) ( f , f ˙ ) H ˙ n 4 2 × H ˙ n 6 2 + G L 1 ( H ˙ n 6 2 ) . (130)
Remark 6.3. In the above Theorem, the Strichartz estimates have a preferred scaling. This is consistent with the application we have in mind. In general, it is not possible to prove estimates of the type  130 for higher Sobolev indices without assuming that the connection A ̲ ~   itself has more regularity. In the case where A ̲ ~   does have better regularity, a proof similar to that given after Proposition  7.1 below can be used to show estimates for those higher norms.

7 Reduction to Approximate Half-Wave Operators

This is a preliminary technical section where we reduce the proof of the Strichartz estimates  130 to a more easily managed form. This material more or less standard, and we again follow closely what was done in [8. Our first step here is to reduce the proof of Theorem  6.2 to the following:
Proposition 7.1 (Existence of a fixed frequency parametrix). Let the number of dimensions be 6 n   , and let d + A ̲ λ   be a connection which satisfies the conditions  126 . In addition assume that we have the frequency localization condition:
P λ ( A ̲ λ ) = 0 , (158)
where P λ   is a frequency cutoff on the region where 2 10 a λ | ξ |   , where 1 a   is some fixed parameter. Then if the constant   on lines  126d and  126f is sufficiently small, there exists a family of approximate propagation operators W A ̲ λ λ ( s )   (or just W s λ   for short) such that if ( f λ , g λ )   is any set of λ   –frequency initial data with Fourier support in the region 2 a λ | ξ | 2 a λ   , the following estimates hold:
W s λ ( f λ , g λ ) X ˙ 0 × t 1 ( X ˙ 1 ) E 1 2 ( f λ , g λ ) , (159a)
W s λ ( f λ , g λ ) ( s ) f λ L 2 1 2 E 1 2 ( f λ , g λ ) , (159b)
t W s λ ( f λ , g λ ) ( s ) g λ L 2 λ 1 2 E 1 2 ( f λ , g λ ) , (159c)
A ̲ λ W s λ ( f λ , g λ ) L 1 ( L 2 ) λ E 1 2 ( f λ , g λ ) . (159d)
Here we have set E ( f λ , g λ )   to the L 2   normalized energy: E ( f λ , g λ ) = f λ 2 L 2 + λ 2 g λ 2 L 2 .   Finally, we have that the frequency support of the parametrix is contained in the set 2 2 a λ | ξ | 2 2 a λ   , where a   is as above.
The final thing we will do in this section is to make one further reduction of the Strichartz estimates  130 . This involves the following proposition:
Proposition 7.2 (Existence of approximate half-wave parametrices). Let the number of dimensions be 6 n   , and let d + A ̲ 1   be a connection which satisfies the conditions  126 as well as the frequency localization condition  158 for λ = 1   . Then there exists pair of evolution operators Φ ± ( f ^ ) ( t )   from L 2 ( R ξ n )   to L 2 ( R x n )   such that the fixed time adjoints ( Φ ± ( t ) ) *   are always supported in the region 2 a | ξ | 2 a   for some fixed 1 a   , and such that they obey the following estimates:
( P 1 Φ ± ( f ^ ) , Φ ± ( f ^ ) ) X ˙ 0 × L x 2 f ^ L ξ 2 , (178a)
x Φ ± ( f ^ ) L t 2 ( L x 2 ( n 1 ) n 3 ) f ^ L ξ 2 , (178b)
t P 1 Φ ± ( f ^ ) P 1 Φ ± ( 2 π i | ξ | f ^ ) X ˙ 0 f ^ L ξ 2 , (178c)
Φ ± ( 0 ) ( ( 2 π | ξ | ) α ( Φ ± ( 0 ) ) * ) g ( Δ ) α 2 P 1 ( g ) L x 2 1 2 g L x 2 , (178d)
A ̲ 1 Φ ± ( f ^ ) L t 1 ( L x 2 ) f ^ L ξ 2 . (178e)

8 Construction of the half wave operators

We now begin construction of our approximate solutions Φ ±   to the reduced covariant wave equation A ̲ 1   . This will be accomplished by integrating over a collection of gauge transformations designed to eliminate the highest order effect of troublesome term A ̲ α 1 α   . In order to understand what such a gauge transformation should be, we begin with a simple calculation. We consider the covariant wave equation ω A   , where the connection ω D = d + ω A   will be determined in a moment, acting on a vector valued plane wave e 2 π i λ ω u ± f ^   . Here f ^   is a constant complex valued matrix in C o ( m )   , and the ω u ±   are the standard plane wave optical functions:
ω u + = t + ω x , ω u = t + ω x .
In particular, α ( ω u ± ) = ( ω L ) α   , where the ω L ±   are the associated null hyper-surface generators:
ω L + = t + ω x , ω L = t + ω x .
With these identifications, we easily have the calculation:
ω A ( e 2 π i λ ω u ± f ^ ) = e 2 π i λ ω u ± ( 4 π i λ [ ω A ( ω L ) , f ^ ] + D α ω A [ ω A α , f ^ ] ) . (182)
Using the heuristic5 that terms of the form ( ω A )   and [ ω A , ω A ]   are lower order, and splitting the potentials { ω A α }   into the sets { ω A α ± }   associated with the optical functions ω u ±   (resp.), we see that in order eliminate the highest order term on the right hand side of  182 would need to assume this connection is in the backward (resp. forward) ω   -null-gauge:
ω A + ( ω L ) = 0 , ω A ( ω L + ) = 0 . (183)
Of course, it is not possible to assume that a given fixed connection will simultaneously be in the null-gauge for every direction ω   . However, it is more or less clear that since these gauges are of Crönstrom type, it is always possible to transform a given connection so that it is in the null-gauge for a fixed direction. This motivates the following form of an approximate solution to A ̲ 1   :
Φ ± ( f ^ ) = R n e 2 π i λ ω u ± ω g ± 1 f ^ ( λ ω ) ω g ± χ ( 1 2 , 2 ) ( λ ) λ n 1 d λ d ω , (184)
where χ ( 1 2 , 2 )   is a smooth bump function such that χ ( 1 2 , 2 ) 1   on the interval [ 2 1 , 2 ]   and such that χ ( 1 2 , 2 ) 0   outside of [ 4 1 , 4 ]   (the variable width assumption of Proposition  7.2 can be achieved with similar bump functions). Here, the gauge transformation:
ω B ± = ω g ± A ̲ 1 ( ω g ± 1 ) + ω g ± d ( ω g ± 1 ) , (185)
will be chosen so that ω B ±   approximately satisfies  183 . It seems that there are in fact many choices of how to do this, although the naive choice of letting ω B ±   satisfy  183 directly by solving the appropriate transport equations6 leads to group elements with poor regularity properties. Therefore, the procedure for arriving at the correct choice deserves some motivation.
The heart of the matter is two-fold. First and foremost, we need to come up with a construction that gives us explicit formulas so that we may perform certain standard calculations on the integral
 184 . In particular, we will need to perform integration by parts with respect to the variable ω   . Since G   is assumed to be non-abelian, and since we will not be able to localize things to a neighborhood of any fixed point on the group7 , this is actually a non-trivial matter. For example, it is not possible to do this directly through a use of the exponential map because we would run into trouble with conjugate points.
Secondly, we will need to replace the transport equation which defines the naive pure null-gauge transformation, with something that has more “elliptic” features.
That such a choice is possible is, strangely enough, determined by the fact that the connection
{ A ̲ 1 }   is not arbitrary, but instead evolves according to a hyperbolic equation. This is taken into account by condition  126e . This kind of structure seems to be ubiquitous in geometric wave equations, both semi and quasi-linear, and the observation that it makes the crucial difference goes back to work of Klainerman-Rodnianski on quasi-linear wave equations [5. The particular form we will use it in here is almost identical to that of [8, but since everything we do is non-abelian, the derivation will seem a bit different at first.
The first observation we use is that just like the Crönstrom gauge, the null-gauge allows one to recover the potentials directly from the curvature. However, since we aim to derive an (sub)-elliptic equation, we do not do this by simply integrating along null directions. Instead, we write:
ω L ω B α ± = F ω B ± ( ω L , α ) . (186)
Making now the approximate assumption that the { ω B ± }   are simply a solution to the scalar wave equation = α α   , which we write as:
= ω L ± ω L + Δ ω , (187)
the identity  186 can be written in the integral form:
ω B α ± = ω L ± Δ ω 1 F ω B ± ( ω L , α ) . (188)
Here Δ ω = Δ ω 2   is the Laplacean on the plane perpendicular to the ω   direction in R n   . We would now like to make  188 our “choice” for the gauge transformed connection on the right hand side of  185 . For example, even though it was based on the approximate assumption the { ω B ± }   satisfy the scalar wave equation, it still respects the null-gauge  183 simply by the skew-symmetry property of the curvature. Unfortunately,  188 has several undesirable features. Firstly, we would like an expression which involves the curvature of { A ̲ 1 }   , not the curvature F ω B ±   .
Secondly, the sub-Laplacean on the right hand side of this expression needs to be smoothed out in some way so that its dependence on the angular variable
ω   is not so rough.
To get around the first of these problems, we simply pretend that the various differential operators on the right hand side of
 188 are gauge covariant. Assuming this and then conjugating both sides of that expression by ω g ±   , moving these group elements past the differential operators on the right, and throwing away quadratic terms from the curvature while assuming that the reduced connection A ̲ 1   satisfies the usual homogeneous wave equation, we are left with the approximate identities:
ω g ± 1 ω B α ± ω g ± ω L ± Δ ω 1 F A ̲ 1 ( ω L , α ) ,
( A ̲ 1 ) α + α ω L ± Δ ω 1 A ̲ 1 ( ω L ) .
To get around the second problem, we mollify the angular variable of the second term on the right hand side of this last expression. Doing this and looking back on the definition  185 , we see that we would like our group elements to be such that:
ω g ± 1 d ( ω g ± ) ω Π ¯ ( 1 2 δ ) x ω L ± Δ ω 1 A ̲ 1 ( ω ) . (189)
Here we have set:
0 < γ δ 1 , (190)
where γ   is our small all purpose constant from line  12 above. Now the problem is, of course, that right hand side of the above formula does not in general represent a flat connection. However, as one can see immediately, its curvature is small in some sense because it is a quadratic expression. At this point, the problem now looks essentially like what happens for wave-maps8 (see e.g. [11and [9). In particular, it is clear that the right way to define the group elements ω g ±   so that the approximate formula  189 holds is to flatten out the right hand side of that expression as much as possible by using the potential version  3.2 of the Uhlenbeck lemma. Therefore, what we need to do is to show the fixed time estimate:
ω Π ¯ ( 1 2 δ ) x ω L ± Δ ω 1 A ̲ 1 ( ω ) L n , (191)
and then assume that   is chosen small enough to that we may use it as the constant in  22 . Because of its utility in the sequel, we will in fact prove the more general estimate:
ω Π ¯ ( 1 2 δ ) x ω L ± Δ ω 1 A ̲ 1 ( ω ) B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) , (192)
where p γ   is a dimension dependent Lebesgue index which we set to:
p γ = 2 ( n 1 ) n 3 2 γ . (193)
Here 0 < γ 1   is again the all-purpose constant which we have fixed in section  2 to be small enough so that it is compatible with its use here. Notice that  192 implies the estimate  191 thanks to the embedding  43 and the fact that for γ   sufficiently small there is plenty of room in the inequality p γ < n   .
Now, because the norm
B ˙ 2 , 10 n p γ , ( 2 , n 2 2 )   is 2   based, by orthogonality and the L ( H ˙ n 2 2 )   estimate contained in the bootstrapping assumption  126d , we see that in order to conclude  192 it is enough to show the fixed frequency estimate (note that there are no high frequencies here): x ω L ± Δ ω 1 ( A ̲ 1 ) μ ( ω ) L p γ μ n ( 1 2 1 p γ ) ( A ̲ 1 ) μ L 2 .   Decomposing the spatial frequency variable into fixed dyadic angular sectors spread from the direction ω   : P μ = θ ω Π θ P μ   , this estimate further reduces (after dyadic summing) to being able being able to prove that:
ω Π θ x ω L ± Δ ω 1 ( A ̲ 1 ) μ ( ω ) L p γ θ γ μ n ( 1 2 1 p γ ) ( A ̲ 1 ) μ L 2 . (194)
We are now almost at the point where we can apply the angular Bernstein inequality  56 directly, because in the current localized setting we have the symbol bounds:
ω Π θ x ω L ± Δ ω 1 S | τ | | ξ | P μ θ 2 P μ , (195)
where we are enforcing the heuristic notation introduced on line  58 . However, since Bernstein only nets us a savings of: θ ( n 1 ) ( 1 2 1 p γ ) = θ 1 + γ ,   in this context, we need to be a bit more careful in order to gain an extra power of θ   . This is provided by the fact that the potentials { A ̲ 1 }   are in the Coulomb gauge. Notice that if say, 1 10 < θ   there is nothing to worry about and we have estimate  194 without any problem. On the other-hand, if it is the case that θ < 1 10   , then we can use the fact that ω Π θ ω 1   is elliptic (in terms of symbol bounds) in conjunction with the gauge condition d * A ̲ 1 = 0   to write:
ω Π θ A ̲ 1 ( ω ) = ω 1 ω Π θ / d * / A ̲ 1 θ ω Π θ / A ̲ 1 . (196)
Here { / A ̲ 1 }   the induced connection (angular portion) on the hyperplane ω   perpendicular to ω   , and / d *   is the associated divergence. We note here that this identity will turn out to be very useful and will be used many times throughout the sequel. With these extra savings in mind, an application of Bernstein now directly yields the desired estimate  194 .
We have now constructed the infinitesimal group elements
ω g ±   in equations  185 , which is explicitly defined by the formulas  31 in Lemma  3.2 applied to the connection:
ω A ̲ ± = ω Π ¯ ( 1 2 δ ) x ω L ± Δ ω 1 A ̲ 1 ( ω ) . (197)
This has the pleasant effect that we will never need to explicitly refer to the connection { ω B ± }   in line  185 . We can calculate the conjugated right hand side of that expression to be:
ω g ± 1 ω B ± ω g ± = A ̲ 1 ω C ± , (198)
where we have set:
ω g ± 1 d ( ω g ± ) = ω C ± . (199)
Using the formulas  31 , we have the following expressions for the spatial components { ω C ̲ ± }   :
( ω C ̲ ± ) d f = d * Δ 1 [ ω C ̲ ± , ω C ̲ ± ] , (200a)
( ω C ̲ ± ) c f = ω A ̲ ± x Δ 1 [ ω A ̲ ± , ω C ̲ ± ] . (200b)
In order to compute a formula for the temporal potential ω C 0 ±   , we simply use the fact that F ω C ± = 0   and the formula  200b which together imply (by computing d * E ω C ±   ):
ω C 0 ± = ω A 0 ± t Δ 1 [ ω A ̲ ± , ω C ̲ ± ] d * Δ 1 [ ω C 0 ± , ω C ̲ ± ] , (201)
where we have: ω A 0 ± = ω Π ¯ ( 1 2 δ ) t ω L ± Δ ω 1 A ̲ 1 ( ω ) .   We remark here that the importance of the system of equations  200a  201 is that they give the following decomposition of the infinitesimal gauge transformation { ω C ± }   :
ω C ± = t , x ω Π ¯ ( 1 2 δ ) ω L ± Δ ω 1 A ̲ 1 ( ω ) + { Q u a d r a t i c E r r o r } . (202)
The linear term in the above expression is enough to kill off the worst error term when differentiating the parametrix  184 . It should be noted that this linear term is precisely what one gets more directly in the abelian case studied in [8. We should also point out here that the quadratic error on the right hand side of  202 above is much more delicate than the quadratic error resulting form the cancellation involving the linear term in this expression. In order to control this, we will need the full force of the orthogonality properties of our parametrix, which are contained in the bootstrapping assumption  126d , as well as some rather technical function spaces and multilinear estimates which we will develop in Section  11 .
To close out this section, we apply the truncated covariant wave operator
A ̲ 1   to the parametrix  184 and record the various error terms which result. We gather this together in the following proposition:
Proposition 8.1 (Error terms for the differentiated parametrix). Consider the parametrix Φ ± ( f ^ )   defined by the formula  184 , with infinitesimal gauge transformations given by equations  200a  201 . Then one has the identity:
A ̲ 1 Φ ± ( f ^ ) (203)
= 4 π i R n e 2 π i λ ω u ± [ A ̲ 1 ( ω L ) ω C ± ( ω L ) , ω g ± 1 f ^ ( λ ω ) ω g ± ] χ ( 2 1 , 2 ) ( λ ) λ n d λ d ω
R n e 2 π i λ ω u ± [ D α A ̲ 1 ( ω C ± ) α , ω g ± 1 f ^ ( λ ω ) ω g ± ] χ ( 2 1 , 2 ) ( λ ) λ n 1 d λ d ω
+ R n e 2 π i λ ω u ± [ A ̲ α 1 ( ω C ± ) α , [ ( A ̲ 1 ) α ω C α ± , ω g ± 1 f ^ ( λ ω ) ω g ± ] ] χ ( 2 1 , 2 ) ( λ ) λ n 1 d λ d ω .
Remark 8.2. The worst error term in the expression  203 is of course the “derivative fall on high” term which is the first on the right hand side. However, using the structure equation  126e , this takes the form:
A ̲ 1 ( ω L ) ω C ± ( ω L ) , (204)
= A ̲ 1 ( ω ) + ω Π ( 1 2 δ ) ω L ω L ± Δ ω 1 A ̲ 1 ( ω ) + { Q u a d r a t i c E r r o r } ,
= ( I ω Π ( 1 2 δ ) ) A ̲ 1 ( ω ) + { Q u a d r a t i c E r r o r } .
The key observation now is that since the operator ( I ω Π ( 1 2 δ ) )   cuts off on such a small angular sector with respect to the spatial frequency, an application of Bernstein's inequality gains enough extra spatial derivatives to put this term in the mixed Lebesgue space L 2 ( L n 1 )   . Furthermore, the quadratic error term which is left over involves enough bilinear interactions to go in L 1 ( L )   . So in this sense, as we have mentions before, the problem reduces to something which is reminiscent of wave-maps. Of course, there is a somewhat heavy price to pay for this “renormalization”, which is that it must take place under an integral sign. Finally, it is worth pointing out that this top order cancellation is completely analogous to what happens in the abelian case [8.

5 For those who are familiar with this kind of problem, this is precisely a reduction to the famous L o w × H i g h   frequency interaction A ̲ α 1 α Φ 1   .

6 This would end up being the usual a frequency based Hadamard parametrix for the operator A ̲ 1   .

7 This is an artifact of the critical nature of the problem.

Specifically, the group elements have the heuristic form ω g = e x p ( 1 ω A )   . Since we do not have L   control on 1 ω A   we cannot localize its image.

8 It is very much our philosophy here that this problem is essentially equivalent to wave-maps after a microlocalization.

Of course, as the reader will see, this microlocalization is quite costly and introduces many objects that are not present in the original wave-maps problem.

9 Fixed Time L 2   Estimates for the Parametrix

We now begin our proof of the estimates  178 for the integral operator  184 introduced in the last section. Here we cover bounds which are of non-differentiated energy type. Specifically, we will show the undifferentiated L ( L 2 )   estimate contained in  178a , as well as the multiplier-approximation bound  178d . Both of these will follow from the same set of estimates. At a heuristic level, they are not much more involved that a standard T T *   argument followed by some integration by parts, although the details turn out to be a bit involved. Things will be computed more or less directly by an appeal to the explicit equations  200a  201 , taking a little bit of care to use them properly. This will be done by considering them as “path lifting” formulas from Minkowski space n   to the compact group G   .
This allows us to employ an integral form of the intermediate value theorem from elementary calculus which is valid in the context of Lie groups. It turns out that this identity can be differentiated as many times as necessary with respect to the angular frequency variable, although this fact is provided through a surprisingly delicate bootstrapping argument. Here the unitarity of the group is needed in a crucial way to keep everything from collapsing. Once the bootstrapping is complete, the estimates themselves will be proved using a “trace-Bernstein” type inequality that we construct by hand using various multipliers. Once the integration by parts portion of things is taken care of, we will close the
L 2   estimate by showing that a “non-smooth” remainder kernel has small amplitudes after integration in the angular frequency variable. This involves some fairly technical bilinear estimates because the necessary othogonality arguments are difficult to pass through Hodge systems. The details of these procedures are as follows.
Throughout this section we will replace the specific cutoff function
χ ( 1 2 , 2 )   appearing in the definition of parametrix  184 with an arbitrary smooth scalar bump function χ ( ξ )   that we may assume to be supported in the frequency annulus { 4 1 < | ξ | < 4 }   .
At fixed time
t 0   , we define the operator T ( f ^ ) = Φ ( f ^ ) ( t 0 )   , where we have suppressed the ±   notation because it will be irrelevant for what we do here. Our first goal is the prove the bound:
T ( f ^ ) L 2 f ^ L 2 . (209)
Squaring this, it suffices to show that (here f   has no relation to f ^   and simply represents a function of the physical-space variables):
T T * ( f ) L 2 f L 2 , (210)
where the adjoint T *   is taken with respect to the Killing form  13 . A quick calculation of the kernel of this operator shows that:
K T T * ( x , y ) = R n e 2 π i ( x y ) ξ ω g 1 ( x ) ω g ( y ) [ ] ω g 1 ( y ) ω g ( x ) χ ( ξ ) d ξ , (211)
where we use the [ ]   notation to emphasize the fact that this operator acts via conjugation. Our task is now to show the estimates:
K T T * L y ( L x 1 ) , K T T * L x ( L y 1 ) 1 . (212)
Since K T T *   is essentially symmetric in ( x , y )   , we may concentrate on the first such estimate.
To proceed, we first decompose the product physical space R n × R n   into the dyadic regions:
D σ = { | x y | σ | σ = 2 i , i N } . (213)
We then decompose the kernel T T *   kernel into the dyadic sum: K T T * = σ χ D σ K T T * = σ K σ T T * .   By dyadic summing, to show  212 it suffices to be able to show the single estimate:
K σ T T * L y ( L x 1 ) σ γ , (214)
where 0 < γ 1   now represents a small savings in physical space decay. Now  214 would be easy to show if we had the absolute decay estimate: | K σ T T * ( x , y ) | | x y | ( n + γ ) ,   and this is almost true. Unfortunately, there is a regularity problem due to the degeneracy of the sub-Laplacean Δ ω   used in the connection  200 which provides the group elements ω g   . This forces us to write the kernel K σ T T *   as a sum of two terms:
K σ T T * = K ~ σ T T * + σ T T * . (215)
We will then prove that both:
| K ~ σ T T * ( x , y ) | | x y | ( n + γ ) , (216)
σ T T * L y ( L x 1 ) σ γ . (217)
To define the splitting  215 , we factor the group elements ω g   into a product of smooth and small parts. This is completely analogous to the procedure used in [8, but since things are non-abelian (and hence non-linear) here, the estimates required are quite a bit more involved. What we will do is construct another gauge transformation ω g ~   , which is based on a further smoothing of the connection  197 .
This will produce a group element which can be treated as a standard symbol. To this end, we define the scale mollified connection:
ω A ̲ ( σ ) ~ = ω Π ¯ σ 1 + γ < ω Π ¯ ( 1 2 δ ) x ω L Δ ω 1 A ̲ 1 ( ω ) , (218)
where γ   is, again, the small dimensional constant from line  190 . Again, we have dropped the ±   notation because it is irrelevant. Following the proof of  191 , and using the fact that the multipliers ω Π ¯ σ 1 + γ <   are bounded on frequency localized Lebesgue spaces, we may apply Lemma  3.2 to the connection { ω A ̲ ( σ ) ~ }   .
This produces a group element
ω g ~   , which is defined by the infinitesimal generator:
ω g ~ 1 d ( ω g ~ ) = ω C ̲ ~ . (219)
Furthermore, this generator is itself defined via the Hodge system:
( ω C ̲ ~ ) d f = d * Δ 1 [ ω C ̲ ~ , ω C ̲ ~ ] , (220a)
( ω C ̲ ~ ) c f = ω A ̲ ( σ ) ~ x Δ 1 [ ω A ̲ ( σ ) ~ , ω C ̲ ~ ] . (220b)
Using this new group element ω g ~   , we define the remainder group element ω h   via the product:
ω g = ω h ω g ~ . (221)
To compute the infinitesimal generator of ω h   , we first use the identity:
d ( ω h ) = d ( ω g ) ω g ~ 1 + g d ( ω g ~ 1 ) ,
= ω h ω g ~ ( ω C ̲ ω C ̲ ~ ) ω g ~ 1 . (222)
This leads us to define the difference connection:
ω C ̲ ~ ~ = ω C ̲ ω C ̲ ~ . (223)
A quick calculation using the systems  200 and  220 shows that this new connection can be pinned down via the Hodge system:
( ω C ̲ ~ ~ ) d f = d * Δ 1 ( [ ω C ̲ ~ , ω C ̲ ~ ~ ] + [ ω C ̲ ~ ~ , ω C ̲ ~ ] ) , (224a)
( ω C ̲ ~ ~ ) c f = ω A ̲ ~ ω A ̲ ( σ ) ~ x Δ 1 ( [ ω A ̲ ~ ω A ̲ ( σ ) ~ , ω C ̲ ~ ] + [ ω A ̲ ( σ ) ~ , ω C ̲ ~ ~ ] ) , (224b)
where a simple computation shows that:
ω A ̲ ω A ̲ ( σ ) ~ = ω Π ¯ σ 1 + γ ω Π ¯ ( 1 2 δ ) x ω L Δ ω 1 A ̲ 1 ( ω ) , (225)
We now define the decomposition  215 along the following decompositions of the group element products in the kernel  211 :
ω g 1 ( x ) ω g ( y ) = ω g ~ 1 ( x ) ω g ~ ( y ) + ω g ~ 1 ( x ) ( ω h 1 ( x ) ω h ( y ) I ) ω g ~ ( y ) , (226)
ω g 1 ( y ) ω g ( x ) = ω g ~ 1 ( y ) ω g ~ ( x ) + ω g ~ 1 ( y ) ( ω h 1 ( y ) ω h ( x ) I ) ω g ~ ( x ) . (227)
Accordingly, we define:
K ~ T T * ( x , y ) = R n e 2 π i ( x y ) ξ ω g ~ 1 ( x ) ω g ~ ( y ) [ ] ω g ~ 1 ( y ) ω g ~ ( x ) χ ( ξ ) d ξ , (228)
and then define σ T T *   according to the formula  215 . The idea now is that while one can only perform integration by parts in the kernel  228 above, the group element ω h 1 ( x ) ω h ( y )   and its inverse, which must be contained as at least one factor in the remainder, are so close to the identity matrix that the resulting difference expression can be estimated without use of the oscillations which take place under the integral sign.
We now begin our proof of the estimate  216 . To do this, we simply integrate by parts as may times as necessary with respect to the variable ξ   in order to pick up the needed point-wise decay. Doing this, we see that in order to draw our conclusion, it suffices to show the following symbol bounds for 1 k   :
χ D σ ξ k ( ω g ~ 1 ( x ) ω g ~ ( y ) ) σ k ( 1 γ ) , (229)
χ D σ ξ k ( ω g ~ 1 ( y ) ω g ~ ( x ) ) σ k ( 1 γ ) . (230)
In fact, we shall prove the following more general bounds, which contain  229  230 as a special case, and which will be useful in the sequel:
Proposition 9.1 (Symbol bounds for the smoothed amplitudes ω g ~ 1 ( t , x ) ω g ~ ( s , y )   and ω g ~ 1 ( s , y ) ω g ~ ( t , x )   ). Let the group elements ω g ~   be defined infinitesimally by the Hodge system  220 , where the parameter σ 1 + γ   is replaced by M 1   , where M   lies in the range:
( | t s | + | x y | ) 1 2 M | t s | + | x y | . (231)
Then for any integer 1 k   , one has the following symbol bounds assuming that the bootstrapping constant   from line  126d is chosen sufficiently small (with respect to each fixed k   ):
ξ k ( ω g ~ 1 ( t , x ) ω g ~ ( s , y ) ) M k , (232)
ξ k ( ω g ~ 1 ( s , y ) ω g ~ ( t , x ) ) M k . (233)
Here the ξ k   notation is shorthand for all k t h   order partial derivatives involving the variable ξ   , and   is the standard matrix vector-norm from line  15 . The implicit constants on the right hand side depend on k   , but are uniform in the parameter M   for each fixed k   .
We now proceed to prove the second main estimate  217 for remainder kernel in the splitting  215 . This involves a sum of kernels, each of which according to the identities  226  227 has at least one copy of the terms ω h 1 ( x ) ω h ( y ) I   and ω h ( x ) ω h 1 ( y ) I   . Therefore, without loss of generality, we may assume that we are trying to prove the estimate:
R x n χ D σ ( x ) R ξ n e 2 π i ( x y ) ξ ω G ( x , y ) χ ( ξ ) d ξ d x σ γ . (302)
where we have set: ω G ( x , y ) = ω g ~ 1 ( x ) ( ω h 1 ( x ) ω h ( y ) I ) ω g ~ ( y ) [ ] ω g 1 ( y ) ω g ( x ) .   We note here that the corresponding estimates for the other terms in σ T T *   are similar and are left to the reader.
To prove  302 , we use following angular cutoff functions to split: ω G = χ | cos ( θ ξ , x y ) | | x y | 1 + γ ω G + χ | cos ( θ ξ , x y ) | < | x y | 1 + γ ω G .   Therefore, using the triangle and Minkowski inequalities, we see that it suffices to prove the pair of bounds:
R x n χ D σ ( x ) R ξ n e 2 π i ( x y ) ξ χ | cos ( θ ξ , x y ) | | x y | 1 + γ ω G ( x , y ) χ ( ξ ) d ξ d x σ γ , (303)
R x n χ D σ ( x ) R ξ n e 2 π i ( x y ) ξ χ | cos ( θ ξ , x y ) | < | x y | 1 + γ ω G ( x , y ) χ ( ξ ) d ξ d x σ γ . (304)
The proof of the first estimate,  303 , is a simple matter of integrating by parts as many times as necessary with respect to the weighted radial derivative 1 2 π i | x y | cos ( θ ξ , x y ) | ξ |   , taking account of the fact that ω G   is independent of the variable | ξ |   . Assuming that | x y | σ   is sufficiently large, we will eventually have that:
| ( 1 2 π i | x y | cos ( θ ξ , x y ) | ξ | ) k χ ( ξ ) | σ n γ , (305)
at which point we may stop the integration by parts and put absolute value signs around the remaining integral. The right hand side of  303 will then follow as a direct consequence of  305 and the simple bounds:
R x n χ D σ ( x ) d x σ n ,
sup x , ω ω G ( x , y ) 1 .
To conclude the proof of estimate  302 , we need to show the second estimate  304 above. At this point, we have stripped things down to where oscillations under the integral sign are no longer of any use, so we simply strive to estimate the absolute value of the integrand. Here the smallness of the function ω G ( x , y )   is essential. To make use of this, we rearrange the order in the absolute integral and use Hölders inequality to bound:
 304  (306) ( L . H . S . ) S n 1 sup x D σ ω G ( x , y ) d ω sup ω R x n χ | cos ( θ ξ , x y ) | < | x y | 1 + γ ( x ) χ D σ ( x ) d x .
To bound the second integral on the right hand side of the above product, we translate by the vector y   and then apply a rotation to reduce the bound we wish to show to the following:
| x | σ χ | cos ( θ ( 1 , 0 ) , x ) | < | x | 1 + γ ( x ) d x σ n 1 + γ . (307)
The validity of  307 follows trivially from the fact that if we split x = ( x 1 , x )   , we have the bounds | x 1 | σ γ   and | x | σ   over the range of integration thanks to the angular cutoff and the identity: cos ( θ ( 1 , 0 ) , x ) = x 1 | x | .   Thus, keeping in mind the bound  307 , we see from estimate  306 that the proof of  304 follows from a Cauchy-Schwartz on the sphere S n 1   and the following integrated bounds:
( S n 1 sup x D σ ω G ( x , y ) 2 d ω ) 1 2 σ 1 n 2 γ . (308)
Due to its use in the next section, we will in fact show the following more general set of estimates which includes  308 as a special case:
Proposition 9.4 (Estimates for integrated remainder group elements ). Let the group elements ω h   be defined infinitesimally via the equations  222  223 and the Hodge system  224 , where the parameter σ 1 + γ   is replaced by M 1   . Then upon integration, one has the following bounds:
( S n 1 sup | x y | N ω h 1 ( t , x ) ω h ( s , y ) I 2 d ω ) 1 2 ( 1 + | t s | + N ) M n δ , (309)
( S n 1 sup | x y | N ω h ( t , x ) ω h 1 ( s , y ) I 2 d ω ) 1 2 ( 1 + | t s | + N ) M n δ , (310)
where   is the bootstrapping constant from line  126d . The above estimates are uniform in the value of M   when it is sufficiently large.

9.1 Proof of the Accuracy estimate  178d 

We will now give a short proof of the multiplier equivalence bound  178d . This will follow almost directly from the estimates we have already shown. We compute the kernel of the operator Φ ( 0 ) ( ( 2 π | ξ | ) α ( Φ ( 0 ) ) * ) ( Δ ) α 2 P 1   to be (again suppressing ±   notations):
(364) K α ( x , y ) = R n e 2 π i ( x y ) ξ ω g 1 ( x ) ω g ( y ) [ ] ω g 1 ( y ) ω g ( x ) χ α ( ξ ) d ξ R n e 2 π i ( x y ) ξ [ ] χ α ( ξ ) d ξ ,
where χ α ( ξ ) = ( 2 π | ξ | ) α χ ( 1 2 , 2 ) ( ξ )   . Notice that this cutoff function satisfies the general requirements of the generic bump function χ   used throughout this section.
In particular, there exist constants
C k   which depend only on α   and the original χ ( 1 2 , 2 )   such that:
R n | ξ k χ α ( ξ ) | d ξ C k . (365)
We now decompose the kernel K α = σ K σ α   according to the dyadic physical space decomposition  213 . For each fixed value of the small constant   on line  178d we write this sum in terms of two pieces, a “close” part and a “far” part:
K α = K 1 2 ( n + 1 ) α + K 1 2 ( n + 1 ) < α , (366)
= σ : σ 1 2 ( n + 1 ) K σ α + σ : 1 2 ( n + 1 ) < σ K σ α .
To estimate the near portion of things, we do a little algebraic manipulation and write the kernel as:
K 1 2 ( n + 1 ) α = χ D 1 2 ( n + 1 ) ( R n e 2 π i ( x y ) ξ ( ω g 1 ( x ) ω g ( y ) I ) [ ] ω g 1 ( y ) ω g ( x ) χ α ( ξ ) d ξ + R n e 2 π i ( x y ) ξ [ ] ( ω g 1 ( y ) ω g ( x ) I ) χ α ( ξ ) d ξ ) .  
By a direct application of the pair of integrated bounds  309  310 (with M 1   ) this last expression gives us the absolute kernel bound: | K 1 2 ( n + 1 ) α ( x , y ) | ( 1 + | x y | ) χ D 1 2 ( n + 1 ) ( | x y | ) .   By integrating the right hand side of this last inequality we easily arrive at the pair of Schur-test bounds:
K 1 2 ( n + 1 ) α L y ( L x 1 ) , K 1 2 ( n + 1 ) α L x ( L y 1 ) 1 2 . (367)
To estimate the second kernel on the right hand side of  366 , we do things separately for each term in the sum  364 . For the second term, which does not contain the group elements, a simple application of the estimate  365 and integration by parts shows that one has the absolute bounds:
| χ D 1 2 ( n + 1 ) < ( | x y | ) R n e 2 π i ( x y ) ξ [ ] χ α ( ξ ) d ξ | , (368)
χ D 1 2 ( n + 1 ) < ( | x y | ) ( 1 + | x y | ) 2 ( n + 1 ) ,
1 2 ( 1 + | x y | ) ( n + 1 ) .
This easily yields Schur-test bounds of the form  367 . Therefore, it remains to prove these bounds for the first integral expression on the right hand side  364 after it has been cut off in the far region 1 2 ( n + 1 ) < | x y |   . This follows at once from writing this kernel as a sum over various dyadic regions, and using the symbol bounds  229  230 as well as the reduction to the integrated estimates  309  310 . The key thing to notice is that there are only two places where we do not pick up the factor of   in the resulting estimates. The first is in the main integration by parts argument when the derivatives ξ k   all fall on the cutoff function χ α   . In that case we can simply use the compactness of the group elements and proceed in a way that is analogous to the computation which started on line  368 above. The second place is where we estimate the integral  303 . In that case we can easily upgrade the bound  305 to have the factor σ 2 ( n + 1 )   on the right hand side. We are then essentially in the same situation as was reached starting on line  368 above. This completes our proof of the general multiplier approximation estimate  178d .

10 The Dispersive Estimate

In this section, we complete our proof of the non-microlocalized version of the Strichartz estimates contained in  178a . Using the abstract machinery of [4, these will follow once we can show that the parametrix  184 satisfies a dispersive estimate.
If at fixed time
t   we write that operator as: T ( t ) ( f ^ ) = Φ ( t ) ( f ^ ) ,   where we have suppressed the ±   notation, then we seek to prove the bound (where f   has nothing to do with the original f ^   , but just represents a function of the physical space variables):
T ( t ) T * ( s ) f L x ( 1 + | t s | ) n 1 2 f L x 1 . (369)
Now, a calculation similar that used to produce  211 shows that the kernel of the above operator can be computed to be:
(370) K T T * ( t , s ; x , y ) = R n e 2 π i ( ( t s ) | ξ | + ( x y ) ξ ) ω g 1 ( t , x ) ω g ( s , y ) [ ] ω g 1 ( s , y ) ω g ( t , x ) χ ( ξ ) d ξ .
Therefore, as is usually the case, we see that it suffices to show the fixed time uniform bound:
K T T * ( t , s ; , ) L x , y ( 1 + | t s | ) n 1 2 . (371)
The proof of  371 turns out to be a straightforward consequence of the bounds established in the previous section. The strategy we follow here is almost identical.
We first decompose the
K T T *   kernel into a sum of two pieces: K σ T T * = K ~ T T * + T T * ,   for which we'll show the bound  371 individually. The K ~ T T *   kernel will be smooth enough that we can use a standard stationary phase computation on it.
The remainder kernel
T T *   will be small in absolute value without using any sophisticated integration by parts (although, as in the previous section, there will be some use for oscillations in this term also). As in the previous section, the definition of K ~ T T *   will depend on a physical space scale, in this case the value of ( 1 + | t s | + | x y | )   . This will again be effected by the choice of an auxiliary gauge transformation ω g ~   . This time we define ω g ~   to be the transformation into the Coulomb gauge of the smoothed out potential:
ω A ̲ ( M ) ~ = ω Π ¯ M 1 < ω Π ¯ ( 1 2 δ ) x ω L Δ ω 1 A ̲ 1 ( ω ) , (372)
where we define the scale M   to be such that: M = ( 1 + | t s | + | x y | ) 1 2 .   As before, we use the splitting  226  227 to compute:
(373) K ~ T T * ( t , s ; x , y ) = R n e 2 π i ( ( t s ) | ξ | + ( x y ) ξ ) ω g ~ 1 ( t , x ) ω g ~ ( s , y ) [ ] ω g ~ 1 ( s , y ) ω g ~ ( t , x ) χ ( ξ ) d ξ .
Our first step here is to notice that it suffices to show  371 for the kernel  373 under the condition that | x y | > 1 2 ( 1 + | t s | )   , for if this were not the case then we could simply integrate by parts as many times as necessary with respect to the variable λ = | ξ |   in the expression  373 and easily achieve  371 . Therefore, we will now show that:
K ~ T T * ( t , s ; x , y ) | x y | n 1 2 . (374)
We now factor the phase in  373 as: e 2 π i ( ( t s ) | ξ | + ( x y ) ξ ) = e 2 π i ( t s ) λ e 2 π i λ | x y | cos ( Θ x y , ω ) ,   where we are using the frequency polar coordinates ξ = λ ω   . Integrating first on the sphere S n 1   , we see that to conclude  374 it is enough to show that:
(375) S n 1 e 2 π i λ | x y | cos ( Θ x y , ω ) ω g ~ 1 ( t , x ) ω g ~ ( s , y ) [ ] ω g ~ 1 ( s , y ) ω g ~ ( t , x ) d ω | x y | n 1 2 .
This last estimate will follow easily from the Morse lemma and the already established symbol bounds  232  233 . To implement this, we first cut off the above integral into small neighborhoods of stationary points of the phase and a remainder. We do this with the smooth partition of unity: 1 = χ | 1 cos ( Θ x y , ω ) | < 1 8 + χ | 1 + cos ( Θ x y , ω ) | < 1 8 + χ ~ .   The cutoff χ ~   cuts off on the region where cos ( Θ x y , ω )   is bounded away from ± 1   , and there we have the gradient estimate: c < | ω cos ( Θ x y , ω ) | ,   for a sufficiently small constant c   . Using this, and integrating by parts n 1   times while using the symbol bounds  232  233 , we easily have that:
S n 1 e 2 π i λ | x y | cos ( Θ x y , ω ) ω g ~ 1 ( t , x ) ω g ~ ( s , y ) [ ] ω g ~ 1 ( s , y ) ω g ~ ( t , x ) χ ~ ( ω ) d ω λ 1 n | x y | n 1 2 .  
This proves  375 because we may assume that 1 4 < λ   . Our goal is now to prove the localized estimate:
S n 1 e 2 π i λ | x y | cos ( Θ x y , ω ) ω g ~ 1 ( t , x ) ω g ~ ( s , y ) [ ] ω g ~ 1 ( s , y ) ω g ~ ( t , x ) χ ~ | 1 cos ( Θ x y , ω ) | < 1 8 ( ω ) d ω | x y | n 1 2 .  
It will become clear that the corresponding estimate for the region where | 1 + cos ( Θ x y , ω ) | < 1 8   follows from identical calculations.
Now, the angular function
cos ( Θ x y , ω )   has a single non-degenerate critical point in a neighborhood of the unit vector ( x y ) / | x y |   with index n 1   . Therefore, by the Morse lemma there exists a diffeomorphism θ = φ ( ω )   in a neighborhood of this point such that: 1 cos ( Θ x y , ω ) = θ 1 2 + + θ n 1 2 .   By making this change of variables, we see that we are trying to prove that:
| | | R n 1 e 2 π i λ | x y | | θ | 2 φ 1 ( θ ) g ~ 1 ( t , x ) φ 1 ( θ ) g ~ ( s , y ) [ ] φ 1 ( θ ) g ~ 1 ( s , y ) φ 1 ( θ ) g ~ ( t , x ) χ ( θ ) J φ 1 ( θ ) d θ | | | | x y | n 1 2 .  
Here J φ 1   denotes the Jacobian matrix of φ 1   , and χ   is some smooth function which is supported where | θ | 1   . Making now the simple change of variables λ | x y | θ = θ   , it suffices to be able to show that:
(376) | | | R n 1 e 2 π i | θ | 2 φ ~ ( θ ) g ~ 1 ( t , x ) φ ~ ( θ ) g ~ ( s , y ) [ ] φ ~ ( θ ) g ~ 1 ( s , y ) φ ~ ( θ ) g ~ ( t , x ) J ~ ( θ ) d θ | | | 1 .
Here J ~ ( θ )   denotes a smooth function with (large) compact support and uniform gradient bounds: | θ k J ~ | 1 .   Furthermore, the function φ ~ ( θ )   obeys the gradient bounds: | θ k φ ~ | | x y | k 2 .   Combining this last estimate with the symbol bounds  232  233 and the truncation condition M = | x y | 1 2   , we have the uniform gradient estimates:
θ k ( φ ~ g ~ 1 ( t , x ) φ ~ g ~ ( s , y ) ) 1 ,
θ k ( φ ~ g ~ 1 ( s , y ) φ ~ g ~ ( t , x ) ) 1 .
Using these bounds, we can prove the bound  376 by treating the quantity on the left hand side as a Fresnel-type integral and performing n   integrations by parts in the region where 1 < | θ |   .
To complete our proof of
 371 we need to show that:
T T * ( t , s ; , ) L x , y ( 1 + | t s | ) n 1 2 , (377)
where T T *   is the kernel which is defined by subtracting  373 from  370 . Using the splitting  226  227 we see that this has at least one factor involving the expressions ω h 1 ( x ) ω h ( y ) I   or ω h ( x ) ω h 1 ( y ) I   under the integral sign. There are several such combination, but we will choose to estimate only one such term and leave the others to reader as they can be treated analogously. Therefore, we may without loss of generality assume that we are trying to prove the bound:
(378) 0 S n 1 e 2 π i λ ( ( t s ) + ( x y ) ω ) ω G ( t , x ; s , y ) χ ( λ ) λ n 1 d λ d ω ( 1 + | t s | ) n 1 2 ,
where we have set: ω G ( t , x ; s , y ) = ω g ~ 1 ( t , x ) ( ω h 1 ( t , x ) ω h ( s , y ) I ) ω g ~ ( s , y ) [ ] ω g 1 ( s , y ) ω g ( t , x ) .   As in the proof of  371 above for the smoothed out kernel K ~ T T *   , we may without loss of generality assume that we trying to prove  378 in the region where | x y | > 1 2 ( 1 + | t s | )   because otherwise we may integrate as many times as necessary with respect to the radial frequency variable to pick up the desired decay.
To proceed further, we will first decompose the range of frequency integration into a small set and a remainder where we can again integrate by parts with respect to
λ   . This is accomplished by using the angular partition of unity: 1 = χ | t s | x y | + cos ( Θ x y , ω ) | > | x y | γ 1 + χ | t s | x y | + cos ( Θ x y , ω ) | | x y | γ 1 .   To deal with the bound  378 for the first cutoff function above, we need to show that:
S n 1 ω G ( t , x ; s , y ) d ω 0 e 2 π i λ ( ( t s ) + ( x y ) ω ) χ | t s | x y | + cos ( Θ x y , ω ) | > | x y | γ 1 χ ( λ ) λ n 1 d λ | x y | n 1 2 .  
This bound follows easily from radial integration by parts in the inner integral, followed by the simple compactness estimate: S n 1 ω G ( t , x ; s , y ) d ω 1 ,   which is of course uniform in the variables ( t , x ; s , y )   .
To wrap things up here, we need to show the absolute estimate: R n ω G ( t , x ; s , y ) χ | t s | x y | + cos ( Θ x y , ω ) | | x y | γ 1 χ ( ξ ) d ξ | x y | n 1 2 .   After a Cauchy-Schwartz, this will follow once we can establish that both:
( R n χ | t s | x y | + cos ( Θ x y , ω ) | | x y | γ 1 χ ( ξ ) d ξ ) 1 2 | x y | 1 2 ( γ 1 ) , (379)
( S n 1 ω G ( t , x ; s , y ) 2 d ω ) 1 2 | x y | 1 2 ( n 2 + γ ) . (380)
The first estimate,  379 follows from elementary bounds. Notice first that after a rotation, it suffices to assume that the vector x y   lies along the ( 1 , 0 )   direction.
Then the cutoff function is supported in the region where:
ξ 1 | ξ | = t s | x y | + O ( | x y | γ 1 ) ,   which is a conical set about the ξ 1   -axis of volume no greater than a constant times | x y | γ 1   in the region where | ξ | 1   . The second estimate  380 above we have already shown. It is a special case of the bound  309 which was proved in the previous section. This completes our proof of  377 , and hence our demonstration of the dispersive estimate  371 .

11 The Decomposable Function Spaces: Proof of the Square-Sum and Differentiated Strichartz Estimates

We now introduce a piece of machinery which will be of central importance for the remainder of the paper. This is a suitable reinterpretation of the important “decomposable function” criterion from the work [8. In our context, we set up the general situation as follows: Suppose we are given an M ( m × m )   valued Fourier integral operator:
Φ ( f ^ ) ( t , x ) = R n e 2 π i ψ ( t , x ; ξ ) e 2 π i x ξ g 1 ( t , x ; ξ ) f ^ ( ξ ) g 2 ( t , x ; ξ ) d ξ , (381)
where the g i   are arbitrary matrix valued functions, such that this operator satisfies certain mixed Lebesgue space mapping properties (uniform in y 0   ):
Φ y 0 ( f ^ ) L q 1 ( L r 1 ) f ^ L 2 , (382)
where Φ y 0   is the same operator as  381 but with phase ψ   replaced by ψ y 0 = ψ ( t , x y 0 ; ξ )   . Suppose now that we are given a matrix valued function C ( t , x ; ω )   which only depends on the angular variable ω = ξ / | ξ |   in frequency. We would like to prove estimates for the coupled operator (we only discuss left multiplication here, the case of right multiplication is analogous):
Φ ~ ( f ^ ) ( t , x ) = R n e 2 π i ψ ( t , x ; ξ ) e 2 π i x ξ C ( t , x ; ω ) g 1 ( t , x ; ξ ) f ^ ( ξ ) g 2 ( t , x ; ξ ) d ξ . (383)
These should be done in a way that the decay properties of the function C ( t , x ; ω )   can be used to improve the range of the estimates  382 . A robust way for doing this has been worked out in the paper of Rodnianski–Tao [8. The answer is to fix an angular scale, say θ   , and then to form the norm (“classical” decomposable norm):
C 2 D θ c l ( L t q 2 ( L x r 2 ) ) = k = 0 10 n θ n + 1 S ω n 1 ( θ ξ ) k C 2 L t q 2 ( L x r 2 ) d ω . (384)
By decomposing the frequency variable in  383 into angular sectors of size θ   , a straightforward computation then shows that one has the estimate:
Φ ~ ( f ^ ) L q ( L r ) C D θ c l ( L t q 2 ( L x r 2 ) ) f ^ L 2 , (385)
whenever estimate  382 holds with 1 q = 1 q 1 + 1 q 2   and 1 r = 1 r 1 + 1 r 2   .
There are two problems which occur when trying to apply
 384 in the present context. The first is that this norm is for a single scale, which causes problems in products where many different scales interact with each other. The other problem, which is conceptually much more serious, is that the estimate  384 contains the highly singular factor of θ n 1 2   , which needs to be eliminated with a delicate orthogonality argument, the kind which is not preserved in this problem for a variety of reasons (non-linear Hodge systems, a covariant wave equation that does not commute with angular cutoffs, etc). However, with only a slight reworking the basic idea behind  384 can be shown to be surprisingly robust. First of all, for a fixed scale we replace  384 with a square function norm which has the same effect, and which will be very easy to verify in the present context. Since we will be using multiple scales in a moment, we introduce the solid angular cutoff functions b φ ¯ θ ( ω )   (not to be confused with the hollow multipliers b θ ω ( ξ )   introduced in Section  4 ), such that:
b φ ¯ θ ( ω ) 1 , (386)
when ω Γ φ   , for the angular sector Γ φ   which we interpret as a cap in a finitely overlapping collection on the sphere S ω n 1 = φ Γ φ   . Here the scale is determined by the condition | Γ φ | θ   . On this scale, we replace  384 with the norm:
C D θ ( L t q 2 ( L x r 2 ) ) = ( k = 0 10 n φ sup ω b φ ¯ θ ( θ ξ ) k C 2 L x r 2 ) 1 2 L t q 2 . (387)
It is not difficult to see that by decomposing the integral on the right hand side of  384 into fine and course scales, and applying Hölder's on the fine (continuous) scales, that the Rodnianski-Tao norm  384 with the time integral on the outside is bounded by the norm  387 . Furthermore, it is easy to see from the proof given in [8that having the time integral on the outside does not effect the bound  385 so long as the index q 1   implicitly appearing in this bound is such that 2 q 1   . This allows one to use Minkowski's inequality to pull the square sum on the parametrix through the time integral. For us this index condition will always hold because we are working with Strichartz type norms. We leave it to the reader to work out the details of these claims.
We now form an
1   Banach space based on incorporating the norms  387 over all dyadic angular scales θ 1   . The elements of this space we denote by { C } = { C ( θ ) }   , and we define its norm 1 ( D θ )   norm as:
{ C } 1 ( D θ ( L t q 2 ( L x r 2 ) ) ) = θ C ( θ ) D θ ( L t q 2 ( L x r 2 ) ) . (388)
There is also the forgetful map from the space 1 ( D θ )   to functions which define as:
{ C } C = θ C ( θ ) , (389)
and we will in practice abusively identify { C }   with C   via the map  389 . The main point is that given any function C   , there may be a variety of ways which we embed C   in the space 1 ( D θ )   , and it is up to the structure of the application to decide how this should be done. Of course, given the square function norms  116 we are working with, our choice here is somewhat canonical.
Now, if we consider the
C   in  389 as embedded in the integral  383 , we easily have the estimate:
Φ ~ ( f ^ ) L q ( L r ) { C } 1 ( D θ ( L t q 2 ( L x r 2 ) ) ) f ^ L 2 . (390)
We also form spatial Besov versions of the norm  390 , which we denote as 1 D θ ( L q ( B ˙ 2 r , ( 2 , s ) ) )   .
This leads us to the basic notation of this section:
Definition 11.1. For a given matrix valued function, we say it is in the decomposable space D ( L q ( B ˙ 2 r , ( 2 , s ) ) )   if the following norm is finite:
C D ( L q ( B ˙ 2 r , ( 2 , s ) ) ) = inf C = θ C ( θ ) { θ C ( θ ) D θ ( L q ( B ˙ 2 r , ( 2 , s ) ) ) } . (391)
We also define the low frequency analog of these spaces, which we denote by D ( L q ( B ˙ 2 , 10 n r , ( 2 , s ) ) )   , similarly.
We remark here that it is easy to see that the norm  391 leads to a Banach space. This will be important in a moment. Also, it is easy to show that the various Besov-Lebesgue space inclusions  42  44 hold for these spaces if we define D ( L p )   analogously to  391 . This is a simple consequence of the fact that the Littlewood-Paley theory commutes with the derivatives ξ k   . We now show that this space satisfies the expected range of bilinear Riesz operator estimates:
Lemma 11.2 (A decomposable Besov calculus). Let the indices 0 σ   , 1 q i , r i   , and s i   be given. Then one has the following family of bilinear estimates:
| D x | σ : D ( L t q 1 ( B ˙ 2 r 1 , ( 2 , s 1 ) ) ) D ( L t q 2 ( B ˙ 2 r 2 , ( 2 , s 2 ) ) ) D ( L t q 3 ( B ˙ 1 r 3 , ( 2 , s 3 ) ) ) , (392)
where the various indices satisfy the conditions:
s 3 = s 1 + s 2 + σ n 2 , (393)
σ + n 2 s 3 < n ( 1 r 1 + 1 r 2 ) , (394)
s 1 < n 2 + min { n ( 1 r 2 1 r 3 ) , 0 } , (395)
s 2 < n 2 + min { ( 1 r 1 1 r 3 ) , 0 } , (396)
1 q 3 = 1 q 1 + 1 q 2 , (397)
1 r 3 1 r 1 + 1 r 2 . (398)
We now establish the link which relates the norms  391 to the X ˙ s   norms we have proved for the parametrix Φ   :
Lemma 11.3 (Core decomposable estimates for the potentials { ω A ± }   and { ω C ± }   ).
Let the sets of potentials { ω A ± }   and { ω C ± }   be defined as on lines  197 ,  200 , and  201 above. Then one has the following family of decomposable bounds:
ω A ± D ( L t ( B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) ) ) , ω A ± D ( L t 2 ( B ˙ 2 , 10 n q γ , ( 2 , n 1 2 ) ) ) , (403)
t ω A ̲ ± D ( L t ( B ˙ 2 , 10 n p γ , ( 2 , n 4 2 ) ) ) , t ω A ̲ ± D ( L t 2 ( B ˙ 2 , 10 n q γ , ( 2 , n 3 2 ) ) ) , (404)
ω C ± D ( L t ( B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) ) ) , ω C ± D ( L t 2 ( B ˙ 2 , 10 n q γ , ( 2 , n 1 2 ) ) ) , (405)
t ω C ̲ ± D ( L t ( B ˙ 2 , 10 n p γ , ( 2 , n 4 2 ) ) ) , t ω C ̲ ± D ( L t 2 ( B ˙ 2 , 10 n q γ , ( 2 , n 3 2 ) ) ) , (406)
where p γ   and q γ   are the dimensional constants from lines  193 and  278 above. Furthermore, one has the following improved null-differentiated space-time bounds:
( ω L ω A ̲ ± , t Δ 1 2 ω L ω A ̲ ± ) D ( L t 2 ( B ˙ 2 , 10 n p γ , ( 2 , n 3 2 ) ) ) , (407)
( ω L ω C ̲ ± , t Δ 1 2 ω L ω C ̲ ± ) D ( L t 2 ( B ˙ 2 , 10 n p γ , ( 2 , n 3 2 ) ) ) . (408)
In all of these estimates, the small constant   is the same as on lines  126d and  126f above.

11.1 Proof of the Square Sum Strichartz Estimates

We now come to what is perhaps the linchpin of our argument. These are the square sum structure estimates contained in  178a . With the current machinery in hand, these will be quite easy to establish. At the heart of things is whether the angular multipliers ω Π θ   “commute with the dynamics” of the covariant wave operator A ̲ 1   . At a quick first glance using Duhamel's principle, this seems to be connected with whether one can control the commutator [ ω Π θ , A ̲ 1 ]   . Unfortunately, it is not too difficult to see that one runs into serious difficulties as soon as θ 1   .
This is not the end of the story however, because it turns out that modulo a very nice error term, one can control the commutator with the “integrated” form the equations
[ ω Π θ , Φ ]   . This shows one of the deep advantages to working with the parametrix as opposed to dealing directly with the equations themselves10 . We proceed as follows.
Our first step is to fix a scale
θ   and run a cap decomposition S n 1 = φ Γ φ   . The next thing we do is to decompose the parametrix Φ ( f ^ )   into a sum of three pieces:
Φ ( f ^ ) = R n e 2 π i λ ω u ω g θ 1 f ^ ( λ ω ) ω g θ χ ( 1 2 , 2 ) ( λ ) λ n 1 d λ d ω
+ R n e 2 π i λ ω u ω g θ 1 f ^ ( λ ω ) ω g θ χ ( 1 2 , 2 ) ( λ ) λ n 1 d λ d ω
+ R n e 2 π i λ ω u ω g θ 1 f ^ ( λ ω ) ω g χ ( 1 2 , 2 ) ( λ ) λ n 1 d λ d ω ,
= I 1 + I 2 + I 3 .
Here: ω g = ω g θ + ω g θ = P θ ( ω g ) + P θ ( ω g ) ,   is a low-high frequency decomposition of the group element ω g   . We define the decomposition for ω g 1   similarly. Our goal is now to prove the following three estimates:
φ : ω 0 Γ φ ω 0 Π θ P 1 ( I 1 ) 2 L 2 ( L 2 ( n 1 ) n 3 ) f ^ 2 L 2 , (423)
φ : ω 0 Γ φ ω 0 Π θ P 1 ( I 2 ) 2 L 2 ( L 2 ( n 1 ) n 3 ) f ^ 2 L 2 , (424)
φ : ω 0 Γ φ ω 0 Π θ P 1 ( I 3 ) 2 L 2 ( L 2 ( n 1 ) n 3 ) f ^ 2 L 2 . (425)
The proof of the first bound  423 follows easily from the plain endpoint Strichartz estimate we have already established. To see this, first notice that for a fixed angle one has the identity:
ω 0 Π θ P 1 R n e 2 π i λ ω u ω g θ 1 f ^ ( λ ω ) ω g θ χ ( 1 2 , 2 ) ( λ ) λ n 1 d λ d ω = ω 0 Π θ P 1 R n e 2 π i λ ω u ω g θ 1 b φ ¯ θ ( ω ) f ^ ( λ ω ) ω g θ χ ( 1 2 , 2 ) ( λ ) λ n 1 d λ d ω ,  
where Γ φ 1 2 Γ φ   for some fixed thickening Γ φ   of the spherical cap that ω 0 Γ φ   .
That this is the case follows easily from the fact that the Fourier transform of the function:
e 2 π i x ξ ω g θ 1 ( x ) f ^ ( λ ω ) ω g θ ( x ) ,   is a tempered distribution with support contained in an O ( c θ )   neighborhood of the point ξ   for some small constant c   , uniform in the value of θ   . Using now the boundedness of the multiplier ω 0 Π θ P 1   , we only need to establish that the truncated parametrix I 1   obeys the endpoint Strichartz estimate. We reduce this claim further by writing the integral in the form: I 1 = R n R n K P θ ( w ) K P θ ( y ) R n e 2 π i λ ω u ω g w 1 f ^ ( λ ω ) ω g y χ ( 1 2 , 2 ) ( λ ) λ n 1 d λ d ω d w d y ,   where ω g w 1 ( x ) = ω g 1 ( x w )   and ω g y ( x ) = ω g ( x y )   denote the translated group elements. Using the fact that the convolution kernel K P θ   has O ( 1 )   L 1   norm uniform in the value of θ   , we are left with establishing the L 2   and dispersive estimates of the previous sections for more general kernels of the form:
Φ g 1 , g 2 ( f ^ ) = R n e 2 π i λ ω u ω g 1 1 f ^ ( λ ω ) ω g 2 χ ( 1 2 , 2 ) ( λ ) λ n 1 d λ d ω , (426)
where ω g 1   and ω g 2   are unrelated group elements which are generated from Hodge systems and connections of the form  197  201 , which satisfy the general requirements  126 and  158 for λ = 1   . This indeed turns out to be the case, and the key observation is that by using the identity  14 , all of the T T *   arguments go through just as they did in previous sections.
It remains for us to prove the bounds
 424  425 . These are essentially identical to each other so we concentrate on the proof of the first of these, leaving the other one to the reader. By an application of Bernstein's inequality and orthogonality, we see that it suffices for us to show the estimate:
θ I 2 L 2 ( L 2 ) f ^ L 2 . (427)
At a heuristic level, this estimate is true because one has the identity θ ω g θ x ω g = g ω C ̲   . And we see that in this case things would follow easily from the D ( L 2 ( L ) )   contained in the estimates  405 . To implement this in a rigorous way, we derive the following elliptic equation for ω g θ   based on the formulas  199 :
ω g θ = i Δ 1 P θ ( ω g ω C ̲ i ) ,
= λ : θ λ i Δ 1 P λ ( ω g ω C ̲ i ) .
If we denote the (vector) kernel of operator x Δ 1 P λ   by K λ Δ 1   , then we have the uniform L 1   bounds: K λ Δ 1 L 1 λ 1 .   Using this and taking into account the previous reductions used in the proof of estimate  423 above we easily arrive at the bound:
θ I 2 L 2 ( L 2 ) λ : θ λ ( θ λ ) sup w , y I ~ w , y L 2 ( L 2 ) ,
sup w , y I ~ w , y L 2 ( L 2 ) ,
where I ~ w , y   is the family of translated kernels:
I ~ w , y = R n e 2 π i λ ω u ω g w 1 f ^ ( λ ω ) ω g y ω C ̲ y χ ( 1 2 , 2 ) ( λ ) λ n 1 d λ d ω d w d y , (428)
where we have also now set ω C ̲ y ( x ) = ω C ̲ ( x y )   . Using the decomposable estimate  390 , we now have that: I ~ w , y L 2 ( L 2 ) ω C ̲ y D ( L 2 ( L ) ) I w , y L ( L 2 ) ,   where the integral I w , y   is the same as I ~ w , y   but with the matrix ω C ̲ y   removed.
Using now the nesting:
D ( L 2 ( B ˙ 2 , 10 n q γ , ( 2 , n 1 2 ) ) ) D ( L 2 ( L ) ) , (429)
the estimate  405 , and the remarks made above about general kernels of the form  426 , we have the pair of estimates:
ω C ̲ y D ( L 2 ( L ) ) , I w , y L ( L 2 ) f ^ L 2 ,
uniform in the values of w , y   . This is enough to prove the estimate  424 . This completes our proof of the square sum Strichartz estimates contained in  178a .

10 This also seems to have far reaching philosophical consequences for how one should proceed in lower dimensions. Specifically, it seems to suggest that the correct “covariant” X s , θ   space should be defined in terms of the parametrix and not in terms of the symbol of the covariant equation.

11.2 Proof of the Differentiated Strichartz Estimates  178b  178c 

To wrap things up for this overall section, we prove the estimates  178b  178c .
This will follows easily from the general list of decomposable estimates contained in Lemma
 11.3 . In what follows, we will only bother to prove the time differentiated estimate  178c . The proof of the gradient estimate  178b follows from identical reasoning and is left to the reader (in fact, one only need apply the plain Strichartz estimates shown in previous sections followed by a D ( L ( L ) )   estimate for the spatial potentials { ω C ̲ }   ). Time differentiating the parametrix Φ ± ( f ^ )   we see that:
t Φ ± ( f ^ ) = R n ( ± 2 π i | ξ | ) e 2 π i λ ω u ± ω g ± 1 f ^ ( λ ω ) ω g ± χ ( 1 2 , 2 ) ( λ ) λ n 1 d λ d ω
+ R n e 2 π i λ ω u ± [ ω g ± 1 f ^ ( λ ω ) ω g ± , ω C 0 ± ] χ ( 1 2 , 2 ) ( λ ) λ n 1 d λ d ω ,
= Φ ± ( ± 2 π i | ξ | f ^ ) + I ~ .
Therefore, our task is to show the pair of estimates:
P 1 I ~ 1 L 2 ( S L 2 ( n 1 ) n 3 ) f ^ L 2 , (430)
P 1 I ~ 1 L ( L 2 ) f ^ L 2 . (431)
The estimate  430 follows from essentially identical reasoning to that employed in the proof of estimates  424  425 above. The main point is to drop to L 2 ( L 2 )   via Bernstein, and then use the D ( L 2 ( L ) )   estimate for the potential ω C 0   contained on line  405 above. The proof of the second estimate  431 above follows easily from the D ( L ( L ) )   estimate for ω C 0   contained on line  405 above. Specifically, one has the nesting: D ( L ( B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) ) ) D ( L ( L ) ) .   This completes our demonstration of  178b  178c and ends this section.

12 Completion of the proof: Controlling the L 1 ( L 2 )   Norm of the Differentiated Parametrix

Our final task here is to prove the estimate  178e which guarantees that our parametrix is a good approximation the covariant wave equation A ̲ 1   . This essentially boils down to applying the estimates  403  408 to the various error terms listed on the right hand side of equation  203 above. We will prove the desired estimates for each of these terms separately.

  Decomposing the term A ̲ 1 ( ω L ) ω C ± ( ω L )  

This represents the worst error term which comes out of our approximation, as well as the main “renormalization” which the parametrix creates. In what follows we will eliminate the ±   notation on favor of the ω L ̲   notation introduced on line  328 above. Using this convention, a short computation involving the formulas  200  201 and the structure equation  126e yields the identity:
A ̲ 1 ( ω L ̲ ) ω C ( ω L ̲ )
= ( I ω Π ¯ ( 1 2 δ ) ) A ̲ 1 ( ω ) + ω Π ¯ ( 1 2 δ ) Δ ω 1 P ~ ( [ B , H ] ) ( ω )
+ ω L ̲ Δ 1 ( [ ω A ̲ , ω C ̲ ] ) d * Δ 1 [ ω C ̲ , ω C ( ω L ̲ ) ] ,
= T 1 + T 2 + T 3 + T 4 . (432)
Our goal is prove the following four estimates:
T 1 D ( L 2 ( L n 1 ) ) , T 2 D ( L 1 ( L ) ) , (433)
T 3 D ( L 1 ( L ) ) , T 4 D ( L 1 ( L ) ) . (434)
To prove the first estimate on line  433 , we see from the decomposable version of the Besov nesting  44 that is suffices to prove the following: ( I ω Π ¯ ( 1 2 δ ) ) A ̲ 1 ( ω ) D ( L 2 ( B ˙ 2 n 1 , ( 2 , n ( n 3 ) 2 ( n 1 ) ) ) ) .   By the square sum nature of the Besov and decomposable norms, and keeping in mind the Besov version of the endpoint Strichartz estimate contained in the bootstrapping estimate  126d , we see that it suffices to prove this estimate at fixed frequency. Thus, we are trying to prove that:
( I ω Π ¯ ( 1 2 δ ) ) P μ A ̲ 1 ( ω ) D ( L 2 ( L n 1 ) ) P μ A ̲ 1 L 2 ( S B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) ) . (435)
Decomposing the term on the left hand side of this expression into all dyadic angular regions spread from the direction ω   this is further reduced to showing that:
ω Π θ ( I ω Π ¯ ( 1 2 δ ) ) P μ A ̲ 1 ( ω ) D θ ( L 2 ( L n 1 ) ) θ γ P μ A ̲ 1 L 2 ( S B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) ) .   Notice that we are only trying to show this for values θ μ 1 2 δ   . Further computing the term on the left hand side of this last expression, and applying the heuristic multiplier bound (also using the Coulomb savings  196 ): ( θ ξ k ) ω Π θ ( I ω Π ¯ ( 1 2 δ ) ) P μ A ̲ 1 ( ω ) θ ω Π θ P μ A ̲ 1 .   Plugging this into the definition of the norm D θ ( L 2 ( L n 1 ) )   , using the multiplier-sum reductions employed in the proof of the inequality  409 , and reverting back to Besov notation we have the inequality sequence involving Bernstein's inequality  56 and a simple index manipulation:
 435 
( L . H . S . ) θ ( φ : ω 0 Γ φ ω 0 Π ~ θ P μ A ̲ 1 2 L 2 ( B ˙ 2 n 1 , ( 2 , n ( n 3 ) 2 ( n 1 ) ) ) ) 1 2 ,
θ n 3 2 ( φ : ω 0 Γ φ ω 0 Π ~ θ P μ A ̲ 1 2 L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n ( n 3 ) 2 ( n 1 ) ) ) ) 1 2 ,
θ n 3 2 μ n 1 2 ( n 1 ) ( φ : ω 0 Γ φ ω 0 Π ~ θ P μ A ̲ 1 2 L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) ) ) 1 2 .
Estimate  435 now follows from the fact that: θ n 3 2 μ n 1 2 ( n 1 ) θ γ ,   which is a consequence of the truncation condition θ μ 1 2 δ   and the fact that 6 n   , and the fact that we have chosen δ , γ   according to  190 . This ends our proof of the first estimate on line  433 .
Our next step is to prove the second estimate on line
 433 above. We will show the somewhat more regular estimate:
ω Π ¯ ( 1 2 δ ) Δ ω 1 P ~ ( [ B , H ] ) ( ω ) D ( L 1 ( B ˙ 1 , ( n , n 2 ) ) ) . (436)
Decomposing the term inside the norm on the left hand side of this last inequality into dyadic angular scales, applying the definition of the fixed scale decomposable norms D θ ( L 1 ( B ˙ 1 , ( n , n 2 ) ) )   , using the (fixed time) fixed frequency heuristic multiplier bound (which again takes into account the savings  196 ): ( θ ξ ) k ω Π θ ω Π ¯ ( 1 2 δ ) Δ ω 1 P λ P ~ ( [ B , H ] ) ( ω ) θ 1 λ 2 ω Π θ P λ ( [ B , H ] ) ,   expanding the resulting expression into a trichotomy, applying the multiplier square sum reduction used previously in the proof of estimate  409 above, and keeping in mind the bootstrapping structure estimates  126f , we see that the estimate  436 reduces to the demonstration of the following three fixed time bounds:
λ , μ i : μ 1 μ 2 λ λ 2 ( φ : ω 0 Γ φ ω 0 Π ~ θ P λ ( [ P μ 1 ( B ) ( t ) , P μ 2 ( H ) ( t ) ] ) 2 L ) 1 2 (437)
θ 1 + γ B ( t ) B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) H ( t ) B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) , (438)
λ , μ i : μ 2 μ 1 λ λ 2 ( φ : ω 0 Γ φ ω 0 Π ~ θ P λ ( [ P μ 1 ( B ) ( t ) , P μ 2 ( H ) ( t ) ] ) 2 L ) 1 2 (439)
θ 1 + γ B ( t ) B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) H ( t ) B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) , (440)
λ , μ i : λ μ 1 μ 2 λ 2 ( φ : ω 0 Γ φ ω 0 Π ~ θ P λ ( [ P μ 1 ( B ) ( t ) , P μ 2 ( H ) ( t ) ] ) 2 L ) 1 2 (441)
θ 1 + γ B ( t ) B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) H ( t ) B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) . (442)
We begin with the proof of the first estimate  438 . This is the most singular of the three. Fixing all of the spatial frequencies on the left hand side of this bound, we see that by an application of Young's inequality, it suffices to prove the following refinement:
(443) λ 2 ( φ : ω 0 Γ φ ω 0 Π ~ θ P λ ( [ P μ 1 ( B ) ( t ) , P μ 2 ( H ) ( t ) ] ) 2 L ) 1 2 ( μ 1 μ 2 ) γ θ 1 + γ P μ 1 ( B ) ( t ) B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) P μ 2 ( H ) ( t ) B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) .
This bound is scale invariant, so we may assume that 1 = λ μ 2   . To aid in the demonstration, we introduce the auxiliary index: r ~ γ = 2 n ( n 1 ) n 2 2 n 1 2 ( n 1 ) γ .   Notice that this has been chosen precisely so that one has the identity: γ = 1 2 + n ( n 3 2 ( n 1 ) 1 r ~ γ ) ,   so that ultimately we can make a reference to the fixed frequency bound  53 . The problem here is that we have 2 < r ~ γ   (in any dimension), so we are going to run into orthogonality issues in the square-sum on the left hand side of  443 . This will end up costing some extra powers of θ 1   , but luckily the Bernstein inequality will more than make up for this. Applying Bernstein to each term in the sum on the left hand side of  443 we arrive at the bound:  443 
( L . H . S . ) θ n 1 r ~ γ ( φ : ω 0 Γ φ ω 0 Π ~ θ P 1 ( [ P μ 1 ( B ) ( t ) , P μ 2 ( H ) ( t ) ] ) 2 L r ~ γ ) 1 2 . (444)
To get rid of the square-sum on the right hand side of this last expression, we introduce the following map from L p ( R n )   to 2 ( L p ( R n ) )   : T θ ( A ) = ( ω 1 Π ~ θ P 1 ( A ) , , ω N Π ~ θ P 1 ( A ) ) ,   where ( ω 1 , , ω N )   is some ordering of the Γ φ   spherical cap “base-points”. Notice that there are N θ 1 n   of these. By orthogonality, and using the uniform boundedness of the multipliers ω Π θ P 1   on L   we have the pair of estimates:
T θ ( A ) 2 ( L 2 ) P 1 ( A ) L 2 ,
T θ ( A ) 2 ( L ) θ 1 n 2 P 1 ( A ) L .
By interpolating these to bounds in the pair of spaces ( 2 ( L 2 ) , 2 ( L ) )   and ( L 2 , L )   (see [1), we have the bound: T θ ( A ) 2 ( L r ~ γ ) θ ( 1 n ) ( 1 2 1 r ~ γ ) P 1 ( A ) L r ~ γ .   Plugging this last estimate into the right hand side of  444 above, and finally applying generic fixed frequency estimate  53 we have that:
 443 
( L . H . S . ) ,
θ ( n 1 ) ( 2 r γ 1 2 ) P 1 ( [ P μ 1 ( B ) ( t ) , P μ 2 ( H ) ( t ) ] ) L r ~ γ ,
( μ 1 μ 2 ) γ θ ( n 1 ) ( 2 r γ 1 2 ) P μ 1 ( B ) ( t ) B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) P μ 2 ( H ) ( t ) B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) .
The estimate  443 now follows from the bound: θ ( n 1 ) ( 2 r γ 1 2 ) θ 1 + γ ,   which holds in dimensions 6 n   . We leave the verification of this to the reader.
This ends our demonstration of the
L o w × H i g h   frequency estimate  438 . Notice that the second estimate  440 is simply a less singular version of this. In fact, repeating the above procedure, we see that in that case there is an extra factor of ( μ 2 μ 1 )   in the analog of the fixed frequency bound  443 .
We have now reduced the proof of the second estimate on line
 433 to the H i g h × H i g h   interaction bound  442 . By applying the L L 2   version of Bernstein, using orthogonality, and then applying the general bound  52 , we have the fixed frequency estimate:
λ 2 ( φ : ω 0 Γ φ ω 0 Π ~ θ P λ ( [ P μ 1 ( B ) ( t ) , P μ 2 ( H ) ( t ) ] ) 2 L ) 1 2 ,
θ n 1 2 λ n 4 2 P λ ( [ P μ 1 ( B ) ( t ) , P μ 2 ( H ) ( t ) ] ) L 2 ,
( λ μ 1 ) σ θ n 1 2 P μ 1 ( B ) ( t ) B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) P μ 2 ( H ) ( t ) B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) ,
where 0 < σ = n ( n 3 n 1 ) 2   . By summing this last line and then applying Cauchy-Schwartz, we easily arrive at the bound  442 .
To finish this subsection, we only need to prove the two estimates on line
 434 above. To show the first estimate involving the T 3   term, we simply expand the ω L ̲   derivative into the product via the Leibniz rule, and then use the decomposable bounds  403 and  405 and  407  408 in conjunction with the following instance of the bilinear decomposable estimate  392 : Δ 1 : D ( L t 2 ( B ˙ 2 , 10 n p γ , ( 2 , n 3 2 ) ) ) D ( L t 2 ( B ˙ 2 , 10 n q γ , ( 2 , n 1 2 ) ) ) D ( L t 1 ( B ˙ 1 , ( 2 , n 2 ) ) ) .   To show the second bound on line  434 , we again use the estimates  403 and  405 , this time in conjunction with: x Δ 1 : D ( L t 2 ( B ˙ 2 , 10 n q γ , ( 2 , n 1 2 ) ) ) D ( L t 2 ( B ˙ 2 , 10 n q γ , ( 2 , n 1 2 ) ) ) D ( L t 1 ( B ˙ 1 , ( 2 , n 2 ) ) ) .   This completes our decomposable estimates for the error term A ̲ 1 ( ω L ) ω C ± ( ω L )   .

  Decomposing the term D α A ̲ 1 ( ω C ± ) α  

Again dropping the ±   notation and using the equations  200  201 and the identity  328 as well as the structure equation  126e , we can write this as:
D α A ̲ 1 ( ω C ) α = ω Π ¯ ( 1 2 δ ) ω L Δ ω 1 P ~ ( [ B , H ] ) ( ω )
+ ( ± ω L ̲ ω x ) t Δ 1 [ ω A ̲ , ω C ̲ ] + t d * Δ 1 [ ω C 0 , ω C ̲ ]
[ ω A ̲ , ω C ̲ ] + [ A ̲ 1 , ω C ̲ ] ,
= T ~ 2 + T ~ 3 + T ~ 4 + T ~ 5 + T ~ 6 .
We will show that all of these terms obey the estimate:
T ~ k D ( L 1 ( L ) ) , 2 k 6 . (445)
Notice that, for the most part, the terms T ~ k   represent less singular versions of the T k   on line  432 above. In fact, they can all be treated using similar embeddings by simply wasting one derivative. Specifically, the estimate  445 for the first term T ~ 2   follows directly from  436 above once one takes into account the presence of the truncation  126c inherent in the projection P ~   . To prove the estimate  445 for the portion of term T ~ 3   containing the ω L ̲   derivative, we use the same embedding employed in the proof of the estimate for T 3   on line  434 above. This follows because one can distribute the time derivative and simply waste smoothness in the estimates  404 ,  406 , and  407  408 . Specifically, by taking advantage of the low frequency behavior of these estimates, we have the bounds:
t ω A ̲ D ( L t 2 ( B ˙ 2 , 9 n q γ , ( 2 , n 1 2 ) ) ) , t ω C ̲ D ( L t 2 ( B ˙ 2 , 9 n q γ , ( 2 , n 1 2 ) ) ) , (446)
t ω L ̲ ω A ̲ D ( L t 2 ( B ˙ 2 , 9 n p γ , ( 2 , n 3 2 ) ) ) , t ω L ̲ ω C ̲ D ( L t 2 ( B ˙ 2 , 9 n p γ , ( 2 , n 3 2 ) ) ) . (447)
Using a similar strategy, we can prove the estimate  445 for the portion of T ~ 3   containing the ω x   derivative (notice that the functions ω i   are trivially decomposable) as well as the term T ~ 4   in the same way as we showed  434 for the term T 4   above. All we need to do is to show the estimate: t ω C 0 D ( L t 2 ( B ˙ 2 , 9 n q γ , ( 2 , n 1 2 ) ) ) .   This follows in the same way we proved the undifferentiated estimate  405 for ω C 0   above, but instead of using the undifferentiated versions of  403 ,  405 , and  407  408 , we simply use  446  447 . Finally, notice that the proof of the estimate  445 for the terms T ~ 5   and T ~ 6   above follows by simply multiplying (decompose twice!) the D ( L 2 ( L ) )   estimate which is implied by the bounds  403 and  405 above. This completes our decomposition of the second error term on the right hand side of  203 above.

  Decomposing the term [ A ̲ α 1 ( ω C ± ) α , [ ( A ̲ 1 ) α ω C α ± , ] ]  

Here we again use the norm D ( L 1 ( L ) )   , which we can achieve as a product of D ( L 2 ( L ) )   estimates, again making an appeal to  403 and  405 above.
This completes our proof of the approximation estimate
 178e and thus, at last, the proof of Proposition  7.2 which allows us to close the bootstrapping begun in Proposition  6.1 . FP. References

  1. Jöran Bergh, Jörgen Löfström Interpolation spaces. An introduction. Grundlehren der Mathematischen Wissenschaften, No. 223. Springer-Verlag, Berlin-New York, 1976.
  2. P. Bizoń, Z. Tabor, On blowup of Yang-Mills fields. Phys. Rev. D (3) 64 (2001), no. 12, 121701, 4 pp.
  3. Thierry Cazenave, Jalal Shatah, Shadi A. Tahvildar-Zadeh Harmonic maps of the hyperbolic space and development of singularities in wave maps and Yang-Mills fields. Ann. Inst. H. Poincar Phys. Thor. 68 (1998), no. 3, 315–349.
  4. Markus Keel, Terence Tao Endpoint Strichartz estimates. Amer. J. Math. 120 (1998), no. 5, 955–980.
  5. Sergiu Klainerman, Igor Rodnianski Improved local well-posedness for quasilinear wave equations in dimension three. Duke Math. J. 117 (2003), no. 1, 1–124.
  6. Sergiu Klainerman, Igor Rodnianski On the global regularity of wave maps in the critical Sobolev norm. Internat. Math. Res. Notices 2001, no. 13, 655–677.
  7. Andrea Nahmod, Atanas Stefanov, Karen Uhlenbeck, On the well-posedness of the wave map problem in high dimensions. Comm. Anal. Geom. 11 (2003), no. 1, 49–83.
  8. Igor Rodnianski, Terence Tao Global regularity for the Maxwell-Klein-Gordon equation with small critical Sobolev norm in high dimensions. Comm. Math. Phys. 251 (2004), no. 2, 377–426.
  9. Jalal Shatah, Michael Struwe The Cauchy problem for wave maps. Int. Math. Res. Not. 2002, no. 11, 555–571.
  10. Hart F. Smith, Daniel Tataru Sharp local well-posedness results for the nonlinear wave equation. to appear in Annals of Mathematics.
  11. Terence Tao Global regularity of wave maps. I. Small critical Sobolev norm in high dimension. Internat. Math. Res. Notices 2001, no. 6, 299–328.
  12. Michael E. Taylor Tools for PDE. Pseudodifferential operators, paradifferential operators, and layer potentials. Mathematical Surveys and Monographs, 81. American Mathematical Society, Providence, RI, 2000.
  13. Karen K. Uhlenbeck Connections with L p   bounds on curvature. Comm. Math. Phys. 83 (1982), no. 1, 31–42.