Zhang's Theorem on Bounded Gaps Between Primes

by Dan Goldston

In late April 2013 Yitang Zhang of the University of New Hampshire submitted a paper to the Annals of Mathematics proving that there are infinitely many pairs of primes that differ by less than 70 million. The proof of this amazing result was verified with high confidence by several experts in the field and accepted for publication. A public slightly revised version of the paper should be available shortly.

Before describing the proof, we make the caveat: As with any major result, the proof needs to be widely examined for errors before it can be completely accepted as correct. In this case the 55 page paper is detailed and precise, and the probationary period will probably be a few months.

Zhang's theorem is a huge step forward in the direction of the twin prime conjecture. We now know for the first time that there are actually infinitely many pairs of primes that differ by some fixed number. The proof does not specify any specific number, only that there is at least one that is less than 70 million. (We actually expect that every even number will occur as the difference of two primes infinitely often.) Since the average spacing of primes around $x$ is $\log x$ which grows to infinity slowly, this sequence of pairs defy this isolating trend. If not twins, then certainly siblings.

Zhang's proof makes use of a modified version of the 2005 method of Goldston, Pintz, and Yildirim (GPY). Consider the tuple $(n+h_1,n+h_2, \ldots , n+h_k)$, where the $h_i$'s are specified integer shifts, and $n$ runs through the positive integers. The Hardy-Littlewood $k$-tuple conjecture states that there are infinitely many $n$ where every component of the tuple is simultaneously prime, except for $h_i$'s where this is clearly impossible. Thus $(n,n+1)$ is only a prime tuple for $(2,3)$ since one of $n$ or $n+1$ is even and divisible by 2. Similarly $(n,n+2,n+4)$ only is a prime tuple for $(3,5,7)$ since one component is divisible by $3$. On the other hand $(n,n+2)$ and $(n,n+2,n+6)$ have no such obstruction and we expect them to be prime tuples infinitely often; the former giving the twin primes. A tuple without the obstruction that some prime always divides one of its components is called admissible. How GPY works is to apply a sieve, actually the Selberg sieve, to a given tuple when $n$ runs over positive integers $\le x$ . Roughly speaking, the sieve removes the tuples where any component has a small prime factor and leaves only tuples where all the components have only large prime factors. Therefore the tuples left have relatively few prime factors distributed among their components, and we obtain a sequence where we are likely to find primes close together. If for example we had a 10-tuple where the components together have at most 18 prime factors this would force at least two components of the tuple to be prime and we would have produced two primes whose difference is at most the width of the tuple. Sieves never work this well or we would have solved most of the problems concerning primes long ago. Despite this, the Selberg sieve applied to larger $k$-tuples is still highly effective at producing "almost prime" tuples. We need to use some further idea in order to detect the primes in the tuple, and what we use is the level of distribution of primes in arithmetic progressions.

The prime number theorem says that the number of primes $\le x$ is asymptotically $x/\log x$ as $x\to \infty$. This was proved in 1896 by Hadamard and de la Vallèe-Poussin, and in 1899 de la Vallèe-Poussin extended the result to prove the prime number theorem for arithmetic progressions. Thus for example if we look at the three arithmetic progressions modulo 3 we have no primes except 3 in $3, 6, 9, 12, \ldots$ but asymptotically half of all the primes will occur in $1, 4, 7, 10, \ldots$ and the other half occur in $2, 5, 8, 11, \ldots$. The principle is that for the $q$ progressions modulo $q$ the primes will flow evenly into the progressions that allow primes, namely those for which there is no prime dividing every term in the progression. There are $\phi(q)$ such progressions, where $\phi(q)$ is the Euler phi-function equal to the number of integers $a$, $1\le a \le q$ for which $a$ and $q$ are relatively primes. Hence each progression gets asymptotically $1/\phi(q) \times x/\log x$ primes. For applications we usually need to have $q$ be a function of $x$, and for this our knowledge is extremely limited. For example, if $q=x^\alpha$ then there is no $\alpha >0$ known for which the asymptotic formulas are always true. Assuming the Generalized Riemann Hypothesis (GRH) this is know for any $\alpha < 1/2$. A major advance in the field occurred in 1965 when Bombieri and independently Vinogradov proved that the asymptotic formula is true for $\alpha < 1/2$ for "almost all" progressions. Since in many applications we break our expression up over lots of progressions, the Bombieri-Vinogradov Theorem has exactly the same strength as GRH for these applications. If in place of $1/2$ we assume that almost all progressions satisfy the prime number theorem for arithemtic progressions for $q=x^\alpha$ and $\alpha <\theta$ then we say the primes have level of distribution $\theta$. It is conjectured that the primes have level of distribution 1, which is called the Elliott-Halberstam conjecture, but it appears very difficult to go beyond the know level of distribution $1/2$, and GRH does not help with this either.

With this preparation we return to GPY and take the sequence of almost prime tuples produced by the Selberg sieve and freeze (or twist) one of the components to be a prime. We can compute this proportion because the sieving is implemented through multiples of any divisors of the other components, and we then count how often the frozen prime component occurs in these arithemtic progressions. We can freeze individually each of the $k$ components to be a prime and we get the same result in each case. If the amount we get when we freeze one component to be a prime is greater than $1/k$ times the contribution of all the almost prime tuples, then the $k$ events where one component is frozen to be a prime cannot all be disjoint from each other and there must be some overlap where two of the components are primes simultaneously. This produces two primes in the given tuple. Therefore the success of the method relies on the ratio of the almost primes tuples produced by the Selberg sieve to the almost prime tuples produced where one component is a prime. The Selberg sieve coefficients need to be chosen appropriately, and while the optimal choice is not clear, it isn't too hard to make a natural and very good choice. In evaluating the twisted sum the size of the numbers we use in sieving is limited by the level of distribution, and therefore the sieving is improved by a larger level of distribution.

The result of this procedure obtained in GPY is the following. If the level of distribution $\theta >1/2$ then you actually win and obtain two primes in your tuple for sufficiently large $k$. Thus you get bounded gaps between primes under this unproved assumption. If we assume a level of distribution 1, the Elliot-Halberstam Conjecture, then we get two primes infinitely often in every admissible $6$-tuple. The smallest admissible $6$-tuple is $(n, n+4, n+6, n+10, n+12,n+16)$ which gives pairs of primes with difference $\le 16$. Since we only know level of distribution $1/2$, GPY fails to get 2 primes in any tuple unconditionally. However not all is lost, and by separate arguments one can still produce two primes significantly closer than the average spacing between primes. However these small gaps between primes are unbounded and grow to infinity.

The most tantalizing question left by GPY was whether one could edge over the level of distribution barrier at $\theta =1/2$ to get bounded gaps between primes. Soundararajan showed that improving the choices in the Selberg sieve could not succeed, and that shifted the question onto the level of distribution error terms that arise in the method. In the early 1980's Fouvry and Iwaneiec were able to obtain results connected to primes in arithmetic progressions with a level of distribution greater than 1/2. During 1985-90 Bombieri, Friedlander, and Iwaniec wrote three well-known papers where they obtained many results which go beyond level of distribution 1/2. In particular these papers obtain results beyond what GRH implies, and require estimates from the spectral theory of automorphic forms. Their results in some cases corresponding to a level of distribution $4/7$. One limitation of these methods is that they do not apply directly to sums of absolute value of the errors terms in the progressions, but rather to weighted sums of the error terms, where the weights have a fair degree of flexibility. Since in applications the error terms also come weighted by various arithmetic functions, these new results have found a number of important uses. For example, Brian Conrey's proof that more than 2/5's of the zeros of the Riemann zeta-function are on the half-line uses this analysis. In the case of GPY however, despite attempts by a number of mathematicians, no one was able use these techniques to deal with the error terms generated by the method.

Now Zhang has found how to break this barrier. The first step is a modification in the GPY method. Zhang shows that in using the Selberg sieve one can restrict the sieving to divisors with no large prime factors. This of course decreases the effectiveness of the sieve, but in GPY he shows that this decrease has a surprisingly small effect on the method. Next, he increase the sieving range corresponding to a level of distribution $1/2 + 1/584$. This turns out to be large enough to overcome the loss from removing the large prime factors above and give bound gaps between primes, provide all the error terms can be controlled. This leads to a very intricate analysis using the methods of Bombieri, Friedlander, and Iwaniec. Of crucial importance is that the divisors in the sieving have no large prime factors and therefore may themselves be factored into factors of various sizes with considerable flexibility. Ultimately it is shown that all the error terms are controlled and the result follows. It should be noted that this does not provide a proof that the primes themselves have a level of distribution greater than 1/2 according to our definition of level of distribution above. It is too early to see how Zhang's method may be used in other applications.

Zhang's proof ends up proving that every admissible $k$-tuple with $k=3,500,000$ contains 2 primes. It is not easy to find the smallest such admissible tuple, but a good choice is to take the shifts $h_i$ to be the first $k$ primes that are larger than $k$. Using a math package and rounding up to the nearest 10 million gives the size $70,000,000$ for the width of the tuple and thus the difference between the two primes. (Actually 60,000,000 seems to work here.) This number has attracted attention, but it has no intrinsic significance. In working out a complicated proof, one often uses whatever numbers make the argument work out reasonably simply. No doubt much smaller numbers will be found in later papers of greater complexity as the limits of the method are explored. At the moment it is hard to predict even the rough size of the ultimate answer by this method. On the other hand, it would be nice to have simpler and shorter proofs which make no attempt to get any specific number.