Progress on the Twin Primes Conjecture

Number theory is chock-full of easily stated problems that are very difficult to solve. One such is the twin primes conjecture, which asserts simply that there are infinitely many twin primes. I'll assume you know what a prime number is. Twin primes are primes that differ by exactly two, such as 3 and 5, 5 and 7, 11 and 13, 17 and 19, or 101 and 103.

Of course, everyone knows there are infinitely many primes. If you have not seen it before, here's one easy way to prove that fact. We assume for a contradiction that there are only finitely many primes, let's say k of them. Then we can make a list:

$latex p_1, p_2, p_3, \ldots, p_k$

of all the primes. But now we can define a number N by multiplying all of these numbers together and adding one. That is, we have:

$latex N=p_1p_2p_3 \cdots p_k+1$

But we know that N must be divisible by some prime number, since every number has at least one prime factor. (Of course, the number N might itself be prime, but that's OK, since every number is a factor of itself.)

We now ask, what prime number divides N? It cannot be any of the primes on our list, since N leaves a remainder of 1 when divided by any of those primes. The only possible conclusion is that there must be a prime number that is not on the list. This contradicts our assumption that our list was complete. It follows that no finite list can possibly contain all the primes, and therefore there are infinitely many prime numbers.

There is also a clever trick for showing that there are arbitrarily long sequences of non-prime numbers. Recall that if n is a positive integer, then by n!, read “n factorial”, we mean the product of all the integers from 1 to n. For example, we have

$latex 6!=6 \times 5 \times 4 \times 3 \times 2 \times 1=720$

Notice that 6! is a multiple of the numbers 2, 3, 4, 5 and 6. I can use this fact to produce a sequence of five consecutive non-prime numbers, namely 722, 723, 724, 725 and 726. 722 is a multiple of 2, 723 is a multiple of 3, 724 is a multiple of 4, 725 is a multiple of 5 and 726 is a multiple of 6. Do you see the point? If you want n consecutive non-prime numbers, just consider the sequence:

$latex (n+1)!+2,\ (n+1)!+3, \ (n+1)!+4, \ldots, (n+1)!+n, \ (n+1)!+(n+1)$

The first number is divisible by 2, the next is divisible by 3, the next is divisible by 4, and so on until the last number, which is a multiple of n+1. Clever!

So, there are infinitely prime numbers, but also arbitrarily long gaps between primes. There is also the prime number theorem, which tells us that for very large values of x, the number of primes smaller than x is closely approximated by

$latex \dfrac{x}{\ln x}$

(A more precise statement of the theorem can be found here.)

None of this really tells us much about twin primes, alas. But we do now know that there is some finite number with the property that there are infinitely many pairs of primes by differ by no more than that number:

On April 17, a paper arrived in the inbox of Annals of Mathematics, one of the discipline's preeminent journals. Written by a mathematician virtually unknown to the experts in his field -- a 50-something lecturer at the University of New Hampshire named Yitang Zhang -- the paper claimed to have taken a huge step forward in understanding one of mathematics' oldest problems, the twin primes conjecture.

Editors of prominent mathematics journals are used to fielding grandiose claims from obscure authors, but this paper was different. Written with crystalline clarity and a total command of the topic’s current state of the art, it was evidently a serious piece of work, and the Annals editors decided to put it on the fast track.

Just three weeks later -- a blink of an eye compared to the usual pace of mathematics journals -- Zhang received the referee report on his paper.

“The main results are of the first rank,” one of the referees wrote. The author had proved “a landmark theorem in the distribution of prime numbers.”

What is that finite number? According to the conjecture, that number should be two. Here's what Zhang achieved:

His paper shows that there is some number N smaller than 70 million such that there are infinitely many pairs of primes that differ by N. No matter how far you go into the deserts of the truly gargantuan prime numbers -- no matter how sparse the primes become -- you will keep finding prime pairs that differ by less than 70 million.

70 million might seem pretty far from 2. What you have to remember, though, is that the difference between 2 and 70 million pales in comparison to the difference between 70 million and infinity. So Zhang's theorem is definitely a big deal!

More like this

Neat! So it's correct to say that prior to this, we didn't know (for certain) the size of the largest possible gap between two "adjacent" primes? I suppose that no two primes can actually be infinitely apart, but it's still right to say the number has been reduced from "infinity" because there had been no particular upper bound. (In principle, there could be a point where the gap gets bigger and bigger, thus meaning that for every integer you can name, there is a pair of adjacent primes that distance apart. Except now we know that can't be the case.)

Anyway, before anyone makes the mistake I'd made, it should be noted that the proof of infinite primes cannot be used to generate any new primes. For example, the first n primes are 2, 3, 5, 7, 11, and 13. Their product is 30,030. Add 1 and you get 30,031, which is the product of the primes 59 and 509.

This doesn't somehow contradict the original proof because those primes are outside the set we used, and the proof is considering the assumption that its set contains all the primes, hence the couldn't be any outside it.

If I'm not mistaken, you could also make a proof like it where you subtract 1. Of course, it can't generate new primes either.

the proof of infinite primes cannot be used to generate any new primes

Neither, to the best of my knowledge, is there any other method which is 100% guaranteed to produce prime numbers. Even Mersenne's algorithm (2^p - 1 where p is a prime number), which is the method used to find the largest primes, does not always produce a prime number. There are, however, methods that are guaranteed to produce non-primes. The Sieve of Eratosthenes works by finding all of the non-primes less than or equal to a particular value: start with 2, eliminate all larger numbers that are divisible by 2, then repeat with each successive number p which has not previously been eliminated. At each stage, if p is the next number to use as a trial divisor, then any number less than p^2 which you have not already eliminated must be prime (if such a number had a prime factor greater than or equal to p, it would also have to have a prime factor less than p, and you have shown that it does not).

By Eric Lund (not verified) on 24 May 2013 #permalink

1) There is no limit on the distance between "adjacent" primes, if what you mean by adjacent is in their normal ordering. That is the point of Jason's 4th para.
2) One wouldn't use -1 in the proof just to avoid the extra explanation required if one began with the single prime 2.

I'm always fascinated by mathematical proofs that end up referencing apparently arbitrary numbers (by which I mean basically anything other than the classic 0, 1, infinity). How does a number like 70 million arise? Presumably it's the result of some rapidly increasing function somewhere in the calculations. Is 70 million precise though? It seems really unexpected that it be 'round' in base 10. Is it just journalistic rounding for 2^26+some stuff ?

Hi Zed

I'm told [I haven't looked myself] that according to WolframAlpha, the proof gives a gap size of 63,374,610

By Michael Fisher (not verified) on 24 May 2013 #permalink


What does that mean that Mersenne's algorithm can't produce primes? You generate numbers and check them for primality until you get one, 100% accurate (it's rather easy to check a number for primality, see AKS, it's new-ish, actually, but there's an obvious naive algorithm, too). People generate large prime numbers all the time for cryptography, etc. I don't know what the etc is, mostly just for crypto, probably.


I think what Eric meant is that there's no determinate (or non-recursive) algorithm to produce a prime number.

Generate and check is indeterminate, as it requires recursion when the check fails, an arbitrary number of times.


there are deterministic algorithms to find primes. They can be pretty slow though.


I am not familiar with the usage "determinate." If you're only interested in algorithms that *don't recurse*, that's seems very restrictive and not practically interesting when talking about algorithms that are actually used for things. I mean, don't they almost all recurse? Is there an especially interesting subset of practical algorithms that don't recurse?

And it's not quite true that this prime generating algorithms will recurse an arbitrary number of times. The complexity of something called a wheel sieve for finding all primes less than N is O(N/(((log N)^L)log log N), so there's a bound. And the next prime after that has to be less than the product of those primes (so that is a very huge upper bound, true).

Sorry, that's space complexity on the wheel sieve, time complexity is O(N/log log N).

GUYS I HAVE PROVED IT WHEREE CAN I SUBmit my papers??? i Proved the twin prime conjecture!!!!

So I'm still interested in this claim of "no algorithms for generating primes," -- obviously false but wondering if it's just a sloppy statement of something that's actually interesting.

I am perhaps misunderstanding Thanny's claim that there are no "determinate" algorithms for it: he can't mean that it's interesting that there are no non-recursive algorithms for finding primes. I'll grant that there are no non-recursive algorithms for finding primes, but there are also no non-recursive algorithms for finding out if a list contains a given element, so it would be very weird to me to claim that's an *interesting* restriction to put on an algorithm, because it excludes almost everything you might want an algorithm for.

As far as time bound, a quick google search turns up this comment on stackexchange:…

claiming that if Cramer's conjecture is true, then finding a prime greater than N is in P. I don't know the number theory here, i.e. I don't know what Cramer's conjecture is and whether it actually means there's only O((log N)^2) tests if it's true. And I also don't know whether Cramer's conjecture is one of those things that people tend to believe without proof, or whether people tend to think it's implausible, even though we can't find a counterexample.

But I'm assuming that post is correct, and that we can't *prove* that finding a prime greater than N is in P, but that it's reasonable to expect you will actually find that prime in polynomial time.

Certainly, computer scientists don't treat the problem of finding primes as if it's intractable.

I think that maybe Thanny meant that there's no FORMULA for generating primes. For instance, the formula 2^p -- 1, where p is a prime sometimes gives a prime result (for p=2 for instance), but sometimes gives a non-prime result (p=11, which gives 2047=89*23). There is no known formula for which we can plug in an input number and use the formula to generate ONLY prime numbers.

Aha Sean, I have a formula: (4+(-1)^n). Plug any value of n in and you will always get a prime!


Pete, Cramer's conjecture says that the next prime after p is never bigger than p + C log(p)^2 for some (fixed) C. In other words, the gaps between primes never get "too large".

I think it's generally expected to be true, but really hard to prove.

Aha, Joffan, but that's not really true! Try n = 1/2, for instance. We both made the same mistake; we didn't express ourselves precisely enough. Your formula works only for integral values of n, but you didn't specify that. My statement SHOULD have been that there's no formula that functions as a one-to-one and onto map from some subset (not necessarily a proper one) of the integers to the primes. Your example obviously is not onto the primes as only 3 and 5 are generated by it.