Last week we saw that every positive integer greater than one can be factored into primes in an essentially unique way. This week we ask a different question: Just how many primes are there?

Euclid solved this problem a little over two thousand years ago by showing there are infinitely many primes. His proof was by contradiction. If there are only finitely many primes then we can list them all:

\[

p_1, \ p_2, \ p_3, \ p_4, \ \dots, \ p_k

\]

We can now define a new number, which we shall call $N$, by the formula

\[

N=p_1p_2p_3 \dots p_k +1

\]

That is, $N$ is obtained by multiplying all the primes together and adding one.

The fundamental theorem of arithmetic tells us that $N$ must have at least one prime factor. (Of course, $N$ might be prime itself, but that is OK since every number is a factor of itself.)

But $N$ is clearly not divisible by any of the primes on the list, since it leaves a remainder of one upon division by any of our primes. That shows there is a prime that is not on our list, which is a contradiction.

An ingenious argument! G. H. Hardy used it as one of two examples of mathematical elegance in his famous essay “A Mathematician's Apology.” I doubt many mathematicians would disagree with the choice. (Hardy' other selection, for the record, was the classic proof of the irrationality of the square root of two.)

There are many other proofs, of course. Kummer devised a clever variation on Euclid's argument. He defined $N$ by the formula

\[

N=p_1p_2p_3 \dots p_k.

\]

No adding one this time. But then he observed that $N-1$ must have a prime factor $p_i$. Since $N$ itself is also a multiple of $p_i$ we see that

\[

p_i \mid N-(N-1)=1,

\]

which is plainly impossible.

Actually, that last little trick shows that any two consecutive numbers must be relatively prime. (That means they have no prime factors in common.) Filip Saidak used that to produce a direct proof of the infinitude of the primes. Let $n$ be a positive integer greater than one. Define the numbers

\[

A=n(n+1) \phantom{xxx} \textrm{and} \phantom{xxx} B=A(A+1)

\]

Then $A$ has at least two prime factors, since it is the product of consecutive numbers. But then $B$ must have at least three prime factors. Since we can continue this sequence indefinitely we see that the number of primes must be greater than any finite number. It follows that there are infinitely many primes.

Euclid's trick of multiplying all the primes and adding one is so good that it would be a shame if the infinitude of the primes was the only thing we could prove with it. So let us generalize things.

By an *arithmetic progression* we mean a sequence of positive integers in which each element is obtained from the one before by adding the same number. An example of an arithmetic progression is

\[

1, \ 2, \ 3, \ 4, \ 5, \ 6, \ \dots

\]

That is the progression where we begin with one, and keep adding one to produce the subsequent numbers. We have shown there are infinitely many primes in that sequence. So let us ask the question: If we pick an arbitrary arithmetic progression, how many prime numbers will it contain?

Let's experiment! Here are two progressions:

\[

\begin{align*}

4k+1 &: \phantom{x} 1, \ 5, \ 9, \ 13, \ 17, \ 21, \ 25, \ 29, \ \dots \\

4k+3 &: \phantom{x} 3, \ 7, \ 11, \ 15, \ 19, \ 23, \ 27, \ 31, \ \dots

\end{align*}

\]

In the first sequence we begin with one and keep adding four. In the second we began with three and kept adding four. A quick scan reveals that some elements of the sequence are prime and some are not. Let us try to use Euclid's method to prove there are infinitely many primes in the progression $4k+3$.

Suppose there were only finitely many such primes. Then list them all. For reasons that will become clear momentarily, remove the number three itself from the list. Now construct the number

\[

N=4 \left( p_1p_2 \dots p_k)+3,

\]

Plainly $N$ has a prime factor. But it is not divisible by any of the primes on our list. It also is not divisible by three, since we deliberately left that number off our list. (Had we not done so, the number $N$ would be a multiple of three.) It follows that there are primes not on our list.

So far, all is essentially identical to Euclid. The trouble is, in this case we already knew there were primes not on our list. Specifically, all of the primes that are one more than a multiple four were omitted. We have not yet shown there are any primes of the form $4k+3$ that have been left off our list.

But wait! It cannot be that all of $N$'s factors are of the from $4k+1$. Note that $N$ itself is three more than a multiple of four. But if we multiply two numbers together, each of which is one more than a multiple of four, we get this:

\[

\left( 4k+1 \right) \left( 4 \ell + 1 \right)=\left( 16k \ell + 4k + 4 \ell \right) +1

\]

The number in the parentheses is clearly a multiple of four. It follows that our product is one more than a multiple of four. This shows that it cannot be that all of $N$'s prime factors are one more than a multiple of four. It must have at least one prime factor that is three more than a multiple of four. And since none of the primes on our list fits the bill, the proof is complete.

Well, that's one sequence down! Alas, we cannot apply the same method to the sequence $4k+1$. You see, it is possible for a number that is one more than a multiple of four to have all of its prime factors be three more than a multiple of four. For example:

\[

21=7 \times 3

\]

In fact, while Euclid's method will allow us to pick off certain progressions, it cannot be used to solve the general question we asked. So I repeat, if $ak+b$ is an arithmetic progression (in which we start with $b$ and keep adding $a$ over and over again), how many prime numbers will it contain?

Obviously we need to require that $a$ and $b$ be relatively prime. If they had a common factor other than one, then every number in the sequence would be divisible by that common factor. That means every number after the first would definitely fail to be prime. So let us make this one assumption.

It turns out that with that one simple assumption, the progression $ak+b$ contains infinitely primes. Sadly, there is no elementary proof of that fact. Dirichlet proved that in 1837, but he since he needed to invent the field of analytic number theory to do it we shall definitely save that for a different post.

- Log in to post comments

Perhaps an overly-pedantic point -- given that 3 is the first prime in the "4k+3" sequence, and you're specifically leaving it out, shouldn't that be:

\[

N=4 \left( p_2p_3 \dots p_k)+3,

\]

</hyperpedant>

Pedantic indeed! Good point though.

Reminds me of this lovely song:

http://www.youtube.com/watch?v=2b4u3SwUIe4 'Twin Prime Conjecture'

So, do mathematicians who are looking for the next largest known prime utilize the fact that it is between Pk and N?

Darlingtonia: No, for several reasons.

1. N is much, much larger than the next prime in all reasonable cases.

2. The largest known primes all have very special forms (e.g., 2^n-1) because there are very efficient algorithms for testing primality for numbers of these forms. If you want to find a larger prime number than anyone has yet found, what you do is go looking more or less at random for primes of the form 2^n-1 for n somewhat bigger than the current record. These numbers would, again, be much smaller than N.

3. In particular, no one goes looking for "the next largest prime". It's probably not at all feasible to determine the next prime after the currently-biggest-known prime.

4. There are other provable bounds on "the next prime" that are much, much smaller than N. For instance (though this is still far from the best one can do) there is a theorem called "Bertrand's postulate" that says that the next prime after n is always less than 2n. (Unless n<2.)