In physics, you come up with an idea, formulate it mathematically, find the theory’s predictions about the real world, and test those predictions by experiment. This works because God is subtle, but not malicious (to borrow Einstein’s words). In more concrete language, the laws of physics fit well together and make sense as a coherent, testable experimental whole.

Mathematical truths are not so easily verifiable. Because mathematics in some sense encompasses a much wider spectrum of possibilities, experimental tests can’t cover all the needed ground in most cases. The laws of physics are all mathematical statements, but math is a much broader subject that just that small subset of math that’s directly useful in physics. Much broader. As a result, most mathematical statements have to be proven mathematically. Mathematical “experiments” with computers are usually inadequate. There are exceptions like the four-color theorem which can be checked by brute force experimental calculation but those are not the rule.

A commenter who writes the very interesting blog The Outer Hoard mentioned a post of his from several years ago doing a mathematical experiment on the prime numbers. It goes like this: get some paper with a fine grid drawn on it. Start counting, and for each number go forward one space from your starting point. When you reach a prime number, turn left. Repeat. You get patterns that look like this (bigger pictures at his site):

It certainly doesn’t *look* like a random glob. There are stretches where the path seems to have a strongly preferred direction. Is this actually what’s happening, or is it just an illusion of randomness?

We have observed that there are regions of numbers in which every fourth interval between one prime and the next tends to be larger than the other intervals. Several intellectuals (though none of them mathematicians) have confidently told me that there’s nothing significant about this, that it boils down to the same clustering tendency you get when dealing with random numbers. And they add, as though it were some sort of reducto ad absurdum, that if my observation were significant, it would imply that there’s something special about the number four.

Unfortunately I’m not a mathematician either. I’m a recreational amateur at best. I like my lab, and the direct connection to physical facts. Theory is great and I use and develop it when needed, but I’m not naturally a theorist by temperament or innate skill. Still, I know a few things and we might be able to make some headway here.

The problem with trying to mathematically determine the expected behavior from first principles is that the properties of the spacing between the primes is not completely understood. The most important open problem in mathematics is the Riemann hypothesis, and it has to do with how “randomly” the primes are spaced. On smaller intervals, the tighest a “turn” can be are when the primes are spaced one apart from each other: (5,7) or (11,13) for instance. But even for something that simple, it’s not even known for sure if there are infinite twin prime pairs, much less how they’re distributed. So we have to use experiments for the moment until a good mathematician shows up.

First, we know the primes are not distributed randomly. They’re in exactly the places where the prime numbers are! But we do know that in a way their spacing has certain overall properties that can be thought of as statistical in nature. For instance and relevant to this exploration, the density of the primes near n is about 1/log(n). So if we expect a random walk, it’s a random walk with a gradually increasing step size as the primes thin out. We can avoid this by starting at high n, and stopping before n gets really tremendously higher. If we start at 1,000,000 and end at 2,000,000 the average space between prime numbers only increases by a factor of 5%.

Suggested experiment number 1: create actual random walks with 1/log(n) probability of turning left at each step, run many of those experiments for (say) 10,000 steps, and measure the distance from the starting point. The do the same thing for actual numbers using the original algorithm between n = 1,000,000 and 1,010,000. Then do it again between 1,010,000 and 1,020,000. Keep going until you have lots of random number data and lots of prime number data and see if there’s a significant difference in the distance from the origin between the random and the primes data.

Suggested experiment number 2: Run control groups. Instead of always turning left, turn in a random direction. And to supplement that, run a test where the turns are made after *every other* prime number, which should wash out any preferred direction in the original experiment while retaining the overall global average spacing of the primes. How do those distances and directions from the origin statistically differ from the original algorithm?

There is a pretty good physical analogy to this: Brownian motion, which is a physical process which is pretty much a random walk. It’s the random jiggling and moving that happens when a tiny particle like a pollen grain is randomly battered by the atoms surrounding it. From the Wikipedia article, a picture of a 2-d random walk:

Quite similar. My guess is that the apparent directional preferences in the prime numbers are just statistical flukes that you’d expect to see every once in a while, but I’m not at all certain. It *looks* a lot like Brownian motion. But that’s not a rigorous observation. Experimental rigor will require quantitative measurements such as in the suggested experiments above. And even then since we’re in mathematics, computational experiment isn’t sufficient to prove anything anyway. Nevertheless, it’s a good starting point.