God may not play dice with the universe, but something strange is going on with the prime numbers.-Paul Erdős

Primo, the distribution of prime numbers is algorithmically random:

Assuming you went over my proof that Cramér’s conjecture is almost surely true, we’ll now take an information-theoretic approach to the same problem. Chaitin’s result that almost all integers are incompressible implies:

\begin{equation} K_U(N) \sim \log_2 N = \sum_{i=1}^N \alpha_i \cdot \log_2 p_i \end{equation}

where \(\prod_{i=1}^N p_i^{\alpha_i}\) is the unique prime factorisation of \(N\).

It follows that the prime numbers are incompressible and therefore the distribution of prime numbers is algorithmically random. This is an important point we shall revisit towards the end of this investigation.

An information-theoretic upper-bound on prime gaps:

Let’s suppose that the locations of the nth consecutive prime numbers are given by \(p_n = l\) and \(p_{n+1}= l + N\) so we may construct a chain of indicator variables \(\{X_n[i]\}_{i=1}^{N+1}\) such that \(X_n[1]=1,X_n[N+1]=1\) and \(\forall j \in [2,N], X_n[j] = 0\).

Now, in the worst case the occurrences of \(p_n\) and \(p_{n+1}\) are independent so we have:

\begin{equation} P(X_n[1] = 1 \land X_n[N+1]=1) \geq \frac{1}{(\ln p_n)^2} \end{equation}

Therefore, \(X_n\) carries at most:

\begin{equation} H(X_n) \leq - \log_2 \frac{1}{(\ln p_n)^2} = \log_2 (\ln p_n)^2 \end{equation}

bits of information.

Given that information is additive and each link in the chain contains one bit of information, the length of \(X_n\) is bounded by:

\begin{equation} p_{n+1}-p_n \leq (\ln p_n)^2 \end{equation}

and so we may conclude that in the worst case:

\begin{equation} p_{n+1}-p_n = \mathcal{O}((\ln p_n)^2) \end{equation}


It is worth considering why information-theoretic methods are so effective. The only Universe where Occam’s razor is generally applicable is one where information is conserved. Moreover, the law of conservation of information implies that fundamental physical laws are time-reversible. This has important implications for the distribution of prime numbers as a second implication of the law of conservation of information is that all computations are observer-dependent.

In particular if our observer is represented by the Turing Machine \(\Omega\) then an application of the Minimum Description Length Principle yields:

\begin{equation} K_{\Omega}(X_N) = K_U(X_N|\Omega) + K_U(\Omega) \implies K_{\Omega}(X_N) \geq K_U(\Omega) \end{equation}

and since \(\Omega\) must contain arithmetic we must conclude that the distribution of prime numbers is algorithmically random relative to \(\Omega\). To be precise, if \(X_N = \{x_n\}_{n=1}^N\) denotes the encoding of the prime numbers where \(x_n=1\) if \(n\) is prime and \(x_n = 0\) otherwise then the algorithmic randomness of the prime numbers implies:

\begin{equation} K_U(X_N) \sim N \sim \pi(N) \cdot \ln N \end{equation}

and so the prime numbers behave as if they are uniformly distributed in \([1,N]\). This is all that could have been known by a Universal Turing Machine at the instant the Universe was created.


  1. Cramér, H. “On the Order of Magnitude of the Difference Between Consecutive Prime Numbers.” Acta Arith. 2, 23-46, 1936.

  2. Hardy, G. H.; Ramanujan, S. (1917), “The normal number of prime factors of a number n”, Quarterly Journal of Mathematics

  3. Turán, Pál (1934), “On a theorem of Hardy and Ramanujan”, Journal of the London Mathematical Society

  4. Yufei Zhao. The Probabilistic Method in Combinatorics. 2019.

  5. F. Mertens. J. reine angew. Math. 78 (1874)