The Prime Number Theorem and the Ergodic Hypothesis:

If a statistical physicist were to analyse the sequential data \(\{\ln p_n \}_{n=1}^N\) where \(p_n\) is prime, they may define the discrete random variable \(X_n\) that has the same distribution as the primes so:

\begin{equation} \lvert X_n \leq N \rvert \sim \pi(N) \sim \frac{N}{\ln N} \end{equation}

The key idea is that by constructing a random variable \(X_n\) whose statistical behaviour is constrained by the Prime Number Theorem, it is possible to reach an important insight concerning the ergodic behaviour of the distribution of primes.

Given \(g(x) = \ln x\), we may use the Law of the Unconscious Statistician to calculate the space average of \(g(X_n)\):

\begin{equation} \mathbb{E}[g(X_n)] = \lim_{N \to \infty} \frac{1}{N} \sum_{x \leq N} g(x) \cdot P(x \in \mathbb{P}) = \lim_{N \to \infty} \frac{1}{N} \sum_{x \leq N} \ln x \cdot \frac{1}{\ln x} = 1 \end{equation}

where we used the fact that for large \(n\), \(P(n \in \mathbb{P}) \sim \frac{1}{\ln n}\).

Likewise, we may define the time average:

\begin{equation} \lim_{N \to \infty} \frac{1}{N} \sum_{X_n \leq N} \ln X_n = \lim_{N \to \infty} \frac{1}{N} \sum_{n=1}^{\pi(N)} \ln X_n \end{equation}

and given that:

\begin{equation} \pi(N) \sim \frac{N}{\ln N} \implies \vartheta(N) = \sum_{p \leq N} \ln p \sim N \end{equation}

we may equate the time average with the space average:

\begin{equation} \lim_{N \to \infty} \frac{1}{N} \sum_{n=1}^{\pi(N)} \ln X_n = \mathbb{E}[\ln X_n] \end{equation}

which implies that the distribution of primes satisfy an ergodic hypothesis.

Information-theoretic corollaries:

If \(X_N \{x_n\}_{n=1}^N\) denotes the prime encoding of \([1,N]\) where \(x_n = 1\) if \(n \in \mathbb{P}\) and \(x_n = 0\) otherwise, then its Expected Kolmogorov Complexity is given by:

\begin{equation} \mathbb{E}[K_U(X_N)] \sim \pi(N) \cdot \ln N \sim N \end{equation}

and we also have:

\begin{equation} \sum_{n=1}^{\pi(N)} \ln p_n \sim \pi(N) \cdot \ln N \sim N \end{equation}

In fact, the last asymptotic relation is consistent with Chaitin’s theorem that almost all integers are incompressible. If \(\mathbb{P}^N\) denotes the first \(N\) primes:

\begin{equation} K_U(\mathbb{P}^{\pi(N)}) \sim \sum_{n=1}^{\pi(N)} K_U(p_n) \sim \frac{1}{\ln 2} \sum_{n=1}^{\pi(N)} \ln p_n \end{equation}

and these asymptotic relations together indicate that all the information in the integers is contained in the prime numbers.


  1. Moore, C., 2015, “Ergodic theorem, ergodic theory, and statistical mechanics,” PNAS, 112(7): 1907–11.

  2. E. Kowalski. Arithmetic Randonnée: An introduction to probabilistic number theory. May 10, 2021.

  3. Alexander Arbieto, Carlos Matheus and Carlos G. Moreira. The remarkable effectiveness of ergodic theory in number theory. ENSAIOS MATEMÁTICOS. 2009, Volume 17, 1–104.