# Chebyshev’s theorem via Occam’s razor

An information-theoretic derivation of Chebyshev’s theorem(1852), an important precursor of the Prime Number Theorem.

Aidan Rocke https://github.com/AidanRocke
01-09-2022

For an integer sampled uniformly from the interval $$[1,N]$$ we may define its random prime factorisation in terms of the random variables $$X_p$$:

$\begin{equation} \forall Z \sim U([1,N]), Z = \prod_{p \leq N} p^{X_p} \tag{1} \end{equation}$

While the description length of $$Z$$ is not computable, we may calculate its Expected Kolmogorov Complexity using the relation :

$\begin{equation} \mathbb{E}[K_U(Z)] = H(Z) + \mathcal{O}(1) \tag{2} \end{equation}$

In our case, the asymptotic formula for $$\mathbb{E}[K_U(Z)]$$ may be explicitly calculated using the fact that almost all integers are algorithmically random:

$\begin{equation} \mathbb{E}[K_U(Z)] \sim \mathbb{E}[\log_2 Z] = \frac{1}{N} \sum_{k=1}^N \log_2 k = \frac{\log_2 N!}{N} \sim \log_2 N \tag{3} \end{equation}$

Hence, the Shannon Entropy of (1) simplifies to:

$\begin{equation} \log_2 N \sim \sum_{p \leq N} \mathbb{E}[X_p] \cdot \log_2 p \tag{4} \end{equation}$

where we may calculate $$\mathbb{E}[X_p]$$ using what we know about the Algorithmic Probability of a prime number:

$\begin{equation} P(Z \bmod p = 0) \sim \frac{1}{p} \tag{5} \end{equation}$

Using the Universal Distribution (5), we find:

$\begin{equation} \mathbb{E}[X_p] = \sum_{k \geq 1} P(X_p \geq k) = \sum_{k \geq 1} P(Z \bmod p^k = 0) \sim \sum_{k \geq 1} \frac{1}{p^k} \tag{6} \end{equation}$

which simplifies to:

$\begin{equation} \mathbb{E}[X_p] \sim \frac{1}{p} \tag{7} \end{equation}$

Thus, we may deduce Chebyshev’s remarkable theorem in base-2:

$\begin{equation} \sum_{p \leq N} \frac{1}{p} \cdot \log_2 p \sim \log_2 N \tag{8} \end{equation}$

What does this theorem mean? Given the Algorithmic Probability $$m(X)$$ of an event $$X$$, we may consider (8) in light of Levin’s Coding theorem :

$\begin{equation} -\log_2 m(X) = K_U(X) + \mathcal{O}(1) \implies -\log_2 m(X)^{m(X)} \sim m(X) \cdot K_U(X) \tag{9} \end{equation}$

and the information-theoretic independence of distinct primes:

$\begin{equation} \forall p,q \in \mathbb{P}, P(Z \bmod p = 0 \land Z \bmod q = 0) \sim \frac{1}{p} \cdot \frac{1}{q} \tag{10} \end{equation}$

so Chebyshev’s theorem tells us that the Algorithmic Probability that we observe a prime number in the interval $$[1,N]$$ is on the order of $$\sim 2^{-\log_2 N} = \frac{1}{N}$$ as $$N \to \infty$$.

1. P.L. Chebychev. Mémoire sur les nombres premiers. J. de Math. Pures Appl., 17:366–390, 1852.

2. P.L. Chebychev. Sur la totalité des nombres premiers inférieurs à une limite donnée. J. de Math. Pures Appl., 17:341–365, 1852.

3. I. Kontoyiannis. “Some information-theoretic computations related to the distribution of prime numbers.” In Festschrift in Honor of Jorma Rissanen, (P. Grunwald, P. Myllymaki, I. Tabus, M. Weinberger, B. Yu, eds.), pp. 135-143, Tampere University Press, May 2008.

4. Peter Grünwald and Paul Vitányi. Shannon Information and Kolmogorov Complexity. 2010.

5. L.A. Levin. Laws of information conservation (non-growth) and aspects of the foundation of probability theory. Problems Information Transmission, 10(3):206-210, 1974.