Chebyshev’s theorem via Occam’s razor

An information-theoretic derivation of Chebyshev’s theorem(1852), an important precursor of the Prime Number Theorem.

Aidan Rocke https://github.com/AidanRocke
01-09-2022

For an integer sampled uniformly from the interval $$[1,N]$$ we may define its random prime factorisation in terms of the random variables $$X_p$$:

$$$\forall Z \sim U([1,N]), Z = \prod_{p \leq N} p^{X_p} \tag{1}$$$

While the description length of $$Z$$ is not computable, we may calculate its Expected Kolmogorov Complexity using the relation [4]:

$$$\mathbb{E}[K_U(Z)] = H(Z) + \mathcal{O}(1) \tag{2}$$$

In our case, the asymptotic formula for $$\mathbb{E}[K_U(Z)]$$ may be explicitly calculated using the fact that almost all integers are algorithmically random:

$$$\mathbb{E}[K_U(Z)] \sim \mathbb{E}[\log_2 Z] = \frac{1}{N} \sum_{k=1}^N \log_2 k = \frac{\log_2 N!}{N} \sim \log_2 N \tag{3}$$$

Hence, the Shannon Entropy of (1) simplifies to:

$$$\log_2 N \sim \sum_{p \leq N} \mathbb{E}[X_p] \cdot \log_2 p \tag{4}$$$

where we may calculate $$\mathbb{E}[X_p]$$ using what we know about the Algorithmic Probability of a prime number:

$$$P(Z \bmod p = 0) \sim \frac{1}{p} \tag{5}$$$

Using the Universal Distribution (5), we find:

$$$\mathbb{E}[X_p] = \sum_{k \geq 1} P(X_p \geq k) = \sum_{k \geq 1} P(Z \bmod p^k = 0) \sim \sum_{k \geq 1} \frac{1}{p^k} \tag{6}$$$

which simplifies to:

$$$\mathbb{E}[X_p] \sim \frac{1}{p} \tag{7}$$$

Thus, we may deduce Chebyshev’s remarkable theorem in base-2:

$$$\sum_{p \leq N} \frac{1}{p} \cdot \log_2 p \sim \log_2 N \tag{8}$$$

What does this theorem mean? Given the Algorithmic Probability $$m(X)$$ of an event $$X$$, we may consider (8) in light of Levin’s Coding theorem [5]:

$$$-\log_2 m(X) = K_U(X) + \mathcal{O}(1) \implies -\log_2 m(X)^{m(X)} \sim m(X) \cdot K_U(X) \tag{9}$$$

and the information-theoretic independence of distinct primes:

$$$\forall p,q \in \mathbb{P}, P(Z \bmod p = 0 \land Z \bmod q = 0) \sim \frac{1}{p} \cdot \frac{1}{q} \tag{10}$$$

so Chebyshev’s theorem tells us that the Algorithmic Probability that we observe a prime number in the interval $$[1,N]$$ is on the order of $$\sim 2^{-\log_2 N} = \frac{1}{N}$$ as $$N \to \infty$$.

References:

1. P.L. Chebychev. Mémoire sur les nombres premiers. J. de Math. Pures Appl., 17:366–390, 1852.

2. P.L. Chebychev. Sur la totalité des nombres premiers inférieurs à une limite donnée. J. de Math. Pures Appl., 17:341–365, 1852.

3. I. Kontoyiannis. “Some information-theoretic computations related to the distribution of prime numbers.” In Festschrift in Honor of Jorma Rissanen, (P. Grunwald, P. Myllymaki, I. Tabus, M. Weinberger, B. Yu, eds.), pp. 135-143, Tampere University Press, May 2008.

4. Peter Grünwald and Paul Vitányi. Shannon Information and Kolmogorov Complexity. 2010.

5. L.A. Levin. Laws of information conservation (non-growth) and aspects of the foundation of probability theory. Problems Information Transmission, 10(3):206-210, 1974.

Citation

Rocke (2022, Jan. 9). Kepler Lounge: Chebyshev's theorem via Occam's razor. Retrieved from keplerlounge.com
@misc{rocke2022chebyshev's,
}