An information-theoretic derivation of Chebyshev’s theorem(1852), an important precursor of the Prime Number Theorem.
The aim of this article is to derive Chebyshev’s theorem(1852) using an adaptation of Kontoyiannis’ original arguments [3].
For an integer sampled uniformly from the interval \([1,N]\) we may define its random prime factorisation in terms of the random variables \(X_p\):
\[\begin{equation} \forall Z \sim U([1,N]), Z = \prod_{p \leq N} p^{X_p} \tag{1} \end{equation}\]
While the description length of \(Z\) is not computable, we may calculate its Expected Kolmogorov Complexity using the relation [4]:
\[\begin{equation} \mathbb{E}[K_U(Z)] = H(Z) + \mathcal{O}(1) \tag{2} \end{equation}\]
In our case, the asymptotic formula for \(\mathbb{E}[K_U(Z)]\) may be explicitly calculated using the fact that almost all integers are algorithmically random:
\[\begin{equation} \mathbb{E}[K_U(Z)] \sim \mathbb{E}[\log_2 Z] = \frac{1}{N} \sum_{k=1}^N \log_2 k = \frac{\log_2 N!}{N} \sim \log_2 N \tag{3} \end{equation}\]
Hence, the Shannon Entropy of (1) simplifies to:
\[\begin{equation} \log_2 N \sim \sum_{p \leq N} \mathbb{E}[X_p] \cdot \log_2 p \tag{4} \end{equation}\]
where we may calculate \(\mathbb{E}[X_p]\) using what we know about the Algorithmic Probability of a prime number:
\[\begin{equation} P(Z \bmod p = 0) \sim \frac{1}{p} \tag{5} \end{equation}\]
Using the Universal Distribution (5), we find:
\[\begin{equation} \mathbb{E}[X_p] = \sum_{k \geq 1} P(X_p \geq k) = \sum_{k \geq 1} P(Z \bmod p^k = 0) \sim \sum_{k \geq 1} \frac{1}{p^k} \tag{6} \end{equation}\]
which simplifies to:
\[\begin{equation} \mathbb{E}[X_p] \sim \frac{1}{p} \tag{7} \end{equation}\]
Thus, we may deduce Chebyshev’s remarkable theorem in base-2:
\[\begin{equation} \sum_{p \leq N} \frac{1}{p} \cdot \log_2 p \sim \log_2 N \tag{8} \end{equation}\]
What does this theorem mean? Given the Algorithmic Probability \(m(X)\) of an event \(X\), we may consider (8) in light of Levin’s Coding theorem [5]:
\[\begin{equation} -\log_2 m(X) = K_U(X) + \mathcal{O}(1) \implies -\log_2 m(X)^{m(X)} \sim m(X) \cdot K_U(X) \tag{9} \end{equation}\]
and the information-theoretic independence of distinct primes:
\[\begin{equation} \forall p,q \in \mathbb{P}, P(Z \bmod p = 0 \land Z \bmod q = 0) \sim \frac{1}{p} \cdot \frac{1}{q} \tag{10} \end{equation}\]
so Chebyshev’s theorem tells us that the Algorithmic Probability that we observe a prime number in the interval \([1,N]\) is on the order of \(\sim 2^{-\log_2 N} = \frac{1}{N}\) as \(N \to \infty\).
P.L. Chebychev. Mémoire sur les nombres premiers. J. de Math. Pures Appl., 17:366–390, 1852.
P.L. Chebychev. Sur la totalité des nombres premiers inférieurs à une limite donnée. J. de Math. Pures Appl., 17:341–365, 1852.
I. Kontoyiannis. “Some information-theoretic computations related to the distribution of prime numbers.” In Festschrift in Honor of Jorma Rissanen, (P. Grunwald, P. Myllymaki, I. Tabus, M. Weinberger, B. Yu, eds.), pp. 135-143, Tampere University Press, May 2008.
Peter Grünwald and Paul Vitányi. Shannon Information and Kolmogorov Complexity. 2010.
L.A. Levin. Laws of information conservation (non-growth) and aspects of the foundation of probability theory. Problems Information Transmission, 10(3):206-210, 1974.
For attribution, please cite this work as
Rocke (2022, Jan. 9). Kepler Lounge: Chebyshev's theorem via Occam's razor. Retrieved from keplerlounge.com
BibTeX citation
@misc{rocke2022chebyshev's, author = {Rocke, Aidan}, title = {Kepler Lounge: Chebyshev's theorem via Occam's razor}, url = {keplerlounge.com}, year = {2022} }