Informationtheoretic corollaries of Mertens' theorem
Informationtheoretic interpretations of the Prime Number Theorem:
From an informationtheoretic perspective, the Prime Number theorem states that:
\(1.\) \(2^{\log_2 N}\) messages may be transmitted using \(\pi(N)\) prime numbers where each prime number, on average, encodes \(\ln(N)\) bits of information:
\begin{equation} 2^{\log_2 N} \sim \pi(N) \cdot \ln(N) \end{equation}
\(2.\) Each prime that is observed in the interval \([1,N]\) is essentially a surprise, sampled uniformly from \([1,N]\), with respect to any finitestate gambler:
\begin{equation} \mathbb{E}[K(X_N)] \sim \pi(N) \cdot \ln(N) \sim N \end{equation}
where \(X_N = \{x_i\}_{i=1}^N\) denotes the prime encoding of \([1,N]\) i.e. \(x_n = 1\) if \(n\) is prime and \(x_n = 0\) otherwise.
Given the Prime Number theorem, we may readily deduce an approximation of Mertens’ theorem:
\begin{equation} \sum_{n=1}^{\pi(N)} \frac{\ln p_n}{p_n} \sim \ln N + \mathcal{O}(\ln \ln N) \end{equation}
which follows immediately from the fact that \(\pi(p_n) = n \sim \frac{p_n}{\ln p_n}\):
\begin{equation} \sum_{n=1}^{\pi(N)} \frac{\ln p_n}{p_n} \approx \sum_{n=1}^{\pi(N)} \frac{1}{n} \sim \ln N + \mathcal{O}(\ln \ln N) \end{equation}
However, Mertens’ theorem is a bit more than a mere corollary of the Prime Number Theorem. It is a profound observation that raises important questions.
Informationtheoretic corollaries of Mertens’ theorem:
Most mathematicians know Mertens’ first theorem in its familiar form:
\begin{equation} \sum_{n=1}^{\pi(N)} \frac{\ln p_n}{p_n} = \ln N + \mathcal{O}(1) \end{equation}
but it recently occurred to me that Mertens’ theorem has an informationtheoretic corollary:
\begin{equation} X_i \sim U([1,p_i]) \implies \sum_{i=1}^{\pi(N)} P(X_i = p_i) \cdot H(X_i) = \ln N \end{equation}
so the prime numbers behave as if they were sampled uniformly without replacement from compact subsets of \(\mathbb{N}\) and the weighted average of \(\pi(N)\) primes encodes \(\ln N\) bits of information.
However, with respect to a finitestate gambler that has no a priori knowledge of locations of the primes in \([1,N]\) nor their relative magnitudes it is accurate to say that each prime is sampled uniformly from \([1,N]\) using the frequency distribution \(f(X=p) = \frac{1}{p}\) so we have:
\begin{equation} \sum_{1 \leq p \leq N} f(X=p) = \sum_{p=1}^N \frac{1}{p} \approx \ln N \end{equation}
which is an unnormalised Zipfian distribution. This relation shall be investigated in more detail in the future.
References:

Mark B. Villarino. Mertens’ Proof of Mertens’ Theorem. 2005.

Sondow, Jonathan and Weisstein, Eric W. “Mertens Theorem.” From MathWorld–A Wolfram Web Resource. https://mathworld.wolfram.com/MertensTheorem.html