## Introduction:

Cramér’s conjecture states that the upper-bound on prime gaps is well-behaved:

$$\exists M \in \mathbb{N}, p_{n+1}-p_n \leq M \cdot (\ln p_n)^2$$

or equivalently,

$$p_{n+1}-p_n = \mathcal{O}((\ln p_n)^2)$$

A probabilistic proof of Cramér’s conjecture is developed within a frequentist perspective. Other than prior knowledge of Bertrand’s postulate, Mertens’ second theorem and the Prime Number Theorem this proof is entirely self-contained.

The key idea is to show that Cramér’s conjecture holds almost surely.

## An upper-bound on prime gaps:

Given that $$\int \ln x dx = x \cdot \ln x + x$$ we may define the integrals:

$$I_{N+1} = \int_{1}^{P_{N+1}} \ln x dx = P_{N+1} \cdot \ln P_{N+1} - P_{N+1} + 1$$

$$I_{N} = \int_{1}^{P_{N}} \ln x dx = P_{N} \cdot \ln P_{N} - P_{N} + 1$$

where $$P_N$$ and $$P_{N+1}$$ are consecutive prime numbers.

Now, if we analyse the difference $$0 \leq \Delta I_N = I_{N+1} - I_N$$ we find:

$$0 \leq \Delta I_N = P_{N+1} \cdot \ln P_{N+1} - P_{N+1} - (P_{N} \cdot \ln P_{N} - P_{N})$$

and from (3) we may deduce:

$$0 \leq P_{N+1} - P_N \leq P_{N+1} \cdot \ln P_{N+1} - P_N \cdot \ln P_N$$

Meanwhile, the Prime Number Theorem tells us that:

$$P_N \sim \pi(P_N) \cdot \ln P_N = N \cdot \ln P_N$$

from which we may deduce:

$$P_N \cdot \ln P_N \sim N \cdot (\ln P_N)^2$$

and as Bertrand’s postulate implies $$\ln P_N < \ln P_{N+1} < \ln P_N + 1$$ we also have:

$$P_{N+1} \cdot \ln P_{N+1} \sim (N+1) \ln (P_{N})^2$$

Finally, if we combine (4), (6) and (7) we may conjecture a sensible upper-bound on prime gaps:

$$0 \leq P_{N+1} - P_N \leq (\ln P_N)^2 + \delta$$

and the remainder of this analysis will show that $$\delta$$ is essentially negligible.

## A derivation of Cramér’s heuristic, or the typical probability of observing a prime number:

As is done in probabilistic proofs of the Hardy-Ramanujan theorem, we may define the indicator variable $$X_p$$ where $$X_p = 1$$ if $$x$$ is divisible by the prime number $$p$$ and $$X_p = 0$$ otherwise. From a frequentist perspective:

$$\mathbb{E}[X_p] = 1 \cdot \frac{1}{p} + 0 \cdot (1- \frac{1}{p}) = \frac{1}{p}$$

as numbers divisible by $$p$$ occur with frequency $$\frac{1}{p}$$.

Given the Nth prime number $$P_N$$, we may use the frequentist perspective to define the typical frequency $$P$$ with which a number $$x \in [1,P_N]$$ is not divisible by any prime in $$[1,P_N]$$:

$$P = \prod_{p \leq P_N} \big(1-\frac{1}{p}\big)$$

and due to Mertens’ second theorem we have:

$$-\ln P \approx \sum_{p \leq P_N} = \ln \ln P_N + \mathcal{O}(1)$$

which implies that:

$$P \approx \frac{1}{\ln P_N}$$

so a prime number $$p \in [1,P_N] \cap \mathbb{P}$$ typically occurs with frequency $$\frac{1}{\ln P_N}$$ and the expected number of primes in $$[1,P_N]$$ scales as follows:

$$N \sim \frac{P_N}{\ln P_N}$$

as there are exactly $$N$$ primes in $$[1,P_N]$$.

## A probabilistic proof of Cramér’s theorem:

Bertrand’s postulate implies that $$\ln \frac{p_{n+1}}{p_n} \leq \ln 2$$ and therefore the density of the primes in $$[P_N, 2 \cdot P_N]$$ may be approximated by $$\frac{1}{\ln P_N}$$ without affecting the asymptotic analysis that follows.

In particular, let’s consider the events that $$\exists \alpha, \beta, \gamma \in \mathbb{R}_{+}^3$$ such that:

$$A_N : \exists p_{n+1} \in [P_N + 1, 2 \cdot P_N], p_{n+1}-p_n \geq \alpha \cdot (\ln p_n)^2 + \beta \cdot \ln p_n + \gamma$$

$$B_N : \mathbb{P} \cap [P_N, 2 \cdot P_N] \neq \emptyset$$

Due to Bertrand’s postulate, we know that $$P(B_N) = 1$$ and since we also have $$A_N \implies B_N$$ we have:

$$A_N \subset B_N$$

and therefore:

$$P(A_N|B_N) = \frac{P(A_N \cap B_N)}{P(B_N)} = P(A_N \cap B_N) = P(A_N)$$

It follows that we may define the upper-bound:

$$P(A_N |B_N) \leq P_N \cdot P(A_N)$$

$$P(A_N) = (1- \frac{1}{\ln P_N})^{\alpha \cdot (\ln P_N)^2} \cdot (1- \frac{1}{\ln P_N})^{\beta \cdot \ln P_N} \cdot (1- \frac{1}{\ln P_N})^{\gamma}$$

where $$P(A_N) \approx \frac{e^{-\beta}}{(P_N)^{\alpha}} \leq \big(\frac{1}{P_N}\big)^{\alpha}$$.

Now, in order to consider any $$\alpha \in (1,\infty)$$ we’ll use the fact that:

$$\int_{1}^{\infty} e^{-x \ln c} dx = \frac{1}{c \ln c}$$

so we have:

$$P(A_N | B_N) \leq P_N \cdot (\frac{1}{P_N \cdot \ln P_N}) = \frac{1}{\ln P_N}$$

so we may deduce that:

$$\lim_{N \to \infty} P(A_N | B_N) \leq \lim_{N \to \infty} \frac{1}{\ln P_N} = 0$$

so $$A_N$$ is a measure zero event.

## References:

1. Cramér, H. “On the Order of Magnitude of the Difference Between Consecutive Prime Numbers.” Acta Arith. 2, 23-46, 1936.

2. Hardy, G. H.; Ramanujan, S. (1917), “The normal number of prime factors of a number n”, Quarterly Journal of Mathematics

3. Turán, Pál (1934), “On a theorem of Hardy and Ramanujan”, Journal of the London Mathematical Society

4. Yufei Zhao. The Probabilistic Method in Combinatorics. 2019.

5. F. Mertens. J. reine angew. Math. 78 (1874)