The sum of the first \(n\) primes less than \(\sqrt{n}\):

On the MathOverflow, I recently came across the following theorem:

\begin{equation} S(n) = \sum_{p \leq \sqrt{n}} p \end{equation}

\begin{equation} S(n) \sim \pi(n) \end{equation}

although the proof of this theorem is rather elementary, I don’t think anybody would have expected this result unless they first came across Mertens’ three theorems.

Even then however, this result is quite unexpected.


If we let \(X_p = 1\) if \(p \in \mathbb{P}\) and \(X_p = 0\) otherwise, then we have:

\begin{equation} \pi(n) \sim \sum_{p \leq n} X_p \end{equation}

and we also have:

\begin{equation} \lfloor \sqrt{n} \rfloor \cdot \pi(\sqrt{n}) = \sum_{p \leq \sqrt{n}} \lfloor \sqrt{n} \rfloor \cdot X_p \end{equation}

\begin{equation} \sum_{p \leq \sqrt{n}} \pi(p) = \sum_{p \leq \lfloor \sqrt{n} \rfloor - 1} (\lfloor \sqrt{n} \rfloor - 1 - (p-1)) \cdot X_p = \sum_{p \leq \lfloor \sqrt{n} \rfloor - 1} (\lfloor \sqrt{n} \rfloor - p) \cdot X_p \end{equation}

Given the definition of \(S(n)\) we have:

\begin{equation} S(n) = \lfloor \sqrt{n} \rfloor \cdot \pi(\sqrt{n}) - \sum_{p \leq \sqrt{n}} \pi(p) \end{equation}

Now, if we use the error term of the Prime Number Theorem we have:

\begin{equation} \lvert \pi(N) - \frac{N}{\ln N} \rvert = \mathcal{O}\big(\frac{N}{(\ln N)^2}\big) \end{equation}

and therefore:

\begin{equation} S(n) = \lfloor \sqrt{n} \rfloor \cdot \frac{\sqrt{n}}{\ln \sqrt{n}} - \sum_{k=2}^{\lfloor \sqrt{n} \rfloor - 1} \frac{k}{\ln k} + \mathcal{O}\big(\frac{n}{(\ln \sqrt{n})^2}\big) \end{equation}

Meanwhile, using integration by parts we find:

\begin{equation} \int_{2}^{\sqrt{n}} \frac{x}{\ln x} dx \approx \frac{n}{\ln n} - \int_{2}^{\sqrt{n}} \frac{x^2}{(\ln x)^2} dx \leq \frac{n}{\ln n} -4 \cdot \frac{n}{(\ln n)^2} = \frac{n}{\ln n} + \mathcal{O}\big(\frac{n}{(\ln n)^2}\big) \end{equation}

so (8) simplifies to:

\begin{equation} S(n) \sim \frac{n}{\ln n} + \mathcal{O}\big(\frac{n}{(\ln n)^2}\big) \end{equation}