## Motivation:

A couple weeks ago I was working on a problem that involved the expected value of a ratio of two random variables:

\begin{equation} \mathbb{E}\big[\frac{X_n}{Z_n}\big] \approx \frac{\mu_{X_n}}{\mu_{Z_n}} - \frac{\mathrm{Cov}(X_n,Z_n)}{\mu_{Z_n}^2} + \frac{\mathrm{Var(Z_n)}\mu_{X_n}}{\mu_{Z_n}^3} \end{equation}

where $$Z_n$$ was a sum of $$n$$ i.i.d. random variables with a symmetric distribution centred at zero.

Everything about this approximation worked fine in computer simulations where $$n$$ was large but mathematically there appeared to be a problem since:

\begin{equation} \mathbb{E}\big[Z_n\big] = 0 \end{equation}

Given that (2) didn’t appear to be an issue in simulation, I went through the code several times to check whether there was an error but found none. After thinking about the problem for a bit longer it occurred to me to formalise the problem and analyse:

\begin{equation} P(\sum_{n=1}^N a_n = 0) \end{equation}

where $$a_n$$ are i.i.d. random variables with a uniform distribution centred at zero so $$\mathbb{E}[a_i]=0$$. My intuition suggested that under relatively weak assumptions:

\begin{equation} \lim_{N \to \infty} P(\sum_{n=1}^N a_n = 0) = 0 \end{equation}

We may think of this as a measure-theoretic phenomenon in high-dimensional spaces where $$N \in \mathbb{N}$$ is our dimension and $$\vec{a} \in \mathbb{R}^N$$ is a random vector.

## Analysis of a special case:

Given that (3) is a very general problem, I decided to start by analysing the special case of $$a_i \sim \mathcal{U}(\{-1,1\})$$ where:

\begin{equation} \forall n \in \mathbb{N}, P(a_n=1)=P(a_n=-1)=\frac{1}{2} \end{equation}

\begin{equation} S_0 = \{ (a_n)_{n=1}^N \in \{-1,1\} : \sum_n a_n = 0\} \end{equation}

Knowing that $$S_0$$ is non-empty only if we have parity of positive and negative terms, we may deduce that:

\begin{equation} S_0 \neq \emptyset \iff N \in 2\mathbb{N} \end{equation}

For the above reason, I focused my analysis on the following sequence:

\begin{equation} u_N = P(\sum_{n=1}^{2N} a_n = 0)= \frac{2N \choose N}{2^{2N}} = \frac{(2N)!}{2^{2N}(N!)^2} \end{equation}

## Proof that $$u_N$$ is decreasing:

We can demonstrate that $$u_N$$ is strictly decreasing by considering the ratio:

\begin{equation} \frac{u_{N+1}}{u_N}=\frac{\frac{(2N+2)!}{2^{2N+2}((N+1)!)^2}}{\frac{(2N)!}{2^{2N}(N!)^2}}=\frac{(2N+2)(2N+1)}{4(N+1)^2}=\frac{2N+1}{2N+2} < 1 \end{equation}

Now, with (9) we have what is necessary to show that:

\begin{equation} \lim_{n \to \infty} u_N = 0 \end{equation}

## Analysis of the limit $$\lim\limits_{N \to \infty} u_N$$:

Using (9) we may derive a recursive definition of $$u_N$$:

\begin{equation} u_{N+1}=\frac{2N+1}{2N+2} \cdot u_N \end{equation}

and given that $$u_0=1$$ we have:

\begin{equation} u_{N}=\prod_{n=0}^{N-1} \frac{2n+1}{2n+2}= \frac{1}{3} \cdot \frac{3}{4} \cdot \frac{5}{6} \cdot … \end{equation}

At this point we can make the useful observation:

\begin{equation} \lim_{N \to \infty} u_N = 0 \implies \lim_{N \to \infty} - \ln u_N = \infty \end{equation}

## Proof that $$\lim\limits_{N \to \infty} u_N=0$$:

By combining (12) and (13) we find that:

\begin{equation} -\ln u_N = -\ln \prod_{n=0}^{N-1} \frac{2n+1}{2n+2}= \sum_{n=0}^{N-1} \ln \frac{2n+2}{2n+1}= \sum_{n=0}^{N-1} \ln \big(1+\frac{1}{2n+1}\big) \end{equation}

We note that when $$n\in \mathbb{N}$$ is large:

\begin{equation} \ln \big(1+\frac{1}{n}\big) \approx \frac{1}{n} \end{equation}

Now, from (15) it follows that:

\begin{equation} \sum_{n=1}^\infty \frac{1}{2n+1} = \infty \implies \sum_{n=0}^{\infty} \ln \big(1+\frac{1}{2n+1}\big) = \infty \end{equation}

Using (15) we may conclude that (10) is indeed true. In some sense, when $$n$$ is large we can expect to observe the expected value with vanishing probability.

## Discussion:

A natural question that follows is whether the above method may be used to handle other cases. Let’s consider $$a_i \sim \mathcal{U}(\{-1,0,1\})$$ where:

\begin{equation} \forall n \in \mathbb{N}, P(a_n=1)=P(a_n=0)=P(a_n=-1)=\frac{1}{3} \end{equation}

so we may define:

\begin{equation} u_N = P(\sum_{n=1}^{4N} a_n = 0)= \frac{(4N)!}{3^{4N}} \sum_{k=1}^N \frac{1}{(2k)!^2 (4N-4k)!} \end{equation}

I actually tried to analyse the combinatorics of this sequence but quickly realised that even if I managed to show that this sequence converged to zero, it wasn’t clear how this method would manage to handle the most general setting, the case of all integer dimensions $$N \in \mathbb{N}$$, and it didn’t appear to be very effective in terms of the number of calculations per case.

In order to make progress, I decided to model this problem from a different perspective.