## Motivation:

A couple weeks ago I was working on a problem that involved the expected value of a ratio of two random variables:

$$\mathbb{E}\big[\frac{X_n}{Z_n}\big] \approx \frac{\mu_{X_n}}{\mu_{Z_n}} - \frac{\mathrm{Cov}(X_n,Z_n)}{\mu_{Z_n}^2} + \frac{\mathrm{Var(Z_n)}\mu_{X_n}}{\mu_{Z_n}^3}$$

where $Z_n$ was defined as a sum of $n$ i.i.d. random variables with a symmetric distribution centred at zero.

Everything about this approximation worked fine in computer simulations where $n$ was large but mathematically there appeared to be a problem since:

$$\mathbb{E}\big[Z_n\big] = 0$$

Given that (2) didn’t appear to be an issue in simulation, I went through the code several times to check whether there was an error but found none. After thinking about the problem for a bit longer it occurred to me formalise the problem and analyse:

$$P(\sum_{n=1}^N a_n = 0)$$

where $a_n$ are i.i.d. random variables with a uniform distribution centred at zero so $\mathbb{E}[a_i]=0$. We may think of this as a measure-theoretic phenomenon in high-dimensional spaces where $N \in \mathbb{N}$ is our dimension and $\vec{a} \in \mathbb{R}^N$ is a random vector.

Now, while in a previous article I analysed this problem as an infinite series for the special case of $a_i \sim \mathcal{U}(\{-1,1\})$, for the more general case of $a_i \sim \mathcal{U}([-N,N])$ where $[-N,N] \subset \mathbb{Z}$ it occurred to me that modelling this problem as a random walk on $\mathbb{Z}$ might be an effective approach.

## A random walk on $\mathbb{Z}$:

Let’s suppose $a_i \sim \mathcal{U}([-N,N])$ where $[-N,N] \subset \mathbb{Z}$. We may then define:

$$S_n = \sum_{i=1}^n a_i$$

Due to the i.i.d. assumption we have:

$$\mathbb{E}\big[S_n\big]= n \cdot \mathbb{E}\big[a_i\big]=0$$

We may now define:

$$u_n = P(S_n=0)$$

and ask whether $u_n$ is decreasing. In other words, what is the probability that we observe the expected value as $n$ becomes large?

## Small and Large deviations:

It’s useful to observe the following nested structure:

$$\forall k \in [0,N], \{\lvert S_n \rvert \leq k\} \subset \{\lvert S_n \rvert \leq k+1 \}$$

From (7), we may deduce that:

$$P(\lvert S_n \rvert \leq N) + P(\lvert S_n \rvert > N) = 1$$

So we are now ready to define the probability of a ‘small’ deviation:

$$\alpha_n = P(\lvert S_n \rvert \leq N)$$

as well as the probability of ‘large’ deviations:

$$\beta_n = P(\lvert S_n \rvert > N)$$

Additional motivation for analysing $\alpha_n$ and $\beta_n$ arises from:

$$P(S_{n+1}| \lvert S_n \rvert > N) = 0$$

$$P(S_{n+1}| \lvert S_n \rvert \leq N) = \frac{1}{2N+1}$$

Furthermore, by the law of total probability we have:

$$\begin{split} P(S_{n+1}) & = \sum_{i=1}^N P(S_{n+1}|\lvert S_n \rvert \leq N) \cdot P(\lvert S_n \rvert \leq N) + P(S_{n+1}|\lvert S_n \rvert > N) \cdot P(\lvert S_n \rvert > N) \\ & = P(S_{n+1}|\lvert S_n \rvert \leq N) \cdot P(\lvert S_n \rvert \leq N) \\ & = \frac{P(\lvert S_n \rvert \leq N)}{2N+1} \end{split}$$

## A remark on symmetry:

It’s useful to note the following alternative definitions of $\alpha_n$ and $\beta_n$ that emerge due to symmetries intrinsic to the problem:

$$\beta_n = P(\lvert S_n \rvert > N) = 2 \cdot P(S_n > N) = 2 \cdot P(S_n < -N)$$

$$\alpha_n = P(\lvert S_n \rvert \leq N) = 1-2 \cdot P(S_n > N)=1-2 \cdot P(S_n < -N)$$

## The case of $n=1$ and $n=2$:

Given that $S_0=0$:

$$P(S_1=0)=\frac{P(\lvert S_0 \rvert \leq N)}{2N+1}= \frac{1}{2N+1}$$

As for the case of $n=2$:

$$P(\lvert S_2 \rvert \leq N) =1 \implies P(S_2=0) = \frac{1}{2N+1}$$

## The case of $n=3$:

The case of $n=3$ requires that we calculate:

$$P(S_3=0)=\frac{P(\lvert S_2 \rvert \leq N)}{2N+1}= \frac{1}{2N+1}$$

$$\begin{split} P(S_{2} > N) & = P(S_{2}| S_1 = i) \cdot P( S_1 = i) \\ & = \frac{1}{2N+1} \sum_{i=1}^N (\frac{1}{2N+1} + … + \frac{N}{2N+1}) \\ & = \frac{N \cdot (N-1)}{2 \cdot (2N+1)^2} \end{split}$$

and using (19) we may derive $P(S_{2} \leq N)$:

$$\begin{split} P(S_{2} \leq N) & = 1 - 2 \cdot P(S_{2} > N) \\ & = 1- \frac{N \cdot (N-1)}{(2N+1)^2} \\ & = \frac{3N^2+5N+1}{(2N+1)^2} \sim \frac{3}{4} \end{split}$$

and so for $n=3$ we have:

$$\begin{split} P(S_{3} = 0) & = P(S_{3} = 0 | \lvert S_2 \rvert \leq N) \cdot P(\lvert S_2 \rvert \leq N) \\ & = \frac{3N^2+5N+1}{(2N+1)^3} \sim \frac{3}{8N} \end{split}$$

## Average drift or why $P(S_n=k) > P(S_n = k+1)$:

It’s useful to note that we may decompose $n$ into:

$$n = \hat{n} + n_z$$

where $\hat{n}$ represents the total number of positive and negative terms, ignoring the null contribution of zero terms $n_z$.

For the above reason, it’s convenient to decompose $S_n$ into:

$$S_n = S_n^+ + S_n^{-}$$

where $S_n^+$ defines the sum of the positive terms and $S_n^{-}$ defines the sum of the negative terms.

By grouping the terms in the manner of (23) we may observe that when $\hat{n}$ is large the average positive/negative step length is given by:

$$\Delta = \frac{N}{2}$$

so that if $\tau$ positive steps and $\hat{n}-\tau$ negative steps are taken:

$$\mathbb{E}[S_n^+] = \tau \cdot \Delta$$

$$\mathbb{E}[S_n^-] = (\hat{n}-\tau) \cdot (-\Delta)$$

$$\mathbb{E}[S_n] = \mathbb{E}[S_n^+] + \mathbb{E}[S_n^-] = \Delta \cdot (2\tau-\hat{n})$$

and we note that:

$$\mathbb{E}[S_n] \geq 0 \implies \tau \geq \lfloor \frac{\hat{n}}{2} \rfloor$$

Furthermore, due to symmetry:

$$P(\lvert S_n \rvert =k) > P(\lvert S_n \rvert =k+1) \iff P(S_n =k) > P(S_n =k+1)$$

so it suffices to demonstrate $P(S_n =k) > P(S_n =k+1)$.

In order to proceed with our demonstration we choose $\tau \in [\lfloor \frac{\hat{n}}{2} \rfloor + 1,\hat{n}N-1]$ so:

$$P(S_{n} = \lfloor \Delta \cdot \tau \rfloor) \propto {\hat{n} \choose \tau} \frac{1}{2^{\hat{n}}}$$

where $\tau \geq \lfloor \frac{\hat{n}}{2} \rfloor$ implies that:

$$\forall k \geq 0, \frac{P(S_n=k)}{P(S_n=k+1)} \sim \frac{(\tau+1)!(\hat{n}-\tau-1)!}{\tau!(\hat{n}-\tau)!} = \frac{\tau+1}{\hat{n}-\tau} > 1$$

and since this holds for all $n_z \leq n$ we may use $n$ in the place of $\hat{n}$ in the next section.

## Proof that $u_n$ is decreasing:

Given (13) we may derive the following ratio:

$$\frac{u_{n+1}}{u_n} = \frac{P(\lvert S_n \rvert \leq N)}{(2N+1) \cdot P(S_n = 0)}$$

So in order to prove that $u_n$ is decreasing we must show that:

$$P(\lvert S_n \rvert \leq N) < (2N+1) \cdot P(S_n=0)$$

and we note that this follows immediately from (31) since:

$$P(\lvert S_n \rvert \leq N) = 2 \sum_{k=1}^N P(S_n=k) + P(S_n=0) < (2N+1) \cdot P(S_n=0)$$

## Proof that $\lim\limits_{n \to \infty} u_n = \lim\limits_{n \to \infty} \alpha_n = 0$:

Now, given (34) we may define:

$$\forall N \in \mathbb{N}, q_n = \frac{P(\lvert S_n \rvert \leq N)}{(2N+1)P(S_n=0)} < 1$$

We may easily show that $q_n$ is decreasing and therefore:

$$\lim_{n \to \infty} \frac{P(S_{n+1}=0)}{P(S_1=0)} = \prod_{n=1}^\infty \frac{P(S_{n+1}=0)}{P(S_n=0)} = \prod_{n=1}^\infty q_n = 0$$

so we may deduce that $u_n$ decreases exponentially fast and that:

$$\lim_{n \to \infty} u_n = \lim_{n \to \infty} P(S_{n+1}=0) = \frac{0}{2N+1}=0$$

Likewise, given that:

$$\alpha_n = P(\lvert S_n \rvert \leq N) = (2N+1) \cdot P(S_{n+1}=0)$$

we may conclude that large deviations are exponentially more likely as $n$ becomes large:

$$\lim_{n \to \infty} \alpha_n = \lim_{n \to \infty} (2N+1) \cdot P(S_{n+1}=0) = 0$$

$$\lim_{n \to \infty} \beta_n = \lim_{n \to \infty} P(\lvert S_n \rvert > N) = \lim_{n \to \infty} 1 - \alpha_n = 1$$

One interpretation of the last two limits is that the mass of the discrete hypercube moves away from the centre and towards the corners which is a concentration-of-measure phenomenon.

## Discussion:

I find it quite surprising that random structures, in this case a random walk, are useful for analysing high-dimensional systems. Indeed, I have to say that for such a general result thirty four equations isn’t much. But, what about the case of uniform distributions on closed intervals of the form $[-N,N] \subset \mathbb{R}$?

It’s useful to note that $[-N,N]^n \subset \mathbb{R}^n$ defines the hypercube with volume $(2N)^n$ and I suspect that in the continuous setting, hypercube geometry and convex analysis might be particularly insightful.