Introduction:

Two days ago, while thinking about brain networks, it occurred to me that almost all simple graphs are small world networks in the sense that if is a simple graph with nodes sampled from the Erdös-Rényi random graph distribution with probability half then when is large:

\begin{equation} \mathbb{E}[d(v_i,v_j)] \leq \log_2 N \end{equation}

My strategy for proving this was to show that when is large, there exists a chain of distinct nodes of length originating from almost surely. This implies that:

\begin{equation} \forall v_i, v_j \in G_N, d(v_i,v_j) \leq \log_2 N \end{equation}

almost surely when is large.

Now, by using the above method of proof I managed to show that almost all simple graphs are very small in the sense that:

\begin{equation} \mathbb{E}[d(v_i,v_j)] \leq \log_2\log_2 N \end{equation}

when tends to infinity. We can actually do even better.

Using my proof that almost all simple graphs are connected, I can show that almost all simple graphs have diameter 2. However, I think there is more value in going through my original proof which in my opinion provides greater insight into the problem.

Degrees of separation and the neighborhood of a node:

We may think of degrees of separation as a sequence of ‘hops’ between the neighborhoods of distinct nodes . Given a node we may define as follows:

\begin{equation} \mathcal{N}(v_i) = \{v_j \in G_N: \overline{v_i v_j} \in G_N \} \end{equation}

where is a graph with nodes.

Now, given the E-R model we can say that implies:

\begin{equation} P(v_k \notin \mathcal{N}(v_i) \land v_k \notin \mathcal{N}(v_j)) = P(v_k \notin \mathcal{N}(v_i)) \cdot P(v_k \notin \mathcal{N}(v_j)) = \frac{1}{4} \end{equation}

and by induction:

\begin{equation} P(v_k \notin \mathcal{N}(v_{1}) \land … \land v_k \notin \mathcal{N}(v_{n})) = \frac{1}{2^n} \end{equation}

It follows that if there is a chain of distinct nodes we can say that:

\begin{equation} P(d(v_1,v_k) \leq n) = 1- \frac{1}{2^n} \end{equation}

Almost all simple graphs are very small world networks:

A chain of distinct nodes exists almost surely:

The probability that there exists a chain of nodes of length :

\begin{equation} \overline{v_1 … v_{\log_2\log_2 N}} \end{equation}

such that is given by:

\begin{equation} P(\overline{v_1 … v_{\log_2\log_2 N}} \in G_N) = \prod_{k=1}^{\log_2\log_2 N} \big(1-\frac{1}{2^{N-k}} \big) \geq \big(1- \frac{\log_2 N}{2^N}\big)^{\log_2\log_2 N} \end{equation}

and we note that:

\begin{equation} \lim\limits_{N \to \infty} \big(1- \frac{\log_2 N}{2^N}\big)^{\log_2\log_2 N} = 1 \end{equation}

this guarantees the existence of a chain of distinct nodes of length originating from any almost surely.

Given that exists almost surely we may deduce that almost surely:

If exists we have:

\begin{equation} \forall \{v_i\}_{i=1}^n, v_k \in G_N, P(d(v_1,v_k) \leq \log_2\log_2 N) = 1 - \frac{1}{2^{\log_2\log_2 N}} \end{equation}

and so we have:

\begin{equation} \lim\limits_{N \to \infty} \forall \{v_i\}_{i=1}^n, v_k \in G_N, P(d(v_1,v_k) \leq \log_2\log_2 N) = 1 \end{equation}

Discussion:

I must say that initially I found the above result quite surprising and I think it partially explains why small world networks frequently occur in nature. Granted, in natural settings the graph is typically embedded in some kind of Euclidean space so in addition to the degrees of separation we must consider the Euclidean distance. But, I suspect that in real-world networks with small world effects the Euclidean distance plays a marginal role.

In particular, I believe that wherever small-world networks prevail the Euclidean distance is dominated by ergodic dynamics between nodes. There is probably some kind of stochastic communication process between the nodes.