Introduction:

While analysing the weighted AM-GM inequality, it recently occurred to me that the exponential of the Shannon entropy might have a useful geometric interpretation. In fact, the level sets of the exponential of the Shannon entropy reveal that they have a natural Hyperbolic embedding in terms of Hyperbolic functions.

The extension of this analysis to the setting of Quantum Information is a challenge I shall save for a future date.

A geometric inequality concerning the exponential of the Shannon entropy:

The weighted AM-GM inequality states that if \(\{a_i\}_{i=1}^n, \{\lambda_i\}_{i=1}^n \in \mathbb{R}_+^n\) and \(\sum_{i=1}^n \lambda_i = 1\) then:

\begin{equation} \prod_{i=1}^n a_i^{\lambda_i} \leq \sum_{i=1}^n \lambda_i \cdot a_i \end{equation}

As an application, I found that if \(H(\vec{p})\) denotes the Shannon entropy of a discrete probability distribution \(\vec{p} = \{p_i\}_{i=1}^n\) and \(r_p^2 = \lVert \vec{p} \rVert^2\) is the \(l_2\) norm of \(\vec{p}\) then:

\begin{equation} e^{H(\vec{p})} \geq \frac{1}{r_p^2} \end{equation}

The proof is quite direct. If \(a_i = p_i\) and \(\lambda_i = p_i\),

\begin{equation} e^{-H(\vec{p})} = e^{\sum_i p_i \ln p_i} = \prod_{i=1}^n p_i^{p_i} \end{equation}

\begin{equation} \sum_{i=1}^n p_i^2 = \lVert \vec{p} \rVert^2 \end{equation}

and using (1), we may deduce that \(e^{H(\vec{p})} \geq \frac{1}{r_p^2}\) where equality is obtained when the Shannon entropy is maximised by the uniform distribution i.e. \(\forall i, p_i = \frac{1}{n}\).

An analysis of natural Hyperbolic embeddings:

If we consider that the Shannon entropy measures the quantity of hidden information in a stochastic system at the state \(\vec{p} \in [0,1]^n\), we may define the level sets \(\mathcal{L}_q\) in terms of the typical probability \(q \in (0,1)\):

\begin{equation} \mathcal{L}_q = \{\vec{p} \in [0,1]^n: e^{H(\vec{p})} = e^{-\ln q}\} \end{equation}

which allows us to define an equivalence relation over states \(\vec{p} \in [0,1]^n\). Such a model is appropriate for events which may have \(n\) distinct outcomes.

Now, we’ll note that \(e^{H(\vec{p})}\) has a natural interpretation as a measure of hidden information while \(e^{-H(\vec{p})}\) may be interpreted as the typical probability of the state \(\vec{p}\). Given (5), a natural relation between these measures may be found using the Hyperbolic identities:

\begin{equation} \cosh^2(-\ln q) - \sinh^2(-\ln q) = 1 \end{equation}

\begin{equation} \cosh(-\ln q) - \sinh(-\ln q) = q \end{equation}

where \(2 \cdot \cosh(-\ln q)\) is the sum of these two measures and \(2 \cdot \sinh(- \ln q)\) may be understood as their difference. This brief analysis indicates that the level sets \(\mathcal{L}_q\) have a natural Hyperbolic embedding in terms of Hyperbolic functions.

A generalisation of the original geometric inequality:

It is worth pointing out that the level sets \(\mathcal{L}_q\) also allow a natural generalisation of the original geometric inequality (2):

\begin{equation} \forall \vec{p} \in \mathcal{L_q}, e^{H(\vec{p})} \geq \max_{\vec{p} \in \mathcal{L}_q} \frac{1}{\lVert \vec{p} \rVert^2} \end{equation}

which allows us to define the Euclidean ball with radius \(r_p\):

\begin{equation} \mathcal{B}(0,r_p) = \bigcap_{\vec{p} \in \mathcal{L_q}} \mathcal{B}(0,\lVert \vec{p} \rVert) \end{equation}

where:

\begin{equation} \frac{1}{r_p^2} = \max_{\vec{p} \in \mathcal{L}_q} \frac{1}{\lVert \vec{p} \rVert^2} \end{equation}

Discussion:

After sharing this analysis on the MathOverflow I received an interesting response from Tom Leinster who pointed out that important research has been done on the application of the exponential of entropy to ecological diversity measures [4]. This was not within the scope of my original considerations but it was a welcome surprise.

In the near future, I look forward to exploring the exponential of entropy within the setting of Quantum Information theory as well as the setting of biological evolution.

References:

  1. Olivier Rioul. This is IT: A Primer on Shannon’s Entropy and Information. Séminaire Poincaré. 2018.

  2. David J.C. MacKay. Information Theory, Inference and Learning Algorithms. Cambridge University Press 2003.

  3. John C. Baez, Tobias Fritz, Tom Leinster. A Characterization of Entropy in Terms of Information Loss. Arxiv. 2011.

  4. Tom Leinster, Entropy and Diversity: The Axiomatic Approach. Cambridge University Press, 2021.