Stirling’s log-factorial approximation

In my statistical physics course, I have frequently encountered the following approximation due to Stirling:

\ln(N!) \approx N \ln (N) -N

It’s very useful but my professor didn’t explain how good the approximation was. The derivation I found turns out to be very simple and so I can present it in a few lines here:

  1. Note that:\int \ln(x) dx = x \ln (x) -x \quad (1)
  2. Now, if we defineS = \sum_{n=1}^{N} \ln (n) \quad (2)
    we have an upper-Riemann sum with \Delta x =1 .
  3. So we basically have the following approximation:
    S = \sum_{n=1}^{N} \ln (n \Delta x)\Delta x \approx \int_{1}^{N} \ln(x) dx \quad (3)
  4. By the intermediate value theorem,\forall n \in\mathbb{N} \thinspace \exists c_n \in (n-1,n), S' =  \sum_{n=1}^{N} \ln (c_n \Delta x)\Delta x =\int_{1}^{N} \ln(x) dx \quad(4) where \Delta x =1 as defined previously.
  5. Let’s check how good this approximation is:|S - S'| = | \sum_{n=1}^{N} \ln (n) - \sum_{n=1}^{N} \ln (c_n) | \Rightarrow |S - S'| \leq | \sum_{n=1}^{N} \ln (n) -\ln (n+1)| = \ln (N+1) \quad (5)
  6. This error grows very slowly. In fact, if N = 10^{24} i.e. the number of molecules in a glass of water,| \ln(N!) - (N \ln (N) -N)| < 60 which is a minuscule error relative to the number of molecules.

Note: A pdf version of this blog post is available here.


generalized AM-GM inequality

As I was reading through my linear analysis notes today, there was a small
passage where our lecturer Jim Wright provided a proof of a generalisation
of the inequality of arithmetic and geometric means:

Let A, B \geq 0 and 0 \leq \theta \leq 1. Then A^{\theta} B^{1-\theta} \leq \theta A + (1-\theta) B .

I didn’t bother to read the proof as I thought I could probably come up with a
good one myself. Indeed, I am sufficiently happy with the proof I found that
I think it’s worth sharing. Here it is:


If AB=0 the proof is trivial. So I shall proceed by assuming the contrary:

  1. Let’s define \beta := \frac{A}{B} \in \mathbb{R}_{+} and divide by  B . We then obtain:\beta^{\theta} \leq \theta \beta + (1-\theta)
  2. Now, let’s define the following differentiable functions:\begin{cases} f(x) =\theta x + (1-\theta)  \\ g(x) =x^{\theta} \\ \end{cases}
  3. The derivatives of these functions are given by:\begin{cases} f'(x) =\theta  \\ g'(x) =\theta x^{\theta-1} \\ \end{cases}
  4. Now, we can quickly reach the following conclusions:a) f(x) \geq g(x) for x \geq 1
    b) For x \in [0,1]  \begin{cases} f(0)  \geq g(0)=0  \\ f(1)=g(1)=1 \\ \end{cases}

and both functions are monotonically increasing on (0,1) so there can’t be x \in (0,1) such that f(x) = g(x) unless \theta equals zero or one.

It follows that f(x) \geq g(x) on (0,1).

I must say that this was an easy problem but I liked the method I found for
solving it. Namely, reducing the number of variables and then replacing
variables with functions that can then be readily analysed. But, we can go a bit further
and show how Hölder’s inequality follows easily.

Hölder’s inequality:

For any two sequences (a_i)_{i=1}^n,(b_i)_{i=1}^n \subset \mathbb{N} , we have:

\sum_{j=1}^{n} a_j b_j = 1 \leq (\sum_{j=1}^{n} a_j^p)^{\frac{1}{p}}(\sum_{j=1}^{n} b_j^q)^{\frac{1}{q}}

for 1 \leq p \leq \infty where \frac{1}{p} + \frac{1}{q}=1 .


If we define a=(a_i)_{i=1}^n, b= (b_i)_{i=1}^n , and normalize these vectors we have:

\begin{cases} \alpha = \frac{a}{||a||_p} \\ \beta = \frac{b}{||b||_q} \\ \end{cases}

Now, we may apply the generalised AM-GM inequality to deduce:

\sum_{j=1}^{n} \alpha_j \beta_j \leq \frac{1}{p}(\sum_{j=1}^{n} \alpha_j^p) +\frac{1}{q}(\sum_{j=1}^{n} \beta_j^q)= ||\alpha||_p||\beta||_q

The Quadrupedal Conjecture

Last summer, I asked myself several related questions concerning the locomotion of quadrupeds:

1) What if the cheetah had more than four legs?
2) Why don’t hexapods gallop?
3) Have we ever had F1 cars with more than 4 wheels?

I found all these related questions interesting but after more reflection I summarised my problem mathematically:

Could four legs be optimal for rapid linear and quasi-linear locomotion on rough planar surfaces?

After searching for papers on this subject and finding none, I decided to name this problem ‘The Quadrupedal Conjecture’ and it may be informally described as saying that four legs allow a creature to travel fastest on a nearly-flat surface. However, I thought it might be interesting to tackle a simpler problem first which we may call the ‘Little Polypedal Conjecture’:

Can we show that given a polyped with 2 n legs having pre-defined mass and energy, that there exists N such that for 2 N limbs or greater, its maximum linear velocity on a rough planar surface would be reduced?

I believe that there is an elegant solution to the above problem and I think that the quadrupedal conjecture has an elegant solution as well. But it is by no means guaranteed that a solution to such a problem would be simple. Finding the right degree of abstraction would be one of the main challenges as there are many different ways of approaching this problem with varying degrees of realism.

First, I must say that this is not primarily a problem in biology although this question has attracted the attention of biologists on the biology stackexchange. Second, I think this is a question that would heavily involve mathematical physics although this question has been controversial on the physics stackexchange. Further, I think that the solution to this problem would be affected by mass scaling. By this I mean that the optimal number of legs would probably vary with the range of masses available to the polyped.

Finally, I think that this question is highly relevant to roboticists who build legged robots as a thorough investigation of this question would probably lead to better models for polypedal locomotion.

That’s all I can reasonably say for now.

Note: I thought I’d add a touch of melodrama to this blog post as a friend of mine told me that my blog posts can be a bit dry.

Computing the Omega constant

I) Solving a simple-looking equation:  

Last Sunday I was trying to compute a complicated integral suggested to me by a friend and was led to consider the problem of solving the following innocent-looking equation:

x\log(x)=1 \quad (1)  

Little did I know that this little problem would lead me to transcendental functions and the Lambert W function in particular. But, before saying more on those two topics I’ll go through what I tried:

1. Given that x \in (1,e), I tried x:=e^\alpha for \alpha \in [0,1] so that I obtain \alpha e^\alpha=1. This resembles the familiar exponential function f(t)=e^{\alpha t}.

2. I tried to recover some useful relations and I found the following:

\forall t \in [0,1]\quad f^n(t)f^{n+1}(t)=\alpha^{2n} \quad (2)

Using the mean-value theorem:

\int_{0}^{1} f^n(t)dt=\alpha^{n-1}(\frac{1}{\alpha}-1) \quad (3)

which isn’t too difficult to show given that

\exists c_n \in (0,1) \quad f^{n+1}(c_n)=f^n(1)-f^n(0)=\alpha^n f'(c_0) \\
c_0 = \frac{1}{\alpha}\log(\frac{1}{\alpha^2}-\frac{1}{\alpha})\\

After this I still couldn’t figure out how to solve for \alpha analytically. I wondered whether there might be a relation which I failed to discover but at this point it was a bit late so I thought I might want to try a numerical method.

3. Using Newton’s method:

\large t_{n+1}=t_n-\frac{g(t_n)}{g'(t_{n+1)}}=t_n-\frac{t_n\ln(t_n)+t_n^2}{1+t_n} \quad (4)

where g(t)=\log(t e^t) 

I found that \alpha \approx 0.567143 and wondered whether it might be related to other numbers I knew. I tried a few calculations which got me nowhere and then I became more interested in whether this problem could be solved without resorting to numerical methods so I asked this on the math stackexchange. From that forum I learned that \alpha is actually known as the Omega constant and it appears as W(1) where W(x) is the Lambert W function, a transcendental function.

II) An equation that can’t be solved analytically: 

A transcendental function, f(x) is basically a function that transcends algebra in the sense it can’t be expressed in terms of a finite number of algebraic operations: addition, subtraction, multiplication, division, raising to a power, and taking roots. In particular it can’t satisfy:

\sum_{k=1}^{n} c_k(x) (f(x))^k = 0 \quad (5) 

where c_k(x) are rational functions(i.e. a ratio of polynomials).

Now, it isn’t too difficult to show that:

t(x) =e^{x\log(x)}=x^x \quad (6)

is actually transcendental if we suppose that t(x) actually satisfies (5). If we choose x \geq 1, |t(x)| \geq 1 we find that

|c_n(x)| |t(x)|^n \leq b(x) |t(x)|^{n-1} \\
b(x) =\sum_{k=1}^{n} |c_k(x)|  \\
\end{cases} \quad (6)

It then follows that

\forall x \in \mathbb{R}, |t(x)| \leq \frac{b(x)}{|c_n(x)|} \quad (7)

However, in (7) the expression on the left grows exponentially while the expression on the right has at most polynomial growth so t(x) can’t be algebraic and hence can’t be solved analytically.

I’ll write a separate blog post on the Lambert W function as I think this blog post is getting a bit long and that function is worth it’s own post as it has many applications.