perturbations that preserve constants of motion

simulation of three body problem with random initial conditions


Within the context of the gravitational n-body problem I’m interested in perturbations of the initial conditions  \{p_i(0),\dot{p}_i(0)\}_{i=1}^n    which leave all the constants of motion unchanged.

It’s clear to me that the linear momentum(P) is preserved under any transformation of the initial position vectors and energy(H) is preserved under any isometry applied to initial position vectors. However, when I add the constraint of conserving angular momentum(L), the general nature of these transformations which would leave (H,L, P) unchanged is not clear.

Granted, time translations would preserve the constants of motion but in a trivial manner. Now, I’m not sure whether the approach I’ve taken can be considerably improved but within the context of the three body problem in the plane, I transformed this problem into an optimisation problem as follows:

  1. We first calculate the constants of motion which are the linear momentum, energy and the angular momentum prior to perturbing the position of one of the n bodies. Let’s suppose that these constants are given by C_1, C_2, C_3 . 
  2. Assuming our bodies are restricted to move in the plane, there are a total of 4n scalars that determine the initial conditions for this n-body system. It follows that the perturbation of one body(q ) generally requires searching for a 4n-2  dimensional vector x  for positions and velocities. This can be done using Nelder-Mead, a derivative-free method, provided that we define a cost function. 
  3. The functions to be minimised are obtained directly from the linear momentum (P), angular momentum (L) and Hamiltonian (H) equations to obtain a single smooth cost function:\begin{aligned} C(x,q) =(P(x,q)-C_1)^2+(H(x,q)-C_2)^2+(L(x,q)-C_3)^2 \end{aligned}
  4. In order to increase the probability of convergence, the Nelder-Mead algorithm was randomly seeded for each perturbed position q .

I think this method can be easily extended to n-bodies.


Pseudo-anonymous forums

After using the stack-exchange(SE) forums for several years, it’s clear that pseudo-anonymous user profiles are synonymous with lousy user experience. Depending on the stack exchange this manifests itself in different ways but I believe the underlying reasons are the same. 

Consider these two questions:
1. Why don’t hexapods gallop?
2. Rigorous derivation of isoperimetric inequality from ideal gas equation?

In the first case you have a pseudo-anonymous user on the Physics SE with a lot of reputation who tries to reformat the question. He basically says that the question has nothing to do with physics. Eventually, I demonstrate that his claim is baseless and other users on the physics stack exchange support my arguments. A different user with less confidence might have responded differently however.

In the second case, we have a clear question on the Math Overflow that gets a clear answer from an identifiable person. Now, if you check the top users on the MathOverflow you’ll realise that almost every user is identifiable. In fact, among the top 20 users the number of identifiable users among these forums stands at 95% and 75% for the MathOverflow and Physics Stack Exchanges respectively. I believe that the fraction of pseudo-anonymous users is an important indicator of the overall forum experience. 

Pseudo-anonymity is not merely a different type of user setting. It leads to fundamentally different forum interactions for systemic reasons:

1. Users are less inhibited because they won’t be identified. 
2. Users feel less pressure to share good content because any social benefits won’t affect their actual person. 

Due to this reward system, pseudo-anonymous users tend to behave like idiots if they can get away with it and they won’t try as hard to share great content. If you’re Terrence Tao on the MathOverflow the situation is very different. In general, I suspect that the reason for the greater fraction of identifiable MathOverflow users is that they are actually proud of what they share. 

While it’s not clear how the Physics Stack Exchange can incentivise users to use their real names I think they can simply require it. I have no doubt that this would improve the user experience. 

The world we understand

A regular problem I encounter is that people create false dichotomies and use this to guide their reasoning as well as influence the reasoning of others. The problem is that we live in an increasingly complex world with lots of tuneable parameters. Some of them are known, most of them are unknown and almost all of them behave in a highly non-linear manner. 

If a project is ineffective that doesn’t mean that the original idea was inherently bad nor does it mean that the particular implementation of that idea wasn’t carefully thought out. It could have failed for any number of reasons. My guess is that many of us reason in a simplistic manner in order to maintain the belief that we live in a world we understand. 

Let me give some examples of nonsense I often hear:

  1. “curiosity-driven education doesn’t work
  2. “foreign aid is a terrible idea”
  3. “renewables are the future and always will be”
  4. “Mars-bound rockets can’t be built by a company”

In every case the educated person in question will point me to a list of experimental or empirical studies and then try to convince me that the particular idea is lousy. However, what many don’t seem to realise is that in many cases it’s difficult to separate the conclusion of a study and its design. Moreover, what is particularly pernicious about this sloppy method of reasoning is that it often targets unconventional ideas. 

In many cases enterprising ideas aren’t inherently more risky than conventional methods. Very often the unusual idea is more carefully thought out whereas the conventional approach merely relies on tradition for justification. If you should take an unusual path there’s less social support in the event of failure and that’s an important reason why many people judge that it’s safer to fit in. There might be a greater chance of succeeding with approach X but if people are guaranteed to say “I told you so” in the event of failure you wouldn’t be irrational to choose conventional methods.  

This might seem like a fuzzy cultural problem but I think that this is one of the most important problems facing our society today. We may not be one technology away from utopia but if our society doesn’t become more adaptive I don’t see how we’ll collectively overcome the biggest problems facing us in the 21st century. Tradition should not be the default answer. 

However, modern society’s collective desire to be accepted means that even the truth surrounding historically important innovators gets distorted. Paul Graham touches this topic very succinctly in his essay, The Risk of Discovery:

Biographies of Newton, for example, understandably focus more on physics than alchemy or theology. The impression we get is that his unerring judgment led him straight to truths no one else had noticed. How to explain all the time he spent on alchemy and theology?

In Newton’s day the three problems seemed roughly equally promising. No one knew yet what the payoff would be for inventing what we now call physics; if they had, more people would have been working on it.

Now, among those people who are aware that Newton spent a considerable amount of time on things other than physics almost all tend to interpret it by saying that he was a bit crazy like all geniuses. This is simply a careful attempt to recast an important innovator to someone more conventional so we can maintain an illusion that comforts us rather than rationalise unusual and possibly risky projects. 

In reality we aren’t safer by collectively walking off the side of a cliff. We need important innovation in the areas of energy, education and government. For this to happen we need to encourage people that take calculated risks and if they should fail instead of saying “I told you so” we should say “It was worth a try”.

point masses and Gauss’ Law of Gravity

My final year project involves searching for stable orbits for the gravitational n-body problem where we use Newton’s point mass approximation to describe the force field around masses assuming that these are approximately spherical. Now given that this model uses point masses we can’t model rotations. In fact, it’s generally assumed that they are negligible. 

However, I wondered whether this assumption can break down at some point. Consider the following:

  1. On the scale of millenia, the surface of our planet actually behaves like a fluid.
  2. If the distance between two planetary bodies is small and we assume zero rotation, this would imply a growing deviation from sphericity over time. It follows that the point-mass approximation would no longer make sense.
  3. In fact, our bodies would look something like blobs of gooey paint slowly dripping towards each other. 

While it’s true that the probability of zero rotation is close to zero the challenge of dealing with non-spherical masses is something that Newton wouldn’t have handled very well with his point-mass approximation. However, this is not quite the end of Newton’s model. 

Given that the gravitational potential in his model holds for point masses a natural solution would be to integrate over the volume enclosed by a compact body. This is exactly what Gauss did when he discovered the divergence theorem from which the Gauss Law of Gravity emerges as an immediate corollary. As this handles the case where our masses aren’t spherical the rest of my blog post will focus on the statement and proof of this theorem. 

Gauss Law of Gravitation:

Given U  as the gravitational potential and \vec{g} = -\nabla U  as the gravitational field, Gauss’ Law of Gravitation states:

\begin{aligned} \int_{V} (\nabla \cdot \vec{g}) dV =\int_{\partial V} \vec{g} \cdot d \vec{A}  \end{aligned}

In order to arrive at this result, I’ll first need to introduce two useful definitions:

flux:The flux \phi of a vector field through a surface is the integral of the scalar product of the field in a given point times the infinitesimal oriented surface at that point:

\begin{aligned} \phi =  \int_{\partial V} \vec{g} \cdot d \vec{A} \end{aligned}

where \vec{g} denotes the gravitational field, \partial V the boundary of our mass and d \vec{A} is our infinitesimal oriented surface.

divergence:The divergence represents volume density of the outward flux of a vector field from an infinitesimal volume around a given point:

\begin{aligned} \nabla \cdot \vec{g} \end{aligned}

The flux and divergence of a field are actually related in a local manner. Consider a box with dimensions \delta x_1,\delta x_2,\delta x_3 . Using Cartesian coordinates, let’s suppose that (x_1,x_2,x_3) denotes a corner on this box. Then the flux of \vec{g} projected onto \hat{x_3} is given by:

\begin{aligned} \phi_3 =(g_3(x_1,x_2,x_3+\delta x_3)-g_3(x_1,x_2,x_3)) \delta x_1 \delta x_2 \end{aligned}

For very small \delta x_3 ,

\begin{aligned} g_3(x_1,x_2,x_3+\delta x_3)-g_3(x_1,x_2,x_3) \approx \frac{\partial g_3}{\partial x_3} \delta x_3 \end{aligned}

where we get equality in the limit as \delta x_3 tends to zero.

As a result we have \forall i \in \{1,2,3 \}  :

\begin{aligned} \phi_i \approx \frac{\partial g_i}{\partial x_i} \delta x_1 \delta x_2 \delta x_3 =\frac{\partial g_i}{\partial x_i} \delta V  \end{aligned}

\begin{aligned} \phi =\sum_{i=1}^{3} \phi_i \end{aligned}

Now, we may deduce that the flux and divergence are related as follows:

\begin{aligned} \frac{\phi}{\delta V}\approx  \sum_{i=1}^{3} \frac{\partial g_i}{\partial x_i} \end{aligned}

\begin{aligned}\lim_{\delta V \to 0} \frac{\phi}{\delta V} = \nabla \cdot \vec{g} =  \sum_{i=1}^{3} \frac{\partial g_i}{\partial x_i} \end{aligned}

If we take an arbitrary solid and slice it into compact volumes V_i with associated surfaces \partial V_i we have:

\begin{aligned} \phi =\sum_{i=1}^{N} \phi_i =\sum_{i=1}^{N} \int_{\partial V_i} \vec{g_i} \cdot d \vec{A} =\int_{\partial V} \vec{g} \cdot d \vec{A} \end{aligned}

In addition, using the relationship between flux and divergence we have:

\begin{aligned} \phi =\lim_{N \to \infty} \lim_{V_i \to 0} \sum_{i=1}^{N} \nabla \cdot \vec{g_i} V_i = \int_{V} \nabla \cdot \vec{g} dV \end{aligned}

By equating the last two expressions we conclude the proof of this marvellous result by Gauss which is succinctly summarised on Wolfram as follows:

in the absence of the creation or destruction of matter, the density within a region of space can change only by having it flow into or away from the region through its boundary

At this point it’s natural to ask how can we describe the gravitational field which is defined in terms of the gravitational potential. In general, for non-spherical masses we can no longer use the usual inverse radius gravitational potential due to Newton.

In some sense we’ll need an infinite sum of perturbations in the gravitational potential which can be done using spherical harmonics, another wonderful mathematical invention that has many applications. This shall be the subject of my next blog post. 



Can a biologist fix a radio?

In 2012, Professor Lazebnik published a seminal article titled ‘Can a biologist fix a radio?’ At first glance the title and its contents seem demeaning to the biology profession. However, a close reading shows that this landmark article not only anticipated the rise of systems biology but still has important consequences for the field of bioinformatics in general.

Around the time of writing, Lazebnik’s field of programmed cell death(aka apoptosis) was growing very quickly. So quickly in fact that the large number of publications(est. 10k per year) meant that reviewers were just as overwhelmed as the authors themselves. To be honest I have no idea how a scientist can read and understand one tenth of that number of papers in a year. But, that’s just the beginning.

The terrible part is that key results were not found to be reproducible and with greater evidence models began to fall apart in a seemingly irreparable manner. Naturally, doubt entered the minds of many who were looking for some miracle drug. Moreover, as Lazebnik looked around he noticed that this phenomenon was persistent in many different fields of biology.

Some might say that this is simply because biology is a field with many unknowns and for most that’s the end of the discussion. But,  Lazebnik then asks whether the general problem might involve the manner biologists approach problems:

To understand what this flaw is, I decided to follow the advice of my high school mathematics teacher, who recommended testing an approach by applying it to a problem that has a known solution. To abstract from peculiarities of biological experimental systems, I looked for a problem that would involve a reasonably complex but well understood system. …I started to contemplate how biologists would determine why my radio does not work and how they would attempt to fix it.

He then goes through a brilliant thought experiment that surveys the various methods an experimental biologist would employ and their means of describing their findings. After going through the methods of dissection, cataloguing connections, and breaking particular components he concludes that a biologist with their methods of interpreting and analysing the resulting data still wouldn’t have a clue how a radio worked. Indeed, he remarks that a radio like cells and organisms doesn’t simply have connected components but also tuneable components so it may malfunction even if all components are undamaged. 

The methodological problem doesn’t have anything to do with the amount of data collected but with the language used by biologists for modelling. He actually says:

…the radio analogy suggests that an approach that is inefficient in analysing a simple system is unlikely to be more useful if the system is more complex. 

However, he then describes what appears to be a serious aversion to precise logical models among biologists. In fact, he characterises the average biologist in the the following manner:

In biology, we use several arguments to convince ourselves that problems that require calculus can be solved with arithmetic if one tries hard enough and does another series of experiments.

The rest of the article then focuses on his proposed solution: the development of systems biology for experimental biologists in his field. This basically involves developing powerful predictive models of intra and intercellular interactions using the right level of mathematical abstraction. I must say that he was very prescient for his time as this field has shown great potential since. 

The article also helped me appreciate the importance of open source projects like OpenWorm which I have contributed to in the past as these help make sense of the massive amounts of data that come out of laboratories. Clearly, large amounts of data are useless if you don’t have appropriate tools for modelling.  

The Ten Cupcake Problem

During my Christmas holidays I went through the regular ritual of eating cake with family relations. Occasionally I would observe crumbs fall off the sides of the cake and then think of the relative share each person got. These rituals continued until for the first time, I found an interesting math problem based on the challenge of dividing cake into equal volumes without leaving any crumbs. In fact it led me to an interesting conjecture at the intersection of geometry, topology and measure theory. 


A normal cake can be considered a compact euclidean manifold M \subset \mathbb{R}^3 homeomorphic to the sphere. If this manifold is also convex:

For any point on the boundary of our manifold, there exists a hyperplane H through this point such that we may decompose our manifold into three path-connected components that satisfy: 

\begin{aligned} M=M_1 \cup \{M \cap H\} \cup M_2 \end{aligned} 

\begin{aligned} Vol(M_1)=Vol(M_2)=Vol(M)/2 \end{aligned} 

The challenge is to show that the above volume-splitting properties hold for any compact manifold that is also convex but doesn’t hold otherwise. By this I mean that if our manifold happens to be non-convex then the above properties don’t hold for every point on its boundary. In other words we would observe crumbs.

This might seem like a relatively simple problem but it’s actually a more complex variant of the Ham Sandwich theorem which I learned from other users of the math stackexchange when I first shared the problem.

I leave it as an exercise to the reader to show that the volume-splitting property holds for any compact and convex manifold. The hard part is showing that for any compact and non-convex manifold the volume-splitting property doesn’t hold for all points on the boundary.

Right now it’s not clear to me whether this problem might exist under another name. If it happens to be unsolved I promise ten cupcakes to the first person that manages to solve it before me. 

Note: I have written a more structured article based on this blog post here. 

Differentiating under the integral sign

Back when I was in high-school I really enjoyed reading about Feynman’s entertaining numerical exploits. In particular, I remember integration challenges where he would use the Leibniz method of differentiating under the integral sign. I wasn’t taught this method at university but this detail resurfaced in my mind recently when I tackled a problem in hamiltonian dynamics where I had to differentiate under the integral sign. After using this method I decided to take a closer look at its mathematical justification.

Leibniz method:
For an integral of the form:

\begin{aligned} u(x) = \int_{y_0}^{y_1} f(x,y) dy \end{aligned} 


For all x in some open interval, the derivative of this integral is expressible as

\begin{aligned} u'(x) = \int_{y_0}^{y_1} f_x(x,y) dy  \end{aligned}

provided that f and f_x are both continuous over a region [x_0,x_1] \text{x}[y_0,y_1]



\begin{aligned} u(x) = \int_{y_0}^{y_1} f(x,y) dy \end{aligned}

\begin{aligned} u'(x) = \lim_{h\to 0} \frac{u(x+h)-u(x)}{h} \end{aligned}  

we may deduce:

\begin{aligned} u'(x) = \lim_{h\to 0} \frac{\int_{y_0}^{y_1} f(x+h,y) dy-\int_{y_0}^{y_1} f(x,y) dy}{h} = \lim_{n\to \infty} \int_{y_0}^{y_1} \frac{f(x+\frac{1}{n},y)-f(x,y)}{\frac{1}{n}} dy \end{aligned}

Now, if we define:

\begin{aligned} f_n (y) = \frac{f(x+\frac{1}{n},y)-f(x,y)}{\frac{1}{n}} \end{aligned}

it’s clear that we may apply the Bounded convergence theorem where:

\begin{aligned} \int_{y_0}^{y_1} \lim_{n\to \infty} f_n(y) dy = \lim_{n\to \infty} \int_{y_0}^{y_1} f_n(y) dy \end{aligned}

This is justified as the existence and continuity of f_x(x,y)  combined with the compactness of closed intervals implies that f_x(x,y) is uniformly bounded.

Note 1: I plan to share the generalization of this result to higher dimensions in the near future.

Note 2: This blog post is almost identical to the Wikipedia entry with some modifications and clarifications which I find useful.