What is the probability that the sun will rise tomorrow?

The sunrise problem is a problem first considered by Laplace that asks for the probability that the sun will rise tomorrow given a history of sunrises. Though this may seem like a silly problem it can serve to illustrate fundamental differences between Frequentists and Bayesians.
While Frequentists use probability only to model processes broadly described as ‘sampling’, Bayesians use probability to model both sampling and their ‘degree of belief’.

First, let’s consider the Frequentist approach. Now, the Frequentist guy has to
cheat in some way as this problem isn’t well-defined in the Frequentist
framework since ‘tomorrow’ is a sample of size one(and infinite standard
deviation). Some Frequentists try to define this probability by assuming that
there are many worlds with a sun potentially rising on each. But this is a really
silly bastardization of Laplace’s principle of insufficent reason. In all honesty,
the best this guy can do is to calculate the probability that the sun will rise on
any day, and not the probability that the sun rises on a particular day. Here we go:

1) Let’s assume that this phenomenon can be modeled as i.i.d. draws from a binomial distribution(i.e. a Bernoulli trial) where X is the sum of n
Bernoulli random variables and represents the number of days that the sun
rises out of n observations.

2) By the Law of Large Numbers X=n \hat{p} converges to the expected
number of sunrises, E[X]=np where p is the probability that the sun rises on any day and \hat{p} is our estimate of this probability.

3) By the Central Limit Theorem, for large n the sunrises should be normally distributed with mean np and variance np(1-p)

4) Furthermore, for large n we may construct confidence intervals
with coverage at least 1-\alpha :

\displaystyle\hat{p}\pm z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}

where z_{\alpha/2} is the 100(1-\alpha/2)th percentile of the standard normal distribution. So, after a large number of sunrises the Frequentist can give a reasonable answer for the probability that the sun would rise on any day provided that his assumption holds true.

Now, compare this with the Bayesian solution which allows consideration of such questions:

1) X is defined exactly as the Frequentist had it defined but in addition we can
define Y  ,the event that the sun rises tomorrow, where Y equals 1 or 0.

2) Let \theta be the probability of a sunrise on any given day.

3) We assume that before observing X we had no prior information concerning
\theta. Hence, by the principle of insufficient reason we may assume that
our prior is uniformly distributed on [0,1].

4) Now for the calculation of P(Y=1|X=n):
P(Y=1| X=n)=\int^1_0 P(Y=1|\theta)P(\theta|X=n)\,d\theta= \int^1_0 \theta \frac{ P(\theta|X=n)}{\int^1_0 P(\theta|X=n)\,d\theta} \,d\theta = \frac{n+1}{n+2}

Note: the Bayesian approach isn’t better than the Frequentist solution. But, building a statistical model without any domain knowledge is doomed for failure whatever your approach…Bayesian, Frequentist, or otherwise.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s