The Normal Distribution

missing
On the importance of the normal distribution:

The normal distribution owes its importance to the Central Limit Theorem(CLT) which broadly states that the sum of i.i.d. random variables tends towards a Gaussian distribution as the number of terms in the sum approaches infinity.  In fact, DeMoivre’s discovery of the normal distribution(1773) in the context of gambling is merely a special case of this important result. Later, the normal distribution was rediscovered by Gauss in 1809 as a result of his analysis astronomical data.

The derivation that follows in the next section is based on the context of Galileo and other astronomers who made the following assumptions[3]:
1) all observations are encumbered with errors due to the observer, the instruments, and other observational conditions
2) the observations are distributed symmetrically about the true value i.e. the errors are distributed symmetrically about zero
3) small errors occur more frequently than large errors

Until 1809 it still wasn’t clear whether the true value to be measured was the mean which minimizes the sum of squared deviations or the median which minimizes the sum of absolute deviations. Gauss’ innovation was to choose the mean as the best estimator of the ‘true value’. But, until the development of the CLT Gauss couldn’t fully defend his decision except perhaps by saying that,by symmetry, the mean and median should be the same for the normal distribution if assumptions 2 and 3 hold true.

Deriving the normal distribution:

Assumptions 2 and 3 in the previous section may be condensed to form the following necessary and sufficient criterion[2]:

data are normally distributed if the rate at which the frequencies fall off is
proportional to the distance of the measurements from the mean

If the frequencies are denoted by y, this gives us:

\frac{dy}{dx} = -k (x- \mu) y(x)

\ln(y) = \frac{-k (x- \mu)^{2}}{2} + \ln(C)

and

\forall x\in \Re, y(x) = C e^{ \frac{-k (x- \mu)^{2}}{2}}

2) By setting u = \sqrt{(k/2)} (x-\mu) and using the fact that y(x) is a probability distribution:

I = C \int^\infty_{-\infty} e^{ \frac{-k (x- \mu)^{2}}{2}}\,dx = C \sqrt{\frac{2}{k}}\int^\infty_{-\infty} e^{-u^{2}}\,du = 1

3) By squaring the integral and using polar coordinates:

I^{2} = 2 \frac{C^{2}}{k} \int^\infty_{-\infty} \int^\infty_{-\infty} e^{-(x^2 + y^{2})}\,dxdy = 1

\Rightarrow I^{2} = \frac{C^{2}}{k} \int^{2 \pi}_0 \int^\infty_0 r e^{-r^{2}}\,drd \theta = \frac{C^{2}}{k} 2 \pi = 1

so

C = \sqrt{ \frac{k}{2 \pi}}

y(x) = \sqrt{\frac{k}{2 \pi}}e^{\frac{-k (x- \mu)^{2}}{2}}

4) Computing the expected value:

let v = x - \mu , dv = dx
so

E(v) = \sqrt{ \frac{k}{2 \pi}} \int^\infty_{-\infty} v e^{-\frac{k v^2}{2}}\,dv = 0

\Rightarrow E(x) = E(\mu) = \mu

5) Computing the variance:

i) let w = x - \mu , dx = dw then \sigma^{2} = \sqrt{ \frac{k}{2 \pi}} \int^\infty_{-\infty} w^{2} e^{-\frac{k w^2}{2}}\,dw

ii) using integration by parts: u = w, du = dw, v = \frac{-1}{k} e^{-\frac{k w^2}{2}} , dv = w e^{-\frac{k w^2}{2}} \\

\Rightarrow \sigma^{2} = \sqrt{ \frac{k}{2 \pi}} v e^{-\frac{k v^2}{2}} \biggl|_\infty^{- \infty} + \frac{1}{k} \sqrt{ \frac{k}{2 \pi}} \int^\infty_{-\infty} e^{-\frac{k w^2}{2}}\,dw = \frac{1}{k}

So we have

\forall x \in \Re, y(x) = \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2} ( \frac{x- \mu}{\sigma})^{2}}

Showing that the normal distribution is a probability distribution:

We assumed that the normal distribution was a probability distribution but this can also be proven:

1) I' = \int^\infty_{-\infty} \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2} ( \frac{x- \mu}{\sigma})^{2}} = \sqrt{(\int^\infty_{-\infty} \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2} ( \frac{x- \mu}{\sigma})^{2}})^{2}\,dx}
so I' = \sqrt{\int^\infty_{-\infty} \int^\infty_{-\infty} \frac{1}{\sigma^{2} 2 \pi} e^{- ( \frac{(x- \mu)^{2} + (y- \mu)^{2}}{2 \sigma ^{2}})}\,dxdy}

2) Now substitute \begin{cases} x' = x - \mu \\ y' = y - \mu\end{cases} and \begin{cases}dx' = dx \\ dy' = dy\end{cases} …converting to polar coordinates:

I' = \sqrt{\int^\infty_{-\infty} \int^\infty_{-\infty} \frac{1}{\sigma^{2} 2 \pi} e^{- ( \frac{(x')^{2} + (y')^{2}}{2 \sigma ^{2}})}\,dxdy}

\Rightarrow I' = \sqrt{\int^\infty_0 \int^{2 \pi}_0 \frac{1}{\sigma^{2} 2 \pi} r e^{-( \frac{r^{2}}{2 \sigma ^{2}})}\,drd\theta} = 1

References:

1) “Proof That Normal Distribution Is a Distribution | Planetmath.org.” Planetmath.org. Planetmath, 21 Mar. 2013. Web. 24 Apr. 2015.

2) Wilson, Robert S. “The Normal Distribution.” The Normal Distribution. Sonoma, n.d. Web. 24 Apr. 2015.

3) Stahl, Saul. “The Evolution of the Normal Distribution.” Mathematics Magazine 1 Apr. 2006: 96-113. Print.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s