The Central Limit Theorem

The Central Limit Theorem says that for any data(normal or otherwise), the distribution of the sample means has an approximately normal distribution. In the history of statistics, this has simplified the development of sampling methods, statistical tests, and statistical algorithms. And this is why statisticians give special importance to the normal distribution.

To be precise, the CLT is stated in this manner:
let \{X_{n}\} be a sequence of i.i.d. random variables with \mu = 0 and \sigma^{2} = 1. If Z \sim N(0,1) and S_{n} = \sum_{i=1}^{n} X_{i} , we have S_{n}/\sqrt{n} \longrightarrow Z in distribution as n \longrightarrow \infty. i.e. \forall x \in \Re, \lim_{n \to \infty}P(S_{n}/\sqrt{n} \leq x) = \frac{1}{\sqrt{2 \pi}} \int^x_{-\infty} e^{-\frac{u^2}{2}}\,du

Lemma: Levy’s continuity theorem states that convergence in distribution is equivalent to point-wise convergence of the corresponding characteristic function.

In order to use Levy’s continuity theorem, we must use the following estimates on Taylor expansions of exponential functions:
a) u \geq 0, 0 \leq e^{-u} -1 +u \leq u^{2}/2
b) \forall t \in \Re, |e^{it} -1-it| \leq |t|^{2}/2
c) \forall t \in \Re, |e^{it} -1-it-(it)^{2}/2| \leq |t|^{3}/6

Now we may proceed with the proof:

i) let F be the characteristic function of the common distribution of the \{X_{n}\}. Then for every t \in \Re, the characteristic function of S_{n}/\sqrt{n} is given by E(e^{itS_{n}/\sqrt{n}}) = [F(t/\sqrt{n})]^{n}

ii) Consequently, our task is to prove that \forall t \in \Re, \lim_{n \to \infty}[F(t/\sqrt{n})]^{n} = e^{-t^{2}/2}

iii) We begin our estimation by noting that |[F(t/\sqrt{n})]^{n}-e^{-t^{2}/2}| \leq n |F(t/\sqrt{n})-e^{-t^{2}/2n}| since |F(t/\sqrt{n})| \leq 1 and |e^{-t^{2}/2n}| \leq 1

iv) Now, we may use the triangle inequality to show that |[F(t/\sqrt{n})]^{n}-e^{-t^{2}/2n}| \leq n |F(t/\sqrt{n})-(1-t^{2}/2n)| + n |(1-t^{2}/2n)-e^{-t^{2}/2n}|

v) by our first estimate, letting u = t^{2}/2n \geq 0, we see that n |(1-t^{2}/2n)-e^{-t^{2}/2n}| \leq \frac{n(t^{2}/2n)}{2} = t^{4}/8n which approaches 0 as n \longrightarrow \infty.

vi) for the first term we note that n |F(t/\sqrt{n})-(1-t^{2}/2n)|=n |E[e^{itX}-(1+\frac{itX}{\sqrt{n}}+i^{2}t^{2}X^{2}/2n)]| \leq n E[|e^{itX}-(1+\frac{itX}{\sqrt{n}}+i^{2}t^{2}X^{2}/2n)|]

For any \delta > 0  and positive integer n, let A=A(\delta,n) = \{|X|>\delta\sqrt{n}\}. Then |e^{itX}-(1+\frac{itX}{\sqrt{n}}+i^{2}t^{2}X^{2}/2n)| \leq (\frac{t^{2}X^{2}}{n}) I_{A} + (\frac{1}{6}\frac{|tX|^{3}}{n^{3/2}}) I_{A^{c}}

Consequently,

n E[|e^{itX}-(1+\frac{itX}{\sqrt{n}}+i^{2}t^{2}X^{2}/2n)|] \leq n E[(\frac{t^{2}X^{2}}{n}) I_{A}] + n E[\frac{1}{6}\frac{|tX|^{3}}{n^{3/2}} I_{A^{c}}] \leq t^{2} E[X^{2} I_{A}] + \delta E[|t|^{3} |X|^{2}] =t^{2} E[X^{2} I_{A}] + \frac{\delta |t|^{3}}{6}

Now, given \varepsilon > 0 we first choose \delta > 0 so \frac{|t|^{3} \delta}{6} \leq \frac{\varepsilon}{2} and for this \delta we
choose n so that if N \geq n we have

t^{2}E[X^{2} I_{A}] \leq \frac{\varepsilon}{2}

The proof then follows from the bounded convergence theorem. But, it’s important to note that:
1) You can never collect an infinite amount of data.
2) A lot of data isn’t generated by stationary processes and so the i.i.d. assumption doesn’t necessarily hold. You can check this discussion for more details.

*this proof can be easily generalized to i.i.d. random variables with finite mean and finite variance if you simply normalize the variables

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s