# Lecture 11: Central Limit Theorem

STA237: Probability, Statistics, and Data Analysis I

Michael Jongho Moon

PhD Student, DoSS, University of Toronto

Monday, June 19, 2023

## Recall: Law of large numbers

Suppose $X_1$, $X_2$, …, $X_n$ are independent random variables with expectation $\mu$ and variance $\sigma^2$. Then for any $\varepsilon > 0$,

$\lim_{n\to\infty}P\left(\left|\overline{X}_n-\mu\right|>\varepsilon\right)=0,$

where $\overline{X}_n=\left.\sum_{i=1}^n X_i\right/n$.

That is, $\overline{X}_n$ converges in probability to $\mu$.

## Distributions of sample means

• Convergence in probability to the mean suggest that the sampling distribution of $\overline{X}_n$ becomes narrower as $n$ increases.
• The convergence occurs regardless of the originating distribution.

## Distributions of sample means

• We also observe that their distributions become roughly symmetric bell shapes with larger sample sizes.
• This behaviour also seems to occur regardless of the underlying distribution shape.

## Convergence in distribution

The distribution of $Y_n$ becomes closer and closer to that of $W$.

Let $Y_1$, $Y_2$, $Y_3$, … be an infinite sequence of random variables, and let $W$ be another random variable. Then, we say the sequence $\left\{Y_n\right\}$ converges in distribution to $W$ if for all $w\in\mathbb{R}$ such that $P\left(W = w\right)=0$, we have

$\lim_{n\to\infty}P\left(Y_n\le w\right)=P\left(W\le w\right)$

and we write

$Y_n\overset{d}{\to}W.$

## Example: Binomial for infinite trials

Suppose $X_n\sim\text{Binom}\left(n,\theta_n\right)$ describes the number of success of $n$ independent sub-intervals of an equal length where $\theta_n=\frac{\lambda}{n}$ for some $\lambda>0$ that represents a rate of success.

What happens when you make the sub-intervals infinitesimally small?

We have seen that $\lim_{n\to\infty}p_{X_n}(x)=p_X(x)$ where $X\sim\text{Pois}(\lambda)$. Recall how we derived the pmf of a Poisson random variabel in Lecture 3.

• $\lim_{n\to\infty}p_{X_n}(x)=\lim_{n\to\infty}\binom{n}{x}\left(\frac{\lambda}{n}\right)^x\left(1-\frac{\lambda}{n}\right)^{n-x}$
• $\phantom{lim_{n\to\infty}p_X(x)}=\frac{\lambda^x}{x!}\lim_{n\to\infty}\frac{n!}{\left(n-x\right)!n^x}\left(1-\frac{\lambda}{n}\right)^{n-x}$
• $\phantom{lim_{n\to\infty}p_X(x)}\vdots$
• $\phantom{lim_{n\to\infty}p_X(x)}=\frac{\lambda^xe^{-\lambda}}{x!}$

$X_n\overset{d}{\to}X, \quad X\sim\text{Pois}\left(\lambda\right)$

# Central limit theorem

• We have observed that sample menas $\overline{X}_n$ converge to distributions with similar shapes regardless of the originating distribution.
• The central limit theorem explains to which distribution they converge.

## The central limit theorem

Let $X_1$, $X_2$, $X_3$, … be independent and identically distributed random variables with $E\left(X_1\right)=\mu<\infty$ and $0< \text{Var}\left(X_1\right)=\sigma^2<\infty$. For $n\ge1$, let

$Z_n=\frac{\sqrt{n}\left(\overline{X}_n-\mu\right)}{\sigma},$

where $\overline{X}_n=\left.\sum_{i=1}^nX_i\right/n$. Then, for any number $a\in\mathbb{R}$,

$\lim_{n\to\infty}P\left(Z_n\le a\right)=\Phi\left(a\right),$

where $\Phi$ is the cumulative distribution function of the standard normal distribution.

## The central limit theorem

In practice, $\overline{X}_n$ approximately follows the distribution of $\left(Z\frac{\sigma}{\sqrt{n}}+\mu\right)$ or $N\left(\mu, \frac{\sigma^2}{n}\right)$ for large $n$.

In other words,

$\frac{\sqrt{n}\left(\overline{X_n}-\mu\right)}{\sigma}\overset{d}{\to}Z,$

where $Z\sim N\left(0,1\right)$.

Recall the survey on Canadian student smoking prevalence.

• As you increase the sample size $n$, the sampling distribution of $T_n$ not only became narrower by the LLN . . .

Recall the survey on Canadian student smoking prevalence.

• As you increase the sample size $n$, the sampling distribution of $T_n$ not only became narrower by the LLN

• but also closer to a symmetrical and bell-shaped distribution by the CLT.

## Example: Normal approximation of the binomial distribution

Suppose $Y\sim \text{Binom}\left(50, 0.3\right)$ and we are interested in $P(Y\le 20)$.

• $Y\sim\sum_{i=1}^{50} W_i$ where $W_i\sim\text{Ber}(0.3)$ independently.
• We may use the CLT to approximate $\phantom{=}P\left(Y\le 20\right)$ $=P\left(\frac{Y}{50} \le \frac{20}{50}\right)$ $=P\left(\overline{W}_{50}\le 0.4\right)$
• Recall $E(W_1)=0.3$ and $\text{Var}(W_1)=0.3\cdot 0.7=0.21$

Using exact $F_Y(y)$

• $F_Y(20) = \sum_{y=0}^{20} p_Y(y)$
• $\phantom{F_Y(20)} = \sum_{y=0}^{20} \binom{50}{y}0.3^{y}0.7^{50 - y}$

pbinom(20, 50, 0.3) in R.

• $\phantom{F_Y(20)} \approx 0.952$

Approximating via $Z\sim N(0,1)$

• $F_Y(20) \approx P\left(Z\cdot \sqrt{0.21 / 50} + 0.3 \le 0.4\right)$

pnorm(.4, .3, sqrt(.21 / 50)) in R

• $\phantom{F_Y(y)} \approx 0.939$

$Z_{50} = 50 \cdot \left(Z\sqrt{\frac{0.21}{50}}+0.3\right)$

$\overline{W}_{5}\quad\text{vs}\quad Z_5$

• pbinom(2, 5, .3) $\approx 0.837$
• pnorm(2, 1.5, sqrt(.21 * 5)) $\approx 0.687$

$\overline{W}_{50}\quad\text{vs}\quad Z_{50}$

• pbinom(20, 50, .3) $\approx 0.952$
• pnorm(20, 15, sqrt(.21 * 50)) $\approx 0.939$

$\overline{W}_{5000}\quad\text{vs}\quad Z_{5000}$

• pbinom(200, 500, .3) $\approx 0.9999992$
• pnorm(200, 150, sqrt(.21 * 500)) $\approx 0.9999995$

$\overline{W}_{5}\quad\text{vs}\quad Z_5$

• With smaller number of trials, $n$, the “gaps” are larger and the approximation isn’t precise.
• With larger number of trials, $n$, the approximation becomes more precise.

# Summary

Given an independent and identically distributed sample from a population of finite mean $\mu$ and positive finite variance $\sigma^2$,

• the sample mean converges in distribution to a normal distribution with mean $\mu$ and $\sigma/n$

We often use the central limit theorem to approximate distributions of finite samples when the sample size is sufficiently large.

# Review

• Weekly Activity 5 Questions
• Selected questions from past exams
• Questions