STA237: Probability, Statistics, and Data Analysis I

Michael Jongho Moon

PhD Student, DoSS, University of Toronto

Monday, June 19, 2023

Suppose \(X_1\), \(X_2\), …, \(X_n\) are independent random variables with expectation \(\mu\) and variance \(\sigma^2\). Then for any \(\varepsilon > 0\),

\[\lim_{n\to\infty}P\left(\left|\overline{X}_n-\mu\right|>\varepsilon\right)=0,\]

where \(\overline{X}_n=\left.\sum_{i=1}^n X_i\right/n\).

That is, \(\overline{X}_n\) converges in probability to \(\mu\).

- Convergence in probability to the mean suggest that the sampling distribution of \(\overline{X}_n\) becomes narrower as \(n\) increases.
- The convergence occurs regardless of the originating distribution.

- We also observe that their distributions become roughly symmetric bell shapes with larger sample sizes.
- This behaviour also seems to occur regardless of the underlying distribution shape.

The distribution of \(Y_n\) becomes closer and closer to that of \(W\).

Let \(Y_1\), \(Y_2\), \(Y_3\), … be an infinite sequence of random variables, and let \(W\) be another random variable. Then, we say the sequence \(\left\{Y_n\right\}\) **converges in distribution** to \(W\) if for all \(w\in\mathbb{R}\) such that \(P\left(W = w\right)=0\), we have

\[\lim_{n\to\infty}P\left(Y_n\le w\right)=P\left(W\le w\right)\]

and we write

\[Y_n\overset{d}{\to}W.\]

Suppose \(X_n\sim\text{Binom}\left(n,\theta_n\right)\) describes the number of success of \(n\) independent sub-intervals of an equal length where \(\theta_n=\frac{\lambda}{n}\) for some \(\lambda>0\) that represents a rate of success.

What happens when you make the sub-intervals infinitesimally small?

We have seen that \(\lim_{n\to\infty}p_{X_n}(x)=p_X(x)\) where \(X\sim\text{Pois}(\lambda)\). Recall how we derived the pmf of a Poisson random variabel in Lecture 3.

- \(\lim_{n\to\infty}p_{X_n}(x)=\lim_{n\to\infty}\binom{n}{x}\left(\frac{\lambda}{n}\right)^x\left(1-\frac{\lambda}{n}\right)^{n-x}\)
- \(\phantom{lim_{n\to\infty}p_X(x)}=\frac{\lambda^x}{x!}\lim_{n\to\infty}\frac{n!}{\left(n-x\right)!n^x}\left(1-\frac{\lambda}{n}\right)^{n-x}\)
- \(\phantom{lim_{n\to\infty}p_X(x)}\vdots\)
- \(\phantom{lim_{n\to\infty}p_X(x)}=\frac{\lambda^xe^{-\lambda}}{x!}\)

\[X_n\overset{d}{\to}X, \quad X\sim\text{Pois}\left(\lambda\right)\]

- We have observed that sample menas \(\overline{X}_n\) converge to distributions with similar shapes regardless of the originating distribution.
- The central limit theorem explains to which distribution they converge.

Let \(X_1\), \(X_2\), \(X_3\), … be *independent and identically distributed random variables* with \(E\left(X_1\right)=\mu<\infty\) and \(0< \text{Var}\left(X_1\right)=\sigma^2<\infty\). For \(n\ge1\), let

\[Z_n=\frac{\sqrt{n}\left(\overline{X}_n-\mu\right)}{\sigma},\]

where \(\overline{X}_n=\left.\sum_{i=1}^nX_i\right/n\). Then, for any number \(a\in\mathbb{R}\),

\[\lim_{n\to\infty}P\left(Z_n\le a\right)=\Phi\left(a\right),\]

where \(\Phi\) is the cumulative distribution function of the standard normal distribution.

In practice, \(\overline{X}_n\) approximately follows the distribution of \(\left(Z\frac{\sigma}{\sqrt{n}}+\mu\right)\) or \(N\left(\mu, \frac{\sigma^2}{n}\right)\) for large \(n\).

In other words,

\[\frac{\sqrt{n}\left(\overline{X_n}-\mu\right)}{\sigma}\overset{d}{\to}Z,\]

where \(Z\sim N\left(0,1\right)\).

Recall the survey on Canadian student smoking prevalence.

- As you increase the sample size \(n\), the sampling distribution of \(T_n\) not only became narrower by the LLN . . .

Recall the survey on Canadian student smoking prevalence.

As you increase the sample size \(n\), the sampling distribution of \(T_n\) not only became narrower by the LLN

but also closer to a symmetrical and bell-shaped distribution by the CLT.

Suppose \(Y\sim \text{Binom}\left(50, 0.3\right)\) and we are interested in \(P(Y\le 20)\).

- \(Y\sim\sum_{i=1}^{50} W_i\) where \(W_i\sim\text{Ber}(0.3)\) independently.
- We may use the CLT to approximate \[\phantom{=}P\left(Y\le 20\right)\] \[=P\left(\frac{Y}{50} \le \frac{20}{50}\right)\] \[=P\left(\overline{W}_{50}\le 0.4\right)\]
- Recall \(E(W_1)=0.3\) and \(\text{Var}(W_1)=0.3\cdot 0.7=0.21\)

Using exact \(F_Y(y)\)

- \(F_Y(20) = \sum_{y=0}^{20} p_Y(y)\)
- \(\phantom{F_Y(20)} = \sum_{y=0}^{20} \binom{50}{y}0.3^{y}0.7^{50 - y}\)

`pbinom(20, 50, 0.3)`

in R.

- \(\phantom{F_Y(20)} \approx 0.952\)

Approximating via \(Z\sim N(0,1)\)

- \(F_Y(20) \approx P\left(Z\cdot \sqrt{0.21 / 50} + 0.3 \le 0.4\right)\)

`pnorm(.4, .3, sqrt(.21 / 50))`

in R

- \(\phantom{F_Y(y)} \approx 0.939\)

\[Z_{50} = 50 \cdot \left(Z\sqrt{\frac{0.21}{50}}+0.3\right)\]

\[\overline{W}_{5}\quad\text{vs}\quad Z_5\]

`pbinom(2, 5, .3)`

\(\approx 0.837\)`pnorm(2, 1.5, sqrt(.21 * 5))`

\(\approx 0.687\)

\[\overline{W}_{50}\quad\text{vs}\quad Z_{50}\]

`pbinom(20, 50, .3)`

\(\approx 0.952\)`pnorm(20, 15, sqrt(.21 * 50))`

\(\approx 0.939\)

\[\overline{W}_{5000}\quad\text{vs}\quad Z_{5000}\]

`pbinom(200, 500, .3)`

\(\approx 0.9999992\)`pnorm(200, 150, sqrt(.21 * 500))`

\(\approx 0.9999995\)

\[\overline{W}_{5}\quad\text{vs}\quad Z_5\]

- With smaller number of trials, \(n\), the “gaps” are larger and the approximation isn’t precise.

- With larger number of trials, \(n\), the approximation becomes more precise.

Given an independent and identically distributed sample from a population of finite mean \(\mu\) and positive finite variance \(\sigma^2\),

- the sample mean converges in distribution to a normal distribution with mean \(\mu\) and \(\sigma/n\)

We often use the central limit theorem to approximate distributions of finite samples when the sample size is sufficiently large.

- Weekly Activity 5 Questions
- Selected questions from past exams
- Questions

- If your group’s player is selected, please explain your group’s strategy.
- Make your guess on Quercus

© 2023. Michael J. Moon. University of Toronto.

Sharing, posting, selling, or using this material outside of your personal use in this course is **NOT** permitted under any circumstances.