STA237: Probability, Statistics, and Data Analysis I
PhD Student, DoSS, University of Toronto
Monday, June 19, 2023
Suppose \(X_1\), \(X_2\), …, \(X_n\) are independent random variables with expectation \(\mu\) and variance \(\sigma^2\). Then for any \(\varepsilon > 0\),
\[\lim_{n\to\infty}P\left(\left|\overline{X}_n-\mu\right|>\varepsilon\right)=0,\]
where \(\overline{X}_n=\left.\sum_{i=1}^n X_i\right/n\).
That is, \(\overline{X}_n\) converges in probability to \(\mu\).
The distribution of \(Y_n\) becomes closer and closer to that of \(W\).
Let \(Y_1\), \(Y_2\), \(Y_3\), … be an infinite sequence of random variables, and let \(W\) be another random variable. Then, we say the sequence \(\left\{Y_n\right\}\) converges in distribution to \(W\) if for all \(w\in\mathbb{R}\) such that \(P\left(W = w\right)=0\), we have
\[\lim_{n\to\infty}P\left(Y_n\le w\right)=P\left(W\le w\right)\]
and we write
\[Y_n\overset{d}{\to}W.\]
Suppose \(X_n\sim\text{Binom}\left(n,\theta_n\right)\) describes the number of success of \(n\) independent sub-intervals of an equal length where \(\theta_n=\frac{\lambda}{n}\) for some \(\lambda>0\) that represents a rate of success.
What happens when you make the sub-intervals infinitesimally small?
We have seen that \(\lim_{n\to\infty}p_{X_n}(x)=p_X(x)\) where \(X\sim\text{Pois}(\lambda)\). Recall how we derived the pmf of a Poisson random variabel in Lecture 3.
\[X_n\overset{d}{\to}X, \quad X\sim\text{Pois}\left(\lambda\right)\]
Let \(X_1\), \(X_2\), \(X_3\), … be independent and identically distributed random variables with \(E\left(X_1\right)=\mu<\infty\) and \(0< \text{Var}\left(X_1\right)=\sigma^2<\infty\). For \(n\ge1\), let
\[Z_n=\frac{\sqrt{n}\left(\overline{X}_n-\mu\right)}{\sigma},\]
where \(\overline{X}_n=\left.\sum_{i=1}^nX_i\right/n\). Then, for any number \(a\in\mathbb{R}\),
\[\lim_{n\to\infty}P\left(Z_n\le a\right)=\Phi\left(a\right),\]
where \(\Phi\) is the cumulative distribution function of the standard normal distribution.
In practice, \(\overline{X}_n\) approximately follows the distribution of \(\left(Z\frac{\sigma}{\sqrt{n}}+\mu\right)\) or \(N\left(\mu, \frac{\sigma^2}{n}\right)\) for large \(n\).
In other words,
\[\frac{\sqrt{n}\left(\overline{X_n}-\mu\right)}{\sigma}\overset{d}{\to}Z,\]
where \(Z\sim N\left(0,1\right)\).
Recall the survey on Canadian student smoking prevalence.
Recall the survey on Canadian student smoking prevalence.
As you increase the sample size \(n\), the sampling distribution of \(T_n\) not only became narrower by the LLN
but also closer to a symmetrical and bell-shaped distribution by the CLT.
Suppose \(Y\sim \text{Binom}\left(50, 0.3\right)\) and we are interested in \(P(Y\le 20)\).
Using exact \(F_Y(y)\)
pbinom(20, 50, 0.3)
in R.
Approximating via \(Z\sim N(0,1)\)
pnorm(.4, .3, sqrt(.21 / 50))
in R
\[Z_{50} = 50 \cdot \left(Z\sqrt{\frac{0.21}{50}}+0.3\right)\]
\[\overline{W}_{5}\quad\text{vs}\quad Z_5\]
pbinom(2, 5, .3)
\(\approx 0.837\)pnorm(2, 1.5, sqrt(.21 * 5))
\(\approx 0.687\)\[\overline{W}_{50}\quad\text{vs}\quad Z_{50}\]
pbinom(20, 50, .3)
\(\approx 0.952\)pnorm(20, 15, sqrt(.21 * 50))
\(\approx 0.939\)\[\overline{W}_{5000}\quad\text{vs}\quad Z_{5000}\]
pbinom(200, 500, .3)
\(\approx 0.9999992\)pnorm(200, 150, sqrt(.21 * 500))
\(\approx 0.9999995\)\[\overline{W}_{5}\quad\text{vs}\quad Z_5\]
Given an independent and identically distributed sample from a population of finite mean \(\mu\) and positive finite variance \(\sigma^2\),
We often use the central limit theorem to approximate distributions of finite samples when the sample size is sufficiently large.
© 2023. Michael J. Moon. University of Toronto.
Sharing, posting, selling, or using this material outside of your personal use in this course is NOT permitted under any circumstances.