STA237: Probability, Statistics, and Data Analysis I
Michael Jongho Moon
PhD Student, DoSS, University of Toronto
June 6, 2022
Michael realizes his coffee shop won’t survive selling coffee alone and decides to sell muffins as well.
After a month of selling both coffee and muffins, Michael estimates the probability distribution of the daily coffee and muffin sales as shown on the right.
For example, the probability of selling 5 cups of coffee and 5 muffins on a day would be indicated by …
How about the probability of selling exactly 5 cups of coffee on a day?
How about the probability of selling less than 5 cups of coffee and less than 4 muffins on a day?
This is an example of a joint distribution of two discrete random variables.
The two random variables arise from the same sample space and the joint distribution describe probabilities of all possible pairs of their values.
The joint probability mass function \(p\) of two discrete random variables \(X\) and \(Y\) is the function \(p:\mathbb{R}^2\to\left[0,1\right]\), defined by
\[p\left(a,b\right) = P\left(X=a, Y=b\right)\quad\text{for} -\infty<a,b<\infty.\]
(From Dekking et al. Section 9.1)
Let \(S\) be the sum of two fair dice rolls and \(M\) be the maximum of the two.
Compute the following probabilities.
\(P(S=7,M=5)\)
\(=2/36=1/18\)
m |
||||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
s | ||||||
2 | 1/36 | 0 | 0 | 0 | 0 | 0 |
3 | 0 | 2/36 | 0 | 0 | 0 | 0 |
4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 |
5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 |
6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 |
7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 |
8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 |
9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |
10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 |
11 | 0 | 0 | 0 | 0 | 0 | 2/36 |
12 | 0 | 0 | 0 | 0 | 0 | 1/36 |
(From Dekking et al. Section 9.1)
Let \(S\) be the sum of two fair dice rolls and \(M\) be the maximum of the two.
Compute the following probabilities.
\(P(S=7,M=5)=1/18\)
\(P(S=7)\)
\(=\left(2+2+2\right)/36=1/6\)
m |
|||||||
---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | P(S=s) | |
s | |||||||
2 | 1/36 | 0 | 0 | 0 | 0 | 0 | 1/36 |
3 | 0 | 2/36 | 0 | 0 | 0 | 0 | 2/36 |
4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 | 3/36 |
5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 | 4/36 |
6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 | 5/36 |
7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 | 6/36 |
8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 | 5/36 |
9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 | 4/36 |
10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 | 3/36 |
11 | 0 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |
12 | 0 | 0 | 0 | 0 | 0 | 1/36 | 1/36 |
(From Dekking et al. Section 9.1)
Let \(S\) be the sum of two fair dice rolls and \(M\) be the maximum of the two.
Compute the following probabilities.
\(P(S=7,M=5)=1/18\)
\(P(S=7)=1/6\)
\(P(M=5)\)
\(=\left(2+2+2+2+1\right)/36=1/4\)
m |
|||||||
---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | P(S=s) | |
s | |||||||
2 | 1/36 | 0 | 0 | 0 | 0 | 0 | 1/36 |
3 | 0 | 2/36 | 0 | 0 | 0 | 0 | 2/36 |
4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 | 3/36 |
5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 | 4/36 |
6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 | 5/36 |
7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 | 6/36 |
8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 | 5/36 |
9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 | 4/36 |
10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 | 3/36 |
11 | 0 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |
12 | 0 | 0 | 0 | 0 | 0 | 1/36 | 1/36 |
P(M=m) | 1/36 | 3/36 | 5/36 | 7/36 | 9/36 | 11/36 |
Let \(X\) and \(Y\) be two discrete random variables, with joint probability mass function \(p_{X,Y}\). Then, the marginal probability mass function \(p_X\) of \(X\) can be computed as
\[p_X(x)=\sum_{y}p_{X,Y}\left(x,y\right),\quad\text{and}\]
the marignal probability mass function \(p_Y\) of \(Y\) can be computed as
\[p_Y(y)=\sum_{x}p_{X,Y}\left(x,y\right).\]
The joint cumulative distribution function \(F\) of two random variables \(X\) and \(Y\) is the function \(F:\mathbb{R}^2\to[0,1]\) defined by
\[F\left(a,b\right)=P\left(X\le a, Y, \le b\right)\quad\text{for }-\infty<a,b<\infty.\]
(From Dekking et al. Section 9.1)
Let \(S\) be the sum of two fair dice rolls and \(M\) be the maximum of the two.
\(F_{S,M}\left(6, 2\right)=?\)
m |
||||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
s | ||||||
2 | 1/36 | 0 | 0 | 0 | 0 | 0 |
3 | 0 | 2/36 | 0 | 0 | 0 | 0 |
4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 |
5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 |
6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 |
7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 |
8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 |
9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |
10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 |
11 | 0 | 0 | 0 | 0 | 0 | 2/36 |
12 | 0 | 0 | 0 | 0 | 0 | 1/36 |
(From Dekking et al. Section 9.1)
Let \(S\) be the sum of two fair dice rolls and \(M\) be the maximum of the two.
\(F_{S,M}\left(6, 2\right)=\left(1+2+1\right)/36=1/9\)
m |
||||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
s | ||||||
2 | 1/36 | 0 | 0 | 0 | 0 | 0 |
3 | 0 | 2/36 | 0 | 0 | 0 | 0 |
4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 |
5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 |
6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 |
7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 |
8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 |
9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |
10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 |
11 | 0 | 0 | 0 | 0 | 0 | 2/36 |
12 | 0 | 0 | 0 | 0 | 0 | 1/36 |
Random variables \(X\) and \(Y\) have a joint continuous distribution if for some function \(f:\mathbb{R}^2\to\mathbb{R}\) and for all numbers \(a_1\), \(a_2\), \(b_1\), and \(b_2\) with \(a_1\le b_1\) and \(a_2\le b_2\),
\[P\left(a_1 \le X\le b_1, a_\le Y\le b_2\right)=\int_{a_2}^{b_2}\int_{a_1}^{b_1} f\left(x,y\right) dx dy.\]
The function \(f\) has to satisfy
We call \(f\) the joint probability density function of \(X\) and \(Y\).
Suppose \(X\) and \(Y\) have a joint continuous distribution with joint density
\[f_{X,Y}\left(x,y\right)=\begin{cases}120\cdot x^3\cdot y & x\ge 0, y\ge 0, x+y\le1 \\ 0 &\text{otherwise.}\end{cases}\]
\[F_{X,Y}(0.5, 0.5)\]
\[P(X\le 0.5)\]
\[\neq\int_0^{1}\int_{0}^{1/2} 120 \cdot x^3 \cdot y dxdy\]
\(f(x,y)\) is not \(120\cdot x^3 \cdot y\) for all values of \(y\) from 0 to 1 when \(x\) is between 0 and 1/2.
\[P(X\le 0.5)\]
\[=\int_{0}^{1/2}\int_0^{1-x} 120 \cdot x^3 \cdot y dydx\]
We can switch the order of integrals when working with probability density functions and evaluate \(y\) in terms of \(x\).
\[P(X\le 0.5)=\int_{0}^{1/2}\int_0^{1-x} 120 \cdot x^3 \cdot y dydx\]
\[=\int_0^{1/2} 120 \cdot x^3 \cdot \frac{(1-x)^2}{2} dx\] \[=\int_0^{1/2} 60 \cdot \left(x^5-2x^4+x^3\right) dx\]
\[=60 \cdot\left(\frac{(1/2)^6}{6} - \frac{2(1/2)^5}{5}+\frac{(1/2)^4}{4}\right)\] \[=60\cdot \left(\frac{10/64}{60}-\frac{12/16}{60}+\frac{15/16}{60}\right)\]
\[=\frac{5-24+30}{32}=\frac{11}{32}\]
Let \(F\) be the joint distribution function of random variables \(X\) and \(Y\). Then the marginal cumulative distribution function of \(X\) is given by
\[F_X\left(a\right)=P\left(X\le a\right)=F\left(a,\infty\right)=\lim_{b\to\infty}F\left(a,b\right)\]
and the marginal cumulative distribution function of \(Y\) is given by
\[F_Y\left(b\right)=P\left(Y\le b\right)=F\left(\infty,b\right)=\lim_{a\to\infty}F\left(a,b\right).\]
Let \(X\) and \(Y\) have a joint continuous distribution, with joint density function \(f_{X,Y}\). Then the marginal density \(f_X\) of \(X\) satisfies
\[f_X\left(x\right) = \int_{-\infty}^\infty f_{X,Y}\left(x,y\right) dy\]
for all \(x\in\mathbb{R}\) and the marginal density \(f_Y\) of \(Y\) satisfies
\[f_Y\left(y\right)=\int_{-\infty}^\infty f_{X,Y}\left(x,y\right)dx\]
for all \(y\in\mathbb{R}\).
Recall for events \(A\) and \(B\) …
… if \(P(A)\cdot P(B)=P(A\cap B)\) then they are independent.
For random variables \(X\) and \(Y\) …
\[P\left(\left\{X\in I_A\right\}\right)\cdot P\left(\left\{Y\in I_B\right\}\right)=P\left(\left\{X\in I_A\right\} \cap \left\{Y\in I_B\right\}\right)\]
The random variables \(X\) and\(Y\), with joint distribution function \(F\), are independent if
\[P\left(X\le x, Y\le y\right)=P\left(X\le x\right)\cdot P\left(Y\le y\right),\]
that is,
\[F\left(x,y\right)=F_X\left(x\right)\cdot F_Y\left(y\right)\]
for all possible values \(x\) and \(y\). Random variables that are not independent are called dependent.
\(X\) and \(Y\) are independent when
That is, \(p_{X,Y}(x,y)=p_X(x)p_Y(y)\)
for all possible values of \(x\) and \(y\).
\(X\) and \(Y\) are independent when
That is, \(f_{X,Y}(x,y)=f_X(x)f_Y(y)\)
for all possible values of \(x\) and \(y\).
For any number of random variables, \(X_1\), \(X_2\), \(X_3\), …, \(X_n\), they are pairwise independent if \(X_j\) and \(X_k\) are independent for all \(j\neq k\), \(1\le j,k \le n\)
For any number of variables, \(X_1\), \(X_2\), \(X_3\), …, \(X_n\), they are independent if \(F\left(x_1,x_2,x_3,\ldots,x_n\right)=\prod_{i=1}^n F_{X_i}\left(x_i\right)\)
You can also write the definition with \(p_{x_i}\) for discrete random variables with joint probability mass function \(p\) or with \(f_{x_i}\) for continuous random variables with joint density function \(f\)
For further details, you can check Section 2.8 from Evans & Rosenthal
Let \(X_1\), \(X_2\), \(X_3\), …, \(X_n\) be independent random variables. For each \(i\in\left\{1,2,\ldots,n\right\}\), let \(h_i:\mathbb{R}\to\mathbb{R}\) be a function and define the random variable
\[Y_i=h_i\left(X_i\right).\]
Then \(Y_1\), \(Y_2\), \(Y_3\), …, \(Y_n\) are also independent.
Let \(X_1\), \(X_2\), \(X_3\), …, \(X_n\) be independent and identically distributed \(U(0,1)\) random variables. Let \(X_{(n)}\) be the maximum value among them.
What is the cumulative distribution function of \(X_{(n)}\)? How about its probability density function?
We want \(F_{X_{(n)}}(x) = P(X_{(n)}\le x)\).
\[\left\{X_{(n)}\le x \right\}=\left\{X_1 \le x\right\}\cap\left\{X_2 \le x\right\}\cap\cdots\left\{X_n \le x\right\}\]
\[P\left(X_{(n)}\le x\right)=P\left(X_1\le x, X_2\le x, \cdots, X_n\le x\right)\]
\[P\left(X_1\le x, X_2\le x, \cdots, X_n\le x\right)\]
\[=P\left(X_1\le x\right)\cdot P\left(X_2\le x\right)\cdots P\left(X_n\le x\right)\]
Let \(X_1\), \(X_2\), \(X_3\), …, \(X_n\) be independent and identically distributed \(U(0,1)\) random variables. Let \(X_{(n)}\) be the maximum value among them.
What is the cumulative distribution function of \(X_{(n)}\)? How about its probability density function?
\[P\left(X_1\le x, X_2\le x, \cdots, X_n\le x\right)\]
\[=P\left(X_1\le x\right)\cdot P\left(X_2\le x\right)\cdots P\left(X_n\le x\right)\]
\[P\left(X_1\le x\right)\cdot P\left(X_2\le x\right)\cdots P\left(X_n\le x\right)\]
\[=P\left(X_1\le x\right)\cdot P\left(X_1\le x\right)\cdots P\left(X_1\le x\right)\]
\(\implies F_{X_{(n)}}(x)=\prod_{i=1}^n F_{X_1}(x)=\left[F_{X_1}\left(x\right)\right]^n=\begin{cases}0 & x<0 \\x^n & x\in[0,1]\\1 &x > 1\end{cases}\)
© 2022. Michael J. Moon. University of Toronto.
Sharing, posting, selling, or using this material outside of your personal use in this course is NOT permitted under any circumstances.