STA237: Probability, Statistics, and Data Analysis I

Michael Jongho Moon

PhD Student, DoSS, University of Toronto

Monday, June 5, 2023

- Recall \(E[R] < 0\) which isn’t the best business
- Michael starts selling muffins as well
- Let \(M\) be the number of muffins sold per day

\[P\left(\left\{D=5\right\} \cap \left\{M=5\right\}\right)\]

Each point represents a probability associated with a pair of values.

How about \(P\left(\left\{D\le 5\right\} \cap \left\{M\le 5\right\}\right)\)?

\[P\left(\left\{D\le 5\right\} \cap \left\{M\le 5\right\}\right)\]

You will add the probabilities represented by all points in the range.

What if you were only interested in \(D\le 5\)?

\[P\left(D\le 5\right)\]

You will include all possible values of \(M\) while restricting \(D\le 5\).

This is an example of a **joint distribution** of two discrete random variables.

The two random variables arise from the *same sample space* and the joint distribution describe the likelihoods of all possible pairs of their values.

We can drop the set notation with random variables.

\[P\left(\left\{D\le 5\right\} \cap \left\{M\le 5\right\}\right)=P\left(D\le 5, M\le5\right)\]

To emphasize the random variables, we can write \(p_{X,Y}(a,b)\).

Note that \(X\) and \(Y\) are defined on the **same sample space**, \(\Omega\).

The **joint probability mass function** \(p\) of two discrete random variables \(X\) and \(Y\) is the function \(p:\mathbb{R}^2\to\left[0,1\right]\), defined by

\[p\left(a,b\right) = P\left(X=a, Y=b\right)\] \[\quad\text{for} -\infty<a,b<\infty.\]

(Dekking et al. Section 9.1)

Let \(S\) bet the sum of two fair dice rolls and \(M\) be the maximum of the two.

Compute the following probabilities.

\[P(S=7,M=5)\]

m |
||||||
---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | |

s | ||||||

2 | 1/36 | 0 | 0 | 0 | 0 | 0 |

3 | 0 | 2/36 | 0 | 0 | 0 | 0 |

4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 |

5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 |

6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 |

7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 |

8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 |

9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |

10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 |

11 | 0 | 0 | 0 | 0 | 0 | 2/36 |

12 | 0 | 0 | 0 | 0 | 0 | 1/36 |

(Dekking et al. Section 9.1)

Let \(S\) bet the sum of two fair dice rolls and \(M\) be the maximum of the two.

Compute the following probabilities.

\[P(S=7,M=5)\]

\[=\frac{2}{36}=\frac{1}{18}\]

m |
||||||
---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | |

s | ||||||

2 | 1/36 | 0 | 0 | 0 | 0 | 0 |

3 | 0 | 2/36 | 0 | 0 | 0 | 0 |

4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 |

5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 |

6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 |

7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 |

8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 |

9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |

10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 |

11 | 0 | 0 | 0 | 0 | 0 | 2/36 |

12 | 0 | 0 | 0 | 0 | 0 | 1/36 |

(Dekking et al. Section 9.1)

Let \(S\) bet the sum of two fair dice rolls and \(M\) be the maximum of the two.

Compute the following probabilities.

\[P(S=7)\]

m |
||||||
---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | |

s | ||||||

2 | 1/36 | 0 | 0 | 0 | 0 | 0 |

3 | 0 | 2/36 | 0 | 0 | 0 | 0 |

4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 |

5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 |

6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 |

7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 |

8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 |

9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |

10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 |

11 | 0 | 0 | 0 | 0 | 0 | 2/36 |

12 | 0 | 0 | 0 | 0 | 0 | 1/36 |

(Dekking et al. Section 9.1)

Let \(S\) bet the sum of two fair dice rolls and \(M\) be the maximum of the two.

Compute the following probabilities.

\[P(S=7)\] \[=\frac{2}{36}+\frac{2}{36}+\frac{2}{36}\] \[=\frac{1}{6}\]

m |
|||||||
---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | \(p_S(s)\) | |

s | |||||||

2 | 1/36 | 0 | 0 | 0 | 0 | 0 | 1/36 |

3 | 0 | 2/36 | 0 | 0 | 0 | 0 | 2/36 |

4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 | 3/36 |

5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 | 4/36 |

6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 | 5/36 |

7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 | 6/36 |

8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 | 5/36 |

9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 | 4/36 |

10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 | 3/36 |

11 | 0 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |

12 | 0 | 0 | 0 | 0 | 0 | 1/36 | 1/36 |

(Dekking et al. Section 9.1)

Let \(S\) bet the sum of two fair dice rolls and \(M\) be the maximum of the two.

Compute the following probabilities.

\[P(M=m)\]

m |
|||||||
---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | \(p_S(s)\) | |

s | |||||||

2 | 1/36 | 0 | 0 | 0 | 0 | 0 | 1/36 |

3 | 0 | 2/36 | 0 | 0 | 0 | 0 | 2/36 |

4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 | 3/36 |

5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 | 4/36 |

6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 | 5/36 |

7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 | 6/36 |

8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 | 5/36 |

9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 | 4/36 |

10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 | 3/36 |

11 | 0 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |

12 | 0 | 0 | 0 | 0 | 0 | 1/36 | 1/36 |

\(p_M(m)\) | 1/36 | 3/36 | 5/36 | 7/36 | 9/36 | 11/36 |

The relationship shows how we extract distributions of a subset of random variables that belong a larger set.

**Marginal distribution** is a distribution of a subset of random variables that belong to a larger set.

Let \(X\) and \(Y\) be two discrete random variables, with joint probability mass function \(p_{X,Y}\). Then, the **marginal** probability mass function \(p_X\) of \(X\) can be computed as

\[p_X(x)=\sum_{y}p_{X,Y}\left(x,y\right),\quad\text{and}\]

the **marignal** probability mass function \(p_Y\) of \(Y\) can be computed as

\[p_Y(y)=\sum_{x}p_{X,Y}\left(x,y\right).\]

Consider random variables \(X\) and \(Y\) with the joint probability mass function shown on the right for some \(\varepsilon > 0\).

b |
|||
---|---|---|---|

0 | 1 | \(p_X(a)\) | |

a | |||

0 | \(1/4-\varepsilon\) | \(1/4+\varepsilon\) | ... |

1 | \(1/4+\varepsilon\) | \(1/4-\varepsilon\) | ... |

\(p_Y(b)\) | ... | ... |

Consider random variables \(X\) and \(Y\) with the joint probability mass function shown on the right for some \(\varepsilon > 0\).

- The marginal probability masses are \(1/2\) for all possible values.

- Can we retract the value of \(\varepsilon\)?

No. Combining the marginal distributions does NOT provide the full information about the joint distribution.

b |
|||
---|---|---|---|

0 | 1 | \(p_X(a)\) | |

a | |||

0 | \(1/4-\varepsilon\) | \(1/4+\varepsilon\) | \(1/2\) |

1 | \(1/4+\varepsilon\) | \(1/4-\varepsilon\) | \(1/2\) |

\(p_Y(b)\) | \(1/2\) | \(1/2\) |

The **joint cumulative distribution function** \(F\) of two random variables \(X\) and \(Y\) is the function \(F:\mathbb{R}^2\to[0,1]\) defined by

\[F\left(a,b\right)=P\left(X\le a, Y \le b\right)\] \[\quad\text{for }-\infty<a,b<\infty.\]

(Dekking et al. Section 9.1)

Let \(S\) bet the sum of two fair dice rolls and \(M\) be the maximum of the two.

\[F_{S,M}(s,m)\]

m |
||||||
---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | |

s | ||||||

2 | 1/36 | 0 | 0 | 0 | 0 | 0 |

3 | 0 | 2/36 | 0 | 0 | 0 | 0 |

4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 |

5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 |

6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 |

7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 |

8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 |

9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |

10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 |

11 | 0 | 0 | 0 | 0 | 0 | 2/36 |

12 | 0 | 0 | 0 | 0 | 0 | 1/36 |

\[p_{S,M}(s,m)\]

m |
||||||
---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | |

s | ||||||

2 | 1/36 | 0 | 0 | 0 | 0 | 0 |

3 | 0 | 2/36 | 0 | 0 | 0 | 0 |

4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 |

5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 |

6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 |

7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 |

8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 |

9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |

10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 |

11 | 0 | 0 | 0 | 0 | 0 | 2/36 |

12 | 0 | 0 | 0 | 0 | 0 | 1/36 |

\[F_{S,M}(s,m)\]

m |
||||||
---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | |

s | ||||||

2 | 1/36 | 1/36 | 1/36 | 1/36 | 1/36 | 1/36 |

3 | 1/36 | 3/36 | 3/36 | 3/36 | 3/36 | 3/36 |

4 | 1/36 | ... | ... | ... | ... | ... |

5 | 1/36 | ... | ... | ... | ... | ... |

6 | 1/36 | ... | ... | ... | ... | ... |

7 | 1/36 | ... | ... | ... | ... | ... |

8 | 1/36 | ... | ... | ... | ... | ... |

9 | 1/36 | ... | ... | ... | ... | ... |

10 | 1/36 | ... | ... | ... | ... | ... |

11 | 1/36 | ... | ... | ... | ... | ... |

12 | 1/36 | ... | ... | ... | ... | ... |

\[\sum_{s=2}^6\sum_{m=1}^4p_{S,M}(s,m)\]

\[F_{S,M}(6,4)=\frac{13}{36}\]

m |
||||||
---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | |

s | ||||||

2 | 1/36 | 0 | 0 | 0 | 0 | 0 |

3 | 0 | 2/36 | 0 | 0 | 0 | 0 |

4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 |

5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 |

6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 |

7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 |

8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 |

9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |

10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 |

11 | 0 | 0 | 0 | 0 | 0 | 2/36 |

12 | 0 | 0 | 0 | 0 | 0 | 1/36 |

m |
||||||
---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | |

s | ||||||

2 | 1/36 | 1/36 | 1/36 | 1/36 | 1/36 | 1/36 |

3 | 1/36 | 3/36 | 3/36 | 3/36 | 3/36 | 3/36 |

4 | 1/36 | ... | ... | ... | ... | ... |

5 | 1/36 | ... | ... | ... | ... | ... |

6 | 1/36 | ... | ... | 13/36 | ... | ... |

7 | 1/36 | ... | ... | ... | ... | ... |

8 | 1/36 | ... | ... | ... | ... | ... |

9 | 1/36 | ... | ... | ... | ... | ... |

10 | 1/36 | ... | ... | ... | ... | ... |

11 | 1/36 | ... | ... | ... | ... | ... |

12 | 1/36 | ... | ... | ... | ... | ... |

Similar to the case of a single random variable, joint cumulative distribution functions can describe pairs of *discrete* random variables and pairs of *continuous* random variables

The **joint cumulative distribution function** \(F\) of two random variables \(X\) and \(Y\) is the function \(F:\mathbb{R}^2\to[0,1]\) defined by

\[F\left(a,b\right)=P\left(X\le a, Y \le b\right)\] \[\quad\text{for }-\infty<a,b<\infty.\]

From an airport, you can either take a bus or a taxi to get to your hotel. The takes \(B\) minutes to your hotel where \(B\) follows a distribution defined the cumulative distribution function \(F_B\).

\[F_B(t)=\begin{cases}1-\frac{15^2}{t^2} & t\ge 15 \\ 0 & t < 15\end{cases}\]

The time until the next bus is \(N\sim U(0,7)\). You decide to take the bus if it arrives within the next 5 minutes. Otherwise, you will take a taxi which takes 20 minutes. Let \(T\) be the travel time to your hotel.

What is \(F_{T,N}(20,5)\)?

- \(F_{T,N}(20,5)=P(T\le 20, N\le 5)\)

Recall the multiplication rule for conditional probabilities.

- \(\phantom{F_{T,N}}=P(T\le20|N\le 5)P(N\le5)\)

- \(\phantom{F_{T,N}}=P(B\le20)P(N\le5)\)
- \(\phantom{F_{T,N}}=\left(1-\frac{15^2}{20^2}\right)\frac{5}{7}\)

- Recall, we compute its integral which is the area under the function to compute a probability.

- Recall, we compute its integral which is the area under the function to compute a probability.
- For joint distributions with two variables, we want a density function whose integral, or the volume under the surface, represents a probability.

Random variables \(X\) and \(Y\) have a **joint continuous distribution** if for some function \(f:\mathbb{R}^2\to\mathbb{R}\) and for all real numbers \(a_1\), \(a_2\), \(b_1\), and \(b_2\) with \(a_1\le b_1\) and \(a_2\le b_2\),

\[P\left(a_1 \le X\le b_1, a_1\le Y\le b_2\right)=\int_{a_2}^{b_2}\int_{a_1}^{b_1} f\left(x,y\right) dx dy.\]

The function \(f\) has to satisfy

- \(f\left(x,y\right)\ge 0\) for all \(x\in\mathbb{R}\) and \(y\in\mathbb{R}\); and
- \(\int_{-\infty}^\infty\int_{-\infty}^\infty f\left(x,y\right) dxdy = 1\).

We call \(f\) the **joint probability density function** of \(X\) and \(Y\).

Suppose \(X\) and \(Y\) have a joint continuous distribution with joint density

\[f_{X,Y}\left(x,y\right)=\begin{cases}120 x^3 y & x\ge 0, y\ge 0, \\ & \quad x+y\le1 \\ 0 &\text{otherwise.}\end{cases}\]

\(f_{X,Y}(x,y)\ge 0\) for all \((x,y)\in\mathbb{R}^2\)

\(\int_{-\infty}^\infty\int_{-\infty}^\infty f_{X,Y}(x,y)dxdy =1\)

- Compute \(F_{X,Y}(1/2, 1/2)\) and \(P(X\le 1/2)\).

\[f_{X,Y}\left(x,y\right)=\begin{cases}120 x^3 y & x\ge 0, y\ge 0, \\ & \quad x+y\le1 \\ 0 &\text{otherwise.}\end{cases}\]

\[F_{X,Y}(1/2, 1/2)\]

- \(=\int_{-\infty}^{1/2}\int_{-\infty}^{1/2} f(x,y) dxdy\)
- \(=\int_{0}^{1/2}\int_{0}^{1/2} 120x^3y\ dxdy\)
- \(=\int_{0}^{1/2} 30\cdot \left(\frac{1}{2}\right)^4 \cdot y\ dy\)
- \(=15\cdot\left(\frac{1}{2}\right)^4\cdot\left(\frac{1}{2}\right)^2\)
- \(=\frac{15}{64}\)

\[f_{X,Y}\left(x,y\right)=\begin{cases}120 x^3 y & x\ge 0, y\ge 0, \\ & \quad x+y\le1 \\ 0 &\text{otherwise.}\end{cases}\]

\[P(X\le 1/2)\]

- \(=\int_{-\infty}^{1/2}\int_{-\infty}^{\infty} f(x,y) \color{forestgreen}{dydx}\)

The order of the integrals is exchangeable for probability density functions.

- \(=\int_{0}^{1/2}\int_{0}^{1 - x} 120x^3y\ dydx\)
- \(=\int_{0}^{1/2} \color{DarkOrchid}{60\cdot x^3\cdot \left(1-x\right)^2}\ dx\)
- \(=\cdots=\frac{11}{32}\)

\[P(X\le 1/2)=\int_{0}^{1/2} \color{DarkOrchid}{60\cdot x^3\cdot \left(1-x\right)^2}\ dx\]

\(\color{DarkOrchid}{60\cdot x^3\cdot \left(1-x\right)^2}\) for \(x\in[0,1]\) is the probability density function of \(X\).

\[f_X(x)=\begin{cases} \color{DarkOrchid}{60\cdot x^3\cdot \left(1-x\right)^2} & x \in [0,1] \\ 0 & \text{otherwise}\end{cases}\]

Let \(X\) and \(Y\) have a joint continuous distribution, with joint density function \(f_{X,Y}\).

Then, the **marginal probability density function** \(f_X\) of \(X\) satisfies

\[f_X\left(x\right) = \int_{-\infty}^\infty f_{X,Y}\left(x,y\right) dy\]

for all \(x\in\mathbb{R}\) and the **marginal probability density function** \(f_Y\) of \(Y\) satisfies

\[f_Y\left(y\right)=\int_{-\infty}^\infty f_{X,Y}\left(x,y\right)dx\]

for all \(y\in\mathbb{R}\).

\(P\left(X\le a\right)=F\left(a,\infty\right)=\lim_{b\to\infty}F\left(a,b\right)\)

\(P\left(Y\le b\right)=F\left(\infty, b\right)=\lim_{a\to\infty}F\left(a,b\right)\)

**Marginal distribution** is a distribution of a subset of random variables that belong to a larger set in *both discrete and continuous cases*.

Let \(F\) be the joint cumulative distribution function of random variables \(X\) and \(Y\).

Then, the **marginal cumulative distribution function** of \(X\) is given by

\[F_X\left(a\right)=\lim_{b\to\infty}F\left(a,b\right)\]

and the **marginal cumulative distribution function** of \(Y\) is given by

\[F_Y\left(b\right)=\lim_{a\to\infty}F\left(a,b\right).\]

Recall for events \(A\) and \(B\),

- if \(P(A)\cdot P(B) = P(A\cap B)\) then they are independent.

\[P\left(\left\{X\in I_A\right\}\right)\cdot P\left(\left\{Y\in I_B\right\}\right)=P\left(\left\{X\in I_A\right\} \cap \left\{Y\in I_B\right\}\right)\]

where \(I_A\) and \(I_B\) are intervals such that \(A=\{X\in I_A\}\) and \(B=\{Y\in I_B\}\).

For random variables \(X\) and \(Y\),

- if \(F_X(x)F_Y(y)=F_{X,Y}(x,y)\) for all possible values of \(x\) and \(y\) then they are independent.

When \(X\) and \(Y\) are independent, \(P\left(\left\{X\in I_A\right\}\right) P\left(\left\{Y\in I_B\right\}\right)=P\left(\left\{X\in I_A\right\} \cap \left\{Y\in I_B\right\}\right)\) is true for ALL \(I_A\) and \(I_B\).

The random variables \(X\) and \(Y\), with joint cumulative distribution function \(F\), are **independent** if

\[P\left(X\le x, Y\le y\right)=P\left(X\le x\right)\cdot P\left(Y\le y\right),\]

that is,

\[F\left(x,y\right)=F_X\left(x\right)\cdot F_Y\left(y\right)\]

for all possible values \(x\) and \(y\). Random variables that are not independent are called **dependent**.

\(P\left(X\le x, Y\le y\right) = P\left(X\le x\right)P\left(Y\le y\right)\) for all possible values of \(x\) and \(y\) implies \(P(X=x, Y=y)= P(X=x)P(Y=y)\) for all possible values of \(x\) and \(y\).

The discrete random variables \(X\) and \(Y\), with joint probability mass function \(p\), are **independent** if \(p(x,y)=p_X(x)p_Y(y)\) for all possible values of \(x\) and \(y\).

Are \(S\) and \(M\) independent?

- \(p_S(6)p_M(4)=\frac{35}{{36^2}}\neq \frac{2}{36}=p_{S,M}(6,4)\)
- \(S\) and \(M\) are dependent.

\[p_{S,M}(s,m)\]

m |
|||||||
---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | \(p_S(s)\) | |

s | |||||||

2 | 1/36 | 0 | 0 | 0 | 0 | 0 | 1/36 |

3 | 0 | 2/36 | 0 | 0 | 0 | 0 | 2/36 |

4 | 0 | 1/36 | 2/36 | 0 | 0 | 0 | 3/36 |

5 | 0 | 0 | 2/36 | 2/36 | 0 | 0 | 4/36 |

6 | 0 | 0 | 1/36 | 2/36 | 2/36 | 0 | 5/36 |

7 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 | 6/36 |

8 | 0 | 0 | 0 | 1/36 | 2/36 | 2/36 | 5/36 |

9 | 0 | 0 | 0 | 0 | 2/36 | 2/36 | 4/36 |

10 | 0 | 0 | 0 | 0 | 1/36 | 2/36 | 3/36 |

11 | 0 | 0 | 0 | 0 | 0 | 2/36 | 2/36 |

12 | 0 | 0 | 0 | 0 | 0 | 1/36 | 1/36 |

\(p_M(m)\) | 1/36 | 3/36 | 5/36 | 7/36 | 9/36 | 11/36 |

Are \(S\) and \(M\) independent?

- \(p_S(6)p_M(4)=\frac{35}{{36^2}}\neq \frac{2}{36}=p_{S,M}(6,4)\)
- \(S\) and \(M\) are dependent.

You can also check using the cumulative distribution functions.

\[F_{S,M}(s,m)\]

m |
|||||||
---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | \(F_S(s)\) | |

s | |||||||

2 | 1/36 | 1/36 | 1/36 | 1/36 | 1/36 | 1/36 | 1/36 |

3 | 1/36 | 3/36 | 3/36 | 3/36 | 3/36 | 3/36 | 3/36 |

4 | 1/36 | ... | ... | ... | ... | ... | ... |

5 | 1/36 | ... | ... | ... | ... | ... | ... |

6 | 1/36 | ... | ... | 13/36 | ... | ... | ... |

7 | 1/36 | ... | ... | ... | ... | ... | ... |

8 | 1/36 | ... | ... | ... | ... | ... | ... |

9 | 1/36 | ... | ... | ... | ... | ... | ... |

10 | 1/36 | ... | ... | ... | ... | ... | ... |

11 | 1/36 | ... | ... | ... | ... | ... | ... |

12 | 1/36 | ... | ... | ... | ... | ... | ... |

\(F_M(m)\) | 1/36 | ... | ... | ... | ... | ... | ... |

If \(X\) and \(Y\) are independent, can we retract the value of \(\varepsilon\)?

If they are independent, \(p_X(a)p_Y(b)=p(a,b)\) for \(a\in\{0,1\}\) and \(b\in\{a,b\}\).

- Yes, \(\varepsilon = 0\).

b |
|||
---|---|---|---|

0 | 1 | \(p_X(a)\) | |

a | |||

0 | \(1/4-\varepsilon\) | \(1/4+\varepsilon\) | \(1/2\) |

1 | \(1/4+\varepsilon\) | \(1/4-\varepsilon\) | \(1/2\) |

\(p_Y(b)\) | \(1/2\) | \(1/2\) |

\(F\left(x, y\right) = F_X\left(x\right)F_Y\left(y\right)\) for all possible values of \(x\) and \(y\) implies \(\frac{d}{dx}\frac{d}{dy} F\left(x, y\right) = \frac{d}{dx} F_X\left(x\right) \frac{d}{dy} F_Y\left(y\right)\) for all possible values of \(x\) and \(y\).

The continuous random variables \(X\) and \(Y\), with joint probability density function \(f\), are **independent** if \(f(x,y)=f_X(x)f_Y(y)\) for all possible values of \(x\) and \(y\).

Suppose \(X\) and \(Y\) have a joint continuous distribution with joint density

Are \(X\) and \(Y\) independent?

\(f_X(x)=\begin{cases}60x^3(1-x)^2 & 0\le x \le 1 \\ 0 & \text{otherwise}\end{cases}\)

- When \(0\le y \le 1\),

\(\phantom{=}f_Y(y)\)

\(= \int_0^{1-y} 120x^3ydx\)

\(= 30y(1-y)^4\).

- \(f_X(x)f_Y(y)\neq f_{X,Y}(x,y)\) for \((x,y)\)

that satisfy \(x\ge 0\), \(y\ge 0\), \(x+y\le 1\).

- NO, they are not independent.

For any number of random variables, \(X_1\), \(X_2\), …, \(X_n\), they are **pairwise independent** if \(X_j\) and \(X_k\) are independent for all \(j\neq k\), \(1\le j,k \le n\).

For any number of variables, \(X_1\), \(X_2\), …, \(X_n\), they are **independent** if \(F\left(x_1,x_2,\ldots,x_n\right)=\prod_{i=1}^n F_{X_i}\left(x_i\right)\).

You can also write the definition with \(p_{x_i}\) for discrete random variables with joint probability mass function \(p\) or with \(f_{x_i}\) for continuous random variables with joint density function \(f\).

Let \(X_1\), \(X_2\), \(X_3\), …, \(X_n\) be independent and identically distributed \(U(0,1)\) random variables. Let \(X_{(n)}\) be the maximum value among them.

What is the cumulative distribution function of \(X_{(n)}\)? How about its probability density function?

\(X_{(n)} \le x\) implies \(X_i \le x\) for all \(i=1,2,\ldots n\).

- \(P\left(X_{(n)}\le x\right)\) \(=P\left(X_1\le x, X_2\le x, \cdots, X_n\le x\right)\)

\(X_1\), \(X_2\), … \(X_n\) are independent.

- \(=P(X_1\le x)P(X_2\le x)\cdots P(X_n\le x)\)

- \(=\begin{cases} 0 & x <0 \\ x^n & 0\le x \le 1 \\ 1 & x>0\end{cases}\)

Let \(X_1\), \(X_2\), …, \(X_n\) be independent random variables. For each \(i\in\left\{1,2,\ldots,n\right\}\), let \(h_i:\mathbb{R}\to\mathbb{R}\) be a function and define the random variable

\[Y_i=h_i\left(X_i\right).\]

Then, \(Y_1\), \(Y_2\), …, \(Y_n\) are also independent.

`learnr`

and run R worksheetClick here to install

`learnr`

on r.datatools.utoronto.caFollow this link to open the worksheet

If you see an error, try:

- Log in to r.datatools.utoronto.ca
- Find
`rlesson07`

from*Files*pane - Click
*Run Document*

Other steps you may try:

- Remove any
`.Rmd`

and`.R`

files on the home directory of r.datatools.utoronto.ca - In RStudio,
- Click
`Tools`

>`Global Options`

- Uncheck
*“Restore most recently opened project at startup”*

- Click
- Run
`install.packages("learnr")`

in RStudio after the steps above or click here

- Joint distributions of two or more random variables can be described using their joint cumulative distribution functions and joint probability mass functions or joint probability density functions.
- Joint distributions contain the information on the relationship between random variables which the marginal distributions of the individual random variables do not explain.
- Independent random variables each describe events that are independent of each other.

Chapter 9, Dekking et al.

Read Section 9.3, 9.5

Quick Exercises 9.3, 9.4, 9.5

All Exercises

See a collection of corrections by the author here

© 2023. Michael J. Moon. University of Toronto.

Sharing, posting, selling, or using this material outside of your personal use in this course is **NOT** permitted under any circumstances.