Lecture 1: Outcomes, Events, and Probability

STA237: Probability, Statistics, and Data Analysis I

Michael Jongho Moon

PhD Student, DoSS, University of Toronto

Monday, May 8, 2023

Introduction to probability

At which station will I experience the next subway delay during my commute?


  •    Definitely Eglinton.
  •     Probably Bloor..
  •     Maybe Union…
  •   None of them. I am just too lucky.

In plain language, definitely, probably, and maybe express a degree of uncertainty or a degree of belief.

Number of TTC delays longer than 5 min in 2022 along my commute.

  • We learn from what we observe to make conclusions about what we haven’t observed
  • We have uncertainties about the conclusions and would like to study the uncertainties
  • We assign a numeric value, called probability to represent our level of certainty

In this class, we will study
how we describe uncertainty
with probability.

Guess the next TTC subway delay.

What probability is

A discipline

Probability is the science of uncertainty.

(Evans and Rosenthal)

What probability is

An expression

a number between 0 and 1 that expresses hows likely [an] event is to occur…

(Dekking et al.)

Another way of thinking about probability is in terms of [long-term] relative frequency.

(Evans and Rosenthal)

Probability in terms of relative frequency

  • More frequent delays in the past,
    more likely I will experience a delay

  • We assume TTC of tomorrow
    is similar to that of yesterday

Relative frequencies inform our beliefs.

Why we study probability

Probability is everywhere
and understanding
probability can help you…

  • plan your next subway trip
  • understand that launching the space shuttle Challenger was a bad idea without launching it (Section 1.4 of Dekking et al.)
  • estimate the prevalence of COVID-19-infected individuals in Ontario communities from wastewater (COVID-19 Wastewater Surveillance in Ontario, Public Health Ontario, 2023.)

Random experiment, outcomes and events

Definitions

A (random) experiment is a mechanism/phenomenon that results in random or unpredictable outcomes.

The station where I experience my next TTC subway delay is the outcome.

A sample space is the collection of all possible outcomes from an experiment. It’s often denoted \(\Omega\) (Omega).

\[\Omega=\{\text{Sheppard}, \ldots, \text{Queen's Park}\}\] \[=\text{All stations along my commute}\]

An event is a subset of the sample space.

\[D=\{\text{Bloor}, \text{Wellesley}, \ldots, \text{Queen's Park}\}\] \[=\text{Stations in downtown along my commute}\]

Some basic set theory

Events

Consider the following events.


\(A\): My next delay is in downtown Toronto.

\(B\): My next delay is along Yonge St.

Event A.

Event B.

Some basic set theory

Events

Consider the following events.


\(A\): My next delay is in downtown Toronto.

\(B\): My next delay is along Yonge St.

Venn Diagrams

\(A\)

\(B\)

Event A and B.

Intersection

\[A\cap B\]

\(A\)

\(B\)

  • Represents the event that includes outcomes from event \(A\) and \(B\)
  • A delay in downtown AND on Yonge St

Event A or B.

Union

\[A\cup B\]

\(A\)

\(B\)

  • Represents the event that includes outcomes from event \(A\) or \(B\)
  • A delay in downtown OR on Yonge St.

Event A complement.

Complement

\[A^c\]

\(A\)

\(B\)

  • Represents the event that excludes outcomes from \(A\)
  • A delay that is NOT in downtown

Example: Niether \(A\) nor \(B\)

How would you write an event that includes outcomes that belong to neither \(A\) nor \(B\) using set notation?


We could write…

a delay that is

NOT \((\cdot^c)\)

  in downtown \((A)\)

  OR \((\cup)\)

  on Yonge St. \((B)\).

a delay that is

NOT \((\cdot^c)\) in downtown \((A)\)

AND \((\cap)\)

NOT \((\cdot^c)\) on Yonge St. \((B)\).

\[(A\cup B)^c\]

\[=\]

\[A^c \cap B^c\]

De Morgan’s Laws

For any two events \(A\) and \(B\), we have

\[(A\cup B)^c = A^c \cap B^c\]

\(A\)

\(B\)



and

\[(A\cap B)^c = A^c\cup B^c.\]

\(A\)

\(B\)

Example: Niether \(A\) nor \(B\)

Empty set.

What is left?

\[A^c \cap B^c = \{\}= \emptyset,\] an empty set.

\(A\)

\(B\)

Example: Exactly one of \(A\) and \(B\)

Exactly one of A or B.

An event that includes outcomes that belong to one of \(A\) or \(B\), but not both.

\(A\)

\(B\)

Example: Exactly one of \(A\) and \(B\)

How can we represent the event using set notation?

\(A\)

\(B\)

Remove

\(A\)

\(B\)

from

\(A\)

\(B\)

  • To “remove” \(A\cup B\), we can find the interesction, \(\cap\) with the complement, \(\left(A\cup B\right)^c\).
  • We can express the event \[A\cup B \cap (A \cap B)^c\] \[=\] \[A\cup B \cap (A^c \cup B^c)\]

Other useful terminologies and properties to remember

Disjoint \(A\) and \(B\)

(mutually exclusive)

A

B

\[A\cap B=\{\}=\emptyset\]

\(A\) implies \(B\)

\(A\) is a subset of \(B\)

A

B

\[A\cap B=A\]

\[A\subset B\]

Communicative

\[A\cup B=B\cup A\] \[A\cap B=B\cap A\]

Associative

\[(A\cup B)\cup C=A\cup(B\cup C)\] \[(A\cap B)\cap C=A\cap(B\cap C)\]

Distributive

\[A\cup (B\cap C)=(A\cup B) \cap (A\cup C)\] \[A\cap (B\cup C)=(A\cap B) \cup (A\cap C)\]

Probability

flowchart TD
    event((<font size=5em>Event))---fn(<font color=#386CB0><font size=5em>Probability<br/>Function)-->value(<font color=#386CB0><font size=5em>Probability<br />Value)
    style event fill: #ffffff, stroke: #696969, stroke-width: 2px;
    style fn fill: #ffffff, stroke: none, font-size 2em;
    style value fill: #ffffff, stroke: #696969, stroke-width: 2px;

Probability function

A probability function \(P\) defined on a finite sample space \(\Omega\) assigns each event \(A\) in \(\Omega\) a number \(P(A)\) such that

  1. \(P(A) \ge 0\);
  2. \(P(\Omega) = 1\); and
  3. \(P(A\cup B) = P(A) + P(B)\)
    if \(A\) and \(B\) are disjoint.


(axioms of probability)

The number \(P(A)\) is called the probability that \(A\) occurs.

Probability function

A probability function \(P\) defined on an infinite sample space \(\Omega\) assigns each event \(A\) in \(\Omega\) a number \(P(A)\) such that

  1. \(P(A) \ge 0\);
  2. \(P(\Omega) = 1\); and
  3. \(P(A_1\cup A_2 \cup A_3 \cup \cdots)\) \(= P(A_1) + P(A_2) + P(A_3) + \cdots\)
    if \(A_1\), \(A_2\), \(A_3\), … are disjoint.


(axioms of probability)

The number \(P(A)\) is called the probability that \(A\) occurs.

Example: My next TTC delay

Higher relative frequencies, higher probability.

  • \(P\left(\left\{\text{Eglinton}\right\}\right) > P\left(\left\{\text{Lawrence}\right\}\right)\)
  • \(P\left(\left\{\text{Eglinton}\right\}\right) + P\left(\left\{\text{Lawrence}\right\}\right)\) \(= P\left(\left\{\text{Eglinton, Lawrence}\right\}\right)\)

\(\left\{\text{Eglinton, Lawrence}\right\}\) \(=\left\{\text{Eglinton}\right\}\cup\left\{\text{Lawrence}\right\}\)

The event that the next delay is at Eglinton or Lawrence

  • \(P\left(\text{All stations}\right)=1\)

assuming I will eventually experience a delay at one of the stations

Probability and set operations

Probability of a union

Consider \(P(A)\)

\(A\)

\(B\)

For any two events \(A\) and \(B\), we can decompose each into two disjoint subsets.

\(A\)

\(B\)

\(A\)

\(B\)

\[(A\cap B^c)\cup (A\cap B)\]

\[P(A)\] \[=P(A\cap B^c) + P(A\cap B)\]

probability axiom iii

Example: My next TTC delay

Recall events \(A\) and \(B\).

\(A\): My next delay is in downtown Toronto.

\(B\): My next delay is along Yonge St.

Assume I will eventually experience a delay at a TTC station during my commute.

\(\implies P(A\cup B)=P(\Omega)=1\)

(probability axiom ii)

Suppose

\[P(A)={4}/{10}\]

Event A.

\[P(B)={2}/{3}\]

Event B.

\[P(A\cap B)=?\]

Example: My next TTC delay

\[P(A)={4}/{10}\] \[P(B)={2}/{3}\] \[P(A\cup B)=1\]

\(P(A)=\)\(P(A\cap B^c)\)\(+\)\(P(A\cap B)\)

\(P(B)=\)\(P(A^c\cap B)\)\(+\)\(P(A\cap B)\)

\(P(A\cup B)=\)\(P(A\cap B^c)\)\(+\)\(P(A^c\cap B)\)\(+\)\(P(A\cap B)\)

\(\implies\)

\(P(A) + P(B)=\)\(P(A\cap B^c)\)\(+\)\(P(A^c\cap B)\)\(+2\cdot\)\(P(A\cap B)\)

\(P(A) + P(B)=P(A \cup B)+\)\(P(A\cap B)\)

\(\implies\)

\(P(A\cap B)\)\(=P(A)+P(B)-P(A\cup B)\)

\(\phantom{P(A\cap B)}=4/10 + 2/3 - 1\)

\(\phantom{P(A\cap B)}=1/15\)

Probability of a union

Consider \(P(A\cup B)\)

\(A\)

\(B\)

\(A\)

\(B\)

\[(A\cap B^c)\cup (A\cap B)\cup (A^c\cap B)\]

\[P(A\cup B)=P(A\cap B^c)+ P(A\cap B) + P(A^c\cap B)\]

Probability of a complement

\[\Omega\]

A

A

\[A\cup A^c\]

For any event \(A\), we can decompose the sample space \(\Omega\) into two disjoint subsets.

\[\implies P(\Omega)=P(A) + P(A^c)\] \[\implies P(A^c)=1-P(A)\]

Probability of equally likely outcomes

Calculating probability by counting

Applies only when

  • all outcomes of the sample space are equally likely; and
  • \(\Omega\) is finite.

For any event \(A\) of such sample space \(\Omega\),

\[P(A)=\frac{\text{number of outcomes that belong to }A}{\text{total number of outcomes in }\Omega}\]

Example: Rolling a die

Suppose you roll a fair die once.

\[A=\text{You roll an even number.}\] \[B=\text{You roll a number less than 3.}\]

Compute the following probabilities.

\[P(A)\]

\[P(A\cap B)\]

\[P(A\cup B)\]

\[P\left(\left\{2,4,6\right\}\right)\] \[=3/6=1/2\]

\[P\left(\left\{2\right\}\right)\] \[=1/6\]

\[P\left(\left\{1,2,4,6\right\}\right)\] \[=4/6=2/3\]

Multiple experiments

Example: Rolling a die twice

Suppose you roll the die twice .


Let \(\Omega_1\) be the sample space for the first roll and \(\Omega_2\) the sample space for the second.

We will denote the sample space of rolling the die twice with \(\Omega\).

What is \(\Omega\)?

Product of sample space

In general,

\[\Omega=\Omega_1 \times \Omega_2=\left\{\left(\omega_1, \omega_2\right):\omega_1\in \Omega_1, \omega_2\in\Omega_2\right\}\]

That is, the sample space generated by observing multiple experiments is a product of the individual sample spaces consisting of all combinations of outcomes of individual experiments.

Example: Rolling a die twice

What is \(P\left(\left\{\left(1,6\right)\right\}\right)\)?

  • The number of possible outcomes in \(\Omega\) is \(6\times6=36\).
  • \(\left\{\left(1,6\right)\right\}\) is an event with a single outcome.
  • \(P\left(\left\{\left(1,6\right)\right\}\right)=1/36\).

Example: Drawing 5 cards from a deck

In a standard deck of playing cards, there are 13 cards in each of the four suits:

What is the probability of drawing

A♠, K♠, Q♠, J♠, 10♠

consecutively from a standard deck in the specific order?

  • What is the number of uniquely ordered ways of drawing 5 cards from a deck of 52 cards?
  • \(52\times51\times50\times49\times48 = 311,875,200\) ways
  • Or, \(\frac{52!}{(52 - 5)!}\)
  • \(\implies P(\){(A♠, K♠, Q♠, J♠, 10♠)}\()\approx\) 1/300 million.

Permutation

Any ordered sequence of \(n\) objects taken from a set of \(N\) distinct objects is called a permutation. The number of possible permutations of size \(n\) from \(N\) objects is

\[{}_NP_{n}=\frac{N!}{\left(N-n\right)!}.\]

Examples

  • Selecting individuals from a baseball team of 12 for a starting lineup by position
  • Allocating 10 pre-construction condo units to a group of 7 applicants
  • Allocating 7 pre-construction condo units to a group of 10 applicants
  • Forming other “words” by rearranging letters in “MICHAEL”

Example: Drawing 5 cards from a deck

What is the probability of drawing

A♠, K♠, Q♠, J♠, 10♠

consecutively from a standard deck in any order?

  • What is the number of ways to order the 5 cards?
  • \(5\times4\times3\times2\times1=5!=120\)
  • \(\implies P(\){(A♠, K♠, Q♠, J♠, 10♠)}\()\approx\) 120/300 million.
  • In other words, there are \(\approx\) (300 million / 120) ways to select 5 cards when we don’t consider the order.

Combination

Any unordered sequence of \(n\) objects taken from a set of \(N\) distinct objects is called a combination. The number of possible combinations of size \(n\) from \(N\) objects is

\[\binom{N}{n}=\frac{N!}{\left(N-n\right)!\cdot n!}.\]

Examples

  • Dividing a baseball team of 12 into 2 teams for practice
  • Selecting 7 winners from 10 scholarship applicants
  • Forming other “words” by rearranging letters in “MOON”

R worksheet

Install learnr and run R worksheet

  1. Click here to install learnr on r.datatools.utoronto.ca

  2. Follow this link to open the worksheet



If you see an error, try:

  1. Log in to r.datatools.utoronto.ca
  2. Find rlesson01 from Files pane
  3. Click Run Document

Other steps you may try:

  1. Remove any .Rmd and .R files on the home directory of r.datatools.utoronto.ca
  2. In RStudio,
    1. Click Tools > Global Options
    2. Uncheck “Restore most recently opened project at startup”
  3. Run install.packages("learnr") in RStudio after the steps above or click here

Summary

  • Probability maps events to numbers representing the level of uncertainty associated with the events
  • The three axioms of probability provide the basic mathematical properties of probability
  • In simple experiments with finite and equally likely outcomes, we can compute probabilities by counting the number of possible outcomes

Practice questions

Chapter 2, Dekking et al.

  • Quick Exercises 2.1, 2.3, 2.5, 2.7
  • Exercises from Dekking et al. Chapter 2: 2.1, 2.2, 2.6, 2.7, 2.9-2.19